Annotation of rpm2html/mirror.html, revision 1.6
1.3 veillard 1: <HTML>
2: <HEAD>
3: <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
4: <META NAME="GENERATOR" CONTENT="Mozilla/4.04 [en] (X11; I; Linux 2.1.101 i686) [Netscape]">
5: <TITLE>Setting up a mirror of Rufus.W3.Org RPM database</TITLE>
6: </HEAD>
7: <BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EF" VLINK="#51188E" ALINK="#FF0000">
8:
9: <CENTER>
10: <H1>
11: Setting up a mirror of Rufus.W3.Org RPM database</H1></CENTER>
12: This page explain how to set-up a Web database for RPM packages similar
1.6 ! daniel 13: to <A HREF="../RPM/">the one running on rpmfind.net</A> . You should first
1.3 veillard 14: get acquainted on the mirroring principle described shortly on <A HREF="mirroring.html">the
15: mirroring proposal</A>. However the setup should be fairy simple:
16: <H3>
17: Prerequisites</H3>
18:
19: <OL>
20: <LI>
21: You must of course have a Web server running, <A HREF="../rpm/apache.html">I
22: suggest Apache </A>the obvious choice for a Linux machine, it's probably
23: installed by default anyway.</LI>
24:
25: <LI>
1.6 ! daniel 26: You should run a mirror of the RDF database available on <A HREF="ftp://ftp.rpm.org/pub/RDF">ftp://rpmfind.net/linux/RDF
1.3 veillard 27: </A>. To help boostraping the mirroring process it may prove more efficient
1.6 ! daniel 28: to fetch first a <A HREF="ftp://rpmfind.net/linux/RDF.tar.gz">compressed
1.3 veillard 29: archive of the whole RDF tree </A>and expand it. Note that you don't need
30: to mirror the full tree, you can select to prune some of the subtrees (but
1.5 daniel 31: do not break the overall structure !). I suggest using
32: <A HREF="../RPM/rsync.html">rsync</A> to do the mirroring. Another alternative
33: is to use <A HREF="../RPM/mirror.html">mirror-2.8 perl script</A>, but it's
34: somewhat more difficult to set-up.
35:
1.3 veillard 36:
37: <LI>
1.6 ! daniel 38: You should get a recent copy of rpm2html, you can <A HREF="ftp://rpmfind.net/pub/rpm2html/">grab
1.3 veillard 39: an rpm</A> for example :-) (the version must be >= 0.90, and it's generally
40: a good idea to follow closely the releases), install it.</LI>
41:
42: <LI>
1.5 daniel 43: Of course, you need disk space, currently the RDF tree requires 130 MBytes
44: while the full HTML tree built consumes nearly 400 MBytes (which is still
1.3 veillard 45: small compared to the initial 8 GBytes needed for the RPM initial mirroring
46: !).</LI>
47:
48: <LI>
1.6 ! daniel 49: Subscribe to the rpm2html mailing-list, send a mail to <A HREF="mailto:majordomo@rpmfind.net">majordomo@rpmfind.net</A>
1.3 veillard 50: with the line</LI>
51:
52: <BR><TT>subscribe rpm2html</TT>
53: <BR>in the body of the message. The list <A HREF="messages/">archive are
54: on-line.</A></OL>
55:
56: <H3>
57: Setting up the mirror</H3>
58: You need to replicate the RDF database available on <A HREF="ftp://ftp.rpm.org/pub/RDF">ftp://ftp.rpm.org/pub/RDF
1.5 daniel 59: </A>.
60:
61: <p>The simplest is to use rsync, the command is simply
62: <pre>
1.6 ! daniel 63: rsync -az --delete rpmfind.net::RDF /linux/RDF
1.5 daniel 64: </pre>
65: I you want to keep the metadata mirror under /linux/RDF. Note also that
66: I am interested in people providing HTTP access to metadata so on a standard
67: linux setup /home/httpd/html/linux/RDF would be even better !</p>
68:
69:
70: <p>Instead, if you want to use mirror, basically install it
71: (this is a set of perl
72: scripts dedicated to the job of mirroring FTP sites), and add to the default
1.3 veillard 73: configuration (usually named mirror.defaults) an entry for the RDF repository.
74: Just add the following lines at the end of your mirror.defaults:
75: <PRE>package=rdf
1.6 ! daniel 76: site=rpmfind.net
1.3 veillard 77: remote_dir=/linux/RDF
1.5 daniel 78: local_dir=/home/httpd/html/linux/RDF
1.3 veillard 79: remote_user=anonymous
1.5 daniel 80: remote_password=me@machine RDF mirroring</PRE>
1.3 veillard 81: Try it by launching "mirror -d -p rdf" and check for possible problems.
82: <H3>
83: Setting up the rpm2html config file</H3>
1.6 ! daniel 84: I suggest <A HREF="ftp://rpmfind.net/pub/rpm2html/rpm2html.config.mirrors">grabbing
1.3 veillard 85: my existing config file</A> and modify it, this is a bit painful, but hopefully
86: has to be done only once:
87: <H4>
88: Modify the Global section</H4>
89:
90: <OL>
91: <LI>
92: Change the <B>maint</B> and <B>mail</B> values to reflect your name and
93: prefered E-mail address for feedback</LI>
94:
95: <LI>
96: Change the <B>dir</B> path to the actual directory where the HTML file
97: have to be produced (something like /home/httpd/html/RPM if you use the
98: standard apache setup). This has to be in your server exported space and
99: <B>the tree may grow to 200 MBytes</B> so check first that you have enought
100: space !</LI>
101:
102: <LI>
103: Change <B>url</B> to the prefix to access teh pages on your HTTP server.
104: For example if you are serving them from <B>/home/httpd/html/RPM</B>, the
105: full URL to access them is <B>http://my.server.org/RPM</B> and the correct
106: value would be : <B>url=/RPM</B> .</LI>
107:
108: <LI>
109: Remove any <B>rdf=true</B> or <B>rdf_dir=/linux/RDF</B> if present, those
110: are used on rufus to create the .rdf files from the .rpm ones. You don't
111: need them on a mirror.</LI>
112: </OL>
113:
114: <H4>
115: Modify each Directory section</H4>
116: After the global section, the config file is a list of directory specific
117: informations, usually related to one specific distribution. The goal here
118: is to adapt it to your local filesystem and point to the local FTP mirrors
119: (for example, you wouldn't point directly to RedHat site but to one of
120: the mirrors in your area). You may drop some for the directories of you
121: are too tight on space or if there is no near mirror for this specific
122: distribution. Let's examine one entry:
123: <OL>
124: <LI>
125: <TT>[/linux/RDF/redhat/5.0/i386]</TT> : <B>change /linux</B><TT>
126: </TT>to the actual location on your disk for the mirror, e.g.:</LI>
127:
128: <BR><TT>[/home/ftp/pub/mirror/redhat/5.0/i386]</TT>
129: <LI>
130: <TT>name=RedHat-5.0 for i386</TT> : You probably don't have to change the
131: name of the distribution, unless you want to translate it.</LI>
132:
133: <LI>
134: <TT>subdir=redhat/5.0/i386</TT> : local path, don't change it !</LI>
135:
136: <LI>
137: <TT>ftp=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
138: : The origin server for the packages, don't change it !</LI>
139:
140: <LI>
141: <TT>ftpsrc=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/SRPMS</TT> : The
142: origin server for the sources, you may want to point to a near server providing
143: the sources RPMs.</LI>
144:
145: <LI>
146: <TT>color=#ffe0ff </TT>: Color code for this distribution, you can change
147: that but avoid giving nearly the same color for two different distribution.</LI>
148:
149: <LI>
1.6 ! daniel 150: <TT>mirror=ftp://rpmfind.net/linux/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
1.3 veillard 151: : The <B>first nearest mirror</B>, customize to reduce the bandwidth traffic
152: (don't reference rufus server if you are located in Australia !).</LI>
153:
154: <LI>
155: <TT>mirror=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
156: : <B>additionnal mirrors</B> may be added, rpm2html currently don't use
157: this feature, but will in a near future ...</LI>
158: </OL>
159: <B>Note</B> that if you changed the configuration file for an existing
160: setup, you need to pass the -force option to rpm2html to ensure that all
161: the pages are updated.
162: <H3>
163: Run rpm2html</H3>
164: Try it:
165:
166: <P><TT>rpm2html config.rpm2html.mirrors</TT>
167:
168: <P>Check for error messages, indicating path or directory rights problems,
169: then point your favorite browser to the Web pages and ensure that the links
170: generated internally are correct, as well as the outside links to the actual
171: RPM mirrors.
172: <BR>
173: <H3>
174: Automate the process</H3>
175: Add the mirror command to update the RDF directory and the call to rpm2html
176: to your crontab. <B>Note </B>that rpm2html never clean up old pages generated
177: but no more accurate, you need to add this to your cron job <B>before</B>
178: running rpm2html:
179: <UL>
180: <LI>
181: 0 4 * * * /usr/local/lib/mirror/mirror</LI>
182:
183: <LI>
184: 30 6 * * * find /serveur/WWW/public/linux/RPM -not -type d -mtime +15 -exec
185: rm {} \; ; /usr/bin/rpm2html -q /usr/share/rpm2html.config.mirrors</LI>
186: </UL>
187:
188: <H3>
189: Announce it and register</H3>
1.6 ! daniel 190: Once you have a working setup, it would be cool to announce it to the <A HREF="mailto:rpm2html@rpmfind.net">rpm2html
1.4 veillard 191: mailing-list</A>, and to your local linux users group Don't forget to
192: give location (country, state) information as well as the dataset indexed
193: if you don't run the full archive. this has to be shared !
194:
195: <a href="mailto:veillard@w3.org">Contact me</a> if you want to localize
196: the output of rpm2html, it's not that hard !
1.3 veillard 197: <BR>
198: <ADDRESS>
199: <A HREF="mailto:Daniel.Veillard@w3.org">Daniel Veillard</A></ADDRESS>
200:
1.6 ! daniel 201: <BR>$Id: mirror.html,v 1.5 1998/09/09 15:49:32 daniel Exp $
1.3 veillard 202: <BR>
203: </BODY>
204: </HTML>
Webmaster