Annotation of rpm2html/mirror.html, revision 1.6

1.3       veillard    1: <HTML>
                      2: <HEAD>
                      3:    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
                      4:    <META NAME="GENERATOR" CONTENT="Mozilla/4.04 [en] (X11; I; Linux 2.1.101 i686) [Netscape]">
                      5:    <TITLE>Setting up a mirror of Rufus.W3.Org RPM database</TITLE>
                      6: </HEAD>
                      7: <BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EF" VLINK="#51188E" ALINK="#FF0000">
                      8: 
                      9: <CENTER>
                     10: <H1>
                     11: Setting up a mirror of Rufus.W3.Org RPM database</H1></CENTER>
                     12: This page explain how to set-up a Web database for RPM packages similar
1.6     ! daniel     13: to <A HREF="../RPM/">the one running on rpmfind.net</A> . You should first
1.3       veillard   14: get acquainted on the mirroring principle described shortly on <A HREF="mirroring.html">the
                     15: mirroring proposal</A>. However the setup should be fairy simple:
                     16: <H3>
                     17: Prerequisites</H3>
                     18: 
                     19: <OL>
                     20: <LI>
                     21: You must of course have a Web server running, <A HREF="../rpm/apache.html">I
                     22: suggest Apache </A>the obvious choice for a Linux machine, it's probably
                     23: installed by default anyway.</LI>
                     24: 
                     25: <LI>
1.6     ! daniel     26: You should run a mirror of the RDF database available on <A HREF="ftp://ftp.rpm.org/pub/RDF">ftp://rpmfind.net/linux/RDF
1.3       veillard   27: </A>. To help boostraping the mirroring process it may prove more efficient
1.6     ! daniel     28: to fetch first a <A HREF="ftp://rpmfind.net/linux/RDF.tar.gz">compressed
1.3       veillard   29: archive of the whole RDF tree </A>and expand it. Note that you don't need
                     30: to mirror the full tree, you can select to prune some of the subtrees (but
1.5       daniel     31: do not break the overall structure !). I suggest using
                     32: <A HREF="../RPM/rsync.html">rsync</A> to do the mirroring. Another alternative 
                     33: is to use <A HREF="../RPM/mirror.html">mirror-2.8 perl script</A>, but it's
                     34: somewhat more difficult to set-up.
                     35: 
1.3       veillard   36: 
                     37: <LI>
1.6     ! daniel     38: You should get a recent copy of rpm2html, you can <A HREF="ftp://rpmfind.net/pub/rpm2html/">grab
1.3       veillard   39: an rpm</A> for example :-) (the version must be >= 0.90, and it's generally
                     40: a good idea to follow closely the releases), install it.</LI>
                     41: 
                     42: <LI>
1.5       daniel     43: Of course, you need disk space, currently the RDF tree requires 130 MBytes
                     44: while the full HTML tree built consumes nearly 400 MBytes (which is still
1.3       veillard   45: small compared to the initial 8 GBytes needed for the RPM initial mirroring
                     46: !).</LI>
                     47: 
                     48: <LI>
1.6     ! daniel     49: Subscribe to the rpm2html mailing-list, send a mail to <A HREF="mailto:majordomo@rpmfind.net">majordomo@rpmfind.net</A>
1.3       veillard   50: with the line</LI>
                     51: 
                     52: <BR><TT>subscribe rpm2html</TT>
                     53: <BR>in the body of the message. The list <A HREF="messages/">archive are
                     54: on-line.</A></OL>
                     55: 
                     56: <H3>
                     57: Setting up the mirror</H3>
                     58: You need to replicate the RDF database available on <A HREF="ftp://ftp.rpm.org/pub/RDF">ftp://ftp.rpm.org/pub/RDF
1.5       daniel     59: </A>. 
                     60: 
                     61: <p>The simplest is to use rsync, the command is simply 
                     62: <pre>
1.6     ! daniel     63: rsync -az --delete rpmfind.net::RDF /linux/RDF
1.5       daniel     64: </pre>
                     65: I you want to keep the metadata mirror under /linux/RDF. Note also that
                     66: I am interested in people providing HTTP access to metadata so on a standard
                     67: linux setup /home/httpd/html/linux/RDF would be even better !</p>
                     68: 
                     69: 
                     70: <p>Instead, if you want to use mirror, basically install it 
                     71: (this is a set of perl
                     72: scripts dedicated to the job of mirroring FTP sites), and add to the default
1.3       veillard   73: configuration (usually named mirror.defaults) an entry for the RDF repository.
                     74: Just add the following lines at the end of your mirror.defaults:
                     75: <PRE>package=rdf
1.6     ! daniel     76: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; site=rpmfind.net
1.3       veillard   77: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remote_dir=/linux/RDF
1.5       daniel     78: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; local_dir=/home/httpd/html/linux/RDF
1.3       veillard   79: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remote_user=anonymous
1.5       daniel     80: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remote_password=me@machine RDF mirroring</PRE>
1.3       veillard   81: Try it by launching "mirror -d -p rdf" and check for possible problems.
                     82: <H3>
                     83: Setting up the rpm2html config file</H3>
1.6     ! daniel     84: I suggest <A HREF="ftp://rpmfind.net/pub/rpm2html/rpm2html.config.mirrors">grabbing
1.3       veillard   85: my existing config file</A> and modify it, this is a bit painful, but hopefully
                     86: has to be done only once:
                     87: <H4>
                     88: Modify the Global section</H4>
                     89: 
                     90: <OL>
                     91: <LI>
                     92: Change the <B>maint</B> and <B>mail</B> values to reflect your name and
                     93: prefered E-mail address for feedback</LI>
                     94: 
                     95: <LI>
                     96: Change the <B>dir</B> path to the actual directory where the HTML file
                     97: have to be produced (something like /home/httpd/html/RPM if you use the
                     98: standard apache setup). This has to be in your server exported space and
                     99: <B>the tree may grow to 200 MBytes</B> so check first that you have enought
                    100: space !</LI>
                    101: 
                    102: <LI>
                    103: Change <B>url</B> to the prefix to access teh pages on your HTTP server.
                    104: For example if you are serving them from <B>/home/httpd/html/RPM</B>, the
                    105: full URL to access them is <B>http://my.server.org/RPM</B> and the correct
                    106: value would be : <B>url=/RPM</B> .</LI>
                    107: 
                    108: <LI>
                    109: Remove any <B>rdf=true</B> or <B>rdf_dir=/linux/RDF</B> if present, those
                    110: are used on rufus to create the .rdf files from the .rpm ones. You don't
                    111: need them on a mirror.</LI>
                    112: </OL>
                    113: 
                    114: <H4>
                    115: Modify each Directory section</H4>
                    116: After the global section, the config file is a list of directory specific
                    117: informations, usually related to one specific distribution. The goal here
                    118: is to adapt it to your local filesystem and point to the local FTP mirrors
                    119: (for example, you wouldn't point directly to RedHat site but to one of
                    120: the mirrors in your area). You may drop some for the directories of you
                    121: are too tight on space or if there is no near mirror for this specific
                    122: distribution. Let's examine one entry:
                    123: <OL>
                    124: <LI>
                    125: <TT>[/linux/RDF/redhat/5.0/i386]</TT> :&nbsp; <B>change /linux</B><TT>
                    126: </TT>to the actual location on your disk for the mirror, e.g.:</LI>
                    127: 
                    128: <BR><TT>[/home/ftp/pub/mirror/redhat/5.0/i386]</TT>
                    129: <LI>
                    130: <TT>name=RedHat-5.0 for i386</TT> : You probably don't have to change the
                    131: name of the distribution, unless you want to translate it.</LI>
                    132: 
                    133: <LI>
                    134: <TT>subdir=redhat/5.0/i386</TT> : local path, don't change it !</LI>
                    135: 
                    136: <LI>
                    137: <TT>ftp=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
                    138: : The origin server for the packages, don't change it !</LI>
                    139: 
                    140: <LI>
                    141: <TT>ftpsrc=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/SRPMS</TT> : The
                    142: origin server for the sources, you may want to point to a near server providing
                    143: the sources RPMs.</LI>
                    144: 
                    145: <LI>
                    146: <TT>color=#ffe0ff </TT>: Color code for this distribution, you can change
                    147: that but avoid giving nearly the same color for two different distribution.</LI>
                    148: 
                    149: <LI>
1.6     ! daniel    150: <TT>mirror=ftp://rpmfind.net/linux/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
1.3       veillard  151: : The <B>first nearest mirror</B>, customize to reduce the bandwidth traffic
                    152: (don't reference rufus server if you are located in Australia !).</LI>
                    153: 
                    154: <LI>
                    155: <TT>mirror=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</TT>
                    156: : <B>additionnal mirrors</B> may be added, rpm2html currently don't use
                    157: this feature, but will in a near future ...</LI>
                    158: </OL>
                    159: <B>Note</B> that if you changed the configuration file for an existing
                    160: setup, you need to pass the -force option to rpm2html to ensure that all
                    161: the pages are updated.
                    162: <H3>
                    163: Run rpm2html</H3>
                    164: Try it:
                    165: 
                    166: <P><TT>rpm2html config.rpm2html.mirrors</TT>
                    167: 
                    168: <P>Check for error messages, indicating path or directory rights problems,
                    169: then point your favorite browser to the Web pages and ensure that the links
                    170: generated internally are correct, as well as the outside links to the actual
                    171: RPM mirrors.
                    172: <BR>&nbsp;
                    173: <H3>
                    174: Automate the process</H3>
                    175: Add the mirror command to update the RDF directory and the call to rpm2html
                    176: to your crontab. <B>Note </B>that rpm2html never clean up old pages generated
                    177: but no more accurate, you need to add this to your cron job <B>before</B>
                    178: running rpm2html:
                    179: <UL>
                    180: <LI>
                    181: 0 4 * * * /usr/local/lib/mirror/mirror</LI>
                    182: 
                    183: <LI>
                    184: 30 6 * * * find /serveur/WWW/public/linux/RPM -not -type d -mtime +15 -exec
                    185: rm {} \; ; /usr/bin/rpm2html -q /usr/share/rpm2html.config.mirrors</LI>
                    186: </UL>
                    187: 
                    188: <H3>
                    189: Announce it and register</H3>
1.6     ! daniel    190: Once you have a working setup, it would be cool to announce it to the <A HREF="mailto:rpm2html@rpmfind.net">rpm2html
1.4       veillard  191: mailing-list</A>, and to your local linux users group Don't forget to
                    192: give location (country, state) information as well as the dataset indexed
                    193: if you don't run the full archive. this has to be shared !
                    194: 
                    195: <a href="mailto:veillard@w3.org">Contact me</a> if you want to localize
                    196: the output of rpm2html, it's not that hard !
1.3       veillard  197: <BR>&nbsp;
                    198: <ADDRESS>
                    199: <A HREF="mailto:Daniel.Veillard@w3.org">Daniel Veillard</A></ADDRESS>
                    200: 
1.6     ! daniel    201: <BR>$Id: mirror.html,v 1.5 1998/09/09 15:49:32 daniel Exp $
1.3       veillard  202: <BR>&nbsp;
                    203: </BODY>
                    204: </HTML>

Webmaster