File:  [Public] / rpm2html / mirror.html
Revision 1.9: download - view: text, annotated - select for diffs
Wed Feb 21 18:45:34 2001 UTC (23 years, 3 months ago) by veillard
Branches: MAIN
CVS tags: HEAD
Update, daniel

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.imag.fr/TR/REC-html401/loose.dtd">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  <title>Setting up a mirror of Rufus.W3.Org RPM database</title>
</head>

<body text="#000000" bgcolor="#FFFFFF" link="#0000EF" vlink="#51188E"
alink="#FF0000">

<center>
<h1>Setting up a mirror of Rufus.W3.Org RPM database</h1>
</center>
This page explain how to set-up a Web database for RPM packages similar to <a
href="../RPM/">the one running on rpmfind.net</a> . You should first get
acquainted on the mirroring principle described shortly on <a
href="mirroring.html">the mirroring proposal</a>. However the setup should be
fairy simple:

<h3>Prerequisites</h3>
<ol>
  <li>You must of course have a Web server running, <a
    href="../rpm/apache.html">I suggest Apache </a>the obvious choice for a
    Linux machine, it's probably installed by default anyway.</li>
  <li>You should run a mirror of the RDF database available on <a
    href="ftp://ftp.rpm.org/pub/RDF">ftp://rpmfind.net/linux/RDF </a>. To help
    boostraping the mirroring process it may prove more efficient to fetch
    first a <a href="ftp://rpmfind.net/linux/RDF.tar.gz">compressed archive of
    the whole RDF tree </a>and expand it. Note that you don't need to mirror
    the full tree, you can select to prune some of the subtrees (but do not
    break the overall structure !). I suggest using <a
    href="../RPM/rsync.html">rsync</a> to do the mirroring. Another
    alternative is to use <a href="../RPM/mirror.html">mirror-2.8 perl
    script</a>, but it's somewhat more difficult to set-up.</li>
  <li>You should get a recent copy of rpm2html, you can <a
    href="ftp://rpmfind.net/pub/rpm2html/">grab an rpm</a> for example :-)
    (the version must be &gt;= 0.90, and it's generally a good idea to follow
    closely the releases), install it.</li>
  <li>Of course, you need disk space, currently the RDF tree requires
    1.3GBytes while the full HTML tree built consumes nearly 4GBytes.</li>
  <li>Subscribe to the rpm2html mailing-list, send a mail to <a
    href="mailto:majordomo@rpmfind.net">majordomo@rpmfind.net</a> with the
    line<br>
    <tt>subscribe rpm2html</tt> <br>
    in the body of the message. The list <a href="messages/">archive are
    on-line.</a></li>
</ol>

<h3>Setting up the mirror</h3>
You need to replicate the RDF database available on <a
href="ftp://rpmfind.net/linux/RDF">ftp://rpmfind.net/linux/RDF </a>.

<p>The simplest is to use rsync, the command is simply</p>
<pre>rsync -az --delete rpmfind.net::RDF /linux/RDF</pre>
I you want to keep the metadata mirror under /linux/RDF. Note also that I am
interested in people providing HTTP access to metadata so on a standard linux
setup /home/httpd/html/linux/RDF would be even better !

<p>Instead, if you want to use mirror, basically install it (this is a set of
perl scripts dedicated to the job of mirroring FTP sites), and add to the
default configuration (usually named mirror.defaults) an entry for the RDF
repository. Just add the following lines at the end of your
mirror.defaults:</p>
<pre>package=rdf
        site=rpmfind.net
        remote_dir=/linux/RDF
        local_dir=/home/httpd/html/linux/RDF
        remote_user=anonymous
        remote_password=me@machine RDF mirroring</pre>
Try it by launching "mirror -d -p rdf" and check for possible problems.

<h3>Setting up the rpm2html config file</h3>
I suggest <a
href="ftp://rpmfind.net/pub/rpm2html/rpm2html.config.mirrors">grabbing my
existing config file</a> and modify it, this is a bit painful, but hopefully
has to be done only once:

<h4>Modify the Global section</h4>
<ol>
  <li>Change the <b>maint</b> and <b>mail</b> values to reflect your name and
    prefered E-mail address for feedback</li>
  <li>Change the <b>dir</b> path to the actual directory where the HTML file
    have to be produced (something like /home/httpd/html/RPM if you use the
    standard apache setup). This has to be in your server exported space and
    <b>the tree may grow to 200 MBytes</b> so check first that you have
    enought space !</li>
  <li>Change <b>url</b> to the prefix to access teh pages on your HTTP server.
    For example if you are serving them from <b>/home/httpd/html/RPM</b>, the
    full URL to access them is <b>http://my.server.org/RPM</b> and the correct
    value would be : <b>url=/RPM</b> .</li>
  <li>Remove any <b>rdf=true</b> or <b>rdf_dir=/linux/RDF</b> if present,
    those are used on rufus to create the .rdf files from the .rpm ones. You
    don't need them on a mirror.</li>
</ol>

<h4>Modify each Directory section</h4>
After the global section, the config file is a list of directory specific
informations, usually related to one specific distribution. The goal here is
to adapt it to your local filesystem and point to the local FTP mirrors (for
example, you wouldn't point directly to RedHat site but to one of the mirrors
in your area). You may drop some for the directories of you are too tight on
space or if there is no near mirror for this specific distribution. Let's
examine one entry:
<ol>
  <li><tt>[/linux/RDF/redhat/5.0/i386]</tt><b>change /linux</b><tt>
    </tt>to the actual location on your disk for the mirror, e.g.:<br>
    <tt>[/home/ftp/pub/mirror/redhat/5.0/i386]</tt></li>
  <li><tt>name=RedHat-5.0 for i386</tt> : You probably don't have to change
    the name of the distribution, unless you want to translate it.</li>
  <li><tt>subdir=redhat/5.0/i386</tt> : local path, don't change it !</li>
  <li><tt>ftp=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</tt>
    : The origin server for the packages, don't change it !</li>
  <li><tt>ftpsrc=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/SRPMS</tt> : The
    origin server for the sources, you may want to point to a near server
    providing the sources RPMs.</li>
  <li><tt>color=#ffe0ff</tt>: Color code for this distribution, you can change
    that but avoid giving nearly the same color for two different
    distribution.</li>
  <li><tt>mirror=ftp://rpmfind.net/linux/redhat/redhat-5.0/alpha/RedHat/RPMS</tt>
    : The <b>first nearest mirror</b>, customize to reduce the bandwidth
    traffic (don't reference rufus server if you are located in Australia
  !).</li>
  <li><tt>mirror=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS</tt>
    : <b>additionnal mirrors</b> may be added, rpm2html currently don't use
    this feature, but will in a near future ...</li>
</ol>
<b>Note</b> that if you changed the configuration file for an existing setup,
you need to pass the -force option to rpm2html to ensure that all the pages
are updated.

<h3>Run rpm2html</h3>
Try it:

<p><tt>rpm2html config.rpm2html.mirrors</tt></p>

<p>Check for error messages, indicating path or directory rights problems,
then point your favorite browser to the Web pages and ensure that the links
generated internally are correct, as well as the outside links to the actual
RPM mirrors. <br>
 </p>

<h3>Automate the process</h3>
Add the mirror command to update the RDF directory and the call to rpm2html to
your crontab. <b>Note </b>that rpm2html never clean up old pages generated but
no more accurate, you need to add this to your cron job <b>before</b> running
rpm2html:
<ul>
  <li>0 4 * * * /usr/local/lib/mirror/mirror</li>
  <li>30 6 * * * find /serveur/WWW/public/linux/RPM -not -type d -mtime +15
    -exec rm {} \; ; /usr/bin/rpm2html -q
  /usr/share/rpm2html.config.mirrors</li>
</ul>

<h3>Announce it and register</h3>
Once you have a working setup, it would be cool to announce it to the <a
href="mailto:rpm2html@rpmfind.net">rpm2html mailing-list</a>, and to your
local linux users group Don't forget to give location (country, state)
information as well as the dataset indexed if you don't run the full archive.
this has to be shared ! <a href="mailto:Daniel.Veillard@imag.fr">Contact me</a> if you
want to localize the output of rpm2html, it's not that hard ! <br>
 
<address>
  <a href="mailto:daniel@veillard.com">Daniel Veillard</a>
</address>
<br>
$Id: mirror.html,v 1.9 2001/02/21 18:45:34 veillard Exp $ <br>
 </body>
</html>

Webmaster