Setting up a mirror of Rufus.W3.Org RPM database
This page explain how to set-up a Web database for RPM packages similar
to the one running on rufus.w3.org . You should first
get acquainted on the mirroring principle described shortly on the
mirroring proposal. However the setup should be fairy simple:
Prerequisites
-
You must of course have a Web server running, I
suggest Apache the obvious choice for a Linux machine, it's probably
installed by default anyway.
-
You should run a mirror of the RDF database available on ftp://ftp.rpm.org/pub/RDF
. To help boostraping the mirroring process it may prove more efficient
to fetch first a compressed
archive of the whole RDF tree and expand it. Note that you don't need
to mirror the full tree, you can select to prune some of the subtrees (but
do not break the overall structure !). I suggest using mirror-2.8
perl script to do the mirroring. [NOTE: until the FTP server
is available on ftp.rpm.org, use the RDF
database on rufus and the RDF.tar.gz
here]
-
You should get a recent copy of rpm2html, you can grab
an rpm for example :-) (the version must be >= 0.90, and it's generally
a good idea to follow closely the releases), install it.
-
Of course, you need disk space, currently the RDF tree requires 60 MBytes
while the full HTML tree built consumes nearly 300 MBytes (which is still
small compared to the initial 8 GBytes needed for the RPM initial mirroring
!).
-
Subscribe to the rpm2html mailing-list, send a mail to majordomo@rufus.w3.org
with the line
subscribe rpm2html
in the body of the message. The list archive are
on-line.
Setting up the mirror
You need to replicate the RDF database available on ftp://ftp.rpm.org/pub/RDF
. Basically you need to install mirror (this is a set of perl script
dedicated to the job of mirroring FTP sites), and to add to the default
configuration (usually named mirror.defaults) an entry for the RDF repository.
Just add the following lines at the end of your mirror.defaults:
package=rdf
site=ftp.rpm.org
remote_dir=/linux/RDF
local_dir=/home/ftp/pub/linux/RDF
remote_user=anonymous
remote_password=hunter@esprit.net RDF mirroring
Try it by launching "mirror -d -p rdf" and check for possible problems.
[NOTE: until the FTP server is available on ftp.rpm.org, use
the RDF database on rufus and
the RDF.tar.gz here]
Setting up the rpm2html config file
I suggest grabbing
my existing config file and modify it, this is a bit painful, but hopefully
has to be done only once:
Modify the Global section
-
Change the maint and mail values to reflect your name and
prefered E-mail address for feedback
-
Change the dir path to the actual directory where the HTML file
have to be produced (something like /home/httpd/html/RPM if you use the
standard apache setup). This has to be in your server exported space and
the tree may grow to 200 MBytes so check first that you have enought
space !
-
Change url to the prefix to access teh pages on your HTTP server.
For example if you are serving them from /home/httpd/html/RPM, the
full URL to access them is http://my.server.org/RPM and the correct
value would be : url=/RPM .
-
Remove any rdf=true or rdf_dir=/linux/RDF if present, those
are used on rufus to create the .rdf files from the .rpm ones. You don't
need them on a mirror.
Modify each Directory section
After the global section, the config file is a list of directory specific
informations, usually related to one specific distribution. The goal here
is to adapt it to your local filesystem and point to the local FTP mirrors
(for example, you wouldn't point directly to RedHat site but to one of
the mirrors in your area). You may drop some for the directories of you
are too tight on space or if there is no near mirror for this specific
distribution. Let's examine one entry:
-
[/linux/RDF/redhat/5.0/i386] : change /linux
to the actual location on your disk for the mirror, e.g.:
[/home/ftp/pub/mirror/redhat/5.0/i386]
-
name=RedHat-5.0 for i386 : You probably don't have to change the
name of the distribution, unless you want to translate it.
-
subdir=redhat/5.0/i386 : local path, don't change it !
-
ftp=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS
: The origin server for the packages, don't change it !
-
ftpsrc=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/SRPMS : The
origin server for the sources, you may want to point to a near server providing
the sources RPMs.
-
color=#ffe0ff : Color code for this distribution, you can change
that but avoid giving nearly the same color for two different distribution.
-
mirror=ftp://rufus.w3.org/linux/redhat/redhat-5.0/alpha/RedHat/RPMS
: The first nearest mirror, customize to reduce the bandwidth traffic
(don't reference rufus server if you are located in Australia !).
-
mirror=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS
: additionnal mirrors may be added, rpm2html currently don't use
this feature, but will in a near future ...
Note that if you changed the configuration file for an existing
setup, you need to pass the -force option to rpm2html to ensure that all
the pages are updated.
Run rpm2html
Try it:
rpm2html config.rpm2html.mirrors
Check for error messages, indicating path or directory rights problems,
then point your favorite browser to the Web pages and ensure that the links
generated internally are correct, as well as the outside links to the actual
RPM mirrors.
Automate the process
Add the mirror command to update the RDF directory and the call to rpm2html
to your crontab. Note that rpm2html never clean up old pages generated
but no more accurate, you need to add this to your cron job before
running rpm2html:
-
0 4 * * * /usr/local/lib/mirror/mirror
-
30 6 * * * find /serveur/WWW/public/linux/RPM -not -type d -mtime +15 -exec
rm {} \; ; /usr/bin/rpm2html -q /usr/share/rpm2html.config.mirrors
Announce it and register
Once you have a working setup, it would be cool to announce it to the rpm2html
mailing-list, and to your local linux users group Don't forget to
give location (country, state) information as well as the dataset indexed
if you don't run the full archive. this has to be shared !
Contact me if you want to localize
the output of rpm2html, it's not that hard !
Daniel Veillard
$Id: mirror.html,v 1.4 1998/05/15 16:17:57 veillard Exp $