diff --git a/en_US.ISO8859-1/articles/hubs/article.sgml b/en_US.ISO8859-1/articles/hubs/article.sgml index 15e7885dc6..812cd19ccd 100644 --- a/en_US.ISO8859-1/articles/hubs/article.sgml +++ b/en_US.ISO8859-1/articles/hubs/article.sgml @@ -1,1092 +1,1092 @@ %man; %authors; %mailing-lists; ]>
Mirroring FreeBSD $FreeBSD$ Jun Kuriyama
kuriyama@FreeBSD.org
Valentino Vaschetto
logo@FreeBSD.org
Daniel Lang
dl@leo.org
An in-progress article on how to mirror FreeBSD, aimed at hub administrators
Requirements for FreeBSD mirrors Disk Space Disk space is one of the most important requirements. Depending on the set of releases, architectures, and degree of completeness you want to mirror, a huge amount of disk space may be consumed. Also keep in mind, that official mirrors are probably required to be complete. The CVS repository and the web pages should always be mirrored completely. Also note, that the numbers stated here, are reflecting the current state (at 4.5-RELEASE). Further development and releases will only increase the required amount. Also make sure, to keep some (ca. 10-20%) extra space around, just to be sure. Here are some approximate figures: Full FTP Distribution: 60 GB CVS repository: 1.4 GB CTM deltas: 1.5 GB Webpages: 150 MB Network Connection/Bandwidth - Of course, you need to be connected to the internet. + Of course, you need to be connected to the Internet. The required bandwidth depends on your intended use of the mirror. If you just want to mirror some parts of FreeBSD for local use at your site/intranet, the demand may be much smaller, than if you want to make the files publicly available or even if you intend to become an official mirror. We can only give rough estimates here: Local site, no public access: basically no minimum, but < 2 Mbps could make syncing a pain. Unofficial public site: 34 Mbps is probably a good start. Official site: > 100 Mbps is recommended, also your host should be connected as close as possible to your border router. System Requirements, CPU, RAM This also depends on the expected amount of clients, which is determined by the servers policy. It is also affected by the types of services you want to offer. Plain FTP or HTTP services may not require a huge amount of resources. Watch out, if you provide CVSup, rsync or even AnonCVS. This can have a huge impact on CPU and memory requirements. Especially rsync is considered a memory hog, and CVSup does indeed consume some CPU. For AnonCVS it might - be a nice idea to set up a memory resident filesystem (MFS) of at least + be a nice idea to set up a memory resident file system (MFS) of at least 300 MB, so you need to take this into account for your memory requirements. The following are just examples to give you a very rough hint. For a moderately visited site, that offers rsync, you might - consider a current CPU with around 800Mhz - 1 GHz, + consider a current CPU with around 800MHz - 1 GHz, and at least 512MB RAM. This is probably the minimum you want for an official site. For a frequently used site you need definitely more RAM (consider 2GB as a good start), and possibly more CPU, which could also mean, that you need to go for a SMP system. You also want to consider a fast disk subsystem. Operations on the CVS repository require a fast disk subsystem (RAID is greatly advised). A SCSI controller that has a cache of its own can also speed up things, since most of these services incur a very large number of small modifications to the disk. You can also experiment with enlarging the portion - of system memory which is used for the filesystem buffer cache. + of system memory which is used for the file system buffer cache. This will also help to reduce the quantity of disk access. This can be done with the BUFCACHEPERCENT kernel option. The default is to use 5% of system memory. Services to offer Every mirror site is required to have a set of core services available. In addition to these basic services, which mirrors are required to provide, there is a number of optional services that server administrators may choose to offer. This section explains which services you can provide and how to go about implementing them. FTP (required for FTP fileset) This is one of the most basic services, and it is required for each mirror, offering public FTP distributions. FTP access must be anonymous, and no upload/download ratios are allowed (a ridiculous thing anyway). Upload capability is not required (and must never be allowed for the FreeBSD file space). Also the FreeBSD archive should be available under the path /pub/FreeBSD. There is a lot of software available which can be set up to allow anonymous FTP (in alphabetical order). /usr/libexec/ftpd: FreeBSD's own ftpd can be used. Be sure to read &man.ftpd.8;. ftp/ncftpd: A commercial package, free for educational use. ftp/oftpd: An ftpd designed with security as a main focus. ftp/proftpd: A modular and very flexible ftpd. ftp/pure-ftpd: Another ftpd developed with security in mind. ftp/twoftpd: As above. ftp/vsftpd: The very secure ftpd. ftp/wu-ftpd: The ftpd from Washington University. It has become infamous, because of the huge amount of security issues that have been found in it. If you do choose to use this software be sure to keep it up to date. FreeBSD's ftpd, proftpd, wu-ftpd and maybe ncftpd are among the most commonly ones. The others do not have a large userbase among mirror sites. RSYNC (optional for FTP fileset) Rsync is often also offered for convenience, for the contents of the FTP area of FreeBSD. The protocol is different from FTP in many ways, and overall, it can be stated, that it is much more bandwidth friendly, as only differences between files are transferred, not whole files. Rsync does require significant amount of memory for each instance. The size depends on the size of the synced module in terms of number of directories and files. Rsync can use rsh and ssh (now default) as a transport, or use it's own protocol for stand-alone access (this is the preferred method for public rsync servers). Authentication, connection limits, and other restrictions may be applied. There is just one software package available: net/rsync HTTP (required for webpages, optional for FTP fileset) If you want to offer the FreeBSD webpages, you need to install a webserver a.k.a. httpd. You may optionally offer the FTP fileset via HTTP. The choice of Webserver software is left up to the mirror administrator. Some of the most popular choices are: www/apache13: Apache is the most widely deployed Webserver on the Internet. It is used extensively by the FreeBSD Project. You may also wish to use the next generation of the Apache Webserver, available in the ports collection as www/apache2. www/thttpd: If you are going to be serving a lot amount of static content you may find that using an application such as tHttpd is more efficient than Apache. It is optimized for excellent performance on FreeBSD. www/boa: Boa is another alternative to tHttpd and Apache. It should provide considerably better performance than Apache for purely static content. It does not, at the time of writing, contain the same set of optimizations for FreeBSD that are found in tHttpd. CVSup (desired for CVS repository) CVSup is a very efficient way of distributing files. It works similar as rsync, but was specially designed for the use with CVS repositories. If you want to offer the FreeBSD CVS repository, you really want to consider offering it via CVSup. Still it is possible to offer the CVS repository via AnonCVS, FTP, Rsync or HTTP, but people would benefit much more from CVSup access. CVSup was developed by &a.jdp;. It is a bit tricky to install on non-FreeBSD platforms, since it is written in Modula-3 and therefore requires a Modula-3 environment. John Polstra has built a stripped down version of M3, that is sufficient to run CVSup, and can be installed much easier. See Ezm3 for details. Related ports are: net/cvsup: The native CVSup port (client and server) which requires lang/ezm3 now. net/cvsup-mirror: The CVSup mirror kit, which requires net/cvsup, and configures it mirror-ready. Some site administrators may want a different setup, though. There are a few more like net/cvsupit and net/cvsup-without-gui you might want to have a look at. If you prefer a static binary package, take a look here. This page stil refers to the S1G bug, that was present in CVSup. Maybe John will setup a generic download-site to get static binaries for various platforms. It is possible to use CVSup to offer any kind of fileset, not just CVS repositories, but configuration can be complex. CVSup is known to eat some CPU on the server as on the client, since it needs to compare lots of files. AnonCVS (optional for CVS repository) If you have the CVS repository, you may want to offer anonymous CVS access. A short warning first: There is not that much demand for it, and it requires some experience and you need to know, what you are doing. Generally there are two ways, how to access a CVS repository remotely: via pserver or via ssh (we don't consider rsh). For anonymous access, pserver is very well suited, but some still offer ssh access as well. There is a custom crafted wrapper in the CVS repository, to be used as a login-shell for the anonymous ssh account. It does a chroot, and therefore requires the CVS repository to be available under the anonymous user's home-directory, which may not be possible for all sites. If you just offer pserver this restriction does not apply, but you may run with more security risks. You don't need to install any special software, since &man.cvs.1; comes with FreeBSD. You need to enable access via inetd, so add an entry into your /etc/inetd.conf like this: cvspserver stream tcp nowait root /usr/bin/cvs cvs -f -l -R -T /anoncvstmp --allow-root=/home/ncvs pserver See the manpage for details of the options. See also the cvs info page, about additional ways to make sure, access is read-only. It is advisable, that you create an unprivileged account, preferably called anoncvs. Also you need to create a file passwd in your /home/ncvs/CVSROOT and assign a CVS password (empty or anoncvs) to that user. The directory /anoncvstmp is a special - purpose memory based filesystem. It is not required but + purpose memory based file system. It is not required but advised, since &man.cvs.1; creates a shadow directory structure in your /tmp which is not used after the operation, but slows things dramatically, if real disk operations are required. Here is an excerpt from /etc/fstab, how to set up such a MFS: /dev/da0s1b /anoncvstmp mfs rw,-s=786432,-b=4096,-f=512,-i=560,-c=3,-m=0,nosuid,nodev 0 0 This is (of course) tuned a lot, and was suggested by &a.jdp;. How to mirror FreeBSD Ok now, you know the requirements, and how to offer the services, but not how to get it. :-) This section explains how to actually mirror the various parts of FreeBSD, what tools to use, and where to mirror from. FTP The FTP area is the largest amount of data, that needs to be mirrored. It includes the distributions sets, required for network installation, the branches, that are actually snapshots of checked-out source trees, the ISO Images to write CD-Roms with the installation distribution, - a live filesystem, and lots of packages, the ports tree, + a live file system, and lots of packages, the ports tree, distfiles and a huge amount of packages. All of course for various FreeBSD versions, and i386 and alpha architecture. With FTP mirror You can use a FTP mirror program, to get the files. There are a lot around, and widely used, like: ftp/mirror ftp/ftpmirror ftp/emirror ftp/spegla ftp/omi some even use ftp/wget ftp/mirror was very popular, but seemed to have some drawbacks, as it is written in &man.perl.1;, and did have real problems on mirroring large directories like a FreeBSD site. There are rumors, that the current version has fixed this, by allowing to specify a different algorithm for comparing the directory structure. In general FTP is not really good for mirroring, since it transfers each whole file, if it has changed, and does not create a single data stream, that will benefit from a large TCP congestion window. With RSYNC A better way, to mirror the FTP area is rsync. You can install the port net/rsync and then use rsync to sync with your upstream host. rsync is already mentioned in . Since rsync access is not required, your preferred upstream site may not allow it. Since it is quite common, though, chances are small, that you cannot use it. You can always consider using an upstream server, that offers it, just for the benefits of rsync. Since the number of rsync clients will have a significant impact on the server machine, most admins impose limitations on their server. For a mirror, you should ask the site maintainer you are syncing from, about their policy, and maybe an exception for your host (since you are a mirror). A command line to mirror FreeBSD could look like that: &prompt.user; rsync -vaz --delete ftp4.de.FreeBSD.org::FreeBSD/ /pub/FreeBSD/ Consult the documentation for rsync, which is also available at http://rsync.samba.org/ about the various options to be used with rsync. Also you might want to set up a script framework, that calls such a command via &man.cron.8;. With CVSup A few sites, including the one-and-only ftp-master.FreeBSD.org even offer CVSup to mirror the contents of the FTP space. You need to install a cvsup client, preferably from the port: net/cvsup. (Also reread .) A sample supfile, suitable for ftp-master.FreeBSD.org looks like this: # # FreeBSD archive supfile from master server # *default host=ftp-master.FreeBSD.org *default base=/usr *default prefix=/pub #*default release=all *default delete use-rel-suffix *default umask=002 # If your network link is a T1 or faster, comment out the following line. #*default compress FreeBSD-archive release=all preserve It seems CVSup would be the best way to mirror the archive, in terms of efficiency, but it is only available from few sites. In fact I just know ftp-master.FreeBSD.org for sure. Please have look at the CVSup documentation like &man.cvsup.1; and consider using the option, as it can reduce the amount of work to be done a lot. Mirroring the CVS repository Again you have various possibilities, but the most recommended one, is to use CVSup. Using CVSup CVSup was already described to some detail in and . Here we just describe an example to set up the supfile: # # FreeBSD CVS supfile from master server # *default host=cvsup-master.FreeBSD.org *default base=/usr *default prefix=/pub/FreeBSD/development/FreeBSD-CVS *default release=cvs *default delete use-rel-suffix *default umask=002 # If your network link is a T1 or faster, comment out the following line. #*default compress cvs-all You should also have a look at /usr/share/examples/cvsup Please don't forget to consider the hint, mentioned in this note above. Using other methods Using other methods than CVSup is generally not recommended. We describe them in short here anyway. Since most sites offer the CVS repository as part of the FTP fileset under the path /pub/FreeBSD/development/FreeBSD-CVS, the following methods could be used. FTP RSYNC maybe even HTTP If you find a site, that supports it, you could use net/sup, but it is inferior to CVSup and it's deficiencies caused John Polstra to develop CVSup in the first place, so it is clearly not recommended. You can NOT use AnonCVS to mirror the CVS repository, since CVS does not allow you to access the repository itself, but only checked out versions of the modules. Mirroring the WWW pages The best way is, to check out the www distribution from CVS. If you have a local mirror of the CVS repository, it is probably as easy as: &prompt.user; cvs -d /home/ncvs co www and a cronjob, that calls cvs up -d -P on a regular basis, maybe just after your repository was updated. Of course, the files need to remain in a directory, available for public WWW access. The installation and configuration of a webserver is not discussed here. - For the website to be visable, users must execute the &man.make.1; + For the website to be visible, users must execute the &man.make.1; command in the main www directory. This command will create the standard *.html files for web viewing. For this to work, however, the textproc/docproj port must be installed. If you don't have a local repository, you can use CVSup to maintain an up to date copy of the www pages. A sample supfile can be found in /usr/share/examples/cvsup/www-supfile and could look like this: # # WWW module supfile for FreeBSD # *default host=cvsup3.de.FreeBSD.org *default base=/usr *default prefix=/usr/local *default release=cvs tag=. *default delete use-rel-suffix # If your network link is a T1 or faster, comment out the following line. *default compress # This collection retrieves the www/ tree of the FreeBSD repository www Using ftp/wget or other web-mirror tools, is probably not recommended. Mirroring the FreeBSD documentation As the documentation is referenced a lot from the webpages, it is recommended, that you mirror the FreeBSD documentation as well. However, this is not so trivial as the www-pages alone. First of all, you should get the doc sources, again preferably via CVSup. Here is a corresponding sample supfile: # # FreeBSD documentation supfile # *default host=cvsup3.de.FreeBSD.org *default base=/usr *default prefix=/usr/share *default release=cvs tag=. *default delete use-rel-suffix # If your network link is a T1 or faster, comment out the following line. #*default compress # This will retrieve the entire doc branch of the FreeBSD repository. # This includes the handbook, FAQ, and translations thereof. doc-all Then you need to install a couple of ports. You are lucky, that there is a meta-port: textproc/docproj to do the work for you. You need to setup some environment variables, like SGML_CATALOG_FILES, also have a look at your /etc/make.conf (copy /etc/defaults/make.conf if you don't have one), and look at the DOC_LANG variable. Now you are probably ready to run make in you doc directory (/usr/share/doc by default) and build the documentation. Again you need to make it accessible for your webserver and make sure, the links point to the right location. The building of the documentation, as well as lots of side issues, is documented itself in: fdp-primer. Please read this piece of documentation, especially if you have problems, building the documentation. XXX MAYBE THIS CAN BE LINKED FROM WITHIN - NOT USING AN ABSOLUTE URL XXX How often should I mirror? Every mirror should be updated on a regular basis. You will certainly need some script framework for it, that will be called by &man.cron.8;. Since nearly every admin does this his own way, we cannot give specific instructions. It could work like this: Put the command to run your mirroring application in a script. Use of a plain /bin/sh script is recommended. Add some output redirections, so diagnostic messages are logged to a file. Test, if your script works. Check the logs. Use &man.crontab.1; to add the script the appropriate user's &man.crontab.5;. Here are some recommended schedules: FTP fileset: daily CVS repository: daily to hourly WWW pages: daily Where to mirror from This is an important issue. So this section will spend some effort to explain the backgrounds. A few words about the organization Mirrors are organized by country. All official mirrors have a DNS entry of the form ftpN.CC.FreeBSD.org. With CC (i.e. country code) being the top level domain of the country, where this mirror is located; and N is a number, telling that the host would be the Nth mirror in that country. (Same applies to cvsupN.CC.FreeBSD.org, wwwN.CC.FreeBSD.org, etc.) There are mirrors with no CC part. These are usually located in the US, but don't need to. ftp.FreeBSD.org is currently located in Denmark and just another mirror (i.e. it is NO master site). Additionally there exists a hierarchy of mirrors, which is described terms of tiers. The master sites are not referred to, but can be described as Tier-0. Mirrors that mirror from these sites can be considered Tier-1, mirrors of Tier-1-mirrors, are Tier-2, etc. Official sites are encouraged to be of a low tier, but the lower the tier, the higher the requirements in terms as described in . Also access to low-tier-mirrors may be restricted, and access to master sites is definitely restricted. The tier-hierarchy is not reflected by DNS and generally not documented anywhere, except for the master sites. However, official mirrors with low numbers, like 1-4, are usually Tier-1 (this is just a rough hint, and there's no rule). Ok, but where should I get the stuff now? The short answer is: from the - site, that is closest to you in internet terms, or gives you + site, that is closest to you in Internet terms, or gives you the fastest access. I just want to mirror from somewhere! If you have no special intentions or requirements, the statement in applies. This means: Look at available mirrors in your country. The FreeBSD Mirror Database can help you with this. Check roughly those, which provide fastest access (number of hops, round-trip-times) and offer the services you intend to use (like rsync or CVSup). Contact the admins of your chosen site, stating your request, and asking about their terms and policies. Setup your mirror as described above. I'm an official mirror, what is the right site for me? In general the description in still applies. Of course you may want to put some weight on the fact, that your upstream should be of a low tier. There are some other considerations about official mirrors, that are described in . I want to access the master sites! If you have good reasons, and good prerequisites, you may want and get access to one of the master sites. Access to these sites is generally restricted, and there are special policies for access. If you are already an official mirror, this certainly helps you getting access. In any other case make sure your country really needs another mirror. If it already has three or more, ask the &a.hubs; first. There are just two master sites, one for the FTP fileset and one for the CVS repository (the webpages and docs are obtained from CVS, so there is no need for it). ftp-master.FreeBSD.org This is the master site for the FTP fileset. ftp-master.FreeBSD.org provides rsync and CVSup access, rather in addition to ftp protocol. Refer to and how to access via these protocols. Mirrors should be encouraged to also allow rsync access for the FTP contents, since they are Tier-1-mirrors. To get access to ftp-master.FreeBSD.org, you need to contact &a.peter;. cvsup-master.FreeBSD.org This is the master site for the CVS repository. cvsup-master.FreeBSD.org provides CVSup access only. See for details. To get access, you need to contact &a.jdp;. Make sure you read FreeBSD CVSup Access Policy first! Set up the required authentication by following these instructions. Make sure you specify the server as freefall.FreeBSD.org on the cvpasswd command line, as described in this document, even when you are contacting cvsup-master.FreeBSD.org Official Mirrors Official mirrors are mirrors that a) have a FreeBSD.org DNS entry (usually a CNAME). b) are listed as an official mirror in the FreeBSD documentation (like handbook). So far to distinguish official mirrors. Official mirrors are not necessarily Tier-1-mirrors. However you probably won't find a Tier-1-mirror, that is not also official. Special Requirements for official (tier-1) mirrors It is not so easy to state requirements for all official mirrors, since the project is sort of tolerant here. It is more easy to say, what official tier-1 mirrors are required to. All other official mirrors can consider this a big should. The following applies mainly to the FTP fileset, since a CVS repository should always be mirrored completely, and the webpages are a case of its own. Tier-1 mirrors are required to: carry the complete fileset allow access to other mirror sites provide FTP and RSYNC access Furthermore, admins should be subscribed to the &a.hubs;. See this link for details, how to subscribe. It is very important for a hub administrator, especially Tier-1 hub admins, to check the release schedule for the next FreeBSD release. This is important because it will tell you when the next release is scheduled to come out, and thus giving you time to prepare for the big spike of traffic which follows it. It is also eminent that hub administrators try to keep their mirrors as up-to-date as possible (again, even more crucial for Tier-1 mirrors). If Mirror1 doesn't update for a while, lower tier mirrors will begin to mirror old data from Mirror1 and thus begins a downward spiral... Keep your mirrors up to date! How to become official then? An interesting question, especially, since the state of being official comes with some benefits, like a much higher bill from your ISP, as more people will be using your site. Also it may be a key requirement, to get access to a master site. Before applying, please consider (again) if another official mirror is really needed for your region. Ask on the &a.hubs;, if in doubt. Ok, here is how to do it: Get the mirror running in first place (maybe not using a master site, yet). Subscribe to the &a.hubs;. If everything works so far, contact the DNS admin, responsible for your region/country, and ask for a DNS entry for your site. The admin should able to be contacted via hostmaster@cc.FreeBSD.org, which cc being your country code/TLD again. Your DNS entry will look like described in . If there is no subdomain delegated, yet, for your country, you probably need to contact hostmaster@FreeBSD.org, however, you can try the &a.hubs; first. Then you can ask the &a.doc; or the &a.hubs; to add your mirror site to the mirror list in the FreeBSD Handbook. Make sure you tell them the email address, to list as the maintainer of the site. This is it. Some statistics from mirror sites Here are links to the stat pages of your favorite mirrors (a.k.a. the only ones who feel like providing stats). FTP site statistics ftp2.FreeBSD.org - grisha@ispol.com - (Bandwidth) ftp.is.FreeBSD.org - hostmaster@is.FreeBSD.org - (Bandwidth) (FTP processes) (HTTP processes) ftp.cz.FreeBSD.org - cejkar@fit.vutbr.cz - (Bandwidth) (FTP processes) (Rsync processes) ftp4.de.FreeBSD.org - dl@leo.org - (FTP users) (RSYNC users) (Bandwidth) CVSup site stats cvsup5.FreeBSD.org - staff@blackened.com - (CVSup processes) cvsup[23456].jp.FreeBSD.org - kuriyama@FreeBSD.org - (CVSup processes) cvsup.cz.FreeBSD.org - cejkar@fit.vutbr.cz - (CVSup processes) [cvsup3|anoncvs].de.FreeBSD.org - dl@leo.org - (CVSup processes)