diff --git a/en_US.ISO8859-1/books/handbook/config/chapter.sgml b/en_US.ISO8859-1/books/handbook/config/chapter.sgml index aaf4671a26..34d9baed68 100644 --- a/en_US.ISO8859-1/books/handbook/config/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/config/chapter.sgml @@ -1,2021 +1,2022 @@ Chern Lee Written by Mike Smith Based on a tutorial written by Matt Dillon Also based on tuning(7) written by Configuration and Tuning Synopsis system configuration/optimization One of the important aspects of FreeBSD is system configuration. Correct system configuration will help prevent headaches during future upgrades. This chapter will explain much of the FreeBSD configuration process, including some of the parameters which can be set to tune a FreeBSD system. After reading this chapter, you will know: How to efficiently work with file systems and swap partitions. The basics of rc.conf configuration and /usr/local/etc/rc.d startup systems. How to configure virtual hosts on your network devices. How to use the various configuration files in /etc. How to tune FreeBSD using sysctl variables. How to tune disk performance and modify kernel limitations. Before reading this chapter, you should: Understand Unix and FreeBSD basics (). Be familiar with keeping FreeBSD sources up to date (), and the basics of kernel configuration/compilation (). Initial Configuration Partition Layout Partition layout /etc /var /usr Base Partitions When laying out file systems with &man.disklabel.8; or &man.sysinstall.8;, remember that hard drives transfer data faster from the outer tracks to the inner. Thus smaller and heavier-accessed file systems should be closer to the outside of the drive While larger partitions like /usr should be placed toward the inner. It is a good idea to create partitions in a similar order to: root, swap, /var, /usr. The size of /var reflects the intended machine usage. /var is used to hold mailboxes, log files, and printer spools. Mailboxes and log files can grow to unexpected sizes depending on how many users exist and how long log files are kept. Most users would never require a gigabyte, but remember that /var/tmp must be large enough to contain packages. The /usr partition holds much of the files required to support the system, the &man.ports.7; collection (recommended) and the source code (optional). Both of which are optional at install time. At least 2 gigabytes would be recommended for this partition. When selecting partition sizes, keep the space requirements in mind. Running out of space in one partition while barely using another can be a hassle. Some users have found that &man.sysinstall.8;'s Auto-defaults partition sizer will sometimes select smaller than adequate /var and / partitions. Partition wisely and generously. Swap Partition swap sizing swap partition As a rule of thumb, the swap partition should be about double the size of system memory (RAM). For example, if the machine has 128 megabytes of memory, the swap file should be 256 megabytes. Systems with less memory may perform better with more swap. Less than 256 megabytes of swap is not recommended and memory expansion should be considered. The kernel's VM paging algorithms are tuned to perform best when the swap partition is at least two times the size of main memory. Configuring too little swap can lead to inefficiencies in the VM page scanning code and might create issues later if more memory is added. On larger systems with multiple SCSI disks (or multiple IDE disks operating on different controllers), it is recommend that a swap is configured on each drive (up to four drives). The swap partitions should be approximately the same size. The kernel can handle arbitrary sizes but internal data structures scale to 4 times the largest swap partition. Keeping the swap partitions near the same size will allow the kernel to optimally stripe swap space across disks. Large swap sizes are fine, even if swap is not used much. It might be easier to recover from a runaway program before being forced to reboot. Why Partition? Several users think a single large partition will be fine, but there are several reasons why this is a bad idea. First, each partition has different operational characteristics and separating them allows the file system to tune accordingly. For example, the root and /usr partitions are read-mostly, without much writing. While a lot of reading and writing could occur in /var and /var/tmp. By properly partitioning a system, fragmentation introduced in the smaller write heavy partitions will not bleed over into the mostly-read partitions. Keeping the write-loaded partitions closer to the disk's edge, will increase I/O performance in the partitions where it occurs the most. Now while I/O performance in the larger partitions may be needed, shifting them more toward the edge of the disk will not lead to a significant performance improvement over moving /var to the edge. Finally, there are safety concerns. A smaller, neater root partition which is mostly read-only has a greater chance of surviving a bad crash. Core Configuration rc files rc.conf The principal location for system configuration information is within /etc/rc.conf. This file contains a wide range of configuration information, principally used at system startup to configure the system. Its name directly implies this; it is configuration information for the rc* files. An administrator should make entries in the rc.conf file to override the default settings from /etc/defaults/rc.conf. The defaults file should not be copied verbatim to /etc - it contains default values, not examples. All system-specific changes should be made in the rc.conf file itself. A number of strategies may be applied in clustered applications to separate site-wide configuration from system-specific configuration in order to keep administration overhead down. The recommended approach is to place site-wide configuration into another file, such as /etc/rc.conf.site, and then include this file into /etc/rc.conf, which will contain only system-specific information. As rc.conf is read by &man.sh.1; it is trivial to achieve this. For example: rc.conf: . rc.conf.site hostname="node15.example.com" network_interfaces="fxp0 lo0" ifconfig_fxp0="inet 10.1.1.1" rc.conf.site: defaultrouter="10.1.1.254" saver="daemon" blanktime="100" The rc.conf.site file can then be distributed to every system using rsync or a similar program, while the rc.conf file remains unique. Upgrading the system using &man.sysinstall.8; or make world will not overwrite the rc.conf file, so system configuration information will not be lost. Application Configuration Typically, installed applications have their own configuration files, with their own syntax, etc. It is important that these files be kept separate from the base system, so that they may be easily located and managed by the package management tools. /usr/local/etc Typically, these files are installed in /usr/local/etc. In the case where an application has a large number of configuration files, a subdirectory will be created to hold them. Normally, when a port or package is installed, sample configuration files are also installed. These are usually identified with a .default suffix. If there are no existing configuration files for the application, they will be created by copying the .default files. For example, consider the contents of the directory /usr/local/etc/apache: -rw-r--r-- 1 root wheel 2184 May 20 1998 access.conf -rw-r--r-- 1 root wheel 2184 May 20 1998 access.conf.default -rw-r--r-- 1 root wheel 9555 May 20 1998 httpd.conf -rw-r--r-- 1 root wheel 9555 May 20 1998 httpd.conf.default -rw-r--r-- 1 root wheel 12205 May 20 1998 magic -rw-r--r-- 1 root wheel 12205 May 20 1998 magic.default -rw-r--r-- 1 root wheel 2700 May 20 1998 mime.types -rw-r--r-- 1 root wheel 2700 May 20 1998 mime.types.default -rw-r--r-- 1 root wheel 7980 May 20 1998 srm.conf -rw-r--r-- 1 root wheel 7933 May 20 1998 srm.conf.default The filesize difference shows that only the srm.conf file has been changed. A later update of the Apache port would not overwrite this changed file. Starting Services services It is common for a system to host a number of services. These may be started in several different fashions, each having different advantages. /usr/local/etc/rc.d Software installed from a port or the packages collection will often place a script in /usr/local/etc/rc.d which is invoked at system startup with a argument, and at system shutdown with a argument. This is the recommended way for starting system-wide services that are to be run as root, or that expect to be started as root. These scripts are registered as part of the installation of the package, and will be removed when the package is removed. A generic startup script in /usr/local/etc/rc.d looks like: #!/bin/sh echo -n ' FooBar' case "$1" in start) /usr/local/bin/foobar ;; stop) kill -9 `cat /var/run/foobar.pid` ;; *) echo "Usage: `basename $0` {start|stop}" >&2 exit 64 ;; esac exit 0 The startup scripts of FreeBSD will look in /usr/local/etc/rc.d for scripts that have an .sh extension and are executable by root. Those scripts that are found are called with an option at startup, and at shutdown to allow them to carry out their purpose. So if you wanted the above sample script to be picked up and run at the proper time during system startup, you should save it to a file called FooBar.sh in /usr/local/etc/rc.d and make sure it is executable. You can make a shell script executable with &man.chmod.1; as shown below: &prompt.root; chmod 755 FooBar.sh Some services expect to be invoked by &man.inetd.8; when a connection is received on a suitable port. This is common for mail reader servers (POP and IMAP, etc.). These services are enabled by editing the file /etc/inetd.conf. See &man.inetd.8; for details on editing this file. Some additional system services may not be covered by the toggles in /etc/rc.conf. These are traditionally enabled by placing the command(s) to invoke them in /etc/rc.local. As of FreeBSD 3.1 there is no default /etc/rc.local; if it is created by the administrator it will however be honored in the normal fashion. Note that rc.local is generally regarded as the location of last resort; if there is a better place to start a service, do it there. Do not place any commands in /etc/rc.conf. To start daemons, or run any commands at boot time, place a script in /usr/local/etc/rc.d instead. It is also possible to use the &man.cron.8; daemon to start system services. This approach has a number of advantages, not least being that because &man.cron.8; runs these processes as the owner of the crontab, services may be started and maintained by non-root users. This takes advantage of a feature of &man.cron.8;: the time specification may be replaced by @reboot, which will cause the job to be run when &man.cron.8; is started shortly after system boot. Marc Fonvieille Contributed by Setting Up Network Interface Cards Network card configuration Nowadays we can not think about a computer without thinking about a network connection. Adding and configuring a network card is a common task for any FreeBSD administrator. Locating the Correct Driver Network card configuration Locating the driver Before you begin, you should know the model of the card you have, the chip it uses, and whether it is a PCI or ISA card. FreeBSD supports a wide variety of both PCI and ISA cards. Check the Hardware Compatibility List for your release to see if your card is supported. Once you are sure your card is supported, you need to determine the proper driver for the card. The file /usr/src/sys/i386/conf/LINT will give you the list of network interfaces drivers with some information about the supported chipsets/cards. If you have doubts about which driver is the correct one, read the manual page of the driver. The manual page will give you more information about the supported hardware and even the possible problems that could occur. If you own a common card, most of the time you will not have to look very hard for a driver. Drivers for common network cards are present in the GENERIC kernel, so your card should show up during boot, like so: dc0: <82c169 PNIC 10/100BaseTX> port 0xa000-0xa0ff mem 0xd3800000-0xd38 000ff irq 15 at device 11.0 on pci0 dc0: Ethernet address: 00:a0:cc:da:da:da miibus0: <MII bus> on dc0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dc1: <82c169 PNIC 10/100BaseTX> port 0x9800-0x98ff mem 0xd3000000-0xd30 000ff irq 11 at device 12.0 on pci0 dc1: Ethernet address: 00:a0:cc:da:da:db miibus1: <MII bus> on dc1 ukphy1: <Generic IEEE 802.3u media interface> on miibus1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto In this example, we see that two cards using the &man.dc.4; driver are present on the system. To use your network card, you will need to load the proper driver. This may be accomplished in one of two ways. The easiest way is to simply load a kernel module for your network card with &man.kldload.8;. A module is not available for all network card drivers (ISA cards and cards using the &man.ed.4; driver, for example). Alternatively, you may statically compile the support for your card into your kernel. Check /usr/src/sys/i386/conf/LINT and the manual page of the driver to know what to add in your kernel configuration file. For more information about recompiling your kernel, please see . If your card was detected at boot by your kernel (GENERIC) you do not have to build a new kernel. Configuring the Network Card Network card configuration configuration Once the right driver is loaded for the network card, the card needs to be configured. As with many other things, the network card may have been configured at installation time by sysinstall. To display the configuration for the network interfaces on your system, enter the following command: &prompt.user; ifconfig dc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 192.168.1.3 netmask 0xffffff00 broadcast 192.168.1.255 ether 00:a0:cc:da:da:da media: Ethernet autoselect (100baseTX <full-duplex>) status: active dc1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255 ether 00:a0:cc:da:da:db media: Ethernet 10baseT/UTP status: no carrier lp0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> mtu 1500 lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 inet 127.0.0.1 netmask 0xff000000 tun0: flags=8010<POINTOPOINT,MULTICAST> mtu 1500 Old versions of FreeBSD may require the option following &man.ifconfig.8;, for more details about the correct syntax of &man.ifconfig.8;, please refer to the manual page. Note also that entries concerning IPv6 (inet6 etc.) were omitted in this example. In this example, the following devices were displayed: dc0: The first Ethernet interface dc1: The second Ethernet interface lp0: The parallel port interface lo0: The loopback device tun0: The tunnel device used by ppp FreeBSD uses the driver name followed by the order in which one the card is detected at the kernel boot to name the network card. For example sis2 would be the third network card on the system using the &man.sis.4; driver. In this example, the dc0 device is up and running. The key indicators are: UP means that the card is configured and ready. The card has an Internet (inet) address (in this case 192.168.1.3). It has a valid subnet mask (netmask; 0xffffff00 is the same as 255.255.255.0). It has a valid broadcast address (in this case, 192.168.1.255). The MAC address of the card (ether) is 00:a0:cc:da:da:da The physical media selection is on autoselection mode (media: Ethernet autoselect (100baseTX <full-duplex>)). We see that dc1 was configured to run with 10baseT/UTP media. For more information on available media types for a driver, please refer to its manual page. The status of the link (status) is active, i.e. the carrier is detected. For dc1, we see status: no carrier. This is normal when an ethernet cable is not plugged into the card. If the &man.ifconfig.8; output had shown something similar to: dc0: flags=8843<BROADCAST,SIMPLEX,MULTICAST> mtu 1500 ether 00:a0:cc:da:da:da it would indicate the card has not been configured. To configure your card, you need root privileges. The network card configuration can be done from the command line with &man.ifconfig.8; but you would have to do it after each reboot of the system. The file /etc/rc.conf is where to add the network card's configuration. Open /etc/rc.conf in your favorite editor. You need to add a line for each network card present on the system, for example in our case, we added these lines: ifconfig_dc0="inet 192.168.1.3 netmask 255.255.255.0" ifconfig_dc1="inet 10.0.0.1 netmask 255.255.255.0 media 10baseT/UTP" You have to replace dc0, dc1, and so on, with the correct device for your cards, and the addresses with the proper ones. You should read the card driver and &man.ifconfig.8; manual pages for more details about the allowed options and also &man.rc.conf.5; manual page for more information on the syntax of /etc/rc.conf. If you configured the network during installation, some lines about the network card(s) may be already present. Double check /etc/rc.conf before adding any lines. You will also have to edit the file /etc/hosts to add the names and the IP addresses of various machines of the LAN, if they are not already there. For more information please refer to &man.hosts.5; and to /usr/share/examples/etc/hosts. Testing and Troubleshooting Once you have made the necessary changes in /etc/rc.conf, you should reboot your system. This will allow the change(s) to the interface(s) to be applied, and verify that the system restarts without any configuration errors. Once the system has been rebooted, you should test the network interfaces. Testing the Ethernet Card Network card configuration Testing the card To verify that an Ethernet card is configured correctly, you have to try two things. First, ping the interface itself, and then ping another machine on the LAN. First test the local interface: &prompt.user; ping -c5 192.168.1.3 PING 192.168.1.3 (192.168.1.3): 56 data bytes 64 bytes from 192.168.1.3: icmp_seq=0 ttl=64 time=0.082 ms 64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.074 ms 64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.076 ms 64 bytes from 192.168.1.3: icmp_seq=3 ttl=64 time=0.108 ms 64 bytes from 192.168.1.3: icmp_seq=4 ttl=64 time=0.076 ms --- 192.168.1.3 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.074/0.083/0.108/0.013 ms Now we have to ping another machine on the LAN: &prompt.user; ping -c5 192.168.1.2 PING 192.168.1.2 (192.168.1.2): 56 data bytes 64 bytes from 192.168.1.2: icmp_seq=0 ttl=64 time=0.726 ms 64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.766 ms 64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.700 ms 64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.747 ms 64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.704 ms --- 192.168.1.2 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.700/0.729/0.766/0.025 ms You could also use the machine name instead of 192.168.1.2 if you have set up the /etc/hosts file. Troubleshooting Network card configuration Troubleshooting Where can I find information about possible trouble I may experience with my network card? The manual page of the driver is the first piece of documentation to read. The mailing lists archives can also be useful. When I try to ping a machine on my LAN, I get this message: ping: sendto: Permission denied. This means that you do not have permission to send ICMP packets. Check to see if a firewall is running on the machine and if there are any rules blocking ICMP. I see a lot of watchdog timeout messages in the system logs, and when I try to ping a machine on the LAN, I get this message: ping: sendto: No route to host. The first thing to do is to check your network cable. Many cards require a PCI slot supporting the Bus Mastering. On some old motherboards, only one PCI slot allows it (most of time slot 0). Check the network card and the motherboard documentation to determine if that may be the problem. I see a lot of device timeout messages in the system logs, and my network card does not work. Having one or two of these messages is sometimes normal with some cards. However, if they persist and the network is not usable, make sure the network cable is plugged in and that there are no IRQ conflicts between the network card and another device (or devices) on the system. The performance of the card is poor, what can I do? It is difficult to answer to that question. What is your definition of poor performance? Double check everything in your configuration, read the &man.tuning.7; manual page, and try to avoid cheap network cards. Many users have noted that setting the media selection mode to autoselect results in bad performance on some hardware. Are there any recommended network cards or cards I should stay away from? You should avoid cheap cards for serious usage. Cheap cards often use buggy chipsets, and most of time do not provide very good performance. Many FreeBSD users like cards using the &man.fxp.4; chipset, however, this does not mean that all other chipsets are bad. Virtual Hosts virtual hosts ip aliases A very common use of FreeBSD is virtual site hosting, where one server appears to the network as many servers. This is achieved by assigning multiple network addresses to a single interface. A given network interface has one real address, and may have any number of alias addresses. These aliases are normally added by placing alias entries in /etc/rc.conf. An alias entry for the interface fxp0 looks like: ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx" Note that alias entries must start with alias0 and proceed upwards in order, (for example, _alias1, _alias2, and so on). The configuration process will stop at the first missing number. The calculation of alias netmasks is important, but fortunately quite simple. For a given interface, there must be one address which correctly represents the network's netmask. Any other addresses which fall within this network must have a netmask of all 1s. For example, consider the case where the fxp0 interface is connected to two networks, the 10.1.1.0 network with a netmask of 255.255.255.0 and the 202.0.75.16 network with a netmask of 255.255.255.240. We want the system to appear at 10.1.1.1 through 10.1.1.5 and at 202.0.75.17 through 202.0.75.20. The following entries configure the adapter correctly for this arrangement: ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0" ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255" ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255" ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255" ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255" ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240" ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255" ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255" ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255" Configuration Files <filename>/etc</filename> Layout There are a number of directories in which configuration information is kept. These include: /etc Generic system configuration information; data here is system-specific. /etc/defaults Default versions of system configuration files. /etc/mail Extra &man.sendmail.8; configuration, other MTA configuration files. /etc/ppp Configuration for both user- and kernel-ppp programs. /etc/namedb Default location for &man.named.8; data. Normally named.conf and zone files are stored here. /usr/local/etc Configuration files for installed applications. May contain per-application subdirectories. /usr/local/etc/rc.d Start/stop scripts for installed applications. /var/db Automatically generated system-specific database files, such as the package database, the locate database, and so on Hostnames hostname DNS <filename>/etc/resolv.conf</filename> resolv.conf /etc/resolv.conf dictates how FreeBSD's resolver accesses the Internet Domain Name System (DNS). The most common entries to resolv.conf are: nameserver The IP address of a name server the resolver should query. The servers are queried in the order listed with a maximum of three. search Search list for hostname lookup. This is normally determined by the domain of the local hostname. domain The local domain name. A typical resolv.conf: search example.com nameserver 147.11.1.11 nameserver 147.11.100.30 Only one of the search and domain options should be used. If you are using DHCP, &man.dhclient.8; usually rewrites resolv.conf with information received from the DHCP server. <filename>/etc/hosts</filename> hosts /etc/hosts is a simple text database reminiscent of the old Internet. It works in conjunction with DNS and NIS providing name to IP address mappings. Local computers connected via a LAN can be placed in here for simplistic naming purposes instead of setting up a &man.named.8; server. Additionally, /etc/hosts can be used to provide a local record of Internet names, reducing the need to query externally for commonly accessed names. # $FreeBSD$ # # Host Database # This file should contain the addresses and aliases # for local hosts that share this file. # In the presence of the domain name service or NIS, this file may # not be consulted at all; see /etc/nsswitch.conf for the resolution order. # # ::1 localhost localhost.my.domain myname.my.domain 127.0.0.1 localhost localhost.my.domain myname.my.domain # # Imaginary network. #10.0.0.2 myname.my.domain myname #10.0.0.3 myfriend.my.domain myfriend # # According to RFC 1918, you can use the following IP networks for # private nets which will never be connected to the Internet: # # 10.0.0.0 - 10.255.255.255 # 172.16.0.0 - 172.31.255.255 # 192.168.0.0 - 192.168.255.255 # # In case you want to be able to connect to the Internet, you need # real official assigned numbers. PLEASE PLEASE PLEASE do not try # to invent your own network numbers but instead get one from your # network provider (if any) or from the Internet Registry (ftp to # rs.internic.net, directory `/templates'). # /etc/hosts takes on the simple format of: [Internet address] [official hostname] [alias1] [alias2] ... For example: 10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2 Consult &man.hosts.5; for more information. Log File Configuration log files <filename>syslog.conf</filename> syslog.conf syslog.conf is the configuration file for the &man.syslogd.8; program. It indicates which types of syslog messages are logged to particular log files. # $FreeBSD$ # # Spaces ARE valid field separators in this file. However, # other *nix-like systems still insist on using tabs as field # separators. If you are sharing this file between systems, you # may want to use only tabs as field separators here. # Consult the syslog.conf(5) manual page. *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages security.* /var/log/security mail.info /var/log/maillog lpr.info /var/log/lpd-errs cron.* /var/log/cron *.err root *.notice;news.err root *.alert root *.emerg * # uncomment this to log all writes to /dev/console to /var/log/console.log #console.info /var/log/console.log # uncomment this to enable logging of all log messages to /var/log/all.log #*.* /var/log/all.log # uncomment this to enable logging to a remote log host named loghost #*.* @loghost # uncomment these if you're running inn # news.crit /var/log/news/news.crit # news.err /var/log/news/news.err # news.notice /var/log/news/news.notice !startslip *.* /var/log/slip.log !ppp *.* /var/log/ppp.log Consult the &man.syslog.conf.5; manual page for more information. <filename>newsyslog.conf</filename> newsyslog.conf newsyslog.conf is the configuration file for &man.newsyslog.8;, a program that is normally scheduled to run by &man.cron.8;. &man.newsyslog.8; determines when log files require archiving or rearranging. logfile is moved to logfile.0, logfile.0 is moved to logfile.1, and so on. Alternatively, the log files may be archived in &man.gzip.1; format causing them to be named: logfile.0.gz, logfile.1.gz, and so on. newsyslog.conf indicates which log files are to be managed, how many are to be kept, and when they are to be touched. Log files can be rearranged and/or archived when they have either reached a certain size, or at a certain periodic time/date. # configuration file for newsyslog # $FreeBSD$ # # filename [owner:group] mode count size when [ZB] [/pid_file] [sig_num] /var/log/cron 600 3 100 * Z /var/log/amd.log 644 7 100 * Z /var/log/kerberos.log 644 7 100 * Z /var/log/lpd-errs 644 7 100 * Z /var/log/maillog 644 7 * @T00 Z /var/log/sendmail.st 644 10 * 168 B /var/log/messages 644 5 100 * Z /var/log/all.log 600 7 * @T00 Z /var/log/slip.log 600 3 100 * Z /var/log/ppp.log 600 3 100 * Z /var/log/security 600 10 100 * Z /var/log/wtmp 644 3 * @01T05 B /var/log/daily.log 640 7 * @T00 Z /var/log/weekly.log 640 5 1 $W6D0 Z /var/log/monthly.log 640 12 * $M1D0 Z /var/log/console.log 640 5 100 * Z Consult the &man.newsyslog.8; manual page for more information. <filename>sysctl.conf</filename> sysctl.conf sysctl sysctl.conf looks much like rc.conf. Values are set in a variable=value form. The specified values are set after the system goes into multi-user mode. Not all variables are settable in this mode. A sample sysctl.conf turning off logging of fatal signal exits and letting Linux programs know they are really running under FreeBSD: kern.logsigexit=0 # Do not log fatal signal exits (e.g. sig 11) compat.linux.osname=FreeBSD compat.linux.osrelease=4.3-STABLE Tuning with sysctl sysctl Tuning with sysctl &man.sysctl.8; is an interface that allows you to make changes to a running FreeBSD system. This includes many advanced options of the TCP/IP stack and virtual memory system that can dramatically improve performance for an experienced system administrator. Over five hundred system variables can be read and set using &man.sysctl.8;. At its core, &man.sysctl.8; serves two functions: to read and to modify system settings. To view all readable variables: &prompt.user; sysctl -a To read a particular variable, for example, kern.maxproc: &prompt.user; sysctl kern.maxproc kern.maxproc: 1044 To set a particular variable, use the intuitive variable=value syntax: &prompt.root; sysctl kern.maxfiles=5000 kern.maxfiles: 2088 -> 5000 Settings of sysctl variables are usually either strings, numbers, or booleans (a boolean being 1 for yes or a 0 for no). Tom Rhodes Contributed by &man.sysctl.8; read only In some cases it may be desirable to modify read-only &man.sysctl.8; values. While this is not recommended, it is also sometimes unavoidable. For instance on some laptop models the &man.cardbus.4; device will not probe memory ranges, and fail with errors which look similar to: cbb0: Could not map register memory device_probe_and_attach: cbb0 attach returned 12 Cases like the one above usually require the modification of some default &man.sysctl.8; settings which are set read only. To overcome these situations a user can put &man.sysctl.8; OIDs in their local /boot/loader.conf.local. Default settings are located in the /boot/defaults/loader.conf file. Fixing the problem mentioned above would require a user to set in the aforementioned file. Now &man.cardbus.4; will work properly. Tuning Disks Sysctl Variables <varname>vfs.vmiodirenable</varname> vfs.vmiodirenable The vfs.vmiodirenable sysctl variable may be set to either 0 (off) or 1 (on); it is 1 by default. This variable controls how directories are cached by the system. Most directories are small, using just a single fragment (typically 1 K) in the file system and less (typically 512 bytes) in the buffer cache. However, when operating in the default mode the buffer cache will only cache a fixed number of directories even if you have a huge amount of memory. Turning on this sysctl allows the buffer cache to use the VM Page Cache to cache the directories, making all the memory available for caching directories. However, the minimum in-core memory used to cache a directory is the physical page size (typically 4 K) rather than 512 bytes. We recommend turning this option on if you are running any services which manipulate large numbers of files. Such services can include web caches, large mail systems, and news systems. Turning on this option will generally not reduce performance even with the wasted memory but you should experiment to find out. <varname>vfs.write_behind</varname> vfs.write_behind The vfs.write_behind sysctl variable defaults to 1 (on). This tells the file system to issue media writes as full clusters are collected, which typically occurs when writing large sequential files. The idea is to avoid saturating the buffer cache with dirty buffers when it would not benefit I/O performance. However, this may stall processes and under certain circumstances you may wish to turn it off. <varname>vfs.hirunningspace</varname> vfs.hirunningspace The vfs.hirunningspace sysctl variable determines how much outstanding write I/O may be queued to disk controllers system-wide at any given instance. The default is usually sufficient but on machines with lots of disks you may want to bump it up to four or five megabytes. Note that setting too high a value (exceeding the buffer cache's write threshold) can lead to extremely bad clustering performance. Do not set this value arbitrarily high! Higher write values may add latency to reads occurring at the same time. There are various other buffer-cache and VM page cache related sysctls. We do not recommend modifying these values. As of FreeBSD 4.3, the VM system does an extremely good job of automatically tuning itself. <varname>vm.swap_idle_enabled</varname> vm.swap_idle_enabled The vm.swap_idle_enabled sysctl variable is useful in large multi-user systems where you have lots of users entering and leaving the system and lots of idle processes. Such systems tend to generate a great deal of continuous pressure on free memory reserves. Turning this feature on and tweaking the swapout hysteresis (in idle seconds) via vm.swap_idle_threshold1 and vm.swap_idle_threshold2 allows you to depress the priority of memory pages associated with idle processes more quickly then the normal pageout algorithm. This gives a helping hand to the pageout daemon. Do not turn this option on unless you need it, because the tradeoff you are making is essentially pre-page memory sooner rather than later; thus eating more swap and disk bandwidth. In a small system this option will have a determinable effect but in a large system that is already doing moderate paging this option allows the VM system to stage whole processes into and out of memory easily. <varname>hw.ata.wc</varname> hw.ata.wc FreeBSD 4.3 flirted with turning off IDE write caching. This reduced write bandwidth to IDE disks but was considered necessary due to serious data consistency issues introduced by hard drive vendors. The problem is that IDE drives lie about when a write completes. With IDE write caching turned on, IDE hard drives not only write data to disk out of order, but will sometimes delay writing some blocks indefinitely when under heavy disk loads. A crash or power failure may cause serious file system corruption. FreeBSD's default was changed to be safe. Unfortunately, the result was such a huge performance loss that we changed write caching back to on by default after the release. You should check the default on your system by observing the hw.ata.wc sysctl variable. If IDE write caching is turned off, you can turn it back on by setting the kernel variable back to 1. This must be done from the boot loader at boot time. Attempting to do it after the kernel boots will have no effect. For more information, please see &man.ata.4;. <option>SCSI_DELAY</option> (<varname>kern.cam.scsi_delay</varname>) kern.cam.scsi_delay The kernel config may be used to reduce system boot times. The defaults are fairly high and can be responsible for 15+ seconds of delay in the boot process. Reducing it to 5 seconds usually works (especially with modern drives). Newer versions of FreeBSD (5.0+) should use the kern.cam.scsi_delay boot time tunable. The tunable, and kernel config option accept values in terms of milliseconds and not seconds. Soft Updates Soft Updates tunefs The &man.tunefs.8; program can be used to fine-tune a file system. This program has many different options, but for now we are only concerned with toggling Soft Updates on and off, which is done by: &prompt.root; tunefs -n enable /filesystem &prompt.root; tunefs -n disable /filesystem A filesystem cannot be modified with &man.tunefs.8; while it is mounted. A good time to enable Soft Updates is before any partitions have been mounted, in single-user mode. As of FreeBSD 4.5, it is possible to enable Soft Updates at filesystem creation time, through use of the -U option to &man.newfs.8;. Soft Updates drastically improves meta-data performance, mainly file creation and deletion, through the use of a memory cache. We recommend to use Soft Updates on all of your file systems. There are two downsides to Soft Updates that you should be aware of: First, Soft Updates guarantees filesystem consistency in the case of a crash but could very easily be several seconds (even a minute!) behind updating the physical disk. If your system crashes you may lose more work than otherwise. Secondly, Soft Updates delays the freeing of filesystem blocks. If you have a filesystem (such as the root filesystem) which is almost full, performing a major update, such as make installworld, can cause the filesystem to run out of space and the update to fail. More details about Soft Updates Soft Updates (Details) There are two traditional approaches to writing a file systems meta-data back to disk. (Meta-data updates are updates to non-content data like inodes or directories.) Historically, the default behavior was to write out meta-data updates synchronously. If a directory had been changed, the system waited until the change was actually written to disk. The file data buffers (file contents) were passed through the buffer cache and backed up to disk later on asynchronously. The advantage of this implementation is that it operates safely. If there is a failure during an update, the meta-data are always in a consistent state. A file is either created completely or not at all. If the data blocks of a file did not find their way out of the buffer cache onto the disk by the time of the crash, &man.fsck.8; is able to recognize this and repair the filesystem by setting the file length to 0. Additionally, the implementation is clear and simple. The disadvantage is that meta-data changes are slow. An rm -r, for instance, touches all the files in a directory sequentially, but each directory change (deletion of a file) will be written synchronously to the disk. This includes updates to the directory itself, to the inode table, and possibly to indirect blocks allocated by the file. Similar considerations apply for unrolling large hierarchies (tar -x). The second case is asynchronous meta-data updates. This is the default for Linux/ext2fs and mount -o async for *BSD ufs. All meta-data updates are simply being passed through the buffer cache too, that is, they will be intermixed with the updates of the file content data. The advantage of this implementation is there is no need to wait until each meta-data update has been written to disk, so all operations which cause huge amounts of meta-data updates work much faster than in the synchronous case. Also, the implementation is still clear and simple, so there is a low risk for bugs creeping into the code. The disadvantage is that there is no guarantee at all for a consistent state of the filesystem. If there is a failure during an operation that updated large amounts of meta-data (like a power failure, or someone pressing the reset button), the filesystem will be left in an unpredictable state. There is no opportunity to examine the state of the filesystem when the system comes up again; the data blocks of a file could already have been written to the disk while the updates of the inode table or the associated directory were not. It is actually impossible to implement a fsck which is able to clean up the resulting chaos (because the necessary information is not available on the disk). If the filesystem has been damaged beyond repair, the only choice is to use &man.newfs.8; on it and restore it from backup. The usual solution for this problem was to implement dirty region logging, which is also referred to as journaling, although that term is not used consistently and is occasionally applied to other forms of transaction logging as well. Meta-data updates are still written synchronously, but only into a small region of the disk. Later on they will be moved to their proper location. Because the logging area is a small, contiguous region on the disk, there are no long distances for the disk heads to move, even during heavy operations, so these operations are quicker than synchronous updates. Additionally the complexity of the implementation is fairly limited, so the risk of bugs being present is low. A disadvantage is that all meta-data are written twice (once into the logging region and once to the proper location) so for normal work, a performance pessimization might result. On the other hand, in case of a crash, all pending meta-data operations can be quickly either rolled-back or completed from the logging area after the system comes up again, resulting in a fast filesystem startup. Kirk McKusick, the developer of Berkeley FFS, solved this problem with Soft Updates: all pending meta-data updates are kept in memory and written out to disk in a sorted sequence (ordered meta-data updates). This has the effect that, in case of heavy meta-data operations, later updates to an item catch the earlier ones if the earlier ones are still in memory and have not already been written to disk. So all operations on, say, a directory are generally performed in memory before the update is written to disk (the data blocks are sorted according to their position so that they will not be on the disk ahead of their meta-data). If the system crashes, this causes an implicit log rewind: all operations which did not find their way to the disk appear as if they had never happened. A consistent filesystem state is maintained that appears to be the one of 30 to 60 seconds earlier. The algorithm used guarantees that all resources in use are marked as such in their appropriate bitmaps: blocks and inodes. After a crash, the only resource allocation error that occurs is that resources are marked as used which are actually free. &man.fsck.8; recognizes this situation, and frees the resources that are no longer used. It is safe to ignore the dirty state of the filesystem after a crash by forcibly mounting it with mount -f. In order to free resources that may be unused, &man.fsck.8; needs to be run at a later time. This is the idea behind the background fsck: at system startup time, only a snapshot of the filesystem is recorded. The fsck can be run later on. All file systems can then be mounted dirty, so the system startup proceeds in multiuser mode. Then, background fscks will be scheduled for all file systems where this is required, to free resources that may be unused. (File systems that do not use Soft Updates still need the usual foreground fsck though.) The advantage is that meta-data operations are nearly as fast as asynchronous updates (i.e. faster than with logging, which has to write the meta-data twice). The disadvantages are the complexity of the code (implying a higher risk for bugs in an area that is highly sensitive regarding loss of user data), and a higher memory consumption. Additionally there are some idiosyncrasies one has to get used to. After a crash, the state of the filesystem appears to be somewhat older. In situations where the standard synchronous approach would have caused some zero-length files to remain after the fsck, these files do not exist at all with a Soft Updates filesystem because neither the meta-data nor the file contents have ever been written to disk. Disk space is not released until the updates have been written to disk, which may take place some time after running rm. This may cause problems when installing large amounts of data on a filesystem that does not have enough free space to hold all the files twice. Tuning Kernel Limits Tuning kernel limits File/Process Limits <varname>kern.maxfiles</varname> kern.maxfiles kern.maxfiles can be raised or lowered based upon your system requirements. This variable indicates the maximum number of file descriptors on your system. When the file descriptor table is full, file: table is full will show up repeatedly in the system message buffer, which can be viewed with the dmesg command. Each open file, socket, or fifo uses one file descriptor. A large-scale production server may easily require many thousands of file descriptors, depending on the kind and number of services running concurrently. kern.maxfile's default value is dictated by the option in your kernel configuration file. kern.maxfiles grows proportionally to the value of . When compiling a custom kernel, it is a good idea to set this kernel configuration option according to the uses of your system. From this number, the kernel is given most of its pre-defined limits. Even though a production machine may not actually have 256 users connected as once, the resources needed may be similar to a high-scale web server. As of FreeBSD 4.5, setting to 0 in your kernel configuration file will choose a reasonable default value based on the amount of RAM present in your system. <varname>kern.ipc.somaxconn</varname> kern.ipc.somaxconn The kern.ipc.somaxconn sysctl variable limits the size of the listen queue for accepting new TCP connections. The default value of 128 is typically too low for robust handling of new connections in a heavily loaded web server environment. For such environments, it is recommended to increase this value to 1024 or higher. The service daemon may itself limit the listen queue size (e.g. &man.sendmail.8;, or Apache) but will often have a directive in it's configuration file to adjust the queue size. Large listen queues also do a better job of avoiding Denial of Service (DoS) attacks. Network Limits The kernel configuration option dictates the amount of network Mbufs available to the system. A heavily-trafficked server with a low number of Mbufs will hinder FreeBSD's ability. Each cluster represents approximately 2 K of memory, so a value of 1024 represents 2 megabytes of kernel memory reserved for network buffers. A simple calculation can be done to figure out how many are needed. If you have a web server which maxes out at 1000 simultaneous connections, and each connection eats a 16 K receive and 16 K send buffer, you need approximately 32 MB worth of network buffers to cover the web server. A good rule of thumb is to multiply by 2, so 2x32 MB / 2 KB = 64 MB / 2 kB = 32768. We recommend values between 4096 and 32768 for machines with greater amounts of memory. Under no circumstances should you specify an arbitrarily high value for this parameter as it could lead to a boot time crash. The option to &man.netstat.1; may be used to observe network cluster use. kern.ipc.nmbclusters loader tunable should be used to tune this at boot time. Only older versions of FreeBSD will require you to use the kernel &man.config.8; option. For busy servers that make extensive use of the &man.sendfile.2; system call, it may be necessary to increase the number of &man.sendfile.2; buffers via the - kernel configuration option. A - common indicator that this parameter needs to be adjusted is - when processes are seen in the sfbufa - state. The sysctl variable - kern.ipc.nsfbufs is a read-only glimpse at - the kernel configured variable. This parameter nominally - scales with kern.maxusers, however it may - be necessary to tune accordingly. + kernel configuration option or by + setting its value in /boot/loader.conf + (see &man.loader.8; for details). A common indicator that + this parameter needs to be adjusted is when processes are seen + in the sfbufa state. The sysctl + variable kern.ipc.nsfbufs is a read-only + glimpse at the kernel configured variable. This parameter + nominally scales with kern.maxusers, + however it may be necessary to tune accordingly. Even though a socket has been marked as non-blocking, calling &man.sendfile.2; on the non-blocking socket may result in the &man.sendfile.2; call blocking until enough struct sf_buf's are made available. <varname>net.inet.ip.portrange.*</varname> net.inet.ip.portrange.* The net.inet.ip.portrange.* sysctl variables control the port number ranges automatically bound to TCP and UDP sockets. There are three ranges: a low range, a default range, and a high range. Most network programs use the default range which is controlled by the net.inet.ip.portrange.first and net.inet.ip.portrange.last, which default to 1024 and 5000, respectively. Bound port ranges are used for outgoing connections, and it is possible to run the system out of ports under certain circumstances. This most commonly occurs when you are running a heavily loaded web proxy. The port range is not an issue when running servers which handle mainly incoming connections, such as a normal web server, or has a limited number of outgoing connections, such as a mail relay. For situations where you may run yourself out of ports, it is recommended to increase net.inet.ip.portrange.last modestly. A value of 10000, 20000 or 30000 may be reasonable. You should also consider firewall effects when changing the port range. Some firewalls may block large ranges of ports (usually low-numbered ports) and expect systems to use higher ranges of ports for outgoing connections — for this reason it is recommended that net.inet.ip.portrange.first be lowered. TCP Bandwidth Delay Product TCP Bandwidth Delay Product Limiting net.inet.tcp.inflight_enable The TCP Bandwidth Delay Product Limiting is similar to TCP/Vegas in NetBSD. It can be enabled by setting net.inet.tcp.inflight_enable sysctl variable to 1. The system will attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput. This feature is useful if you are serving data over modems, Gigabit Ethernet, or even high speed WAN links (or any other link with a high bandwidth delay product), especially if you are also using window scaling or have configured a large send window. If you enable this option, you should also be sure to set net.inet.tcp.inflight_debug to 0 (disable debugging), and for production use setting net.inet.tcp.inflight_min to at least 6144 may be beneficial. However, note that setting high minimums may effectively disable bandwidth limiting depending on the link. The limiting feature reduces the amount of data built up in intermediate route and switch packet queues as well as reduces the amount of data built up in the local host's interface queue. With fewer packets queued up, interactive connections, especially over slow modems, will also be able to operate with lower Round Trip Times. However, note that this feature only effects data transmission (uploading / server side). It has no effect on data reception (downloading). Adjusting net.inet.tcp.inflight_stab is not recommended. This parameter defaults to 20, representing 2 maximal packets added to the bandwidth delay product window calculation. The additional window is required to stabilize the algorithm and improve responsiveness to changing conditions, but it can also result in higher ping times over slow links (though still much lower than you would get without the inflight algorithm). In such cases, you may wish to try reducing this parameter to 15, 10, or 5; and may also have to reduce net.inet.tcp.inflight_min (for example, to 3500) to get the desired effect. Reducing these parameters should be done as a last resort only. Adding Swap Space No matter how well you plan, sometimes a system does not run as you expect. If you find you need more swap space, it is simple enough to add. You have three ways to increase swap space: adding a new hard drive, enabling swap over NFS, and creating a swap file on an existing partition. Swap on a New Hard Drive The best way to add swap, of course, is to use this as an excuse to add another hard drive. You can always use another hard drive, after all. If you can do this, go reread the discussion of swap space from the Initial Configuration section of the Handbook for some suggestions on how to best arrange your swap. Swapping over NFS Swapping over NFS is only recommended if you do not have a local hard disk to swap to. Swapping over NFS is slow and inefficient in versions of FreeBSD prior to 4.X. It is reasonably fast and efficient in 4.0-RELEASE and newer. Even with newer versions of FreeBSD, NFS swapping will be limited by the available network bandwidth and puts an additional burden on the NFS server. Swapfiles You can create a file of a specified size to use as a swap file. In our example here we will use a 64MB file called /usr/swap0. You can use any name you want, of course. Creating a Swapfile on FreeBSD 4.X Be certain that your kernel configuration includes the vnode driver. It is not in recent versions of GENERIC. pseudo-device vn 1 #Vnode driver (turns a file into a device) create a vn-device: &prompt.root; cd /dev &prompt.root; sh MAKEDEV vn0 create a swapfile (/usr/swap0): &prompt.root; dd if=/dev/zero of=/usr/swap0 bs=1024k count=64 set proper permissions on (/usr/swap0): &prompt.root; chmod 0600 /usr/swap0 enable the swap file in /etc/rc.conf: swapfile="/usr/swap0" # Set to name of swapfile if aux swapfile desired. Reboot the machine or to enable the swap file immediately, type: &prompt.root; vnconfig -e /dev/vn0b /usr/swap0 swap Creating a Swapfile on FreeBSD 5.X Be certain that your kernel configuration includes the memory disk driver (&man.md.4;). It is default in GENERIC kernel. device md # Memory "disks" create a swapfile (/usr/swap0): &prompt.root; dd if=/dev/zero of=/usr/swap0 bs=1024k count=64 set proper permissions on (/usr/swap0): &prompt.root; chmod 0600 /usr/swap0 enable the swap file in /etc/rc.conf: swapfile="/usr/swap0" # Set to name of swapfile if aux swapfile desired. Reboot the machine or to enable the swap file immediately, type: &prompt.root; mdconfig -a -t vnode -f /usr/swap0 -u 0 && swapon /dev/md0 Hiten Pandya Written by Tom Rhodes ACPI and FreeBSD It is very important to utilize hardware resources in an efficient manner. Before ACPI was introduced, it was very difficult and inflexible for operating systems to manage the power usage and thermal properties of a system. The hardware was either controlled by some sort of BIOS embedded interface, i.e.: Plug and Play BIOS (PNPBIOS), Advanced Power Management (APM) and so on. Power and Resource Management is one of the key components of a modern operating system. For example, you would want an operating system to monitor system limits (and possibly take an action), in case your system temperature increased unexpectedly. In this section of the FreeBSD Handbook, we will provide comprehensive information about ACPI. References will be provided for further reading, at the end. Please be aware that ACPI is only available on FreeBSD 5.X and above. What is ACPI? Advanced Configuration and Power Interface (ACPI) is a standard written by an alliance of vendors to provide a standard interface for hardware resources and power management (hence the name). It is a key element in Operating System-directed configuration and Power Management, i.e.: it provides more control and flexibility to the operating system (OS). Modern systems stretched the limits of the current Plug and Play interfaces (such as APM, which is used in FreeBSD 4.X), prior to the introduction of ACPI. ACPI is the direct successor to APM (Advanced Power Management). Configuring <acronym>ACPI</acronym> The acpi.ko driver is loaded by default at start up by the &man.loader.8; and should not be compiled into the kernel. The reasoning behind this is that modules are easier to work with, say if switching to another acpi.ko without doing a kernel rebuild. This has the advantage of making testing easier. Another reason is that starting ACPI after a system has been brought up is not too useful, and in some cases can be fatal. In doubt, just disable ACPI all together. This driver should not and can not be unloaded because the system bus uses it for various hardware interactions. ACPI can be disabled with the &man.acpiconf.8; utility. In fact most of the interaction with ACPI can be done via &man.acpiconf.8;. Basically this means, if anything about ACPI is in the &man.dmesg.8; output, then most likely it is already running. ACPI and APM cannot coexist and should be used separately. The last one to load will terminate if the driver notices the other running. In the simplest form, ACPI can be used to put the system into a sleep mode with &man.acpiconf.8;, the flag, and a 1-5 option. Most users will only need 1. Option 5 will do a soft-off which is the same action as: &prompt.root; halt -p The other options are available. Check out the &man.acpiconf.8; manual page for more information. Debugging <acronym>ACPI</acronym> Almost everything in ACPI is transparent, until it does not work. That is usually when you as a user will know there is something not working properly.