Bootstrapping and kernel initialization

diff --git a/en_US.ISO8859-1/books/arch-handbook/boot/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/boot/chapter.sgml index ef2711bbc4..a370d2a7eb 100644 --- a/en_US.ISO8859-1/books/arch-handbook/boot/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/boot/chapter.sgml @@ -1,1044 +1,1044 @@ Sergey Lyubka Contributed by Bootstrapping and kernel initialization Synopsis BIOS firmware POST IA-32 booting system initialization This chapter is an overview of the boot and system initialization process, starting from the BIOS (firmware) POST, to the first user process creation. Since the initial steps of system startup are very architecture dependent, the IA-32 architecture is used as an example. Overview A computer running FreeBSD can boot by several methods, although the most common method, booting from a harddisk where the OS is installed, will be discussed here. The boot process is divided into several steps: BIOS POST boot0 stage boot2 stage loader stage kernel initialization BIOS POST boot0 boot2 loader The boot0 and boot2 stages are also referred to as bootstrap stages 1 and 2 in &man.boot.8; as the first steps in FreeBSD's 3-stage bootstrapping procedure. Various information is printed on the screen at each stage, so you may visually recognize them using the table that follows. Please note that the actual data may differ from machine to machine: may vary BIOS (firmware) messages F1 FreeBSD F2 BSD F5 Disk 2 boot0 ->>FreeBSD/i386 BOOT +>>FreeBSD/i386 BOOT Default: 1:ad(1,a)/boot/loader boot: boot2This prompt will appear if the user presses a key just after selecting an OS to boot at the boot0 stage. BTX loader 1.0 BTX version is 1.01 BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS 639kB/64512kB available memory FreeBSD/i386 bootstrap loader, Revision 0.8 Console internal video/keyboard (jkh@bento.freebsd.org, Mon Nov 20 11:41:23 GMT 2000) /kernel text=0x1234 data=0x2345 syms=[0x4+0x3456] Hit [Enter] to boot immediately, or any other key for command prompt Booting [kernel] in 9 seconds..._ loader Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.6-RC #0: Sat May 4 22:49:02 GMT 2002 devnull@kukas:/usr/obj/usr/src/sys/DEVNULL Timecounter "i8254" frequency 1193182 Hz kernel BIOS POST When the PC powers on, the processor's registers are set to some predefined values. One of the registers is the instruction pointer register, and its value after a power on is well defined: it is a 32-bit value of 0xfffffff0. The instruction pointer register points to code to be executed by the processor. One of the registers is the cr1 32-bit control register, and its value just after the reboot is 0. One of the cr1's bits, the bit PE (Protected Enabled) indicates whether the processor is running in protected or real mode. Since at boot time this bit is cleared, the processor boots in real mode. Real mode means, among other things, that linear and physical addresses are identical. The value of 0xfffffff0 is slightly less then 4Gb, so unless the machine has 4Gb physical memory, it cannot point to a valid memory address. The computer's hardware translates this address so that it points to a BIOS memory block. BIOS stands for Basic Input Output System, and it is a chip on the motherboard that has a relatively small amount of read-only memory (ROM). This memory contains various low-level routines that are specific to the hardware supplied with the motherboard. So, the processor will first jump to the address 0xfffffff0, which really resides in the BIOS's memory. Usually this address contains a jump instruction to the BIOS's POST routines. POST stands for Power On Self Test. This is a set of routines including the memory check, system bus check and other low-level stuff so that the CPU can initialize the computer properly. The important step on this stage is determining the boot device. All modern BIOS's allow the boot device to be set manually, so you can boot from a floppy, CD-ROM, harddisk etc. The very last thing in the POST is the INT 0x19 instruction. That instruction reads 512 bytes from the first sector of boot device into the memory at address 0x7c00. The term first sector originates from harddrive architecture, where the magnetic plate is divided to a number of cylindrical tracks. Tracks are numbered, and every track is divided by a number (usually 64) sectors. Track number 0 is the outermost on the magnetic plate, and sector 1, the first sector (tracks, or, cylinders, are numbered starting from 0, but sectors - starting from 1), has a special meaning. It is also called Master Boot Record, or MBR. The remaining sectors on the first track are never used Some utilities such as &man.disklabel.8; may store the information in this area, mostly in the second sector.. <literal>boot0</literal> stage MBR Take a look at the file /boot/boot0. This is a small 512-byte file, and it is exactly what FreeBSD's installation procedure wrote to your harddisk's MBR if you chose the bootmanager option at installation time. As mentioned previously, the INT 0x19 instruction loads an MBR, i.e. the boot0 content, into the memory at address 0x7c00. Taking a look at the file sys/boot/i386/boot0/boot0.s can give a guess at what is happening there - this is the boot manager, which is an awesome piece of code written by Robert Nordier. The MBR, or, boot0, has a special structure starting from offset 0x1be, called the partition table. It has 4 records of 16 bytes each, called partition records, which represent how the harddisk(s) are partitioned, or, in FreeBSD's terminology, sliced. One byte of those 16 says whether a partition (slice) is bootable or not. Exactly one record must have that flag set, otherwise boot0's code will refuse to proceed. A partition record has the following fields: the 1-byte filesystem type the 1-byte bootable flag the 6 byte descriptor in CHS format the 8 byte descriptor in LBA format A partition record descriptor has the information about where exactly the partition resides on the drive. Both descriptors, LBA and CHS, describe the same information, but in different ways: LBA (Logical Block Addressing) has the starting sector for the partition and the partition's length, while CHS (Cylinder Head Sector) has coordinates for the first and last sectors of the partition. The boot manager scans the partition table and prints the menu on the screen so the user can select what disk and what slice to boot. By pressing an appropriate key, boot0 performs the following actions: modifies the bootable flag for the selected partition to make it bootable, and clears the previous saves itself to disk to remember what partition (slice) has been selected so to use it as the default on the next boot loads the first sector of the selected partition (slice) into memory and jumps there What kind of data should reside on the very first sector of a bootable partition (slice), in our case, a FreeBSD slice? As you may have already guessed, it is boot2. <literal>boot2</literal> stage You might wonder, why boot2 comes after boot0, and not boot1. Actually, there is a 512-byte file called boot1 in the directory /boot as well. It is used for booting from a floppy. When booting from a floppy, boot1 plays the same role as boot0 for a harddisk: it locates boot2 and runs it. You may have realized that a file /boot/mbr exists as well. It is a simplified version of boot0. The code in mbr does not provide a menu for the user, it just blindly boots the partition marked active. The code implementing boot2 resides in sys/boot/i386/boot2/, and the executable itself is in /boot. The files boot0 and boot2 that are in /boot are not used by the bootstrap, but by utilities such as boot0cfg. The actual position for boot0 is in the MBR. For boot2 it is the beginning of a bootable FreeBSD slice. These locations are not under the filesystem's control, so they are invisible to commands like ls. The main task for boot2 is to load the file /boot/loader, which is the third stage in the bootstrapping procedure. The code in boot2 cannot use any services like open() and read(), since the kernel is not yet loaded. It must scan the harddisk, knowing about the filesystem structure, find the file /boot/loader, read it into memory using a BIOS service, and then pass the execution to the loader's entry point. Besides that, boot2 prompts for user input so the loader can be booted from different disk, unit, slice and partition. The boot2 binary is created in special way: sys/boot/i386/boot2/Makefile boot2: boot2.ldr boot2.bin ${BTX}/btx/btx btxld -v -E ${ORG2} -f bin -b ${BTX}/btx/btx -l boot2.ldr \ -o boot2.ld -P 1 boot2.bin BTX This Makefile snippet shows that &man.btxld.8; is used to link the binary. BTX, which stands for BooT eXtender, is a piece of code that provides a protected mode environment for the program, called the client, that it is linked with. So boot2 is a BTX client, i.e. it uses the service provided by BTX. linker The btxld utility is the linker. It links two binaries together. The difference between &man.btxld.8; and &man.ld.1; is that ld usually links object files into a shared object or executable, while btxld links an object file with the BTX, producing the binary file suitable to be put on the beginning of the partition for the system boot. boot0 passes the execution to BTX's entry point. BTX then switches the processor to protected mode, and prepares a simple environment before calling the client. This includes: virtual v86 mode virtual v86 mode. That means, the BTX is a v86 monitor. Real mode instructions like pushf, popf, cli, sti, if called by the client, will work. Interrupt Descriptor Table (IDT) is set up so all hardware interrupts are routed to the default BIOS's handlers, and interrupt 0x30 is set up to be the syscall gate. Two system calls: exec and exit, are defined: sys/boot/i386/btx/lib/btxsys.s: .set INT_SYS,0x30 # Interrupt number # # System call: exit # __exit: xorl %eax,%eax # BTX system int $INT_SYS # call 0x0 # # System call: exec # __exec: movl $0x1,%eax # BTX system int $INT_SYS # call 0x1 BTX creates a Global Descriptor Table (GDT): sys/boot/i386/btx/btx/btx.s: gdt: .word 0x0,0x0,0x0,0x0 # Null entry .word 0xffff,0x0,0x9a00,0xcf # SEL_SCODE .word 0xffff,0x0,0x9200,0xcf # SEL_SDATA .word 0xffff,0x0,0x9a00,0x0 # SEL_RCODE .word 0xffff,0x0,0x9200,0x0 # SEL_RDATA .word 0xffff,MEM_USR,0xfa00,0xcf# SEL_UCODE .word 0xffff,MEM_USR,0xf200,0xcf# SEL_UDATA .word _TSSLM,MEM_TSS,0x8900,0x0 # SEL_TSS The client's code and data start from address MEM_USR (0xa000), and a selector (SEL_UCODE) points to the client's code segment. The SEL_UCODE descriptor has Descriptor Privilege Level (DPL) 3, which is the lowest privilege level. But the INT 0x30 instruction handler resides in a segment pointed to by the SEL_SCODE (supervisor code) selector, as shown from the code that creates an IDT: mov $SEL_SCODE,%dh # Segment selector init.2: shr %bx # Handle this int? jnc init.3 # No mov %ax,(%di) # Set handler offset mov %dh,0x2(%di) # and selector mov %dl,0x5(%di) # Set P:DPL:type add $0x4,%ax # Next handler So, when the client calls __exec(), the code will be executed with the highest privileges. This allows the kernel to change the protected mode data structures, such as page tables, GDT, IDT, etc later, if needed. boot2 defines an important structure, struct bootinfo. This structure is initialized by boot2 and passed to the loader, and then further to the kernel. Some nodes of this structures are set by boot2, the rest by the loader. This structure, among other information, contains the kernel filename, BIOS harddisk geometry, BIOS drive number for boot device, physical memory available, envp pointer etc. The definition for it is: /usr/include/machine/bootinfo.h struct bootinfo { u_int32_t bi_version; u_int32_t bi_kernelname; /* represents a char * */ u_int32_t bi_nfs_diskless; /* struct nfs_diskless * */ /* End of fields that are always present. */ #define bi_endcommon bi_n_bios_used u_int32_t bi_n_bios_used; u_int32_t bi_bios_geom[N_BIOS_GEOM]; u_int32_t bi_size; u_int8_t bi_memsizes_valid; u_int8_t bi_bios_dev; /* bootdev BIOS unit number */ u_int8_t bi_pad[2]; u_int32_t bi_basemem; u_int32_t bi_extmem; u_int32_t bi_symtab; /* struct symtab * */ u_int32_t bi_esymtab; /* struct symtab * */ /* Items below only from advanced bootloader */ u_int32_t bi_kernend; /* end of kernel space */ u_int32_t bi_envp; /* environment */ u_int32_t bi_modulep; /* preloaded modules */ }; boot2 enters into an infinite loop waiting for user input, then calls load(). If the user does not press anything, the loop breaks by a timeout, so load() will load the default file (/boot/loader). Functions ino_t lookup(char *filename) and int xfsread(ino_t inode, void *buf, size_t nbyte) are used to read the content of a file into memory. /boot/loader is an ELF binary, but where the ELF header is prepended with a.out's struct exec structure. load() scans the loader's ELF header, loading the content of /boot/loader into memory, and passing the execution to the loader's entry: sys/boot/i386/boot2/boot2.c: - __exec((caddr_t)addr, RB_BOOTINFO | (opts & RBX_MASK), + __exec((caddr_t)addr, RB_BOOTINFO | (opts & RBX_MASK), MAKEBOOTDEV(dev_maj[dsk.type], 0, dsk.slice, dsk.unit, dsk.part), 0, 0, 0, VTOP(&bootinfo)); <application>loader</application> stage loader is a BTX client as well. I will not describe it here in detail, there is a comprehensive manpage written by Mike Smith, &man.loader.8;. The underlying mechanisms and BTX were discussed above. The main task for the loader is to boot the kernel. When the kernel is loaded into memory, it is being called by the loader: sys/boot/common/boot.c: /* Call the exec handler from the loader matching the kernel */ - module_formats[km->m_loader]->l_exec(km); + module_formats[km->m_loader]->l_exec(km); Kernel initialization To where exactly is the execution passed by the loader, i.e. what is the kernel's actual entry point. Let us take a look at the command that links the kernel: sys/conf/Makefile.i386: ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386 -export-dynamic \ -dynamic-linker /red/herring -o kernel -X locore.o \ <lots of kernel .o files> ELF A few interesting things can be seen in this line. First, the kernel is an ELF dynamically linked binary, but the dynamic linker for kernel is /red/herring, which is definitely a bogus file. Second, taking a look at the file sys/conf/ldscript.i386 gives an idea about what ld options are used when compiling a kernel. Reading through the first few lines, the string sys/conf/ldscript.i386: ENTRY(btext) says that a kernel's entry point is the symbol `btext'. This symbol is defined in locore.s: sys/i386/i386/locore.s: .text /********************************************************************** * * This is where the bootblocks start us, set the ball rolling... * */ NON_GPROF_ENTRY(btext) First what is done is the register EFLAGS is set to a predefined value of 0x00000002, and then all the segment registers are initialized: sys/i386/i386/locore.s /* Don't trust what the BIOS gives for eflags. */ pushl $PSL_KERNEL popfl /* * Don't trust what the BIOS gives for %fs and %gs. Trust the bootstrap * to set %cs, %ds, %es and %ss. */ mov %ds, %ax mov %ax, %fs mov %ax, %gs btext calls the routines recover_bootinfo(), identify_cpu(), create_pagetables(), which are also defined in locore.s. Here is a description of what they do: - + recover_bootinfo This routine parses the parameters to the kernel passed from the bootstrap. The kernel may have been booted in 3 ways: by the loader, described above, by the old disk boot blocks, and by the old diskless boot procedure. This function determines the booting method, and stores the struct bootinfo structure into the kernel memory. identify_cpu This functions tries to find out what CPU it is running on, storing the value found in a variable _cpu. create_pagetables This function allocates and fills out a Page Table Directory at the top of the kernel memory area. The next steps are enabling VME, if the CPU supports it: testl $CPUID_VME, R(_cpu_feature) jz 1f movl %cr4, %eax orl $CR4_VME, %eax movl %eax, %cr4 Then, enabling paging: /* Now enable paging */ movl R(_IdlePTD), %eax movl %eax,%cr3 /* load ptd addr into mmu */ movl %cr0,%eax /* get control word */ orl $CR0_PE|CR0_PG,%eax /* enable paging */ movl %eax,%cr0 /* and let's page NOW! */ The next three lines of code are because the paging was set, so the jump is needed to continue the execution in virtualized address space: pushl $begin /* jump to high virtualized address */ ret /* now running relocated at KERNBASE where the system is linked to run */ begin: The function init386() is called, with a pointer to the first free physical page, after that mi_startup(). init386 is an architecture dependent initialization function, and mi_startup() is an architecture independent one (the 'mi_' prefix stands for Machine Independent). The kernel never returns from mi_startup(), and by calling it, the kernel finishes booting: sys/i386/i386/locore.s: movl physfree, %esi pushl %esi /* value of first for init386(first) */ call _init386 /* wire 386 chip for unix operation */ call _mi_startup /* autoconfiguration, mountroot etc */ hlt /* never returns to here */ <function>init386()</function> init386() is defined in sys/i386/i386/machdep.c and performs low-level initialization, specific to the i386 chip. The switch to protected mode was performed by the loader. The loader has created the very first task, in which the kernel continues to operate. Before running straight away to the code, I will enumerate the tasks the processor must complete to initialize protected mode execution: Initialize the kernel tunable parameters, passed from the bootstrapping program. Prepare the GDT. Prepare the IDT. Initialize the system console. Initialize the DDB, if it is compiled into kernel. Initialize the TSS. Prepare the LDT. Set up proc0's pcb. parameters What init386() first does is initialize the tunable parameters passed from bootstrap. This is done by setting the environment pointer (envp) and calling init_param1(). The envp pointer has been passed from loader in the bootinfo structure: sys/i386/i386/machdep.c: kern_envp = (caddr_t)bootinfo.bi_envp + KERNBASE; /* Init basic tunables, hz etc */ init_param1(); init_param1() is defined in sys/kern/subr_param.c. That file has a number of sysctls, and two functions, init_param1() and init_param2(), that are called from init386(): sys/kern/subr_param.c hz = HZ; TUNABLE_INT_FETCH("kern.hz", &hz); TUNABLE_<typename>_FETCH is used to fetch the value from the environment: /usr/src/sys/sys/kernel.h #define TUNABLE_INT_FETCH(path, var) getenv_int((path), (var)) Sysctl kern.hz is the system clock tick. Along with this, the following sysctls are set by init_param1(): kern.maxswzone, kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.dflssiz, kern.maxssiz, kern.sgrowsiz. Global Descriptors Table (GDT) Then init386() prepares the Global Descriptors Table (GDT). Every task on an x86 is running in its own virtual address space, and this space is addressed by a segment:offset pair. Say, for instance, the current instruction to be executed by the processor lies at CS:EIP, then the linear virtual address for that instruction would be the virtual address of code segment CS + EIP. For convenience, segments begin at virtual address 0 and end at a 4Gb boundary. Therefore, the instruction's linear virtual address for this example would just be the value of EIP. Segment registers such as CS, DS etc are the selectors, i.e. indexes, into GDT (to be more precise, an index is not a selector itself, but the INDEX field of a selector). FreeBSD's GDT holds descriptors for 15 selectors per CPU: sys/i386/i386/machdep.c: union descriptor gdt[NGDT * MAXCPU]; /* global descriptor table */ sys/i386/include/segments.h: /* * Entries in the Global Descriptor Table (GDT) */ #define GNULL_SEL 0 /* Null Descriptor */ #define GCODE_SEL 1 /* Kernel Code Descriptor */ #define GDATA_SEL 2 /* Kernel Data Descriptor */ #define GPRIV_SEL 3 /* SMP Per-Processor Private Data */ #define GPROC0_SEL 4 /* Task state process slot zero and up */ #define GLDT_SEL 5 /* LDT - eventually one per process */ #define GUSERLDT_SEL 6 /* User LDT */ #define GTGATE_SEL 7 /* Process task switch gate */ #define GBIOSLOWMEM_SEL 8 /* BIOS low memory access (must be entry 8) */ #define GPANIC_SEL 9 /* Task state to consider panic from */ #define GBIOSCODE32_SEL 10 /* BIOS interface (32bit Code) */ #define GBIOSCODE16_SEL 11 /* BIOS interface (16bit Code) */ #define GBIOSDATA_SEL 12 /* BIOS interface (Data) */ #define GBIOSUTIL_SEL 13 /* BIOS interface (Utility) */ #define GBIOSARGS_SEL 14 /* BIOS interface (Arguments) */ Note that those #defines are not selectors themselves, but just a field INDEX of a selector, so they are exactly the indices of the GDT. for example, an actual selector for the kernel code (GCODE_SEL) has the value 0x08. Interrupt Descriptor Table (IDT) The next step is to initialize the Interrupt Descriptor Table (IDT). This table is to be referenced by the processor when a software or hardware interrupt occurs. For example, to make a system call, user application issues the INT 0x80 instruction. This is a software interrupt, so the processor's hardware looks up a record with index 0x80 in the IDT. This record points to the routine that handles this interrupt, in this particular case, this will be the kernel's syscall gate. The IDT may have a maximum of 256 (0x100) records. The kernel allocates NIDT records for the IDT, where NIDT is the maximum (256): sys/i386/i386/machdep.c: static struct gate_descriptor idt0[NIDT]; struct gate_descriptor *idt = &idt0[0]; /* interrupt descriptor table */ For each interrupt, an appropriate handler is set. The syscall gate for INT 0x80 is set as well: sys/i386/i386/machdep.c: setidt(0x80, &IDTVEC(int0x80_syscall), SDT_SYS386TGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL)); So when a userland application issues the INT 0x80 instruction, control will transfer to the function _Xint0x80_syscall, which is in the kernel code segment and will be executed with supervisor privileges. Console and DDB are then initialized: DDB sys/i386/i386/machdep.c: cninit(); /* skipped */ #ifdef DDB kdb_init(); - if (boothowto & RB_KDB) + if (boothowto & RB_KDB) Debugger("Boot flags requested debugger"); #endif The Task State Segment is another x86 protected mode structure, the TSS is used by the hardware to store task information when a task switch occurs. The Local Descriptors Table is used to reference userland code and data. Several selectors are defined to point to the LDT, they are the system call gates and the user code and data selectors: /usr/include/machine/segments.h #define LSYS5CALLS_SEL 0 /* forced by intel BCS */ #define LSYS5SIGR_SEL 1 #define L43BSDCALLS_SEL 2 /* notyet */ #define LUCODE_SEL 3 -#define LSOL26CALLS_SEL 4 /* Solaris >= 2.6 system call gate */ +#define LSOL26CALLS_SEL 4 /* Solaris >= 2.6 system call gate */ #define LUDATA_SEL 5 /* separate stack, es,fs,gs sels ? */ /* #define LPOSIXCALLS_SEL 5*/ /* notyet */ #define LBSDICALLS_SEL 16 /* BSDI system call gate */ #define NLDT (LBSDICALLS_SEL + 1) Next, proc0's Process Control Block (struct pcb) structure is initialized. proc0 is a struct proc structure that describes a kernel process. It is always present while the kernel is running, therefore it is declared as global: sys/kern/kern_init.c: struct proc proc0; The structure struct pcb is a part of a proc structure. It is defined in /usr/include/machine/pcb.h and has a process's information specific to the i386 architecture, such as registers values. <function>mi_startup()</function> This function performs a bubble sort of all the system initialization objects and then calls the entry of each object one by one: sys/kern/init_main.c: for (sipp = sysinit; *sipp; sipp++) { /* ... skipped ... */ /* Call function */ - (*((*sipp)->func))((*sipp)->udata); + (*((*sipp)->func))((*sipp)->udata); /* ... skipped ... */ } Although the sysinit framework is described in the Developers' Handbook, I will discuss the internals of it. sysinit objects Every system initialization object (sysinit object) is created by calling a SYSINIT() macro. Let us take as example an announce sysinit object. This object prints the copyright message: sys/kern/init_main.c: static void print_caddr_t(void *data __unused) { printf("%s", (char *)data); } SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyright) The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001), which comes right after the SI_SUB_CONSOLE (0x0800000). So, the copyright message will be printed out first, just after the console initialization. Let us take a look at what exactly the macro SYSINIT() does. It expands to a C_SYSINIT() macro. The C_SYSINIT() macro then expands to a static struct sysinit structure declaration with another DATA_SET macro call: /usr/include/sys/kernel.h: #define C_SYSINIT(uniquifier, subsystem, order, func, ident) \ static struct sysinit uniquifier ## _sys_init = { \ subsystem, \ order, \ func, \ ident \ }; \ DATA_SET(sysinit_set,uniquifier ## _sys_init); #define SYSINIT(uniquifier, subsystem, order, func, ident) \ C_SYSINIT(uniquifier, subsystem, order, \ (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)ident) The DATA_SET() macro expands to a MAKE_SET(), and that macro is the point where the all sysinit magic is hidden: /usr/include/linker_set.h #define MAKE_SET(set, sym) \ static void const * const __set_##set##_sym_##sym = &sym; \ __asm(".section .set." #set ",\"aw\""); \ __asm(".long " #sym); \ __asm(".previous") #endif #define TEXT_SET(set, sym) MAKE_SET(set, sym) #define DATA_SET(set, sym) MAKE_SET(set, sym) In our case, the following declaration will occur: static struct sysinit announce_sys_init = { SI_SUB_COPYRIGHT, SI_ORDER_FIRST, (sysinit_cfunc_t)(sysinit_nfunc_t) print_caddr_t, (void *) copyright }; static void const *const __set_sysinit_set_sym_announce_sys_init = &announce_sys_init; __asm(".section .set.sysinit_set" ",\"aw\""); __asm(".long " "announce_sys_init"); __asm(".previous"); The first __asm instruction will create an ELF section within the kernel's executable. This will happen at kernel link time. The section will have the name .set.sysinit_set. The content of this section is one 32-bit value, the address of announce_sys_init structure, and that is what the second __asm is. The third __asm instruction marks the end of a section. If a directive with the same section name occurred before, the content, i.e. the 32-bit value, will be appended to the existing section, so forming an array of 32-bit pointers. Running objdump on a kernel binary, you may notice the presence of such small sections: &prompt.user; objdump -h /kernel 7 .set.cons_set 00000014 c03164c0 c03164c0 002154c0 2**2 CONTENTS, ALLOC, LOAD, DATA 8 .set.kbddriver_set 00000010 c03164d4 c03164d4 002154d4 2**2 CONTENTS, ALLOC, LOAD, DATA 9 .set.scrndr_set 00000024 c03164e4 c03164e4 002154e4 2**2 CONTENTS, ALLOC, LOAD, DATA 10 .set.scterm_set 0000000c c0316508 c0316508 00215508 2**2 CONTENTS, ALLOC, LOAD, DATA 11 .set.sysctl_set 0000097c c0316514 c0316514 00215514 2**2 CONTENTS, ALLOC, LOAD, DATA 12 .set.sysinit_set 00000664 c0316e90 c0316e90 00215e90 2**2 CONTENTS, ALLOC, LOAD, DATA This screen dump shows that the size of .set.sysinit_set section is 0x664 bytes, so 0x664/sizeof(void *) sysinit objects are compiled into the kernel. The other sections such as .set.sysctl_set represent other linker sets. By defining a variable of type struct linker_set the content of .set.sysinit_set section will be collected into that variable: sys/kern/init_main.c: extern struct linker_set sysinit_set; /* XXX */ The struct linker_set is defined as follows: /usr/include/linker_set.h: struct linker_set { int ls_length; void *ls_items[1]; /* really ls_length of them, trailing NULL */ }; The first node will be equal to the number of a sysinit objects, and the second node will be a NULL-terminated array of pointers to them. Returning to the mi_startup() discussion, it is must be clear now, how the sysinit objects are being organized. The mi_startup() function sorts them and calls each. The very last object is the system scheduler: /usr/include/sys/kernel.h: enum sysinit_sub_id { SI_SUB_DUMMY = 0x0000000, /* not executed; for linker*/ SI_SUB_DONE = 0x0000001, /* processed*/ SI_SUB_CONSOLE = 0x0800000, /* console*/ SI_SUB_COPYRIGHT = 0x0800001, /* first use of console*/ ... SI_SUB_RUN_SCHEDULER = 0xfffffff /* scheduler: no return*/ }; The system scheduler sysinit object is defined in the file sys/vm/vm_glue.c, and the entry point for that object is scheduler(). That function is actually an infinite loop, and it represents a process with PID 0, the swapper process. The proc0 structure, mentioned before, is used to describe it. The first user process, called init, is created by the sysinit object init: sys/kern/init_main.c: static void create_init(const void *udata __unused) { int error; int s; s = splhigh(); error = fork1(&proc0, RFFDG | RFPROC, &initproc); if (error) panic("cannot fork init: %d\n", error); - initproc->p_flag |= P_INMEM | P_SYSTEM; + initproc->p_flag |= P_INMEM | P_SYSTEM; cpu_set_fork_handler(initproc, start_init, NULL); remrunqueue(initproc); splx(s); } SYSINIT(init,SI_SUB_CREATE_INIT, SI_ORDER_FIRST, create_init, NULL) The create_init() allocates a new process by calling fork1(), but does not mark it runnable. When this new process is scheduled for execution by the scheduler, the start_init() will be called. That function is defined in init_main.c. It tries to load and exec the init binary, probing /sbin/init first, then /sbin/oinit, /sbin/init.bak, and finally /stand/sysinstall: sys/kern/init_main.c: static char init_path[MAXPATHLEN] = #ifdef INIT_PATH __XSTRING(INIT_PATH); #else "/sbin/init:/sbin/oinit:/sbin/init.bak:/stand/sysinstall"; #endif diff --git a/en_US.ISO8859-1/books/arch-handbook/driverbasics/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/driverbasics/chapter.sgml index 7de6991778..c42f39b41b 100644 --- a/en_US.ISO8859-1/books/arch-handbook/driverbasics/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/driverbasics/chapter.sgml @@ -1,616 +1,616 @@ Murray Stokely Written by Jörg Wunsch Based on intro(4) manual page by Writing FreeBSD Device Drivers Introduction device driver pseudo-device This chapter provides a brief introduction to writing device drivers for FreeBSD. A device in this context is a term used mostly for hardware-related stuff that belongs to the system, like disks, printers, or a graphics display with its keyboard. A device driver is the software component of the operating system that controls a specific device. There are also so-called pseudo-devices where a device driver emulates the behavior of a device in software without any particular underlying hardware. Device drivers can be compiled into the system statically or loaded on demand through the dynamic kernel linker facility `kld'. device nodes MAKEDEV Most devices in a &unix;-like operating system are accessed through device-nodes, sometimes also called special files. These files are usually located under the directory /dev in the filesystem hierarchy. In releases of FreeBSD older than 5.0-RELEASE, where &man.devfs.5; support is not integrated into FreeBSD, each device node must be created statically and independent of the existence of the associated device driver. Most device nodes on the system are created by running MAKEDEV. Device drivers can roughly be broken down into two categories; character and network device drivers. Dynamic Kernel Linker Facility - KLD kernel linkingdynamic kernel loadable modules (KLD) The kld interface allows system administrators to dynamically add and remove functionality from a running system. This allows device driver writers to load their new changes into a running kernel without constantly rebooting to test changes. The kld interface is used through the following privileged commands: kernel modulesloading kernel modulesunloading kernel moduleslisting kldload - loads a new kernel module kldunload - unloads a kernel module kldstat - lists the currently loaded modules Skeleton Layout of a kernel module /* * KLD Skeleton * Inspired by Andrew Reiter's Daemonnews article */ #include <sys/types.h> #include <sys/module.h> #include <sys/systm.h> /* uprintf */ #include <sys/errno.h> #include <sys/param.h> /* defines used in kernel.h */ #include <sys/kernel.h> /* types used in module initialization */ /* * Load handler that deals with the loading and unloading of a KLD. */ static int skel_loader(struct module *m, int what, void *arg) { int err = 0; switch (what) { case MOD_LOAD: /* kldload */ uprintf("Skeleton KLD loaded.\n"); break; case MOD_UNLOAD: uprintf("Skeleton KLD unloaded.\n"); break; default: err = EINVAL; break; } return(err); } /* Declare this module to the rest of the kernel */ static moduledata_t skel_mod = { "skel", skel_loader, NULL }; DECLARE_MODULE(skeleton, skel_mod, SI_SUB_KLD, SI_ORDER_ANY); Makefile FreeBSD provides a makefile include that you can use to quickly compile your kernel addition. SRCS=skeleton.c KMOD=skeleton .include <bsd.kmod.mk> Simply running make with this makefile will create a file skeleton.ko that can be loaded into your system by typing: &prompt.root; kldload -v ./skeleton.ko Accessing a device driver &unix; provides a common set of system calls for user applications to use. The upper layers of the kernel dispatch these calls to the corresponding device driver when a user accesses a device node. The /dev/MAKEDEV script makes most of the device nodes for your system but if you are doing your own driver development it may be necessary to create your own device nodes with mknod. Creating static device nodes device nodesstatic mknod The mknod command requires four arguments to create a device node. You must specify the name of the device node, the type of device, the major number of the device, and the minor number of the device. Dynamic device nodes device nodesdynamic devfs The device filesystem, or devfs, provides access to the kernel's device namespace in the global filesystem namespace. This eliminates the problems of potentially having a device driver without a static device node, or a device node without an installed device driver. Devfs is still a work in progress, but it is already working quite nicely. Character Devices character devices A character device driver is one that transfers data directly to and from a user process. This is the most common type of device driver and there are plenty of simple examples in the source tree. This simple example pseudo-device remembers whatever values you write to it and can then supply them back to you when you - read from it. Two versions are shown, one for &os; 4.X and - one for &os; 5.X. + read from it. Two versions are shown, one for &os; 4.X and + one for &os; 5.X. Example of a Sample Echo Pseudo-Device Driver for &os; 4.X /* * Simple `echo' pseudo-device KLD * * Murray Stokely */ -#define MIN(a,b) (((a) < (b)) ? (a) : (b)) +#define MIN(a,b) (((a) < (b)) ? (a) : (b)) #include <sys/types.h> #include <sys/module.h> #include <sys/systm.h> /* uprintf */ #include <sys/errno.h> #include <sys/param.h> /* defines used in kernel.h */ #include <sys/kernel.h> /* types used in module initialization */ #include <sys/conf.h> /* cdevsw struct */ #include <sys/uio.h> /* uio struct */ #include <sys/malloc.h> #define BUFFERSIZE 256 /* Function prototypes */ d_open_t echo_open; d_close_t echo_close; d_read_t echo_read; d_write_t echo_write; /* Character device entry points */ static struct cdevsw echo_cdevsw = { echo_open, echo_close, echo_read, echo_write, noioctl, nopoll, nommap, nostrategy, "echo", 33, /* reserved for lkms - /usr/src/sys/conf/majors */ nodump, nopsize, D_TTY, -1 }; struct s_echo { char msg[BUFFERSIZE]; int len; } t_echo; /* vars */ static dev_t sdev; static int len; static int count; static t_echo *echomsg; MALLOC_DECLARE(M_ECHOBUF); MALLOC_DEFINE(M_ECHOBUF, "echobuffer", "buffer for echo module"); /* * This function is called by the kld[un]load(2) system calls to * determine what actions to take when a module is loaded or unloaded. */ static int echo_loader(struct module *m, int what, void *arg) { int err = 0; switch (what) { case MOD_LOAD: /* kldload */ - sdev = make_dev(&echo_cdevsw, + sdev = make_dev(&echo_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "echo"); /* kmalloc memory for use by this driver */ MALLOC(echomsg, t_echo *, sizeof(t_echo), M_ECHOBUF, M_WAITOK); printf("Echo device loaded.\n"); break; case MOD_UNLOAD: destroy_dev(sdev); FREE(echomsg,M_ECHOBUF); printf("Echo device unloaded.\n"); break; default: err = EINVAL; break; } return(err); } int echo_open(dev_t dev, int oflags, int devtype, struct proc *p) { int err = 0; uprintf("Opened device \"echo\" successfully.\n"); return(err); } int echo_close(dev_t dev, int fflag, int devtype, struct proc *p) { uprintf("Closing device \"echo.\"\n"); return(0); } /* * The read function just takes the buf that was saved via * echo_write() and returns it to userland for accessing. * uio(9) */ int echo_read(dev_t dev, struct uio *uio, int ioflag) { int err = 0; int amt; /* * How big is this read operation? Either as big as the user wants, * or as big as the remaining data */ - amt = MIN(uio->uio_resid, (echomsg->len - uio->uio_offset > 0) ? - echomsg->len - uio->uio_offset : 0); - if ((err = uiomove(echomsg->msg + uio->uio_offset,amt,uio)) != 0) { + amt = MIN(uio->uio_resid, (echomsg->len - uio->uio_offset > 0) ? + echomsg->len - uio->uio_offset : 0); + if ((err = uiomove(echomsg->msg + uio->uio_offset,amt,uio)) != 0) { uprintf("uiomove failed!\n"); } return(err); } /* * echo_write takes in a character string and saves it * to buf for later accessing. */ int echo_write(dev_t dev, struct uio *uio, int ioflag) { int err = 0; /* Copy the string in from user memory to kernel memory */ - err = copyin(uio->uio_iov->iov_base, echomsg->msg, - MIN(uio->uio_iov->iov_len,BUFFERSIZE - 1)); + err = copyin(uio->uio_iov->iov_base, echomsg->msg, + MIN(uio->uio_iov->iov_len,BUFFERSIZE - 1)); /* Now we need to null terminate, then record the length */ - *(echomsg->msg + MIN(uio->uio_iov->iov_len,BUFFERSIZE - 1)) = 0; - echomsg->len = MIN(uio->uio_iov->iov_len,BUFFERSIZE); + *(echomsg->msg + MIN(uio->uio_iov->iov_len,BUFFERSIZE - 1)) = 0; + echomsg->len = MIN(uio->uio_iov->iov_len,BUFFERSIZE); if (err != 0) { uprintf("Write failed: bad address!\n"); } count++; return(err); } DEV_MODULE(echo,echo_loader,NULL); Example of a Sample Echo Pseudo-Device Driver for &os; 5.X /* * Simple `echo' pseudo-device KLD * * Murray Stokely * - * Converted to 5.X by S�ren (Xride) Straarup + * Converted to 5.X by Søren (Xride) Straarup */ #include <sys/types.h> #include <sys/module.h> #include <sys/systm.h> /* uprintf */ #include <sys/errno.h> #include <sys/param.h> /* defines used in kernel.h */ #include <sys/kernel.h> /* types used in module initialization */ #include <sys/conf.h> /* cdevsw struct */ #include <sys/uio.h> /* uio struct */ #include <sys/malloc.h> #define BUFFERSIZE 256 /* Function prototypes */ static d_open_t echo_open; static d_close_t echo_close; static d_read_t echo_read; static d_write_t echo_write; /* Character device entry points */ static struct cdevsw echo_cdevsw = { .d_version = D_VERSION, .d_open = echo_open, .d_close = echo_close, .d_read = echo_read, .d_write = echo_write, .d_name = "echo", }; typedef struct s_echo { char msg[BUFFERSIZE]; int len; } t_echo; /* vars */ static struct cdev *echo_dev; static int count; static t_echo *echomsg; MALLOC_DECLARE(M_ECHOBUF); MALLOC_DEFINE(M_ECHOBUF, "echobuffer", "buffer for echo module"); /* * This function is called by the kld[un]load(2) system calls to * determine what actions to take when a module is loaded or unloaded. */ static int echo_loader(struct module *m, int what, void *arg) { int err = 0; switch (what) { case MOD_LOAD: /* kldload */ - echo_dev = make_dev(&echo_cdevsw, + echo_dev = make_dev(&echo_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "echo"); /* kmalloc memory for use by this driver */ echomsg = malloc(sizeof(t_echo), M_ECHOBUF, M_WAITOK); printf("Echo device loaded.\n"); break; case MOD_UNLOAD: destroy_dev(echo_dev); free(echomsg, M_ECHOBUF); printf("Echo device unloaded.\n"); break; default: err = EINVAL; break; } return(err); } static int echo_open(struct cdev *dev, int oflags, int devtype, struct thread *p) { int err = 0; uprintf("Opened device \"echo\" successfully.\n"); return(err); } static int echo_close(struct cdev *dev, int fflag, int devtype, struct thread *p) { uprintf("Closing device \"echo.\"\n"); return(0); } /* * The read function just takes the buf that was saved via * echo_write() and returns it to userland for accessing. * uio(9) */ static int echo_read(struct cdev *dev, struct uio *uio, int ioflag) { int err = 0; int amt; /* * How big is this read operation? Either as big as the user wants, * or as big as the remaining data */ amt = MIN(uio->uio_resid, (echomsg->len - uio->uio_offset > 0) ? echomsg->len - uio->uio_offset : 0); if ((err = uiomove(echomsg->msg + uio->uio_offset,amt,uio)) != 0) { uprintf("uiomove failed!\n"); } return(err); } /* * echo_write takes in a character string and saves it * to buf for later accessing. */ static int echo_write(struct cdev *dev, struct uio *uio, int ioflag) { int err = 0; /* Copy the string in from user memory to kernel memory */ err = copyin(uio->uio_iov->iov_base, echomsg->msg, MIN(uio->uio_iov->iov_len,BUFFERSIZE)); /* Now we need to null terminate, then record the length */ *(echomsg->msg + MIN(uio->uio_iov->iov_len,BUFFERSIZE)) = 0; echomsg->len = MIN(uio->uio_iov->iov_len,BUFFERSIZE); if (err != 0) { uprintf("Write failed: bad address!\n"); } count++; return(err); } DEV_MODULE(echo,echo_loader,NULL); To install this driver on &os; 4.X you will first need to make a node on your filesystem with a command such as: &prompt.root; mknod /dev/echo c 33 0 With this driver loaded you should now be able to type something like: &prompt.root; echo -n "Test Data" > /dev/echo &prompt.root; cat /dev/echo Test Data Real hardware devices are described in the next chapter. Additional Resources Dynamic Kernel Linker (KLD) Facility Programming Tutorial - Daemonnews October 2000 How to Write Kernel Drivers with NEWBUS - Daemonnews July 2000 Block Devices (Are Gone) block devices Other &unix; systems may support a second type of disk device known as block devices. Block devices are disk devices for which the kernel provides caching. This caching makes block-devices almost unusable, or at least dangerously unreliable. The caching will reorder the sequence of write operations, depriving the application of the ability to know the exact disk contents at any one instant in time. This makes predictable and reliable crash recovery of on-disk data structures (filesystems, databases etc.) impossible. Since writes may be delayed, there is no way the kernel can report to the application which particular write operation encountered a write error, this further compounds the consistency problem. For this reason, no serious applications rely on block devices, and in fact, almost all applications which access disks directly take great pains to specify that character (or raw) devices should always be used. Because the implementation of the aliasing of each disk (partition) to two devices with different semantics significantly complicated the relevant kernel code &os; dropped support for cached disk devices as part of the modernization of the disk I/O infrastructure. Network Drivers network devices Drivers for network devices do not use device nodes in order to be accessed. Their selection is based on other decisions made inside the kernel and instead of calling open(), use of a network device is generally introduced by using the system call socket(2). For more information see ifnet(9), the source of the loopback device, and Bill Paul's network drivers. diff --git a/en_US.ISO8859-1/books/arch-handbook/isa/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/isa/chapter.sgml index 71dd7b3bd3..db94c8a095 100644 --- a/en_US.ISO8859-1/books/arch-handbook/isa/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/isa/chapter.sgml @@ -1,2528 +1,2528 @@ Sergey Babkin Written by Murray Stokely Modifications for Handbook made by Valentino Vaschetto Wylie Stilwell ISA device drivers Synopsis ISA device driverISA This chapter introduces the issues relevant to writing a driver for an ISA device. The pseudo-code presented here is rather detailed and reminiscent of the real code but is still only pseudo-code. It avoids the details irrelevant to the subject of the discussion. The real-life examples can be found in the source code of real drivers. In particular the drivers ep and aha are good sources of information. Basic information A typical ISA driver would need the following include files: #include <sys/module.h> #include <sys/bus.h> #include <machine/bus.h> #include <machine/resource.h> #include <sys/rman.h> #include <isa/isavar.h> #include <isa/pnpvar.h> They describe the things specific to the ISA and generic bus subsystem. object-oriented The bus subsystem is implemented in an object-oriented fashion, its main structures are accessed by associated method functions. bus methods The list of bus methods implemented by an ISA driver is like one for any other bus. For a hypothetical driver named xxx they would be: static void xxx_isa_identify (driver_t *, device_t); Normally used for bus drivers, not device drivers. But for ISA devices this method may have special use: if the device provides some device-specific (non-PnP) way to auto-detect devices this routine may implement it. static int xxx_isa_probe (device_t dev); Probe for a device at a known (or PnP) location. This routine can also accommodate device-specific auto-detection of parameters for partially configured devices. static int xxx_isa_attach (device_t dev); Attach and initialize device. static int xxx_isa_detach (device_t dev); Detach device before unloading the driver module. static int xxx_isa_shutdown (device_t dev); Execute shutdown of the device before system shutdown. static int xxx_isa_suspend (device_t dev); Suspend the device before the system goes to the power-save state. May also abort transition to the power-save state. static int xxx_isa_resume (device_t dev); Resume the device activity after return from power-save state. xxx_isa_probe() and xxx_isa_attach() are mandatory, the rest of the routines are optional, depending on the device's needs. The driver is linked to the system with the following set of descriptions. /* table of supported bus methods */ static device_method_t xxx_isa_methods[] = { /* list all the bus method functions supported by the driver */ /* omit the unsupported methods */ DEVMETHOD(device_identify, xxx_isa_identify), DEVMETHOD(device_probe, xxx_isa_probe), DEVMETHOD(device_attach, xxx_isa_attach), DEVMETHOD(device_detach, xxx_isa_detach), DEVMETHOD(device_shutdown, xxx_isa_shutdown), DEVMETHOD(device_suspend, xxx_isa_suspend), DEVMETHOD(device_resume, xxx_isa_resume), { 0, 0 } }; static driver_t xxx_isa_driver = { "xxx", xxx_isa_methods, sizeof(struct xxx_softc), }; static devclass_t xxx_devclass; DRIVER_MODULE(xxx, isa, xxx_isa_driver, xxx_devclass, load_function, load_argument); softc Here struct xxx_softc is a device-specific structure that contains private driver data and descriptors for the driver's resources. The bus code automatically allocates one softc descriptor per device as needed. kernel module If the driver is implemented as a loadable module then load_function() is called to do driver-specific initialization or clean-up when the driver is loaded or unloaded and load_argument is passed as one of its arguments. If the driver does not support dynamic loading (in other words it must always be linked into the kernel) then these values should be set to 0 and the last definition would look like: DRIVER_MODULE(xxx, isa, xxx_isa_driver, xxx_devclass, 0, 0); PnP If the driver is for a device which supports PnP then a table of supported PnP IDs must be defined. The table consists of a list of PnP IDs supported by this driver and human-readable descriptions of the hardware types and models having these IDs. It looks like: static struct isa_pnp_id xxx_pnp_ids[] = { /* a line for each supported PnP ID */ { 0x12345678, "Our device model 1234A" }, { 0x12345679, "Our device model 1234B" }, { 0, NULL }, /* end of table */ }; If the driver does not support PnP devices it still needs an empty PnP ID table, like: static struct isa_pnp_id xxx_pnp_ids[] = { { 0, NULL }, /* end of table */ }; Device_t pointer Device_t is the pointer type for the device structure. Here we consider only the methods interesting from the device driver writer's standpoint. The methods to manipulate values in the device structure are: device_t device_get_parent(dev) Get the parent bus of a device. driver_t device_get_driver(dev) Get pointer to its driver structure. char *device_get_name(dev) Get the driver name, such as "xxx" for our example. int device_get_unit(dev) Get the unit number (units are numbered from 0 for the devices associated with each driver). char *device_get_nameunit(dev) Get the device name including the unit number, such as xxx0, xxx1 and so on. char *device_get_desc(dev) Get the device description. Normally it describes the exact model of device in human-readable form. device_set_desc(dev, desc) Set the description. This makes the device description point to the string desc which may not be deallocated or changed after that. device_set_desc_copy(dev, desc) Set the description. The description is copied into an internal dynamically allocated buffer, so the string desc may be changed afterwards without adverse effects. void *device_get_softc(dev) Get pointer to the device descriptor (struct xxx_softc) associated with this device. u_int32_t device_get_flags(dev) Get the flags specified for the device in the configuration file. A convenience function device_printf(dev, fmt, ...) may be used to print the messages from the device driver. It automatically prepends the unitname and colon to the message. The device_t methods are implemented in the file kern/bus_subr.c. Configuration file and the order of identifying and probing during auto-configuration ISAprobing The ISA devices are described in the kernel configuration file like: device xxx0 at isa? port 0x300 irq 10 drq 5 iomem 0xd0000 flags 0x1 sensitive IRQ The values of port, IRQ and so on are converted to the resource values associated with the device. They are optional, depending on the device's needs and abilities for auto-configuration. For example, some devices do not need DRQ at all and some allow the driver to read the IRQ setting from the device configuration ports. If a machine has multiple ISA buses the exact bus may be specified in the configuration line, like isa0 or isa1, otherwise the device would be searched for on all the ISA buses. sensitive is a resource requesting that this device must be probed before all non-sensitive devices. It is supported but does not seem to be used in any current driver. For legacy ISA devices in many cases the drivers are still able to detect the configuration parameters. But each device to be configured in the system must have a config line. If two devices of some type are installed in the system but there is only one configuration line for the corresponding driver, ie: device xxx0 at isa? then only one device will be configured. But for the devices supporting automatic identification by the means of Plug-n-Play or some proprietary protocol one configuration line is enough to configure all the devices in the system, like the one above or just simply: device xxx at isa? If a driver supports both auto-identified and legacy devices and both kinds are installed at once in one machine then it is enough to describe in the config file the legacy devices only. The auto-identified devices will be added automatically. When an ISA bus is auto-configured the events happen as follows: All the drivers' identify routines (including the PnP identify routine which identifies all the PnP devices) are called in random order. As they identify the devices they add them to the list on the ISA bus. Normally the drivers' identify routines associate their drivers with the new devices. The PnP identify routine does not know about the other drivers yet so it does not associate any with the new devices it adds. The PnP devices are put to sleep using the PnP protocol to prevent them from being probed as legacy devices. The probe routines of non-PnP devices marked as sensitive are called. If probe for a device went successfully, the attach routine is called for it. The probe and attach routines of all non-PNP devices are called likewise. The PnP devices are brought back from the sleep state and assigned the resources they request: I/O and memory address ranges, IRQs and DRQs, all of them not conflicting with the attached legacy devices. Then for each PnP device the probe routines of all the present ISA drivers are called. The first one that claims the device gets attached. It is possible that multiple drivers would claim the device with different priority; in this case, the highest-priority driver wins. The probe routines must call ISA_PNP_PROBE() to compare the actual PnP ID with the list of the IDs supported by the driver and if the ID is not in the table return failure. That means that absolutely every driver, even the ones not supporting any PnP devices must call ISA_PNP_PROBE(), at least with an empty PnP ID table to return failure on unknown PnP devices. The probe routine returns a positive value (the error code) on error, zero or negative value on success. The negative return values are used when a PnP device supports multiple interfaces. For example, an older compatibility interface and a newer advanced interface which are supported by different drivers. Then both drivers would detect the device. The driver which returns a higher value in the probe routine takes precedence (in other words, the driver returning 0 has highest precedence, returning -1 is next, returning -2 is after it and so on). In result the devices which support only the old interface will be handled by the old driver (which should return -1 from the probe routine) while the devices supporting the new interface as well will be handled by the new driver (which should return 0 from the probe routine). If multiple drivers return the same value then the one called first wins. So if a driver returns value 0 it may be sure that it won the priority arbitration. The device-specific identify routines can also assign not a driver but a class of drivers to the device. Then all the drivers in the class are probed for this device, like the case with PnP. This feature is not implemented in any existing driver and is not considered further in this document. Because the PnP devices are disabled when probing the legacy devices they will not be attached twice (once as legacy and once as PnP). But in case of device-dependent identify routines it is the responsibility of the driver to make sure that the same device will not be attached by the driver twice: once as legacy user-configured and once as auto-identified. Another practical consequence for the auto-identified devices (both PnP and device-specific) is that the flags can not be passed to them from the kernel configuration file. So they must either not use the flags at all or use the flags from the device unit 0 for all the auto-identified devices or use the sysctl interface instead of flags. Other unusual configurations may be accommodated by accessing the configuration resources directly with functions of families resource_query_*() and resource_*_value(). Their implementations are located in kern/subr_bus.c. The old IDE disk driver i386/isa/wd.c contains examples of such use. But the standard means of configuration must always be preferred. Leave parsing the configuration resources to the bus configuration code. Resources resources device driverresources The information that a user enters into the kernel configuration file is processed and passed to the kernel as configuration resources. This information is parsed by the bus configuration code and transformed into a value of structure device_t and the bus resources associated with it. The drivers may access the configuration resources directly using functions resource_* for more complex cases of configuration. However, generally this is neither needed nor recommended, so this issue is not discussed further here. The bus resources are associated with each device. They are identified by type and number within the type. For the ISA bus the following types are defined: DMA channel SYS_RES_IRQ - interrupt number SYS_RES_DRQ - ISA DMA channel number SYS_RES_MEMORY - range of device memory mapped into the system memory space SYS_RES_IOPORT - range of device I/O registers The enumeration within types starts from 0, so if a device has two memory regions it would have resources of type SYS_RES_MEMORY numbered 0 and 1. The resource type has nothing to do with the C language type, all the resource values have the C language type unsigned long and must be cast as necessary. The resource numbers do not have to be contiguous, although for ISA they normally would be. The permitted resource numbers for ISA devices are: IRQ: 0-1 DRQ: 0-1 MEMORY: 0-3 IOPORT: 0-7 All the resources are represented as ranges, with a start value and count. For IRQ and DRQ resources the count would normally be equal to 1. The values for memory refer to the physical addresses. Three types of activities can be performed on resources: set/get allocate/release activate/deactivate Setting sets the range used by the resource. Allocation reserves the requested range that no other driver would be able to reserve it (and checking that no other driver reserved this range already). Activation makes the resource accessible to the driver by doing whatever is necessary for that (for example, for memory it would be mapping into the kernel virtual address space). The functions to manipulate resources are: int bus_set_resource(device_t dev, int type, int rid, u_long start, u_long count) Set a range for a resource. Returns 0 if successful, error code otherwise. Normally, this function will return an error only if one of type, rid, start or count has a value that falls out of the permitted range. dev - driver's device type - type of resource, SYS_RES_* rid - resource number (ID) within type start, count - resource range int bus_get_resource(device_t dev, int type, int rid, u_long *startp, u_long *countp) Get the range of resource. Returns 0 if successful, error code if the resource is not defined yet. u_long bus_get_resource_start(device_t dev, int type, int rid) u_long bus_get_resource_count (device_t dev, int type, int rid) Convenience functions to get only the start or count. Return 0 in case of error, so if the resource start has 0 among the legitimate values it would be impossible to tell if the value is 0 or an error occurred. Luckily, no ISA resources for add-on drivers may have a start value equal to 0. void bus_delete_resource(device_t dev, int type, int rid) Delete a resource, make it undefined. struct resource * bus_alloc_resource(device_t dev, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) Allocate a resource as a range of count values not allocated by anyone else, somewhere between start and end. Alas, alignment is not supported. If the resource was not set yet it is automatically created. The special values of start 0 and end ~0 (all ones) means that the fixed values previously set by bus_set_resource() must be used instead: start and count as themselves and end=(start+count), in this case if the resource was not defined before then an error is returned. Although rid is passed by reference it is not set anywhere by the resource allocation code of the ISA bus. (The other buses may use a different approach and modify it). Flags are a bitmap, the flags interesting for the caller are: RF_ACTIVE - causes the resource to be automatically activated after allocation. RF_SHAREABLE - resource may be shared at the same time by multiple drivers. RF_TIMESHARE - resource may be time-shared by multiple drivers, i.e. allocated at the same time by many but activated only by one at any given moment of time. Returns 0 on error. The allocated values may be obtained from the returned handle using methods rhand_*(). int bus_release_resource(device_t dev, int type, int rid, struct resource *r) Release the resource, r is the handle returned by bus_alloc_resource(). Returns 0 on success, error code otherwise. int bus_activate_resource(device_t dev, int type, int rid, struct resource *r) int bus_deactivate_resource(device_t dev, int type, int rid, struct resource *r) Activate or deactivate resource. Return 0 on success, error code otherwise. If the resource is time-shared and currently activated by another driver then EBUSY is returned. int bus_setup_intr(device_t dev, struct resource *r, int flags, driver_intr_t *handler, void *arg, void **cookiep) int bus_teardown_intr(device_t dev, struct resource *r, void *cookie) Associate or de-associate the interrupt handler with a device. Return 0 on success, error code otherwise. r - the activated resource handler describing the IRQ flags - the interrupt priority level, one of: INTR_TYPE_TTY - terminals and other likewise character-type devices. To mask them use spltty(). (INTR_TYPE_TTY | INTR_TYPE_FAST) - terminal type devices with small input buffer, critical to the data loss on input (such as the old-fashioned serial ports). To mask them use spltty(). INTR_TYPE_BIO - block-type devices, except those on the CAM controllers. To mask them use splbio(). INTR_TYPE_CAM - CAM (Common Access Method) bus controllers. To mask them use splcam(). INTR_TYPE_NET - network interface controllers. To mask them use splimp(). INTR_TYPE_MISC - miscellaneous devices. There is no other way to mask them than by splhigh() which masks all interrupts. When an interrupt handler executes all the other interrupts matching its priority level will be masked. The only exception is the MISC level for which no other interrupts are masked and which is not masked by any other interrupt. handler - pointer to the handler function, the type driver_intr_t is defined as void driver_intr_t(void *) arg - the argument passed to the handler to identify this particular device. It is cast from void* to any real type by the handler. The old convention for the ISA interrupt handlers was to use the unit number as argument, the new (recommended) convention is using a pointer to the device softc structure. cookie[p] - the value received from setup() is used to identify the handler when passed to teardown() A number of methods are defined to operate on the resource handlers (struct resource *). Those of interest to the device driver writers are: u_long rman_get_start(r) u_long rman_get_end(r) Get the start and end of allocated resource range. void *rman_get_virtual(r) Get the virtual address of activated memory resource. Bus memory mapping In many cases data is exchanged between the driver and the device through the memory. Two variants are possible: (a) memory is located on the device card (b) memory is the main memory of the computer In case (a) the driver always copies the data back and forth between the on-card memory and the main memory as necessary. To map the on-card memory into the kernel virtual address space the physical address and length of the on-card memory must be defined as a SYS_RES_MEMORY resource. That resource can then be allocated and activated, and its virtual address obtained using rman_get_virtual(). The older drivers used the function pmap_mapdev() for this purpose, which should not be used directly any more. Now it is one of the internal steps of resource activation. Most of the ISA cards will have their memory configured for physical location somewhere in range 640KB-1MB. Some of the ISA cards require larger memory ranges which should be placed somewhere under 16MB (because of the 24-bit address limitation on the ISA bus). In that case if the machine has more memory than the start address of the device memory (in other words, they overlap) a memory hole must be configured at the address range used by devices. Many BIOSes allow configuration of a memory hole of 1MB starting at 14MB or 15MB. FreeBSD can handle the memory holes properly if the BIOS reports them properly (this feature may be broken on old BIOSes). In case (b) just the address of the data is sent to the device, and the device uses DMA to actually access the data in the main memory. Two limitations are present: First, ISA cards can only access memory below 16MB. Second, the contiguous pages in virtual address space may not be contiguous in physical address space, so the device may have to do scatter/gather operations. The bus subsystem provides ready solutions for some of these problems, the rest has to be done by the drivers themselves. Two structures are used for DMA memory allocation, bus_dma_tag_t and bus_dmamap_t. Tag describes the properties required for the DMA memory. Map represents a memory block allocated according to these properties. Multiple maps may be associated with the same tag. Tags are organized into a tree-like hierarchy with inheritance of the properties. A child tag inherits all the requirements of its parent tag, and may make them more strict but never more loose. Normally one top-level tag (with no parent) is created for each device unit. If multiple memory areas with different requirements are needed for each device then a tag for each of them may be created as a child of the parent tag. The tags can be used to create a map in two ways. First, a chunk of contiguous memory conformant with the tag requirements may be allocated (and later may be freed). This is normally used to allocate relatively long-living areas of memory for communication with the device. Loading of such memory into a map is trivial: it is always considered as one chunk in the appropriate physical memory range. Second, an arbitrary area of virtual memory may be loaded into a map. Each page of this memory will be checked for conformance to the map requirement. If it conforms then it is left at its original location. If it is not then a fresh conformant bounce page is allocated and used as intermediate storage. When writing the data from the non-conformant original pages they will be copied to their bounce pages first and then transferred from the bounce pages to the device. When reading the data would go from the device to the bounce pages and then copied to their non-conformant original pages. The process of copying between the original and bounce pages is called synchronization. This is normally used on a per-transfer basis: buffer for each transfer would be loaded, transfer done and buffer unloaded. The functions working on the DMA memory are: int bus_dma_tag_create(bus_dma_tag_t parent, bus_size_t alignment, bus_size_t boundary, bus_addr_t lowaddr, bus_addr_t highaddr, bus_dma_filter_t *filter, void *filterarg, bus_size_t maxsize, int nsegments, bus_size_t maxsegsz, int flags, bus_dma_tag_t *dmat) Create a new tag. Returns 0 on success, the error code otherwise. parent - parent tag, or NULL to create a top-level tag. alignment - required physical alignment of the memory area to be allocated for this tag. Use value 1 for no specific alignment. Applies only to the future bus_dmamem_alloc() but not bus_dmamap_create() calls. boundary - physical address boundary that must not be crossed when allocating the memory. Use value 0 for no boundary. Applies only to the future bus_dmamem_alloc() but not bus_dmamap_create() calls. Must be power of 2. If the memory is planned to be used in non-cascaded DMA mode (i.e. the DMA addresses will be supplied not by the device itself but by the ISA DMA controller) then the boundary must be no larger than 64KB (64*1024) due to the limitations of the DMA hardware. lowaddr, highaddr - the names are slightly misleading; these values are used to limit the permitted range of physical addresses used to allocate the memory. The exact meaning varies depending on the planned future use: For bus_dmamem_alloc() all the addresses from 0 to lowaddr-1 are considered permitted, the higher ones are forbidden. For bus_dmamap_create() all the addresses outside the inclusive range [lowaddr; highaddr] are considered accessible. The addresses of pages inside the range are passed to the filter function which decides if they are accessible. If no filter function is supplied then all the range is considered unaccessible. For the ISA devices the normal values (with no filter function) are: lowaddr = BUS_SPACE_MAXADDR_24BIT highaddr = BUS_SPACE_MAXADDR filter, filterarg - the filter function and its argument. If NULL is passed for filter then the whole range [lowaddr, highaddr] is considered unaccessible when doing bus_dmamap_create(). Otherwise the physical address of each attempted page in range [lowaddr; highaddr] is passed to the filter function which decides if it is accessible. The prototype of the filter function is: int filterfunc(void *arg, bus_addr_t paddr). It must return 0 if the page is accessible, non-zero otherwise. maxsize - the maximal size of memory (in bytes) that may be allocated through this tag. In case it is difficult to estimate or could be arbitrarily big, the value for ISA devices would be BUS_SPACE_MAXSIZE_24BIT. nsegments - maximal number of scatter-gather segments supported by the device. If unrestricted then the value BUS_SPACE_UNRESTRICTED should be used. This value is recommended for the parent tags, the actual restrictions would then be specified for the descendant tags. Tags with nsegments equal to BUS_SPACE_UNRESTRICTED may not be used to actually load maps, they may be used only as parent tags. The practical limit for nsegments seems to be about 250-300, higher values will cause kernel stack overflow (the hardware can not normally support that many scatter-gather buffers anyway). maxsegsz - maximal size of a scatter-gather segment supported by the device. The maximal value for ISA device would be BUS_SPACE_MAXSIZE_24BIT. flags - a bitmap of flags. The only interesting flags are: BUS_DMA_ALLOCNOW - requests to allocate all the potentially needed bounce pages when creating the tag. BUS_DMA_ISA - mysterious flag used only on Alpha machines. It is not defined for the i386 machines. Probably it should be used by all the ISA drivers for Alpha machines but it looks like there are no such drivers yet. dmat - pointer to the storage for the new tag to be returned. int bus_dma_tag_destroy(bus_dma_tag_t dmat) Destroy a tag. Returns 0 on success, the error code otherwise. dmat - the tag to be destroyed. int bus_dmamem_alloc(bus_dma_tag_t dmat, void** vaddr, int flags, bus_dmamap_t *mapp) Allocate an area of contiguous memory described by the tag. The size of memory to be allocated is tag's maxsize. Returns 0 on success, the error code otherwise. The result still has to be loaded by bus_dmamap_load() before being used to get the physical address of the memory. dmat - the tag vaddr - pointer to the storage for the kernel virtual address of the allocated area to be returned. flags - a bitmap of flags. The only interesting flag is: BUS_DMA_NOWAIT - if the memory is not immediately available return the error. If this flag is not set then the routine is allowed to sleep until the memory becomes available. mapp - pointer to the storage for the new map to be returned. void bus_dmamem_free(bus_dma_tag_t dmat, void *vaddr, bus_dmamap_t map) Free the memory allocated by bus_dmamem_alloc(). At present, freeing of the memory allocated with ISA restrictions is not implemented. Because of this the recommended model of use is to keep and re-use the allocated areas for as long as possible. Do not lightly free some area and then shortly allocate it again. That does not mean that bus_dmamem_free() should not be used at all: hopefully it will be properly implemented soon. dmat - the tag vaddr - the kernel virtual address of the memory map - the map of the memory (as returned from bus_dmamem_alloc()) int bus_dmamap_create(bus_dma_tag_t dmat, int flags, bus_dmamap_t *mapp) Create a map for the tag, to be used in bus_dmamap_load() later. Returns 0 on success, the error code otherwise. dmat - the tag flags - theoretically, a bit map of flags. But no flags are defined yet, so at present it will be always 0. mapp - pointer to the storage for the new map to be returned int bus_dmamap_destroy(bus_dma_tag_t dmat, bus_dmamap_t map) Destroy a map. Returns 0 on success, the error code otherwise. dmat - the tag to which the map is associated map - the map to be destroyed int bus_dmamap_load(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf, bus_size_t buflen, bus_dmamap_callback_t *callback, void *callback_arg, int flags) Load a buffer into the map (the map must be previously created by bus_dmamap_create() or bus_dmamem_alloc()). All the pages of the buffer are checked for conformance to the tag requirements and for those not conformant the bounce pages are allocated. An array of physical segment descriptors is built and passed to the callback routine. This callback routine is then expected to handle it in some way. The number of bounce buffers in the system is limited, so if the bounce buffers are needed but not immediately available the request will be queued and the callback will be called when the bounce buffers will become available. Returns 0 if the callback was executed immediately or EINPROGRESS if the request was queued for future execution. In the latter case the synchronization with queued callback routine is the responsibility of the driver. dmat - the tag map - the map buf - kernel virtual address of the buffer buflen - length of the buffer callback, callback_arg - the callback function and its argument The prototype of callback function is: void callback(void *arg, bus_dma_segment_t *seg, int nseg, int error) arg - the same as callback_arg passed to bus_dmamap_load() seg - array of the segment descriptors nseg - number of descriptors in array error - indication of the segment number overflow: if it is set to EFBIG then the buffer did not fit into the maximal number of segments permitted by the tag. In this case only the permitted number of descriptors will be in the array. Handling of this situation is up to the driver: depending on the desired semantics it can either consider this an error or split the buffer in two and handle the second part separately Each entry in the segments array contains the fields: ds_addr - physical bus address of the segment ds_len - length of the segment void bus_dmamap_unload(bus_dma_tag_t dmat, bus_dmamap_t map) unload the map. dmat - tag map - loaded map void bus_dmamap_sync (bus_dma_tag_t dmat, bus_dmamap_t map, bus_dmasync_op_t op) Synchronise a loaded buffer with its bounce pages before and after physical transfer to or from device. This is the function that does all the necessary copying of data between the original buffer and its mapped version. The buffers must be synchronized both before and after doing the transfer. dmat - tag map - loaded map op - type of synchronization operation to perform: BUS_DMASYNC_PREREAD - before reading from device into buffer BUS_DMASYNC_POSTREAD - after reading from device into buffer BUS_DMASYNC_PREWRITE - before writing the buffer to device BUS_DMASYNC_POSTWRITE - after writing the buffer to device As of now PREREAD and POSTWRITE are null operations but that may change in the future, so they must not be ignored in the driver. Synchronization is not needed for the memory obtained from bus_dmamem_alloc(). Before calling the callback function from bus_dmamap_load() the segment array is stored in the stack. And it gets pre-allocated for the maximal number of segments allowed by the tag. Because of this the practical limit for the number of segments on i386 architecture is about 250-300 (the kernel stack is 4KB minus the size of the user structure, size of a segment array entry is 8 bytes, and some space must be left). Because the array is allocated based on the maximal number this value must not be set higher than really needed. Fortunately, for most of hardware the maximal supported number of segments is much lower. But if the driver wants to handle buffers with a very large number of scatter-gather segments it should do that in portions: load part of the buffer, transfer it to the device, load next part of the buffer, and so on. Another practical consequence is that the number of segments may limit the size of the buffer. If all the pages in the buffer happen to be physically non-contiguous then the maximal supported buffer size for that fragmented case would be (nsegments * page_size). For example, if a maximal number of 10 segments is supported then on i386 maximal guaranteed supported buffer size would be 40K. If a higher size is desired then special tricks should be used in the driver. If the hardware does not support scatter-gather at all or the driver wants to support some buffer size even if it is heavily fragmented then the solution is to allocate a contiguous buffer in the driver and use it as intermediate storage if the original buffer does not fit. Below are the typical call sequences when using a map depend - on the use of the map. The characters -> are used to show + on the use of the map. The characters -> are used to show the flow of time. For a buffer which stays practically fixed during all the time between attachment and detachment of a device: - bus_dmamem_alloc -> bus_dmamap_load -> ...use buffer... -> - -> bus_dmamap_unload -> bus_dmamem_free + bus_dmamem_alloc -> bus_dmamap_load -> ...use buffer... -> + -> bus_dmamap_unload -> bus_dmamem_free For a buffer that changes frequently and is passed from outside the driver: - bus_dmamap_create -> - -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer -> - -> bus_dmamap_sync(POST...) -> bus_dmamap_unload -> + bus_dmamap_create -> + -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer -> + -> bus_dmamap_sync(POST...) -> bus_dmamap_unload -> ... - -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer -> - -> bus_dmamap_sync(POST...) -> bus_dmamap_unload -> - -> bus_dmamap_destroy + -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer -> + -> bus_dmamap_sync(POST...) -> bus_dmamap_unload -> + -> bus_dmamap_destroy When loading a map created by bus_dmamem_alloc() the passed address and size of the buffer must be the same as used in bus_dmamem_alloc(). In this case it is guaranteed that the whole buffer will be mapped as one segment (so the callback may be based on this assumption) and the request will be executed immediately (EINPROGRESS will never be returned). All the callback needs to do in this case is to save the physical address. A typical example would be: static void alloc_callback(void *arg, bus_dma_segment_t *seg, int nseg, int error) { *(bus_addr_t *)arg = seg[0].ds_addr; } ... int error; struct somedata { .... }; struct somedata *vsomedata; /* virtual address */ bus_addr_t psomedata; /* physical bus-relative address */ bus_dma_tag_t tag_somedata; bus_dmamap_t map_somedata; ... error=bus_dma_tag_create(parent_tag, alignment, boundary, lowaddr, highaddr, /*filter*/ NULL, /*filterarg*/ NULL, /*maxsize*/ sizeof(struct somedata), /*nsegments*/ 1, /*maxsegsz*/ sizeof(struct somedata), /*flags*/ 0, &tag_somedata); if(error) return error; error = bus_dmamem_alloc(tag_somedata, &vsomedata, /* flags*/ 0, &map_somedata); if(error) return error; bus_dmamap_load(tag_somedata, map_somedata, (void *)vsomedata, sizeof (struct somedata), alloc_callback, (void *) &psomedata, /*flags*/0); Looks a bit long and complicated but that is the way to do it. The practical consequence is: if multiple memory areas are allocated always together it would be a really good idea to combine them all into one structure and allocate as one (if the alignment and boundary limitations permit). When loading an arbitrary buffer into the map created by bus_dmamap_create() special measures must be taken to synchronize with the callback in case it would be delayed. The code would look like: { int s; int error; s = splsoftvm(); error = bus_dmamap_load( dmat, dmamap, buffer_ptr, buffer_len, callback, /*callback_arg*/ buffer_descriptor, /*flags*/0); if (error == EINPROGRESS) { /* * Do whatever is needed to ensure synchronization * with callback. Callback is guaranteed not to be started * until we do splx() or tsleep(). */ } splx(s); } Two possible approaches for the processing of requests are: 1. If requests are completed by marking them explicitly as done (such as the CAM requests) then it would be simpler to put all the further processing into the callback driver which would mark the request when it is done. Then not much extra synchronization is needed. For the flow control reasons it may be a good idea to freeze the request queue until this request gets completed. 2. If requests are completed when the function returns (such as classic read or write requests on character devices) then a synchronization flag should be set in the buffer descriptor and tsleep() called. Later when the callback gets called it will do its processing and check this synchronization flag. If it is set then the callback should issue a wakeup. In this approach the callback function could either do all the needed processing (just like the previous case) or simply save the segments array in the buffer descriptor. Then after callback completes the calling function could use this saved segments array and do all the processing. DMA Direct Memory Access (DMA) The Direct Memory Access (DMA) is implemented in the ISA bus through the DMA controller (actually, two of them but that is an irrelevant detail). To make the early ISA devices simple and cheap the logic of the bus control and address generation was concentrated in the DMA controller. Fortunately, FreeBSD provides a set of functions that mostly hide the annoying details of the DMA controller from the device drivers. The simplest case is for the fairly intelligent devices. Like the bus master devices on PCI they can generate the bus cycles and memory addresses all by themselves. The only thing they really need from the DMA controller is bus arbitration. So for this purpose they pretend to be cascaded slave DMA controllers. And the only thing needed from the system DMA controller is to enable the cascaded mode on a DMA channel by calling the following function when attaching the driver: void isa_dmacascade(int channel_number) All the further activity is done by programming the device. When detaching the driver no DMA-related functions need to be called. For the simpler devices things get more complicated. The functions used are: int isa_dma_acquire(int chanel_number) Reserve a DMA channel. Returns 0 on success or EBUSY if the channel was already reserved by this or a different driver. Most of the ISA devices are not able to share DMA channels anyway, so normally this function is called when attaching a device. This reservation was made redundant by the modern interface of bus resources but still must be used in addition to the latter. If not used then later, other DMA routines will panic. int isa_dma_release(int chanel_number) Release a previously reserved DMA channel. No transfers must be in progress when the channel is released (in addition the device must not try to initiate transfer after the channel is released). void isa_dmainit(int chan, u_int bouncebufsize) Allocate a bounce buffer for use with the specified channel. The requested size of the buffer can not exceed 64KB. This bounce buffer will be automatically used later if a transfer buffer happens to be not physically contiguous or outside of the memory accessible by the ISA bus or crossing the 64KB boundary. If the transfers will be always done from buffers which conform to these conditions (such as those allocated by bus_dmamem_alloc() with proper limitations) then isa_dmainit() does not have to be called. But it is quite convenient to transfer arbitrary data using the DMA controller. The bounce buffer will automatically care of the scatter-gather issues. chan - channel number bouncebufsize - size of the bounce buffer in bytes void isa_dmastart(int flags, caddr_t addr, u_int nbytes, int chan) Prepare to start a DMA transfer. This function must be called to set up the DMA controller before actually starting transfer on the device. It checks that the buffer is contiguous and falls into the ISA memory range, if not then the bounce buffer is automatically used. If bounce buffer is required but not set up by isa_dmainit() or too small for the requested transfer size then the system will panic. In case of a write request with bounce buffer the data will be automatically copied to the bounce buffer. flags - a bitmask determining the type of operation to be done. The direction bits B_READ and B_WRITE are mutually exclusive. B_READ - read from the ISA bus into memory B_WRITE - write from the memory to the ISA bus B_RAW - if set then the DMA controller will remember the buffer and after the end of transfer will automatically re-initialize itself to repeat transfer of the same buffer again (of course, the driver may change the data in the buffer before initiating another transfer in the device). If not set then the parameters will work only for one transfer, and isa_dmastart() will have to be called again before initiating the next transfer. Using B_RAW makes sense only if the bounce buffer is not used. addr - virtual address of the buffer nbytes - length of the buffer. Must be less or equal to 64KB. Length of 0 is not allowed: the DMA controller will understand it as 64KB while the kernel code will understand it as 0 and that would cause unpredictable effects. For channels number 4 and higher the length must be even because these channels transfer 2 bytes at a time. In case of an odd length the last byte will not be transferred. chan - channel number void isa_dmadone(int flags, caddr_t addr, int nbytes, int chan) Synchronize the memory after device reports that transfer is done. If that was a read operation with a bounce buffer then the data will be copied from the bounce buffer to the original buffer. Arguments are the same as for isa_dmastart(). Flag B_RAW is permitted but it does not affect isa_dmadone() in any way. int isa_dmastatus(int channel_number) Returns the number of bytes left in the current transfer to be transferred. In case the flag B_READ was set in isa_dmastart() the number returned will never be equal to zero. At the end of transfer it will be automatically reset back to the length of buffer. The normal use is to check the number of bytes left after the device signals that the transfer is completed. If the number of bytes is not 0 then something probably went wrong with that transfer. int isa_dmastop(int channel_number) Aborts the current transfer and returns the number of bytes left untransferred. xxx_isa_probe This function probes if a device is present. If the driver supports auto-detection of some part of device configuration (such as interrupt vector or memory address) this auto-detection must be done in this routine. As for any other bus, if the device cannot be detected or is detected but failed the self-test or some other problem happened then it returns a positive value of error. The value ENXIO must be returned if the device is not present. Other error values may mean other conditions. Zero or negative values mean success. Most of the drivers return zero as success. The negative return values are used when a PnP device supports multiple interfaces. For example, an older compatibility interface and a newer advanced interface which are supported by different drivers. Then both drivers would detect the device. The driver which returns a higher value in the probe routine takes precedence (in other words, the driver returning 0 has highest precedence, one returning -1 is next, one returning -2 is after it and so on). In result the devices which support only the old interface will be handled by the old driver (which should return -1 from the probe routine) while the devices supporting the new interface as well will be handled by the new driver (which should return 0 from the probe routine). The device descriptor struct xxx_softc is allocated by the system before calling the probe routine. If the probe routine returns an error the descriptor will be automatically deallocated by the system. So if a probing error occurs the driver must make sure that all the resources it used during probe are deallocated and that nothing keeps the descriptor from being safely deallocated. If the probe completes successfully the descriptor will be preserved by the system and later passed to the routine xxx_isa_attach(). If a driver returns a negative value it can not be sure that it will have the highest priority and its attach routine will be called. So in this case it also must release all the resources before returning and if necessary allocate them again in the attach routine. When xxx_isa_probe() returns 0 releasing the resources before returning is also a good idea and a well-behaved driver should do so. But in cases where there is some problem with releasing the resources the driver is allowed to keep resources between returning 0 from the probe routine and execution of the attach routine. A typical probe routine starts with getting the device descriptor and unit: struct xxx_softc *sc = device_get_softc(dev); int unit = device_get_unit(dev); int pnperror; int error = 0; - sc->dev = dev; /* link it back */ - sc->unit = unit; + sc->dev = dev; /* link it back */ + sc->unit = unit; Then check for the PnP devices. The check is carried out by a table containing the list of PnP IDs supported by this driver and human-readable descriptions of the device models corresponding to these IDs. pnperror=ISA_PNP_PROBE(device_get_parent(dev), dev, xxx_pnp_ids); if(pnperror == ENXIO) return ENXIO; The logic of ISA_PNP_PROBE is the following: If this card (device unit) was not detected as PnP then ENOENT will be returned. If it was detected as PnP but its detected ID does not match any of the IDs in the table then ENXIO is returned. Finally, if it has PnP support and it matches on of the IDs in the table, 0 is returned and the appropriate description from the table is set by device_set_desc(). If a driver supports only PnP devices then the condition would look like: if(pnperror != 0) return pnperror; No special treatment is required for the drivers which do not support PnP because they pass an empty PnP ID table and will always get ENXIO if called on a PnP card. The probe routine normally needs at least some minimal set of resources, such as I/O port number to find the card and probe it. Depending on the hardware the driver may be able to discover the other necessary resources automatically. The PnP devices have all the resources pre-set by the PnP subsystem, so the driver does not need to discover them by itself. Typically the minimal information required to get access to the device is the I/O port number. Then some devices allow to get the rest of information from the device configuration registers (though not all devices do that). So first we try to get the port start value: - sc->port0 = bus_get_resource_start(dev, - SYS_RES_IOPORT, 0 /*rid*/); if(sc->port0 == 0) return ENXIO; + sc->port0 = bus_get_resource_start(dev, + SYS_RES_IOPORT, 0 /*rid*/); if(sc->port0 == 0) return ENXIO; The base port address is saved in the structure softc for future use. If it will be used very often then calling the resource function each time would be prohibitively slow. If we do not get a port we just return an error. Some device drivers can instead be clever and try to probe all the possible ports, like this: /* table of all possible base I/O port addresses for this device */ static struct xxx_allports { u_short port; /* port address */ short used; /* flag: if this port is already used by some unit */ } xxx_allports = { { 0x300, 0 }, { 0x320, 0 }, { 0x340, 0 }, { 0, 0 } /* end of table */ }; ... int port, i; ... port = bus_get_resource_start(dev, SYS_RES_IOPORT, 0 /*rid*/); if(port !=0 ) { for(i=0; xxx_allports[i].port!=0; i++) { if(xxx_allports[i].used || xxx_allports[i].port != port) continue; /* found it */ xxx_allports[i].used = 1; /* do probe on a known port */ return xxx_really_probe(dev, port); } return ENXIO; /* port is unknown or already used */ } /* we get here only if we need to guess the port */ for(i=0; xxx_allports[i].port!=0; i++) { if(xxx_allports[i].used) continue; /* mark as used - even if we find nothing at this port * at least we won't probe it in future */ xxx_allports[i].used = 1; error = xxx_really_probe(dev, xxx_allports[i].port); if(error == 0) /* found a device at that port */ return 0; } /* probed all possible addresses, none worked */ return ENXIO; Of course, normally the driver's identify() routine should be used for such things. But there may be one valid reason why it may be better to be done in probe(): if this probe would drive some other sensitive device crazy. The probe routines are ordered with consideration of the sensitive flag: the sensitive devices get probed first and the rest of the devices later. But the identify() routines are called before any probes, so they show no respect to the sensitive devices and may upset them. Now, after we got the starting port we need to set the port count (except for PnP devices) because the kernel does not have this information in the configuration file. if(pnperror /* only for non-PnP devices */ - && bus_set_resource(dev, SYS_RES_IOPORT, 0, sc->port0, + && bus_set_resource(dev, SYS_RES_IOPORT, 0, sc->port0, XXX_PORT_COUNT)<0) return ENXIO; Finally allocate and activate a piece of port address space (special values of start and end mean use those we set by bus_set_resource()): - sc->port0_rid = 0; - sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT, - &sc->port0_rid, + sc->port0_rid = 0; + sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT, + &sc->port0_rid, /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE); - if(sc->port0_r == NULL) + if(sc->port0_r == NULL) return ENXIO; Now having access to the port-mapped registers we can poke the device in some way and check if it reacts like it is expected to. If it does not then there is probably some other device or no device at all at this address. Normally drivers do not set up the interrupt handlers until the attach routine. Instead they do probes in the polling mode using the DELAY() function for timeout. The probe routine must never hang forever, all the waits for the device must be done with timeouts. If the device does not respond within the time it is probably broken or misconfigured and the driver must return error. When determining the timeout interval give the device some extra time to be on the safe side: although DELAY() is supposed to delay for the same amount of time on any machine it has some margin of error, depending on the exact CPU. If the probe routine really wants to check that the interrupts really work it may configure and probe the interrupts too. But that is not recommended. /* implemented in some very device-specific way */ if(error = xxx_probe_ports(sc)) goto bad; /* will deallocate the resources before returning */ The function xxx_probe_ports() may also set the device description depending on the exact model of device it discovers. But if there is only one supported device model this can be as well done in a hardcoded way. Of course, for the PnP devices the PnP support sets the description from the table automatically. if(pnperror) device_set_desc(dev, "Our device model 1234"); Then the probe routine should either discover the ranges of all the resources by reading the device configuration registers or make sure that they were set explicitly by the user. We will consider it with an example of on-board memory. The probe routine should be as non-intrusive as possible, so allocation and check of functionality of the rest of resources (besides the ports) would be better left to the attach routine. The memory address may be specified in the kernel configuration file or on some devices it may be pre-configured in non-volatile configuration registers. If both sources are available and different, which one should be used? Probably if the user bothered to set the address explicitly in the kernel configuration file they know what they are doing and this one should take precedence. An example of implementation could be: /* try to find out the config address first */ - sc->mem0_p = bus_get_resource_start(dev, SYS_RES_MEMORY, 0 /*rid*/); - if(sc->mem0_p == 0) { /* nope, not specified by user */ - sc->mem0_p = xxx_read_mem0_from_device_config(sc); + sc->mem0_p = bus_get_resource_start(dev, SYS_RES_MEMORY, 0 /*rid*/); + if(sc->mem0_p == 0) { /* nope, not specified by user */ + sc->mem0_p = xxx_read_mem0_from_device_config(sc); - if(sc->mem0_p == 0) + if(sc->mem0_p == 0) /* can't get it from device config registers either */ goto bad; } else { if(xxx_set_mem0_address_on_device(sc) < 0) goto bad; /* device does not support that address */ } /* just like the port, set the memory size, * for some devices the memory size would not be constant * but should be read from the device configuration registers instead * to accommodate different models of devices. Another option would * be to let the user set the memory size as "msize" configuration * resource which will be automatically handled by the ISA bus. */ if(pnperror) { /* only for non-PnP devices */ - sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/); - if(sc->mem0_size == 0) /* not specified by user */ - sc->mem0_size = xxx_read_mem0_size_from_device_config(sc); + sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/); + if(sc->mem0_size == 0) /* not specified by user */ + sc->mem0_size = xxx_read_mem0_size_from_device_config(sc); - if(sc->mem0_size == 0) { + if(sc->mem0_size == 0) { /* suppose this is a very old model of device without * auto-configuration features and the user gave no preference, * so assume the minimalistic case * (of course, the real value will vary with the driver) */ - sc->mem0_size = 8*1024; + sc->mem0_size = 8*1024; } if(xxx_set_mem0_size_on_device(sc) < 0) goto bad; /* device does not support that size */ if(bus_set_resource(dev, SYS_RES_MEMORY, /*rid*/0, - sc->mem0_p, sc->mem0_size)<0) + sc->mem0_p, sc->mem0_size)<0) goto bad; } else { - sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/); + sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/); } Resources for IRQ and DRQ are easy to check by analogy. If all went well then release all the resources and return success. xxx_free_resources(sc); return 0; Finally, handle the troublesome situations. All the resources should be deallocated before returning. We make use of the fact that before the structure softc is passed to us it gets zeroed out, so we can find out if some resource was allocated: then its descriptor is non-zero. bad: xxx_free_resources(sc); if(error) return error; else /* exact error is unknown */ return ENXIO; That would be all for the probe routine. Freeing of resources is done from multiple places, so it is moved to a function which may look like: static void xxx_free_resources(sc) struct xxx_softc *sc; { /* check every resource and free if not zero */ /* interrupt handler */ - if(sc->intr_r) { - bus_teardown_intr(sc->dev, sc->intr_r, sc->intr_cookie); - bus_release_resource(sc->dev, SYS_RES_IRQ, sc->intr_rid, - sc->intr_r); - sc->intr_r = 0; + if(sc->intr_r) { + bus_teardown_intr(sc->dev, sc->intr_r, sc->intr_cookie); + bus_release_resource(sc->dev, SYS_RES_IRQ, sc->intr_rid, + sc->intr_r); + sc->intr_r = 0; } /* all kinds of memory maps we could have allocated */ - if(sc->data_p) { - bus_dmamap_unload(sc->data_tag, sc->data_map); - sc->data_p = 0; + if(sc->data_p) { + bus_dmamap_unload(sc->data_tag, sc->data_map); + sc->data_p = 0; } - if(sc->data) { /* sc->data_map may be legitimately equal to 0 */ + if(sc->data) { /* sc->data_map may be legitimately equal to 0 */ /* the map will also be freed */ - bus_dmamem_free(sc->data_tag, sc->data, sc->data_map); - sc->data = 0; + bus_dmamem_free(sc->data_tag, sc->data, sc->data_map); + sc->data = 0; } - if(sc->data_tag) { - bus_dma_tag_destroy(sc->data_tag); - sc->data_tag = 0; + if(sc->data_tag) { + bus_dma_tag_destroy(sc->data_tag); + sc->data_tag = 0; } ... free other maps and tags if we have them ... - if(sc->parent_tag) { - bus_dma_tag_destroy(sc->parent_tag); - sc->parent_tag = 0; + if(sc->parent_tag) { + bus_dma_tag_destroy(sc->parent_tag); + sc->parent_tag = 0; } /* release all the bus resources */ - if(sc->mem0_r) { - bus_release_resource(sc->dev, SYS_RES_MEMORY, sc->mem0_rid, - sc->mem0_r); - sc->mem0_r = 0; + if(sc->mem0_r) { + bus_release_resource(sc->dev, SYS_RES_MEMORY, sc->mem0_rid, + sc->mem0_r); + sc->mem0_r = 0; } ... - if(sc->port0_r) { - bus_release_resource(sc->dev, SYS_RES_IOPORT, sc->port0_rid, - sc->port0_r); - sc->port0_r = 0; + if(sc->port0_r) { + bus_release_resource(sc->dev, SYS_RES_IOPORT, sc->port0_rid, + sc->port0_r); + sc->port0_r = 0; } } xxx_isa_attach The attach routine actually connects the driver to the system if the probe routine returned success and the system had chosen to attach that driver. If the probe routine returned 0 then the attach routine may expect to receive the device structure softc intact, as it was set by the probe routine. Also if the probe routine returns 0 it may expect that the attach routine for this device shall be called at some point in the future. If the probe routine returns a negative value then the driver may make none of these assumptions. The attach routine returns 0 if it completed successfully or error code otherwise. The attach routine starts just like the probe routine, with getting some frequently used data into more accessible variables. struct xxx_softc *sc = device_get_softc(dev); int unit = device_get_unit(dev); int error = 0; Then allocate and activate all the necessary resources. Because normally the port range will be released before returning from probe, it has to be allocated again. We expect that the probe routine had properly set all the resource ranges, as well as saved them in the structure softc. If the probe routine had left some resource allocated then it does not need to be allocated again (which would be considered an error). - sc->port0_rid = 0; - sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT, &sc->port0_rid, + sc->port0_rid = 0; + sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT, &sc->port0_rid, /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE); - if(sc->port0_r == NULL) + if(sc->port0_r == NULL) return ENXIO; /* on-board memory */ - sc->mem0_rid = 0; - sc->mem0_r = bus_alloc_resource(dev, SYS_RES_MEMORY, &sc->mem0_rid, + sc->mem0_rid = 0; + sc->mem0_r = bus_alloc_resource(dev, SYS_RES_MEMORY, &sc->mem0_rid, /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE); - if(sc->mem0_r == NULL) + if(sc->mem0_r == NULL) goto bad; /* get its virtual address */ - sc->mem0_v = rman_get_virtual(sc->mem0_r); + sc->mem0_v = rman_get_virtual(sc->mem0_r); The DMA request channel (DRQ) is allocated likewise. To initialize it use functions of the isa_dma*() family. For example: - isa_dmacascade(sc->drq0); + isa_dmacascade(sc->drq0); The interrupt request line (IRQ) is a bit special. Besides allocation the driver's interrupt handler should be associated with it. Historically in the old ISA drivers the argument passed by the system to the interrupt handler was the device unit number. But in modern drivers the convention suggests passing the pointer to structure softc. The important reason is that when the structures softc are allocated dynamically then getting the unit number from softc is easy while getting softc from the unit number is difficult. Also this convention makes the drivers for different buses look more uniform and allows them to share the code: each bus gets its own probe, attach, detach and other bus-specific routines while the bulk of the driver code may be shared among them. - sc->intr_rid = 0; - sc->intr_r = bus_alloc_resource(dev, SYS_RES_MEMORY, &sc->intr_rid, + sc->intr_rid = 0; + sc->intr_r = bus_alloc_resource(dev, SYS_RES_MEMORY, &sc->intr_rid, /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE); - if(sc->intr_r == NULL) + if(sc->intr_r == NULL) goto bad; /* * XXX_INTR_TYPE is supposed to be defined depending on the type of * the driver, for example as INTR_TYPE_CAM for a CAM driver */ - error = bus_setup_intr(dev, sc->intr_r, XXX_INTR_TYPE, - (driver_intr_t *) xxx_intr, (void *) sc, &sc->intr_cookie); + error = bus_setup_intr(dev, sc->intr_r, XXX_INTR_TYPE, + (driver_intr_t *) xxx_intr, (void *) sc, &sc->intr_cookie); if(error) goto bad; If the device needs to make DMA to the main memory then this memory should be allocated like described before: error=bus_dma_tag_create(NULL, /*alignment*/ 4, /*boundary*/ 0, /*lowaddr*/ BUS_SPACE_MAXADDR_24BIT, /*highaddr*/ BUS_SPACE_MAXADDR, /*filter*/ NULL, /*filterarg*/ NULL, /*maxsize*/ BUS_SPACE_MAXSIZE_24BIT, /*nsegments*/ BUS_SPACE_UNRESTRICTED, /*maxsegsz*/ BUS_SPACE_MAXSIZE_24BIT, /*flags*/ 0, - &sc->parent_tag); + &sc->parent_tag); if(error) goto bad; /* many things get inherited from the parent tag - * sc->data is supposed to point to the structure with the shared data, + * sc->data is supposed to point to the structure with the shared data, * for example for a ring buffer it could be: * struct { * u_short rd_pos; * u_short wr_pos; * char bf[XXX_RING_BUFFER_SIZE] * } *data; */ - error=bus_dma_tag_create(sc->parent_tag, 1, + error=bus_dma_tag_create(sc->parent_tag, 1, 0, BUS_SPACE_MAXADDR, 0, /*filter*/ NULL, /*filterarg*/ NULL, - /*maxsize*/ sizeof(* sc->data), /*nsegments*/ 1, - /*maxsegsz*/ sizeof(* sc->data), /*flags*/ 0, - &sc->data_tag); + /*maxsize*/ sizeof(* sc->data), /*nsegments*/ 1, + /*maxsegsz*/ sizeof(* sc->data), /*flags*/ 0, + &sc->data_tag); if(error) goto bad; - error = bus_dmamem_alloc(sc->data_tag, &sc->data, /* flags*/ 0, - &sc->data_map); + error = bus_dmamem_alloc(sc->data_tag, &sc->data, /* flags*/ 0, + &sc->data_map); if(error) goto bad; /* xxx_alloc_callback() just saves the physical address at - * the pointer passed as its argument, in this case &sc->data_p. + * the pointer passed as its argument, in this case &sc->data_p. * See details in the section on bus memory mapping. * It can be implemented like: * * static void * xxx_alloc_callback(void *arg, bus_dma_segment_t *seg, * int nseg, int error) * { * *(bus_addr_t *)arg = seg[0].ds_addr; * } */ - bus_dmamap_load(sc->data_tag, sc->data_map, (void *)sc->data, - sizeof (* sc->data), xxx_alloc_callback, (void *) &sc->data_p, + bus_dmamap_load(sc->data_tag, sc->data_map, (void *)sc->data, + sizeof (* sc->data), xxx_alloc_callback, (void *) &sc->data_p, /*flags*/0); After all the necessary resources are allocated the device should be initialized. The initialization may include testing that all the expected features are functional. if(xxx_initialize(sc) < 0) goto bad; The bus subsystem will automatically print on the console the device description set by probe. But if the driver wants to print some extra information about the device it may do so, for example: - device_printf(dev, "has on-card FIFO buffer of %d bytes\n", sc->fifosize); + device_printf(dev, "has on-card FIFO buffer of %d bytes\n", sc->fifosize); If the initialization routine experiences any problems then printing messages about them before returning error is also recommended. The final step of the attach routine is attaching the device to its functional subsystem in the kernel. The exact way to do it depends on the type of the driver: a character device, a block device, a network device, a CAM SCSI bus device and so on. If all went well then return success. error = xxx_attach_subsystem(sc); if(error) goto bad; return 0; Finally, handle the troublesome situations. All the resources should be deallocated before returning an error. We make use of the fact that before the structure softc is passed to us it gets zeroed out, so we can find out if some resource was allocated: then its descriptor is non-zero. bad: xxx_free_resources(sc); if(error) return error; else /* exact error is unknown */ return ENXIO; That would be all for the attach routine. xxx_isa_detach If this function is present in the driver and the driver is compiled as a loadable module then the driver gets the ability to be unloaded. This is an important feature if the hardware supports hot plug. But the ISA bus does not support hot plug, so this feature is not particularly important for the ISA devices. The ability to unload a driver may be useful when debugging it, but in many cases installation of the new version of the driver would be required only after the old version somehow wedges the system and a reboot will be needed anyway, so the efforts spent on writing the detach routine may not be worth it. Another argument that unloading would allow upgrading the drivers on a production machine seems to be mostly theoretical. Installing a new version of a driver is a dangerous operation which should never be performed on a production machine (and which is not permitted when the system is running in secure mode). Still, the detach routine may be provided for the sake of completeness. The detach routine returns 0 if the driver was successfully detached or the error code otherwise. The logic of detach is a mirror of the attach. The first thing to do is to detach the driver from its kernel subsystem. If the device is currently open then the driver has two choices: refuse to be detached or forcibly close and proceed with detach. The choice used depends on the ability of the particular kernel subsystem to do a forced close and on the preferences of the driver's author. Generally the forced close seems to be the preferred alternative. struct xxx_softc *sc = device_get_softc(dev); int error; error = xxx_detach_subsystem(sc); if(error) return error; Next the driver may want to reset the hardware to some consistent state. That includes stopping any ongoing transfers, disabling the DMA channels and interrupts to avoid memory corruption by the device. For most of the drivers this is exactly what the shutdown routine does, so if it is included in the driver we can just call it. xxx_isa_shutdown(dev); And finally release all the resources and return success. xxx_free_resources(sc); return 0; xxx_isa_shutdown This routine is called when the system is about to be shut down. It is expected to bring the hardware to some consistent state. For most of the ISA devices no special action is required, so the function is not really necessary because the device will be re-initialized on reboot anyway. But some devices have to be shut down with a special procedure, to make sure that they will be properly detected after soft reboot (this is especially true for many devices with proprietary identification protocols). In any case disabling DMA and interrupts in the device registers and stopping any ongoing transfers is a good idea. The exact action depends on the hardware, so we do not consider it here in any detail. xxx_intr interrupt handler The interrupt handler is called when an interrupt is received which may be from this particular device. The ISA bus does not support interrupt sharing (except in some special cases) so in practice if the interrupt handler is called then the interrupt almost for sure came from its device. Still, the interrupt handler must poll the device registers and make sure that the interrupt was generated by its device. If not it should just return. The old convention for the ISA drivers was getting the device unit number as an argument. This is obsolete, and the new drivers receive whatever argument was specified for them in the attach routine when calling bus_setup_intr(). By the new convention it should be the pointer to the structure softc. So the interrupt handler commonly starts as: static void xxx_intr(struct xxx_softc *sc) { It runs at the interrupt priority level specified by the interrupt type parameter of bus_setup_intr(). That means that all the other interrupts of the same type as well as all the software interrupts are disabled. To avoid races it is commonly written as a loop: while(xxx_interrupt_pending(sc)) { xxx_process_interrupt(sc); xxx_acknowledge_interrupt(sc); } The interrupt handler has to acknowledge interrupt to the device only but not to the interrupt controller, the system takes care of the latter. diff --git a/en_US.ISO8859-1/books/arch-handbook/jail/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/jail/chapter.sgml index e9765b8ea3..f5f886f4fd 100644 --- a/en_US.ISO8859-1/books/arch-handbook/jail/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/jail/chapter.sgml @@ -1,622 +1,622 @@ Evan Sarmiento

evms@cs.bu.edu

2001 Evan Sarmiento The Jail Subsystem security Jail root On most &unix; systems, root has omnipotent power. This promotes insecurity. If an attacker were to gain root on a system, he would have every function at his fingertips. In FreeBSD there are sysctls which dilute the power of root, in order to minimize the damage caused by an attacker. Specifically, one of these functions is called secure levels. Similarly, another function which is present from FreeBSD 4.0 and onward, is a utility called &man.jail.8;. Jail chroots an environment and sets certain restrictions on processes which are forked from within. For example, a jailed process cannot affect processes outside of the jail, utilize certain system calls, or inflict any damage on the main computer. Jail is becoming the new security model. People are running potentially vulnerable servers such as Apache, BIND, and sendmail within jails, so that if an attacker gains root within the Jail, it is only an annoyance, and not a devastation. This article focuses on the internals (source code) of Jail. It will also suggest improvements upon the jail code base which are already being worked on. If you are looking for a how-to on setting up a Jail, I suggest you look at my other article in Sys Admin Magazine, May 2001, entitled "Securing FreeBSD using Jail." Architecture Jail consists of two realms: the user-space program, jail, and the code implemented within the kernel: the jail() system call and associated restrictions. I will be discussing the user-space program and then how jail is implemented within the kernel. Userland code Jail userland program The source for the user-land jail is located in /usr/src/usr.sbin/jail, consisting of one file, jail.c. The program takes these arguments: the path of the jail, hostname, ip address, and the command to be executed. Data Structures In jail.c, the first thing I would note is the declaration of an important structure struct jail j; which was included from /usr/include/sys/jail.h. The definition of the jail structure is: /usr/include/sys/jail.h: struct jail { u_int32_t version; char *path; char *hostname; u_int32_t ip_number; }; As you can see, there is an entry for each of the arguments passed to the jail program, and indeed, they are set during its execution. /usr/src/usr.sbin/jail.c j.version = 0; j.path = argv[1]; j.hostname = argv[2]; Networking One of the arguments passed to the Jail program is an IP address with which the jail can be accessed over the network. Jail translates the ip address given into network byte order and then stores it in j (the jail structure). /usr/src/usr.sbin/jail/jail.c: struct in.addr in; ... i = inet.aton(argv[3], ); ... j.ip_number = ntohl(in.s.addr); The inet_aton3 function "interprets the specified character string as an Internet address, placing the address into the structure provided." The ip number node in the jail structure is set only when the ip address placed onto the in structure by inet aton is translated into network byte order by ntohl(). Jailing The Process Finally, the userland program jails the process, and executes the command specified. Jail now becomes an imprisoned process itself and forks a child process which then executes the command given using &man.execv.3; /usr/src/sys/usr.sbin/jail/jail.c i = jail(); ... i = execv(argv[4], argv + 4); As you can see, the jail function is being called, and its argument is the jail structure which has been filled with the arguments given to the program. Finally, the program you specify is executed. I will now discuss how Jail is implemented within the kernel. Kernel Space Jail kernel architecture We will now be looking at the file /usr/src/sys/kern/kern_jail.c. This is the file where the jail system call, appropriate sysctls, and networking functions are defined. sysctls sysctl In kern_jail.c, the following sysctls are defined: /usr/src/sys/kern/kern_jail.c: int jail_set_hostname_allowed = 1; SYSCTL_INT(_jail, OID_AUTO, set_hostname_allowed, CTLFLAG_RW, _set_hostname_allowed, 0, "Processes in jail can set their hostnames"); int jail_socket_unixiproute_only = 1; SYSCTL_INT(_jail, OID_AUTO, socket_unixiproute_only, CTLFLAG_RW, _socket_unixiproute_only, 0, "Processes in jail are limited to creating &unix;/IPv4/route sockets only "); int jail_sysvipc_allowed = 0; SYSCTL_INT(_jail, OID_AUTO, sysvipc_allowed, CTLFLAG_RW, _sysvipc_allowed, 0, "Processes in jail can use System V IPC primitives"); Each of these sysctls can be accessed by the user through the sysctl program. Throughout the kernel, these specific sysctls are recognized by their name. For example, the name of the first sysctl is jail.set.hostname.allowed. &man.jail.2; system call Like all system calls, the &man.jail.2; system call takes two arguments, struct proc *p and struct jail_args *uap. p is a pointer to a proc structure which describes the calling process. In this context, uap is a pointer to a structure which specifies the arguments given to &man.jail.2; from the userland program jail.c. When I described the userland program before, you saw that the &man.jail.2; system call was given a jail structure as its own argument. /usr/src/sys/kern/kern_jail.c: int jail(p, uap) struct proc *p; struct jail_args /* { syscallarg(struct jail *) jail; } */ *uap; - Therefore, uap->jail would access the + Therefore, uap->jail would access the jail structure which was passed to the system call. Next, the system call copies the jail structure into kernel space using the copyin() function. copyin() takes three arguments: the data which is to be copied into kernel space, - uap->jail, where to store it, + uap->jail, where to store it, j and the size of the storage. The jail - structure uap->jail is copied into kernel + structure uap->jail is copied into kernel space and stored in another jail structure, j. /usr/src/sys/kern/kern_jail.c: -error = copyin(uap->jail, , sizeof j); +error = copyin(uap->jail, , sizeof j); There is another important structure defined in jail.h. It is the prison structure (pr). The prison structure is used exclusively within kernel space. The &man.jail.2; system call copies everything from the jail structure onto the prison structure. Here is the definition of the prison structure. /usr/include/sys/jail.h: struct prison { int pr_ref; char pr_host[MAXHOSTNAMELEN]; u_int32_t pr_ip; void *pr_linux; }; The jail() system call then allocates memory for a pointer to a prison structure and copies data between the two structures. /usr/src/sys/kern/kern_jail.c: MALLOC(pr, struct prison *, sizeof *pr , M_PRISON, M_WAITOK); bzero((caddr_t)pr, sizeof *pr); - error = copyinstr(j.hostname, pr_host]]>, sizeof pr->pr_host, 0); + error = copyinstr(j.hostname, , sizeof pr->pr_host, 0); if (error) goto bail; chroot Finally, the jail system call chroots the path specified. The chroot function is given two arguments. The first is p, which represents the calling process, the second is a pointer to the structure chroot args. The structure chroot args contains the path which is to be chrooted. As you can see, the path specified in the jail structure is copied to the chroot args structure and used. /usr/src/sys/kern/kern_jail.c: ca.path = j.path; error = chroot(p, ); These next three lines in the source are very important, as they specify how the kernel recognizes a process as jailed. Each process on a &unix; system is described by its own proc structure. You can see the whole proc structure in /usr/include/sys/proc.h. For example, the p argument in any system call is actually a pointer to that process' proc structure, as stated before. The proc structure contains nodes which can describe the owner's identity (p_cred), the process resource limits (p_limit), and so on. In the definition of the process structure, there is a pointer to a prison structure. (p_prison). /usr/include/sys/proc.h: struct proc { ... struct prison *p_prison; ... }; In kern_jail.c, the function then copies the pr structure, which is filled with all the information from the original jail structure, over to the - p->p_prison structure. It then does a - bitwise OR of p->p_flag with the constant + p->p_prison structure. It then does a + bitwise OR of p->p_flag with the constant P_JAILED, meaning that the calling process is now recognized as jailed. The parent process of each process, forked within the jail, is the program jail itself, as it calls the &man.jail.2; system call. When the program is executed through execve, it inherits the properties of its parents proc structure, therefore it has - the p->p_flag set, and the - p->p_prison structure is filled. + the p->p_flag set, and the + p->p_prison structure is filled. /usr/src/sys/kern/kern_jail.c -p->p.prison = pr; -p->p.flag |= P.JAILED; +p->p.prison = pr; +p->p.flag |= P.JAILED; When a process is forked from a parent process, the &man.fork.2; system call deals differently with imprisoned processes. In the fork system call, there are two pointers to a proc structure p1 and p2. p1 points to the parent's proc structure and p2 points to the child's unfilled proc structure. After copying all relevant data between the structures, &man.fork.2; checks if the structure - p->p_prison is filled on + p->p_prison is filled on p2. If it is, it increments the pr.ref by one, and sets the p_flag to one on the child process. /usr/src/sys/kern/kern_fork.c: -if (p2->p_prison) { - p2->p_prison->pr_ref++; - p2->p_flag |= P_JAILED; +if (p2->p_prison) { + p2->p_prison->pr_ref++; + p2->p_flag |= P_JAILED; } Restrictions Throughout the kernel there are access restrictions relating to jailed processes. Usually, these restrictions only check if the process is jailed, and if so, returns an error. For example: - if (p->p_prison) + if (p->p_prison) return EPERM; SysV IPC System V IPC System V IPC is based on messages. Processes can send each other these messages which tell them how to act. The functions which deal with messages are: msgsys, msgctl, msgget, msgsend and msgrcv. Earlier, I mentioned that there were certain sysctls you could turn on or off in order to affect the behavior of Jail. One of these sysctls was jail_sysvipc_allowed. On most systems, this sysctl is set to 0. If it were set to 1, it would defeat the whole purpose of having a jail; privleged users from within the jail would be able to affect processes outside of the environment. The difference between a message and a signal is that the message only consists of the signal number. /usr/src/sys/kern/sysv_msg.c: &man.msgget.3;: msgget returns (and possibly creates) a message descriptor that designates a message queue for use in other system calls. &man.msgctl.3;: Using this function, a process can query the status of a message descriptor. &man.msgsnd.3;: msgsnd sends a message to a process. &man.msgrcv.3;: a process receives messages using this function In each of these system calls, there is this conditional: /usr/src/sys/kern/sysv msg.c: -if (!jail.sysvipc.allowed && p->p_prison != NULL) +if (!jail.sysvipc.allowed && p->p_prison != NULL) return (ENOSYS); semaphores Semaphore system calls allow processes to synchronize execution by doing a set of operations atomically on a set of semaphores. Basically semaphores provide another way for processes lock resources. However, process waiting on a semaphore, that is being used, will sleep until the resources are relinquished. The following semaphore system calls are blocked inside a jail: semsys, semget, semctl and semop. /usr/src/sys/kern/sysv_sem.c: &man.semctl.2;(id, num, cmd, arg): Semctl does the specified cmd on the semaphore queue indicated by id. &man.semget.2;(key, nsems, flag): Semget creates an array of semaphores, corresponding to key. Key and flag take on the same meaning as they do in msgget. &man.semop.2;(id, ops, num): Semop does the set of semaphore operations in the array of structures ops, to the set of semaphores identified by id. shared memory System V IPC allows for processes to share memory. Processes can communicate directly with each other by sharing parts of their virtual address space and then reading and writing data stored in the shared memory. These system calls are blocked within a jailed environment: shmdt, shmat, oshmctl, shmctl, shmget, and shmsys. /usr/src/sys/kern/sysv shm.c: &man.shmctl.2;(id, cmd, buf): shmctl does various control operations on the shared memory region identified by id. &man.shmget.2;(key, size, flag): shmget accesses or creates a shared memory region of size bytes. &man.shmat.2;(id, addr, flag): shmat attaches a shared memory region identified by id to the address space of a process. &man.shmdt.2;(addr): shmdt detaches the shared memory region previously attached at addr. Sockets sockets Jail treats the &man.socket.2; system call and related lower-level socket functions in a special manner. In order to determine whether a certain socket is allowed to be created, it first checks to see if the sysctl jail.socket.unixiproute.only is set. If set, sockets are only allowed to be created if the family specified is either PF_LOCAL, PF_INET or PF_ROUTE. Otherwise, it returns an error. /usr/src/sys/kern/uipc_socket.c: int socreate(dom, aso, type, proto, p) ... register struct protosw *prp; ... { - if (p->p_prison && jail_socket_unixiproute_only && - prp->pr_domain->dom_family != PR_LOCAL && prp->pr_domain->dom_family != PF_INET - && prp->pr_domain->dom_family != PF_ROUTE) + if (p->p_prison && jail_socket_unixiproute_only && + prp->pr_domain->dom_family != PR_LOCAL && prp->pr_domain->dom_family != PF_INET + && prp->pr_domain->dom_family != PF_ROUTE) return (EPROTONOSUPPORT); ... } Berkeley Packet Filter Berkeley Packet Filter data link layer The Berkeley Packet Filter provides a raw interface to data link layers in a protocol independent fashion. The function bpfopen() opens an Ethernet device. There is a conditional which disallows any jailed processes from accessing this function. /usr/src/sys/net/bpf.c: static int bpfopen(dev, flags, fmt, p) ... { - if (p->p_prison) + if (p->p_prison) return (EPERM); ... } Protocols protocols There are certain protocols which are very common, such as TCP, UDP, IP and ICMP. IP and ICMP are on the same level: the network layer 2. There are certain precautions which are taken in order to prevent a jailed process from binding a protocol to a certain port only if the nam parameter is set. nam is a pointer to a sockaddr structure, which describes the address on which to bind the service. A more exact definition is that sockaddr "may be used as a template for reffering to the identifying tag and length of each address"[2]. In the function in pcbbind, sin is a pointer to a sockaddr.in structure, which contains the port, address, length and domain family of the socket which is to be bound. Basically, this disallows any processes from jail to be able to specify the domain family. /usr/src/sys/kern/netinet/in_pcb.c: int in.pcbbind(int, nam, p) ... struct sockaddr *nam; struct proc *p; { ... struct sockaddr.in *sin; ... if (nam) { sin = (struct sockaddr.in *)nam; ... - if (sin->sin_addr.s_addr != INADDR_ANY) - if (prison.ip(p, 0, ->sin.addr.s_addr)) + if (sin->sin_addr.s_addr != INADDR_ANY) + if (prison.ip(p, 0, ->sin.addr.s_addr)) return (EINVAL); .... } ... } You might be wondering what function prison_ip() does. prison.ip is given three arguments, the current process (represented by p), any flags, and an ip address. It returns 1 if the ip address belongs to a jail or 0 if it does not. As you can see from the code, if it is indeed an ip address belonging to a jail, the protcol is not allowed to bind to a certain port. /usr/src/sys/kern/kern_jail.c: int prison_ip(struct proc *p, int flag, u_int32_t *ip) { u_int32_t tmp; - if (!p->p_prison) + if (!p->p_prison) return (0); if (flag) tmp = *ip; else tmp = ntohl (*ip); if (tmp == INADDR_ANY) { if (flag) - *ip = p->p_prison->pr_ip; - else *ip = htonl(p->p_prison->pr_ip); + *ip = p->p_prison->pr_ip; + else *ip = htonl(p->p_prison->pr_ip); return (0); } - if (p->p_prison->pr_ip != tmp) + if (p->p_prison->pr_ip != tmp) return (1); return (0); } Jailed users are not allowed to bind services to an ip which does not belong to the jail. The restriction is also written within the function in_pcbbind: /usr/src/sys/net inet/in_pcb.c if (nam) { ... - lport = sin->sin.port; + lport = sin->sin.port; ... if (lport) { ... - if (p && p->p_prison) + if (p && p->p_prison) prison = 1; - if (prison && - prison_ip(p, 0, ->sin_addr.s_addr)) + if (prison && + prison_ip(p, 0, ->sin_addr.s_addr)) return (EADDRNOTAVAIL); Filesystem filesystem Even root users within the jail are not allowed to set any file flags, such as immutable, append, and no unlink flags, if the securelevel is greater than 0. /usr/src/sys/ufs/ufs/ufs_vnops.c: int ufs.setattr(ap) ... { - if ((cred->cr.uid == 0) && (p->prison == NULL)) { - if ((ip->i_flags - & (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND)) && - securelevel > 0) + if ((cred->cr.uid == 0) && (p->prison == NULL)) { + if ((ip->i_flags + & (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND)) && + securelevel > 0) return (EPERM); } diff --git a/en_US.ISO8859-1/books/arch-handbook/kobj/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/kobj/chapter.sgml index ab80c62fbb..feba4172df 100644 --- a/en_US.ISO8859-1/books/arch-handbook/kobj/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/kobj/chapter.sgml @@ -1,315 +1,315 @@ Kernel Objects Kernel Objects Object-Oriented binary compatibility Kernel Objects, or Kobj provides an object-oriented C programming system for the kernel. As such the data being operated on carries the description of how to operate on it. This allows operations to be added and removed from an interface at run time and without breaking binary compatibility. Terminology object method class interface Object A set of data - data structure - data allocation. Method An operation - function. Class One or more methods. Interface A standard set of one or more methods. Kobj Operation Kobj works by generating descriptions of methods. Each description holds a unique id as well as a default function. The description's address is used to uniquely identify the method within a class' method table. A class is built by creating a method table associating one or more functions with method descriptions. Before use the class is compiled. The compilation allocates a cache and associates it with the class. A unique id is assigned to each method description within the method table of the class if not already done so by another referencing class compilation. For every method to be used a function is generated by script to qualify arguments and automatically reference the method description for a lookup. The generated function looks up the method by using the unique id associated with the method description as a hash into the cache associated with the object's class. If the method is not cached the generated function proceeds to use the class' table to find the method. If the method is found then the associated function within the class is used; otherwise, the default function associated with the method description is used. These indirections can be visualized as the following: - object->cache<->class + object->cache<->class Using Kobj Structures struct kobj_method Functions void kobj_class_compile(kobj_class_t cls); void kobj_class_compile_static(kobj_class_t cls, kobj_ops_t ops); void kobj_class_free(kobj_class_t cls); kobj_t kobj_create(kobj_class_t cls, struct malloc_type *mtype, int mflags); void kobj_init(kobj_t obj, kobj_class_t cls); void kobj_delete(kobj_t obj, struct malloc_type *mtype); Macros KOBJ_CLASS_FIELDS KOBJ_FIELDS DEFINE_CLASS(name, methods, size) KOBJMETHOD(NAME, FUNC) Headers <sys/param.h> <sys/kobj.h> Creating an interface template Kernel Objects interface The first step in using Kobj is to create an Interface. Creating the interface involves creating a template that the script src/sys/kern/makeobjops.pl can use to generate the header and code for the method declarations and method lookup functions. Within this template the following keywords are used: #include, INTERFACE, CODE, METHOD, STATICMETHOD, and DEFAULT. The #include statement and what follows it is copied verbatim to the head of the generated code file. For example: #include <sys/foo.h> The INTERFACE keyword is used to define the interface name. This name is concatenated with each method name as [interface name]_[method name]. Its syntax is INTERFACE [interface name];. For example: INTERFACE foo; The CODE keyword copies its arguments verbatim into the code file. Its syntax is CODE { [whatever] }; For example: CODE { struct foo * foo_alloc_null(struct bar *) { return NULL; } }; The METHOD keyword describes a method. Its syntax is METHOD [return type] [method name] { [object [, arguments]] }; For example: METHOD int bar { struct object *; struct foo *; struct bar; }; The DEFAULT keyword may follow the METHOD keyword. It extends the METHOD key word to include the default function for method. The extended syntax is METHOD [return type] [method name] { [object; [other arguments]] }DEFAULT [default function]; For example: METHOD int bar { struct object *; struct foo *; int bar; } DEFAULT foo_hack; The STATICMETHOD keyword is used like the METHOD keyword except the kobj data is not at the head of the object structure so casting to kobj_t would be incorrect. Instead STATICMETHOD relies on the Kobj data being referenced as 'ops'. This is also useful for calling methods directly out of a class's method table. Other complete examples: src/sys/kern/bus_if.m src/sys/kern/device_if.m Creating a Class Kernel Objects class The second step in using Kobj is to create a class. A class consists of a name, a table of methods, and the size of objects if Kobj's object handling facilities are used. To create the class use the macro DEFINE_CLASS(). To create the method table create an array of kobj_method_t terminated by a NULL entry. Each non-NULL entry may be created using the macro KOBJMETHOD(). For example: DEFINE_CLASS(fooclass, foomethods, sizeof(struct foodata)); kobj_method_t foomethods[] = { KOBJMETHOD(bar_doo, foo_doo), KOBJMETHOD(bar_foo, foo_foo), { NULL, NULL} }; The class must be compiled. Depending on the state of the system at the time that the class is to be initialized a statically allocated cache, ops table have to be used. This can be accomplished by declaring a struct kobj_ops and using kobj_class_compile_static(); otherwise, kobj_class_compile() should be used. Creating an Object Kernel Objects object The third step in using Kobj involves how to define the object. Kobj object creation routines assume that Kobj data is at the head of an object. If this in not appropriate you will have to allocate the object yourself and then use kobj_init() on the Kobj portion of it; otherwise, you may use kobj_create() to allocate and initialize the Kobj portion of the object automatically. kobj_init() may also be used to change the class that an object uses. To integrate Kobj into the object you should use the macro KOBJ_FIELDS. For example struct foo_data { KOBJ_FIELDS; foo_foo; foo_bar; }; Calling Methods The last step in using Kobj is to simply use the generated functions to use the desired method within the object's class. This is as simple as using the interface name and the method name with a few modifications. The interface name should be concatenated with the method name using a '_' between them, all in upper case. For example, if the interface name was foo and the method was bar then the call would be: [return value = ] FOO_BAR(object [, other parameters]); Cleaning Up When an object allocated through kobj_create() is no longer needed kobj_delete() may be called on it, and when a class is no longer being used kobj_class_free() may be called on it. diff --git a/en_US.ISO8859-1/books/arch-handbook/locking/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/locking/chapter.sgml index 8a9b6d938d..3f5dbec7cf 100644 --- a/en_US.ISO8859-1/books/arch-handbook/locking/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/locking/chapter.sgml @@ -1,341 +1,341 @@ Locking Notes SMP Next Generation Project This chapter is maintained by the FreeBSD SMP Next Generation Project. Please direct any comments or suggestions to its &a.smp;. locking multi-processing mutexes lockmgr atomic operations This document outlines the locking used in the FreeBSD kernel to permit effective multi-processing within the kernel. Locking can be achieved via several means. Data structures can be protected by mutexes or &man.lockmgr.9; locks. A few variables are protected simply by always using atomic operations to access them. Mutexes A mutex is simply a lock used to guarantee mutual exclusion. Specifically, a mutex may only be owned by one entity at a time. If another entity wishes to obtain a mutex that is already owned, it must wait until the mutex is released. In the FreeBSD kernel, mutexes are owned by processes. Mutexes may be recursively acquired, but they are intended to be held for a short period of time. Specifically, one may not sleep while holding a mutex. If you need to hold a lock across a sleep, use a &man.lockmgr.9; lock. Each mutex has several properties of interest: Variable Name The name of the struct mtx variable in the kernel source. Logical Name The name of the mutex assigned to it by mtx_init. This name is displayed in KTR trace messages and witness errors and warnings and is used to distinguish mutexes in the witness code. Type The type of the mutex in terms of the MTX_* flags. The meaning for each flag is related to its meaning as documented in &man.mutex.9;. MTX_DEF A sleep mutex MTX_SPIN A spin mutex MTX_RECURSE This mutex is allowed to recurse. Protectees A list of data structures or data structure members that this entry protects. For data structure members, the name will be in the form of - + <structname/structure name/.<structfield/member name/. Dependent Functions Functions that can only be called if this mutex is held. Mutex Listlocks sched_locklocks vm86pcb_locklocks Giantlocks callout_lock Variable Name Logical Name Type Protectees Dependent Functions sched_lock sched lock MTX_SPIN | MTX_RECURSE _gmonparam, cnt.v_swtch, cp_time, curpriority, - pscnt, slpque, itqueuebits, itqueues, rtqueuebits, rtqueues, queuebits, queues, idqueuebits, idqueues, switchtime, switchticks setrunqueue, remrunqueue, mi_switch, chooseproc, schedclock, resetpriority, updatepri, maybe_resched, cpu_switch, cpu_throw, need_resched, resched_wanted, clear_resched, aston, astoff, astpending, calcru, proc_compare vm86pcb_lock vm86pcb lock MTX_DEF vm86pcb vm86_bioscall Giant Giant MTX_DEF | MTX_RECURSE nearly everything lots callout_lock callout lock MTX_SPIN | MTX_RECURSE callfree, callwheel, nextsoftcheck, - softticks, ticks

Shared Exclusive Locks These locks provide basic reader-writer type functionality and may be held by a sleeping process. Currently they are backed by &man.lockmgr.9;. locks shared exclusive Shared Exclusive Lock Listlocks allproc_locklocks proctree_lock Variable Name Protectees allproc_lock allproc zombproc pidhashtbl - nextpid proctree_lock -

Atomically Protected Variables atomically protected variables An atomically protected variable is a special variable that is not protected by an explicit lock. Instead, all data accesses to the variables use special atomic operations as described in &man.atomic.9;. Very few variables are treated this way, although other synchronization primitives such as mutexes are implemented with atomically protected variables. - + <structname/mtx/.<structfield/mtx_lock/ diff --git a/en_US.ISO8859-1/books/arch-handbook/mac/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/mac/chapter.sgml index bfeeae352b..e732a6e8e5 100644 --- a/en_US.ISO8859-1/books/arch-handbook/mac/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/mac/chapter.sgml @@ -1,8030 +1,8028 @@ Chris Costello TrustedBSD Project

chris@FreeBSD.org

Robert Watson TrustedBSD Project

rwatson@FreeBSD.org

The TrustedBSD MAC Framework MAC Documentation Copyright This documentation was developed for the FreeBSD Project by Chris Costello at Safeport Network Services and Network Associates Laboratories, the Security Research Division of Network Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035 (CBOSS), as part of the DARPA CHATS research program. Redistribution and use in source (SGML DocBook) and 'compiled' forms (SGML, HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met: Redistributions of source code (SGML DocBook) must retain the above copyright notice, this list of conditions and the following disclaimer as the first lines of this file unmodified. Redistributions in compiled form (transformed to other DTDs, converted to PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS DOCUMENTATION IS PROVIDED BY THE NETWORKS ASSOCIATES TECHNOLOGY, INC "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NETWORKS ASSOCIATES TECHNOLOGY, INC BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Synopsis FreeBSD includes experimental support for several mandatory access control policies, as well as a framework for kernel security extensibility, the TrustedBSD MAC Framework. The MAC Framework is a pluggable access control framework, permitting new security policies to be easily linked into the kernel, loaded at boot, or loaded dynamically at run-time. The framework provides a variety of features to make it easier to implement new security policies, including the ability to easily tag security labels (such as confidentiality information) onto system objects. This chapter introduces the MAC policy framework and provides documentation for a sample MAC policy module. Introduction The TrustedBSD MAC framework provides a mechanism to allow the compile-time or run-time extension of the kernel access control model. New system policies may be implemented as kernel modules and linked to the kernel; if multiple policy modules are present, their results will be composed. The MAC Framework provides a variety of access control infrastructure services to assist policy writers, including support for transient and persistent policy-agnostic object security labels. This support is currently considered experimental. This chapter provides information appropriate for developers of policy modules, as well as potential consumers of MAC-enabled environments, to learn about how the MAC Framework supports access control extension of the kernel. Policy Background Mandatory Access Control (MAC), refers to a set of access control policies that are mandatorily enforced on users by the operating system. MAC policies may be contrasted with Discretionary Access Control (DAC) protections, by which non-administrative users may (at their discretion) protect objects. In traditional UNIX systems, DAC protections include file permissions and access control lists; MAC protections include process controls preventing inter-user debugging and firewalls. A variety of MAC policies have been formulated by operating system designers and security researches, including the Multi-Level Security (MLS) confidentiality policy, the Biba integrity policy, Role-Based Access Control (RBAC), Domain and Type Enforcement (DTE), and Type Enforcement (TE). Each model bases decisions on a variety of factors, including user identity, role, and security clearance, as well as security labels on objects representing concepts such as data sensitivity and integrity. The TrustedBSD MAC Framework is capable of supporting policy modules that implement all of these policies, as well as a broad class of system hardening policies, which may use existing security attributes, such as user and group IDs, as well as extended attributes on files, and other system properties. In addition, despite the name, the MAC Framework can also be used to implement purely discretionary policies, as policy modules are given substantial flexibility in how they authorize protections. MAC Framework Kernel Architecture The TrustedBSD MAC Framework permits kernel modules to extend the operating system security policy, as well as providing infrastructure functionality required by many access control modules. If multiple policies are simultaneously loaded, the MAC Framework will usefully (for some definition of useful) compose the results of the policies. Kernel Elements The MAC Framework contains a number of kernel elements: Framework management interfaces Concurrency and synchronization primitives. Policy registration Extensible security label for kernel objects Policy entry point composition operators Label management primitives Entry point API invoked by kernel services Entry point API to policy modules Entry points implementations (policy life cycle, object life cycle/label management, access control checks). Policy-agnostic label-management system calls mac_syscall() multiplex system call Various security policies implemented as MAC policy modules Framework Management Interfaces The TrustedBSD MAC Framework may be directly managed using sysctl's, loader tunables, and system calls. In most cases, sysctl's and loader tunables of the same name modify the same parameters, and control behavior such as enforcement of protections relating to various kernel subsystems. In addition, if MAC debugging support is compiled into the kernel, several counters will be maintained tracking label allocation. It is generally advisable that per-subsystem enforcement controls not be used to control policy behavior in production environments, as they broadly impact the operation of all active policies. Instead, per-policy controls should be preferred, as they provide greater granularity and greater operational consistency for policy modules. Loading and unloading of policy modules is performed using the system module management system calls and other system interfaces, including boot loader variables; policy modules will have the opportunity to influence load and unload events, including preventing undesired unloading of the policy. Policy List Concurrency and Synchronization As the set of active policies may change at run-time, and the invocation of entry points is non-atomic, synchronization is required to prevent loading or unloading of policies while an entry point invocation is in progress, freezing the set of active policies for the duration. This is accomplished by means of a framework busy count: whenever an entry point is entered, the busy count is incremented; whenever it is exited, the busy count is decremented. While the busy count is elevated, policy list changes are not permitted, and threads attempting to modify the policy list will sleep until the list is not busy. The busy count is protected by a mutex, and a condition variable is used to wake up sleepers waiting on policy list modifications. One side effect of this synchronization model is that recursion into the MAC Framework from within a policy module is permitted, although not generally used. Various optimizations are used to reduce the overhead of the busy count, including avoiding the full cost of incrementing and decrementing if the list is empty or contains only static entries (policies that are loaded before the system starts, and cannot be unloaded). A compile-time option is also provided which prevents any change in the set of loaded policies at run-time, which eliminates the mutex locking costs associated with supporting dynamically loaded and unloaded policies as synchronization is no longer required. As the MAC Framework is not permitted to block in some entry points, a normal sleep lock cannot be used; as a result, it is possible for the load or unload attempt to block for a substantial period of time waiting for the framework to become idle. Label Synchronization As kernel objects of interest may generally be accessed from more than one thread at a time, and simultaneous entry of more than one thread into the MAC Framework is permitted, security attribute storage maintained by the MAC Framework is carefully synchronized. In general, existing kernel synchronization on kernel object data is used to protect MAC Framework security labels on the object: for example, MAC labels on sockets are protected using the existing socket mutex. Likewise, semantics for concurrent access are generally identical to those of the container objects: for credentials, copy-on-write semantics are maintained for label contents as with the remainder of the credential structure. The MAC Framework asserts necessary locks on objects when invoked with an object reference. Policy authors must be aware of these synchronization semantics, as they will sometimes limit the types of accesses permitted on labels: for example, when a read-only reference to a credential is passed to a policy via an entry point, only read operations are permitted on the label state attached to the credential. Policy Synchronization and Concurrency Policy modules must be written to assume that many kernel threads may simultaneously enter one more policy entry points due to the parallel and preemptive nature of the FreeBSD kernel. If the policy module makes use of mutable state, this may require the use of synchronization primitives within the policy to prevent inconsistent views on that state resulting in incorrect operation of the policy. Policies will generally be able to make use of existing FreeBSD synchronization primitives for this purpose, including mutexes, sleep locks, condition variables, and counting semaphores. However, policies should be written to employ these primitives carefully, respecting existing kernel lock orders, and recognizing that some entry points are not permitted to sleep, limiting the use of primitives in those entry points to mutexes and wakeup operations. When policy modules call out to other kernel subsytems, they will generally need to release any in-policy locks in order to avoid violating the kernel lock order or risking lock recursion. This will maintain policy locks as leaf locks in the global lock order, helping to avoid deadlock. Policy Registration The MAC Framework maintains two lists of active policies: a static list, and a dynamic list. The lists differ only with regards to their locking semantics: an elevated reference count is not required to make use of the static list. When kernel modules containing MAC Framework policies are loaded, the policy module will use SYSINIT to invoke a registration function; when a policy module is unloaded, SYSINIT will likewise invoke a de-registration function. Registration may fail if a policy module is loaded more than once, if insufficient resources are available for the registration (for example, the policy might require labeling and insufficient labeling state might be available), or other policy prerequisites might not be met (some policies may only be loaded prior to boot). Likewise, de-registration may fail if a policy is flagged as not unloadable. Entry Points Kernel services interact with the MAC Framework in two ways: they invoke a series of APIs to notify the framework of relevant events, and they provide a policy-agnostic label structure pointer in security-relevant objects. The label pointer is maintained by the MAC Framework via label management entry points, and permits the Framework to offer a labeling service to policy modules through relatively non-invasive changes to the kernel subsystem maintaining the object. For example, label pointers have been added to processes, process credentials, sockets, pipes, vnodes, Mbufs, network interfaces, IP reassembly queues, and a variety of other security-relevant structures. Kernel services also invoke the MAC Framework when they perform important security decisions, permitting policy modules to augment those decisions based on their own criteria (possibly including data stored in security labels). Most of these security critical decisions will be explicit access control checks; however, some affect more general decision functions such as packet matching for sockets and label transition at program execution. Policy Composition When more than one policy module is loaded into the kernel at a time, the results of the policy modules will be composed by the framework using a composition operator. This operator is currently hard-coded, and requires that all active policies must approve a request for it to return success. As policies may return a variety of error conditions (success, access denied, object doesn't exist, ...), a precedence operator selects the resulting error from the set of errors returned by policies. In general, errors indicating that an object does not exist will be preferred to errors indicating that access to an object is denied. While it is not guaranteed that the resulting composition will be useful or secure, we've found that it is for many useful selections of policies. For example, traditional trusted systems often ship with two or more policies using a similar composition. Labeling Support As many interesting access control extensions rely on security labels on objects, the MAC Framework provides a set of policy-agnostic label management system calls covering a variety of user-exposed objects. Common label types include partition identifiers, sensitivity labels, integrity labels, compartments, domains, roles, and types. By policy agnostic, we mean that policy modules are able to completely define the semantics of meta-data associated with an object. Policy modules participate in the internalization and externalization of string-based labels provides by user applications, and can expose multiple label elements to applications if desired. In-memory labels are stored in slab-allocated struct label, which consists of a fixed-length array of unions, each holding a void * pointer and a long. Policies registering for label storage will be assigned a "slot" identifier, which may be used to dereference the label storage. The semantics of the storage are left entirely up to the policy module: modules are provided with a variety of entry points associated with the kernel object life cycle, including initialization, association/creation, and destruction. Using these interfaces, it is possible to implement reference counting and other storage models. Direct access to the object structure is generally not required by policy modules to retrieve a label, as the MAC Framework generally passes both a pointer to the object and a direct pointer to the object's label into entry points. The primary exception to this rule is the process credential, which must be manually dereferenced to access the credential label. This may change in future revisions of the MAC Framework. Initialization entry points frequently include a sleeping disposition flag indicating whether or not an initialization is permitted to sleep; if sleeping is not permitted, a failure may be returned to cancel allocation of the label (and hence object). This may occur, for example, in the network stack during interrupt handling, where sleeping is not permitted, or while the caller holds a mutex. Due to the performance cost of maintaining labels on in-flight network packets (Mbufs), policies must specifically declare a requirement that Mbuf labels be allocated. Dynamically loaded policies making use of labels must be able to handle the case where their init function has not been called on an object, as objects may already exist when the policy is loaded. The MAC Framework guarantees that uninitialized label slots will hold a 0 or NULL value, which policies may use to detect uninitialized values. However, as allocation of Mbuf labels is conditional, policies must also be able to handle a NULL label pointer for Mbufs if they have been loaded dynamically. In the case of file system labels, special support is provided for the persistent storage of security labels in extended attributes. Where available, extended attribute transactions are used to permit consistent compound updates of security labels on vnodes--currently this support is present only in the UFS2 file system. Policy authors may choose to implement multilabel file system object labels using one (or more) extended attributes. For efficiency reasons, the vnode label (v_label) is a cache of any on-disk label; policies are able to load values into the cache when the vnode is instantiated, and update the cache as needed. As a result, the extended attribute need not be directly accessed with every access control check. Currently, if a labeled policy permits dynamic unloading, its state slot cannot be reclaimed, which places a strict (and relatively low) bound on the number of unload-reload operations for labeled policies. System Calls The MAC Framework implements a number of system calls: most of these calls support the policy-agnostic label retrieval and manipulation APIs exposed to user applications. The label management calls accept a label description structure, struct mac, which contains a series of MAC label elements. Each element contains a character string name, and character string value. Each policy will be given the chance to claim a particular element name, permitting policies to expose multiple independent elements if desired. Policy modules perform the internalization and externalization between kernel labels and user-provided labels via entry points, permitting a variety of semantics. Label management system calls are generally wrapped by user library functions to perform memory allocation and error handling, simplifying user applications that must manage labels. The following MAC-related system calls are present in the FreeBSD kernel: mac_get_proc() may be used to retrieve the label of the current process. mac_set_proc() may be used to request a change in the label of the current process. mac_get_fd() may be used to retrieve the label of an object (file, socket, pipe, ...) referenced by a file descriptor. mac_get_file() may be used to retrieve the label of an object referenced by a file system path. mac_set_fd() may be used to request a change in the label of an object (file, socket, pipe, ...) referenced by a file descriptor. mac_set_file() may be used to request a change in the label of an object referenced by a file system path. mac_syscall() permits policy modules to create new system calls without modifying the system call table; it accepts a target policy name, operation number, and opaque argument for use by the policy. mac_get_pid() may be used to request the label of another process by process id. mac_get_link() is identical to mac_get_file(), only it will not follow a symbolic link if it is the final entry in the path, so may be used to retrieve the label on a symlink. mac_set_link() is identical to mac_set_file(), only it will not follow a symbolic link if it is the final entry in a path, so may be used to manipulate the label on a symlink. mac_execve() is identical to the execve() system call, only it also accepts a requested label to set the process label to when beginning execution of a new program. This change in label on execution is referred to as a "transition". mac_get_peer(), actually implemented via a socket option, retrieves the label of a remote peer on a socket, if available. In addition to these system calls, the SIOCSIGMAC and SIOCSIFMAC network interface ioctls permit the labels on network interfaces to be retrieved and set. MAC Policy Architecture Security policies are either linked directly into the kernel, or compiled into loadable kernel modules that may be loaded at boot, or dynamically using the module loading system calls at runtime. Policy modules interact with the system through a set of declared entry points, providing access to a stream of system events and permitting the policy to influence access control decisions. Each policy contains a number of elements: Optional configuration parameters for policy. Centralized implementation of the policy logic and parameters. Optional implementation of policy life cycle events, such as initialization and destruction. Optional support for initializing, maintaining, and destroying labels on selected kernel objects. Optional support for user process inspection and modification of labels on selected objects. Implementation of selected access control entry points that are of interest to the policy. Declaration of policy identity, module entry points, and policy properties. Policy Declaration Modules may be declared using the MAC_POLICY_SET() macro, which names the policy, provides a reference to the MAC entry point vector, provides load-time flags determining how the policy framework should handle the policy, and optionally requests the allocation of label state by the framework. static struct mac_policy_ops mac_policy_ops = { .mpo_destroy = mac_policy_destroy, .mpo_init = mac_policy_init, .mpo_init_bpfdesc_label = mac_policy_init_bpfdesc_label, .mpo_init_cred_label = mac_policy_init_label, /* ... */ .mpo_check_vnode_setutimes = mac_policy_check_vnode_setutimes, .mpo_check_vnode_stat = mac_policy_check_vnode_stat, .mpo_check_vnode_write = mac_policy_check_vnode_write, }; The MAC policy entry point vector, mac_policy_ops in this example, associates functions defined in the module with specific entry points. A complete listing of available entry points and their prototypes may be found in the MAC entry point reference section. Of specific interest during module registration are the .mpo_destroy and .mpo_init entry points. .mpo_init will be invoked once a policy is successfully registered with the module framework but prior to any other entry points becoming active. This permits the policy to perform any policy-specific allocation and initialization, such as initialization of any data or locks. .mpo_destroy will be invoked when a policy module is unloaded to permit releasing of any allocated memory and destruction of locks. Currently, these two entry points are invoked with the MAC policy list mutex held to prevent any other entry points from being invoked: this will be changed, but in the mean time, policies should be careful about what kernel primitives they invoke so as to avoid lock ordering or sleeping problems. The policy declaration's module name field exists so that the module may be uniquely identified for the purposes of module dependencies. An appropriate string should be selected. The full string name of the policy is displayed to the user via the kernel log during load and unload events, and also exported when providing status information to userland processes. Policy Flags The policy declaration flags field permits the module to provide the framework with information about its capabilities at the time the module is loaded. Currently, three flags are defined: MPC_LOADTIME_FLAG_UNLOADOK This flag indicates that the policy module may be unloaded. If this flag is not provided, then the policy framework will reject requests to unload the module. This flag might be used by modules that allocate label state and are unable to free that state at runtime. MPC_LOADTIME_FLAG_NOTLATE This flag indicates that the policy module must be loaded and initialized early in the boot process. If the flag is specified, attempts to register the module following boot will be rejected. The flag may be used by policies that require pervasive labeling of all system objects, and cannot handle objects that have not been properly initialized by the policy. MPC_LOADTIME_FLAG_LABELMBUFS This flag indicates that the policy module requires labeling of Mbufs, and that memory should always be allocated for the storage of Mbuf labels. By default, the MAC Framework will not allocate label storage for Mbufs unless at least one loaded policy has this flag set. This measurably improves network performance when policies do not require Mbuf labeling. A kernel option, MAC_ALWAYS_LABEL_MBUF, exists to force the MAC Framework to allocate Mbuf label storage regardless of the setting of this flag, and may be useful in some environments. Policies using the MPC_LOADTIME_FLAG_LABELMBUFS without the MPC_LOADTIME_FLAG_NOTLATE flag set must be able to correctly handle NULL Mbuf label pointers passed into entry points. This is necessary as in-flight Mbufs without label storage may persist after a policy enabling Mbuf labeling has been loaded. If a policy is loaded before the network subsystem is active (i.e., the policy is not being loaded late), then all Mbufs are guaranteed to have label storage. Policy Entry Points Four classes of entry points are offered to policies registered with the framework: entry points associated with the registration and management of policies, entry points denoting initialization, creation, destruction, and other life cycle events for kernel objects, events associated with access control decisions that the policy module may influence, and calls associated with the management of labels on objects. In addition, a mac_syscall() entry point is provided so that policies may extend the kernel interface without registering new system calls. Policy module writers should be aware of the kernel locking strategy, as well as what object locks are available during which entry points. Writers should attempt to avoid deadlock scenarios by avoiding grabbing non-leaf locks inside of entry points, and also follow the locking protocol for object access and modification. In particular, writers should be aware that while necessary locks to access objects and their labels are generally held, sufficient locks to modify an object or its label may not be present for all entry points. Locking information for arguments is documented in the MAC framework entry point document. Policy entry points will pass a reference to the object label along with the object itself. This permits labeled policies to be unaware of the internals of the object yet still make decisions based on the label. The exception to this is the process credential, which is assumed to be understood by policies as a first class security object in the kernel. MAC Policy Entry Point Reference General-Purpose Module Entry Points <function>&mac.mpo;_init</function> void &mac.mpo;_init struct mac_policy_conf *conf &mac.thead; conf MAC policy definition Policy load event. The policy list mutex is held, so sleep operations cannot be performed, and calls out to other kernel subsystems must be made with caution. If potentially sleeping memory allocations are required during policy initialization, they should be made using a separate module SYSINIT(). <function>&mac.mpo;_destroy</function> void &mac.mpo;_destroy struct mac_policy_conf *conf &mac.thead; conf MAC policy definition Policy load event. The policy list mutex is held, so caution should be applied. <function>&mac.mpo;_syscall</function> int &mac.mpo;_syscall struct thread *td int call void *arg &mac.thead; td Calling thread call Policy-specific syscall number arg Pointer to syscall arguments This entry point provides a policy-multiplexed system call so that policies may provide additional services to user processes without registering specific system calls. The policy name provided during registration is used to demux calls from userland, and the arguments will be forwarded to this entry point. When implementing new services, security modules should be sure to invoke appropriate access control checks from the MAC framework as needed. For example, if a policy implements an augmented signal functionality, it should call the necessary signal access control checks to invoke the MAC framework and other registered policies. Modules must currently perform the copyin() of the syscall data on their own. <function>&mac.mpo;_thread_userret</function> void &mac.mpo;_thread_userret struct thread *td &mac.thead; td Returning thread This entry point permits policy modules to perform MAC-related events when a thread returns to user space, via a system call return, trap return, or otherwise. This is required for policies that have floating process labels, as it is not always possible to acquire the process lock at arbitrary points in the stack during system call processing; process labels might represent traditional authentication data, process history information, or other data. To employ this mechanism, intended changes to the process credential label may be stored in the p_label protected by a per-policy spin lock, and then set the per-thread TDF_ASTPENDING flag and per-process PS_MACPENDM flag to schedule a call to the userret entry point. From this entry point, the policy may create a replacement credential with less concern about the locking context. Policy writers are cautioned that event ordering relating to scheduling an AST and the AST being performed may be complex and interlaced in multithreaded applications. Label Operations <function>&mac.mpo;_init_bpfdesc_label</function> void &mac.mpo;_init_bpfdesc_label struct label *label &mac.thead; label New label to apply Initialize the label on a newly instantiated bpfdesc (BPF descriptor). Sleeping is permitted. <function>&mac.mpo;_init_cred_label</function> void &mac.mpo;_init_cred_label struct label *label &mac.thead; label New label to initialize Initialize the label for a newly instantiated user credential. Sleeping is permitted. <function>&mac.mpo;_init_devfsdirent_label</function> void &mac.mpo;_init_devfsdirent_label struct label *label &mac.thead; label New label to apply Initialize the label on a newly instantiated devfs entry. Sleeping is permitted. <function>&mac.mpo;_init_ifnet_label</function> void &mac.mpo;_init_ifnet_label struct label *label &mac.thead; label New label to apply Initialize the label on a newly instantiated network interface. Sleeping is permitted. <function>&mac.mpo;_init_ipq_label</function> void &mac.mpo;_init_ipq_label struct label *label int flag &mac.thead; label New label to apply flag Sleeping/non-sleeping &man.malloc.9;; see below Initialize the label on a newly instantiated IP fragment reassembly queue. The flag field may be one of M_WAITOK and M_NOWAIT, and should be employed to avoid performing a sleeping &man.malloc.9; during this initialization call. IP fragment reassembly queue allocation frequently occurs in performance sensitive environments, and the implementation should be careful to avoid sleeping or long-lived operations. This entry point is permitted to fail resulting in the failure to allocate the IP fragment reassembly queue. <function>&mac.mpo;_init_mbuf_label</function> void &mac.mpo;_init_mbuf_label int flag struct label *label &mac.thead; flag Sleeping/non-sleeping &man.malloc.9;; see below label Policy label to initialize Initialize the label on a newly instantiated mbuf packet header (mbuf). The flag field may be one of M_WAITOK and M_NOWAIT, and should be employed to avoid performing a sleeping &man.malloc.9; during this initialization call. Mbuf allocation frequently occurs in performance sensitive environments, and the implementation should be careful to avoid sleeping or long-lived operations. This entry point is permitted to fail resulting in the failure to allocate the mbuf header. <function>&mac.mpo;_init_mount_label</function> void &mac.mpo;_init_mount_label struct label *mntlabel struct label *fslabel &mac.thead; mntlabel Policy label to be initialized for the mount itself fslabel Policy label to be initialized for the file system Initialize the labels on a newly instantiated mount point. Sleeping is permitted. <function>&mac.mpo;_init_mount_fs_label</function> void &mac.mpo;_init_mount_fs_label struct label *label &mac.thead; label Label to be initialized Initialize the label on a newly mounted file system. Sleeping is permitted <function>&mac.mpo;_init_pipe_label</function> void &mac.mpo;_init_pipe_label struct label*label &mac.thead; label Label to be filled in Initialize a label for a newly instantiated pipe. Sleeping is permitted. <function>&mac.mpo;_init_socket_label</function> void &mac.mpo;_init_socket_label struct label *label int flag &mac.thead; label New label to initialize flag &man.malloc.9; flags Initialize a label for a newly instantiated socket. The flag field may be one of M_WAITOK and M_NOWAIT, and should be employed to avoid performing a sleeping &man.malloc.9; during this initialization call. <function>&mac.mpo;_init_socket_peer_label</function> void &mac.mpo;_init_socket_peer_label struct label *label int flag &mac.thead; label New label to initialize flag &man.malloc.9; flags Initialize the peer label for a newly instantiated socket. The flag field may be one of M_WAITOK and M_NOWAIT, and should be employed to avoid performing a sleeping &man.malloc.9; during this initialization call. <function>&mac.mpo;_init_proc_label</function> void &mac.mpo;_init_proc_label struct label *label &mac.thead; label New label to initialize Initialize the label for a newly instantiated process. Sleeping is permitted. <function>&mac.mpo;_init_vnode_label</function> void &mac.mpo;_init_vnode_label struct label *label &mac.thead; label New label to initialize Initialize the label on a newly instantiated vnode. Sleeping is permitted. <function>&mac.mpo;_destroy_bpfdesc_label</function> void &mac.mpo;_destroy_bpfdesc_label struct label *label &mac.thead; label bpfdesc label Destroy the label on a BPF descriptor. In this entry point a policy should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_cred_label</function> void &mac.mpo;_destroy_cred_label struct label *label &mac.thead; label Label being destroyed Destroy the label on a credential. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_devfsdirent_label</function> void &mac.mpo;_destroy_devfsdirent_label struct label *label &mac.thead; label Label being destroyed Destroy the label on a devfs entry. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_ifnet_label</function> void &mac.mpo;_destroy_ifnet_label struct label *label &mac.thead; label Label being destroyed Destroy the label on a removed interface. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_ipq_label</function> void &mac.mpo;_destroy_ipq_label struct label *label &mac.thead; label Label being destroyed Destroy the label on an IP fragment queue. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_mbuf_label</function> void &mac.mpo;_destroy_mbuf_label struct label *label &mac.thead; label Label being destroyed Destroy the label on an mbuf header. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_mount_label</function> void &mac.mpo;_destroy_mount_label struct label *label &mac.thead; label Mount point label being destroyed Destroy the labels on a mount point. In this entry point, a policy module should free the internal storage associated with mntlabel so that they may be destroyed. <function>&mac.mpo;_destroy_mount_label</function> void &mac.mpo;_destroy_mount_label struct label *mntlabel struct label *fslabel &mac.thead; mntlabel Mount point label being destroyed fslabel File system label being destroyed> Destroy the labels on a mount point. In this entry point, a policy module should free the internal storage associated with mntlabel and fslabel so that they may be destroyed. <function>&mac.mpo;_destroy_socket_label</function> void &mac.mpo;_destroy_socket_label struct label *label &mac.thead; label Socket label being destroyed Destroy the label on a socket. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_socket_peer_label</function> void &mac.mpo;_destroy_socket_peer_label struct label *peerlabel &mac.thead; peerlabel Socket peer label being destroyed Destroy the peer label on a socket. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_pipe_label</function> void &mac.mpo;_destroy_pipe_label struct label *label &mac.thead; label Pipe label Destroy the label on a pipe. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_proc_label</function> void &mac.mpo;_destroy_proc_label struct label *label &mac.thead; label Process label Destroy the label on a process. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_destroy_vnode_label</function> void &mac.mpo;_destroy_vnode_label struct label *label &mac.thead; label Process label Destroy the label on a vnode. In this entry point, a policy module should free any internal storage associated with label so that it may be destroyed. <function>&mac.mpo;_copy_mbuf_label</function> void &mac.mpo;_copy_mbuf_label struct label *src struct label *dest &mac.thead; src Source label dest Destination label Copy the label information in src into dest. <function>&mac.mpo;_copy_pipe_label</function> void &mac.mpo;_copy_pipe_label struct label *src struct label *dest &mac.thead; src Source label dest Destination label Copy the label information in src into dest. <function>&mac.mpo;_copy_vnode_label</function> void &mac.mpo;_copy_vnode_label struct label *src struct label *dest &mac.thead; src Source label dest Destination label Copy the label information in src into dest. <function>&mac.mpo;_externalize_cred_label</function> int &mac.mpo;_externalize_cred_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_externalize_ifnet_label</function> int &mac.mpo;_externalize_ifnet_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_externalize_pipe_label</function> int &mac.mpo;_externalize_pipe_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_externalize_socket_label</function> int &mac.mpo;_externalize_socket_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_externalize_socket_peer_label</function> int &mac.mpo;_externalize_socket_peer_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_externalize_vnode_label</function> int &mac.mpo;_externalize_vnode_label &mac.externalize.paramdefs; &mac.thead; &mac.externalize.tbody; &mac.externalize.para; <function>&mac.mpo;_internalize_cred_label</function> int &mac.mpo;_internalize_cred_label &mac.internalize.paramdefs; &mac.thead; &mac.internalize.tbody; &mac.internalize.para; <function>&mac.mpo;_internalize_ifnet_label</function> int &mac.mpo;_internalize_ifnet_label &mac.internalize.paramdefs; &mac.thead; &mac.internalize.tbody; &mac.internalize.para; <function>&mac.mpo;_internalize_pipe_label</function> int &mac.mpo;_internalize_pipe_label &mac.internalize.paramdefs; &mac.thead; &mac.internalize.tbody; &mac.internalize.para; <function>&mac.mpo;_internalize_socket_label</function> int &mac.mpo;_internalize_socket_label &mac.internalize.paramdefs; &mac.thead; &mac.internalize.tbody; &mac.internalize.para; <function>&mac.mpo;_internalize_vnode_label</function> int &mac.mpo;_internalize_vnode_label &mac.internalize.paramdefs; &mac.thead; &mac.internalize.tbody; &mac.internalize.para; Label Events This class of entry points is used by the MAC framework to permit policies to maintain label information on kernel objects. For each labeled kernel object of interest to a MAC policy, entry points may be registered for relevant life cycle events. All objects implement initialization, creation, and destruction hooks. Some objects will also implement relabeling, allowing user processes to change the labels on objects. Some objects will also implement object-specific events, such as label events associated with IP reassembly. A typical labeled object will have the following life cycle of entry points: Label initialization o (object-specific wait) \ Label creation o \ -Relabel events, o--<--. +Relabel events, o--<--. Various object-specific, | | -Access control events ~-->--o +Access control events ~-->--o \ Label destruction o Label initialization permits policies to allocate memory and set initial values for labels without context for the use of the object. The label slot allocated to a policy will be zeroed by default, so some policies may not need to perform initialization. Label creation occurs when the kernel structure is associated with an actual kernel object. For example, Mbufs may be allocated and remain unused in a pool until they are required. mbuf allocation causes label initialization on the mbuf to take place, but mbuf creation occurs when the mbuf is associated with a datagram. Typically, context will be provided for a creation event, including the circumstances of the creation, and labels of other relevant objects in the creation process. For example, when an mbuf is created from a socket, the socket and its label will be presented to registered policies in addition to the new mbuf and its label. Memory allocation in creation events is discouraged, as it may occur in performance sensitive ports of the kernel; in addition, creation calls are not permitted to fail so a failure to allocate memory cannot be reported. Object specific events do not generally fall into the other broad classes of label events, but will generally provide an opportunity to modify or update the label on an object based on additional context. For example, the label on an IP fragment reassembly queue may be updated during the MAC_UPDATE_IPQ entry point as a result of the acceptance of an additional mbuf to that queue. Access control events are discussed in detail in the following section. Label destruction permits policies to release storage or state associated with a label during its association with an object so that the kernel data structures supporting the object may be reused or released. In addition to labels associated with specific kernel objects, an additional class of labels exists: temporary labels. These labels are used to store update information submitted by user processes. These labels are initialized and destroyed as with other label types, but the creation event is MAC_INTERNALIZE, which accepts a user label to be converted to an in-kernel representation. File System Object Labeling Event Operations <function>&mac.mpo;_associate_vnode_devfs</function> void &mac.mpo;_associate_vnode_devfs struct mount *mp struct label *fslabel struct devfs_dirent *de struct label *delabel struct vnode *vp struct label *vlabel &mac.thead; mp Devfs mount point fslabel Devfs file system label - (mp->mnt_fslabel) + (mp->mnt_fslabel) de Devfs directory entry delabel Policy label associated with de vp vnode associated with de vlabel Policy label associated with vp Fill in the label (vlabel) for a newly created devfs vnode based on the devfs directory entry passed in de and its label. <function>&mac.mpo;_associate_vnode_extattr</function> int &mac.mpo;_associate_vnode_extattr struct mount *mp struct label *fslabel struct vnode *vp struct label *vlabel &mac.thead; mp File system mount point fslabel File system label vp Vnode to label vlabel Policy label associated with vp Attempt to retrieve the label for vp from the file system extended attributes. Upon success, the value 0 is returned. Should extended attribute retrieval not be supported, an accepted fallback is to copy fslabel into vlabel. In the event of an error, an appropriate value for errno should be returned. <function>&mac.mpo;_associate_vnode_singlelabel</function> void &mac.mpo;_associate_vnode_singlelabel struct mount *mp struct label *fslabel struct vnode *vp struct label *vlabel &mac.thead; mp File system mount point fslabel File system label vp Vnode to label vlabel Policy label associated with vp On non-multilabel file systems, this entry point is called to set the policy label for vp based on the file system label, fslabel. <function>&mac.mpo;_create_devfs_device</function> void &mac.mpo;_create_devfs_device dev_t dev struct devfs_dirent *devfs_dirent struct label *label &mac.thead; dev Device corresponding with devfs_dirent devfs_dirent Devfs directory entry to be labeled. label Label for devfs_dirent to be filled in. Fill out the label on a devfs_dirent being created for the passed device. This call will be made when the device file system is mounted, regenerated, or a new device is made available. <function>&mac.mpo;_create_devfs_directory</function> void &mac.mpo;_create_devfs_directory char *dirname int dirnamelen struct devfs_dirent *devfs_dirent struct label *label &mac.thead; dirname Name of directory being created namelen Length of string dirname devfs_dirent Devfs directory entry for directory being created. Fill out the label on a devfs_dirent being created for the passed directory. This call will be made when the device file system is mounted, regenerated, or a new device requiring a specific directory hierarchy is made available. <function>&mac.mpo;_create_devfs_symlink</function> void &mac.mpo;_create_devfs_symlink struct ucred *cred struct mount *mp struct devfs_dirent *dd struct label *ddlabel struct devfs_dirent *de struct label *delabel &mac.thead; cred Subject credential mp Devfs mount point dd Link destination ddlabel Label associated with dd de Symlink entry delabel Label associated with de Fill in the label (delabel) for a newly created &man.devfs.5; symbolic link entry. <function>&mac.mpo;_create_vnode_extattr</function> int &mac.mpo;_create_vnode_extattr struct ucred *cred struct mount *mp struct label *fslabel struct vnode *dvp struct label *dlabel struct vnode *vp struct label *vlabel struct componentname *cnp &mac.thead; cred Subject credential mount File system mount point label File system label dvp Parent directory vnode dlabel Label associated with dvp vp Newly created vnode vlabel Policy label associated with vp cnp Component name for vp Write out the label for vp to the appropriate extended attribute. If the write succeeds, fill in vlabel with the label, and return 0. Otherwise, return an appropriate error. <function>&mac.mpo;_create_mount</function> void &mac.mpo;_create_mount struct ucred *cred struct mount *mp struct label *mnt struct label *fslabel &mac.thead; cred Subject credential mp Object; file system being mounted mntlabel Policy label to be filled in for mp fslabel Policy label for the file system mp mounts. Fill out the labels on the mount point being created by the passed subject credential. This call will be made when a new file system is mounted. <function>&mac.mpo;_create_root_mount</function> void &mac.mpo;_create_root_mount struct ucred *cred struct mount *mp struct label *mntlabel struct label *fslabel &mac.thead; See . Fill out the labels on the mount point being created by the passed subject credential. This call will be made when the root file system is mounted, after &mac.mpo;_create_mount;. <function>&mac.mpo;_relabel_vnode</function> void &mac.mpo;_relabel_vnode struct ucred *cred struct vnode *vp struct label *vnodelabel struct label *newlabel &mac.thead; cred Subject credential vp vnode to relabel vnodelabel Existing policy label for vp newlabel New, possibly partial label to replace vnodelabel Update the label on the passed vnode given the passed update vnode label and the passed subject credential. <function>&mac.mpo;_setlabel_vnode_extattr</function> int &mac.mpo;_setlabel_vnode_extattr struct ucred *cred struct vnode *vp struct label *vlabel struct label *intlabel &mac.thead; cred Subject credential vp Vnode for which the label is being written vlabel Policy label associated with vp intlabel Label to write out Write out the policy from intlabel to an extended attribute. This is called from vop_stdcreatevnode_ea. <function>&mac.mpo;_update_devfsdirent</function> void &mac.mpo;_update_devfsdirent struct devfs_dirent *devfs_dirent struct label *direntlabel struct vnode *vp struct label *vnodelabel &mac.thead; devfs_dirent Object; devfs directory entry direntlabel Policy label for devfs_dirent to be updated. vp Parent vnode Locked vnodelabel Policy label for vp Update the devfs_dirent label from the passed devfs vnode label. This call will be made when a devfs vnode has been successfully relabeled to commit the label change such that it lasts even if the vnode is recycled. It will also be made when when a symlink is created in devfs, following a call to mac_vnode_create_from_vnode to initialize the vnode label. IPC Object Labeling Event Operations <function>&mac.mpo;_create_mbuf_from_socket</function> void &mac.mpo;_create_mbuf_from_socket struct socket *so struct label *socketlabel struct mbuf *m struct label *mbuflabel &mac.thead; socket Socket Socket locking WIP socketlabel Policy label for socket m Object; mbuf mbuflabel Policy label to fill in for m Set the label on a newly created mbuf header from the passed socket label. This call is made when a new datagram or message is generated by the socket and stored in the passed mbuf. <function>&mac.mpo;_create_pipe</function> void &mac.mpo;_create_pipe struct ucred *cred struct pipe *pipe struct label *pipelabel &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe Set the label on a newly created pipe from the passed subject credential. This call is made when a new pipe is created. <function>&mac.mpo;_create_socket</function> void &mac.mpo;_create_socket struct ucred *cred struct socket *so struct label *socketlabel &mac.thead; cred Subject credential Immutable so Object; socket to label socketlabel Label to fill in for so Set the label on a newly created socket from the passed subject credential. This call is made when a socket is created. <function>&mac.mpo;_create_socket_from_socket</function> void &mac.mpo;_create_socket_from_socket struct socket *oldsocket struct label *oldsocketlabel struct socket *newsocket struct label *newsocketlabel &mac.thead; oldsocket Listening socket oldsocketlabel Policy label associated with oldsocket newsocket New socket newsocketlabel Policy label associated with newsocketlabel Label a socket, newsocket, newly &man.accept.2;ed, based on the &man.listen.2; socket, oldsocket. <function>&mac.mpo;_relabel_pipe</function> void &mac.mpo;_relabel_pipe struct ucred *cred struct pipe *pipe struct label *oldlabel struct label *newlabel &mac.thead; cred Subject credential pipe Pipe oldlabel Current policy label associated with pipe newlabel Policy label update to apply to pipe Apply a new label, newlabel, to pipe. <function>&mac.mpo;_relabel_socket</function> void &mac.mpo;_relabel_socket struct ucred *cred struct socket *so struct label *oldlabel struct label *newlabel &mac.thead; cred Subject credential Immutable so Object; socket oldlabel Current label for so newlabel Label update for so Update the label on a socket from the passed socket label update. <function>&mac.mpo;_set_socket_peer_from_mbuf</function> void &mac.mpo;_set_socket_peer_from_mbuf struct mbuf *mbuf struct label *mbuflabel struct label *oldlabel struct label *newlabel &mac.thead; mbuf First datagram received over socket mbuflabel Label for mbuf oldlabel Current label for the socket newlabel Policy label to be filled out for the socket Set the peer label on a stream socket from the passed mbuf label. This call will be made when the first datagram is received by the stream socket, with the exception of Unix domain sockets. <function>&mac.mpo;_set_socket_peer_from_socket</function> void &mac.mpo;_set_socket_peer_from_socket struct socket *oldsocket struct label *oldsocketlabel struct socket *newsocket struct label *newsocketpeerlabel &mac.thead; oldsocket Local socket oldsocketlabel Policy label for oldsocket newsocket Peer socket newsocketpeerlabel Policy label to fill in for newsocket Set the peer label on a stream UNIX domain socket from the passed remote socket endpoint. This call will be made when the socket pair is connected, and will be made for both endpoints. Network Object Labeling Event Operations <function>&mac.mpo;_create_bpfdesc</function> void &mac.mpo;_create_bpfdesc struct ucred *cred struct bpf_d *bpf_d struct label *bpflabel &mac.thead; cred Subject credential Immutable bpf_d Object; bpf descriptor bpf Policy label to be filled in for bpf_d Set the label on a newly created BPF descriptor from the passed subject credential. This call will be made when a BPF device node is opened by a process with the passed subject credential. <function>&mac.mpo;_create_ifnet</function> void &mac.mpo;_create_ifnet struct ifnet *ifnet struct label *ifnetlabel &mac.thead; ifnet Network interface ifnetlabel Policy label to fill in for ifnet Set the label on a newly created interface. This call may be made when a new physical interface becomes available to the system, or when a pseudo-interface is instantiated during the boot or as a result of a user action. <function>&mac.mpo;_create_ipq</function> void &mac.mpo;_create_ipq struct mbuf *fragment struct label *fragmentlabel struct ipq *ipq struct label *ipqlabel &mac.thead; fragment First received IP fragment fragmentlabel Policy label for fragment ipq IP reassembly queue to be labeled ipqlabel Policy label to be filled in for ipq Set the label on a newly created IP fragment reassembly queue from the mbuf header of the first received fragment. <function>&mac.mpo;_create_datagram_from_ipq</function> void &mac.mpo;_create_create_datagram_from_ipq struct ipq *ipq struct label *ipqlabel struct mbuf *datagram struct label *datagramlabel &mac.thead; ipq IP reassembly queue ipqlabel Policy label for ipq datagram Datagram to be labeled datagramlabel Policy label to be filled in for datagramlabel Set the label on a newly reassembled IP datagram from the IP fragment reassembly queue from which it was generated. <function>&mac.mpo;_create_fragment</function> void &mac.mpo;_create_fragment struct mbuf *datagram struct label *datagramlabel struct mbuf *fragment struct label *fragmentlabel &mac.thead; datagram Datagram datagramlabel Policy label for datagram fragment Fragment to be labeled fragmentlabel Policy label to be filled in for datagram Set the label on the mbuf header of a newly created IP fragment from the label on the mbuf header of the datagram it was generate from. <function>&mac.mpo;_create_mbuf_from_mbuf</function> void &mac.mpo;_create_mbuf_from_mbuf struct mbuf *oldmbuf struct label *oldmbuflabel struct mbuf *newmbuf struct label *newmbuflabel &mac.thead; oldmbuf Existing (source) mbuf oldmbuflabel Policy label for oldmbuf newmbuf New mbuf to be labeled newmbuflabel Policy label to be filled in for newmbuf Set the label on the mbuf header of a newly created datagram from the mbuf header of an existing datagram. This call may be made in a number of situations, including when an mbuf is re-allocated for alignment purposes. <function>&mac.mpo;_create_mbuf_linklayer</function> void &mac.mpo;_create_mbuf_linklayer struct ifnet *ifnet struct label *ifnetlabel struct mbuf *mbuf struct label *mbuflabel &mac.thead; ifnet Network interface ifnetlabel Policy label for ifnet mbuf mbuf header for new datagram mbuflabel Policy label to be filled in for mbuf Set the label on the mbuf header of a newly created datagram generated for the purposes of a link layer response for the passed interface. This call may be made in a number of situations, including for ARP or ND6 responses in the IPv4 and IPv6 stacks. <function>&mac.mpo;_create_mbuf_from_bpfdesc</function> void &mac.mpo;_create_mbuf_from_bpfdesc struct bpf_d *bpf_d struct label *bpflabel struct mbuf *mbuf struct label *mbuflabel &mac.thead; bpf_d BPF descriptor bpflabel Policy label for bpflabel mbuf New mbuf to be labeled mbuflabel Policy label to fill in for mbuf Set the label on the mbuf header of a newly created datagram generated using the passed BPF descriptor. This call is made when a write is performed to the BPF device associated with the passed BPF descriptor. <function>&mac.mpo;_create_mbuf_from_ifnet</function> void &mac.mpo;_create_mbuf_from_ifnet struct ifnet *ifnet struct label *ifnetlabel struct mbuf *mbuf struct label *mbuflabel &mac.thead; ifnet Network interface ifnetlabel Policy label for ifnetlabel mbuf mbuf header for new datagram mbuflabel Policy label to be filled in for mbuf Set the label on the mbuf header of a newly created datagram generated from the passed network interface. <function>&mac.mpo;_create_mbuf_multicast_encap</function> void &mac.mpo;_create_mbuf_multicast_encap struct mbuf *oldmbuf struct label *oldmbuflabel struct ifnet *ifnet struct label *ifnetlabel struct mbuf *newmbuf struct label *newmbuflabel &mac.thead; oldmbuf mbuf header for existing datagram oldmbuflabel Policy label for oldmbuf ifnet Network interface ifnetlabel Policy label for ifnet newmbuf mbuf header to be labeled for new datagram newmbuflabel Policy label to be filled in for newmbuf Set the label on the mbuf header of a newly created datagram generated from the existing passed datagram when it is processed by the passed multicast encapsulation interface. This call is made when an mbuf is to be delivered using the virtual interface. <function>&mac.mpo;_create_mbuf_netlayer</function> void &mac.mpo;_create_mbuf_netlayer struct mbuf *oldmbuf struct label *oldmbuflabel struct mbuf *newmbuf struct label *newmbuflabel &mac.thead; oldmbuf Received datagram oldmbuflabel Policy label for oldmbuf newmbuf Newly created datagram newmbuflabel Policy label for newmbuf Set the label on the mbuf header of a newly created datagram generated by the IP stack in response to an existing received datagram (oldmbuf). This call may be made in a number of situations, including when responding to ICMP request datagrams. <function>&mac.mpo;_fragment_match</function> int &mac.mpo;_fragment_match struct mbuf *fragment struct label *fragmentlabel struct ipq *ipq struct label *ipqlabel &mac.thead; fragment IP datagram fragment fragmentlabel Policy label for fragment ipq IP fragment reassembly queue ipqlabel Policy label for ipq Determine whether an mbuf header containing an IP datagram (fragment) fragment matches the label of the passed IP fragment reassembly queue (ipq). Return (1) for a successful match, or (0) for no match. This call is made when the IP stack attempts to find an existing fragment reassembly queue for a newly received fragment; if this fails, a new fragment reassembly queue may be instantiated for the fragment. Policies may use this entry point to prevent the reassembly of otherwise matching IP fragments if policy does not permit them to be reassembled based on the label or other information. <function>&mac.mpo;_relabel_ifnet</function> void &mac.mpo;_relabel_ifnet struct ucred *cred struct ifnet *ifnet struct label *ifnetlabel struct label *newlabel &mac.thead; cred Subject credential ifnet Object; Network interface ifnetlabel Policy label for ifnet newlabel Label update to apply to ifnet Update the label of network interface, ifnet, based on the passed update label, newlabel, and the passed subject credential, cred. <function>&mac.mpo;_update_ipq</function> void &mac.mpo;_update_ipq struct mbuf *fragment struct label *fragmentlabel struct ipq *ipq struct label *ipqlabel &mac.thead; mbuf IP fragment mbuflabel Policy label for mbuf ipq IP fragment reassembly queue ipqlabel Policy label to be updated for ipq Update the label on an IP fragment reassembly queue (ipq) based on the acceptance of the passed IP fragment mbuf header (mbuf). Process Labeling Event Operations <function>&mac.mpo;_create_cred</function> void &mac.mpo;_create_cred struct ucred *parent_cred struct ucred *child_cred &mac.thead; parent_cred Parent subject credential child_cred Child subject credential Set the label of a newly created subject credential from the passed subject credential. This call will be made when &man.crcopy.9; is invoked on a newly created struct ucred. This call should not be confused with a process forking or creation event. <function>&mac.mpo;_execve_transition</function> void &mac.mpo;_execve_transition struct ucred *old struct ucred *new struct vnode *vp struct label *vnodelabel &mac.thead; old Existing subject credential Immutable new New subject credential to be labeled vp File to execute Locked vnodelabel Policy label for vp Update the label of a newly created subject credential (new) from the passed existing subject credential (old) based on a label transition caused by executing the passed vnode (vp). This call occurs when a process executes the passed vnode and one of the policies returns a success from the mpo_execve_will_transition entry point. Policies may choose to implement this call simply by invoking mpo_create_cred and passing the two subject credentials so as not to implement a transitioning event. Policies should not leave this entry point unimplemented if they implement mpo_create_cred, even if they do not implement mpo_execve_will_transition. <function>&mac.mpo;_execve_will_transition</function> int &mac.mpo;_execve_will_transition struct ucred *old struct vnode *vp struct label *vnodelabel &mac.thead; old Subject credential prior to &man.execve.2; Immutable vp File to execute vnodelabel Policy label for vp Determine whether the policy will want to perform a transition event as a result of the execution of the passed vnode by the passed subject credential. Return 1 if a transition is required, 0 if not. Even if a policy returns 0, it should behave correctly in the presence of an unexpected invocation of mpo_execve_transition, as that call may happen as a result of another policy requesting a transition. <function>&mac.mpo;_create_proc0</function> void &mac.mpo;_create_proc0 struct ucred *cred &mac.thead; cred Subject credential to be filled in Create the subject credential of process 0, the parent of all kernel processes. <function>&mac.mpo;_create_proc1</function> void &mac.mpo;_create_proc1 struct ucred *cred &mac.thead; cred Subject credential to be filled in Create the subject credential of process 1, the parent of all user processes. <function>&mac.mpo;_relabel_cred</function> void &mac.mpo;_relabel_cred struct ucred *cred struct label *newlabel &mac.thead; cred Subject credential newlabel Label update to apply to cred Update the label on a subject credential from the passed update label. - - Access Control Checks Access control entry points permit policy modules to influence access control decisions made by the kernel. Generally, although not always, arguments to an access control entry point will include one or more authorizing credentials, information (possibly including a label) for any other objects involved in the operation. An access control entry point may return 0 to permit the operation, or an &man.errno.2; error value. The results of invoking the entry point across various registered policy modules will be composed as follows: if all modules permit the operation to succeed, success will be returned. If one or modules returns a failure, a failure will be returned. If more than one module returns a failure, the errno value to return to the user will be selected using the following precedence, implemented by the error_select() function in kern_mac.c: Most precedence EDEADLK EINVAL ESRCH EACCES Least precedence EPERM If none of the error values returned by all modules are listed in the precedence chart then an arbitrarily selected value from the set will be returned. In general, the rules provide precedence to errors in the following order: kernel failures, invalid arguments, object not present, access not permitted, other. <function>&mac.mpo;_check_bpfdesc_receive</function> int &mac.mpo;_check_bpfdesc_receive struct bpf_d *bpf_d struct label *bpflabel struct ifnet *ifnet struct label *ifnetlabel &mac.thead; bpf_d Subject; BPF descriptor bpflabel Policy label for bpf_d ifnet Object; network interface ifnetlabel Policy label for ifnet Determine whether the MAC framework should permit datagrams from the passed interface to be delivered to the buffers of the passed BPF descriptor. Return (0) for success, or an errno value for failure Suggested failure: EACCES for label mismatches, EPERM for lack of privilege. <function>&mac.mpo;_check_kenv_dump</function> int &mac.mpo;_check_kenv_dump struct ucred *cred &mac.thead; cred Subject credential Determine whether the subject should be allowed to retrieve the kernel environment (see &man.kenv.2;). <function>&mac.mpo;_check_kenv_get</function> int &mac.mpo;_check_kenv_get struct ucred *cred char *name &mac.thead; cred Subject credential name Kernel environment variable name Determine whether the subject should be allowed to retrieve the value of the specified kernel environment variable. <function>&mac.mpo;_check_kenv_set</function> int &mac.mpo;_check_kenv_set struct ucred *cred char *name &mac.thead; cred Subject credential name Kernel environment variable name Determine whether the subject should be allowed to set the specified kernel environment variable. <function>&mac.mpo;_check_kenv_unset</function> int &mac.mpo;_check_kenv_unset struct ucred *cred char *name &mac.thead; cred Subject credential name Kernel environment variable name Determine whether the subject should be allowed to unset the specified kernel environment variable. <function>&mac.mpo;_check_kld_load</function> int &mac.mpo;_check_kld_load struct ucred *cred struct vnode *vp struct label *vlabel &mac.thead; cred Subject credential vp Kernel module vnode vlabel Label associated with vp Determine whether the subject should be allowed to load the specified module file. <function>&mac.mpo;_check_kld_stat</function> int &mac.mpo;_check_kld_stat struct ucred *cred &mac.thead; cred Subject credential Determine whether the subject should be allowed to retrieve a list of loaded kernel module files and associated statistics. <function>&mac.mpo;_check_kld_unload</function> int &mac.mpo;_check_kld_unload struct ucred *cred &mac.thead; cred Subject credential Determine whether the subject should be allowed to unload a kernel module. <function>&mac.mpo;_check_pipe_ioctl</function> int &mac.mpo;_check_pipe_ioctl struct ucred *cred struct pipe *pipe struct label *pipelabel unsigned long cmd void *data &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe cmd &man.ioctl.2; command data &man.ioctl.2; data Determine whether the subject should be allowed to make the specified &man.ioctl.2; call. <function>&mac.mpo;_check_pipe_poll</function> int &mac.mpo;_check_pipe_poll struct ucred *cred struct pipe *pipe struct label *pipelabel &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe Determine whether the subject should be allowed to poll pipe. <function>&mac.mpo;_check_pipe_read</function> int &mac.mpo;_check_pipe_read struct ucred *cred struct pipe *pipe struct label *pipelabel &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe Determine whether the subject should be allowed read access to pipe. <function>&mac.mpo;_check_pipe_relabel</function> int &mac.mpo;_check_pipe_relabel struct ucred *cred struct pipe *pipe struct label *pipelabel struct label *newlabel &mac.thead; cred Subject credential pipe Pipe pipelabel Current policy label associated with pipe newlabel Label update to pipelabel Determine whether the subject should be allowed to relabel pipe. <function>&mac.mpo;_check_pipe_stat</function> int &mac.mpo;_check_pipe_stat struct ucred *cred struct pipe *pipe struct label *pipelabel &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe Determine whether the subject should be allowed to retrieve statistics related to pipe. <function>&mac.mpo;_check_pipe_write</function> int &mac.mpo;_check_pipe_write struct ucred *cred struct pipe *pipe struct label *pipelabel &mac.thead; cred Subject credential pipe Pipe pipelabel Policy label associated with pipe Determine whether the subject should be allowed to write to pipe. <function>&mac.mpo;_check_socket_bind</function> int &mac.mpo;_check_socket_bind struct ucred *cred struct socket *socket struct label *socketlabel struct sockaddr *sockaddr &mac.thead; cred Subject credential socket Socket to be bound socketlabel Policy label for socket sockaddr Address of socket <function>&mac.mpo;_check_socket_connect</function> int &mac.mpo;_check_socket_connect struct ucred *cred struct socket *socket struct label *socketlabel struct sockaddr *sockaddr &mac.thead; cred Subject credential socket Socket to be connected socketlabel Policy label for socket sockaddr Address of socket Determine whether the subject credential (cred) can connect the passed socket (socket) to the passed socket address (sockaddr). Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatches, EPERM for lack of privilege. <function>&mac.mpo;_check_socket_receive</function> int &mac.mpo;_check_socket_receive struct ucred *cred struct socket *so struct label *socketlabel &mac.thead; cred Subject credential so Socket socketlabel Policy label associated with so Determine whether the subject should be allowed to receive information from the socket so. <function>&mac.mpo;_check_socket_send</function> int &mac.mpo;_check_socket_send struct ucred *cred struct socket *so struct label *socketlabel &mac.thead; cred Subject credential so Socket socketlabel Policy label associated with so Determine whether the subject should be allowed to send information across the socket so. <function>&mac.mpo;_check_cred_visible</function> int &mac.mpo;_check_cred_visible struct ucred *u1 struct ucred *u2 &mac.thead; u1 Subject credential u2 Object credential Determine whether the subject credential u1 can see other subjects with the passed subject credential u2. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatches, EPERM for lack of privilege, or ESRCH to hide visibility. This call may be made in a number of situations, including inter-process status sysctl's used by ps, and in procfs lookups. <function>&mac.mpo;_check_socket_visible</function> int &mac.mpo;_check_socket_visible struct ucred *cred struct socket *socket struct label *socketlabel &mac.thead; cred Subject credential socket Object; socket socketlabel Policy label for socket <function>&mac.mpo;_check_ifnet_relabel</function> int &mac.mpo;_check_ifnet_relabel struct ucred *cred struct ifnet *ifnet struct label *ifnetlabel struct label *newlabel &mac.thead; cred Subject credential ifnet Object; network interface ifnetlabel Existing policy label for ifnet newlabel Policy label update to later be applied to ifnet Determine whether the subject credential can relabel the passed network interface to the passed label update. <function>&mac.mpo;_check_socket_relabel</function> int &mac.mpo;_check_socket_relabel struct ucred *cred struct socket *socket struct label *socketlabel struct label *newlabel &mac.thead; cred Subject credential socket Object; socket socketlabel Existing policy label for socket newlabel Label update to later be applied to socketlabel Determine whether the subject credential can relabel the passed socket to the passed label update. <function>&mac.mpo;_check_cred_relabel</function> int &mac.mpo;_check_cred_relabel struct ucred *cred struct label *newlabel &mac.thead; cred Subject credential newlabel Label update to later be applied to cred Determine whether the subject credential can relabel itself to the passed label update. <function>&mac.mpo;_check_vnode_relabel</function> int &mac.mpo;_check_vnode_relabel struct ucred *cred struct vnode *vp struct label *vnodelabel struct label *newlabel &mac.thead; cred Subject credential Immutable vp Object; vnode Locked vnodelabel Existing policy label for vp newlabel Policy label update to later be applied to vp Determine whether the subject credential can relabel the passed vnode to the passed label update. <function>&mac.mpo;_check_mount_stat</function> int &mac.mpo;_check_mount_stat struct ucred *cred struct mount *mp struct label *mountlabel &mac.thead; cred Subject credential mp Object; file system mount mountlabel Policy label for mp Determine whether the subject credential can see the results of a statfs performed on the file system. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatches or EPERM for lack of privilege. This call may be made in a number of situations, including during invocations of &man.statfs.2; and related calls, as well as to determine what file systems to exclude from listings of file systems, such as when &man.getfsstat.2; is invoked. <function>&mac.mpo;_check_proc_debug</function> int &mac.mpo;_check_proc_debug struct ucred *cred struct proc *proc &mac.thead; cred Subject credential Immutable proc Object; process Determine whether the subject credential can debug the passed process. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, EPERM for lack of privilege, or ESRCH to hide visibility of the target. This call may be made in a number of situations, including use of the &man.ptrace.2; and &man.ktrace.2; APIs, as well as for some types of procfs operations. <function>&mac.mpo;_check_vnode_access</function> int &mac.mpo;_check_vnode_access struct ucred *cred struct vnode *vp struct label *label int flags &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp flags &man.access.2; flags Determine how invocations of &man.access.2; and related calls by the subject credential should return when performed on the passed vnode using the passed access flags. This should generally be implemented using the same semantics used in &mac.mpo;_check_vnode_open. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatches or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_chdir</function> int &mac.mpo;_check_vnode_chdir struct ucred *cred struct vnode *dvp struct label *dlabel &mac.thead; cred Subject credential dvp Object; vnode to &man.chdir.2; into dlabel Policy label for dvp Determine whether the subject credential can change the process working directory to the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_chroot</function> int &mac.mpo;_check_vnode_chroot struct ucred *cred struct vnode *dvp struct label *dlabel &mac.thead; cred Subject credential dvp Directory vnode dlabel Policy label associated with dvp Determine whether the subject should be allowed to &man.chroot.2; into the specified directory (dvp). <function>&mac.mpo;_check_vnode_create</function> int &mac.mpo;_check_vnode_create struct ucred *cred struct vnode *dvp struct label *dlabel struct componentname *cnp struct vattr *vap &mac.thead; cred Subject credential dvp Object; vnode dlabel Policy label for dvp cnp Component name for dvp vap vnode attributes for vap Determine whether the subject credential can create a vnode with the passed parent directory, passed name information, and passed attribute information. Return 0 for success, or an errno value for failure. Suggested failure: EACCES. for label mismatch, or EPERM for lack of privilege. This call may be made in a number of situations, including as a result of calls to &man.open.2; with O_CREAT, &man.mknod.2;, &man.mkfifo.2;, and others. <function>&mac.mpo;_check_vnode_delete</function> int &mac.mpo;_check_vnode_delete struct ucred *cred struct vnode *dvp struct label *dlabel struct vnode *vp void *label struct componentname *cnp &mac.thead; cred Subject credential dvp Parent directory vnode dlabel Policy label for dvp vp Object; vnode to delete label Policy label for vp cnp Component name for vp Determine whether the subject credential can delete a vnode from the passed parent directory and passed name information. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. This call may be made in a number of situations, including as a result of calls to &man.unlink.2; and &man.rmdir.2;. Policies implementing this entry point should also implement mpo_check_rename_to to authorize deletion of objects as a result of being the target of a rename. <function>&mac.mpo;_check_vnode_deleteacl</function> int &mac.mpo;_check_vnode_deleteacl struct ucred *cred struct vnode *vp struct label *label acl_type_t type &mac.thead; cred Subject credential Immutable vp Object; vnode Locked label Policy label for vp type ACL type Determine whether the subject credential can delete the ACL of passed type from the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_exec</function> int &mac.mpo;_check_vnode_exec struct ucred *cred struct vnode *vp struct label *label &mac.thead; cred Subject credential vp Object; vnode to execute label Policy label for vp Determine whether the subject credential can execute the passed vnode. Determination of execute privilege is made separately from decisions about any transitioning event. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_getacl</function> int &mac.mpo;_check_vnode_getacl struct ucred *cred struct vnode *vp struct label *label acl_type_t type &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp type ACL type Determine whether the subject credential can retrieve the ACL of passed type from the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_getextattr</function> int &mac.mpo;_check_vnode_getextattr struct ucred *cred struct vnode *vp struct label *label int attrnamespace const char *name struct uio *uio &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp attrnamespace Extended attribute namespace name Extended attribute name uio I/O structure pointer; see &man.uio.9; Determine whether the subject credential can retrieve the extended attribute with the passed namespace and name from the passed vnode. Policies implementing labeling using extended attributes may be interested in special handling of operations on those extended attributes. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_link</function> int &mac.mpo;_check_vnode_link struct ucred *cred struct vnode *dvp struct label *dlabel struct vnode *vp struct label *label struct componentname *cnp &mac.thead; cred Subject credential dvp Directory vnode dlabel Policy label associated with dvp vp Link destination vnode label Policy label associated with vp cnp Component name for the link being created Determine whether the subject should be allowed to create a link to the vnode vp with the name specified by cnp. <function>&mac.mpo;_check_vnode_mmap</function> int &mac.mpo;_check_vnode_mmap struct ucred *cred struct vnode *vp struct label *label int prot &mac.thead; cred Subject credential vp Vnode to map label Policy label associated with vp prot Mmap protections (see &man.mmap.2;) Determine whether the subject should be allowed to map the vnode vp with the protections specified in prot. <function>&mac.mpo;_check_vnode_mmap_downgrade</function> void &mac.mpo;_check_vnode_mmap_downgrade struct ucred *cred struct vnode *vp struct label *label int *prot &mac.thead; cred See . vp label prot Mmap protections to be downgraded Downgrade the mmap protections based on the subject and object labels. <function>&mac.mpo;_check_vnode_mprotect</function> int &mac.mpo;_check_vnode_mprotect struct ucred *cred struct vnode *vp struct label *label int prot &mac.thead; cred Subject credential vp Mapped vnode prot Memory protections Determine whether the subject should be allowed to set the specified memory protections on memory mapped from the vnode vp. <function>&mac.mpo;_check_vnode_poll</function> int &mac.mpo;_check_vnode_poll struct ucred *active_cred struct ucred *file_cred struct vnode *vp struct label *label &mac.thead; active_cred Subject credential file_cred Credential associated with the struct file vp Polled vnode label Policy label associated with vp Determine whether the subject should be allowed to poll the vnode vp. <function>&mac.mpo;_check_vnode_rename_from</function> int &mac.mpo;_vnode_rename_from struct ucred *cred struct vnode *dvp struct label *dlabel struct vnode *vp struct label *label struct componentname *cnp &mac.thead; cred Subject credential dvp Directory vnode dlabel Policy label associated with dvp vp Vnode to be renamed label Policy label associated with vp cnp Component name for vp Determine whether the subject should be allowed to rename the vnode vp to something else. <function>&mac.mpo;_check_vnode_rename_to</function> int &mac.mpo;_check_vnode_rename_to struct ucred *cred struct vnode *dvp struct label *dlabel struct vnode *vp struct label *label int samedir struct componentname *cnp &mac.thead; cred Subject credential dvp Directory vnode dlabel Policy label associated with dvp vp Overwritten vnode label Policy label associated with vp samedir Boolean; 1 if the source and destination directories are the same cnp Destination component name Determine whether the subject should be allowed to rename to the vnode vp, into the directory dvp, or to the name represented by cnp. If there is no existing file to overwrite, vp and label will be NULL. <function>&mac.mpo;_check_socket_listen</function> int &mac.mpo;_check_socket_listen struct ucred *cred struct socket *socket struct label *socketlabel &mac.thead; cred Subject credential socket Object; socket socketlabel Policy label for socket Determine whether the subject credential can listen on the passed socket. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_lookup</function> int &mac.mpo;_check_vnode_lookup struct ucred *cred struct vnode *dvp struct label *dlabel struct componentname *cnp &mac.thead; cred Subject credential dvp Object; vnode dlabel Policy label for dvp cnp Component name being looked up Determine whether the subject credential can perform a lookup in the passed directory vnode for the passed name. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_open</function> int &mac.mpo;_check_vnode_open struct ucred *cred struct vnode *vp struct label *label int acc_mode &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp acc_mode &man.open.2; access mode Determine whether the subject credential can perform an open operation on the passed vnode with the passed access mode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_readdir</function> int &mac.mpo;_check_vnode_readdir struct ucred *cred struct vnode *dvp struct label *dlabel &mac.thead; cred Subject credential dvp Object; directory vnode dlabel Policy label for dvp Determine whether the subject credential can perform a readdir operation on the passed directory vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_readlink</function> int &mac.mpo;_check_vnode_readlink struct ucred *cred struct vnode *vp struct label *label &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp Determine whether the subject credential can perform a readlink operation on the passed symlink vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. This call may be made in a number of situations, including an explicit readlink call by the user process, or as a result of an implicit readlink during a name lookup by the process. <function>&mac.mpo;_check_vnode_revoke</function> int &mac.mpo;_check_vnode_revoke struct ucred *cred struct vnode *vp struct label *label &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp Determine whether the subject credential can revoke access to the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setacl</function> int &mac.mpo;_check_vnode_setacl struct ucred *cred struct vnode *vp struct label *label acl_type_t type struct acl *acl &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp type ACL type acl ACL Determine whether the subject credential can set the passed ACL of passed type on the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setextattr</function> int &mac.mpo;_check_vnode_setextattr struct ucred *cred struct vnode *vp struct label *label int attrnamespace const char *name struct uio *uio &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp attrnamespace Extended attribute namespace name Extended attribute name uio I/O structure pointer; see &man.uio.9; Determine whether the subject credential can set the extended attribute of passed name and passed namespace on the passed vnode. Policies implementing security labels backed into extended attributes may want to provide additional protections for those attributes. Additionally, policies should avoid making decisions based on the data referenced from uio, as there is a potential race condition between this check and the actual operation. The uio may also be NULL if a delete operation is being performed. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setflags</function> int &mac.mpo;_check_vnode_setflags struct ucred *cred struct vnode *vp struct label *label u_long flags &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp flags File flags; see &man.chflags.2; Determine whether the subject credential can set the passed flags on the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setmode</function> int &mac.mpo;_check_vnode_setmode struct ucred *cred struct vnode *vp struct label *label mode_t mode &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp mode File mode; see &man.chmod.2; Determine whether the subject credential can set the passed mode on the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setowner</function> int &mac.mpo;_check_vnode_setowner struct ucred *cred struct vnode *vp struct label *label uid_t uid gid_t gid &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp uid User ID gid Group ID Determine whether the subject credential can set the passed uid and passed gid as file uid and file gid on the passed vnode. The IDs may be set to (-1) to request no update. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_vnode_setutimes</function> int &mac.mpo;_check_vnode_setutimes struct ucred *cred struct vnode *vp struct label *label struct timespec atime struct timespec mtime &mac.thead; cred Subject credential vp Object; vp label Policy label for vp atime Access time; see &man.utimes.2; mtime Modification time; see &man.utimes.2; Determine whether the subject credential can set the passed access timestamps on the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_proc_sched</function> int &mac.mpo;_check_proc_sched struct ucred *ucred struct proc *proc &mac.thead; cred Subject credential proc Object; process Determine whether the subject credential can change the scheduling parameters of the passed process. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, EPERM for lack of privilege, or ESRCH to limit visibility. See &man.setpriority.2; for more information. <function>&mac.mpo;_check_proc_signal</function> int &mac.mpo;_check_proc_signal struct ucred *cred struct proc *proc int signal &mac.thead; cred Subject credential proc Object; process signal Signal; see &man.kill.2; Determine whether the subject credential can deliver the passed signal to the passed process. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, EPERM for lack of privilege, or ESRCH to limit visibility. <function>&mac.mpo;_check_vnode_stat</function> int &mac.mpo;_check_vnode_stat struct ucred *cred struct vnode *vp struct label *label &mac.thead; cred Subject credential vp Object; vnode label Policy label for vp Determine whether the subject credential can stat the passed vnode. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. See &man.stat.2; for more information. <function>&mac.mpo;_check_ifnet_transmit</function> int &mac.mpo;_check_ifnet_transmit struct ucred *cred struct ifnet *ifnet struct label *ifnetlabel struct mbuf *mbuf struct label *mbuflabel &mac.thead; cred Subject credential ifnet Network interface ifnetlabel Policy label for ifnet mbuf Object; mbuf to be sent mbuflabel Policy label for mbuf Determine whether the network interface can transmit the passed mbuf. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_socket_deliver</function> int &mac.mpo;_check_socket_deliver struct ucred *cred struct ifnet *ifnet struct label *ifnetlabel struct mbuf *mbuf struct label *mbuflabel &mac.thead; cred Subject credential ifnet Network interface ifnetlabel Policy label for ifnet mbuf Object; mbuf to be delivered mbuflabel Policy label for mbuf Determine whether the socket may receive the datagram stored in the passed mbuf header. Return 0 for success, or an errno value for failure. Suggested failures: EACCES for label mismatch, or EPERM for lack of privilege. <function>&mac.mpo;_check_socket_visible</function> int &mac.mpo;_check_socket_visible struct ucred *cred struct socket *so struct label *socketlabel &mac.thead; cred Subject credential Immutable so Object; socket socketlabel Policy label for so Determine whether the subject credential cred can "see" the passed socket (socket) using system monitoring functions, such as those employed by &man.netstat.8; and &man.sockstat.1;. Return 0 for success, or an errno value for failure. Suggested failure: EACCES for label mismatches, EPERM for lack of privilege, or ESRCH to hide visibility. <function>&mac.mpo;_check_system_acct</function> int &mac.mpo;_check_system_acct struct ucred *ucred struct vnode *vp struct label *vlabel &mac.thead; ucred Subject credential vp Accounting file; &man.acct.5; vlabel Label associated with vp Determine whether the subject should be allowed to enable accounting, based on its label and the label of the accounting log file. <function>&mac.mpo;_check_system_nfsd</function> int &mac.mpo;_check_system_nfsd struct ucred *cred &mac.thead; cred Subject credential Determine whether the subject should be allowed to call &man.nfssvc.2;. <function>&mac.mpo;_check_system_reboot</function> int &mac.mpo;_check_system_reboot struct ucred *cred int howto &mac.thead; cred Subject credential howto howto parameter from &man.reboot.2; Determine whether the subject should be allowed to reboot the system in the specified manner. <function>&mac.mpo;_check_system_settime</function> int &mac.mpo;_check_system_settime struct ucred *cred &mac.thead; cred Subject credential Determine whether the user should be allowed to set the system clock. <function>&mac.mpo;_check_system_swapon</function> int &mac.mpo;_check_system_swapon struct ucred *cred struct vnode *vp struct label *vlabel &mac.thead; cred Subject credential vp Swap device vlabel Label associated with vp Determine whether the subject should be allowed to add vp as a swap device. <function>&mac.mpo;_check_system_sysctl</function> int &mac.mpo;_check_system_sysctl struct ucred *cred int *name u_int *namelen void *old size_t *oldlenp int inkernel void *new size_t newlen &mac.thead; cred Subject credential name See &man.sysctl.3; namelen old oldlenp inkernel Boolean; 1 if called from kernel new See &man.sysctl.3; newlen Determine whether the subject should be allowed to make the specified &man.sysctl.3; transaction. Label Management Calls Relabel events occur when a user process has requested that the label on an object be modified. A two-phase update occurs: first, an access control check will be performed to determine if the update is both valid and permitted, and then the update itself is performed via a separate entry point. Relabel entry points typically accept the object, object label reference, and an update label submitted by the process. Memory allocation during relabel is discouraged, as relabel calls are not permitted to fail (failure should be reported earlier in the relabel check). Userland Architecture The TrustedBSD MAC Framework includes a number of policy-agnostic elements, including MAC library interfaces for abstractly managing labels, modifications to the system credential management and login libraries to support the assignment of MAC labels to users, and a set of tools to monitor and modify labels on processes, files, and network interfaces. More details on the user architecture will be added to this section in the near future. APIs for Policy-Agnostic Label Management The TrustedBSD MAC Framework provides a number of library and system calls permitting applications to manage MAC labels on objects using a policy-agnostic interface. This permits applications to manipulate labels for a variety of policies without being written to support specific policies. These interfaces are used by general-purpose tools such as &man.ifconfig.8;, &man.ls.1; and &man.ps.1; to view labels on network interfaces, files, and processes. The APIs also support MAC management tools including &man.getfmac.8;, &man.getpmac.8;, &man.setfmac.8;, &man.setfsmac.8;, and &man.setpmac.8;. The MAC APIs are documented in &man.mac.3;. Applications handle MAC labels in two forms: an internalized form used to return and set labels on processes and objects (mac_t), and externalized form based on C strings appropriate for storage in configuration files, display to the user, or input from the user. Each MAC label contains a number of elements, each consisting of a name and value pair. Policy modules in the kernel bind to specific names and interpret the values in policy-specific ways. In the externalized string form, labels are represented by a comma-delimited list of name and value pairs separated by the / character. Labels may be directly converted to and from text using provided APIs; when retrieving labels from the kernel, internalized label storage must first be prepared for the desired label element set. Typically, this is done in one of two ways: using &man.mac.prepare.3; and an arbitrary list of desired label elements, or one of the variants of the call that loads a default element set from the &man.mac.conf.5; configuration file. Per-object defaults permit application writers to usefully display labels associated with objects without being aware of the policies present in the system. Currently, direct manipulation of label elements other than by conversion to a text string, string editing, and conversion back to an internalized label is not supported by the MAC library. Such interfaces may be added in the future if they prove necessary for application writers. Binding of Labels to Users The standard user context management interface, &man.setusercontext.3;, has been modified to retrieve MAC labels associated with a user's class from &man.login.conf.5;. These labels are then set along with other user context when either LOGIN_SETALL is specified, or when LOGIN_SETMAC is explicitly specified. It is expected that, in a future version of FreeBSD, the MAC label database will be separated from the login.conf user class abstraction, and be maintained in a separate database. However, the &man.setusercontext.3; API should remain the same following such a change. Conclusion The TrustedBSD MAC framework permits kernel modules to augment the system security policy in a highly integrated manner. They may do this based on existing object properties, or based on label data that is maintained with the assistance of the MAC framework. The framework is sufficiently flexible to implement a variety of policy types, including information flow security policies such as MLS and Biba, as well as policies based on existing BSD credentials or file protections. Policy authors may wish to consult this documentation as well as existing security modules when implementing a new security service. diff --git a/en_US.ISO8859-1/books/arch-handbook/pccard/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/pccard/chapter.sgml index 138dbf3e37..55a39bf163 100644 --- a/en_US.ISO8859-1/books/arch-handbook/pccard/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/pccard/chapter.sgml @@ -1,379 +1,379 @@ PC Card PC Card CardBus This chapter will talk about the FreeBSD mechanisms for writing a device driver for a PC Card or CardBus device. However, at the present time, it just documents how to add a driver to an existing pccard driver. Adding a device The procedure for adding a new device to the list of supported pccard devices has changed from the system used through FreeBSD 4. In prior versions, editing a file in /etc to list the device was necessary. Starting in FreeBSD 5.0, devices drivers know what devices they support. There is now a table of supported devices in the kernel that drivers use to attach to a device. Overview CIS PC Cards are identified in one of two ways, both based on information in the CIS of the card. The first method is to use numeric manufacturer and product numbers. The second method is to use the human readable strings that are also contained in the CIS. The PC Card bus uses a centralized database and some macros to facilitate a design pattern to help the driver writer match devices to his driver. There is a widespread practice of one company developing a reference design for a PC Card product and then selling this design to other companies to market. Those companies refine the design, market the product to their target audience or geographic area and put their own name plate onto the card. However, the refinements to the physical card typically are very minor, if any changes are made at all. Often, however, to strengthen their branding of their version of the card, these vendors will place their company name in the human strings in the CIS space, but leave the manufacturer and product ids unchanged. NetGear Linksys D-Link Because of the above practice, it is a smaller work load for FreeBSD to use the numeric IDs. It also introduces some minor complications into the process of adding IDs to the system. One must carefully check to see who really made the card, especially when it appears that the vendor who made the card from might already have a different manufacturer id listed in the central database. Linksys, D-Link and NetGear are a number of US Manufacturers of LAN hardware that often sell the same design. These same designs can be sold in Japan under names such as Buffalo and Corega. Yet often, these devices will all have the same manufacturer and product id. The PC Card bus keeps its central database of card information, but not which driver is associated with them, in /sys/dev/pccard/pccarddevs. It also provides a set of macros that allow one to easily construct simple entries in the table the driver uses to claim devices. Finally, some really low end devices do not contain manufacturer identification at all. These devices require that one matches them using the human readable CIS strings. While it would be nice if we did not need this method as a fallback, it is necessary for some very low end CD-ROM players that are quite popular. This method should generally be avoided, but a number of devices are listed in this section because they were added prior to the recognition of the OEM nature of the PC Card business. When adding new devices, prefer using the numeric method. Format of <filename>pccarddevs</filename> There are four sections of the pccarddevs files. The first section lists the manufacturer numbers for those vendors that use them. This section is sorted in numerical order. The next section has all of the products that are used by these vendors, along with their product ID numbers and a description string. The description string typically is not used (instead we set the device's description based on the human readable CIS, even if we match on the numeric version). These two sections are then repeated for those devices that use the string matching method. Finally, C-style comments are allowed anywhere in the file. The first section of the file contains the vendor IDs. Please keep this list sorted in numeric order. Also, please coordinate changes to this file because we share it with NetBSD to help facilitate a common clearing house for this information. For example: vendor FUJITSU 0x0004 Fujitsu Corporation vendor NETGEAR_2 0x000b Netgear vendor PANASONIC 0x0032 Matsushita Electric Industrial Co. vendor SANDISK 0x0045 Sandisk Corporation shows the first few vendor ids. Chances are very good that the NETGEAR_2 entry is really an OEM that NETGEAR purchased cards from and the author of support for those cards was unaware at the time that Netgear was using someone else's id. These entries are fairly straightforward. There is the vendor keyword used to denote the kind of line that this is. There is the name of the vendor. This name will be repeated later in the pccarddevs file, as well as used in the driver's match tables, so keep it short and a valid C identifier. There is a numeric ID, in hex, for the manufacturer. Do not add IDs of the form 0xffffffff or 0xffff because these are reserved ids (the former is 'no id set' while the latter is sometimes seen in extremely poor quality cards to try to indicate 'none). Finally there is a string description of the company that makes the card. This string is not used in FreeBSD for anything but commentary purposes. The second section of the file contains the products. As you can see in the following example: /* Allied Telesis K.K. */ product ALLIEDTELESIS LA_PCM 0x0002 Allied Telesis LA-PCM /* Archos */ product ARCHOS ARC_ATAPI 0x0043 MiniCD the format is similar to the vendor lines. There is the product keyword. Then there is the vendor name, repeated from above. This is followed by the product name, which is used by the driver and should be a valid C identifier, but may also start with a number. There is then the product id for this card, in hex. As with the vendors, there is the same convention for 0xffffffff and 0xffff. Finally, there is a string description of the device itself. This string typically is not used in FreeBSD, since FreeBSD's pccard bus driver will construct a string from the human readable CIS entries, but it can be used in the rare cases where this is somehow insufficient. The products are in alphabetical order by manufacturer, then numerical order by product id. They have a C comment before each manufacturer's entries and there is a blank line between entries. The third section is like the previous vendor section, but with all of the manufacturer numeric ids as -1. -1 means match anything you find in the FreeBSD pccard bus code. Since these are C identifiers, their names must be unique. Otherwise the format is identical to the first section of the file. The final section contains the entries for those cards that we must match with string entries. This sections' format is a little different than the generic section: product ADDTRON AWP100 { "Addtron", "AWP-100&spWireless&spPCMCIA", "Version&sp01.02", NULL } product ALLIEDTELESIS WR211PCM { "Allied&spTelesis&spK.K.", "WR211PCM", NULL, NULL } Allied Telesis WR211PCM We have the familiar product keyword, followed by the vendor name followed by the card name, just as in the second section of the file. However, then we deviate from that format. There is a {} grouping, followed by a number of strings. These strings correspond to the vendor, product and extra information that is defined in a CIS_INFO tuple. These strings are filtered by the program that generates pccarddevs.h to replace &sp with a real space. NULL entries mean that that part of the entry should be ignored. In the example I have picked, there is a bad entry. It should not contain the version number in it unless that is critical for the operation of the card. Sometimes vendors will have many different versions of the card in the field that all work, in which case that information only makes it harder for someone with a similar card to use it with FreeBSD. Sometimes it is necessary when a vendor wishes to sell many different parts under the same brand due to market considerations (availability, price, and so forth). Then it can be critical to disambiguating the card in those rare cases where the vendor kept the same manufacturer/product pair. Regular expression matching is not available at this time. Sample probe routine PC Cardprobe To understand how to add a device to the list of supported devices, one must understand the probe and/or match routines that many drivers have. It is complicated a little in FreeBSD 5.x because there is a compatibility layer for OLDCARD present as well. Since only the window-dressing is different, an idealized version will be presented here. static const struct pccard_product wi_pccard_products[] = { PCMCIA_CARD(3COM, 3CRWE737A, 0), PCMCIA_CARD(BUFFALO, WLI_PCM_S11, 0), PCMCIA_CARD(BUFFALO, WLI_CF_S11G, 0), PCMCIA_CARD(TDK, LAK_CD011WL, 0), { NULL } }; static int wi_pccard_probe(dev) device_t dev; { const struct pccard_product *pp; if ((pp = pccard_product_lookup(dev, wi_pccard_products, sizeof(wi_pccard_products[0]), NULL)) != NULL) { - if (pp->pp_name != NULL) - device_set_desc(dev, pp->pp_name); + if (pp->pp_name != NULL) + device_set_desc(dev, pp->pp_name); return (0); } return (ENXIO); } Here we have a simple pccard probe routine that matches a few devices. As stated above, the name may vary (if it is not foo_pccard_probe() it will be foo_pccard_match()). The function pccard_product_lookup() is a generalized function that walks the table and returns a pointer to the first entry that it matches. Some drivers may use this mechanism to convey additional information about some cards to the rest of the driver, so there may be some variance in the table. The only requirement is that if you have a different table, the first element of the structure you have a table of be a struct pccard_product. Looking at the table wi_pccard_products, one notices that all the entries are of the form PCMCIA_CARD(foo, bar, baz). The foo part is the manufacturer id from pccarddevs. The bar part is the product. The baz is the expected function number that for this card. Many pccards can have multiple functions, and some way to disambiguate function 1 from function 0 is needed. You may see PCMCIA_CARD_D, which includes the device description from the pccarddevs file. You may also see PCMCIA_CARD2 and PCMCIA_CARD2_D which are used when you need to match CIS both CIS strings and manufacturer numbers, in the use the default description and take the description from pccarddevs flavors. Putting it all together So, to add a new device, one must do the following steps. First, one must obtain the identification information from the device. The easiest way to do this is to insert the device into a PC Card or CF slot and issue devinfo -v. You will likely see something like: cbb1 pnpinfo vendor=0x104c device=0xac51 subvendor=0x1265 subdevice=0x0300 class=0x060700 at slot=10 function=1 cardbus1 pccard1 unknown pnpinfo manufacturer=0x026f product=0x030c cisvendor="BUFFALO" cisproduct="WLI2-CF-S11" function_type=6 at function=0 as part of the output. The manufacturer and product are the numeric IDs for this product. While the cisvendor and cisproduct are the strings that are present in the CIS that describe this product. Since we first want to prefer the numeric option, first try to construct an entry based on that. The above card has been slightly fictionalized for the purpose of this example. The vendor is BUFFALO, which we see already has an entry: vendor BUFFALO 0x026f BUFFALO (Melco Corporation) so we are good there. Looking for an entry for this card, we do not find one. Instead we find: /* BUFFALO */ product BUFFALO WLI_PCM_S11 0x0305 BUFFALO AirStation 11Mbps WLAN product BUFFALO LPC_CF_CLT 0x0307 BUFFALO LPC-CF-CLT product BUFFALO LPC3_CLT 0x030a BUFFALO LPC3-CLT Ethernet Adapter product BUFFALO WLI_CF_S11G 0x030b BUFFALO AirStation 11Mbps CF WLAN we can just add product BUFFALO WLI2_CF_S11G 0x030c BUFFALO AirStation ultra 802.11b CF to pccarddevs. Presently, there is a manual step to regenerate the pccarddevs.h file used to convey these identifiers to the client driver. The following steps must be done before you can use them in the driver: &prompt.root; cd src/sys/dev/pccard &prompt.root; make -f Makefile.pccarddevs Once these steps are complete, you can add the card to the driver. That is a simple operation of adding one line: static const struct pccard_product wi_pccard_products[] = { PCMCIA_CARD(3COM, 3CRWE737A, 0), PCMCIA_CARD(BUFFALO, WLI_PCM_S11, 0), PCMCIA_CARD(BUFFALO, WLI_CF_S11G, 0), + PCMCIA_CARD(BUFFALO, WLI_CF2_S11G, 0), PCMCIA_CARD(TDK, LAK_CD011WL, 0), { NULL } }; Note that I have included a '+' in the line before the line that I added, but that is simply to highlight the line. Do not add it to the actual driver. Once you have added the line, you can recompile your kernel or module and try to see if it recognizes the device. If it does and works, please submit a patch. If it does not work, please figure out what is needed to make it work and submit a patch. If it did not recognize it at all, you have done something wrong and should recheck each step. If you are a FreeBSD src committer, and everything appears to be working, then you can commit the changes to the tree. However, there are some minor tricky things that you need to worry about. First, you must commit the pccarddevs file to the tree. After you have done that, you must regenerate pccarddevs.h and commit it as a second commit (this is to make sure that the right $FreeBSD$ tag is in the latter file). Finally, you need to commit the additions to the driver. Submitting a new device Many people send entries for new devices to the author directly. Please do not do this. Please submit them as a PR and send the author the PR number for his records. This makes sure that entries are not lost. When submitting a PR, it is unnecessary to include the pccardevs.h diffs in the patch, since those will be regenerated. It is necessary to include a description of the device, as well as the patches to the client driver. If you do not know the name, use OEM99 as the name, and the author will adjust OEM99 accordingly after investigation. Committers should not commit OEM99, but instead find the highest OEM entry and commit one more than that. diff --git a/en_US.ISO8859-1/books/arch-handbook/pci/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/pci/chapter.sgml index 2123ec5861..76e5730f42 100644 --- a/en_US.ISO8859-1/books/arch-handbook/pci/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/pci/chapter.sgml @@ -1,386 +1,386 @@ PCI Devices PCI bus This chapter will talk about the FreeBSD mechanisms for writing a device driver for a device on a PCI bus. Probe and Attach Information here about how the PCI bus code iterates through the unattached devices and see if a newly loaded kld will attach to any of them. /* * Simple KLD to play with the PCI functions. * * Murray Stokely */ -#define MIN(a,b) (((a) < (b)) ? (a) : (b)) +#define MIN(a,b) (((a) < (b)) ? (a) : (b)) #include <sys/param.h> /* defines used in kernel.h */ #include <sys/module.h> #include <sys/systm.h> #include <sys/errno.h> #include <sys/kernel.h> /* types used in module initialization */ #include <sys/conf.h> /* cdevsw struct */ #include <sys/uio.h> /* uio struct */ #include <sys/malloc.h> #include <sys/bus.h> /* structs, prototypes for pci bus stuff */ #include <machine/bus.h> #include <sys/rman.h> #include <machine/resource.h> #include <dev/pci/pcivar.h> /* For get_pci macros! */ #include <dev/pci/pcireg.h> /* Function prototypes */ d_open_t mypci_open; d_close_t mypci_close; d_read_t mypci_read; d_write_t mypci_write; /* Character device entry points */ static struct cdevsw mypci_cdevsw = { .d_open = mypci_open, .d_close = mypci_close, .d_read = mypci_read, .d_write = mypci_write, .d_name = "mypci", }; /* vars */ static dev_t sdev; /* We're more interested in probe/attach than with open/close/read/write at this point */ int mypci_open(dev_t dev, int oflags, int devtype, d_thread_t *td) { int err = 0; printf("Opened device \"mypci\" successfully.\n"); return (err); } int mypci_close(dev_t dev, int fflag, int devtype, d_thread_t *td) { int err = 0; printf("Closing device \"mypci.\"\n"); return (err); } int mypci_read(dev_t dev, struct uio *uio, int ioflag) { int err = 0; printf("mypci read!\n"); return (err); } int mypci_write(dev_t dev, struct uio *uio, int ioflag) { int err = 0; printf("mypci write!\n"); return (err); } /* PCI Support Functions */ /* * Return identification string if this is device is ours. */ static int mypci_probe(device_t dev) { device_printf(dev, "MyPCI Probe\nVendor ID : 0x%x\nDevice ID : 0x%x\n", pci_get_vendor(dev), pci_get_device(dev)); if (pci_get_vendor(dev) == 0x11c1) { printf("We've got the Winmodem, probe successful!\n"); return (0); } return (ENXIO); } /* Attach function is only called if the probe is successful */ static int mypci_attach(device_t dev) { printf("MyPCI Attach for : deviceID : 0x%x\n",pci_get_vendor(dev)); - sdev = make_dev(&mypci_cdevsw, 0, UID_ROOT, + sdev = make_dev(&mypci_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, "mypci"); printf("Mypci device loaded.\n"); return (ENXIO); } /* Detach device. */ static int mypci_detach(device_t dev) { printf("Mypci detach!\n"); return (0); } /* Called during system shutdown after sync. */ static int mypci_shutdown(device_t dev) { printf("Mypci shutdown!\n"); return (0); } /* * Device suspend routine. */ static int mypci_suspend(device_t dev) { printf("Mypci suspend!\n"); return (0); } /* * Device resume routine. */ static int mypci_resume(device_t dev) { printf("Mypci resume!\n"); return (0); } static device_method_t mypci_methods[] = { /* Device interface */ DEVMETHOD(device_probe, mypci_probe), DEVMETHOD(device_attach, mypci_attach), DEVMETHOD(device_detach, mypci_detach), DEVMETHOD(device_shutdown, mypci_shutdown), DEVMETHOD(device_suspend, mypci_suspend), DEVMETHOD(device_resume, mypci_resume), { 0, 0 } }; static driver_t mypci_driver = { "mypci", mypci_methods, 0, /* sizeof(struct mypci_softc), */ }; static devclass_t mypci_devclass; DRIVER_MODULE(mypci, pci, mypci_driver, mypci_devclass, 0, 0); Additional Resources PCI Special Interest Group PCI System Architecture, Fourth Edition by Tom Shanley, et al. Bus Resources PCI busresources FreeBSD provides an object-oriented mechanism for requesting resources from a parent bus. Almost all devices will be a child member of some sort of bus (PCI, ISA, USB, SCSI, etc) and these devices need to acquire resources from their parent bus (such as memory segments, interrupt lines, or DMA channels). Base Address Registers PCI busBase Address Registers To do anything particularly useful with a PCI device you will need to obtain the Base Address Registers (BARs) from the PCI Configuration space. The PCI-specific details of obtaining the BAR are abstracted in the bus_alloc_resource() function. For example, a typical driver might have something similar to this in the attach() function: - sc->bar0id = PCIR_BAR(0); - sc->bar0res = bus_alloc_resource(dev, SYS_RES_MEMORY, &(sc->bar0id), + sc->bar0id = PCIR_BAR(0); + sc->bar0res = bus_alloc_resource(dev, SYS_RES_MEMORY, &(sc->bar0id), 0, ~0, 1, RF_ACTIVE); - if (sc->bar0res == NULL) { + if (sc->bar0res == NULL) { printf("Memory allocation of PCI base register 0 failed!\n"); error = ENXIO; goto fail1; } - sc->bar1id = PCIR_BAR(1); - sc->bar1res = bus_alloc_resource(dev, SYS_RES_MEMORY, &(sc->bar1id), + sc->bar1id = PCIR_BAR(1); + sc->bar1res = bus_alloc_resource(dev, SYS_RES_MEMORY, &(sc->bar1id), 0, ~0, 1, RF_ACTIVE); - if (sc->bar1res == NULL) { + if (sc->bar1res == NULL) { printf("Memory allocation of PCI base register 1 failed!\n"); error = ENXIO; goto fail2; } - sc->bar0_bt = rman_get_bustag(sc->bar0res); - sc->bar0_bh = rman_get_bushandle(sc->bar0res); - sc->bar1_bt = rman_get_bustag(sc->bar1res); - sc->bar1_bh = rman_get_bushandle(sc->bar1res); + sc->bar0_bt = rman_get_bustag(sc->bar0res); + sc->bar0_bh = rman_get_bushandle(sc->bar0res); + sc->bar1_bt = rman_get_bustag(sc->bar1res); + sc->bar1_bh = rman_get_bushandle(sc->bar1res); Handles for each base address register are kept in the softc structure so that they can be used to write to the device later. These handles can then be used to read or write from the device registers with the bus_space_* functions. For example, a driver might contain a shorthand function to read from a board specific register like this: uint16_t board_read(struct ni_softc *sc, uint16_t address) { - return bus_space_read_2(sc->bar1_bt, sc->bar1_bh, address); + return bus_space_read_2(sc->bar1_bt, sc->bar1_bh, address); } Similarly, one could write to the registers with: void board_write(struct ni_softc *sc, uint16_t address, uint16_t value) { - bus_space_write_2(sc->bar1_bt, sc->bar1_bh, address, value); + bus_space_write_2(sc->bar1_bt, sc->bar1_bh, address, value); } These functions exist in 8bit, 16bit, and 32bit versions and you should use bus_space_{read|write}_{1|2|4} accordingly. Interrupts PCI businterrupts Interrupts are allocated from the object-oriented bus code in a way similar to the memory resources. First an IRQ resource must be allocated from the parent bus, and then the interrupt handler must be set up to deal with this IRQ. Again, a sample from a device attach() function says more than words. /* Get the IRQ resource */ - sc->irqid = 0x0; - sc->irqres = bus_alloc_resource(dev, SYS_RES_IRQ, &(sc->irqid), + sc->irqid = 0x0; + sc->irqres = bus_alloc_resource(dev, SYS_RES_IRQ, &(sc->irqid), 0, ~0, 1, RF_SHAREABLE | RF_ACTIVE); - if (sc->irqres == NULL) { + if (sc->irqres == NULL) { printf("IRQ allocation failed!\n"); error = ENXIO; goto fail3; } /* Now we should set up the interrupt handler */ - error = bus_setup_intr(dev, sc->irqres, INTR_TYPE_MISC, - my_handler, sc, &(sc->handler)); + error = bus_setup_intr(dev, sc->irqres, INTR_TYPE_MISC, + my_handler, sc, &(sc->handler)); if (error) { printf("Couldn't set up irq\n"); goto fail4; } - sc->irq_bt = rman_get_bustag(sc->irqres); - sc->irq_bh = rman_get_bushandle(sc->irqres); + sc->irq_bt = rman_get_bustag(sc->irqres); + sc->irq_bh = rman_get_bushandle(sc->irqres); Some care must be taken in the detach routine of the driver. You must quiesce the device's interrupt stream, and remove the interrupt handler. Once bus_teardown_intr() has returned, you know that your interrupt handler will no longer be called and that all threads that might have been executing this interrupt handler have returned. Since this function can sleep, you must not hold any mutexes when calling this function. DMA PCI busDMA This section is obsolete, and present only for historical reasons. The proper methods for dealing with these issues is to use the bus_space_dma*() functions instead. This paragraph can be removed when this section is updated to reflect that usage. However, at the moment, the API is in a bit of flux, so once that settles down, it would be good to update this section to reflect that. On the PC, peripherals that want to do bus-mastering DMA must deal with physical addresses. This is a problem since FreeBSD uses virtual memory and deals almost exclusively with virtual addresses. Fortunately, there is a function, vtophys() to help. #include <vm/vm.h> #include <vm/pmap.h> #define vtophys(virtual_address) (...) The solution is a bit different on the alpha however, and what we really want is a function called vtobus(). #if defined(__alpha__) #define vtobus(va) alpha_XXX_dmamap((vm_offset_t)va) #else #define vtobus(va) vtophys(va) #endif Deallocating Resources It is very important to deallocate all of the resources that were allocated during attach(). Care must be taken to deallocate the correct stuff even on a failure condition so that the system will remain usable while your driver dies. diff --git a/en_US.ISO8859-1/books/arch-handbook/scsi/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/scsi/chapter.sgml index 11caed4363..764effd926 100644 --- a/en_US.ISO8859-1/books/arch-handbook/scsi/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/scsi/chapter.sgml @@ -1,2013 +1,2013 @@ Sergey Babkin Written by Murray Stokely Modifications for Handbook made by Common Access Method SCSI Controllers Synopsis SCSI This document assumes that the reader has a general understanding of device drivers in FreeBSD and of the SCSI protocol. Much of the information in this document was extracted from the drivers: ncr (/sys/pci/ncr.c) by Wolfgang Stanglmeier and Stefan Esser sym (/sys/dev/sym/sym_hipd.c) by Gerard Roudier aic7xxx (/sys/dev/aic7xxx/aic7xxx.c) by Justin T. Gibbs and from the CAM code itself (by Justin T. Gibbs, see /sys/cam/*). When some solution looked the most logical and was essentially verbatim extracted from the code by Justin T. Gibbs, I marked it as recommended. The document is illustrated with examples in pseudo-code. Although sometimes the examples have many details and look like real code, it is still pseudo-code. It was written to demonstrate the concepts in an understandable way. For a real driver other approaches may be more modular and efficient. It also abstracts from the hardware details, as well as issues that would cloud the demonstrated concepts or that are supposed to be described in the other chapters of the developers handbook. Such details are commonly shown as calls to functions with descriptive names, comments or pseudo-statements. Fortunately real life full-size examples with all the details can be found in the real drivers. General architecture Common Access Method (CAM) CAM stands for Common Access Method. It is a generic way to address the I/O buses in a SCSI-like way. This allows a separation of the generic device drivers from the drivers controlling the I/O bus: for example the disk driver becomes able to control disks on both SCSI, IDE, and/or any other bus so the disk driver portion does not have to be rewritten (or copied and modified) for every new I/O bus. Thus the two most important active entities are: CD-ROM tape IDE Peripheral Modules - a driver for peripheral devices (disk, tape, CD-ROM, etc.) SCSI Interface Modules (SIM) - a Host Bus Adapter drivers for connecting to an I/O bus such as SCSI or IDE. A peripheral driver receives requests from the OS, converts them to a sequence of SCSI commands and passes these SCSI commands to a SCSI Interface Module. The SCSI Interface Module is responsible for passing these commands to the actual hardware (or if the actual hardware is not SCSI but, for example, IDE then also converting the SCSI commands to the native commands of the hardware). Because we are interested in writing a SCSI adapter driver here, from this point on we will consider everything from the SIM standpoint. A typical SIM driver needs to include the following CAM-related header files: #include <cam/cam.h> #include <cam/cam_ccb.h> #include <cam/cam_sim.h> #include <cam/cam_xpt_sim.h> #include <cam/cam_debug.h> #include <cam/scsi/scsi_all.h> The first thing each SIM driver must do is register itself with the CAM subsystem. This is done during the driver's xxx_attach() function (here and further xxx_ is used to denote the unique driver name prefix). The xxx_attach() function itself is called by the system bus auto-configuration code which we do not describe here. This is achieved in multiple steps: first it is necessary to allocate the queue of requests associated with this SIM: struct cam_devq *devq; if(( devq = cam_simq_alloc(SIZE) )==NULL) { error; /* some code to handle the error */ } Here SIZE is the size of the queue to be allocated, maximal number of requests it could contain. It is the number of requests that the SIM driver can handle in parallel on one SCSI card. Commonly it can be calculated as: SIZE = NUMBER_OF_SUPPORTED_TARGETS * MAX_SIMULTANEOUS_COMMANDS_PER_TARGET Next we create a descriptor of our SIM: struct cam_sim *sim; if(( sim = cam_sim_alloc(action_func, poll_func, driver_name, softc, unit, max_dev_transactions, max_tagged_dev_transactions, devq) )==NULL) { cam_simq_free(devq); error; /* some code to handle the error */ } Note that if we are not able to create a SIM descriptor we free the devq also because we can do nothing else with it and we want to conserve memory. SCSIbus If a SCSI card has multiple SCSI buses on it then each bus requires its own cam_sim structure. An interesting question is what to do if a SCSI card has more than one SCSI bus, do we need one devq structure per card or per SCSI bus? The answer given in the comments to the CAM code is: either way, as the driver's author prefers. The arguments are: action_func - pointer to the driver's xxx_action function. static void xxx_action struct cam_sim *sim, union ccb *ccb poll_func - pointer to the driver's xxx_poll() static void xxx_poll struct cam_sim *sim driver_name - the name of the actual driver, such as ncr or wds. softc - pointer to the driver's internal descriptor for this SCSI card. This pointer will be used by the driver in future to get private data. unit - the controller unit number, for example for controller wds0 this number will be 0 max_dev_transactions - maximal number of simultaneous transactions per SCSI target in the non-tagged mode. This value will be almost universally equal to 1, with possible exceptions only for the non-SCSI cards. Also the drivers that hope to take advantage by preparing one transaction while another one is executed may set it to 2 but this does not seem to be worth the complexity. max_tagged_dev_transactions - the same thing, but in the tagged mode. Tags are the SCSI way to initiate multiple transactions on a device: each transaction is assigned a unique tag and the transaction is sent to the device. When the device completes some transaction it sends back the result together with the tag so that the SCSI adapter (and the driver) can tell which transaction was completed. This argument is also known as the maximal tag depth. It depends on the abilities of the SCSI adapter. SCSIadapter Finally we register the SCSI buses associated with our SCSI adapter: if(xpt_bus_register(sim, bus_number) != CAM_SUCCESS) { cam_sim_free(sim, /*free_devq*/ TRUE); error; /* some code to handle the error */ } If there is one devq structure per SCSI bus (i.e. we consider a card with multiple buses as multiple cards with one bus each) then the bus number will always be 0, otherwise each bus on the SCSI card should be get a distinct number. Each bus needs its own separate structure cam_sim. After that our controller is completely hooked to the CAM system. The value of devq can be discarded now: sim will be passed as an argument in all further calls from CAM and devq can be derived from it. CAM provides the framework for such asynchronous events. Some events originate from the lower levels (the SIM drivers), some events originate from the peripheral drivers, some events originate from the CAM subsystem itself. Any driver can register callbacks for some types of the asynchronous events, so that it would be notified if these events occur. A typical example of such an event is a device reset. Each transaction and event identifies the devices to which it applies by the means of path. The target-specific events normally occur during a transaction with this device. So the path from that transaction may be re-used to report this event (this is safe because the event path is copied in the event reporting routine but not deallocated nor passed anywhere further). Also it is safe to allocate paths dynamically at any time including the interrupt routines, although that incurs certain overhead, and a possible problem with this approach is that there may be no free memory at that time. For a bus reset event we need to define a wildcard path including all devices on the bus. So we can create the path for the future bus reset events in advance and avoid problems with the future memory shortage: struct cam_path *path; if(xpt_create_path(&path, /*periph*/NULL, cam_sim_path(sim), CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD) != CAM_REQ_CMP) { xpt_bus_deregister(cam_sim_path(sim)); cam_sim_free(sim, /*free_devq*/TRUE); error; /* some code to handle the error */ } - softc->wpath = path; - softc->sim = sim; + softc->wpath = path; + softc->sim = sim; As you can see the path includes: ID of the peripheral driver (NULL here because we have none) ID of the SIM driver (cam_sim_path(sim)) SCSI target number of the device (CAM_TARGET_WILDCARD means all devices) SCSI LUN number of the subdevice (CAM_LUN_WILDCARD means all LUNs) If the driver can not allocate this path it will not be able to work normally, so in that case we dismantle that SCSI bus. And we save the path pointer in the softc structure for future use. After that we save the value of sim (or we can also discard it on the exit from xxx_probe() if we wish). That is all for a minimalistic initialization. To do things right there is one more issue left. For a SIM driver there is one particularly interesting event: when a target device is considered lost. In this case resetting the SCSI negotiations with this device may be a good idea. So we register a callback for this event with CAM. The request is passed to CAM by requesting CAM action on a CAM control block for this type of request: struct ccb_setasync csa; xpt_setup_ccb(&csa.ccb_h, path, /*priority*/5); csa.ccb_h.func_code = XPT_SASYNC_CB; csa.event_enable = AC_LOST_DEVICE; csa.callback = xxx_async; csa.callback_arg = sim; xpt_action((union ccb *)&csa); Now we take a look at the xxx_action() and xxx_poll() driver entry points. static void xxx_action struct cam_sim *sim, union ccb *ccb Do some action on request of the CAM subsystem. Sim describes the SIM for the request, CCB is the request itself. CCB stands for CAM Control Block. It is a union of many specific instances, each describing arguments for some type of transactions. All of these instances share the CCB header where the common part of arguments is stored. CAM supports the SCSI controllers working in both initiator (normal) mode and target (simulating a SCSI device) mode. Here we only consider the part relevant to the initiator mode. There are a few function and macros (in other words, methods) defined to access the public data in the struct sim: cam_sim_path(sim) - the path ID (see above) cam_sim_name(sim) - the name of the sim cam_sim_softc(sim) - the pointer to the softc (driver private data) structure cam_sim_unit(sim) - the unit number cam_sim_bus(sim) - the bus ID To identify the device, xxx_action() can get the unit number and pointer to its structure softc using these functions. The type of request is stored in ccb->ccb_h.func_code. So generally xxx_action() consists of a big switch: struct xxx_softc *softc = (struct xxx_softc *) cam_sim_softc(sim); - struct ccb_hdr *ccb_h = &ccb->ccb_h; + struct ccb_hdr *ccb_h = &ccb->ccb_h; int unit = cam_sim_unit(sim); int bus = cam_sim_bus(sim); - switch(ccb_h->func_code) { + switch(ccb_h->func_code) { case ...: ... default: - ccb_h->status = CAM_REQ_INVALID; + ccb_h->status = CAM_REQ_INVALID; xpt_done(ccb); break; } As can be seen from the default case (if an unknown command was received) the return code of the command is set into ccb->ccb_h.status and the completed CCB is returned back to CAM by calling xpt_done(ccb). xpt_done() does not have to be called from xxx_action(): For example an I/O request may be enqueued inside the SIM driver and/or its SCSI controller. Then when the device would post an interrupt signaling that the processing of this request is complete xpt_done() may be called from the interrupt handling routine. Actually, the CCB status is not only assigned as a return code but a CCB has some status all the time. Before CCB is passed to the xxx_action() routine it gets the status CCB_REQ_INPROG meaning that it is in progress. There are a surprising number of status values defined in /sys/cam/cam.h which should be able to represent the status of a request in great detail. More interesting yet, the status is in fact a bitwise or of an enumerated status value (the lower 6 bits) and possible additional flag-like bits (the upper bits). The enumerated values will be discussed later in more detail. The summary of them can be found in the Errors Summary section. The possible status flags are: CAM_DEV_QFRZN - if the SIM driver gets a serious error (for example, the device does not respond to the selection or breaks the SCSI protocol) when processing a CCB it should freeze the request queue by calling xpt_freeze_simq(), return the other enqueued but not processed yet CCBs for this device back to the CAM queue, then set this flag for the troublesome CCB and call xpt_done(). This flag causes the CAM subsystem to unfreeze the queue after it handles the error. CAM_AUTOSNS_VALID - if the device returned an error condition and the flag CAM_DIS_AUTOSENSE is not set in CCB the SIM driver must execute the REQUEST SENSE command automatically to extract the sense (extended error information) data from the device. If this attempt was successful the sense data should be saved in the CCB and this flag set. CAM_RELEASE_SIMQ - like CAM_DEV_QFRZN but used in case there is some problem (or resource shortage) with the SCSI controller itself. Then all the future requests to the controller should be stopped by xpt_freeze_simq(). The controller queue will be restarted after the SIM driver overcomes the shortage and informs CAM by returning some CCB with this flag set. CAM_SIM_QUEUED - when SIM puts a CCB into its request queue this flag should be set (and removed when this CCB gets dequeued before being returned back to CAM). This flag is not used anywhere in the CAM code now, so its purpose is purely diagnostic. The function xxx_action() is not allowed to sleep, so all the synchronization for resource access must be done using SIM or device queue freezing. Besides the aforementioned flags the CAM subsystem provides functions xpt_release_simq() and xpt_release_devq() to unfreeze the queues directly, without passing a CCB to CAM. The CCB header contains the following fields: path - path ID for the request target_id - target device ID for the request target_lun - LUN ID of the target device timeout - timeout interval for this command, in milliseconds timeout_ch - a convenience place for the SIM driver to store the timeout handle (the CAM subsystem itself does not make any assumptions about it) flags - various bits of information about the request spriv_ptr0, spriv_ptr1 - fields reserved for private use by the SIM driver (such as linking to the SIM queues or SIM private control blocks); actually, they exist as unions: spriv_ptr0 and spriv_ptr1 have the type (void *), spriv_field0 and spriv_field1 have the type unsigned long, sim_priv.entries[0].bytes and sim_priv.entries[1].bytes are byte arrays of the size consistent with the other incarnations of the union and sim_priv.bytes is one array, twice bigger. The recommended way of using the SIM private fields of CCB is to define some meaningful names for them and use these meaningful names in the driver, like: #define ccb_some_meaningful_name sim_priv.entries[0].bytes #define ccb_hcb spriv_ptr1 /* for hardware control block */ The most common initiator mode requests are: XPT_SCSI_IO - execute an I/O transaction The instance struct ccb_scsiio csio of the union ccb is used to transfer the arguments. They are: cdb_io - pointer to the SCSI command buffer or the buffer itself cdb_len - SCSI command length data_ptr - pointer to the data buffer (gets a bit complicated if scatter/gather is used) dxfer_len - length of the data to transfer sglist_cnt - counter of the scatter/gather segments scsi_status - place to return the SCSI status sense_data - buffer for the SCSI sense information if the command returns an error (the SIM driver is supposed to run the REQUEST SENSE command automatically in this case if the CCB flag CAM_DIS_AUTOSENSE is not set) sense_len - the length of that buffer (if it happens to be higher than size of sense_data the SIM driver must silently assume the smaller value) resid, sense_resid - if the transfer of data or SCSI sense returned an error these are the returned counters of the residual (not transferred) data. They do not seem to be especially meaningful, so in a case when they are difficult to compute (say, counting bytes in the SCSI controller's FIFO buffer) an approximate value will do as well. For a successfully completed transfer they must be set to zero. tag_action - the kind of tag to use: CAM_TAG_ACTION_NONE - do not use tags for this transaction MSG_SIMPLE_Q_TAG, MSG_HEAD_OF_Q_TAG, MSG_ORDERED_Q_TAG - value equal to the appropriate tag message (see /sys/cam/scsi/scsi_message.h); this gives only the tag type, the SIM driver must assign the tag value itself The general logic of handling this request is the following: The first thing to do is to check for possible races, to make sure that the command did not get aborted when it was sitting in the queue: - struct ccb_scsiio *csio = &ccb->csio; + struct ccb_scsiio *csio = &ccb->csio; - if ((ccb_h->status & CAM_STATUS_MASK) != CAM_REQ_INPROG) { + if ((ccb_h->status & CAM_STATUS_MASK) != CAM_REQ_INPROG) { xpt_done(ccb); return; } Also we check that the device is supported at all by our controller: - if(ccb_h->target_id > OUR_MAX_SUPPORTED_TARGET_ID - || cch_h->target_id == OUR_SCSI_CONTROLLERS_OWN_ID) { - ccb_h->status = CAM_TID_INVALID; + if(ccb_h->target_id > OUR_MAX_SUPPORTED_TARGET_ID + || cch_h->target_id == OUR_SCSI_CONTROLLERS_OWN_ID) { + ccb_h->status = CAM_TID_INVALID; xpt_done(ccb); return; } - if(ccb_h->target_lun > OUR_MAX_SUPPORTED_LUN) { - ccb_h->status = CAM_LUN_INVALID; + if(ccb_h->target_lun > OUR_MAX_SUPPORTED_LUN) { + ccb_h->status = CAM_LUN_INVALID; xpt_done(ccb); return; } hardware control block Then allocate whatever data structures (such as card-dependent hardware control block) we need to process this request. If we can not then freeze the SIM queue and remember that we have a pending operation, return the CCB back and ask CAM to re-queue it. Later when the resources become available the SIM queue must be unfrozen by returning a ccb with the CAM_SIMQ_RELEASE bit set in its status. Otherwise, if all went well, link the CCB with the hardware control block (HCB) and mark it as queued. struct xxx_hcb *hcb = allocate_hcb(softc, unit, bus); if(hcb == NULL) { - softc->flags |= RESOURCE_SHORTAGE; + softc->flags |= RESOURCE_SHORTAGE; xpt_freeze_simq(sim, /*count*/1); - ccb_h->status = CAM_REQUEUE_REQ; + ccb_h->status = CAM_REQUEUE_REQ; xpt_done(ccb); return; } - hcb->ccb = ccb; ccb_h->ccb_hcb = (void *)hcb; - ccb_h->status |= CAM_SIM_QUEUED; + hcb->ccb = ccb; ccb_h->ccb_hcb = (void *)hcb; + ccb_h->status |= CAM_SIM_QUEUED; Extract the target data from CCB into the hardware control block. Check if we are asked to assign a tag and if yes then generate an unique tag and build the SCSI tag messages. The SIM driver is also responsible for negotiations with the devices to set the maximal mutually supported bus width, synchronous rate and offset. - hcb->target = ccb_h->target_id; hcb->lun = ccb_h->target_lun; + hcb->target = ccb_h->target_id; hcb->lun = ccb_h->target_lun; generate_identify_message(hcb); - if( ccb_h->tag_action != CAM_TAG_ACTION_NONE ) - generate_unique_tag_message(hcb, ccb_h->tag_action); + if( ccb_h->tag_action != CAM_TAG_ACTION_NONE ) + generate_unique_tag_message(hcb, ccb_h->tag_action); if( !target_negotiated(hcb) ) generate_negotiation_messages(hcb); Then set up the SCSI command. The command storage may be specified in the CCB in many interesting ways, specified by the CCB flags. The command buffer can be contained in CCB or pointed to, in the latter case the pointer may be physical or virtual. Since the hardware commonly needs physical address we always convert the address to the physical one. A NOT-QUITE RELATED NOTE: Normally this is done by a call to vtophys(), but for the PCI device (which account for most of the SCSI controllers now) drivers' portability to the Alpha architecture the conversion must be done by vtobus() instead due to special Alpha quirks. [IMHO it would be much better to have two separate functions, vtop() and ptobus() then vtobus() would be a simple superposition of them.] In case if a physical address is requested it is OK to return the CCB with the status CAM_REQ_INVALID, the current drivers do that. But it is also possible to compile the Alpha-specific piece of code, as in this example (there should be a more direct way to do that, without conditional compilation in the drivers). If necessary a physical address can be also converted or mapped back to a virtual address but with big pain, so we do not do that. - if(ccb_h->flags & CAM_CDB_POINTER) { + if(ccb_h->flags & CAM_CDB_POINTER) { /* CDB is a pointer */ - if(!(ccb_h->flags & CAM_CDB_PHYS)) { + if(!(ccb_h->flags & CAM_CDB_PHYS)) { /* CDB pointer is virtual */ - hcb->cmd = vtobus(csio->cdb_io.cdb_ptr); + hcb->cmd = vtobus(csio->cdb_io.cdb_ptr); } else { /* CDB pointer is physical */ #if defined(__alpha__) - hcb->cmd = csio->cdb_io.cdb_ptr | alpha_XXX_dmamap_or ; + hcb->cmd = csio->cdb_io.cdb_ptr | alpha_XXX_dmamap_or ; #else - hcb->cmd = csio->cdb_io.cdb_ptr ; + hcb->cmd = csio->cdb_io.cdb_ptr ; #endif } } else { /* CDB is in the ccb (buffer) */ - hcb->cmd = vtobus(csio->cdb_io.cdb_bytes); + hcb->cmd = vtobus(csio->cdb_io.cdb_bytes); } - hcb->cmdlen = csio->cdb_len; + hcb->cmdlen = csio->cdb_len; Now it is time to set up the data. Again, the data storage may be specified in the CCB in many interesting ways, specified by the CCB flags. First we get the direction of the data transfer. The simplest case is if there is no data to transfer: - int dir = (ccb_h->flags & CAM_DIR_MASK); + int dir = (ccb_h->flags & CAM_DIR_MASK); if (dir == CAM_DIR_NONE) goto end_data; Then we check if the data is in one chunk or in a scatter-gather list, and the addresses are physical or virtual. The SCSI controller may be able to handle only a limited number of chunks of limited length. If the request hits this limitation we return an error. We use a special function to return the CCB to handle in one place the HCB resource shortages. The functions to add chunks are driver-dependent, and here we leave them without detailed implementation. See description of the SCSI command (CDB) handling for the details on the address-translation issues. If some variation is too difficult or impossible to implement with a particular card it is OK to return the status CAM_REQ_INVALID. Actually, it seems like the scatter-gather ability is not used anywhere in the CAM code now. But at least the case for a single non-scattered virtual buffer must be implemented, it is actively used by CAM. int rv; initialize_hcb_for_data(hcb); - if((!(ccb_h->flags & CAM_SCATTER_VALID)) { + if((!(ccb_h->flags & CAM_SCATTER_VALID)) { /* single buffer */ - if(!(ccb_h->flags & CAM_DATA_PHYS)) { - rv = add_virtual_chunk(hcb, csio->data_ptr, csio->dxfer_len, dir); + if(!(ccb_h->flags & CAM_DATA_PHYS)) { + rv = add_virtual_chunk(hcb, csio->data_ptr, csio->dxfer_len, dir); } } else { - rv = add_physical_chunk(hcb, csio->data_ptr, csio->dxfer_len, dir); + rv = add_physical_chunk(hcb, csio->data_ptr, csio->dxfer_len, dir); } } else { int i; struct bus_dma_segment *segs; - segs = (struct bus_dma_segment *)csio->data_ptr; + segs = (struct bus_dma_segment *)csio->data_ptr; - if ((ccb_h->flags & CAM_SG_LIST_PHYS) != 0) { + if ((ccb_h->flags & CAM_SG_LIST_PHYS) != 0) { /* The SG list pointer is physical */ - rv = setup_hcb_for_physical_sg_list(hcb, segs, csio->sglist_cnt); - } else if (!(ccb_h->flags & CAM_DATA_PHYS)) { + rv = setup_hcb_for_physical_sg_list(hcb, segs, csio->sglist_cnt); + } else if (!(ccb_h->flags & CAM_DATA_PHYS)) { /* SG buffer pointers are virtual */ - for (i = 0; i < csio->sglist_cnt; i++) { + for (i = 0; i < csio->sglist_cnt; i++) { rv = add_virtual_chunk(hcb, segs[i].ds_addr, segs[i].ds_len, dir); if (rv != CAM_REQ_CMP) break; } } else { /* SG buffer pointers are physical */ - for (i = 0; i < csio->sglist_cnt; i++) { + for (i = 0; i < csio->sglist_cnt; i++) { rv = add_physical_chunk(hcb, segs[i].ds_addr, segs[i].ds_len, dir); if (rv != CAM_REQ_CMP) break; } } } if(rv != CAM_REQ_CMP) { /* we expect that add_*_chunk() functions return CAM_REQ_CMP * if they added a chunk successfully, CAM_REQ_TOO_BIG if * the request is too big (too many bytes or too many chunks), * CAM_REQ_INVALID in case of other troubles */ free_hcb_and_ccb_done(hcb, ccb, rv); return; } end_data: If disconnection is disabled for this CCB we pass this information to the hcb: - if(ccb_h->flags & CAM_DIS_DISCONNECT) + if(ccb_h->flags & CAM_DIS_DISCONNECT) hcb_disable_disconnect(hcb); If the controller is able to run REQUEST SENSE command all by itself then the value of the flag CAM_DIS_AUTOSENSE should also be passed to it, to prevent automatic REQUEST SENSE if the CAM subsystem does not want it. The only thing left is to set up the timeout, pass our hcb to the hardware and return, the rest will be done by the interrupt handler (or timeout handler). - ccb_h->timeout_ch = timeout(xxx_timeout, (caddr_t) hcb, - (ccb_h->timeout * hz) / 1000); /* convert milliseconds to ticks */ + ccb_h->timeout_ch = timeout(xxx_timeout, (caddr_t) hcb, + (ccb_h->timeout * hz) / 1000); /* convert milliseconds to ticks */ put_hcb_into_hardware_queue(hcb); return; And here is a possible implementation of the function returning CCB: static void free_hcb_and_ccb_done(struct xxx_hcb *hcb, union ccb *ccb, u_int32_t status) { - struct xxx_softc *softc = hcb->softc; + struct xxx_softc *softc = hcb->softc; - ccb->ccb_h.ccb_hcb = 0; + ccb->ccb_h.ccb_hcb = 0; if(hcb != NULL) { - untimeout(xxx_timeout, (caddr_t) hcb, ccb->ccb_h.timeout_ch); + untimeout(xxx_timeout, (caddr_t) hcb, ccb->ccb_h.timeout_ch); /* we're about to free a hcb, so the shortage has ended */ - if(softc->flags & RESOURCE_SHORTAGE) { - softc->flags &= ~RESOURCE_SHORTAGE; + if(softc->flags & RESOURCE_SHORTAGE) { + softc->flags &= ~RESOURCE_SHORTAGE; status |= CAM_RELEASE_SIMQ; } free_hcb(hcb); /* also removes hcb from any internal lists */ } - ccb->ccb_h.status = status | - (ccb->ccb_h.status & ~(CAM_STATUS_MASK|CAM_SIM_QUEUED)); + ccb->ccb_h.status = status | + (ccb->ccb_h.status & ~(CAM_STATUS_MASK|CAM_SIM_QUEUED)); xpt_done(ccb); } XPT_RESET_DEV - send the SCSI BUS DEVICE RESET message to a device There is no data transferred in CCB except the header and the most interesting argument of it is target_id. Depending on the controller hardware a hardware control block just like for the XPT_SCSI_IO request may be constructed (see XPT_SCSI_IO request description) and sent to the controller or the SCSI controller may be immediately programmed to send this RESET message to the device or this request may be just not supported (and return the status CAM_REQ_INVALID). Also on completion of the request all the disconnected transactions for this target must be aborted (probably in the interrupt routine). Also all the current negotiations for the target are lost on reset, so they might be cleaned too. Or they clearing may be deferred, because anyway the target would request re-negotiation on the next transaction. XPT_RESET_BUS - send the RESET signal to the SCSI bus No arguments are passed in the CCB, the only interesting argument is the SCSI bus indicated by the struct sim pointer. A minimalistic implementation would forget the SCSI negotiations for all the devices on the bus and return the status CAM_REQ_CMP. The proper implementation would in addition actually reset the SCSI bus (possible also reset the SCSI controller) and mark all the CCBs being processed, both those in the hardware queue and those being disconnected, as done with the status CAM_SCSI_BUS_RESET. Like: int targ, lun; struct xxx_hcb *h, *hh; struct ccb_trans_settings neg; struct cam_path *path; /* The SCSI bus reset may take a long time, in this case its completion * should be checked by interrupt or timeout. But for simplicity * we assume here that it is really fast. */ reset_scsi_bus(softc); /* drop all enqueued CCBs */ - for(h = softc->first_queued_hcb; h != NULL; h = hh) { - hh = h->next; - free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); + for(h = softc->first_queued_hcb; h != NULL; h = hh) { + hh = h->next; + free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); } /* the clean values of negotiations to report */ neg.bus_width = 8; neg.sync_period = neg.sync_offset = 0; neg.valid = (CCB_TRANS_BUS_WIDTH_VALID | CCB_TRANS_SYNC_RATE_VALID | CCB_TRANS_SYNC_OFFSET_VALID); /* drop all disconnected CCBs and clean negotiations */ - for(targ=0; targ <= OUR_MAX_SUPPORTED_TARGET; targ++) { + for(targ=0; targ <= OUR_MAX_SUPPORTED_TARGET; targ++) { clean_negotiations(softc, targ); /* report the event if possible */ if(xpt_create_path(&path, /*periph*/NULL, cam_sim_path(sim), targ, CAM_LUN_WILDCARD) == CAM_REQ_CMP) { xpt_async(AC_TRANSFER_NEG, path, &neg); xpt_free_path(path); } - for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) - for(h = softc->first_discon_hcb[targ][lun]; h != NULL; h = hh) { - hh=h->next; - free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); + for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) + for(h = softc->first_discon_hcb[targ][lun]; h != NULL; h = hh) { + hh=h->next; + free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); } } - ccb->ccb_h.status = CAM_REQ_CMP; + ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); /* report the event */ - xpt_async(AC_BUS_RESET, softc->wpath, NULL); + xpt_async(AC_BUS_RESET, softc->wpath, NULL); return; Implementing the SCSI bus reset as a function may be a good idea because it would be re-used by the timeout function as a last resort if the things go wrong. XPT_ABORT - abort the specified CCB The arguments are transferred in the instance struct ccb_abort cab of the union ccb. The only argument field in it is: abort_ccb - pointer to the CCB to be aborted If the abort is not supported just return the status CAM_UA_ABORT. This is also the easy way to minimally implement this call, return CAM_UA_ABORT in any case. The hard way is to implement this request honestly. First check that abort applies to a SCSI transaction: struct ccb *abort_ccb; - abort_ccb = ccb->cab.abort_ccb; + abort_ccb = ccb->cab.abort_ccb; - if(abort_ccb->ccb_h.func_code != XPT_SCSI_IO) { - ccb->ccb_h.status = CAM_UA_ABORT; + if(abort_ccb->ccb_h.func_code != XPT_SCSI_IO) { + ccb->ccb_h.status = CAM_UA_ABORT; xpt_done(ccb); return; } Then it is necessary to find this CCB in our queue. This can be done by walking the list of all our hardware control blocks in search for one associated with this CCB: struct xxx_hcb *hcb, *h; hcb = NULL; - /* We assume that softc->first_hcb is the head of the list of all + /* We assume that softc->first_hcb is the head of the list of all * HCBs associated with this bus, including those enqueued for * processing, being processed by hardware and disconnected ones. */ - for(h = softc->first_hcb; h != NULL; h = h->next) { - if(h->ccb == abort_ccb) { + for(h = softc->first_hcb; h != NULL; h = h->next) { + if(h->ccb == abort_ccb) { hcb = h; break; } } if(hcb == NULL) { /* no such CCB in our queue */ - ccb->ccb_h.status = CAM_PATH_INVALID; + ccb->ccb_h.status = CAM_PATH_INVALID; xpt_done(ccb); return; } hcb=found_hcb; Now we look at the current processing status of the HCB. It may be either sitting in the queue waiting to be sent to the SCSI bus, being transferred right now, or disconnected and waiting for the result of the command, or actually completed by hardware but not yet marked as done by software. To make sure that we do not get in any races with hardware we mark the HCB as being aborted, so that if this HCB is about to be sent to the SCSI bus the SCSI controller will see this flag and skip it. int hstatus; /* shown as a function, in case special action is needed to make * this flag visible to hardware */ set_hcb_flags(hcb, HCB_BEING_ABORTED); abort_again: hstatus = get_hcb_status(hcb); switch(hstatus) { case HCB_SITTING_IN_QUEUE: remove_hcb_from_hardware_queue(hcb); /* FALLTHROUGH */ case HCB_COMPLETED: /* this is an easy case */ free_hcb_and_ccb_done(hcb, abort_ccb, CAM_REQ_ABORTED); break; If the CCB is being transferred right now we would like to signal to the SCSI controller in some hardware-dependent way that we want to abort the current transfer. The SCSI controller would set the SCSI ATTENTION signal and when the target responds to it send an ABORT message. We also reset the timeout to make sure that the target is not sleeping forever. If the command would not get aborted in some reasonable time like 10 seconds the timeout routine would go ahead and reset the whole SCSI bus. Because the command will be aborted in some reasonable time we can just return the abort request now as successfully completed, and mark the aborted CCB as aborted (but not mark it as done yet). case HCB_BEING_TRANSFERRED: - untimeout(xxx_timeout, (caddr_t) hcb, abort_ccb->ccb_h.timeout_ch); - abort_ccb->ccb_h.timeout_ch = + untimeout(xxx_timeout, (caddr_t) hcb, abort_ccb->ccb_h.timeout_ch); + abort_ccb->ccb_h.timeout_ch = timeout(xxx_timeout, (caddr_t) hcb, 10 * hz); - abort_ccb->ccb_h.status = CAM_REQ_ABORTED; + abort_ccb->ccb_h.status = CAM_REQ_ABORTED; /* ask the controller to abort that HCB, then generate * an interrupt and stop */ - if(signal_hardware_to_abort_hcb_and_stop(hcb) < 0) { + if(signal_hardware_to_abort_hcb_and_stop(hcb) < 0) { /* oops, we missed the race with hardware, this transaction * got off the bus before we aborted it, try again */ goto abort_again; } break; If the CCB is in the list of disconnected then set it up as an abort request and re-queue it at the front of hardware queue. Reset the timeout and report the abort request to be completed. case HCB_DISCONNECTED: - untimeout(xxx_timeout, (caddr_t) hcb, abort_ccb->ccb_h.timeout_ch); - abort_ccb->ccb_h.timeout_ch = + untimeout(xxx_timeout, (caddr_t) hcb, abort_ccb->ccb_h.timeout_ch); + abort_ccb->ccb_h.timeout_ch = timeout(xxx_timeout, (caddr_t) hcb, 10 * hz); put_abort_message_into_hcb(hcb); put_hcb_at_the_front_of_hardware_queue(hcb); break; } - ccb->ccb_h.status = CAM_REQ_CMP; + ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; That is all for the ABORT request, although there is one more issue. Because the ABORT message cleans all the ongoing transactions on a LUN we have to mark all the other active transactions on this LUN as aborted. That should be done in the interrupt routine, after the transaction gets aborted. Implementing the CCB abort as a function may be quite a good idea, this function can be re-used if an I/O transaction times out. The only difference would be that the timed out transaction would return the status CAM_CMD_TIMEOUT for the timed out request. Then the case XPT_ABORT would be small, like that: case XPT_ABORT: struct ccb *abort_ccb; - abort_ccb = ccb->cab.abort_ccb; + abort_ccb = ccb->cab.abort_ccb; - if(abort_ccb->ccb_h.func_code != XPT_SCSI_IO) { - ccb->ccb_h.status = CAM_UA_ABORT; + if(abort_ccb->ccb_h.func_code != XPT_SCSI_IO) { + ccb->ccb_h.status = CAM_UA_ABORT; xpt_done(ccb); return; } - if(xxx_abort_ccb(abort_ccb, CAM_REQ_ABORTED) < 0) + if(xxx_abort_ccb(abort_ccb, CAM_REQ_ABORTED) < 0) /* no such CCB in our queue */ - ccb->ccb_h.status = CAM_PATH_INVALID; + ccb->ccb_h.status = CAM_PATH_INVALID; else - ccb->ccb_h.status = CAM_REQ_CMP; + ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; XPT_SET_TRAN_SETTINGS - explicitly set values of SCSI transfer settings The arguments are transferred in the instance struct ccb_trans_setting cts of the union ccb: valid - a bitmask showing which settings should be updated: CCB_TRANS_SYNC_RATE_VALID - synchronous transfer rate CCB_TRANS_SYNC_OFFSET_VALID - synchronous offset CCB_TRANS_BUS_WIDTH_VALID - bus width CCB_TRANS_DISC_VALID - set enable/disable disconnection CCB_TRANS_TQ_VALID - set enable/disable tagged queuing flags - consists of two parts, binary arguments and identification of sub-operations. The binary arguments are: CCB_TRANS_DISC_ENB - enable disconnection CCB_TRANS_TAG_ENB - enable tagged queuing the sub-operations are: CCB_TRANS_CURRENT_SETTINGS - change the current negotiations CCB_TRANS_USER_SETTINGS - remember the desired user values sync_period, sync_offset - self-explanatory, if sync_offset==0 then the asynchronous mode is requested bus_width - bus width, in bits (not bytes) Two sets of negotiated parameters are supported, the user settings and the current settings. The user settings are not really used much in the SIM drivers, this is mostly just a piece of memory where the upper levels can store (and later recall) its ideas about the parameters. Setting the user parameters does not cause re-negotiation of the transfer rates. But when the SCSI controller does a negotiation it must never set the values higher than the user parameters, so it is essentially the top boundary. The current settings are, as the name says, current. Changing them means that the parameters must be re-negotiated on the next transfer. Again, these new current settings are not supposed to be forced on the device, just they are used as the initial step of negotiations. Also they must be limited by actual capabilities of the SCSI controller: for example, if the SCSI controller has 8-bit bus and the request asks to set 16-bit wide transfers this parameter must be silently truncated to 8-bit transfers before sending it to the device. One caveat is that the bus width and synchronous parameters are per target while the disconnection and tag enabling parameters are per lun. The recommended implementation is to keep 3 sets of negotiated (bus width and synchronous transfer) parameters: user - the user set, as above current - those actually in effect goal - those requested by setting of the current parameters The code looks like: struct ccb_trans_settings *cts; int targ, lun; int flags; - cts = &ccb->cts; - targ = ccb_h->target_id; - lun = ccb_h->target_lun; - flags = cts->flags; + cts = &ccb->cts; + targ = ccb_h->target_id; + lun = ccb_h->target_lun; + flags = cts->flags; if(flags & CCB_TRANS_USER_SETTINGS) { if(flags & CCB_TRANS_SYNC_RATE_VALID) - softc->user_sync_period[targ] = cts->sync_period; + softc->user_sync_period[targ] = cts->sync_period; if(flags & CCB_TRANS_SYNC_OFFSET_VALID) - softc->user_sync_offset[targ] = cts->sync_offset; + softc->user_sync_offset[targ] = cts->sync_offset; if(flags & CCB_TRANS_BUS_WIDTH_VALID) - softc->user_bus_width[targ] = cts->bus_width; + softc->user_bus_width[targ] = cts->bus_width; if(flags & CCB_TRANS_DISC_VALID) { - softc->user_tflags[targ][lun] &= ~CCB_TRANS_DISC_ENB; - softc->user_tflags[targ][lun] |= flags & CCB_TRANS_DISC_ENB; + softc->user_tflags[targ][lun] &= ~CCB_TRANS_DISC_ENB; + softc->user_tflags[targ][lun] |= flags & CCB_TRANS_DISC_ENB; } if(flags & CCB_TRANS_TQ_VALID) { - softc->user_tflags[targ][lun] &= ~CCB_TRANS_TQ_ENB; - softc->user_tflags[targ][lun] |= flags & CCB_TRANS_TQ_ENB; + softc->user_tflags[targ][lun] &= ~CCB_TRANS_TQ_ENB; + softc->user_tflags[targ][lun] |= flags & CCB_TRANS_TQ_ENB; } } if(flags & CCB_TRANS_CURRENT_SETTINGS) { if(flags & CCB_TRANS_SYNC_RATE_VALID) - softc->goal_sync_period[targ] = - max(cts->sync_period, OUR_MIN_SUPPORTED_PERIOD); + softc->goal_sync_period[targ] = + max(cts->sync_period, OUR_MIN_SUPPORTED_PERIOD); if(flags & CCB_TRANS_SYNC_OFFSET_VALID) - softc->goal_sync_offset[targ] = - min(cts->sync_offset, OUR_MAX_SUPPORTED_OFFSET); + softc->goal_sync_offset[targ] = + min(cts->sync_offset, OUR_MAX_SUPPORTED_OFFSET); if(flags & CCB_TRANS_BUS_WIDTH_VALID) - softc->goal_bus_width[targ] = min(cts->bus_width, OUR_BUS_WIDTH); + softc->goal_bus_width[targ] = min(cts->bus_width, OUR_BUS_WIDTH); if(flags & CCB_TRANS_DISC_VALID) { - softc->current_tflags[targ][lun] &= ~CCB_TRANS_DISC_ENB; - softc->current_tflags[targ][lun] |= flags & CCB_TRANS_DISC_ENB; + softc->current_tflags[targ][lun] &= ~CCB_TRANS_DISC_ENB; + softc->current_tflags[targ][lun] |= flags & CCB_TRANS_DISC_ENB; } if(flags & CCB_TRANS_TQ_VALID) { - softc->current_tflags[targ][lun] &= ~CCB_TRANS_TQ_ENB; - softc->current_tflags[targ][lun] |= flags & CCB_TRANS_TQ_ENB; + softc->current_tflags[targ][lun] &= ~CCB_TRANS_TQ_ENB; + softc->current_tflags[targ][lun] |= flags & CCB_TRANS_TQ_ENB; } } - ccb->ccb_h.status = CAM_REQ_CMP; + ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; Then when the next I/O request will be processed it will check if it has to re-negotiate, for example by calling the function target_negotiated(hcb). It can be implemented like this: int target_negotiated(struct xxx_hcb *hcb) { - struct softc *softc = hcb->softc; - int targ = hcb->targ; + struct softc *softc = hcb->softc; + int targ = hcb->targ; - if( softc->current_sync_period[targ] != softc->goal_sync_period[targ] - || softc->current_sync_offset[targ] != softc->goal_sync_offset[targ] - || softc->current_bus_width[targ] != softc->goal_bus_width[targ] ) + if( softc->current_sync_period[targ] != softc->goal_sync_period[targ] + || softc->current_sync_offset[targ] != softc->goal_sync_offset[targ] + || softc->current_bus_width[targ] != softc->goal_bus_width[targ] ) return 0; /* FALSE */ else return 1; /* TRUE */ } After the values are re-negotiated the resulting values must be assigned to both current and goal parameters, so for future I/O transactions the current and goal parameters would be the same and target_negotiated() would return TRUE. When the card is initialized (in xxx_attach()) the current negotiation values must be initialized to narrow asynchronous mode, the goal and current values must be initialized to the maximal values supported by controller. XPT_GET_TRAN_SETTINGS - get values of SCSI transfer settings This operations is the reverse of XPT_SET_TRAN_SETTINGS. Fill up the CCB instance struct ccb_trans_setting cts with data as requested by the flags CCB_TRANS_CURRENT_SETTINGS or CCB_TRANS_USER_SETTINGS (if both are set then the existing drivers return the current settings). Set all the bits in the valid field. BIOS XPT_CALC_GEOMETRY - calculate logical (BIOS) geometry of the disk The arguments are transferred in the instance struct ccb_calc_geometry ccg of the union ccb: block_size - input, block (A.K.A sector) size in bytes volume_size - input, volume size in bytes cylinders - output, logical cylinders heads - output, logical heads secs_per_track - output, logical sectors per track SCSIBIOS If the returned geometry differs much enough from what the SCSI controller BIOS thinks and a disk on this SCSI controller is used as bootable the system may not be able to boot. The typical calculation example taken from the aic7xxx driver is: struct ccb_calc_geometry *ccg; u_int32_t size_mb; u_int32_t secs_per_cylinder; int extended; - ccg = &ccb->ccg; - size_mb = ccg->volume_size - / ((1024L * 1024L) / ccg->block_size); + ccg = &ccb->ccg; + size_mb = ccg->volume_size + / ((1024L * 1024L) / ccg->block_size); extended = check_cards_EEPROM_for_extended_geometry(softc); - if (size_mb > 1024 && extended) { - ccg->heads = 255; - ccg->secs_per_track = 63; + if (size_mb > 1024 && extended) { + ccg->heads = 255; + ccg->secs_per_track = 63; } else { - ccg->heads = 64; - ccg->secs_per_track = 32; + ccg->heads = 64; + ccg->secs_per_track = 32; } - secs_per_cylinder = ccg->heads * ccg->secs_per_track; - ccg->cylinders = ccg->volume_size / secs_per_cylinder; - ccb->ccb_h.status = CAM_REQ_CMP; + secs_per_cylinder = ccg->heads * ccg->secs_per_track; + ccg->cylinders = ccg->volume_size / secs_per_cylinder; + ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; This gives the general idea, the exact calculation depends on the quirks of the particular BIOS. If BIOS provides no way set the extended translation flag in EEPROM this flag should normally be assumed equal to 1. Other popular geometries are: 128 heads, 63 sectors - Symbios controllers 16 heads, 63 sectors - old controllers Some system BIOSes and SCSI BIOSes fight with each other with variable success, for example a combination of Symbios 875/895 SCSI and Phoenix BIOS can give geometry 128/63 after power up and 255/63 after a hard reset or soft reboot. XPT_PATH_INQ - path inquiry, in other words get the SIM driver and SCSI controller (also known as HBA - Host Bus Adapter) properties The properties are returned in the instance struct ccb_pathinq cpi of the union ccb: version_num - the SIM driver version number, now all drivers use 1 hba_inquiry - bitmask of features supported by the controller: PI_MDP_ABLE - supports MDP message (something from SCSI3?) PI_WIDE_32 - supports 32 bit wide SCSI PI_WIDE_16 - supports 16 bit wide SCSI PI_SDTR_ABLE - can negotiate synchronous transfer rate PI_LINKED_CDB - supports linked commands PI_TAG_ABLE - supports tagged commands PI_SOFT_RST - supports soft reset alternative (hard reset and soft reset are mutually exclusive within a SCSI bus) target_sprt - flags for target mode support, 0 if unsupported hba_misc - miscellaneous controller features: PIM_SCANHILO - bus scans from high ID to low ID PIM_NOREMOVE - removable devices not included in scan PIM_NOINITIATOR - initiator role not supported PIM_NOBUSRESET - user has disabled initial BUS RESET hba_eng_cnt - mysterious HBA engine count, something related to compression, now is always set to 0 vuhba_flags - vendor-unique flags, unused now max_target - maximal supported target ID (7 for 8-bit bus, 15 for 16-bit bus, 127 for Fibre Channel) max_lun - maximal supported LUN ID (7 for older SCSI controllers, 63 for newer ones) async_flags - bitmask of installed Async handler, unused now hpath_id - highest Path ID in the subsystem, unused now unit_number - the controller unit number, cam_sim_unit(sim) bus_id - the bus number, cam_sim_bus(sim) initiator_id - the SCSI ID of the controller itself base_transfer_speed - nominal transfer speed in KB/s for asynchronous narrow transfers, equals to 3300 for SCSI sim_vid - SIM driver's vendor id, a zero-terminated string of maximal length SIM_IDLEN including the terminating zero hba_vid - SCSI controller's vendor id, a zero-terminated string of maximal length HBA_IDLEN including the terminating zero dev_name - device driver name, a zero-terminated string of maximal length DEV_IDLEN including the terminating zero, equal to cam_sim_name(sim) The recommended way of setting the string fields is using strncpy, like: - strncpy(cpi->dev_name, cam_sim_name(sim), DEV_IDLEN); + strncpy(cpi->dev_name, cam_sim_name(sim), DEV_IDLEN); After setting the values set the status to CAM_REQ_CMP and mark the CCB as done. Polling static void xxx_poll struct cam_sim *sim The poll function is used to simulate the interrupts when the interrupt subsystem is not functioning (for example, when the system has crashed and is creating the system dump). The CAM subsystem sets the proper interrupt level before calling the poll routine. So all it needs to do is to call the interrupt routine (or the other way around, the poll routine may be doing the real action and the interrupt routine would just call the poll routine). Why bother about a separate function then? Because of different calling conventions. The xxx_poll routine gets the struct cam_sim pointer as its argument when the PCI interrupt routine by common convention gets pointer to the struct xxx_softc and the ISA interrupt routine gets just the device unit number. So the poll routine would normally look as: static void xxx_poll(struct cam_sim *sim) { xxx_intr((struct xxx_softc *)cam_sim_softc(sim)); /* for PCI device */ } or static void xxx_poll(struct cam_sim *sim) { xxx_intr(cam_sim_unit(sim)); /* for ISA device */ } Asynchronous Events If an asynchronous event callback has been set up then the callback function should be defined. static void ahc_async(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg) callback_arg - the value supplied when registering the callback code - identifies the type of event path - identifies the devices to which the event applies arg - event-specific argument Implementation for a single type of event, AC_LOST_DEVICE, looks like: struct xxx_softc *softc; struct cam_sim *sim; int targ; struct ccb_trans_settings neg; sim = (struct cam_sim *)callback_arg; softc = (struct xxx_softc *)cam_sim_softc(sim); switch (code) { case AC_LOST_DEVICE: targ = xpt_path_target_id(path); - if(targ <= OUR_MAX_SUPPORTED_TARGET) { + if(targ <= OUR_MAX_SUPPORTED_TARGET) { clean_negotiations(softc, targ); /* send indication to CAM */ neg.bus_width = 8; neg.sync_period = neg.sync_offset = 0; neg.valid = (CCB_TRANS_BUS_WIDTH_VALID | CCB_TRANS_SYNC_RATE_VALID | CCB_TRANS_SYNC_OFFSET_VALID); xpt_async(AC_TRANSFER_NEG, path, &neg); } break; default: break; } Interrupts SCSIinterrupts The exact type of the interrupt routine depends on the type of the peripheral bus (PCI, ISA and so on) to which the SCSI controller is connected. The interrupt routines of the SIM drivers run at the interrupt level splcam. So splcam() should be used in the driver to synchronize activity between the interrupt routine and the rest of the driver (for a multiprocessor-aware driver things get yet more interesting but we ignore this case here). The pseudo-code in this document happily ignores the problems of synchronization. The real code must not ignore them. A simple-minded approach is to set splcam() on the entry to the other routines and reset it on return thus protecting them by one big critical section. To make sure that the interrupt level will be always restored a wrapper function can be defined, like: static void xxx_action(struct cam_sim *sim, union ccb *ccb) { int s; s = splcam(); xxx_action1(sim, ccb); splx(s); } static void xxx_action1(struct cam_sim *sim, union ccb *ccb) { ... process the request ... } This approach is simple and robust but the problem with it is that interrupts may get blocked for a relatively long time and this would negatively affect the system's performance. On the other hand the functions of the spl() family have rather high overhead, so vast amount of tiny critical sections may not be good either. The conditions handled by the interrupt routine and the details depend very much on the hardware. We consider the set of typical conditions. First, we check if a SCSI reset was encountered on the bus (probably caused by another SCSI controller on the same SCSI bus). If so we drop all the enqueued and disconnected requests, report the events and re-initialize our SCSI controller. It is important that during this initialization the controller will not issue another reset or else two controllers on the same SCSI bus could ping-pong resets forever. The case of fatal controller error/hang could be handled in the same place, but it will probably need also sending RESET signal to the SCSI bus to reset the status of the connections with the SCSI devices. int fatal=0; struct ccb_trans_settings neg; struct cam_path *path; if( detected_scsi_reset(softc) || (fatal = detected_fatal_controller_error(softc)) ) { int targ, lun; struct xxx_hcb *h, *hh; /* drop all enqueued CCBs */ - for(h = softc->first_queued_hcb; h != NULL; h = hh) { - hh = h->next; - free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); + for(h = softc->first_queued_hcb; h != NULL; h = hh) { + hh = h->next; + free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); } /* the clean values of negotiations to report */ neg.bus_width = 8; neg.sync_period = neg.sync_offset = 0; neg.valid = (CCB_TRANS_BUS_WIDTH_VALID | CCB_TRANS_SYNC_RATE_VALID | CCB_TRANS_SYNC_OFFSET_VALID); /* drop all disconnected CCBs and clean negotiations */ - for(targ=0; targ <= OUR_MAX_SUPPORTED_TARGET; targ++) { + for(targ=0; targ <= OUR_MAX_SUPPORTED_TARGET; targ++) { clean_negotiations(softc, targ); /* report the event if possible */ if(xpt_create_path(&path, /*periph*/NULL, cam_sim_path(sim), targ, CAM_LUN_WILDCARD) == CAM_REQ_CMP) { xpt_async(AC_TRANSFER_NEG, path, &neg); xpt_free_path(path); } - for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) - for(h = softc->first_discon_hcb[targ][lun]; h != NULL; h = hh) { - hh=h->next; + for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) + for(h = softc->first_discon_hcb[targ][lun]; h != NULL; h = hh) { + hh=h->next; if(fatal) - free_hcb_and_ccb_done(h, h->ccb, CAM_UNREC_HBA_ERROR); + free_hcb_and_ccb_done(h, h->ccb, CAM_UNREC_HBA_ERROR); else - free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); + free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); } } /* report the event */ - xpt_async(AC_BUS_RESET, softc->wpath, NULL); + xpt_async(AC_BUS_RESET, softc->wpath, NULL); /* re-initialization may take a lot of time, in such case * its completion should be signaled by another interrupt or * checked on timeout - but for simplicity we assume here that * it is really fast */ if(!fatal) { reinitialize_controller_without_scsi_reset(softc); } else { reinitialize_controller_with_scsi_reset(softc); } schedule_next_hcb(softc); return; } If interrupt is not caused by a controller-wide condition then probably something has happened to the current hardware control block. Depending on the hardware there may be other non-HCB-related events, we just do not consider them here. Then we analyze what happened to this HCB: struct xxx_hcb *hcb, *h, *hh; int hcb_status, scsi_status; int ccb_status; int targ; int lun_to_freeze; hcb = get_current_hcb(softc); if(hcb == NULL) { /* either stray interrupt or something went very wrong * or this is something hardware-dependent */ handle as necessary; return; } - targ = hcb->target; + targ = hcb->target; hcb_status = get_status_of_current_hcb(softc); First we check if the HCB has completed and if so we check the returned SCSI status. if(hcb_status == COMPLETED) { scsi_status = get_completion_status(hcb); Then look if this status is related to the REQUEST SENSE command and if so handle it in a simple way. - if(hcb->flags & DOING_AUTOSENSE) { + if(hcb->flags & DOING_AUTOSENSE) { if(scsi_status == GOOD) { /* autosense was successful */ - hcb->ccb->ccb_h.status |= CAM_AUTOSNS_VALID; - free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_SCSI_STATUS_ERROR); + hcb->ccb->ccb_h.status |= CAM_AUTOSNS_VALID; + free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_SCSI_STATUS_ERROR); } else { autosense_failed: - free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_AUTOSENSE_FAIL); + free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_AUTOSENSE_FAIL); } schedule_next_hcb(softc); return; } Else the command itself has completed, pay more attention to details. If auto-sense is not disabled for this CCB and the command has failed with sense data then run REQUEST SENSE command to receive that data. - hcb->ccb->csio.scsi_status = scsi_status; + hcb->ccb->csio.scsi_status = scsi_status; calculate_residue(hcb); - if( (hcb->ccb->ccb_h.flags & CAM_DIS_AUTOSENSE)==0 + if( (hcb->ccb->ccb_h.flags & CAM_DIS_AUTOSENSE)==0 && ( scsi_status == CHECK_CONDITION || scsi_status == COMMAND_TERMINATED) ) { /* start auto-SENSE */ - hcb->flags |= DOING_AUTOSENSE; + hcb->flags |= DOING_AUTOSENSE; setup_autosense_command_in_hcb(hcb); restart_current_hcb(softc); return; } if(scsi_status == GOOD) - free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_REQ_CMP); + free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_REQ_CMP); else - free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_SCSI_STATUS_ERROR); + free_hcb_and_ccb_done(hcb, hcb->ccb, CAM_SCSI_STATUS_ERROR); schedule_next_hcb(softc); return; } One typical thing would be negotiation events: negotiation messages received from a SCSI target (in answer to our negotiation attempt or by target's initiative) or the target is unable to negotiate (rejects our negotiation messages or does not answer them). switch(hcb_status) { case TARGET_REJECTED_WIDE_NEG: /* revert to 8-bit bus */ - softc->current_bus_width[targ] = softc->goal_bus_width[targ] = 8; + softc->current_bus_width[targ] = softc->goal_bus_width[targ] = 8; /* report the event */ neg.bus_width = 8; neg.valid = CCB_TRANS_BUS_WIDTH_VALID; - xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); + xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); continue_current_hcb(softc); return; case TARGET_ANSWERED_WIDE_NEG: { int wd; wd = get_target_bus_width_request(softc); - if(wd <= softc->goal_bus_width[targ]) { + if(wd <= softc->goal_bus_width[targ]) { /* answer is acceptable */ - softc->current_bus_width[targ] = - softc->goal_bus_width[targ] = neg.bus_width = wd; + softc->current_bus_width[targ] = + softc->goal_bus_width[targ] = neg.bus_width = wd; /* report the event */ neg.valid = CCB_TRANS_BUS_WIDTH_VALID; - xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); + xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); } else { prepare_reject_message(hcb); } } continue_current_hcb(softc); return; case TARGET_REQUESTED_WIDE_NEG: { int wd; wd = get_target_bus_width_request(softc); wd = min (wd, OUR_BUS_WIDTH); - wd = min (wd, softc->user_bus_width[targ]); + wd = min (wd, softc->user_bus_width[targ]); - if(wd != softc->current_bus_width[targ]) { + if(wd != softc->current_bus_width[targ]) { /* the bus width has changed */ - softc->current_bus_width[targ] = - softc->goal_bus_width[targ] = neg.bus_width = wd; + softc->current_bus_width[targ] = + softc->goal_bus_width[targ] = neg.bus_width = wd; /* report the event */ neg.valid = CCB_TRANS_BUS_WIDTH_VALID; - xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); + xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); } prepare_width_nego_rsponse(hcb, wd); } continue_current_hcb(softc); return; } Then we handle any errors that could have happened during auto-sense in the same simple-minded way as before. Otherwise we look closer at the details again. - if(hcb->flags & DOING_AUTOSENSE) + if(hcb->flags & DOING_AUTOSENSE) goto autosense_failed; switch(hcb_status) { The next event we consider is unexpected disconnect. Which is considered normal after an ABORT or BUS DEVICE RESET message and abnormal in other cases. case UNEXPECTED_DISCONNECT: if(requested_abort(hcb)) { /* abort affects all commands on that target+LUN, so * mark all disconnected HCBs on that target+LUN as aborted too */ - for(h = softc->first_discon_hcb[hcb->target][hcb->lun]; + for(h = softc->first_discon_hcb[hcb->target][hcb->lun]; h != NULL; h = hh) { - hh=h->next; - free_hcb_and_ccb_done(h, h->ccb, CAM_REQ_ABORTED); + hh=h->next; + free_hcb_and_ccb_done(h, h->ccb, CAM_REQ_ABORTED); } ccb_status = CAM_REQ_ABORTED; } else if(requested_bus_device_reset(hcb)) { int lun; /* reset affects all commands on that target, so * mark all disconnected HCBs on that target+LUN as reset */ - for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) - for(h = softc->first_discon_hcb[hcb->target][lun]; + for(lun=0; lun <= OUR_MAX_SUPPORTED_LUN; lun++) + for(h = softc->first_discon_hcb[hcb->target][lun]; h != NULL; h = hh) { - hh=h->next; - free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); + hh=h->next; + free_hcb_and_ccb_done(h, h->ccb, CAM_SCSI_BUS_RESET); } /* send event */ - xpt_async(AC_SENT_BDR, hcb->ccb->ccb_h.path_id, NULL); + xpt_async(AC_SENT_BDR, hcb->ccb->ccb_h.path_id, NULL); /* this was the CAM_RESET_DEV request itself, it is completed */ ccb_status = CAM_REQ_CMP; } else { calculate_residue(hcb); ccb_status = CAM_UNEXP_BUSFREE; /* request the further code to freeze the queue */ - hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; - lun_to_freeze = hcb->lun; + hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; + lun_to_freeze = hcb->lun; } break; If the target refuses to accept tags we notify CAM about that and return back all commands for this LUN: case TAGS_REJECTED: /* report the event */ neg.flags = 0 & ~CCB_TRANS_TAG_ENB; neg.valid = CCB_TRANS_TQ_VALID; - xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); + xpt_async(AC_TRANSFER_NEG, hcb->ccb.ccb_h.path_id, &neg); ccb_status = CAM_MSG_REJECT_REC; /* request the further code to freeze the queue */ - hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; - lun_to_freeze = hcb->lun; + hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; + lun_to_freeze = hcb->lun; break; Then we check a number of other conditions, with processing basically limited to setting the CCB status: case SELECTION_TIMEOUT: ccb_status = CAM_SEL_TIMEOUT; /* request the further code to freeze the queue */ - hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; + hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; lun_to_freeze = CAM_LUN_WILDCARD; break; case PARITY_ERROR: ccb_status = CAM_UNCOR_PARITY; break; case DATA_OVERRUN: case ODD_WIDE_TRANSFER: ccb_status = CAM_DATA_RUN_ERR; break; default: /* all other errors are handled in a generic way */ ccb_status = CAM_REQ_CMP_ERR; /* request the further code to freeze the queue */ - hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; + hcb->ccb->ccb_h.status |= CAM_DEV_QFRZN; lun_to_freeze = CAM_LUN_WILDCARD; break; } Then we check if the error was serious enough to freeze the input queue until it gets proceeded and do so if it is: - if(hcb->ccb->ccb_h.status & CAM_DEV_QFRZN) { + if(hcb->ccb->ccb_h.status & CAM_DEV_QFRZN) { /* freeze the queue */ - xpt_freeze_devq(ccb->ccb_h.path, /*count*/1); + xpt_freeze_devq(ccb->ccb_h.path, /*count*/1); /* re-queue all commands for this target/LUN back to CAM */ - for(h = softc->first_queued_hcb; h != NULL; h = hh) { - hh = h->next; + for(h = softc->first_queued_hcb; h != NULL; h = hh) { + hh = h->next; - if(targ == h->targ - && (lun_to_freeze == CAM_LUN_WILDCARD || lun_to_freeze == h->lun) ) - free_hcb_and_ccb_done(h, h->ccb, CAM_REQUEUE_REQ); + if(targ == h->targ + && (lun_to_freeze == CAM_LUN_WILDCARD || lun_to_freeze == h->lun) ) + free_hcb_and_ccb_done(h, h->ccb, CAM_REQUEUE_REQ); } } - free_hcb_and_ccb_done(hcb, hcb->ccb, ccb_status); + free_hcb_and_ccb_done(hcb, hcb->ccb, ccb_status); schedule_next_hcb(softc); return; This concludes the generic interrupt handling although specific controllers may require some additions. Errors Summary SCSIerrors When executing an I/O request many things may go wrong. The reason of error can be reported in the CCB status with great detail. Examples of use are spread throughout this document. For completeness here is the summary of recommended responses for the typical error conditions: CAM_RESRC_UNAVAIL - some resource is temporarily unavailable and the SIM driver cannot generate an event when it will become available. An example of this resource would be some intra-controller hardware resource for which the controller does not generate an interrupt when it becomes available. CAM_UNCOR_PARITY - unrecovered parity error occurred CAM_DATA_RUN_ERR - data overrun or unexpected data phase (going in other direction than specified in CAM_DIR_MASK) or odd transfer length for wide transfer CAM_SEL_TIMEOUT - selection timeout occurred (target does not respond) CAM_CMD_TIMEOUT - command timeout occurred (the timeout function ran) CAM_SCSI_STATUS_ERROR - the device returned error CAM_AUTOSENSE_FAIL - the device returned error and the REQUEST SENSE COMMAND failed CAM_MSG_REJECT_REC - MESSAGE REJECT message was received CAM_SCSI_BUS_RESET - received SCSI bus reset CAM_REQ_CMP_ERR - impossible SCSI phase occurred or something else as weird or just a generic error if further detail is not available CAM_UNEXP_BUSFREE - unexpected disconnect occurred CAM_BDR_SENT - BUS DEVICE RESET message was sent to the target CAM_UNREC_HBA_ERROR - unrecoverable Host Bus Adapter Error CAM_REQ_TOO_BIG - the request was too large for this controller CAM_REQUEUE_REQ - this request should be re-queued to preserve transaction ordering. This typically occurs when the SIM recognizes an error that should freeze the queue and must place other queued requests for the target at the sim level back into the XPT queue. Typical cases of such errors are selection timeouts, command timeouts and other like conditions. In such cases the troublesome command returns the status indicating the error, the and the other commands which have not be sent to the bus yet get re-queued. CAM_LUN_INVALID - the LUN ID in the request is not supported by the SCSI controller CAM_TID_INVALID - the target ID in the request is not supported by the SCSI controller Timeout Handling When the timeout for an HCB expires that request should be aborted, just like with an XPT_ABORT request. The only difference is that the returned status of aborted request should be CAM_CMD_TIMEOUT instead of CAM_REQ_ABORTED (that is why implementation of the abort better be done as a function). But there is one more possible problem: what if the abort request itself will get stuck? In this case the SCSI bus should be reset, just like with an XPT_RESET_BUS request (and the idea about implementing it as a function called from both places applies here too). Also we should reset the whole SCSI bus if a device reset request got stuck. So after all the timeout function would look like: static void xxx_timeout(void *arg) { struct xxx_hcb *hcb = (struct xxx_hcb *)arg; struct xxx_softc *softc; struct ccb_hdr *ccb_h; - softc = hcb->softc; - ccb_h = &hcb->ccb->ccb_h; + softc = hcb->softc; + ccb_h = &hcb->ccb->ccb_h; - if(hcb->flags & HCB_BEING_ABORTED - || ccb_h->func_code == XPT_RESET_DEV) { + if(hcb->flags & HCB_BEING_ABORTED + || ccb_h->func_code == XPT_RESET_DEV) { xxx_reset_bus(softc); } else { - xxx_abort_ccb(hcb->ccb, CAM_CMD_TIMEOUT); + xxx_abort_ccb(hcb->ccb, CAM_CMD_TIMEOUT); } } When we abort a request all the other disconnected requests to the same target/LUN get aborted too. So there appears a question, should we return them with status CAM_REQ_ABORTED or CAM_CMD_TIMEOUT? The current drivers use CAM_CMD_TIMEOUT. This seems logical because if one request got timed out then probably something really bad is happening to the device, so if they would not be disturbed they would time out by themselves. diff --git a/en_US.ISO8859-1/books/arch-handbook/sound/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/sound/chapter.sgml index cbefbeb46b..f529ae5961 100644 --- a/en_US.ISO8859-1/books/arch-handbook/sound/chapter.sgml +++ b/en_US.ISO8859-1/books/arch-handbook/sound/chapter.sgml @@ -1,697 +1,697 @@ Jean-Francois Dockes Contributed by Sound subsystem Introduction sound subsystem The FreeBSD sound subsystem cleanly separates generic sound handling issues from device-specific ones. This makes it easier to add support for new hardware. The &man.pcm.4; framework is the central piece of the sound subsystem. It mainly implements the following elements: system call interface A system call interface (read, write, ioctls) to digitized sound and mixer functions. The ioctl command set is compatible with the legacy OSS or Voxware interface, allowing common multimedia applications to be ported without modification. Common code for processing sound data (format conversions, virtual channels). A uniform software interface to hardware-specific audio interface modules. Additional support for some common hardware interfaces (ac97), or shared hardware-specific code (ex: ISA DMA routines). The support for specific sound cards is implemented by hardware-specific drivers, which provide channel and mixer interfaces to plug into the generic pcm code. In this chapter, the term pcm will refer to the central, common part of the sound driver, as opposed to the hardware-specific modules. The prospective driver writer will of course want to start from an existing module and use the code as the ultimate reference. But, while the sound code is nice and clean, it is also mostly devoid of comments. This document tries to give an overview of the framework interface and answer some questions that may arise while adapting the existing code. As an alternative, or in addition to starting from a working example, you can find a commented driver template at http://people.FreeBSD.org/~cg/template.c Files All the relevant code currently (FreeBSD 4.4) lives in /usr/src/sys/dev/sound/, except for the public ioctl interface definitions, found in /usr/src/sys/sys/soundcard.h Under /usr/src/sys/dev/sound/, the pcm/ directory holds the central code, while the isa/ and pci/ directories have the drivers for ISA and PCI boards. Probing, attaching, etc. Sound drivers probe and attach in almost the same way as any hardware driver module. You might want to look at the ISA or PCI specific sections of the handbook for more information. However, sound drivers differ in some ways: They declare themselves as pcm class devices, with a struct snddev_info device private structure: static driver_t xxx_driver = { "pcm", xxx_methods, sizeof(struct snddev_info) }; DRIVER_MODULE(snd_xxxpci, pci, xxx_driver, pcm_devclass, 0, 0); MODULE_DEPEND(snd_xxxpci, snd_pcm, PCM_MINVER, PCM_PREFVER,PCM_MAXVER); device driverssound Most sound drivers need to store additional private information about their device. A private data structure is usually allocated in the attach routine. Its address is passed to pcm by the calls to pcm_register() and mixer_init(). pcm later passes back this address as a parameter in calls to the sound driver interfaces. The sound driver attach routine should declare its MIXER or AC97 interface to pcm by calling mixer_init(). For a MIXER interface, this causes in turn a call to xxxmixer_init(). The sound driver attach routine declares its general CHANNEL configuration to pcm by calling pcm_register(dev, sc, nplay, nrec), where sc is the address for the device data structure, used in further calls from pcm, and nplay and nrec are the number of play and record channels. The sound driver attach routine declares each of its channel objects by calls to pcm_addchan(). This sets up the channel glue in pcm and causes in turn a call to xxxchannel_init(). The sound driver detach routine should call pcm_unregister() before releasing its resources. There are two possible methods to handle non-PnP devices: Use a device_identify() method (example: sound/isa/es1888.c). The device_identify() method probes for the hardware at known addresses and, if it finds a supported device, creates a new pcm device which is then passed to probe/attach. Use a custom kernel configuration with appropriate hints for pcm devices (example: sound/isa/mss.c). pcm drivers should implement device_suspend, device_resume and device_shutdown routines, so that power management and module unloading function correctly. Interfaces The interface between the pcm core and the sound drivers is defined in terms of kernel objects. There are two main interfaces that a sound driver will usually provide: CHANNEL and either MIXER or AC97. The AC97 interface is a very small hardware access (register read/write) interface, implemented by drivers for hardware with an AC97 codec. In this case, the actual MIXER interface is provided by the shared AC97 code in pcm. The CHANNEL interface Common notes for function parameters Sound drivers usually have a private data structure to describe their device, and one structure for each play and record data channel that it supports. For all CHANNEL interface functions, the first parameter is an opaque pointer. The second parameter is a pointer to the private channel data structure, except for channel_init() which has a pointer to the private device structure (and returns the channel pointer for further use by pcm). Overview of data transfer operations For sound data transfers, the pcm core and the sound drivers communicate through a shared memory area, described by a struct snd_dbuf. struct snd_dbuf is private to pcm, and sound drivers obtain values of interest by calls to accessor functions (sndbuf_getxxx()). The shared memory area has a size of sndbuf_getsize() and is divided into fixed size blocks of sndbuf_getblksz() bytes. When playing, the general transfer mechanism is as follows (reverse the idea for recording): pcm initially fills up the buffer, then calls the sound driver's xxxchannel_trigger() function with a parameter of PCMTRIG_START. The sound driver then arranges to repeatedly transfer the whole memory area (sndbuf_getbuf(), sndbuf_getsize()) to the device, in blocks of sndbuf_getblksz() bytes. It calls back the chn_intr() pcm function for each transferred block (this will typically happen at interrupt time). chn_intr() arranges to copy new data to the area that was transferred to the device (now free), and make appropriate updates to the snd_dbuf structure. channel_init xxxchannel_init() is called to initialize each of the play or record channels. The calls are initiated from the sound driver attach routine. (See the probe and attach section). static void * xxxchannel_init(kobj_t obj, void *data, struct snd_dbuf *b, struct pcm_channel *c, int dir) { struct xxx_info *sc = data; struct xxx_chinfo *ch; ... return ch; } b is the address for the channel struct snd_dbuf. It should be initialized in the function by calling sndbuf_alloc(). The buffer size to use is normally a small multiple of the 'typical' unit transfer size for your device. c is the pcm channel control structure pointer. This is an opaque object. The function should store it in the local channel structure, to be used in later calls to pcm (ie: chn_intr(c)). dir indicates the channel direction (PCMDIR_PLAY or PCMDIR_REC). The function should return a pointer to the private area used to control this channel. This will be passed as a parameter to other channel interface calls. channel_setformat xxxchannel_setformat() should set up the hardware for the specified channel for the specified sound format. static int xxxchannel_setformat(kobj_t obj, void *data, u_int32_t format) { struct xxx_chinfo *ch = data; ... return 0; } format is specified as an AFMT_XXX value (soundcard.h). channel_setspeed xxxchannel_setspeed() sets up the channel hardware for the specified sampling speed, and returns the possibly adjusted speed. static int xxxchannel_setspeed(kobj_t obj, void *data, u_int32_t speed) { struct xxx_chinfo *ch = data; ... return speed; } channel_setblocksize xxxchannel_setblocksize() sets the block size, which is the size of unit transactions between pcm and the sound driver, and between the sound driver and the device. Typically, this would be the number of bytes transferred before an interrupt occurs. During a transfer, the sound driver should call pcm's chn_intr() every time this size has been transferred. Most sound drivers only take note of the block size here, to be used when an actual transfer will be started. static int xxxchannel_setblocksize(kobj_t obj, void *data, u_int32_t blocksize) { struct xxx_chinfo *ch = data; ... return blocksize; } The function returns the possibly adjusted block size. In case the block size is indeed changed, sndbuf_resize() should be called to adjust the buffer. channel_trigger xxxchannel_trigger() is called by pcm to control data transfer operations in the driver. static int xxxchannel_trigger(kobj_t obj, void *data, int go) { struct xxx_chinfo *ch = data; ... return 0; } go defines the action for the current call. The possible values are: PCMTRIG_START: the driver should start a data transfer from or to the channel buffer. If needed, the buffer base and size can be retrieved through sndbuf_getbuf() and sndbuf_getsize(). PCMTRIG_EMLDMAWR / PCMTRIG_EMLDMARD: this tells the driver that the input or output buffer may have been updated. Most drivers just ignore these calls. PCMTRIG_STOP / PCMTRIG_ABORT: the driver should stop the current transfer. If the driver uses ISA DMA, sndbuf_isadma() should be called before performing actions on the device, and will take care of the DMA chip side of things. channel_getptr xxxchannel_getptr() returns the current offset in the transfer buffer. This will typically be called by chn_intr(), and this is how pcm knows where it can transfer new data. channel_free xxxchannel_free() is called to free up channel resources, for example when the driver is unloaded, and should be implemented if the channel data structures are dynamically allocated or if sndbuf_alloc() was not used for buffer allocation. channel_getcaps struct pcmchan_caps * xxxchannel_getcaps(kobj_t obj, void *data) { return &xxx_caps; } The routine returns a pointer to a (usually statically-defined) pcmchan_caps structure (defined in sound/pcm/channel.h. The structure holds the minimum and maximum sampling frequencies, and the accepted sound formats. Look at any sound driver for an example. More functions channel_reset(), channel_resetdone(), and channel_notify() are for special purposes and should not be implemented in a driver without discussing it with the authorities (&a.cg;). channel_setdir() is deprecated. The MIXER interface mixer_init xxxmixer_init() initializes the hardware and tells pcm what mixer devices are available for playing and recording static int xxxmixer_init(struct snd_mixer *m) { struct xxx_info *sc = mix_getdevinfo(m); u_int32_t v; [Initialize hardware] [Set appropriate bits in v for play mixers] mix_setdevs(m, v); [Set appropriate bits in v for record mixers] mix_setrecdevs(m, v) return 0; } Set bits in an integer value and call mix_setdevs() and mix_setrecdevs() to tell pcm what devices exist. Mixer bits definitions can be found in soundcard.h (SOUND_MASK_XXX values and SOUND_MIXER_XXX bit shifts). mixer_set xxxmixer_set() sets the volume level for one mixer device. static int xxxmixer_set(struct snd_mixer *m, unsigned dev, unsigned left, unsigned right) { struct sc_info *sc = mix_getdevinfo(m); [set volume level] - return left | (right << 8); + return left | (right << 8); } The device is specified as a SOUND_MIXER_XXX value The volume values are specified in range [0-100]. A value of zero should mute the device. As the hardware levels probably will not match the input scale, and some rounding will occur, the routine returns the actual level values (in range 0-100) as shown. mixer_setrecsrc xxxmixer_setrecsrc() sets the recording source device. static int xxxmixer_setrecsrc(struct snd_mixer *m, u_int32_t src) { struct xxx_info *sc = mix_getdevinfo(m); [look for non zero bit(s) in src, set up hardware] [update src to reflect actual action] return src; } The desired recording devices are specified as a bit field The actual devices set for recording are returned. Some drivers can only set one device for recording. The function should return -1 if an error occurs. mixer_uninit, mixer_reinit xxxmixer_uninit() should ensure that all sound is muted and if possible mixer hardware should be powered down xxxmixer_reinit() should ensure that the mixer hardware is powered up and any settings not controlled by mixer_set() or mixer_setrecsrc() are restored. The AC97 interface AC97 The AC97 interface is implemented by drivers with an AC97 codec. It only has three methods: xxxac97_init() returns the number of ac97 codecs found. ac97_read() and ac97_write() read or write a specified register. The AC97 interface is used by the AC97 code in pcm to perform higher level operations. Look at sound/pci/maestro3.c or many others under sound/pci/ for an example. diff --git a/en_US.ISO8859-1/books/developers-handbook/introduction/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/introduction/chapter.sgml index 680a868792..200f52c858 100644 --- a/en_US.ISO8859-1/books/developers-handbook/introduction/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/introduction/chapter.sgml @@ -1,225 +1,225 @@ Murray Stokely Contributed by Jeroen Ruigrok van der Werven Introduction Developing on FreeBSD So here we are. System all installed and you are ready to start programming. But where to start? What does FreeBSD provide? What can it do for me, as a programmer? These are some questions which this chapter tries to answer. Of course, programming has different levels of proficiency like any other trade. For some it is a hobby, for others it is their profession. The information in this chapter might be aimed toward the beginning programmer; indeed, it could serve useful for the programmer unfamiliar with the &os; platform. The BSD Vision To produce the best &unix; like operating system package possible, with due respect to the original software tools ideology as well as usability, performance and stability. Architectural Guidelines Our ideology can be described by the following guidelines Do not add new functionality unless an implementor cannot complete a real application without it. It is as important to decide what a system is not as to decide what it is. Do not serve all the world's needs; rather, make the system extensible so that additional needs can be met in an upwardly compatible fashion. The only thing worse than generalizing from one example is generalizing from no examples at all. If a problem is not completely understood, it is probably best to provide no solution at all. If you can get 90 percent of the desired effect for 10 percent of the work, use the simpler solution. Isolate complexity as much as possible. Provide mechanism, rather than policy. In particular, place user interface policy in the client's hands. From Scheifler & Gettys: "X Window System" The Layout of <filename class="directory">/usr/src</filename> The complete source code to FreeBSD is available from our public CVS repository. The source code is normally installed in - /usr/src which contains the + /usr/src which contains the following subdirectories: Directory Description - bin/ + bin/ Source for files in /bin - contrib/ + contrib/ Source for files from contributed software. - crypto/ + crypto/ Cryptographical sources - etc/ + etc/ Source for files in /etc + class="directory">/etc - games/ + games/ Source for files in /usr/games + class="directory">/usr/games - gnu/ + gnu/ Utilities covered by the GNU Public License - include/ + include/ Source for files in /usr/include + class="directory">/usr/include kerberos5/ + class="directory">kerberos5/ Source for Kerberos version 5 - lib/ + lib/ Source for files in /usr/lib + class="directory">/usr/lib - libexec/ + libexec/ Source for files in /usr/libexec + class="directory">/usr/libexec release/ + class="directory">release/ Files required to produce a FreeBSD release rescue/ Build system for the /rescue utilities - sbin/ + sbin/ Source for files in /sbin + class="directory">/sbin - secure/ + secure/ FreeSec sources - share/ + share/ Source for files in /usr/share + class="directory">/usr/share - sys/ + sys/ Kernel source files - tools/ + tools/ Tools used for maintenance and testing of FreeBSD usr.bin/ + class="directory">usr.bin/ Source for files in /usr/bin + class="directory">/usr/bin usr.sbin/ + class="directory">usr.sbin/ Source for files in /usr/sbin + class="directory">/usr/sbin diff --git a/en_US.ISO8859-1/books/developers-handbook/ipv6/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/ipv6/chapter.sgml index 58637dc6a6..71e76092df 100644 --- a/en_US.ISO8859-1/books/developers-handbook/ipv6/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/ipv6/chapter.sgml @@ -1,1593 +1,1593 @@ IPv6 Internals Yoshinobu Inoue Contributed by IPv6/IPsec Implementation This section should explain IPv6 and IPsec related implementation internals. These functionalities are derived from KAME project IPv6 Conformance The IPv6 related functions conforms, or tries to conform to the latest set of IPv6 specifications. For future reference we list some of the relevant documents below (NOTE: this is not a complete list - this is too hard to maintain...). For details please refer to specific chapter in the document, RFCs, manual pages, or comments in the source code. Conformance tests have been performed on the KAME STABLE kit at TAHI project. Results can be viewed at . We also attended Univ. of New Hampshire IOL tests () in the past, with our past snapshots. RFC1639: FTP Operation Over Big Address Records (FOOBAR) RFC2428 is preferred over RFC1639. FTP clients will first try RFC2428, then RFC1639 if failed. RFC1886: DNS Extensions to support IPv6 RFC1933: Transition Mechanisms for IPv6 Hosts and Routers IPv4 compatible address is not supported. automatic tunneling (described in 4.3 of this RFC) is not supported. &man.gif.4; interface implements IPv[46]-over-IPv[46] tunnel in a generic way, and it covers "configured tunnel" described in the spec. See 23.5.1.5 in this document for details. RFC1981: Path MTU Discovery for IPv6 RFC2080: RIPng for IPv6 usr.sbin/route6d support this. RFC2292: Advanced Sockets API for IPv6 For supported library functions/kernel APIs, see sys/netinet6/ADVAPI. RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM) RFC2362 defines packet formats for PIM-SM. draft-ietf-pim-ipv6-01.txt is written based on this. RFC2373: IPv6 Addressing Architecture supports node required addresses, and conforms to the scope requirement. RFC2374: An IPv6 Aggregatable Global Unicast Address Format supports 64-bit length of Interface ID. RFC2375: IPv6 Multicast Address Assignments Userland applications use the well-known addresses assigned in the RFC. RFC2428: FTP Extensions for IPv6 and NATs RFC2428 is preferred over RFC1639. FTP clients will first try RFC2428, then RFC1639 if failed. RFC2460: IPv6 specification RFC2461: Neighbor discovery for IPv6 See 23.5.1.2 in this document for details. RFC2462: IPv6 Stateless Address Autoconfiguration See 23.5.1.4 in this document for details. RFC2463: ICMPv6 for IPv6 specification See 23.5.1.9 in this document for details. RFC2464: Transmission of IPv6 Packets over Ethernet Networks RFC2465: MIB for IPv6: Textual Conventions and General Group Necessary statistics are gathered by the kernel. Actual IPv6 MIB support is provided as a patchkit for ucd-snmp. RFC2466: MIB for IPv6: ICMPv6 group Necessary statistics are gathered by the kernel. Actual IPv6 MIB support is provided as patchkit for ucd-snmp. RFC2467: Transmission of IPv6 Packets over FDDI Networks RFC2497: Transmission of IPv6 packet over ARCnet Networks RFC2553: Basic Socket Interface Extensions for IPv6 IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind socket (3.8) are supported. See 23.5.1.12 in this document for details. RFC2675: IPv6 Jumbograms See 23.5.1.7 in this document for details. RFC2710: Multicast Listener Discovery for IPv6 RFC2711: IPv6 router alert option draft-ietf-ipngwg-router-renum-08: Router renumbering for IPv6 draft-ietf-ipngwg-icmp-namelookups-02: IPv6 Name Lookups Through ICMP draft-ietf-ipngwg-icmp-name-lookups-03: IPv6 Name Lookups Through ICMP draft-ietf-pim-ipv6-01.txt: PIM for IPv6 &man.pim6dd.8; implements dense mode. &man.pim6sd.8; implements sparse mode. draft-itojun-ipv6-tcp-to-anycast-00: Disconnecting TCP connection toward IPv6 anycast address draft-yamamoto-wideipv6-comm-model-00 See 23.5.1.6 in this document for details. draft-ietf-ipngwg-scopedaddr-format-00.txt : An Extension of Format for IPv6 Scoped Addresses Neighbor Discovery Neighbor Discovery is fairly stable. Currently Address Resolution, Duplicated Address Detection, and Neighbor Unreachability Detection are supported. In the near future we will be adding Proxy Neighbor Advertisement support in the kernel and Unsolicited Neighbor Advertisement transmission command as admin tool. If DAD fails, the address will be marked "duplicated" and message will be generated to syslog (and usually to console). The "duplicated" mark can be checked with &man.ifconfig.8;. It is administrators' responsibility to check for and recover from DAD failures. The behavior should be improved in the near future. Some of the network driver loops multicast packets back to itself, even if instructed not to do so (especially in promiscuous mode). In such cases DAD may fail, because DAD engine sees inbound NS packet (actually from the node itself) and considers it as a sign of duplicate. You may want to look at #if condition marked "heuristics" in sys/netinet6/nd6_nbr.c:nd6_dad_timer() as workaround (note that the code fragment in "heuristics" section is not spec conformant). Neighbor Discovery specification (RFC2461) does not talk about neighbor cache handling in the following cases: when there was no neighbor cache entry, node received unsolicited RS/NS/NA/redirect packet without link-layer address neighbor cache handling on medium without link-layer address (we need a neighbor cache entry for IsRouter bit) For first case, we implemented workaround based on discussions on IETF ipngwg mailing list. For more details, see the comments in the source code and email thread started from (IPng 7155), dated Feb 6 1999. IPv6 on-link determination rule (RFC2461) is quite different from assumptions in BSD network code. At this moment, no on-link determination rule is supported where default router list is empty (RFC2461, section 5.2, last sentence in 2nd paragraph - note that the spec misuse the word "host" and "node" in several places in the section). To avoid possible DoS attacks and infinite loops, only 10 options on ND packet is accepted now. Therefore, if you have 20 prefix options attached to RA, only the first 10 prefixes will be recognized. If this troubles you, please ask it on FREEBSD-CURRENT mailing list and/or modify nd6_maxndopt in sys/netinet6/nd6.c. If there are high demands we may provide sysctl knob for the variable. Scope Index IPv6 uses scoped addresses. Therefore, it is very important to specify scope index (interface index for link-local address, or site index for site-local address) with an IPv6 address. Without scope index, scoped IPv6 address is ambiguous to the kernel, and kernel will not be able to determine the outbound interface for a packet. Ordinary userland applications should use advanced API (RFC2292) to specify scope index, or interface index. For similar purpose, sin6_scope_id member in sockaddr_in6 structure is defined in RFC2553. However, the semantics for sin6_scope_id is rather vague. If you care about portability of your application, we suggest you to use advanced API rather than sin6_scope_id. In the kernel, an interface index for link-local scoped address is embedded into 2nd 16bit-word (3rd and 4th byte) in IPv6 address. For example, you may see something like: fe80:1::200:f8ff:fe01:6317 in the routing table and interface address structure (struct in6_ifaddr). The address above is a link-local unicast address which belongs to a network interface whose interface identifier is 1. The embedded index enables us to identify IPv6 link local addresses over multiple interfaces effectively and with only a little code change. Routing daemons and configuration programs, like &man.route6d.8; and &man.ifconfig.8;, will need to manipulate the "embedded" scope index. These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API will return IPv6 addresses with 2nd 16bit-word filled in. The APIs are for manipulating kernel internal structure. Programs that use these APIs have to be prepared about differences in kernels anyway. When you specify scoped address to the command line, NEVER write the embedded form (such as ff02:1::1 or fe80:2::fedc). This is not supposed to work. Always use standard form, like ff02::1 or fe80::fedc, with command line option for specifying interface (like ping6 -I ne0 ff02::1). In general, if a command does not have command line option to specify outgoing interface, that command is not ready to accept scoped address. This may seem to be opposite from IPv6's premise to support "dentist office" situation. We believe that specifications need some improvements for this. Some of the userland tools support extended numeric IPv6 syntax, as documented in draft-ietf-ipngwg-scopedaddr-format-00.txt. You can specify outgoing link, by using name of the outgoing interface like "fe80::1%ne0". This way you will be able to specify link-local scoped address without much trouble. To use this extension in your program, you will need to use &man.getaddrinfo.3;, and &man.getnameinfo.3; with NI_WITHSCOPEID. The implementation currently assumes 1-to-1 relationship between a link and an interface, which is stronger than what specs say. Plug and Play Most of the IPv6 stateless address autoconfiguration is implemented in the kernel. Neighbor Discovery functions are implemented in the kernel as a whole. Router Advertisement (RA) input for hosts is implemented in the kernel. Router Solicitation (RS) output for endhosts, RS input for routers, and RA output for routers are implemented in the userland. Assignment of link-local, and special addresses IPv6 link-local address is generated from IEEE802 address (Ethernet MAC address). Each of interface is assigned an IPv6 link-local address automatically, when the interface becomes up (IFF_UP). Also, direct route for the link-local address is added to routing table. Here is an output of netstat command: Internet6: Destination Gateway Flags Netif Expire fe80:1::%ed0/64 link#1 UC ed0 fe80:2::%ep0/64 link#2 UC ep0 Interfaces that has no IEEE802 address (pseudo interfaces like tunnel interfaces, or ppp interfaces) will borrow IEEE802 address from other interfaces, such as Ethernet interfaces, whenever possible. If there is no IEEE802 hardware attached, a last resort pseudo-random value, MD5(hostname), will be used as source of link-local address. If it is not suitable for your usage, you will need to configure the link-local address manually. If an interface is not capable of handling IPv6 (such as lack of multicast support), link-local address will not be assigned to that interface. See section 2 for details. Each interface joins the solicited multicast address and the link-local all-nodes multicast addresses (e.g. fe80::1:ff01:6317 and ff02::1, respectively, on the link the interface is attached). In addition to a link-local address, the loopback address (::1) will be assigned to the loopback interface. Also, ::1/128 and ff01::/32 are automatically added to routing table, and loopback interface joins node-local multicast group ff01::1. Stateless address autoconfiguration on hosts In IPv6 specification, nodes are separated into two categories: routers and hosts. Routers forward packets addressed to others, hosts does not forward the packets. net.inet6.ip6.forwarding defines whether this node is router or host (router if it is 1, host if it is 0). When a host hears Router Advertisement from the router, a host may autoconfigure itself by stateless address autoconfiguration. This behavior can be controlled by net.inet6.ip6.accept_rtadv (host autoconfigures itself if it is set to 1). By autoconfiguration, network address prefix for the receiving interface (usually global address prefix) is added. Default route is also configured. Routers periodically generate Router Advertisement packets. To request an adjacent router to generate RA packet, a host can transmit Router Solicitation. To generate a RS packet at any time, use the rtsol command. &man.rtsold.8; daemon is also available. &man.rtsold.8; generates Router Solicitation whenever necessary, and it works great for nomadic usage (notebooks/laptops). If one wishes to ignore Router Advertisements, use sysctl to set net.inet6.ip6.accept_rtadv to 0. To generate Router Advertisement from a router, use the &man.rtadvd.8; daemon. Note that, IPv6 specification assumes the following items, and nonconforming cases are left unspecified: Only hosts will listen to router advertisements Hosts have single network interface (except loopback) Therefore, this is unwise to enable net.inet6.ip6.accept_rtadv on routers, or multi-interface host. A misconfigured node can behave strange (nonconforming configuration allowed for those who would like to do some experiments). To summarize the sysctl knob: accept_rtadv forwarding role of the node --- --- --- 0 0 host (to be manually configured) 0 1 router 1 0 autoconfigured host (spec assumes that host has single interface only, autoconfigured host with multiple interface is out-of-scope) 1 1 invalid, or experimental (out-of-scope of spec) RFC2462 has validation rule against incoming RA prefix information option, in 5.5.3 (e). This is to protect hosts from malicious (or misconfigured) routers that advertise very short prefix lifetime. There was an update from Jim Bound to ipngwg mailing list (look for "(ipng 6712)" in the archive) and it is implemented Jim's update. See 23.5.1.2 in the document for relationship between DAD and autoconfiguration. Generic tunnel interface GIF (Generic InterFace) is a pseudo interface for configured tunnel. Details are described in &man.gif.4;. Currently v6 in v6 v6 in v4 v4 in v6 v4 in v4 are available. Use &man.gifconfig.8; to assign physical (outer) source and destination address to gif interfaces. Configuration that uses same address family for inner and outer IP header (v4 in v4, or v6 in v6) is dangerous. It is very easy to configure interfaces and routing tables to perform infinite level of tunneling. Please be warned. gif can be configured to be ECN-friendly. See 23.5.4.5 for ECN-friendliness of tunnels, and &man.gif.4; for how to configure. If you would like to configure an IPv4-in-IPv6 tunnel with gif interface, read &man.gif.4; carefully. You will need to remove IPv6 link-local address automatically assigned to the gif interface. Source Address Selection Current source selection rule is scope oriented (there are some exceptions - see below). For a given destination, a source IPv6 address is selected by the following rule: If the source address is explicitly specified by the user (e.g. via the advanced API), the specified address is used. If there is an address assigned to the outgoing interface (which is usually determined by looking up the routing table) that has the same scope as the destination address, the address is used. This is the most typical case. If there is no address that satisfies the above condition, choose a global address assigned to one of the interfaces on the sending node. If there is no address that satisfies the above condition, and destination address is site local scope, choose a site local address assigned to one of the interfaces on the sending node. If there is no address that satisfies the above condition, choose the address associated with the routing table entry for the destination. This is the last resort, which may cause scope violation. For instance, ::1 is selected for ff01::1, fe80:1::200:f8ff:fe01:6317 for fe80:1::2a0:24ff:feab:839b (note that embedded interface index - described in 23.5.1.3 - helps us choose the right source address. Those embedded indices will not be on the wire). If the outgoing interface has multiple address for the scope, a source is selected longest match basis (rule 3). Suppose 3ffe:501:808:1:200:f8ff:fe01:6317 and 3ffe:2001:9:124:200:f8ff:fe01:6317 are given to the outgoing interface. 3ffe:501:808:1:200:f8ff:fe01:6317 is chosen as the source for the destination 3ffe:501:800::1. Note that the above rule is not documented in the IPv6 spec. It is considered "up to implementation" item. There are some cases where we do not use the above rule. One example is connected TCP session, and we use the address kept in tcb as the source. Another example is source address for Neighbor Advertisement. Under the spec (RFC2461 7.2.2) NA's source should be the target address of the corresponding NS's target. In this case we follow the spec rather than the above longest-match rule. For new connections (when rule 1 does not apply), deprecated addresses (addresses with preferred lifetime = 0) will not be chosen as source address if other choices are available. If no other choices are available, deprecated address will be used as a last resort. If there are multiple choice of deprecated addresses, the above scope rule will be used to choose from those deprecated addresses. If you would like to prohibit the use of deprecated address for some reason, configure net.inet6.ip6.use_deprecated to 0. The issue related to deprecated address is described in RFC2462 5.5.4 (NOTE: there is some debate underway in IETF ipngwg on how to use "deprecated" address). Jumbo Payload The Jumbo Payload hop-by-hop option is implemented and can be used to send IPv6 packets with payloads longer than 65,535 octets. But currently no physical interface whose MTU is more than 65,535 is supported, so such payloads can be seen only on the loopback interface (i.e. lo0). If you want to try jumbo payloads, you first have to reconfigure the kernel so that the MTU of the loopback interface is more than 65,535 bytes; add the following to the kernel configuration file: options "LARGE_LOMTU" #To test jumbo payload and recompile the new kernel. Then you can test jumbo payloads by the &man.ping6.8; command with -b and -s options. The -b option must be specified to enlarge the size of the socket buffer and the -s option specifies the length of the packet, which should be more than 65,535. For example, type as follows: &prompt.user; ping6 -b 70000 -s 68000 ::1 The IPv6 specification requires that the Jumbo Payload option must not be used in a packet that carries a fragment header. If this condition is broken, an ICMPv6 Parameter Problem message must be sent to the sender. specification is followed, but you cannot usually see an ICMPv6 error caused by this requirement. When an IPv6 packet is received, the frame length is checked and compared to the length specified in the payload length field of the IPv6 header or in the value of the Jumbo Payload option, if any. If the former is shorter than the latter, the packet is discarded and statistics are incremented. You can see the statistics as output of &man.netstat.8; command with `-s -p ip6' option: &prompt.user; netstat -s -p ip6 ip6: (snip) 1 with data size < data length So, kernel does not send an ICMPv6 error unless the erroneous packet is an actual Jumbo Payload, that is, its packet size is more than 65,535 bytes. As described above, currently no physical interface with such a huge MTU is supported, so it rarely returns an ICMPv6 error. TCP/UDP over jumbogram is not supported at this moment. This is because we have no medium (other than loopback) to test this. Contact us if you need this. IPsec does not work on jumbograms. This is due to some specification twists in supporting AH with jumbograms (AH header size influences payload length, and this makes it real hard to authenticate inbound packet with jumbo payload option as well as AH). There are fundamental issues in *BSD support for jumbograms. We would like to address those, but we need more time to finalize these. To name a few: mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it will not hold jumbogram with len > 2G on 32bit architecture CPUs. If we would like to support jumbogram properly, the field must be expanded to hold 4G + IPv6 header + link-layer header. Therefore, it must be expanded to at least int64_t (u_int32_t is NOT enough). We mistakingly use "int" to hold packet length in many places. We need to convert them into larger integral type. It needs a great care, as we may experience overflow during packet length computation. We mistakingly check for ip6_plen field of IPv6 header for packet payload length in various places. We should be checking mbuf pkthdr.len instead. ip6_input() will perform sanity check on jumbo payload option on input, and we can safely use mbuf pkthdr.len afterwards. TCP code needs a careful update in bunch of places, of course. Loop prevention in header processing IPv6 specification allows arbitrary number of extension headers to be placed onto packets. If we implement IPv6 packet processing code in the way BSD IPv4 code is implemented, kernel stack may overflow due to long function call chain. sys/netinet6 code is carefully designed to avoid kernel stack overflow. Because of this, sys/netinet6 code defines its own protocol switch structure, as "struct ip6protosw" (see netinet6/ip6protosw.h). There is no such update to IPv4 part (sys/netinet) for compatibility, but small change is added to its pr_input() prototype. So "struct ipprotosw" is also defined. Because of this, if you receive IPsec-over-IPv4 packet with massive number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay. (Off-course, for those all IPsec headers to be processed, each such IPsec header must pass each IPsec check. So an anonymous attacker will not be able to do such an attack.) ICMPv6 After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium. This is already implemented into the kernel. Applications For userland programming, we support IPv6 socket API as specified in RFC2553, RFC2292 and upcoming Internet drafts. TCP/UDP over IPv6 is available and quite stable. You can enjoy &man.telnet.1;, &man.ftp.1;, &man.rlogin.1;, &man.rsh.1;, &man.ssh.1;, etc. These applications are protocol independent. That is, they automatically chooses IPv4 or IPv6 according to DNS. Kernel Internals While ip_forward() calls ip_output(), ip6_forward() directly calls if_output() since routers must not divide IPv6 packets into fragments. ICMPv6 should contain the original packet as long as possible up to 1280. UDP6/IP6 port unreach, for instance, should contain all extension headers and the *unchanged* UDP6 and IP6 headers. So, all IP6 functions except TCP never convert network byte order into host byte order, to save the original packet. tcp_input(), udp6_input() and icmp6_input() can not assume that IP6 header is preceding the transport headers due to extension headers. So, in6_cksum() was implemented to handle packets whose IP6 header and transport header is not continuous. TCP/IP6 nor UDP6/IP6 header structures do not exist for checksum calculation. To process IP6 header, extension headers and transport headers easily, network drivers are now required to store packets in one internal mbuf or one or more external mbufs. A typical old driver prepares two internal mbufs for 96 - 204 bytes data, however, now such packet data is stored in one external mbuf. netstat -s -p ip6 tells you whether or not your driver conforms such requirement. In the following example, "cce0" violates the requirement. (For more information, refer to Section 2.) Mbuf statistics: 317 one mbuf two or more mbuf:: lo0 = 8 cce0 = 10 3282 one ext mbuf 0 two or more ext mbuf Each input function calls IP6_EXTHDR_CHECK in the beginning to check if the region between IP6 and its header is continuous. IP6_EXTHDR_CHECK calls m_pullup() only if the mbuf has M_LOOP flag, that is, the packet comes from the loopback interface. m_pullup() is never called for packets coming from physical network interfaces. Both IP and IP6 reassemble functions never call m_pullup(). IPv4 mapped address and IPv6 wildcard socket RFC2553 describes IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind socket (3.8). The spec allows you to: Accept IPv4 connections by AF_INET6 wildcard bind socket. Transmit IPv4 packet over AF_INET6 socket by using special form of the address like ::ffff:10.1.1.1. but the spec itself is very complicated and does not specify how the socket layer should behave. Here we call the former one "listening side" and the latter one "initiating side", for reference purposes. You can perform wildcard bind on both of the address families, on the same port. The following table show the behavior of FreeBSD 4.x. listening side initiating side (AF_INET6 wildcard (connection to ::ffff:10.1.1.1) socket gets IPv4 conn.) --- --- FreeBSD 4.x configurable supported default: enabled The following sections will give you more details, and how you can configure the behavior. Comments on listening side: It looks that RFC2553 talks too little on wildcard bind issue, especially on the port space issue, failure mode and relationship between AF_INET/INET6 wildcard bind. There can be several separate interpretation for this RFC which conform to it but behaves differently. So, to implement portable application you should assume nothing about the behavior in the kernel. Using &man.getaddrinfo.3; is the safest way. Port number space and wildcard bind issues were discussed in detail on ipv6imp mailing list, in mid March 1999 and it looks that there is no concrete consensus (means, up to implementers). You may want to check the mailing list archives. If a server application would like to accept IPv4 and IPv6 connections, there will be two alternatives. One is using AF_INET and AF_INET6 socket (you will need two sockets). Use &man.getaddrinfo.3; with AI_PASSIVE into ai_flags, and &man.socket.2; and &man.bind.2; to all the addresses returned. By opening multiple sockets, you can accept connections onto the socket with proper address family. IPv4 connections will be accepted by AF_INET socket, and IPv6 connections will be accepted by AF_INET6 socket. Another way is using one AF_INET6 wildcard bind socket. Use &man.getaddrinfo.3; with AI_PASSIVE into ai_flags and with AF_INET6 into ai_family, and set the 1st argument hostname to NULL. And &man.socket.2; and &man.bind.2; to the address returned. (should be IPv6 unspecified addr). You can accept either of IPv4 and IPv6 packet via this one socket. To support only IPv6 traffic on AF_INET6 wildcard binded socket portably, always check the peer address when a connection is made toward AF_INET6 listening socket. If the address is IPv4 mapped address, you may want to reject the connection. You can check the condition by using IN6_IS_ADDR_V4MAPPED() macro. To resolve this issue more easily, there is system dependent &man.setsockopt.2; option, IPV6_BINDV6ONLY, used like below. int on; setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY, - (char *)&on, sizeof (on)) < 0)); + (char *)&on, sizeof (on)) < 0)); When this call succeed, then this socket only receive IPv6 packets. Comments on initiating side: Advise to application implementers: to implement a portable IPv6 application (which works on multiple IPv6 kernels), we believe that the following is the key to the success: NEVER hardcode AF_INET nor AF_INET6. Use &man.getaddrinfo.3; and &man.getnameinfo.3; throughout the system. Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*(). (To update existing applications to be IPv6 aware easily, sometime getipnodeby*() will be useful. But if possible, try to rewrite the code to use &man.getaddrinfo.3; and &man.getnameinfo.3;.) If you would like to connect to destination, use &man.getaddrinfo.3; and try all the destination returned, like &man.telnet.1; does. Some of the IPv6 stack is shipped with buggy &man.getaddrinfo.3;. Ship a minimal working version with your application and use that as last resort. If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing connection, you will need to use &man.getipnodebyname.3;. When you would like to update your existing application to be IPv6 aware with minimal effort, this approach might be chosen. But please note that it is a temporal solution, because &man.getipnodebyname.3; itself is not recommended as it does not handle scoped IPv6 addresses at all. For IPv6 name resolution, &man.getaddrinfo.3; is the preferred API. So you should rewrite your application to use &man.getaddrinfo.3;, when you get the time to do it. When writing applications that make outgoing connections, story goes much simpler if you treat AF_INET and AF_INET6 as totally separate address family. {set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do not recommend you to rely upon IPv4 mapped address. unified tcp and inpcb code FreeBSD 4.x uses shared tcp code between IPv4 and IPv6 (from sys/netinet/tcp*) and separate udp4/6 code. It uses unified inpcb structure. The platform can be configured to support IPv4 mapped address. Kernel configuration is summarized as follows: By default, AF_INET6 socket will grab IPv4 connections in certain condition, and can initiate connection to IPv4 destination embedded in IPv4 mapped IPv6 address. You can disable it on entire system with sysctl like below. sysctl net.inet6.ip6.mapped_addr=0 listening side Each socket can be configured to support special AF_INET6 wildcard bind (enabled by default). You can disable it on each socket basis with &man.setsockopt.2; like below. int on; setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY, - (char *)&on, sizeof (on)) < 0)); + (char *)&on, sizeof (on)) < 0)); Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following conditions are satisfied: there is no AF_INET socket that matches the IPv4 connection the AF_INET6 socket is configured to accept IPv4 traffic, i.e. getsockopt(IPV6_BINDV6ONLY) returns 0. There is no problem with open/close ordering. initiating side FreeBSD 4.x supports outgoing connection to IPv4 mapped address (::ffff:10.1.1.1), if the node is configured to support IPv4 mapped address. sockaddr_storage When RFC2553 was about to be finalized, there was discussion on how struct sockaddr_storage members are named. One proposal is to prepend "__" to the members (like "__ss_len") as they should not be touched. The other proposal was not to prepend it (like "ss_len") as we need to touch those members directly. There was no clear consensus on it. As a result, RFC2553 defines struct sockaddr_storage as follows: struct sockaddr_storage { u_char __ss_len; /* address length */ u_char __ss_family; /* address family */ /* and bunch of padding */ }; On the contrary, XNET draft defines as follows: struct sockaddr_storage { u_char ss_len; /* address length */ u_char ss_family; /* address family */ /* and bunch of padding */ }; In December 1999, it was agreed that RFC2553bis should pick the latter (XNET) definition. Current implementation conforms to XNET definition, based on RFC2553bis discussion. If you look at multiple IPv6 implementations, you will be able to see both definitions. As an userland programmer, the most portable way of dealing with it is to: ensure ss_family and/or ss_len are available on the platform, by using GNU autoconf, have -Dss_family=__ss_family to unify all occurrences (including header file) into __ss_family, or never touch __ss_family. cast to sockaddr * and use sa_family like: struct sockaddr_storage ss; - family = ((struct sockaddr *)&ss)->sa_family + family = ((struct sockaddr *)&ss)->sa_family Network Drivers Now following two items are required to be supported by standard drivers: mbuf clustering requirement. In this stable release, we changed MINCLSIZE into MHLEN+1 for all the operating systems in order to make all the drivers behave as we expect. multicast. If &man.ifmcstat.8; yields no multicast group for a interface, that interface has to be patched. If any of the drivers do not support the requirements, then the drivers can not be used for IPv6 and/or IPsec communication. If you find any problem with your card using IPv6/IPsec, then, please report it to the &a.bugs;. (NOTE: In the past we required all PCMCIA drivers to have a call to in6_ifattach(). We have no such requirement any more) Translator We categorize IPv4/IPv6 translator into 4 types: Translator A --- It is used in the early stage of transition to make it possible to establish a connection from an IPv6 host in an IPv6 island to an IPv4 host in the IPv4 ocean. Translator B --- It is used in the early stage of transition to make it possible to establish a connection from an IPv4 host in the IPv4 ocean to an IPv6 host in an IPv6 island. Translator C --- It is used in the late stage of transition to make it possible to establish a connection from an IPv4 host in an IPv4 island to an IPv6 host in the IPv6 ocean. Translator D --- It is used in the late stage of transition to make it possible to establish a connection from an IPv6 host in the IPv6 ocean to an IPv4 host in an IPv4 island. TCP relay translator for category A is supported. This is called "FAITH". We also provide IP header translator for category A. (The latter is not yet put into FreeBSD 4.x yet.) FAITH TCP relay translator FAITH system uses TCP relay daemon called &man.faithd.8; helped by the kernel. FAITH will reserve an IPv6 address prefix, and relay TCP connection toward that prefix to IPv4 destination. For example, if the reserved IPv6 prefix is 3ffe:0501:0200:ffff::, and the IPv6 destination for TCP connection is 3ffe:0501:0200:ffff::163.221.202.12, the connection will be relayed toward IPv4 destination 163.221.202.12. destination IPv4 node (163.221.202.12) ^ | IPv4 tcp toward 163.221.202.12 FAITH-relay dual stack node ^ | IPv6 TCP toward 3ffe:0501:0200:ffff::163.221.202.12 source IPv6 node &man.faithd.8; must be invoked on FAITH-relay dual stack node. For more details, consult src/usr.sbin/faithd/README IPsec IPsec is mainly organized by three components. Policy Management Key Management AH and ESP handling Policy Management The kernel implements experimental policy management code. There are two way to manage security policy. One is to configure per-socket policy using &man.setsockopt.2;. In this cases, policy configuration is described in &man.ipsec.set.policy.3;. The other is to configure kernel packet filter-based policy using PF_KEY interface, via &man.setkey.8;. The policy entry is not re-ordered with its indexes, so the order of entry when you add is very significant. Key Management The key management code implemented in this kit (sys/netkey) is a home-brew PFKEY v2 implementation. This conforms to RFC2367. The home-brew IKE daemon, "racoon" is included in the kit (kame/kame/racoon). Basically you will need to run racoon as daemon, then set up a policy to require keys (like ping -P 'out ipsec esp/transport//use'). The kernel will contact racoon daemon as necessary to exchange keys. AH and ESP handling IPsec module is implemented as "hooks" to the standard IPv4/IPv6 processing. When sending a packet, ip{,6}_output() checks if ESP/AH processing is required by checking if a matching SPD (Security Policy Database) is found. If ESP/AH is needed, {esp,ah}{4,6}_output() will be called and mbuf will be updated accordingly. When a packet is received, {esp,ah}4_input() will be called based on protocol number, i.e. (*inetsw[proto])(). {esp,ah}4_input() will decrypt/check authenticity of the packet, and strips off daisy-chained header and padding for ESP/AH. It is safe to strip off the ESP/AH header on packet reception, since we will never use the received packet in "as is" form. By using ESP/AH, TCP4/6 effective data segment size will be affected by extra daisy-chained headers inserted by ESP/AH. Our code takes care of the case. Basic crypto functions can be found in directory "sys/crypto". ESP/AH transform are listed in {esp,ah}_core.c with wrapper functions. If you wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and add your crypto algorithm code into sys/crypto. Tunnel mode is partially supported in this release, with the following restrictions: IPsec tunnel is not combined with GIF generic tunneling interface. It needs a great care because we may create an - infinite loop between ip_output() and tunnelifp->if_output(). + infinite loop between ip_output() and tunnelifp->if_output(). Opinion varies if it is better to unify them, or not. MTU and Don't Fragment bit (IPv4) considerations need more checking, but basically works fine. Authentication model for AH tunnel must be revisited. We will need to improve the policy management engine, eventually. Conformance to RFCs and IDs The IPsec code in the kernel conforms (or, tries to conform) to the following standards: "old IPsec" specification documented in rfc182[5-9].txt "new IPsec" specification documented in rfc240[1-6].txt, rfc241[01].txt, rfc2451.txt and draft-mcdonald-simple-ipsec-api-01.txt (draft expired, but you can take from ftp://ftp.kame.net/pub/internet-drafts/). (NOTE: IKE specifications, rfc241[7-9].txt are implemented in userland, as "racoon" IKE daemon) Currently supported algorithms are: old IPsec AH null crypto checksum (no document, just for debugging) keyed MD5 with 128bit crypto checksum (rfc1828.txt) keyed SHA1 with 128bit crypto checksum (no document) HMAC MD5 with 128bit crypto checksum (rfc2085.txt) HMAC SHA1 with 128bit crypto checksum (no document) old IPsec ESP null encryption (no document, similar to rfc2410.txt) DES-CBC mode (rfc1829.txt) new IPsec AH null crypto checksum (no document, just for debugging) keyed MD5 with 96bit crypto checksum (no document) keyed SHA1 with 96bit crypto checksum (no document) HMAC MD5 with 96bit crypto checksum (rfc2403.txt) HMAC SHA1 with 96bit crypto checksum (rfc2404.txt) new IPsec ESP null encryption (rfc2410.txt) DES-CBC with derived IV (draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired) DES-CBC with explicit IV (rfc2405.txt) 3DES-CBC with explicit IV (rfc2451.txt) BLOWFISH CBC (rfc2451.txt) CAST128 CBC (rfc2451.txt) RC5 CBC (rfc2451.txt) each of the above can be combined with: ESP authentication with HMAC-MD5(96bit) ESP authentication with HMAC-SHA1(96bit) The following algorithms are NOT supported: old IPsec AH HMAC MD5 with 128bit crypto checksum + 64bit replay prevention (rfc2085.txt) keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt) IPsec (in kernel) and IKE (in userland as "racoon") has been tested at several interoperability test events, and it is known to interoperate with many other implementations well. Also, current IPsec implementation as quite wide coverage for IPsec crypto algorithms documented in RFC (we cover algorithms without intellectual property issues only). ECN consideration on IPsec tunnels ECN-friendly IPsec tunnel is supported as described in draft-ipsec-ecn-00.txt. Normal IPsec tunnel is described in RFC2401. On encapsulation, IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner IP header to outer IP header. On decapsulation outer IP header will be simply dropped. The decapsulation rule is not compatible with ECN, since ECN bit on the outer IP TOS/traffic class field will be lost. To make IPsec tunnel ECN-friendly, we should modify encapsulation and decapsulation procedure. This is described in http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt, chapter 3. IPsec tunnel implementation can give you three behaviors, by setting net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value: RFC2401: no consideration for ECN (sysctl value -1) ECN forbidden (sysctl value 0) ECN allowed (sysctl value 1) Note that the behavior is configurable in per-node manner, not per-SA manner (draft-ipsec-ecn-00 wants per-SA configuration, but it looks too much for me). The behavior is summarized as follows (see source code for more detail): encapsulate decapsulate --- --- RFC2401 copy all TOS bits drop TOS bits on outer from inner to outer. (use inner TOS bits as is) ECN forbidden copy TOS bits except for ECN drop TOS bits on outer (masked with 0xfc) from inner (use inner TOS bits as is) to outer. set ECN bits to 0. ECN allowed copy TOS bits except for ECN use inner TOS bits with some CE (masked with 0xfe) from change. if outer ECN CE bit inner to outer. is 1, enable ECN CE bit on set ECN CE bit to 0. the inner. General strategy for configuration is as follows: if both IPsec tunnel endpoint are capable of ECN-friendly behavior, you should better configure both end to ECN allowed (sysctl value 1). if the other end is very strict about TOS bit, use "RFC2401" (sysctl value -1). in other cases, use "ECN forbidden" (sysctl value 0). The default behavior is "ECN forbidden" (sysctl value 0). For more information, please refer to: http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt, RFC2481 (Explicit Congestion Notification), src/sys/netinet6/{ah,esp}_input.c (Thanks goes to Kenjiro Cho kjc@csl.sony.co.jp for detailed analysis) Interoperability Here are (some of) platforms that KAME code have tested IPsec/IKE interoperability in the past. Note that both ends may have modified their implementation, so use the following list just for reference purposes. Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure), Ericsson ACC, FreeS/WAN, HITACHI, IBM &aix;, IIJ, Intel, µsoft; &windowsnt;, NIST (linux IPsec + plutoplus), Netscreen, OpenBSD, RedCreek, Routerware, SSH, Secure Computing, Soliton, Toshiba, VPNet, Yamaha RT100i diff --git a/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml index 8bf0047a45..85851742d2 100644 --- a/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml @@ -1,836 +1,836 @@ Paul Richards Contributed by Jörg Wunsch Kernel Debugging Obtaining a Kernel Crash Dump When running a development kernel (eg: &os.current;), such as a kernel under extreme conditions (eg: very high load averages, tens of thousands of connections, exceedingly high number of concurrent users, hundreds of &man.jail.8;s, etc.), or using a new feature or device driver on &os.stable; (eg: PAE), sometimes a kernel will panic. In the event that it does, this chapter will demonstrate how to extract useful information out of a crash. A system reboot is inevitable once a kernel panics. Once a system is rebooted, the contents of a system's physical memory (RAM) is lost, as well as any bits that are on the swap device before the panic. To preserve the bits in physical memory, the kernel makes use of the swap device as a temporary place to store the bits that are in RAM across a reboot after a crash. In doing this, when &os; boots after a crash, a kernel image can now be extracted and debugging can take place. A swap device that has been configured as a dump device still acts as a swap device. Dumps to non-swap devices (such as tapes or CDRWs, for example) are not supported at this time. A swap device is synonymous with a swap partition. To be able to extract a usable core, it is required that at least one swap partition be large enough to hold all of the bits in physical memory. When a kernel panics, before the system reboots, the kernel is smart enough to check to see if a swap device has been configured as a dump device. If there is a valid dump device, the kernel dumps the contents of what is in physical memory to the swap device. Configuring the Dump Device Before the kernel will dump the contents of its physical memory to a dump device, a dump device must be configured. A dump device is specified by using the &man.dumpon.8; command to tell the kernel where to save kernel crash dumps. The &man.dumpon.8; program must be called after the swap partition has been configured with &man.swapon.8;. This is normally handled by setting the dumpdev variable in &man.rc.conf.5; to the path of the swap device (the recommended way to extract a kernel dump). Alternatively, the dump device can be hard-coded via the dump clause in the &man.config.5; line of a kernel configuration file. This approach is deprecated and should be used only if a kernel is crashing before &man.dumpon.8; can be executed. Check /etc/fstab or &man.swapinfo.8; for a list of swap devices. Make sure the dumpdir specified in &man.rc.conf.5; exists before a kernel crash! &prompt.root; mkdir /var/crash &prompt.root; chmod 700 /var/crash Also, remember that the contents of /var/crash is sensitive and very likely contains confidential information such as passwords. Extracting a Kernel Dump Once a dump has been written to a dump device, the dump must be extracted before the swap device is mounted. To extract a dump from a dump device, use the &man.savecore.8; program. If dumpdev has been set in &man.rc.conf.5;, &man.savecore.8; will be called automatically on the first multi-user boot after the crash and before the swap device is mounted. The location of the extracted core is placed in the &man.rc.conf.5; value dumpdir, by default /var/crash and will be named vmcore.0. In the event that there is already a file called vmcore.0 in /var/crash (or whatever dumpdir is set to), the kernel will increment the trailing number for every crash to avoid overwriting an existing vmcore (eg: vmcore.1). While debugging, it is highly likely that you will want to use the highest version vmcore in /var/crash when searching for the right vmcore. If you are testing a new kernel but need to boot a different one in order to get your system up and running again, boot it only into single user mode using the flag at the boot prompt, and then perform the following steps: &prompt.root; fsck -p &prompt.root; mount -a -t ufs # make sure /var/crash is writable &prompt.root; savecore /var/crash /dev/ad0s1b &prompt.root; exit # exit to multi-user This instructs &man.savecore.8; to extract a kernel dump from /dev/ad0s1b and place the contents in /var/crash. Do not forget to make sure the destination directory /var/crash has enough space for the dump. Also, do not forget to specify the correct path to your swap device as it is likely different than /dev/ad0s1b! The recommended, and certainly the easiest way to automate obtaining crash dumps is to use the dumpdev variable in &man.rc.conf.5;. Debugging a Kernel Crash Dump with <command>kgdb</command> This section covers &man.kgdb.1; as found in &os; 5.3 and later. In previous versions, one must use gdb -k to read a core dump file. Once a dump has been obtained, getting useful information out of the dump is relatively easy for simple problems. Before launching into the internals of &man.kgdb.1; to debug the crash dump, locate the debug version of your kernel (normally called kernel.debug) and the path to the source files used to build your kernel (normally /usr/obj/usr/src/sys/KERNCONF, where KERNCONF is the ident specified in a kernel &man.config.5;). With those two pieces of info, let the debugging commence! To enter into the debugger and begin getting information from the dump, the following steps are required at a minimum: &prompt.root; cd /usr/obj/usr/src/sys/KERNCONF &prompt.root; kgdb kernel.debug /var/crash/vmcore.0 You can debug the crash dump using the kernel sources just like you can for any other program. This first dump is from a 5.2-BETA kernel and the crash comes from deep within the kernel. The output below has been modified to include line numbers on the left. This first trace inspects the instruction pointer and obtains a back trace. The address that is used on line 41 for the list command is the instruction pointer and can be found on line 17. Most developers will request having at least this information sent to them if you are unable to debug the problem yourself. If, however, you do solve the problem, make sure that your patch winds its way into the source tree via a problem report, mailing lists, or by being able to commit it! 1:&prompt.root; cd /usr/obj/usr/src/sys/KERNCONF 2:&prompt.root; kgdb kernel.debug /var/crash/vmcore.0 3:GNU gdb 5.2.1 (FreeBSD) 4:Copyright 2002 Free Software Foundation, Inc. 5:GDB is free software, covered by the GNU General Public License, and you are 6:welcome to change it and/or distribute copies of it under certain conditions. 7:Type "show copying" to see the conditions. 8:There is absolutely no warranty for GDB. Type "show warranty" for details. 9:This GDB was configured as "i386-undermydesk-freebsd"... 10:panic: page fault 11:panic messages: 12:--- 13:Fatal trap 12: page fault while in kernel mode 14:cpuid = 0; apic id = 00 15:fault virtual address = 0x300 16:fault code: = supervisor read, page not present 17:instruction pointer = 0x8:0xc0713860 18:stack pointer = 0x10:0xdc1d0b70 19:frame pointer = 0x10:0xdc1d0b7c 20:code segment = base 0x0, limit 0xfffff, type 0x1b 21: = DPL 0, pres 1, def32 1, gran 1 22:processor eflags = resume, IOPL = 0 23:current process = 14394 (uname) 24:trap number = 12 25:panic: page fault 26 cpuid = 0; 27:Stack backtrace: 28 29:syncing disks, buffers remaining... 2199 2199 panic: mi_switch: switch in a critical section 30:cpuid = 0; 31:Uptime: 2h43m19s 32:Dumping 255 MB 33: 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 34:--- 35:Reading symbols from /boot/kernel/snd_maestro3.ko...done. 36:Loaded symbols for /boot/kernel/snd_maestro3.ko 37:Reading symbols from /boot/kernel/snd_pcm.ko...done. 38:Loaded symbols for /boot/kernel/snd_pcm.ko 39:#0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 40:240 dumping++; 41:(kgdb) list *0xc0713860 42:0xc0713860 is in lapic_ipi_wait (/usr/src/sys/i386/i386/local_apic.c:663). 43:658 incr = 0; 44:659 delay = 1; 45:660 } else 46:661 incr = 1; 47:662 for (x = 0; x < delay; x += incr) { 48:663 if ((lapic->icr_lo & APIC_DELSTAT_MASK) == APIC_DELSTAT_IDLE) 49:664 return (1); 50:665 ia32_pause(); 51:666 } 52:667 return (0); 53:(kgdb) backtrace 54:#0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 55:#1 0xc055fd9b in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372 56:#2 0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550 57:#3 0xc0567ef5 in mi_switch () at /usr/src/sys/kern/kern_synch.c:470 58:#4 0xc055fa87 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:312 59:#5 0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550 60:#6 0xc0720c66 in trap_fatal (frame=0xdc1d0b30, eva=0) 61: at /usr/src/sys/i386/i386/trap.c:821 62:#7 0xc07202b3 in trap (frame= 63: {tf_fs = -1065484264, tf_es = -1065484272, tf_ds = -1065484272, tf_edi = 1, tf_esi = 0, tf_ebp = -602076292, tf_isp = -602076324, tf_ebx = 0, tf_edx = 0, tf_ecx = 1000000, tf_eax = 243, tf_trapno = 12, tf_err = 0, tf_eip = -1066321824, tf_cs = 8, tf_eflags = 65671, tf_esp = 243, tf_ss = 0}) 64: at /usr/src/sys/i386/i386/trap.c:250 65:#8 0xc070c9f8 in calltrap () at {standard input}:94 66:#9 0xc07139f3 in lapic_ipi_vectored (vector=0, dest=0) 67: at /usr/src/sys/i386/i386/local_apic.c:733 68:#10 0xc0718b23 in ipi_selected (cpus=1, ipi=1) 69: at /usr/src/sys/i386/i386/mp_machdep.c:1115 70:#11 0xc057473e in kseq_notify (ke=0xcc05e360, cpu=0) 71: at /usr/src/sys/kern/sched_ule.c:520 72:#12 0xc0575cad in sched_add (td=0xcbcf5c80) 73: at /usr/src/sys/kern/sched_ule.c:1366 74:#13 0xc05666c6 in setrunqueue (td=0xcc05e360) 75: at /usr/src/sys/kern/kern_switch.c:422 76:#14 0xc05752f4 in sched_wakeup (td=0xcbcf5c80) 77: at /usr/src/sys/kern/sched_ule.c:999 78:#15 0xc056816c in setrunnable (td=0xcbcf5c80) 79: at /usr/src/sys/kern/kern_synch.c:570 80:#16 0xc0567d53 in wakeup (ident=0xcbcf5c80) 81: at /usr/src/sys/kern/kern_synch.c:411 82:#17 0xc05490a8 in exit1 (td=0xcbcf5b40, rv=0) 83: at /usr/src/sys/kern/kern_exit.c:509 84:#18 0xc0548011 in sys_exit () at /usr/src/sys/kern/kern_exit.c:102 85:#19 0xc0720fd0 in syscall (frame= 86: {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = -1, tf_ebp = -1077940712, tf_isp = -602075788, tf_ebx = 672411944, tf_edx = 10, tf_ecx = 672411600, tf_eax = 1, tf_trapno = 12, tf_err = 2, tf_eip = 671899563, tf_cs = 31, tf_eflags = 642, tf_esp = -1077940740, tf_ss = 47}) 87: at /usr/src/sys/i386/i386/trap.c:1010 88:#20 0xc070ca4d in Xint0x80_syscall () at {standard input}:136 89:---Can't read userspace from dump, or kernel process--- 90:(kgdb) quit This next trace is an older dump from the FreeBSD 2 time frame, but is more involved and demonstrates more of the features of gdb. Long lines have been folded to improve readability, and the lines are numbered for reference. Despite this, it is a real-world error trace taken during the development of the pcvt console driver. 1:Script started on Fri Dec 30 23:15:22 1994 2:&prompt.root; cd /sys/compile/URIAH 3:&prompt.root; gdb -k kernel /var/crash/vmcore.1 4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel ...done. 5:IdlePTD 1f3000 6:panic: because you said to! 7:current pcb at 1e3f70 8:Reading in symbols for ../../i386/i386/machdep.c...done. 9:(kgdb) backtrace 10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767) 11:#1 0xf0115159 in panic () 12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698) 13:#3 0xf010185e in db_fncall () 14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073) 15:#5 0xf0101711 in db_command_loop () 16:#6 0xf01040a0 in db_trap () 17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723) 18:#8 0xf019d2eb in trap_fatal (...) 19:#9 0xf019ce60 in trap_pfault (...) 20:#10 0xf019cb2f in trap (...) 21:#11 0xf01932a1 in exception:calltrap () 22:#12 0xf0191503 in cnopen (...) 23:#13 0xf0132c34 in spec_open () 24:#14 0xf012d014 in vn_open () 25:#15 0xf012a183 in open () 26:#16 0xf019d4eb in syscall (...) 27:(kgdb) up 10 28:Reading in symbols for ../../i386/i386/trap.c...done. 29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\ 30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\ 31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\ 32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\ 33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\ 34:ss = -266427884}) (../../i386/i386/trap.c line 283) 35:283 (void) trap_pfault(&frame, FALSE); 36:(kgdb) frame frame->tf_ebp frame->tf_eip 37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done. 38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\ 39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403) 40:403 return ((*linesw[tp->t_line].l_open)(dev, tp)); 41:(kgdb) list 42:398 43:399 tp->t_state |= TS_CARR_ON; 44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */ 45:401 -46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200) +46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200) 47:403 return ((*linesw[tp->t_line].l_open)(dev, tp)); 48:404 #else 49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag)); -50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */ +50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */ 51:407 } 52:(kgdb) print tp 53:Reading in symbols for ../../i386/i386/cons.c...done. 54:$1 = (struct tty *) 0x1bae 55:(kgdb) print tp->t_line 56:$2 = 1767990816 57:(kgdb) up 58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\ 59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126) 60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p)); 61:(kgdb) up 62:#2 0xf0132c34 in spec_open () 63:(kgdb) up 64:#3 0xf012d014 in vn_open () 65:(kgdb) up 66:#4 0xf012a183 in open () 67:(kgdb) up 68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\ 69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\ 70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \ 71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \ 72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673) 73:673 error = (*callp->sy_call)(p, args, rval); 74:(kgdb) up 75:Initial frame selected; you cannot go up. 76:(kgdb) quit Comments to the above script: line 6: This is a dump taken from within DDB (see below), hence the panic comment because you said to!, and a rather long stack trace; the initial reason for going into DDB has been a page fault trap though. line 20: This is the location of function trap() in the stack trace. line 36: Force usage of a new stack frame; this is no longer necessary. The stack frames are supposed to point to the right locations now, even in case of a trap. From looking at the code in source line 403, there is a high probability that either the pointer access for tp was messed up, or the array access was out of bounds. line 52: The pointer looks suspicious, but happens to be a valid address. line 56: However, it obviously points to garbage, so we have found our error! (For those unfamiliar with that particular piece of code: tp->t_line refers to the line discipline of the console device here, which must be a rather small integer number.) If your system is crashing regularly and you are running out of disk space, deleting old vmcore files in /var/crash could save a considerable amount of disk space! Debugging a Crash Dump with DDD Examining a kernel crash dump with a graphical debugger like ddd is also possible (you will need to install the devel/ddd port in order to use the ddd debugger). Add the option to the ddd command line you would use normally. For example; &prompt.root; ddd -k /var/crash/kernel.0 /var/crash/vmcore.0 You should then be able to go about looking at the crash dump using ddd's graphical interface. Post-Mortem Analysis of a Dump What do you do if a kernel dumped core but you did not expect it, and it is therefore not compiled using config -g? Not everything is lost here. Do not panic! Of course, you still need to enable crash dumps. See above for the options you have to specify in order to do this. Go to your kernel config directory (/usr/src/sys/arch/conf) and edit your configuration file. Uncomment (or add, if it does not exist) the following line: makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols Rebuild the kernel. Due to the time stamp change on the Makefile, some other object files will be rebuilt, for example trap.o. With a bit of luck, the added option will not change anything for the generated code, so you will finally get a new kernel with similar code to the faulting one but with some debugging symbols. You should at least verify the old and new sizes with the &man.size.1; command. If there is a mismatch, you probably need to give up here. Go and examine the dump as described above. The debugging symbols might be incomplete for some places, as can be seen in the stack trace in the example above where some functions are displayed without line numbers and argument lists. If you need more debugging symbols, remove the appropriate object files, recompile the kernel again and repeat the gdb session until you know enough. All this is not guaranteed to work, but it will do it fine in most cases. On-Line Kernel Debugging Using DDB While gdb as an off-line debugger provides a very high level of user interface, there are some things it cannot do. The most important ones being breakpointing and single-stepping kernel code. If you need to do low-level debugging on your kernel, there is an on-line debugger available called DDB. It allows setting of breakpoints, single-stepping kernel functions, examining and changing kernel variables, etc. However, it cannot access kernel source files, and only has access to the global and static symbols, not to the full debug information like gdb does. To configure your kernel to include DDB, add the option line options DDB to your config file, and rebuild. (See The FreeBSD Handbook for details on configuring the FreeBSD kernel). If you have an older version of the boot blocks, your debugger symbols might not be loaded at all. Update the boot blocks; the recent ones load the DDB symbols automatically. Once your DDB kernel is running, there are several ways to enter DDB. The first, and earliest way is to type the boot flag right at the boot prompt. The kernel will start up in debug mode and enter DDB prior to any device probing. Hence you can even debug the device probe/attach functions. The second scenario is to drop to the debugger once the system has booted. There are two simple ways to accomplish this. If you would like to break to the debugger from the command prompt, simply type the command: &prompt.root; sysctl debug.enter_debugger=ddb Alternatively, if you are at the system console, you may use a hot-key on the keyboard. The default break-to-debugger sequence is Ctrl AltESC. For syscons, this sequence can be remapped and some of the distributed maps out there do this, so check to make sure you know the right sequence to use. There is an option available for serial consoles that allows the use of a serial line BREAK on the console line to enter DDB (options BREAK_TO_DEBUGGER in the kernel config file). It is not the default since there are a lot of serial adapters around that gratuitously generate a BREAK condition, for example when pulling the cable. The third way is that any panic condition will branch to DDB if the kernel is configured to use it. For this reason, it is not wise to configure a kernel with DDB for a machine running unattended. The DDB commands roughly resemble some gdb commands. The first thing you probably need to do is to set a breakpoint: b function-name b address Numbers are taken hexadecimal by default, but to make them distinct from symbol names; hexadecimal numbers starting with the letters a-f need to be preceded with 0x (this is optional for other numbers). Simple expressions are allowed, for example: function-name + 0x103. To continue the operation of an interrupted kernel, simply type: c To get a stack trace, use: trace Note that when entering DDB via a hot-key, the kernel is currently servicing an interrupt, so the stack trace might be not of much use to you. If you want to remove a breakpoint, use del del address-expression The first form will be accepted immediately after a breakpoint hit, and deletes the current breakpoint. The second form can remove any breakpoint, but you need to specify the exact address; this can be obtained from: show b To single-step the kernel, try: s This will step into functions, but you can make DDB trace them until the matching return statement is reached by: n This is different from gdb's next statement; it is like gdb's finish. To examine data from memory, use (for example): x/wx 0xf0133fe0,40 x/hd db_symtab_space x/bc termbuf,10 x/s stringbuf for word/halfword/byte access, and hexadecimal/decimal/character/ string display. The number after the comma is the object count. To display the next 0x10 items, simply use: x ,10 Similarly, use x/ia foofunc,10 to disassemble the first 0x10 instructions of foofunc, and display them along with their offset from the beginning of foofunc. To modify memory, use the write command: w/b termbuf 0xa 0xb 0 w/w 0xf0010030 0 0 The command modifier (b/h/w) specifies the size of the data to be written, the first following expression is the address to write to and the remainder is interpreted as data to write to successive memory locations. If you need to know the current registers, use: show reg Alternatively, you can display a single register value by e.g. p $eax and modify it by: set $eax new-value Should you need to call some kernel functions from DDB, simply say: call func(arg1, arg2, ...) The return value will be printed. For a &man.ps.1; style summary of all running processes, use: ps Now you have examined why your kernel failed, and you wish to reboot. Remember that, depending on the severity of previous malfunctioning, not all parts of the kernel might still be working as expected. Perform one of the following actions to shut down and reboot your system: panic This will cause your kernel to dump core and reboot, so you can later analyze the core on a higher level with gdb. This command usually must be followed by another continue statement. call boot(0) Which might be a good way to cleanly shut down the running system, sync() all disks, and finally reboot. As long as the disk and filesystem interfaces of the kernel are not damaged, this might be a good way for an almost clean shutdown. call cpu_reset() This is the final way out of disaster and almost the same as hitting the Big Red Button. If you need a short command summary, simply type: help However, it is highly recommended to have a printed copy of the &man.ddb.4; manual page ready for a debugging session. Remember that it is hard to read the on-line manual while single-stepping the kernel. On-Line Kernel Debugging Using Remote GDB This feature has been supported since FreeBSD 2.2, and it is actually a very neat one. GDB has already supported remote debugging for a long time. This is done using a very simple protocol along a serial line. Unlike the other methods described above, you will need two machines for doing this. One is the host providing the debugging environment, including all the sources, and a copy of the kernel binary with all the symbols in it, and the other one is the target machine that simply runs a similar copy of the very same kernel (but stripped of the debugging information). You should configure the kernel in question with config -g, include into the configuration, and compile it as usual. This gives a large binary, due to the debugging information. Copy this kernel to the target machine, strip the debugging symbols off with strip -x, and boot it using the boot option. Connect the serial line of the target machine that has "flags 080" set on its sio device to any serial line of the debugging host. Now, on the debugging machine, go to the compile directory of the target kernel, and start gdb: &prompt.user; gdb -k kernel GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc... (kgdb) Initialize the remote debugging session (assuming the first serial port is being used) by: (kgdb) target remote /dev/cuaa0 Now, on the target host (the one that entered DDB right before even starting the device probe), type: Debugger("Boot flags requested debugger") Stopped at Debugger+0x35: movb $0, edata+0x51bc db> gdb DDB will respond with: Next trap will enter GDB remote protocol mode Every time you type gdb, the mode will be toggled between remote GDB and local DDB. In order to force a next trap immediately, simply type s (step). Your hosting GDB will now gain control over the target kernel: Remote debugging using /dev/cuaa0 Debugger (msg=0xf01b0383 "Boot flags requested debugger") at ../../i386/i386/db_interface.c:257 (kgdb) You can use this session almost as any other GDB session, including full access to the source, running it in gud-mode inside an Emacs window (which gives you an automatic source code display in another Emacs window), etc. Debugging Loadable Modules Using GDB When debugging a panic that occurred within a module, or using remote GDB against a machine that uses dynamic modules, you need to tell GDB how to obtain symbol information for those modules. First, you need to build the module(s) with debugging information: &prompt.root; cd /sys/modules/linux &prompt.root; make clean; make COPTS=-g If you are using remote GDB, you can run kldstat on the target machine to find out where the module was loaded: &prompt.root; kldstat Id Refs Address Size Name 1 4 0xc0100000 1c1678 kernel 2 1 0xc0a9e000 6000 linprocfs.ko 3 1 0xc0ad7000 2000 warp_saver.ko 4 1 0xc0adc000 11000 linux.ko If you are debugging a crash dump, you will need to walk the linker_files list, starting at - linker_files->tqh_first and following the + linker_files->tqh_first and following the link.tqe_next pointers until you find the entry with the filename you are looking for. The address member of that entry is the load address of the module. Next, you need to find out the offset of the text section within the module: &prompt.root; objdump --section-headers /sys/modules/linux/linux.ko | grep text 3 .rel.text 000016e0 000038e0 000038e0 000038e0 2**2 10 .text 00007f34 000062d0 000062d0 000062d0 2**2 The one you want is the .text section, section 10 in the above example. The fourth hexadecimal field (sixth field overall) is the offset of the text section within the file. Add this offset to the load address of the module to obtain the relocation address for the module's code. In our example, we get 0xc0adc000 + 0x62d0 = 0xc0ae22d0. Use the add-symbol-file command in GDB to tell the debugger about the module: (kgdb) add-symbol-file /sys/modules/linux/linux.ko 0xc0ae22d0 add symbol table from file "/sys/modules/linux/linux.ko" at text_addr = 0xc0ae22d0? (y or n) y Reading symbols from /sys/modules/linux/linux.ko...done. (kgdb) You should now have access to all the symbols in the module. Debugging a Console Driver Since you need a console driver to run DDB on, things are more complicated if the console driver itself is failing. You might remember the use of a serial console (either with modified boot blocks, or by specifying at the Boot: prompt), and hook up a standard terminal onto your first serial port. DDB works on any configured console driver, including a serial console. diff --git a/en_US.ISO8859-1/books/developers-handbook/secure/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/secure/chapter.sgml index 94c2e8ef43..2a6fddb297 100644 --- a/en_US.ISO8859-1/books/developers-handbook/secure/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/secure/chapter.sgml @@ -1,525 +1,525 @@ Murray Stokely Contributed by Secure Programming Synopsis This chapter describes some of the security issues that have plagued &unix; programmers for decades and some of the new tools available to help programmers avoid writing exploitable code. Secure Design Methodology Writing secure applications takes a very scrutinous and pessimistic outlook on life. Applications should be run with the principle of least privilege so that no process is ever running with more than the bare minimum access that it needs to accomplish its function. Previously tested code should be reused whenever possible to avoid common mistakes that others may have already fixed. One of the pitfalls of the &unix; environment is how easy it is to make assumptions about the sanity of the environment. Applications should never trust user input (in all its forms), system resources, inter-process communication, or the timing of events. &unix; processes do not execute synchronously so logical operations are rarely atomic. Buffer Overflows Buffer Overflows have been around since the very beginnings of the Von-Neuman architecture. buffer overflow Von-Neuman They first gained widespread notoriety in 1988 with the Morris Internet worm. Unfortunately, the same basic attack remains Morris Internet worm effective today. Of the 17 CERT security advisories of 1999, 10 CERTsecurity advisories of them were directly caused by buffer-overflow software bugs. By far the most common type of buffer overflow attack is based on corrupting the stack. stack arguments Most modern computer systems use a stack to pass arguments to procedures and to store local variables. A stack is a last in first out (LIFO) buffer in the high memory area of a process image. When a program invokes a function a new "stack frame" is LIFO process image stack pointer created. This stack frame consists of the arguments passed to the function as well as a dynamic amount of local variable space. The "stack pointer" is a register that holds the current stack frame stack pointer location of the top of the stack. Since this value is constantly changing as new values are pushed onto the top of the stack, many implementations also provide a "frame pointer" that is located near the beginning of a stack frame so that local variables can more easily be addressed relative to this value. The return address for function frame pointer process image frame pointer return address stack-overflow calls is also stored on the stack, and this is the cause of stack-overflow exploits since overflowing a local variable in a function can overwrite the return address of that function, potentially allowing a malicious user to execute any code he or she wants. Although stack-based attacks are by far the most common, it would also be possible to overrun the stack with a heap-based (malloc/free) attack. The C programming language does not perform automatic bounds checking on arrays or pointers as many other languages do. In addition, the standard C library is filled with a handful of very dangerous functions. - + strcpy(char *dest, const char *src) May overflow the dest buffer strcat(char *dest, const char *src) May overflow the dest buffer getwd(char *buf) May overflow the buf buffer gets(char *s) May overflow the s buffer [vf]scanf(const char *format, ...) May overflow its arguments. realpath(char *path, char resolved_path[]) May overflow the path buffer [v]sprintf(char *str, const char *format, ...) May overflow the str buffer. Example Buffer Overflow The following example code contains a buffer overflow designed to overwrite the return address and skip the instruction immediately following the function call. (Inspired by ) #include stdio.h void manipulate(char *buffer) { char newbuffer[80]; strcpy(newbuffer,buffer); } int main() { char ch,buffer[4096]; int i=0; while ((buffer[i++] = getchar()) != '\n') {}; i=1; manipulate(buffer); i=2; printf("The value of i is : %d\n",i); return 0; } Let us examine what the memory image of this process would look like if we were to input 160 spaces into our little program before hitting return. [XXX figure here!] Obviously more malicious input can be devised to execute actual compiled instructions (such as exec(/bin/sh)). Avoiding Buffer Overflows The most straightforward solution to the problem of stack-overflows is to always use length restricted memory and string copy functions. strncpy and strncat are part of the standard C library. string copy functions strncpy string copy functions strncat These functions accept a length value as a parameter which should be no larger than the size of the destination buffer. These functions will then copy up to `length' bytes from the source to the destination. However there are a number of problems with these functions. Neither function guarantees NUL termination if the size of the input buffer is as large as the NUL termination destination. The length parameter is also used inconsistently between strncpy and strncat so it is easy for programmers to get confused as to their proper usage. There is also a significant performance loss compared to strcpy when copying a short string into a large buffer since strncpy NUL fills up the size specified. In OpenBSD, another memory copy implementation has been OpenBSD created to get around these problem. The strlcpy and strlcat functions guarantee that they will always null terminate the destination string when given a non-zero length argument. For more information about these functions see . The OpenBSD strlcpy and strlcat instructions have been in FreeBSD since 3.3. string copy functions strlcpy string copy functions strlcat Compiler based run-time bounds checking bounds checking compiler-based Unfortunately there is still a very large assortment of code in public use which blindly copies memory around without using any of the bounded copy routines we just discussed. Fortunately, there is another solution. Several compiler add-ons and libraries exist to do Run-time bounds checking in C/C++. StackGuard gcc StackGuard is one such add-on that is implemented as a small patch to the gcc code generator. From the StackGuard website:

"StackGuard detects and defeats stack smashing attacks by protecting the return address on the stack from being altered. StackGuard places a "canary" word next to the return address when a function is called. If the canary word has been altered when the function returns, then a stack smashing attack has been attempted, and the program responds by emitting an intruder alert into syslog, and then halts."

"StackGuard is implemented as a small patch to the gcc code generator, specifically the function_prolog() and function_epilog() routines. function_prolog() has been enhanced to lay down canaries on the stack when functions start, and function_epilog() checks canary integrity when the function exits. Any attempt at corrupting the return address is thus detected before the function returns."

buffer overflow Recompiling your application with StackGuard is an effective means of stopping most buffer-overflow attacks, but it can still be compromised. Library based run-time bounds checking bounds checking library-based Compiler-based mechanisms are completely useless for binary-only software for which you cannot recompile. For these situations there are a number of libraries which re-implement the unsafe functions of the C-library (strcpy, fscanf, getwd, etc..) and ensure that these functions can never write past the stack pointer. libsafe libverify libparanoia Unfortunately these library-based defenses have a number of shortcomings. These libraries only protect against a very small set of security related issues and they neglect to fix the actual problem. These defenses may fail if the application was compiled with -fomit-frame-pointer. Also, the LD_PRELOAD and LD_LIBRARY_PATH environment variables can be overwritten/unset by the user. SetUID issues seteuid There are at least 6 different IDs associated with any given process. Because of this you have to be very careful with the access that your process has at any given time. In particular, all seteuid applications should give up their privileges as soon as it is no longer required. user IDs real user ID user IDs effective user ID The real user ID can only be changed by a superuser process. The login program sets this when a user initially logs in and it is seldom changed. The effective user ID is set by the exec() functions if a program has its seteuid bit set. An application can call seteuid() at any time to set the effective user ID to either the real user ID or the saved set-user-ID. When the effective user ID is set by exec() functions, the previous value is saved in the saved set-user-ID. Limiting your program's environment chroot() The traditional method of restricting a process is with the chroot() system call. This system call changes the root directory from which all other paths are referenced for a process and any child processes. For this call to succeed the process must have execute (search) permission on the directory being referenced. The new environment does not actually take effect until you chdir() into your new environment. It should also be noted that a process can easily break out of a chroot environment if it has root privilege. This could be accomplished by creating device nodes to read kernel memory, attaching a debugger to a process outside of the jail, or in many other creative ways. The behavior of the chroot() system call can be controlled somewhat with the kern.chroot_allow_open_directories sysctl variable. When this value is set to 0, chroot() will fail with EPERM if there are any directories open. If set to the default value of 1, then chroot() will fail with EPERM if there are any directories open and the process is already subject to a chroot() call. For any other value, the check for open directories will be bypassed completely. FreeBSD's jail functionality jail The concept of a Jail extends upon the chroot() by limiting the powers of the superuser to create a true `virtual server'. Once a prison is set up all network communication must take place through the specified IP address, and the power of "root privilege" in this jail is severely constrained. While in a prison, any tests of superuser power within the kernel using the suser() call will fail. However, some calls to suser() have been changed to a new interface suser_xxx(). This function is responsible for recognizing or denying access to superuser power for imprisoned processes. A superuser process within a jailed environment has the power to: Manipulate credential with setuid, seteuid, setgid, setegid, setgroups, setreuid, setregid, setlogin Set resource limits with setrlimit Modify some sysctl nodes (kern.hostname) chroot() Set flags on a vnode: chflags, fchflags Set attributes of a vnode such as file permission, owner, group, size, access time, and modification time. Bind to privileged ports in the Internet - domain (ports < 1024) + domain (ports < 1024) Jail is a very useful tool for running applications in a secure environment but it does have some shortcomings. Currently, the IPC mechanisms have not been converted to the suser_xxx so applications such as MySQL cannot be run within a jail. Superuser access may have a very limited meaning within a jail, but there is no way to specify exactly what "very limited" means. &posix;.1e Process Capabilities POSIX.1e Process Capabilities TrustedBSD &posix; has released a working draft that adds event auditing, access control lists, fine grained privileges, information labeling, and mandatory access control. This is a work in progress and is the focus of the TrustedBSD project. Some of the initial work has been committed to &os.current; (cap_set_proc(3)). Trust An application should never assume that anything about the users environment is sane. This includes (but is certainly not limited to): user input, signals, environment variables, resources, IPC, mmaps, the filesystem working directory, file descriptors, the # of open files, etc. positive filtering data validation You should never assume that you can catch all forms of invalid input that a user might supply. Instead, your application should use positive filtering to only allow a specific subset of inputs that you deem safe. Improper data validation has been the cause of many exploits, especially with CGI scripts on the world wide web. For filenames you need to be extra careful about paths ("../", "/"), symbolic links, and shell escape characters. Perl Taint mode Perl has a really cool feature called "Taint" mode which can be used to prevent scripts from using data derived outside the program in an unsafe way. This mode will check command line arguments, environment variables, locale information, the results of certain syscalls (readdir(), readlink(), getpwxxx(), and all file input. Race Conditions A race condition is anomalous behavior caused by the unexpected dependence on the relative timing of events. In other words, a programmer incorrectly assumed that a particular event would always happen before another. race conditions signals race conditions access checks race conditions file opens Some of the common causes of race conditions are signals, access checks, and file opens. Signals are asynchronous events by nature so special care must be taken in dealing with them. Checking access with access(2) then open(2) is clearly non-atomic. Users can move files in between the two calls. Instead, privileged applications should seteuid() and then call open() directly. Along the same lines, an application should always set a proper umask before open() to obviate the need for spurious chmod() calls. diff --git a/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.sgml index cb392c23d4..d4d4d6556d 100644 --- a/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.sgml @@ -1,1790 +1,1790 @@ G. Adam Stanislav Contributed by Sockets Synopsis BSD sockets take interprocess communications to a new level. It is no longer necessary for the communicating processes to run on the same machine. They still can, but they do not have to. Not only do these processes not have to run on the same machine, they do not have to run under the same operating system. Thanks to BSD sockets, your FreeBSD software can smoothly cooperate with a program running on a &macintosh;, another one running on a &sun; workstation, yet another one running under &windows; 2000, all connected with an Ethernet-based local area network. But your software can equally well cooperate with processes running in another building, or on another continent, inside a submarine, or a space shuttle. It can also cooperate with processes that are not part of a computer (at least not in the strict sense of the word), but of such devices as printers, digital cameras, medical equipment. Just about anything capable of digital communications. Networking and Diversity We have already hinted on the diversity of networking. Many different systems have to talk to each other. And they have to speak the same language. They also have to understand the same language the same way. People often think that body language is universal. But it is not. Back in my early teens, my father took me to Bulgaria. We were sitting at a table in a park in Sofia, when a vendor approached us trying to sell us some roasted almonds. I had not learned much Bulgarian by then, so, instead of saying no, I shook my head from side to side, the universal body language for no. The vendor quickly started serving us some almonds. I then remembered I had been told that in Bulgaria shaking your head sideways meant yes. Quickly, I started nodding my head up and down. The vendor noticed, took his almonds, and walked away. To an uninformed observer, I did not change the body language: I continued using the language of shaking and nodding my head. What changed was the meaning of the body language. At first, the vendor and I interpreted the same language as having completely different meaning. I had to adjust my own interpretation of that language so the vendor would understand. It is the same with computers: The same symbols may have different, even outright opposite meaning. Therefore, for two computers to understand each other, they must not only agree on the same language, but on the same interpretation of the language. Protocols While various programming languages tend to have complex syntax and use a number of multi-letter reserved words (which makes them easy for the human programmer to understand), the languages of data communications tend to be very terse. Instead of multi-byte words, they often use individual bits. There is a very convincing reason for it: While data travels inside your computer at speeds approaching the speed of light, it often travels considerably slower between two computers. Because the languages used in data communications are so terse, we usually refer to them as protocols rather than languages. As data travels from one computer to another, it always uses more than one protocol. These protocols are layered. The data can be compared to the inside of an onion: You have to peel off several layers of skin to get to the data. This is best illustrated with a picture: +----------------+ | Ethernet | |+--------------+| || IP || ||+------------+|| ||| TCP ||| |||+----------+||| |||| HTTP |||| ||||+--------+|||| ||||| PNG ||||| |||||+------+||||| |||||| Data |||||| |||||+------+||||| ||||+--------+|||| |||+----------+||| ||+------------+|| |+--------------+| +----------------+ Protocol Layers In this example, we are trying to get an image from a web page we are connected to via an Ethernet. The image consists of raw data, which is simply a sequence of RGB values that our software can process, i.e., convert into an image and display on our monitor. Alas, our software has no way of knowing how the raw data is organized: Is it a sequence of RGB values, or a sequence of grayscale intensities, or perhaps of CMYK encoded colors? Is the data represented by 8-bit quanta, or are they 16 bits in size, or perhaps 4 bits? How many rows and columns does the image consist of? Should certain pixels be transparent? I think you get the picture... To inform our software how to handle the raw data, it is encoded as a PNG file. It could be a GIF, or a JPEG, but it is a PNG. And PNG is a protocol. At this point, I can hear some of you yelling, No, it is not! It is a file format! Well, of course it is a file format. But from the perspective of data communications, a file format is a protocol: The file structure is a language, a terse one at that, communicating to our process how the data is organized. Ergo, it is a protocol. Alas, if all we received was the PNG file, our software would be facing a serious problem: How is it supposed to know the data is representing an image, as opposed to some text, or perhaps a sound, or what not? Secondly, how is it supposed to know the image is in the PNG format as opposed to GIF, or JPEG, or some other image format? To obtain that information, we are using another protocol: HTTP. This protocol can tell us exactly that the data represents an image, and that it uses the PNG protocol. It can also tell us some other things, but let us stay focused on protocol layers here. So, now we have some data wrapped in the PNG protocol, wrapped in the HTTP protocol. How did we get it from the server? By using TCP/IP over Ethernet, that is how. Indeed, that is three more protocols. Instead of continuing inside out, I am now going to talk about Ethernet, simply because it is easier to explain the rest that way. Ethernet is an interesting system of connecting computers in a local area network (LAN). Each computer has a network interface card (NIC), which has a unique 48-bit ID called its address. No two Ethernet NICs in the world have the same address. These NICs are all connected with each other. Whenever one computer wants to communicate with another in the same Ethernet LAN, it sends a message over the network. Every NIC sees the message. But as part of the Ethernet protocol, the data contains the address of the destination NIC (among other things). So, only one of all the network interface cards will pay attention to it, the rest will ignore it. But not all computers are connected to the same network. Just because we have received the data over our Ethernet does not mean it originated in our own local area network. It could have come to us from some other network (which may not even be Ethernet based) connected with our own network via the Internet. All data is transferred over the Internet using IP, which stands for Internet Protocol. Its basic role is to let us know where in the world the data has arrived from, and where it is supposed to go to. It does not guarantee we will receive the data, only that we will know where it came from if we do receive it. Even if we do receive the data, IP does not guarantee we will receive various chunks of data in the same order the other computer has sent it to us. So, we can receive the center of our image before we receive the upper left corner and after the lower right, for example. It is TCP (Transmission Control Protocol) that asks the sender to resend any lost data and that places it all into the proper order. All in all, it took five different protocols for one computer to communicate to another what an image looks like. We received the data wrapped into the PNG protocol, which was wrapped into the HTTP protocol, which was wrapped into the TCP protocol, which was wrapped into the IP protocol, which was wrapped into the Ethernet protocol. Oh, and by the way, there probably were several other protocols involved somewhere on the way. For example, if our LAN was connected to the Internet through a dial-up call, it used the PPP protocol over the modem which used one (or several) of the various modem protocols, et cetera, et cetera, et cetera... As a developer you should be asking by now, How am I supposed to handle it all? Luckily for you, you are not supposed to handle it all. You are supposed to handle some of it, but not all of it. Specifically, you need not worry about the physical connection (in our case Ethernet and possibly PPP, etc). Nor do you need to handle the Internet Protocol, or the Transmission Control Protocol. In other words, you do not have to do anything to receive the data from the other computer. Well, you do have to ask for it, but that is almost as simple as opening a file. Once you have received the data, it is up to you to figure out what to do with it. In our case, you would need to understand the HTTP protocol and the PNG file structure. To use an analogy, all the internetworking protocols become a gray area: Not so much because we do not understand how it works, but because we are no longer concerned about it. The sockets interface takes care of this gray area for us: +----------------+ |xxxxEthernetxxxx| |+--------------+| ||xxxxxxIPxxxxxx|| ||+------------+|| |||xxxxxTCPxxxx||| |||+----------+||| |||| HTTP |||| ||||+--------+|||| ||||| PNG ||||| |||||+------+||||| |||||| Data |||||| |||||+------+||||| ||||+--------+|||| |||+----------+||| ||+------------+|| |+--------------+| +----------------+ Sockets Covered Protocol Layers We only need to understand any protocols that tell us how to interpret the data, not how to receive it from another process, nor how to send it to another process. The Sockets Model BSD sockets are built on the basic &unix; model: Everything is a file. In our example, then, sockets would let us receive an HTTP file, so to speak. It would then be up to us to extract the PNG file from it. Because of the complexity of internetworking, we cannot just use the open system call, or the open() C function. Instead, we need to take several steps to opening a socket. Once we do, however, we can start treating the socket the same way we treat any file descriptor: We can read from it, write to it, pipe it, and, eventually, close it. Essential Socket Functions While FreeBSD offers different functions to work with sockets, we only need four to open a socket. And in some cases we only need two. The Client-Server Difference Typically, one of the ends of a socket-based data communication is a server, the other is a client. The Common Elements <function>socket</function> The one function used by both, clients and servers, is &man.socket.2;. It is declared this way: int socket(int domain, int type, int protocol); The return value is of the same type as that of open, an integer. FreeBSD allocates its value from the same pool as that of file handles. That is what allows sockets to be treated the same way as files. The domain argument tells the system what protocol family you want it to use. Many of them exist, some are vendor specific, others are very common. They are declared in sys/socket.h. Use PF_INET for UDP, TCP and other Internet protocols (IPv4). Five values are defined for the type argument, again, in sys/socket.h. All of them start with SOCK_. The most common one is SOCK_STREAM, which tells the system you are asking for a reliable stream delivery service (which is TCP when used with PF_INET). If you asked for SOCK_DGRAM, you would be requesting a connectionless datagram delivery service (in our case, UDP). If you wanted to be in charge of the low-level protocols (such as IP), or even network interfaces (e.g., the Ethernet), you would need to specify SOCK_RAW. Finally, the protocol argument depends on the previous two arguments, and is not always meaningful. In that case, use 0 for its value. The Unconnected Socket Nowhere, in the socket function have we specified to what other system we should be connected. Our newly created socket remains unconnected. This is on purpose: To use a telephone analogy, we have just attached a modem to the phone line. We have neither told the modem to make a call, nor to answer if the phone rings. <varname>sockaddr</varname> Various functions of the sockets family expect the address of (or pointer to, to use C terminology) a small area of the memory. The various C declarations in the sys/socket.h refer to it as struct sockaddr. This structure is declared in the same file: /* * Structure used by kernel to store most * addresses. */ struct sockaddr { unsigned char sa_len; /* total length */ sa_family_t sa_family; /* address family */ char sa_data[14]; /* actually longer; address value */ }; #define SOCK_MAXADDRLEN 255 /* longest possible addresses */ Please note the vagueness with which the sa_data field is declared, just as an array of 14 bytes, with the comment hinting there can be more than 14 of them. This vagueness is quite deliberate. Sockets is a very powerful interface. While most people perhaps think of it as nothing more than the Internet interface—and most applications probably use it for that nowadays—sockets can be used for just about any kind of interprocess communications, of which the Internet (or, more precisely, IP) is only one. The sys/socket.h refers to the various types of protocols sockets will handle as address families, and lists them right before the definition of sockaddr: /* * Address families. */ #define AF_UNSPEC 0 /* unspecified */ #define AF_LOCAL 1 /* local to host (pipes, portals) */ #define AF_UNIX AF_LOCAL /* backward compatibility */ #define AF_INET 2 /* internetwork: UDP, TCP, etc. */ #define AF_IMPLINK 3 /* arpanet imp addresses */ #define AF_PUP 4 /* pup protocols: e.g. BSP */ #define AF_CHAOS 5 /* mit CHAOS protocols */ #define AF_NS 6 /* XEROX NS protocols */ #define AF_ISO 7 /* ISO protocols */ #define AF_OSI AF_ISO #define AF_ECMA 8 /* European computer manufacturers */ #define AF_DATAKIT 9 /* datakit protocols */ #define AF_CCITT 10 /* CCITT protocols, X.25 etc */ #define AF_SNA 11 /* IBM SNA */ #define AF_DECnet 12 /* DECnet */ #define AF_DLI 13 /* DEC Direct data link interface */ #define AF_LAT 14 /* LAT */ #define AF_HYLINK 15 /* NSC Hyperchannel */ #define AF_APPLETALK 16 /* Apple Talk */ #define AF_ROUTE 17 /* Internal Routing Protocol */ #define AF_LINK 18 /* Link layer interface */ #define pseudo_AF_XTP 19 /* eXpress Transfer Protocol (no AF) */ #define AF_COIP 20 /* connection-oriented IP, aka ST II */ #define AF_CNT 21 /* Computer Network Technology */ #define pseudo_AF_RTIP 22 /* Help Identify RTIP packets */ #define AF_IPX 23 /* Novell Internet Protocol */ #define AF_SIP 24 /* Simple Internet Protocol */ #define pseudo_AF_PIP 25 /* Help Identify PIP packets */ #define AF_ISDN 26 /* Integrated Services Digital Network*/ #define AF_E164 AF_ISDN /* CCITT E.164 recommendation */ #define pseudo_AF_KEY 27 /* Internal key-management function */ #define AF_INET6 28 /* IPv6 */ #define AF_NATM 29 /* native ATM access */ #define AF_ATM 30 /* ATM */ #define pseudo_AF_HDRCMPLT 31 /* Used by BPF to not rewrite headers * in interface output routine */ #define AF_NETGRAPH 32 /* Netgraph sockets */ #define AF_SLOW 33 /* 802.3ad slow protocol */ #define AF_SCLUSTER 34 /* Sitara cluster protocol */ #define AF_ARP 35 #define AF_BLUETOOTH 36 /* Bluetooth sockets */ #define AF_MAX 37 The one used for IP is AF_INET. It is a symbol for the constant 2. It is the address family listed in the sa_family field of sockaddr that decides how exactly the vaguely named bytes of sa_data will be used. Specifically, whenever the address family is AF_INET, we can use struct sockaddr_in found in netinet/in.h, wherever sockaddr is expected: /* * Socket address, internet style. */ struct sockaddr_in { uint8_t sin_len; sa_family_t sin_family; in_port_t sin_port; struct in_addr sin_addr; char sin_zero[8]; }; We can visualize its organization this way: 0 1 2 3 +--------+--------+-----------------+ 0 | 0 | Family | Port | +--------+--------+-----------------+ 4 | IP Address | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ sockaddr_in The three important fields are sin_family, which is byte 1 of the structure, sin_port, a 16-bit value found in bytes 2 and 3, and sin_addr, a 32-bit integer representation of the IP address, stored in bytes 4-7. Now, let us try to fill it out. Let us assume we are trying to write a client for the daytime protocol, which simply states that its server will write a text string representing the current date and time to port 13. We want to use TCP/IP, so we need to specify AF_INET in the address family field. AF_INET is defined as 2. Let us use the IP address of 192.43.244.18, which is the time server of US federal government (time.nist.gov). 0 1 2 3 +--------+--------+-----------------+ 0 | 0 | 2 | 13 | +-----------------+-----------------+ 4 | 192.43.244.18 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ Specific example of sockaddr_in By the way the sin_addr field is declared as being of the struct in_addr type, which is defined in netinet/in.h: /* * Internet address (a structure for historical reasons) */ struct in_addr { in_addr_t s_addr; }; In addition, in_addr_t is a 32-bit integer. The 192.43.244.18 is just a convenient notation of expressing a 32-bit integer by listing all of its 8-bit bytes, starting with the most significant one. So far, we have viewed sockaddr as an abstraction. Our computer does not store short integers as a single 16-bit entity, but as a sequence of 2 bytes. Similarly, it stores 32-bit integers as a sequence of 4 bytes. Suppose we coded something like this: sa.sin_family = AF_INET; sa.sin_port = 13; - sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | 18; + sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | 18; What would the result look like? Well, that depends, of course. On a &pentium;, or other x86, based computer, it would look like this: 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 13 | 0 | +--------+--------+--------+--------+ 4 | 18 | 244 | 43 | 192 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ sockaddr_in on an Intel system On a different system, it might look like this: 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ 4 | 192 | 43 | 244 | 18 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ sockaddr_in on an MSB system And on a PDP it might look different yet. But the above two are the most common ways in use today. Ordinarily, wanting to write portable code, programmers pretend that these differences do not exist. And they get away with it (except when they code in assembly language). Alas, you cannot get away with it that easily when coding for sockets. Why? Because when communicating with another computer, you usually do not know whether it stores data most significant byte (MSB) or least significant byte (LSB) first. You might be wondering, So, will sockets not handle it for me? It will not. While that answer may surprise you at first, remember that the general sockets interface only understands the sa_len and sa_family fields of the sockaddr structure. You do not have to worry about the byte order there (of course, on FreeBSD sa_family is only 1 byte anyway, but many other &unix; systems do not have sa_len and use 2 bytes for sa_family, and expect the data in whatever order is native to the computer). But the rest of the data is just sa_data[14] as far as sockets goes. Depending on the address family, sockets just forwards that data to its destination. Indeed, when we enter a port number, it is because we want the other computer to know what service we are asking for. And, when we are the server, we read the port number so we know what service the other computer is expecting from us. Either way, sockets only has to forward the port number as data. It does not interpret it in any way. Similarly, we enter the IP address to tell everyone on the way where to send our data to. Sockets, again, only forwards it as data. That is why, we (the programmers, not the sockets) have to distinguish between the byte order used by our computer and a conventional byte order to send the data in to the other computer. We will call the byte order our computer uses the host byte order, or just the host order. There is a convention of sending the multi-byte data over IP MSB first. This, we will refer to as the network byte order, or simply the network order. Now, if we compiled the above code for an Intel based computer, our host byte order would produce: 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 13 | 0 | +--------+--------+--------+--------+ 4 | 18 | 244 | 43 | 192 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ Host byte order on an Intel system But the network byte order requires that we store the data MSB first: 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ 4 | 192 | 43 | 244 | 18 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ Network byte order Unfortunately, our host order is the exact opposite of the network order. We have several ways of dealing with it. One would be to reverse the values in our code: sa.sin_family = AF_INET; - sa.sin_port = 13 << 8; - sa.sin_addr.s_addr = (((((18 << 8) | 244) << 8) | 43) << 8) | 192; + sa.sin_port = 13 << 8; + sa.sin_addr.s_addr = (((((18 << 8) | 244) << 8) | 43) << 8) | 192; This will trick our compiler into storing the data in the network byte order. In some cases, this is exactly the way to do it (e.g., when programming in assembly language). In most cases, however, it can cause a problem. Suppose, you wrote a sockets-based program in C. You know it is going to run on a &pentium;, so you enter all your constants in reverse and force them to the network byte order. It works well. Then, some day, your trusted old &pentium; becomes a rusty old &pentium;. You replace it with a system whose host order is the same as the network order. You need to recompile all your software. All of your software continues to perform well, except the one program you wrote. You have since forgotten that you had forced all of your constants to the opposite of the host order. You spend some quality time tearing out your hair, calling the names of all gods you ever heard of (and some you made up), hitting your monitor with a nerf bat, and performing all the other traditional ceremonies of trying to figure out why something that has worked so well is suddenly not working at all. Eventually, you figure it out, say a couple of swear words, and start rewriting your code. Luckily, you are not the first one to face the problem. Someone else has created the &man.htons.3; and &man.htonl.3; C functions to convert a short and long respectively from the host byte order to the network byte order, and the &man.ntohs.3; and &man.ntohl.3; C functions to go the other way. On MSB-first systems these functions do nothing. On LSB-first systems they convert values to the proper order. So, regardless of what system your software is compiled on, your data will end up in the correct order if you use these functions. Client Functions Typically, the client initiates the connection to the server. The client knows which server it is about to call: It knows its IP address, and it knows the port the server resides at. It is akin to you picking up the phone and dialing the number (the address), then, after someone answers, asking for the person in charge of wingdings (the port). <function>connect</function> Once a client has created a socket, it needs to connect it to a specific port on a remote system. It uses &man.connect.2;: int connect(int s, const struct sockaddr *name, socklen_t namelen); The s argument is the socket, i.e., the value returned by the socket function. The name is a pointer to sockaddr, the structure we have talked about extensively. Finally, namelen informs the system how many bytes are in our sockaddr structure. If connect is successful, it returns 0. Otherwise it returns -1 and stores the error code in errno. There are many reasons why connect may fail. For example, with an attempt to an Internet connection, the IP address may not exist, or it may be down, or just too busy, or it may not have a server listening at the specified port. Or it may outright refuse any request for specific code. Our First Client We now know enough to write a very simple client, one that will get current time from 192.43.244.18 and print it to stdout. /* * daytime.c * * Programmed by G. Adam Stanislav */ #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> int main() { register int s; register int bytes; struct sockaddr_in sa; char buffer[BUFSIZ+1]; if ((s = socket(PF_INET, SOCK_STREAM, 0)) < 0) { perror("socket"); return 1; } bzero(&sa, sizeof sa); sa.sin_family = AF_INET; sa.sin_port = htons(13); sa.sin_addr.s_addr = htonl((((((192 << 8) | 43) << 8) | 244) << 8) | 18); if (connect(s, (struct sockaddr *)&sa, sizeof sa) < 0) { perror("connect"); close(s); return 2; } - while ((bytes = read(s, buffer, BUFSIZ)) > 0) + while ((bytes = read(s, buffer, BUFSIZ)) > 0) write(1, buffer, bytes); close(s); return 0; } Go ahead, enter it in your editor, save it as daytime.c, then compile and run it: &prompt.user; cc -O3 -o daytime daytime.c &prompt.user; ./daytime 52079 01-06-19 02:29:25 50 0 1 543.9 UTC(NIST) * &prompt.user; In this case, the date was June 19, 2001, the time was 02:29:25 UTC. Naturally, your results will vary. Server Functions The typical server does not initiate the connection. Instead, it waits for a client to call it and request services. It does not know when the client will call, nor how many clients will call. It may be just sitting there, waiting patiently, one moment, The next moment, it can find itself swamped with requests from a number of clients, all calling in at the same time. The sockets interface offers three basic functions to handle this. <function>bind</function> Ports are like extensions to a phone line: After you dial a number, you dial the extension to get to a specific person or department. There are 65535 IP ports, but a server usually processes requests that come in on only one of them. It is like telling the phone room operator that we are now at work and available to answer the phone at a specific extension. We use &man.bind.2; to tell sockets which port we want to serve. int bind(int s, const struct sockaddr *addr, socklen_t addrlen); Beside specifying the port in addr, the server may include its IP address. However, it can just use the symbolic constant INADDR_ANY to indicate it will serve all requests to the specified port regardless of what its IP address is. This symbol, along with several similar ones, is declared in netinet/in.h #define INADDR_ANY (u_int32_t)0x00000000 Suppose we were writing a server for the daytime protocol over TCP/IP. Recall that it uses port 13. Our sockaddr_in structure would look like this: 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ 4 | 0 | +-----------------------------------+ 8 | 0 | +-----------------------------------+ 12 | 0 | +-----------------------------------+ Example Server sockaddr_in <function>listen</function> To continue our office phone analogy, after you have told the phone central operator what extension you will be at, you now walk into your office, and make sure your own phone is plugged in and the ringer is turned on. Plus, you make sure your call waiting is activated, so you can hear the phone ring even while you are talking to someone. The server ensures all of that with the &man.listen.2; function. int listen(int s, int backlog); In here, the backlog variable tells sockets how many incoming requests to accept while you are busy processing the last request. In other words, it determines the maximum size of the queue of pending connections. <function>accept</function> After you hear the phone ringing, you accept the call by answering the call. You have now established a connection with your client. This connection remains active until either you or your client hang up. The server accepts the connection by using the &man.accept.2; function. int accept(int s, struct sockaddr *addr, socklen_t *addrlen); Note that this time addrlen is a pointer. This is necessary because in this case it is the socket that fills out addr, the sockaddr_in structure. The return value is an integer. Indeed, the accept returns a new socket. You will use this new socket to communicate with the client. What happens to the old socket? It continues to listen for more requests (remember the backlog variable we passed to listen?) until we close it. Now, the new socket is meant only for communications. It is fully connected. We cannot pass it to listen again, trying to accept additional connections. Our First Server Our first server will be somewhat more complex than our first client was: Not only do we have more sockets functions to use, but we need to write it as a daemon. This is best achieved by creating a child process after binding the port. The main process then exits and returns control to the shell (or whatever program invoked it). The child calls listen, then starts an endless loop, which accepts a connection, serves it, and eventually closes its socket. /* * daytimed - a port 13 server * * Programmed by G. Adam Stanislav * June 19, 2001 */ #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define BACKLOG 4 int main() { register int s, c; int b; struct sockaddr_in sa; time_t t; struct tm *tm; FILE *client; if ((s = socket(PF_INET, SOCK_STREAM, 0)) < 0) { perror("socket"); return 1; } bzero(&sa, sizeof sa); sa.sin_family = AF_INET; sa.sin_port = htons(13); if (INADDR_ANY) sa.sin_addr.s_addr = htonl(INADDR_ANY); - if (bind(s, (struct sockaddr *)&sa, sizeof sa) < 0) { + if (bind(s, (struct sockaddr *)&sa, sizeof sa) < 0) { perror("bind"); return 2; } switch (fork()) { case -1: perror("fork"); return 3; break; default: close(s); return 0; break; case 0: break; } listen(s, BACKLOG); for (;;) { b = sizeof sa; if ((c = accept(s, (struct sockaddr *)&sa, &b)) < 0) { perror("daytimed accept"); return 4; } if ((client = fdopen(c, "w")) == NULL) { perror("daytimed fdopen"); return 5; } if ((t = time(NULL)) < 0) { perror("daytimed time"); return 6; } tm = gmtime(&t); fprintf(client, "%.4i-%.2i-%.2iT%.2i:%.2i:%.2iZ\n", - tm->tm_year + 1900, - tm->tm_mon + 1, - tm->tm_mday, - tm->tm_hour, - tm->tm_min, - tm->tm_sec); + tm->tm_year + 1900, + tm->tm_mon + 1, + tm->tm_mday, + tm->tm_hour, + tm->tm_min, + tm->tm_sec); fclose(client); } } We start by creating a socket. Then we fill out the sockaddr_in structure in sa. Note the conditional use of INADDR_ANY: if (INADDR_ANY) sa.sin_addr.s_addr = htonl(INADDR_ANY); Its value is 0. Since we have just used bzero on the entire structure, it would be redundant to set it to 0 again. But if we port our code to some other system where INADDR_ANY is perhaps not a zero, we need to assign it to sa.sin_addr.s_addr. Most modern C compilers are clever enough to notice that INADDR_ANY is a constant. As long as it is a zero, they will optimize the entire conditional statement out of the code. After we have called bind successfully, we are ready to become a daemon: We use fork to create a child process. In both, the parent and the child, the s variable is our socket. The parent process will not need it, so it calls close, then it returns 0 to inform its own parent it had terminated successfully. Meanwhile, the child process continues working in the background. It calls listen and sets its backlog to 4. It does not need a large value here because daytime is not a protocol many clients request all the time, and because it can process each request instantly anyway. Finally, the daemon starts an endless loop, which performs the following steps: Call accept. It waits here until a client contacts it. At that point, it receives a new socket, c, which it can use to communicate with this particular client. It uses the C function fdopen to turn the socket from a low-level file descriptor to a C-style FILE pointer. This will allow the use of fprintf later on. It checks the time, and prints it in the ISO 8601 format to the client file. It then uses fclose to close the file. That will automatically close the socket as well. We can generalize this, and use it as a model for many other servers: +-----------------+ | Create Socket | +-----------------+ | +-----------------+ | Bind Port | Daemon Process +-----------------+ | +--------+ +-------------+-->| Init | | | +--------+ +-----------------+ | | | Exit | | +--------+ +-----------------+ | | Listen | | +--------+ | | | +--------+ | | Accept | | +--------+ | | | +--------+ | | Serve | | +--------+ | | | +--------+ | | Close | |<--------+ Sequential Server This flowchart is good for sequential servers, i.e., servers that can serve one client at a time, just as we were able to with our daytime server. This is only possible whenever there is no real conversation going on between the client and the server: As soon as the server detects a connection to the client, it sends out some data and closes the connection. The entire operation may take nanoseconds, and it is finished. The advantage of this flowchart is that, except for the brief moment after the parent forks and before it exits, there is always only one process active: Our server does not take up much memory and other system resources. Note that we have added initialize daemon in our flowchart. We did not need to initialize our own daemon, but this is a good place in the flow of the program to set up any signal handlers, open any files we may need, etc. Just about everything in the flow chart can be used literally on many different servers. The serve entry is the exception. We think of it as a black box, i.e., something you design specifically for your own server, and just plug it into the rest. Not all protocols are that simple. Many receive a request from the client, reply to it, then receive another request from the same client. Because of that, they do not know in advance how long they will be serving the client. Such servers usually start a new process for each client. While the new process is serving its client, the daemon can continue listening for more connections. Now, go ahead, save the above source code as daytimed.c (it is customary to end the names of daemons with the letter d). After you have compiled it, try running it: &prompt.user; ./daytimed bind: Permission denied &prompt.user; What happened here? As you will recall, the daytime protocol uses port 13. But all ports below 1024 are reserved to the superuser (otherwise, anyone could start a daemon pretending to serve a commonly used port, while causing a security breach). Try again, this time as the superuser: &prompt.root; ./daytimed &prompt.root; What... Nothing? Let us try again: &prompt.root; ./daytimed bind: Address already in use &prompt.root; Every port can only be bound by one program at a time. Our first attempt was indeed successful: It started the child daemon and returned quietly. It is still running and will continue to run until you either kill it, or any of its system calls fail, or you reboot the system. Fine, we know it is running in the background. But is it working? How do we know it is a proper daytime server? Simple: &prompt.user; telnet localhost 13 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 2001-06-19T21:04:42Z Connection closed by foreign host. &prompt.user; telnet tried the new IPv6, and failed. It retried with IPv4 and succeeded. The daemon works. If you have access to another &unix; system via telnet, you can use it to test accessing the server remotely. My computer does not have a static IP address, so this is what I did: &prompt.user; who whizkid ttyp0 Jun 19 16:59 (216.127.220.143) xxx ttyp1 Jun 19 16:06 (xx.xx.xx.xx) &prompt.user; telnet 216.127.220.143 13 Trying 216.127.220.143... Connected to r47.bfm.org. Escape character is '^]'. 2001-06-19T21:31:11Z Connection closed by foreign host. &prompt.user; Again, it worked. Will it work using the domain name? &prompt.user; telnet r47.bfm.org 13 Trying 216.127.220.143... Connected to r47.bfm.org. Escape character is '^]'. 2001-06-19T21:31:40Z Connection closed by foreign host. &prompt.user; By the way, telnet prints the Connection closed by foreign host message after our daemon has closed the socket. This shows us that, indeed, using fclose(client); in our code works as advertised. Helper Functions FreeBSD C library contains many helper functions for sockets programming. For example, in our sample client we hard coded the time.nist.gov IP address. But we do not always know the IP address. Even if we do, our software is more flexible if it allows the user to enter the IP address, or even the domain name. <function>gethostbyname</function> While there is no way to pass the domain name directly to any of the sockets functions, the FreeBSD C library comes with the &man.gethostbyname.3; and &man.gethostbyname2.3; functions, declared in netdb.h. struct hostent * gethostbyname(const char *name); struct hostent * gethostbyname2(const char *name, int af); Both return a pointer to the hostent structure, with much information about the domain. For our purposes, the h_addr_list[0] field of the structure points at h_length bytes of the correct address, already stored in the network byte order. This allows us to create a much more flexible—and much more useful—version of our daytime program: /* * daytime.c * * Programmed by G. Adam Stanislav * 19 June 2001 */ #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> int main(int argc, char *argv[]) { register int s; register int bytes; struct sockaddr_in sa; struct hostent *he; char buf[BUFSIZ+1]; char *host; if ((s = socket(PF_INET, SOCK_STREAM, 0)) < 0) { perror("socket"); return 1; } bzero(&sa, sizeof sa); sa.sin_family = AF_INET; sa.sin_port = htons(13); host = (argc > 1) ? (char *)argv[1] : "time.nist.gov"; if ((he = gethostbyname(host)) == NULL) { perror(host); return 2; } bcopy(he->h_addr_list[0],&sa.sin_addr, he->h_length); if (connect(s, (struct sockaddr *)&sa, sizeof sa) < 0) { perror("connect"); return 3; } while ((bytes = read(s, buf, BUFSIZ)) > 0) write(1, buf, bytes); close(s); return 0; } We now can type a domain name (or an IP address, it works both ways) on the command line, and the program will try to connect to its daytime server. Otherwise, it will still default to time.nist.gov. However, even in this case we will use gethostbyname rather than hard coding 192.43.244.18. That way, even if its IP address changes in the future, we will still find it. Since it takes virtually no time to get the time from your local server, you could run daytime twice in a row: First to get the time from time.nist.gov, the second time from your own system. You can then compare the results and see how exact your system clock is: &prompt.user; daytime ; daytime localhost 52080 01-06-20 04:02:33 50 0 0 390.2 UTC(NIST) * 2001-06-20T04:02:35Z &prompt.user; As you can see, my system was two seconds ahead of the NIST time. <function>getservbyname</function> Sometimes you may not be sure what port a certain service uses. The &man.getservbyname.3; function, also declared in netdb.h comes in very handy in those cases: struct servent * getservbyname(const char *name, const char *proto); The servent structure contains the s_port, which contains the proper port, already in network byte order. Had we not known the correct port for the daytime service, we could have found it this way: struct servent *se; ... if ((se = getservbyname("daytime", "tcp")) == NULL { fprintf(stderr, "Cannot determine which port to use.\n"); return 7; } - sa.sin_port = se->s_port; + sa.sin_port = se->s_port; You usually do know the port. But if you are developing a new protocol, you may be testing it on an unofficial port. Some day, you will register the protocol and its port (if nowhere else, at least in your /etc/services, which is where getservbyname looks). Instead of returning an error in the above code, you just use the temporary port number. Once you have listed the protocol in /etc/services, your software will find its port without you having to rewrite the code. Concurrent Servers Unlike a sequential server, a concurrent server has to be able to serve more than one client at a time. For example, a chat server may be serving a specific client for hours—it cannot wait till it stops serving a client before it serves the next one. This requires a significant change in our flowchart: +-----------------+ | Create Socket | +-----------------+ | +-----------------+ | Bind Port | Daemon Process +-----------------+ | +--------+ +-------------+-->| Init | | | +--------+ +-----------------+ | | | Exit | | +--------+ +-----------------+ | | Listen | | +--------+ | | | +--------+ | | Accept | | +--------+ | | +------------------+ | +------>| Close Top Socket | | | +------------------+ | +--------+ | | | Close | +------------------+ | +--------+ | Serve | | | +------------------+ |<--------+ | +------------------+ | Close Acc Socket | +--------+ +------------------+ | Signal | | +--------+ +------------------+ | Exit | +------------------+ Concurrent Server We moved the serve from the daemon process to its own server process. However, because each child process inherits all open files (and a socket is treated just like a file), the new process inherits not only the accepted handle, i.e., the socket returned by the accept call, but also the top socket, i.e., the one opened by the top process right at the beginning. However, the server process does not need this socket and should close it immediately. Similarly, the daemon process no longer needs the accepted socket, and not only should, but must close it—otherwise, it will run out of available file descriptors sooner or later. After the server process is done serving, it should close the accepted socket. Instead of returning to accept, it now exits. Under &unix;, a process does not really exit. Instead, it returns to its parent. Typically, a parent process waits for its child process, and obtains a return value. However, our daemon process cannot simply stop and wait. That would defeat the whole purpose of creating additional processes. But if it never does wait, its children will become zombies—no longer functional but still roaming around. For that reason, the daemon process needs to set signal handlers in its initialize daemon phase. At least a SIGCHLD signal has to be processed, so the daemon can remove the zombie return values from the system and release the system resources they are taking up. That is why our flowchart now contains a process signals box, which is not connected to any other box. By the way, many servers also process SIGHUP, and typically interpret as the signal from the superuser that they should reread their configuration files. This allows us to change settings without having to kill and restart these servers. diff --git a/en_US.ISO8859-1/books/developers-handbook/tools/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/tools/chapter.sgml index 78e65c084b..7508065173 100644 --- a/en_US.ISO8859-1/books/developers-handbook/tools/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/tools/chapter.sgml @@ -1,2366 +1,2366 @@ James Raynard Contributed by Murray Stokely Programming Tools Synopsis This chapter is an introduction to using some of the programming tools supplied with FreeBSD, although much of it will be applicable to many other versions of &unix;. It does not attempt to describe coding in any detail. Most of the chapter assumes little or no previous programming knowledge, although it is hoped that most programmers will find something of value in it. Introduction FreeBSD offers an excellent development environment. Compilers for C, C++, and Fortran and an assembler come with the basic system, not to mention a Perl interpreter and classic &unix; tools such as sed and awk. If that is not enough, there are many more compilers and interpreters in the Ports collection. FreeBSD is very compatible with standards such as &posix; and ANSI C, as well with its own BSD heritage, so it is possible to write applications that will compile and run with little or no modification on a wide range of platforms. However, all this power can be rather overwhelming at first if you have never written programs on a &unix; platform before. This document aims to help you get up and running, without getting too deeply into more advanced topics. The intention is that this document should give you enough of the basics to be able to make some sense of the documentation. Most of the document requires little or no knowledge of programming, although it does assume a basic competence with using &unix; and a willingness to learn! Introduction to Programming A program is a set of instructions that tell the computer to do various things; sometimes the instruction it has to perform depends on what happened when it performed a previous instruction. This section gives an overview of the two main ways in which you can give these instructions, or commands as they are usually called. One way uses an interpreter, the other a compiler. As human languages are too difficult for a computer to understand in an unambiguous way, commands are usually written in one or other languages specially designed for the purpose. Interpreters With an interpreter, the language comes as an environment, where you type in commands at a prompt and the environment executes them for you. For more complicated programs, you can type the commands into a file and get the interpreter to load the file and execute the commands in it. If anything goes wrong, many interpreters will drop you into a debugger to help you track down the problem. The advantage of this is that you can see the results of your commands immediately, and mistakes can be corrected readily. The biggest disadvantage comes when you want to share your programs with someone. They must have the same interpreter, or you must have some way of giving it to them, and they need to understand how to use it. Also users may not appreciate being thrown into a debugger if they press the wrong key! From a performance point of view, interpreters can use up a lot of memory, and generally do not generate code as efficiently as compilers. In my opinion, interpreted languages are the best way to start if you have not done any programming before. This kind of environment is typically found with languages like Lisp, Smalltalk, Perl and Basic. It could also be argued that the &unix; shell (sh, csh) is itself an interpreter, and many people do in fact write shell scripts to help with various housekeeping tasks on their machine. Indeed, part of the original &unix; philosophy was to provide lots of small utility programs that could be linked together in shell scripts to perform useful tasks. Interpreters available with FreeBSD Here is a list of interpreters that are available from the &os; Ports Collection, with a brief discussion of some of the more popular interpreted languages. Instructions on how to get and install applications from the Ports Collection can be found in the Ports section of the handbook. BASIC Short for Beginner's All-purpose Symbolic Instruction Code. Developed in the 1950s for teaching University students to program and provided with every self-respecting personal computer in the 1980s, BASIC has been the first programming language for many programmers. It is also the foundation for Visual Basic. The Bywater Basic Interpreter can be found in the Ports Collection as lang/bwbasic and the Phil Cockroft's Basic Interpreter (formerly Rabbit Basic) is available as lang/pbasic. Lisp A language that was developed in the late 1950s as an alternative to the number-crunching languages that were popular at the time. Instead of being based on numbers, Lisp is based on lists; in fact the name is short for List Processing. Very popular in AI (Artificial Intelligence) circles. Lisp is an extremely powerful and sophisticated language, but can be rather large and unwieldy. Various implementations of Lisp that can run on &unix; systems are available in the Ports Collection for &os;. GNU Common Lisp can be found as lang/gcl. CLISP by Bruno Haible and Michael Stoll is available as lang/clisp. For CMUCL, which includes a highly-optimizing compiler too, or simpler Lisp implementations like SLisp, which implements most of the Common Lisp constructs in a few hundred lines of C code, lang/cmucl and lang/slisp are available respectively. Perl Very popular with system administrators for writing scripts; also often used on World Wide Web servers for writing CGI scripts. Perl is available in the Ports Collection as lang/perl5 for all &os; releases, and is installed as /usr/bin/perl in the base system 4.X releases. Scheme A dialect of Lisp that is rather more compact and cleaner than Common Lisp. Popular in Universities as it is simple enough to teach to undergraduates as a first language, while it has a high enough level of abstraction to be used in research work. Scheme is available from the Ports Collection as lang/elk for the Elk Scheme Interpreter. The MIT Scheme Interpreter can be found in lang/mit-scheme and the SCM Scheme Interpreter in lang/scm. Icon Icon is a high-level language with extensive facilities for processing strings and structures. The version of Icon for &os; can be found in the Ports Collection as lang/icon. Logo Logo is a language that is easy to learn, and has been used as an introductory programming language in various courses. It is an excellent tool to work with when teaching programming in small ages, as it makes the creation of elaborate geometric shapes an easy task even for very small children. The lastest version of Logo for &os; is available from the Ports Collection in lang/logo. Python Python is an Object-Oriented, interpreted language. Its advocates argue that it is one of the best languages to start programming with, since it is relatively easy to start with, but is not limited in comparison to other popular interpreted languages that are used for the development of large, complex applications (Perl and Tcl are two other languages that are popular for such tasks). The latest version of Python is available from the Ports Collection in lang/python. Ruby Ruby is an interpreter, pure object-oriented programming language. It has become widely popular because of its easy to understand syntax, flexibility when writing code, and the ability to easily develop and maintain large, complex programs. Ruby is available from the Ports Collection as lang/ruby18. Tcl and Tk Tcl is an embeddable, interpreted language, that has become widely used and became popular mostly because of its portability to many platforms. It can be used both for quickly writing small, prototype applications, or (when combined with Tk, a GUI toolkit) fully-fledged, featureful programs. Various versions of Tcl are available as ports for &os;. The latest version, Tcl 8.4, can be found in lang/tcl84. Compilers Compilers are rather different. First of all, you write your code in a file (or files) using an editor. You then run the compiler and see if it accepts your program. If it did not compile, grit your teeth and go back to the editor; if it did compile and gave you a program, you can run it either at a shell command prompt or in a debugger to see if it works properly. If you run it in the shell, you may get a core dump. Obviously, this is not quite as direct as using an interpreter. However it allows you to do a lot of things which are very difficult or even impossible with an interpreter, such as writing code which interacts closely with the operating system—or even writing your own operating system! It is also useful if you need to write very efficient code, as the compiler can take its time and optimize the code, which would not be acceptable in an interpreter. Moreover, distributing a program written for a compiler is usually more straightforward than one written for an interpreter—you can just give them a copy of the executable, assuming they have the same operating system as you. Compiled languages include Pascal, C and C++. C and C++ are rather unforgiving languages, and best suited to more experienced programmers; Pascal, on the other hand, was designed as an educational language, and is quite a good language to start with. FreeBSD does not include Pascal support in the base system, but both GNU Pascal Compiler (GPC) and the Free Pascal Compiler are available in the ports collection as lang/gpc and lang/fpc. As the edit-compile-run-debug cycle is rather tedious when using separate programs, many commercial compiler makers have produced Integrated Development Environments (IDEs for short). FreeBSD does not include an IDE in the base system, but devel/kdevelop is available in the ports tree and many use Emacs for this purpose. Using Emacs as an IDE is discussed in . Compiling with <command>cc</command> This section deals only with the GNU compiler for C and C++, since that comes with the base FreeBSD system. It can be invoked by either cc or gcc. The details of producing a program with an interpreter vary considerably between interpreters, and are usually well covered in the documentation and on-line help for the interpreter. Once you have written your masterpiece, the next step is to convert it into something that will (hopefully!) run on FreeBSD. This usually involves several steps, each of which is done by a separate program. Pre-process your source code to remove comments and do other tricks like expanding macros in C. Check the syntax of your code to see if you have obeyed the rules of the language. If you have not, it will complain! Convert the source code into assembly language—this is very close to machine code, but still understandable by humans. Allegedly. To be strictly accurate, cc converts the source code into its own, machine-independent p-code instead of assembly language at this stage. Convert the assembly language into machine code—yep, we are talking bits and bytes, ones and zeros here. Check that you have used things like functions and global variables in a consistent way. For example, if you have called a non-existent function, it will complain. If you are trying to produce an executable from several source code files, work out how to fit them all together. Work out how to produce something that the system's run-time loader will be able to load into memory and run. Finally, write the executable on the filesystem. The word compiling is often used to refer to just steps 1 to 4—the others are referred to as linking. Sometimes step 1 is referred to as pre-processing and steps 3-4 as assembling. Fortunately, almost all this detail is hidden from you, as cc is a front end that manages calling all these programs with the right arguments for you; simply typing &prompt.user; cc foobar.c will cause foobar.c to be compiled by all the steps above. If you have more than one file to compile, just do something like &prompt.user; cc foo.c bar.c Note that the syntax checking is just that—checking the syntax. It will not check for any logical mistakes you may have made, like putting the program into an infinite loop, or using a bubble sort when you meant to use a binary sort. In case you did not know, a binary sort is an efficient way of sorting things into order and a bubble sort is not. There are lots and lots of options for cc, which are all in the manual page. Here are a few of the most important ones, with examples of how to use them. The output name of the file. If you do not use this option, cc will produce an executable called a.out. The reasons for this are buried in the mists of history. &prompt.user; cc foobar.c executable is a.out &prompt.user; cc -o foobar foobar.c executable is foobar Just compile the file, do not link it. Useful for toy programs where you just want to check the syntax, or if you are using a Makefile. &prompt.user; cc -c foobar.c This will produce an object file (not an executable) called foobar.o. This can be linked together with other object files into an executable. Create a debug version of the executable. This makes the compiler put information into the executable about which line of which source file corresponds to which function call. A debugger can use this information to show the source code as you step through the program, which is very useful; the disadvantage is that all this extra information makes the program much bigger. Normally, you compile with while you are developing a program and then compile a release version without when you are satisfied it works properly. &prompt.user; cc -g foobar.c This will produce a debug version of the program. Note, we did not use the flag to specify the executable name, so we will get an executable called a.out. Producing a debug version called foobar is left as an exercise for the reader! Create an optimized version of the executable. The compiler performs various clever tricks to try to produce an executable that runs faster than normal. You can add a number after the to specify a higher level of optimization, but this often exposes bugs in the compiler's optimizer. For instance, the version of cc that comes with the 2.1.0 release of FreeBSD is known to produce bad code with the option in some circumstances. Optimization is usually only turned on when compiling a release version. &prompt.user; cc -O -o foobar foobar.c This will produce an optimized version of foobar. The following three flags will force cc to check that your code complies to the relevant international standard, often referred to as the ANSI standard, though strictly speaking it is an ISO standard. Enable all the warnings which the authors of cc believe are worthwhile. Despite the name, it will not enable all the warnings cc is capable of. Turn off most, but not all, of the non-ANSI C features provided by cc. Despite the name, it does not guarantee strictly that your code will comply to the standard. Turn off all cc's non-ANSI C features. Without these flags, cc will allow you to use some of its non-standard extensions to the standard. Some of these are very useful, but will not work with other compilers—in fact, one of the main aims of the standard is to allow people to write code that will work with any compiler on any system. This is known as portable code. Generally, you should try to make your code as portable as possible, as otherwise you may have to completely rewrite the program later to get it to work somewhere else—and who knows what you may be using in a few years time? &prompt.user; cc -Wall -ansi -pedantic -o foobar foobar.c This will produce an executable foobar after checking foobar.c for standard compliance. Specify a function library to be used at link time. The most common example of this is when compiling a program that uses some of the mathematical functions in C. Unlike most other platforms, these are in a separate library from the standard C one and you have to tell the compiler to add it. The rule is that if the library is called libsomething.a, you give cc the argument . For example, the math library is libm.a, so you give cc the argument . A common gotcha with the math library is that it has to be the last library on the command line. &prompt.user; cc -o foobar foobar.c -lm This will link the math library functions into foobar. If you are compiling C++ code, you need to add , or if you are using FreeBSD 2.2 or later, to the command line argument to link the C++ library functions. Alternatively, you can run c++ instead of cc, which does this for you. c++ can also be invoked as g++ on FreeBSD. &prompt.user; cc -o foobar foobar.cc -lg++ For FreeBSD 2.1.6 and earlier &prompt.user; cc -o foobar foobar.cc -lstdc++ For FreeBSD 2.2 and later &prompt.user; c++ -o foobar foobar.cc Each of these will both produce an executable foobar from the C++ source file foobar.cc. Note that, on &unix; systems, C++ source files traditionally end in .C, .cxx or .cc, rather than the &ms-dos; style .cpp (which was already used for something else). gcc used to rely on this to work out what kind of compiler to use on the source file; however, this restriction no longer applies, so you may now call your C++ files .cpp with impunity! Common <command>cc</command> Queries and Problems I am trying to write a program which uses the sin() function and I get an error like this. What does it mean? /var/tmp/cc0143941.o: Undefined symbol `_sin' referenced from text segment When using mathematical functions like sin(), you have to tell cc to link in the math library, like so: &prompt.user; cc -o foobar foobar.c -lm All right, I wrote this simple program to practice using . All it does is raise 2.1 to the power of 6. #include <stdio.h> int main() { float f; f = pow(2.1, 6); printf("2.1 ^ 6 = %f\n", f); return 0; } and I compiled it as: &prompt.user; cc temp.c -lm like you said I should, but I get this when I run it: &prompt.user; ./a.out 2.1 ^ 6 = 1023.000000 This is not the right answer! What is going on? When the compiler sees you call a function, it checks if it has already seen a prototype for it. If it has not, it assumes the function returns an int, which is definitely not what you want here. So how do I fix this? The prototypes for the mathematical functions are in math.h. If you include this file, the compiler will be able to find the prototype and it will stop doing strange things to your calculation! #include <math.h> #include <stdio.h> int main() { ... After recompiling it as you did before, run it: &prompt.user; ./a.out 2.1 ^ 6 = 85.766121 If you are using any of the mathematical functions, always include math.h and remember to link in the math library. I compiled a file called foobar.c and I cannot find an executable called foobar. Where has it gone? Remember, cc will call the executable a.out unless you tell it differently. Use the option: &prompt.user; cc -o foobar foobar.c OK, I have an executable called foobar, I can see it when I run ls, but when I type in foobar at the command prompt it tells me there is no such file. Why can it not find it? Unlike &ms-dos;, &unix; does not look in the current directory when it is trying to find out which executable you want it to run, unless you tell it to. Either type ./foobar, which means run the file called foobar in the current directory, or change your PATH environment variable so that it looks something like bin:/usr/bin:/usr/local/bin:. The dot at the end means look in the current directory if it is not in any of the others. I called my executable test, but nothing happens when I run it. What is going on? Most &unix; systems have a program called test in /usr/bin and the shell is picking that one up before it gets to checking the current directory. Either type: &prompt.user; ./test or choose a better name for your program! I compiled my program and it seemed to run all right at first, then there was an error and it said something about core dumped. What does that mean? The name core dump dates back to the very early days of &unix;, when the machines used core memory for storing data. Basically, if the program failed under certain conditions, the system would write the contents of core memory to disk in a file called core, which the programmer could then pore over to find out what went wrong. Fascinating stuff, but what I am supposed to do now? Use gdb to analyze the core (see ). When my program dumped core, it said something about a segmentation fault. What is that? This basically means that your program tried to perform some sort of illegal operation on memory; &unix; is designed to protect the operating system and other programs from rogue programs. Common causes for this are: Trying to write to a NULL pointer, eg char *foo = NULL; strcpy(foo, "bang!"); Using a pointer that has not been initialized, eg char *foo; strcpy(foo, "bang!"); The pointer will have some random value that, with luck, will point into an area of memory that is not available to your program and the kernel will kill your program before it can do any damage. If you are unlucky, it will point somewhere inside your own program and corrupt one of your data structures, causing the program to fail mysteriously. Trying to access past the end of an array, eg int bar[20]; bar[27] = 6; Trying to store something in read-only memory, eg char *foo = "My string"; strcpy(foo, "bang!"); &unix; compilers often put string literals like "My string" into read-only areas of memory. Doing naughty things with malloc() and free(), eg char bar[80]; free(bar); or char *foo = malloc(27); free(foo); free(foo); Making one of these mistakes will not always lead to an error, but they are always bad practice. Some systems and compilers are more tolerant than others, which is why programs that ran well on one system can crash when you try them on an another. Sometimes when I get a core dump it says bus error. It says in my &unix; book that this means a hardware problem, but the computer still seems to be working. Is this true? No, fortunately not (unless of course you really do have a hardware problem…). This is usually another way of saying that you accessed memory in a way you should not have. This dumping core business sounds as though it could be quite useful, if I can make it happen when I want to. Can I do this, or do I have to wait until there is an error? Yes, just go to another console or xterm, do &prompt.user; ps to find out the process ID of your program, and do &prompt.user; kill -ABRT pid where pid is the process ID you looked up. This is useful if your program has got stuck in an infinite loop, for instance. If your program happens to trap SIGABRT, there are several other signals which have a similar effect. Alternatively, you can create a core dump from inside your program, by calling the abort() function. See the manual page of &man.abort.3; to learn more. If you want to create a core dump from outside your program, but do not want the process to terminate, you can use the gcore program. See the manual page of &man.gcore.1; for more information. Make What is <command>make</command>? When you are working on a simple program with only one or two source files, typing in &prompt.user; cc file1.c file2.c is not too bad, but it quickly becomes very tedious when there are several files—and it can take a while to compile, too. One way to get around this is to use object files and only recompile the source file if the source code has changed. So we could have something like: &prompt.user; cc file1.o file2.o … file37.c … if we had changed file37.c, but not any of the others, since the last time we compiled. This may speed up the compilation quite a bit, but does not solve the typing problem. Or we could write a shell script to solve the typing problem, but it would have to re-compile everything, making it very inefficient on a large project. What happens if we have hundreds of source files lying about? What if we are working in a team with other people who forget to tell us when they have changed one of their source files that we use? Perhaps we could put the two solutions together and write something like a shell script that would contain some kind of magic rule saying when a source file needs compiling. Now all we need now is a program that can understand these rules, as it is a bit too complicated for the shell. This program is called make. It reads in a file, called a makefile, that tells it how different files depend on each other, and works out which files need to be re-compiled and which ones do not. For example, a rule could say something like if fromboz.o is older than fromboz.c, that means someone must have changed fromboz.c, so it needs to be re-compiled. The makefile also has rules telling make how to re-compile the source file, making it a much more powerful tool. Makefiles are typically kept in the same directory as the source they apply to, and can be called makefile, Makefile or MAKEFILE. Most programmers use the name Makefile, as this puts it near the top of a directory listing, where it can easily be seen. They do not use the MAKEFILE form as block capitals are often used for documentation files like README. Example of using <command>make</command> Here is a very simple make file: foo: foo.c cc -o foo foo.c It consists of two lines, a dependency line and a creation line. The dependency line here consists of the name of the program (known as the target), followed by a colon, then whitespace, then the name of the source file. When make reads this line, it looks to see if foo exists; if it exists, it compares the time foo was last modified to the time foo.c was last modified. If foo does not exist, or is older than foo.c, it then looks at the creation line to find out what to do. In other words, this is the rule for working out when foo.c needs to be re-compiled. The creation line starts with a tab (press the tab key) and then the command you would type to create foo if you were doing it at a command prompt. If foo is out of date, or does not exist, make then executes this command to create it. In other words, this is the rule which tells make how to re-compile foo.c. So, when you type make, it will make sure that foo is up to date with respect to your latest changes to foo.c. This principle can be extended to Makefiles with hundreds of targets—in fact, on FreeBSD, it is possible to compile the entire operating system just by typing make world in the appropriate directory! Another useful property of makefiles is that the targets do not have to be programs. For instance, we could have a make file that looks like this: foo: foo.c cc -o foo foo.c install: cp foo /home/me We can tell make which target we want to make by typing: &prompt.user; make target make will then only look at that target and ignore any others. For example, if we type make foo with the makefile above, make will ignore the install target. If we just type make on its own, make will always look at the first target and then stop without looking at any others. So if we typed make here, it will just go to the foo target, re-compile foo if necessary, and then stop without going on to the install target. Notice that the install target does not actually depend on anything! This means that the command on the following line is always executed when we try to make that target by typing make install. In this case, it will copy foo into the user's home directory. This is often used by application makefiles, so that the application can be installed in the correct directory when it has been correctly compiled. This is a slightly confusing subject to try to explain. If you do not quite understand how make works, the best thing to do is to write a simple program like hello world and a make file like the one above and experiment. Then progress to using more than one source file, or having the source file include a header file. The touch command is very useful here—it changes the date on a file without you having to edit it. Make and include-files C code often starts with a list of files to include, for example stdio.h. Some of these files are system-include files, some of them are from the project you are now working on: #include <stdio.h> #include "foo.h" int main(.... To make sure that this file is recompiled the moment foo.h is changed, you have to add it in your Makefile: foo: foo.c foo.h The moment your project is getting bigger and you have more and more own include-files to maintain, it will be a pain to keep track of all include files and the files which are depending on it. If you change an include-file but forget to recompile all the files which are depending on it, the results will be devastating. gcc has an option to analyze your files and to produce a list of include-files and their dependencies: . If you add this to your Makefile: depend: gcc -E -MM *.c > .depend and run make depend, the file .depend will appear with a list of object-files, C-files and the include-files: foo.o: foo.c foo.h If you change foo.h, next time you run make all files depending on foo.h will be recompiled. Do not forget to run make depend each time you add an include-file to one of your files. FreeBSD Makefiles Makefiles can be rather complicated to write. Fortunately, BSD-based systems like FreeBSD come with some very powerful ones as part of the system. One very good example of this is the FreeBSD ports system. Here is the essential part of a typical ports Makefile: MASTER_SITES= ftp://freefall.cdrom.com/pub/FreeBSD/LOCAL_PORTS/ DISTFILES= scheme-microcode+dist-7.3-freebsd.tgz .include <bsd.port.mk> Now, if we go to the directory for this port and type make, the following happens: A check is made to see if the source code for this port is already on the system. If it is not, an FTP connection to the URL in MASTER_SITES is set up to download the source. The checksum for the source is calculated and compared it with one for a known, good, copy of the source. This is to make sure that the source was not corrupted while in transit. Any changes required to make the source work on FreeBSD are applied—this is known as patching. Any special configuration needed for the source is done. (Many &unix; program distributions try to work out which version of &unix; they are being compiled on and which optional &unix; features are present—this is where they are given the information in the FreeBSD ports scenario). The source code for the program is compiled. In effect, we change to the directory where the source was unpacked and do make—the program's own make file has the necessary information to build the program. We now have a compiled version of the program. If we wish, we can test it now; when we feel confident about the program, we can type make install. This will cause the program and any supporting files it needs to be copied into the correct location; an entry is also made into a package database, so that the port can easily be uninstalled later if we change our mind about it. Now I think you will agree that is rather impressive for a four line script! The secret lies in the last line, which tells make to look in the system makefile called bsd.port.mk. It is easy to overlook this line, but this is where all the clever stuff comes from—someone has written a makefile that tells make to do all the things above (plus a couple of other things I did not mention, including handling any errors that may occur) and anyone can get access to that just by putting a single line in their own make file! If you want to have a look at these system makefiles, they are in /usr/share/mk, but it is probably best to wait until you have had a bit of practice with makefiles, as they are very complicated (and if you do look at them, make sure you have a flask of strong coffee handy!) More advanced uses of <command>make</command> Make is a very powerful tool, and can do much more than the simple example above shows. Unfortunately, there are several different versions of make, and they all differ considerably. The best way to learn what they can do is probably to read the documentation—hopefully this introduction will have given you a base from which you can do this. The version of make that comes with FreeBSD is the Berkeley make; there is a tutorial for it in /usr/share/doc/psd/12.make. To view it, do &prompt.user; zmore paper.ascii.gz in that directory. Many applications in the ports use GNU make, which has a very good set of info pages. If you have installed any of these ports, GNU make will automatically have been installed as gmake. It is also available as a port and package in its own right. To view the info pages for GNU make, you will have to edit the dir file in the /usr/local/info directory to add an entry for it. This involves adding a line like * Make: (make). The GNU Make utility. to the file. Once you have done this, you can type info and then select make from the menu (or in Emacs, do C-h i). Debugging The Debugger The debugger that comes with FreeBSD is called gdb (GNU debugger). You start it up by typing &prompt.user; gdb progname although most people prefer to run it inside Emacs. You can do this by: M-x gdb RET progname RET Using a debugger allows you to run the program under more controlled circumstances. Typically, you can step through the program a line at a time, inspect the value of variables, change them, tell the debugger to run up to a certain point and then stop, and so on. You can even attach to a program that is already running, or load a core file to investigate why the program crashed. It is even possible to debug the kernel, though that is a little trickier than the user applications we will be discussing in this section. gdb has quite good on-line help, as well as a set of info pages, so this section will concentrate on a few of the basic commands. Finally, if you find its text-based command-prompt style off-putting, there is a graphical front-end for it (xxgdb) in the ports collection. This section is intended to be an introduction to using gdb and does not cover specialized topics such as debugging the kernel. Running a program in the debugger You will need to have compiled the program with the option to get the most out of using gdb. It will work without, but you will only see the name of the function you are in, instead of the source code. If you see a line like: … (no debugging symbols found) … when gdb starts up, you will know that the program was not compiled with the option. At the gdb prompt, type break main. This will tell the debugger to skip over the preliminary set-up code in the program and start at the beginning of your code. Now type run to start the program—it will start at the beginning of the set-up code and then get stopped by the debugger when it calls main(). (If you have ever wondered where main() gets called from, now you know!). You can now step through the program, a line at a time, by pressing n. If you get to a function call, you can step into it by pressing s. Once you are in a function call, you can return from stepping into a function call by pressing f. You can also use up and down to take a quick look at the caller. Here is a simple example of how to spot a mistake in a program with gdb. This is our program (with a deliberate mistake): #include <stdio.h> int bazz(int anint); main() { int i; printf("This is my program\n"); bazz(i); return 0; } int bazz(int anint) { printf("You gave me %d\n", anint); return anint; } This program sets i to be 5 and passes it to a function bazz() which prints out the number we gave it. When we compile and run the program we get &prompt.user; cc -g -o temp temp.c &prompt.user; ./temp This is my program anint = 4231 That was not what we expected! Time to see what is going on! &prompt.user; gdb temp GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc. (gdb) break main Skip the set-up code Breakpoint 1 at 0x160f: file temp.c, line 9. gdb puts breakpoint at main() (gdb) run Run as far as main() Starting program: /home/james/tmp/temp Program starts running Breakpoint 1, main () at temp.c:9 gdb stops at main() (gdb) n Go to next line This is my program Program prints out (gdb) s step into bazz() bazz (anint=4231) at temp.c:17 gdb displays stack frame (gdb) Hang on a minute! How did anint get to be 4231? Did we not we set it to be 5 in main()? Let's move up to main() and have a look. (gdb) up Move up call stack #1 0x1625 in main () at temp.c:11 gdb displays stack frame (gdb) p i Show us the value of i $1 = 4231 gdb displays 4231 Oh dear! Looking at the code, we forgot to initialize i. We meant to put … main() { int i; i = 5; printf("This is my program\n"); … but we left the i=5; line out. As we did not initialize i, it had whatever number happened to be in that area of memory when the program ran, which in this case happened to be 4231. gdb displays the stack frame every time we go into or out of a function, even if we are using up and down to move around the call stack. This shows the name of the function and the values of its arguments, which helps us keep track of where we are and what is going on. (The stack is a storage area where the program stores information about the arguments passed to functions and where to go when it returns from a function call). Examining a core file A core file is basically a file which contains the complete state of the process when it crashed. In the good old days, programmers had to print out hex listings of core files and sweat over machine code manuals, but now life is a bit easier. Incidentally, under FreeBSD and other 4.4BSD systems, a core file is called progname.core instead of just core, to make it clearer which program a core file belongs to. To examine a core file, start up gdb in the usual way. Instead of typing break or run, type (gdb) core progname.core If you are not in the same directory as the core file, you will have to do dir /path/to/core/file first. You should see something like this: &prompt.user; gdb a.out GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc. (gdb) core a.out.core Core was generated by `a.out'. Program terminated with signal 11, Segmentation fault. Cannot access memory at address 0x7020796d. #0 0x164a in bazz (anint=0x5) at temp.c:17 (gdb) In this case, the program was called a.out, so the core file is called a.out.core. We can see that the program crashed due to trying to access an area in memory that was not available to it in a function called bazz. Sometimes it is useful to be able to see how a function was called, as the problem could have occurred a long way up the call stack in a complex program. The bt command causes gdb to print out a back-trace of the call stack: (gdb) bt #0 0x164a in bazz (anint=0x5) at temp.c:17 #1 0xefbfd888 in end () #2 0x162c in main () at temp.c:11 (gdb) The end() function is called when a program crashes; in this case, the bazz() function was called from main(). Attaching to a running program One of the neatest features about gdb is that it can attach to a program that is already running. Of course, that assumes you have sufficient permissions to do so. A common problem is when you are stepping through a program that forks, and you want to trace the child, but the debugger will only let you trace the parent. What you do is start up another gdb, use ps to find the process ID for the child, and do (gdb) attach pid in gdb, and then debug as usual. That is all very well, you are probably thinking, but by the time I have done that, the child process will be over the hill and far away. Fear not, gentle reader, here is how to do it (courtesy of the gdb info pages): … -if ((pid = fork()) < 0) /* _Always_ check this */ +if ((pid = fork()) < 0) /* _Always_ check this */ error(); else if (pid == 0) { /* child */ int PauseMode = 1; while (PauseMode) sleep(10); /* Wait until someone attaches to us */ … } else { /* parent */ … Now all you have to do is attach to the child, set PauseMode to 0, and wait for the sleep() call to return! Using Emacs as a Development Environment Emacs Unfortunately, &unix; systems do not come with the kind of everything-you-ever-wanted-and-lots-more-you-did-not-in-one-gigantic-package integrated development environments that other systems have. Some powerful, free IDEs now exist, such as KDevelop in the ports collection. However, it is possible to set up your own environment. It may not be as pretty, and it may not be quite as integrated, but you can set it up the way you want it. And it is free. And you have the source to it. The key to it all is Emacs. Now there are some people who loathe it, but many who love it. If you are one of the former, I am afraid this section will hold little of interest to you. Also, you will need a fair amount of memory to run it—I would recommend 8MB in text mode and 16MB in X as the bare minimum to get reasonable performance. Emacs is basically a highly customizable editor—indeed, it has been customized to the point where it is more like an operating system than an editor! Many developers and sysadmins do in fact spend practically all their time working inside Emacs, leaving it only to log out. It is impossible even to summarize everything Emacs can do here, but here are some of the features of interest to developers: Very powerful editor, allowing search-and-replace on both strings and regular expressions (patterns), jumping to start/end of block expression, etc, etc. Pull-down menus and online help. Language-dependent syntax highlighting and indentation. Completely customizable. You can compile and debug programs within Emacs. On a compilation error, you can jump to the offending line of source code. Friendly-ish front-end to the info program used for reading GNU hypertext documentation, including the documentation on Emacs itself. Friendly front-end to gdb, allowing you to look at the source code as you step through your program. You can read Usenet news and mail while your program is compiling. And doubtless many more that I have overlooked. Emacs can be installed on FreeBSD using the Emacs port. Once it is installed, start it up and do C-h t to read an Emacs tutorial—that means hold down the control key, press h, let go of the control key, and then press t. (Alternatively, you can you use the mouse to select Emacs Tutorial from the Help menu). Although Emacs does have menus, it is well worth learning the key bindings, as it is much quicker when you are editing something to press a couple of keys than to try to find the mouse and then click on the right place. And, when you are talking to seasoned Emacs users, you will find they often casually throw around expressions like M-x replace-s RET foo RET bar RET so it is useful to know what they mean. And in any case, Emacs has far too many useful functions for them to all fit on the menu bars. Fortunately, it is quite easy to pick up the key-bindings, as they are displayed next to the menu item. My advice is to use the menu item for, say, opening a file until you understand how it works and feel confident with it, then try doing C-x C-f. When you are happy with that, move on to another menu command. If you can not remember what a particular combination of keys does, select Describe Key from the Help menu and type it in—Emacs will tell you what it does. You can also use the Command Apropos menu item to find out all the commands which contain a particular word in them, with the key binding next to it. By the way, the expression above means hold down the Meta key, press x, release the Meta key, type replace-s (short for replace-string—another feature of Emacs is that you can abbreviate commands), press the return key, type foo (the string you want replaced), press the return key, type bar (the string you want to replace foo with) and press return again. Emacs will then do the search-and-replace operation you have just requested. If you are wondering what on earth the Meta key is, it is a special key that many &unix; workstations have. Unfortunately, PC's do not have one, so it is usually the alt key (or if you are unlucky, the escape key). Oh, and to get out of Emacs, do C-x C-c (that means hold down the control key, press x, press c and release the control key). If you have any unsaved files open, Emacs will ask you if you want to save them. (Ignore the bit in the documentation where it says C-z is the usual way to leave Emacs—that leaves Emacs hanging around in the background, and is only really useful if you are on a system which does not have virtual terminals). Configuring Emacs Emacs does many wonderful things; some of them are built in, some of them need to be configured. Instead of using a proprietary macro language for configuration, Emacs uses a version of Lisp specially adapted for editors, known as Emacs Lisp. Working with Emacs Lisp can be quite helpful if you want to go on and learn something like Common Lisp. Emacs Lisp has many features of Common Lisp, although it is considerably smaller (and thus easier to master). The best way to learn Emacs Lisp is to download the Emacs Tutorial However, there is no need to actually know any Lisp to get started with configuring Emacs, as I have included a sample .emacs file, which should be enough to get you started. Just copy it into your home directory and restart Emacs if it is already running; it will read the commands from the file and (hopefully) give you a useful basic setup. A sample <filename>.emacs</filename> file Unfortunately, there is far too much here to explain it in detail; however there are one or two points worth mentioning. Everything beginning with a ; is a comment and is ignored by Emacs. In the first line, the -*- Emacs-Lisp -*- is so that we can edit the .emacs file itself within Emacs and get all the fancy features for editing Emacs Lisp. Emacs usually tries to guess this based on the filename, and may not get it right for .emacs. The tab key is bound to an indentation function in some modes, so when you press the tab key, it will indent the current line of code. If you want to put a tab character in whatever you are writing, hold the control key down while you are pressing the tab key. This file supports syntax highlighting for C, C++, Perl, Lisp and Scheme, by guessing the language from the filename. Emacs already has a pre-defined function called next-error. In a compilation output window, this allows you to move from one compilation error to the next by doing M-n; we define a complementary function, previous-error, that allows you to go to a previous error by doing M-p. The nicest feature of all is that C-c C-c will open up the source file in which the error occurred and jump to the appropriate line. We enable Emacs's ability to act as a server, so that if you are doing something outside Emacs and you want to edit a file, you can just type in &prompt.user; emacsclient filename and then you can edit the file in your Emacs! Many Emacs users set their EDITOR environment to emacsclient so this happens every time they need to edit a file. A sample <filename>.emacs</filename> file ;; -*-Emacs-Lisp-*- ;; This file is designed to be re-evaled; use the variable first-time ;; to avoid any problems with this. (defvar first-time t "Flag signifying this is the first time that .emacs has been evaled") ;; Meta (global-set-key "\M- " 'set-mark-command) (global-set-key "\M-\C-h" 'backward-kill-word) (global-set-key "\M-\C-r" 'query-replace) (global-set-key "\M-r" 'replace-string) (global-set-key "\M-g" 'goto-line) (global-set-key "\M-h" 'help-command) ;; Function keys (global-set-key [f1] 'manual-entry) (global-set-key [f2] 'info) (global-set-key [f3] 'repeat-complex-command) (global-set-key [f4] 'advertised-undo) (global-set-key [f5] 'eval-current-buffer) (global-set-key [f6] 'buffer-menu) (global-set-key [f7] 'other-window) (global-set-key [f8] 'find-file) (global-set-key [f9] 'save-buffer) (global-set-key [f10] 'next-error) (global-set-key [f11] 'compile) (global-set-key [f12] 'grep) (global-set-key [C-f1] 'compile) (global-set-key [C-f2] 'grep) (global-set-key [C-f3] 'next-error) (global-set-key [C-f4] 'previous-error) (global-set-key [C-f5] 'display-faces) (global-set-key [C-f8] 'dired) (global-set-key [C-f10] 'kill-compilation) ;; Keypad bindings (global-set-key [up] "\C-p") (global-set-key [down] "\C-n") (global-set-key [left] "\C-b") (global-set-key [right] "\C-f") (global-set-key [home] "\C-a") (global-set-key [end] "\C-e") (global-set-key [prior] "\M-v") (global-set-key [next] "\C-v") (global-set-key [C-up] "\M-\C-b") (global-set-key [C-down] "\M-\C-f") (global-set-key [C-left] "\M-b") (global-set-key [C-right] "\M-f") (global-set-key [C-home] "\M-<") (global-set-key [C-end] "\M->") (global-set-key [C-prior] "\M-<") (global-set-key [C-next] "\M->") ;; Mouse (global-set-key [mouse-3] 'imenu) ;; Misc (global-set-key [C-tab] "\C-q\t") ; Control tab quotes a tab. (setq backup-by-copying-when-mismatch t) ;; Treat 'y' or <CR> as yes, 'n' as no. (fset 'yes-or-no-p 'y-or-n-p) (define-key query-replace-map [return] 'act) (define-key query-replace-map [?\C-m] 'act) ;; Load packages (require 'desktop) (require 'tar-mode) ;; Pretty diff mode (autoload 'ediff-buffers "ediff" "Intelligent Emacs interface to diff" t) (autoload 'ediff-files "ediff" "Intelligent Emacs interface to diff" t) (autoload 'ediff-files-remote "ediff" "Intelligent Emacs interface to diff") (if first-time (setq auto-mode-alist (append '(("\\.cpp$" . c++-mode) ("\\.hpp$" . c++-mode) ("\\.lsp$" . lisp-mode) ("\\.scm$" . scheme-mode) ("\\.pl$" . perl-mode) ) auto-mode-alist))) ;; Auto font lock mode (defvar font-lock-auto-mode-list (list 'c-mode 'c++-mode 'c++-c-mode 'emacs-lisp-mode 'lisp-mode 'perl-mode 'scheme-mode) "List of modes to always start in font-lock-mode") (defvar font-lock-mode-keyword-alist '((c++-c-mode . c-font-lock-keywords) (perl-mode . perl-font-lock-keywords)) "Associations between modes and keywords") (defun font-lock-auto-mode-select () "Automatically select font-lock-mode if the current major mode is in font-lock-auto-mode-list" (if (memq major-mode font-lock-auto-mode-list) (progn (font-lock-mode t)) ) ) (global-set-key [M-f1] 'font-lock-fontify-buffer) ;; New dabbrev stuff ;(require 'new-dabbrev) (setq dabbrev-always-check-other-buffers t) (setq dabbrev-abbrev-char-regexp "\\sw\\|\\s_") (add-hook 'emacs-lisp-mode-hook '(lambda () (set (make-local-variable 'dabbrev-case-fold-search) nil) (set (make-local-variable 'dabbrev-case-replace) nil))) (add-hook 'c-mode-hook '(lambda () (set (make-local-variable 'dabbrev-case-fold-search) nil) (set (make-local-variable 'dabbrev-case-replace) nil))) (add-hook 'text-mode-hook '(lambda () (set (make-local-variable 'dabbrev-case-fold-search) t) (set (make-local-variable 'dabbrev-case-replace) t))) ;; C++ and C mode... (defun my-c++-mode-hook () (setq tab-width 4) (define-key c++-mode-map "\C-m" 'reindent-then-newline-and-indent) (define-key c++-mode-map "\C-ce" 'c-comment-edit) (setq c++-auto-hungry-initial-state 'none) (setq c++-delete-function 'backward-delete-char) (setq c++-tab-always-indent t) (setq c-indent-level 4) (setq c-continued-statement-offset 4) (setq c++-empty-arglist-indent 4)) (defun my-c-mode-hook () (setq tab-width 4) (define-key c-mode-map "\C-m" 'reindent-then-newline-and-indent) (define-key c-mode-map "\C-ce" 'c-comment-edit) (setq c-auto-hungry-initial-state 'none) (setq c-delete-function 'backward-delete-char) (setq c-tab-always-indent t) ;; BSD-ish indentation style (setq c-indent-level 4) (setq c-continued-statement-offset 4) (setq c-brace-offset -4) (setq c-argdecl-indent 0) (setq c-label-offset -4)) ;; Perl mode (defun my-perl-mode-hook () (setq tab-width 4) (define-key c++-mode-map "\C-m" 'reindent-then-newline-and-indent) (setq perl-indent-level 4) (setq perl-continued-statement-offset 4)) ;; Scheme mode... (defun my-scheme-mode-hook () (define-key scheme-mode-map "\C-m" 'reindent-then-newline-and-indent)) ;; Emacs-Lisp mode... (defun my-lisp-mode-hook () (define-key lisp-mode-map "\C-m" 'reindent-then-newline-and-indent) (define-key lisp-mode-map "\C-i" 'lisp-indent-line) (define-key lisp-mode-map "\C-j" 'eval-print-last-sexp)) ;; Add all of the hooks... (add-hook 'c++-mode-hook 'my-c++-mode-hook) (add-hook 'c-mode-hook 'my-c-mode-hook) (add-hook 'scheme-mode-hook 'my-scheme-mode-hook) (add-hook 'emacs-lisp-mode-hook 'my-lisp-mode-hook) (add-hook 'lisp-mode-hook 'my-lisp-mode-hook) (add-hook 'perl-mode-hook 'my-perl-mode-hook) ;; Complement to next-error (defun previous-error (n) "Visit previous compilation error message and corresponding source code." (interactive "p") (next-error (- n))) ;; Misc... (transient-mark-mode 1) (setq mark-even-if-inactive t) (setq visible-bell nil) (setq next-line-add-newlines nil) (setq compile-command "make") (setq suggest-key-bindings nil) (put 'eval-expression 'disabled nil) (put 'narrow-to-region 'disabled nil) (put 'set-goal-column 'disabled nil) (if (>= emacs-major-version 21) (setq show-trailing-whitespace t)) ;; Elisp archive searching (autoload 'format-lisp-code-directory "lispdir" nil t) (autoload 'lisp-dir-apropos "lispdir" nil t) (autoload 'lisp-dir-retrieve "lispdir" nil t) (autoload 'lisp-dir-verify "lispdir" nil t) ;; Font lock mode (defun my-make-face (face color &optional bold) "Create a face from a color and optionally make it bold" (make-face face) (copy-face 'default face) (set-face-foreground face color) (if bold (make-face-bold face)) ) (if (eq window-system 'x) (progn (my-make-face 'blue "blue") (my-make-face 'red "red") (my-make-face 'green "dark green") (setq font-lock-comment-face 'blue) (setq font-lock-string-face 'bold) (setq font-lock-type-face 'bold) (setq font-lock-keyword-face 'bold) (setq font-lock-function-name-face 'red) (setq font-lock-doc-string-face 'green) (add-hook 'find-file-hooks 'font-lock-auto-mode-select) (setq baud-rate 1000000) (global-set-key "\C-cmm" 'menu-bar-mode) (global-set-key "\C-cms" 'scroll-bar-mode) (global-set-key [backspace] 'backward-delete-char) ; (global-set-key [delete] 'delete-char) (standard-display-european t) (load-library "iso-transl"))) ;; X11 or PC using direct screen writes (if window-system (progn ;; (global-set-key [M-f1] 'hilit-repaint-command) ;; (global-set-key [M-f2] [?\C-u M-f1]) (setq hilit-mode-enable-list '(not text-mode c-mode c++-mode emacs-lisp-mode lisp-mode scheme-mode) hilit-auto-highlight nil hilit-auto-rehighlight 'visible hilit-inhibit-hooks nil hilit-inhibit-rebinding t) (require 'hilit19) (require 'paren)) (setq baud-rate 2400) ; For slow serial connections ) ;; TTY type terminal (if (and (not window-system) (not (equal system-type 'ms-dos))) (progn (if first-time (progn (keyboard-translate ?\C-h ?\C-?) (keyboard-translate ?\C-? ?\C-h))))) ;; Under UNIX (if (not (equal system-type 'ms-dos)) (progn (if first-time (server-start)))) ;; Add any face changes here (add-hook 'term-setup-hook 'my-term-setup-hook) (defun my-term-setup-hook () (if (eq window-system 'pc) (progn ;; (set-face-background 'default "red") ))) ;; Restore the "desktop" - do this as late as possible (if first-time (progn (desktop-load-default) (desktop-read))) ;; Indicate that this file has been read at least once (setq first-time nil) ;; No need to debug anything now (setq debug-on-error nil) ;; All done (message "All done, %s%s" (user-login-name) ".") Extending the Range of Languages Emacs Understands Now, this is all very well if you only want to program in the languages already catered for in the .emacs file (C, C++, Perl, Lisp and Scheme), but what happens if a new language called whizbang comes out, full of exciting features? The first thing to do is find out if whizbang comes with any files that tell Emacs about the language. These usually end in .el, short for Emacs Lisp. For example, if whizbang is a FreeBSD port, we can locate these files by doing &prompt.user; find /usr/ports/lang/whizbang -name "*.el" -print and install them by copying them into the Emacs site Lisp directory. On FreeBSD 2.1.0-RELEASE, this is /usr/local/share/emacs/site-lisp. So for example, if the output from the find command was /usr/ports/lang/whizbang/work/misc/whizbang.el we would do &prompt.root; cp /usr/ports/lang/whizbang/work/misc/whizbang.el /usr/local/share/emacs/site-lisp Next, we need to decide what extension whizbang source files have. Let's say for the sake of argument that they all end in .wiz. We need to add an entry to our .emacs file to make sure Emacs will be able to use the information in whizbang.el. Find the auto-mode-alist entry in .emacs and add a line for whizbang, such as: … ("\\.lsp$" . lisp-mode) ("\\.wiz$" . whizbang-mode) ("\\.scm$" . scheme-mode) … This means that Emacs will automatically go into whizbang-mode when you edit a file ending in .wiz. Just below this, you will find the font-lock-auto-mode-list entry. Add whizbang-mode to it like so: ;; Auto font lock mode (defvar font-lock-auto-mode-list (list 'c-mode 'c++-mode 'c++-c-mode 'emacs-lisp-mode 'whizbang-mode 'lisp-mode 'perl-mode 'scheme-mode) "List of modes to always start in font-lock-mode") This means that Emacs will always enable font-lock-mode (ie syntax highlighting) when editing a .wiz file. And that is all that is needed. If there is anything else you want done automatically when you open up a .wiz file, you can add a whizbang-mode hook (see my-scheme-mode-hook for a simple example that adds auto-indent). Further Reading For information about setting up a development environment for contributing fixes to FreeBSD itself, please see &man.development.7;. Brian Harvey and Matthew Wright Simply Scheme MIT 1994. ISBN 0-262-08226-8 Randall Schwartz Learning Perl O'Reilly 1993 ISBN 1-56592-042-2 Patrick Henry Winston and Berthold Klaus Paul Horn Lisp (3rd Edition) Addison-Wesley 1989 ISBN 0-201-08319-1 Brian W. Kernighan and Rob Pike The Unix Programming Environment Prentice-Hall 1984 ISBN 0-13-937681-X Brian W. Kernighan and Dennis M. Ritchie The C Programming Language (2nd Edition) Prentice-Hall 1988 ISBN 0-13-110362-8 Bjarne Stroustrup The C++ Programming Language Addison-Wesley 1991 ISBN 0-201-53992-6 W. Richard Stevens Advanced Programming in the Unix Environment Addison-Wesley 1992 ISBN 0-201-56317-7 W. Richard Stevens Unix Network Programming Prentice-Hall 1990 ISBN 0-13-949876-1 diff --git a/en_US.ISO8859-1/books/developers-handbook/x86/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/x86/chapter.sgml index 0b3232e008..b4102f9fe4 100644 --- a/en_US.ISO8859-1/books/developers-handbook/x86/chapter.sgml +++ b/en_US.ISO8859-1/books/developers-handbook/x86/chapter.sgml @@ -1,6486 +1,6486 @@ x86 Assembly Language Programming This chapter was written by &a.stanislav;. Synopsis Assembly language programming under &unix; is highly undocumented. It is generally assumed that no one would ever want to use it because various &unix; systems run on different microprocessors, so everything should be written in C for portability. In reality, C portability is quite a myth. Even C programs need to be modified when ported from one &unix; to another, regardless of what processor each runs on. Typically, such a program is full of conditional statements depending on the system it is compiled for. Even if we believe that all of &unix; software should be written in C, or some other high-level language, we still need assembly language programmers: Who else would write the section of C library that accesses the kernel? In this chapter I will attempt to show you how you can use assembly language writing &unix; programs, specifically under FreeBSD. This chapter does not explain the basics of assembly language. There are enough resources about that (for a complete online course in assembly language, see Randall Hyde's Art of Assembly Language; or if you prefer a printed book, take a look at Jeff Duntemann's Assembly Language Step-by-Step). However, once the chapter is finished, any assembly language programmer will be able to write programs for FreeBSD quickly and efficiently. Copyright © 2000-2001 G. Adam Stanislav. All rights reserved. The Tools The Assembler The most important tool for assembly language programming is the assembler, the software that converts assembly language code into machine language. Two very different assemblers are available for FreeBSD. One is as1, which uses the traditional &unix; assembly language syntax. It comes with the system. The other is /usr/ports/devel/nasm. It uses the Intel syntax. Its main advantage is that it can assemble code for many operating systems. It needs to be installed separately, but is completely free. This chapter uses nasm syntax because most assembly language programmers coming to FreeBSD from other operating systems will find it easier to understand. And, because, quite frankly, that is what I am used to. The Linker The output of the assembler, like that of any compiler, needs to be linked to form an executable file. The standard ld1 linker comes with FreeBSD. It works with the code assembled with either assembler. System Calls Default Calling Convention By default, the FreeBSD kernel uses the C calling convention. Further, although the kernel is accessed using int 80h, it is assumed the program will call a function that issues int 80h, rather than issuing int 80h directly. This convention is very convenient, and quite superior to the µsoft; convention used by &ms-dos;. Why? Because the &unix; convention allows any program written in any language to access the kernel. An assembly language program can do that as well. For example, we could open a file: kernel: int 80h ; Call kernel ret open: push dword mode push dword flags push dword path mov eax, 5 call kernel add esp, byte 12 ret This is a very clean and portable way of coding. If you need to port the code to a &unix; system which uses a different interrupt, or a different way of passing parameters, all you need to change is the kernel procedure. But assembly language programmers like to shave off cycles. The above example requires a call/ret combination. We can eliminate it by pushing an extra dword: open: push dword mode push dword flags push dword path mov eax, 5 push eax ; Or any other dword int 80h add esp, byte 16 The 5 that we have placed in EAX identifies the kernel function, in this case open. Alternate Calling Convention FreeBSD is an extremely flexible system. It offers other ways of calling the kernel. For it to work, however, the system must have Linux emulation installed. Linux is a &unix; like system. However, its kernel uses the same system-call convention of passing parameters in registers &ms-dos; does. As with the &unix; convention, the function number is placed in EAX. The parameters, however, are not passed on the stack but in EBX, ECX, EDX, ESI, EDI, EBP: open: mov eax, 5 mov ebx, path mov ecx, flags mov edx, mode int 80h This convention has a great disadvantage over the &unix; way, at least as far as assembly language programming is concerned: Every time you make a kernel call you must push the registers, then pop them later. This makes your code bulkier and slower. Nevertheless, FreeBSD gives you a choice. If you do choose the Linux convention, you must let the system know about it. After your program is assembled and linked, you need to brand the executable: &prompt.user; brandelf -f Linux filename Which Convention Should You Use? If you are coding specifically for FreeBSD, you should always use the &unix; convention: It is faster, you can store global variables in registers, you do not have to brand the executable, and you do not impose the installation of the Linux emulation package on the target system. If you want to create portable code that can also run on Linux, you will probably still want to give the FreeBSD users as efficient a code as possible. I will show you how you can accomplish that after I have explained the basics. Call Numbers To tell the kernel which system service you are calling, place its number in EAX. Of course, you need to know what the number is. The <filename>syscalls</filename> File The numbers are listed in syscalls. locate syscalls finds this file in several different formats, all produced automatically from syscalls.master. You can find the master file for the default &unix; calling convention in /usr/src/sys/kern/syscalls.master. If you need to use the other convention implemented in the Linux emulation mode, read /usr/src/sys/i386/linux/syscalls.master. Not only do FreeBSD and Linux use different calling conventions, they sometimes use different numbers for the same functions. syscalls.master describes how the call is to be made: 0 STD NOHIDE { int nosys(void); } syscall nosys_args int 1 STD NOHIDE { void exit(int rval); } exit rexit_args void 2 STD POSIX { int fork(void); } 3 STD POSIX { ssize_t read(int fd, void *buf, size_t nbyte); } 4 STD POSIX { ssize_t write(int fd, const void *buf, size_t nbyte); } 5 STD POSIX { int open(char *path, int flags, int mode); } 6 STD POSIX { int close(int fd); } etc... It is the leftmost column that tells us the number to place in EAX. The rightmost column tells us what parameters to push. They are pushed from right to left. For example, to open a file, we need to push the mode first, then flags, then the address at which the path is stored. Return Values A system call would not be useful most of the time if it did not return some kind of a value: The file descriptor of an open file, the number of bytes read to a buffer, the system time, etc. Additionally, the system needs to inform us if an error occurs: A file does not exist, system resources are exhausted, we passed an invalid parameter, etc. Man Pages The traditional place to look for information about various system calls under &unix; systems are the manual pages. FreeBSD describes its system calls in section 2, sometimes in section 3. For example, open2 says:

If successful, open() returns a non-negative integer, termed a file descriptor. It returns -1 on failure, and sets errno to indicate the error.

The assembly language programmer new to &unix; and FreeBSD will immediately ask the puzzling question: Where is errno and how do I get to it? The information presented in the manual pages applies to C programs. The assembly language programmer needs additional information. Where Are the Return Values? Unfortunately, it depends... For most system calls it is in EAX, but not for all. A good rule of thumb, when working with a system call for the first time, is to look for the return value in EAX. If it is not there, you need further research. I am aware of one system call that returns the value in EDX: SYS_fork. All others I have worked with use EAX. But I have not worked with them all yet. If you cannot find the answer here or anywhere else, study libc source code and see how it interfaces with the kernel. Where Is <varname>errno</varname>? Actually, nowhere... errno is part of the C language, not the &unix; kernel. When accessing kernel services directly, the error code is returned in EAX, the same register the proper return value generally ends up in. This makes perfect sense. If there is no error, there is no error code. If there is an error, there is no return value. One register can contain either. Determining an Error Occurred When using the standard FreeBSD calling convention, the carry flag is cleared upon success, set upon failure. When using the Linux emulation mode, the signed value in EAX is non-negative upon success, and contains the return value. In case of an error, the value is negative, i.e., -errno. Creating Portable Code Portability is generally not one of the strengths of assembly language. Yet, writing assembly language programs for different platforms is possible, especially with nasm. I have written assembly language libraries that can be assembled for such different operating systems as &windows; and FreeBSD. It is all the more possible when you want your code to run on two platforms which, while different, are based on similar architectures. For example, FreeBSD is &unix;, Linux is &unix; like. I only mentioned three differences between them (from an assembly language programmer's perspective): The calling convention, the function numbers, and the way of returning values. Dealing with Function Numbers In many cases the function numbers are the same. However, even when they are not, the problem is easy to deal with: Instead of using numbers in your code, use constants which you have declared differently depending on the target architecture: %ifdef LINUX %define SYS_execve 11 %else %define SYS_execve 59 %endif Dealing with Conventions Both, the calling convention, and the return value (the errno problem) can be resolved with macros: %ifdef LINUX %macro system 0 call kernel %endmacro align 4 kernel: push ebx push ecx push edx push esi push edi push ebp mov ebx, [esp+32] mov ecx, [esp+36] mov edx, [esp+40] mov esi, [esp+44] mov ebp, [esp+48] int 80h pop ebp pop edi pop esi pop edx pop ecx pop ebx or eax, eax js .errno clc ret .errno: neg eax stc ret %else %macro system 0 int 80h %endmacro %endif Dealing with Other Portability Issues The above solutions can handle most cases of writing code portable between FreeBSD and Linux. Nevertheless, with some kernel services the differences are deeper. In that case, you need to write two different handlers for those particular system calls, and use conditional assembly. Luckily, most of your code does something other than calling the kernel, so usually you will only need a few such conditional sections in your code. Using a Library You can avoid portability issues in your main code altogether by writing a library of system calls. Create a separate library for FreeBSD, a different one for Linux, and yet other libraries for more operating systems. In your library, write a separate function (or procedure, if you prefer the traditional assembly language terminology) for each system call. Use the C calling convention of passing parameters. But still use EAX to pass the call number in. In that case, your FreeBSD library can be very simple, as many seemingly different functions can be just labels to the same code: sys.open: sys.close: [etc...] int 80h ret Your Linux library will require more different functions. But even here you can group system calls using the same number of parameters: sys.exit: sys.close: [etc... one-parameter functions] push ebx mov ebx, [esp+12] int 80h pop ebx jmp sys.return ... sys.return: or eax, eax js sys.err clc ret sys.err: neg eax stc ret The library approach may seem inconvenient at first because it requires you to produce a separate file your code depends on. But it has many advantages: For one, you only need to write it once and can use it for all your programs. You can even let other assembly language programmers use it, or perhaps use one written by someone else. But perhaps the greatest advantage of the library is that your code can be ported to other systems, even by other programmers, by simply writing a new library without any changes to your code. If you do not like the idea of having a library, you can at least place all your system calls in a separate assembly language file and link it with your main program. Here, again, all porters have to do is create a new object file to link with your main program. Using an Include File If you are releasing your software as (or with) source code, you can use macros and place them in a separate file, which you include in your code. Porters of your software will simply write a new include file. No library or external object file is necessary, yet your code is portable without any need to edit the code. This is the approach we will use throughout this chapter. We will name our include file system.inc, and add to it whenever we deal with a new system call. We can start our system.inc by declaring the standard file descriptors: %define stdin 0 %define stdout 1 %define stderr 2 Next, we create a symbolic name for each system call: %define SYS_nosys 0 %define SYS_exit 1 %define SYS_fork 2 %define SYS_read 3 %define SYS_write 4 ; [etc...] We add a short, non-global procedure with a long name, so we do not accidentally reuse the name in our code: section .text align 4 access.the.bsd.kernel: int 80h ret We create a macro which takes one argument, the syscall number: %macro system 1 mov eax, %1 call access.the.bsd.kernel %endmacro Finally, we create macros for each syscall. These macros take no arguments. %macro sys.exit 0 system SYS_exit %endmacro %macro sys.fork 0 system SYS_fork %endmacro %macro sys.read 0 system SYS_read %endmacro %macro sys.write 0 system SYS_write %endmacro ; [etc...] Go ahead, enter it into your editor and save it as system.inc. We will add more to it as we discuss more syscalls. Our First Program We are now ready for our first program, the mandatory Hello, World! 1: %include 'system.inc' 2: 3: section .data 4: hello db 'Hello, World!', 0Ah 5: hbytes equ $-hello 6: 7: section .text 8: global _start 9: _start: 10: push dword hbytes 11: push dword hello 12: push dword stdout 13: sys.write 14: 15: push dword 0 16: sys.exit Here is what it does: Line 1 includes the defines, the macros, and the code from system.inc. Lines 3-5 are the data: Line 3 starts the data section/segment. Line 4 contains the string "Hello, World!" followed by a new line (0Ah). Line 5 creates a constant that contains the length of the string from line 4 in bytes. Lines 7-16 contain the code. Note that FreeBSD uses the elf file format for its executables, which requires every program to start at the point labeled _start (or, more precisely, the linker expects that). This label has to be global. Lines 10-13 ask the system to write hbytes bytes of the hello string to stdout. Lines 15-16 ask the system to end the program with the return value of 0. The SYS_exit syscall never returns, so the code ends there. If you have come to &unix; from &ms-dos; assembly language background, you may be used to writing directly to the video hardware. You will never have to worry about this in FreeBSD, or any other flavor of &unix;. As far as you are concerned, you are writing to a file known as stdout. This can be the video screen, or a telnet terminal, or an actual file, or even the input of another program. Which one it is, is for the system to figure out. Assembling the Code Type the code (except the line numbers) in an editor, and save it in a file named hello.asm. You need nasm to assemble it. Installing <application>nasm</application> If you do not have nasm, type: &prompt.user; su Password:your root password &prompt.root; cd /usr/ports/devel/nasm &prompt.root; make install &prompt.root; exit &prompt.user; You may type make install clean instead of just make install if you do not want to keep nasm source code. Either way, FreeBSD will automatically download nasm from the Internet, compile it, and install it on your system. If your system is not FreeBSD, you need to get nasm from its home page. You can still use it to assemble FreeBSD code. Now you can assemble, link, and run the code: &prompt.user; nasm -f elf hello.asm &prompt.user; ld -s -o hello hello.o &prompt.user; ./hello Hello, World! &prompt.user; Writing &unix; Filters A common type of &unix; application is a filter—a program that reads data from the stdin, processes it somehow, then writes the result to stdout. In this chapter, we shall develop a simple filter, and learn how to read from stdin and write to stdout. This filter will convert each byte of its input into a hexadecimal number followed by a blank space. %include 'system.inc' section .data hex db '0123456789ABCDEF' buffer db 0, 0, ' ' section .text global _start _start: ; read a byte from stdin push dword 1 push dword buffer push dword stdin sys.read add esp, byte 12 or eax, eax je .done ; convert it to hex movzx eax, byte [buffer] mov edx, eax shr dl, 4 mov dl, [hex+edx] mov [buffer], dl and al, 0Fh mov al, [hex+eax] mov [buffer+1], al ; print it push dword 3 push dword buffer push dword stdout sys.write add esp, byte 12 jmp short _start .done: push dword 0 sys.exit In the data section we create an array called hex. It contains the 16 hexadecimal digits in ascending order. The array is followed by a buffer which we will use for both input and output. The first two bytes of the buffer are initially set to 0. This is where we will write the two hexadecimal digits (the first byte also is where we will read the input). The third byte is a space. The code section consists of four parts: Reading the byte, converting it to a hexadecimal number, writing the result, and eventually exiting the program. To read the byte, we ask the system to read one byte from stdin, and store it in the first byte of the buffer. The system returns the number of bytes read in EAX. This will be 1 while data is coming, or 0, when no more input data is available. Therefore, we check the value of EAX. If it is 0, we jump to .done, otherwise we continue. For simplicity sake, we are ignoring the possibility of an error condition at this time. The hexadecimal conversion reads the byte from the buffer into EAX, or actually just AL, while clearing the remaining bits of EAX to zeros. We also copy the byte to EDX because we need to convert the upper four bits (nibble) separately from the lower four bits. We store the result in the first two bytes of the buffer. Next, we ask the system to write the three bytes of the buffer, i.e., the two hexadecimal digits and the blank space, to stdout. We then jump back to the beginning of the program and process the next byte. Once there is no more input left, we ask the system to exit our program, returning a zero, which is the traditional value meaning the program was successful. Go ahead, and save the code in a file named hex.asm, then type the following (the ^D means press the control key and type D while holding the control key down): &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; If you are migrating to &unix; from &ms-dos;, you may be wondering why each line ends with 0A instead of 0D 0A. This is because &unix; does not use the cr/lf convention, but a "new line" convention, which is 0A in hexadecimal. Can we improve this? Well, for one, it is a bit confusing because once we have converted a line of text, our input no longer starts at the beginning of the line. We can modify it to print a new line instead of a space after each 0A: %include 'system.inc' section .data hex db '0123456789ABCDEF' buffer db 0, 0, ' ' section .text global _start _start: mov cl, ' ' .loop: ; read a byte from stdin push dword 1 push dword buffer push dword stdin sys.read add esp, byte 12 or eax, eax je .done ; convert it to hex movzx eax, byte [buffer] mov [buffer+2], cl cmp al, 0Ah jne .hex mov [buffer+2], al .hex: mov edx, eax shr dl, 4 mov dl, [hex+edx] mov [buffer], dl and al, 0Fh mov al, [hex+eax] mov [buffer+1], al ; print it push dword 3 push dword buffer push dword stdout sys.write add esp, byte 12 jmp short .loop .done: push dword 0 sys.exit We have stored the space in the CL register. We can do this safely because, unlike µsoft.windows;, &unix; system calls do not modify the value of any register they do not use to return a value in. That means we only need to set CL once. We have, therefore, added a new label .loop and jump to it for the next byte instead of jumping at _start. We have also added the .hex label so we can either have a blank space or a new line as the third byte of the buffer. Once you have changed hex.asm to reflect these changes, type: &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; That looks better. But this code is quite inefficient! We are making a system call for every single byte twice (once to read it, another time to write the output). Buffered Input and Output We can improve the efficiency of our code by buffering our input and output. We create an input buffer and read a whole sequence of bytes at one time. Then we fetch them one by one from the buffer. We also create an output buffer. We store our output in it until it is full. At that time we ask the kernel to write the contents of the buffer to stdout. The program ends when there is no more input. But we still need to ask the kernel to write the contents of our output buffer to stdout one last time, otherwise some of our output would make it to the output buffer, but never be sent out. Do not forget that, or you will be wondering why some of your output is missing. %include 'system.inc' %define BUFSIZE 2048 section .data hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text global _start _start: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword stdin sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword stdout sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret We now have a third section in the source code, named .bss. This section is not included in our executable file, and, therefore, cannot be initialized. We use resb instead of db. It simply reserves the requested size of uninitialized memory for our use. We take advantage of the fact that the system does not modify the registers: We use registers for what, otherwise, would have to be global variables stored in the .data section. This is also why the &unix; convention of passing parameters to system calls on the stack is superior to the Microsoft convention of passing them in the registers: We can keep the registers for our own use. We use EDI and ESI as pointers to the next byte to be read from or written to. We use EBX and ECX to keep count of the number of bytes in the two buffers, so we know when to dump the output to, or read more input from, the system. Let us see how it works now: &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! Here I come! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; Not what you expected? The program did not print the output until we pressed ^D. That is easy to fix by inserting three lines of code to write the output every time we have converted a new line to 0A. I have marked the three lines with > (do not copy the > in your hex.asm). %include 'system.inc' %define BUFSIZE 2048 section .data hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text global _start _start: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar > cmp al, 0Ah > jne .loop > call write jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword stdin sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword stdout sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret Now, let us see how it works: &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; Not bad for a 644-byte executable, is it! This approach to buffered input/output still contains a hidden danger. I will discuss—and fix—it later, when I talk about the dark side of buffering. How to Unread a Character This may be a somewhat advanced topic, mostly of interest to programmers familiar with the theory of compilers. If you wish, you may skip to the next section, and perhaps read this later. While our sample program does not require it, more sophisticated filters often need to look ahead. In other words, they may need to see what the next character is (or even several characters). If the next character is of a certain value, it is part of the token currently being processed. Otherwise, it is not. For example, you may be parsing the input stream for a textual string (e.g., when implementing a language compiler): If a character is followed by another character, or perhaps a digit, it is part of the token you are processing. If it is followed by white space, or some other value, then it is not part of the current token. This presents an interesting problem: How to return the next character back to the input stream, so it can be read again later? One possible solution is to store it in a character variable, then set a flag. We can modify getchar to check the flag, and if it is set, fetch the byte from that variable instead of the input buffer, and reset the flag. But, of course, that slows us down. The C language has an ungetc() function, just for that purpose. Is there a quick way to implement it in our code? I would like you to scroll back up and take a look at the getchar procedure and see if you can find a nice and fast solution before reading the next paragraph. Then come back here and see my own solution. The key to returning a character back to the stream is in how we are getting the characters to start with: First we check if the buffer is empty by testing the value of EBX. If it is zero, we call the read procedure. If we do have a character available, we use lodsb, then decrease the value of EBX. The lodsb instruction is effectively identical to: mov al, [esi] inc esi The byte we have fetched remains in the buffer until the next time read is called. We do not know when that happens, but we do know it will not happen until the next call to getchar. Hence, to "return" the last-read byte back to the stream, all we have to do is decrease the value of ESI and increase the value of EBX: ungetc: dec esi inc ebx ret But, be careful! We are perfectly safe doing this if our look-ahead is at most one character at a time. If we are examining more than one upcoming character and call ungetc several times in a row, it will work most of the time, but not all the time (and will be tough to debug). Why? Because as long as getchar does not have to call read, all of the pre-read bytes are still in the buffer, and our ungetc works without a glitch. But the moment getchar calls read, the contents of the buffer change. We can always rely on ungetc working properly on the last character we have read with getchar, but not on anything we have read before that. If your program reads more than one byte ahead, you have at least two choices: If possible, modify the program so it only reads one byte ahead. This is the simplest solution. If that option is not available, first of all determine the maximum number of characters your program needs to return to the input stream at one time. Increase that number slightly, just to be sure, preferably to a multiple of 16—so it aligns nicely. Then modify the .bss section of your code, and create a small "spare" buffer right before your input buffer, something like this: section .bss resb 16 ; or whatever the value you came up with ibuffer resb BUFSIZE obuffer resb BUFSIZE You also need to modify your ungetc to pass the value of the byte to unget in AL: ungetc: dec esi inc ebx mov [esi], al ret With this modification, you can call ungetc up to 17 times in a row safely (the first call will still be within the buffer, the remaining 16 may be either within the buffer or within the "spare"). Command Line Arguments Our hex program will be more useful if it can read the names of an input and output file from its command line, i.e., if it can process the command line arguments. But... Where are they? Before a &unix; system starts a program, it pushes some data on the stack, then jumps at the _start label of the program. Yes, I said jumps, not calls. That means the data can be accessed by reading [esp+offset], or by simply popping it. The value at the top of the stack contains the number of command line arguments. It is traditionally called argc, for "argument count." Command line arguments follow next, all argc of them. These are typically referred to as argv, for "argument value(s)." That is, we get argv[0], argv[1], ..., argv[argc-1]. These are not the actual arguments, but pointers to arguments, i.e., memory addresses of the actual arguments. The arguments themselves are NUL-terminated character strings. The argv list is followed by a NULL pointer, which is simply a 0. There is more, but this is enough for our purposes right now. If you have come from the &ms-dos; programming environment, the main difference is that each argument is in a separate string. The second difference is that there is no practical limit on how many arguments there can be. Armed with this knowledge, we are almost ready for the next version of hex.asm. First, however, we need to add a few lines to system.inc: First, we need to add two new entries to our list of system call numbers: %define SYS_open 5 %define SYS_close 6 Then we add two new macros at the end of the file: %macro sys.open 0 system SYS_open %endmacro %macro sys.close 0 system SYS_close %endmacro Here, then, is our modified source code: %include 'system.inc' %define BUFSIZE 2048 section .data fd.in dd stdin fd.out dd stdout hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text align 4 err: push dword 1 ; return failure sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] pop ecx jecxz .init ; no more arguments ; ECX contains the path to input file push dword 0 ; O_RDONLY push ecx sys.open jc err ; open failed add esp, byte 8 mov [fd.in], eax pop ecx jecxz .init ; no more arguments ; ECX contains the path to output file push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc err add esp, byte 12 mov [fd.out], eax .init: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from input file or stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar cmp al, dl jne .loop call write jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close ; return success push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret In our .data section we now have two new variables, fd.in and fd.out. We store the input and output file descriptors here. In the .text section we have replaced the references to stdin and stdout with [fd.in] and [fd.out]. The .text section now starts with a simple error handler, which does nothing but exit the program with a return value of 1. The error handler is before _start so we are within a short distance from where the errors occur. Naturally, the program execution still begins at _start. First, we remove argc and argv[0] from the stack: They are of no interest to us (in this program, that is). We pop argv[1] to ECX. This register is particularly suited for pointers, as we can handle NULL pointers with jecxz. If argv[1] is not NULL, we try to open the file named in the first argument. Otherwise, we continue the program as before: Reading from stdin, writing to stdout. If we fail to open the input file (e.g., it does not exist), we jump to the error handler and quit. If all went well, we now check for the second argument. If it is there, we open the output file. Otherwise, we send the output to stdout. If we fail to open the output file (e.g., it exists and we do not have the write permission), we, again, jump to the error handler. The rest of the code is the same as before, except we close the input and output files before exiting, and, as mentioned, we use [fd.in] and [fd.out]. Our executable is now a whopping 768 bytes long. Can we still improve it? Of course! Every program can be improved. Here are a few ideas of what we could do: Have our error handler print a message to stderr. Add error handlers to the read and write functions. Close stdin when we open an input file, stdout when we open an output file. Add command line switches, such as -i and -o, so we can list the input and output files in any order, or perhaps read from stdin and write to a file. Print a usage message if command line arguments are incorrect. I shall leave these enhancements as an exercise to the reader: You already know everything you need to know to implement them. &unix; Environment An important &unix; concept is the environment, which is defined by environment variables. Some are set by the system, others by you, yet others by the shell, or any program that loads another program. How to Find Environment Variables I said earlier that when a program starts executing, the stack contains argc followed by the NULL-terminated argv array, followed by something else. The "something else" is the environment, or, to be more precise, a NULL-terminated array of pointers to environment variables. This is often referred to as env. The structure of env is the same as that of argv, a list of memory addresses followed by a NULL (0). In this case, there is no "envc"—we figure out where the array ends by searching for the final NULL. The variables usually come in the name=value format, but sometimes the =value part may be missing. We need to account for that possibility. webvars I could just show you some code that prints the environment the same way the &unix; env command does. But I thought it would be more interesting to write a simple assembly language CGI utility. CGI: A Quick Overview I have a detailed CGI tutorial on my web site, but here is a very quick overview of CGI: The web server communicates with the CGI program by setting environment variables. The CGI program sends its output to stdout. The web server reads it from there. It must start with an HTTP header followed by two blank lines. It then prints the HTML code, or whatever other type of data it is producing. While certain environment variables use standard names, others vary, depending on the web server. That makes webvars quite a useful diagnostic tool. The Code Our webvars program, then, must send out the HTTP header followed by some HTML mark-up. It then must read the environment variables one by one and send them out as part of the HTML page. The code follows. I placed comments and explanations right inside the code: ;;;;;;; webvars.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Copyright (c) 2000 G. Adam Stanislav ; All rights reserved. ; ; Redistribution and use in source and binary forms, with or without ; modification, are permitted provided that the following conditions ; are met: ; 1. Redistributions of source code must retain the above copyright ; notice, this list of conditions and the following disclaimer. ; 2. Redistributions in binary form must reproduce the above copyright ; notice, this list of conditions and the following disclaimer in the ; documentation and/or other materials provided with the distribution. ; ; THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ; ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE ; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE ; FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS ; OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ; HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT ; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY ; OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ; SUCH DAMAGE. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Version 1.0 ; ; Started: 8-Dec-2000 ; Updated: 8-Dec-2000 ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' section .data http db 'Content-type: text/html', 0Ah, 0Ah db '<?xml version="1.0" encoding="UTF-8"?>', 0Ah db '<!DOCTYPE html PUBLIC "-//W3C/DTD XHTML Strict//EN" ' db '"DTD/xhtml1-strict.dtd">', 0Ah db '<html xmlns="http://www.w3.org/1999/xhtml" ' db 'xml.lang="en" lang="en">', 0Ah db '<head>', 0Ah db '<title>Web Environment</title>', 0Ah db '<meta name="author" content="G. Adam Stanislav" />', 0Ah db '</head>', 0Ah, 0Ah db '<body bgcolor="#ffffff" text="#000000" link="#0000ff" ' db 'vlink="#840084" alink="#0000ff">', 0Ah db '<div class="webvars">', 0Ah db '<h1>Web Environment</h1>', 0Ah db '<p>The following <b>environment variables</b> are defined ' db 'on this web server:</p>', 0Ah, 0Ah db '<table align="center" width="80" border="0" cellpadding="10" ' db 'cellspacing="0" class="webvars">', 0Ah httplen equ $-http left db '<tr>', 0Ah db '<td class="name"><tt>' leftlen equ $-left middle db '</tt></td>', 0Ah db '<td class="value"><tt><b>' midlen equ $-middle undef db '<i>(undefined)</i>' undeflen equ $-undef right db '</b></tt></td>', 0Ah db '</tr>', 0Ah rightlen equ $-right wrap db '</table>', 0Ah db '</div>', 0Ah db '</body>', 0Ah db '</html>', 0Ah, 0Ah wraplen equ $-wrap section .text global _start _start: ; First, send out all the http and xhtml stuff that is ; needed before we start showing the environment push dword httplen push dword http push dword stdout sys.write ; Now find how far on the stack the environment pointers ; are. We have 12 bytes we have pushed before "argc" mov eax, [esp+12] ; We need to remove the following from the stack: ; ; The 12 bytes we pushed for sys.write ; The 4 bytes of argc ; The EAX*4 bytes of argv ; The 4 bytes of the NULL after argv ; ; Total: ; 20 + eax * 4 ; ; Because stack grows down, we need to ADD that many bytes ; to ESP. lea esp, [esp+20+eax*4] cld ; This should already be the case, but let's be sure. ; Loop through the environment, printing it out .loop: pop edi or edi, edi ; Done yet? je near .wrap ; Print the left part of HTML push dword leftlen push dword left push dword stdout sys.write ; It may be tempting to search for the '=' in the env string next. ; But it is possible there is no '=', so we search for the ; terminating NUL first. mov esi, edi ; Save start of string sub ecx, ecx not ecx ; ECX = FFFFFFFF sub eax, eax repne scasb not ecx ; ECX = string length + 1 mov ebx, ecx ; Save it in EBX ; Now is the time to find '=' mov edi, esi ; Start of string mov al, '=' repne scasb not ecx add ecx, ebx ; Length of name push ecx push esi push dword stdout sys.write ; Print the middle part of HTML table code push dword midlen push dword middle push dword stdout sys.write ; Find the length of the value not ecx lea ebx, [ebx+ecx-1] ; Print "undefined" if 0 or ebx, ebx jne .value mov ebx, undeflen mov edi, undef .value: push ebx push edi push dword stdout sys.write ; Print the right part of the table row push dword rightlen push dword right push dword stdout sys.write ; Get rid of the 60 bytes we have pushed add esp, byte 60 ; Get the next variable jmp .loop .wrap: ; Print the rest of HTML push dword wraplen push dword wrap push dword stdout sys.write ; Return success push dword 0 sys.exit This code produces a 1,396-byte executable. Most of it is data, i.e., the HTML mark-up we need to send out. Assemble and link it as usual: &prompt.user; nasm -f elf webvars.asm &prompt.user; ld -s -o webvars webvars.o To use it, you need to upload webvars to your web server. Depending on how your web server is set up, you may have to store it in a special cgi-bin directory, or perhaps rename it with a .cgi extension. Then you need to use your browser to view its output. To see its output on my web server, please go to http://www.int80h.org/webvars/. If curious about the additional environment variables present in a password protected web directory, go to http://www.int80h.org/private/, using the name asm and password programmer. Working with Files We have already done some basic file work: We know how to open and close them, how to read and write them using buffers. But &unix; offers much more functionality when it comes to files. We will examine some of it in this section, and end up with a nice file conversion utility. Indeed, let us start at the end, that is, with the file conversion utility. It always makes programming easier when we know from the start what the end product is supposed to do. One of the first programs I wrote for &unix; was tuc, a text-to-&unix; file converter. It converts a text file from other operating systems to a &unix; text file. In other words, it changes from different kind of line endings to the newline convention of &unix;. It saves the output in a different file. Optionally, it converts a &unix; text file to a DOS text file. I have used tuc extensively, but always only to convert from some other OS to &unix;, never the other way. I have always wished it would just overwrite the file instead of me having to send the output to a different file. Most of the time, I end up using it like this: &prompt.user; tuc myfile tempfile &prompt.user; mv tempfile myfile It would be nice to have a ftuc, i.e., fast tuc, and use it like this: &prompt.user; ftuc myfile In this chapter, then, we will write ftuc in assembly language (the original tuc is in C), and study various file-oriented kernel services in the process. At first sight, such a file conversion is very simple: All you have to do is strip the carriage returns, right? If you answered yes, think again: That approach will work most of the time (at least with MS DOS text files), but will fail occasionally. The problem is that not all non &unix; text files end their line with the carriage return / line feed sequence. Some use carriage returns without line feeds. Others combine several blank lines into a single carriage return followed by several line feeds. And so on. A text file converter, then, must be able to handle any possible line endings: carriage return / line feed carriage return line feed / carriage return line feed It should also handle files that use some kind of a combination of the above (e.g., carriage return followed by several line feeds). Finite State Machine The problem is easily solved by the use of a technique called finite state machine, originally developed by the designers of digital electronic circuits. A finite state machine is a digital circuit whose output is dependent not only on its input but on its previous input, i.e., on its state. The microprocessor is an example of a finite state machine: Our assembly language code is assembled to machine language in which some assembly language code produces a single byte of machine language, while others produce several bytes. As the microprocessor fetches the bytes from the memory one by one, some of them simply change its state rather than produce some output. When all the bytes of the op code are fetched, the microprocessor produces some output, or changes the value of a register, etc. Because of that, all software is essentially a sequence of state instructions for the microprocessor. Nevertheless, the concept of finite state machine is useful in software design as well. Our text file converter can be designed as a finite state machine with three possible states. We could call them states 0-2, but it will make our life easier if we give them symbolic names: ordinary cr lf Our program will start in the ordinary state. During this state, the program action depends on its input as follows: If the input is anything other than a carriage return or line feed, the input is simply passed on to the output. The state remains unchanged. If the input is a carriage return, the state is changed to cr. The input is then discarded, i.e., no output is made. If the input is a line feed, the state is changed to lf. The input is then discarded. Whenever we are in the cr state, it is because the last input was a carriage return, which was unprocessed. What our software does in this state again depends on the current input: If the input is anything other than a carriage return or line feed, output a line feed, then output the input, then change the state to ordinary. If the input is a carriage return, we have received two (or more) carriage returns in a row. We discard the input, we output a line feed, and leave the state unchanged. If the input is a line feed, we output the line feed and change the state to ordinary. Note that this is not the same as the first case above – if we tried to combine them, we would be outputting two line feeds instead of one. Finally, we are in the lf state after we have received a line feed that was not preceded by a carriage return. This will happen when our file already is in &unix; format, or whenever several lines in a row are expressed by a single carriage return followed by several line feeds, or when line ends with a line feed / carriage return sequence. Here is how we need to handle our input in this state: If the input is anything other than a carriage return or line feed, we output a line feed, then output the input, then change the state to ordinary. This is exactly the same action as in the cr state upon receiving the same kind of input. If the input is a carriage return, we discard the input, we output a line feed, then change the state to ordinary. If the input is a line feed, we output the line feed, and leave the state unchanged. The Final State The above finite state machine works for the entire file, but leaves the possibility that the final line end will be ignored. That will happen whenever the file ends with a single carriage return or a single line feed. I did not think of it when I wrote tuc, just to discover that occasionally it strips the last line ending. This problem is easily fixed by checking the state after the entire file was processed. If the state is not ordinary, we simply need to output one last line feed. Now that we have expressed our algorithm as a finite state machine, we could easily design a dedicated digital electronic circuit (a "chip") to do the conversion for us. Of course, doing so would be considerably more expensive than writing an assembly language program. The Output Counter Because our file conversion program may be combining two characters into one, we need to use an output counter. We initialize it to 0, and increase it every time we send a character to the output. At the end of the program, the counter will tell us what size we need to set the file to. Implementing FSM in Software The hardest part of working with a finite state machine is analyzing the problem and expressing it as a finite state machine. That accomplished, the software almost writes itself. In a high-level language, such as C, there are several main approaches. One is to use a switch statement which chooses what function should be run. For example, switch (state) { default: case REGULAR: regular(inputchar); break; case CR: cr(inputchar); break; case LF: lf(inputchar); break; } Another approach is by using an array of function pointers, something like this: (output[state])(inputchar); Yet another is to have state be a function pointer, set to point at the appropriate function: (*state)(inputchar); This is the approach we will use in our program because it is very easy to do in assembly language, and very fast, too. We will simply keep the address of the right procedure in EBX, and then just issue: call ebx This is possibly faster than hardcoding the address in the code because the microprocessor does not have to fetch the address from the memory—it is already stored in one of its registers. I said possibly because with the caching modern microprocessors do, either way may be equally fast. Memory Mapped Files Because our program works on a single file, we cannot use the approach that worked for us before, i.e., to read from an input file and to write to an output file. &unix; allows us to map a file, or a section of a file, into memory. To do that, we first need to open the file with the appropriate read/write flags. Then we use the mmap system call to map it into the memory. One nice thing about mmap is that it automatically works with virtual memory: We can map more of the file into the memory than we have physical memory available, yet still access it through regular memory op codes, such as mov, lods, and stos. Whatever changes we make to the memory image of the file will be written to the file by the system. We do not even have to keep the file open: As long as it stays mapped, we can read from it and write to it. The 32-bit Intel microprocessors can access up to four gigabytes of memory – physical or virtual. The FreeBSD system allows us to use up to a half of it for file mapping. For simplicity sake, in this tutorial we will only convert files that can be mapped into the memory in their entirety. There are probably not too many text files that exceed two gigabytes in size. If our program encounters one, it will simply display a message suggesting we use the original tuc instead. If you examine your copy of syscalls.master, you will find two separate syscalls named mmap. This is because of evolution of &unix;: There was the traditional BSD mmap, syscall 71. That one was superseded by the &posix; mmap, syscall 197. The FreeBSD system supports both because older programs were written by using the original BSD version. But new software uses the &posix; version, which is what we will use. The syscalls.master file lists the &posix; version like this: 197 STD BSD { caddr_t mmap(caddr_t addr, size_t len, int prot, \ int flags, int fd, long pad, off_t pos); } This differs slightly from what mmap2 says. That is because mmap2 describes the C version. The difference is in the long pad argument, which is not present in the C version. However, the FreeBSD syscalls add a 32-bit pad after pushing a 64-bit argument. In this case, off_t is a 64-bit value. When we are finished working with a memory-mapped file, we unmap it with the munmap syscall: For an in-depth treatment of mmap, see W. Richard Stevens' Unix Network Programming, Volume 2, Chapter 12. Determining File Size Because we need to tell mmap how many bytes of the file to map into the memory, and because we want to map the entire file, we need to determine the size of the file. We can use the fstat syscall to get all the information about an open file that the system can give us. That includes the file size. Again, syscalls.master lists two versions of fstat, a traditional one (syscall 62), and a &posix; one (syscall 189). Naturally, we will use the &posix; version: 189 STD POSIX { int fstat(int fd, struct stat *sb); } This is a very straightforward call: We pass to it the address of a stat structure and the descriptor of an open file. It will fill out the contents of the stat structure. I do, however, have to say that I tried to declare the stat structure in the .bss section, and fstat did not like it: It set the carry flag indicating an error. After I changed the code to allocate the structure on the stack, everything was working fine. Changing the File Size Because our program may combine carriage return / line feed sequences into straight line feeds, our output may be smaller than our input. However, since we are placing our output into the same file we read the input from, we may have to change the size of the file. The ftruncate system call allows us to do just that. Despite its somewhat misleading name, the ftruncate system call can be used to both truncate the file (make it smaller) and to grow it. And yes, we will find two versions of ftruncate in syscalls.master, an older one (130), and a newer one (201). We will use the newer one: 201 STD BSD { int ftruncate(int fd, int pad, off_t length); } Please note that this one contains a int pad again. ftuc We now know everything we need to write ftuc. We start by adding some new lines in system.inc. First, we define some constants and structures, somewhere at or near the beginning of the file: ;;;;;;; open flags %define O_RDONLY 0 %define O_WRONLY 1 %define O_RDWR 2 ;;;;;;; mmap flags %define PROT_NONE 0 %define PROT_READ 1 %define PROT_WRITE 2 %define PROT_EXEC 4 ;; %define MAP_SHARED 0001h %define MAP_PRIVATE 0002h ;;;;;;; stat structure struc stat st_dev resd 1 ; = 0 st_ino resd 1 ; = 4 st_mode resw 1 ; = 8, size is 16 bits st_nlink resw 1 ; = 10, ditto st_uid resd 1 ; = 12 st_gid resd 1 ; = 16 st_rdev resd 1 ; = 20 st_atime resd 1 ; = 24 st_atimensec resd 1 ; = 28 st_mtime resd 1 ; = 32 st_mtimensec resd 1 ; = 36 st_ctime resd 1 ; = 40 st_ctimensec resd 1 ; = 44 st_size resd 2 ; = 48, size is 64 bits st_blocks resd 2 ; = 56, ditto st_blksize resd 1 ; = 64 st_flags resd 1 ; = 68 st_gen resd 1 ; = 72 st_lspare resd 1 ; = 76 st_qspare resd 4 ; = 80 endstruc We define the new syscalls: %define SYS_mmap 197 %define SYS_munmap 73 %define SYS_fstat 189 %define SYS_ftruncate 201 We add the macros for their use: %macro sys.mmap 0 system SYS_mmap %endmacro %macro sys.munmap 0 system SYS_munmap %endmacro %macro sys.ftruncate 0 system SYS_ftruncate %endmacro %macro sys.fstat 0 system SYS_fstat %endmacro And here is our code: ;;;;;;; Fast Text-to-Unix Conversion (ftuc.asm) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Started: 21-Dec-2000 ;; Updated: 22-Dec-2000 ;; ;; Copyright 2000 G. Adam Stanislav. ;; All rights reserved. ;; ;;;;;;; v.1 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' section .data db 'Copyright 2000 G. Adam Stanislav.', 0Ah db 'All rights reserved.', 0Ah usg db 'Usage: ftuc filename', 0Ah usglen equ $-usg co db "ftuc: Can't open file.", 0Ah colen equ $-co fae db 'ftuc: File access error.', 0Ah faelen equ $-fae ftl db 'ftuc: File too long, use regular tuc instead.', 0Ah ftllen equ $-ftl mae db 'ftuc: Memory allocation error.', 0Ah maelen equ $-mae section .text align 4 memerr: push dword maelen push dword mae jmp short error align 4 toolong: push dword ftllen push dword ftl jmp short error align 4 facerr: push dword faelen push dword fae jmp short error align 4 cantopen: push dword colen push dword co jmp short error align 4 usage: push dword usglen push dword usg error: push dword stderr sys.write push dword 1 sys.exit align 4 global _start _start: pop eax ; argc pop eax ; program name pop ecx ; file to convert jecxz usage pop eax or eax, eax ; Too many arguments? jne usage ; Open the file push dword O_RDWR push ecx sys.open jc cantopen mov ebp, eax ; Save fd sub esp, byte stat_size mov ebx, esp ; Find file size push ebx push ebp ; fd sys.fstat jc facerr mov edx, [ebx + st_size + 4] ; File is too long if EDX != 0 ... or edx, edx jne near toolong mov ecx, [ebx + st_size] ; ... or if it is above 2 GB or ecx, ecx js near toolong ; Do nothing if the file is 0 bytes in size jecxz .quit ; Map the entire file in memory push edx push edx ; starting at offset 0 push edx ; pad push ebp ; fd push dword MAP_SHARED push dword PROT_READ | PROT_WRITE push ecx ; entire file size push edx ; let system decide on the address sys.mmap jc near memerr mov edi, eax mov esi, eax push ecx ; for SYS_munmap push edi ; Use EBX for state machine mov ebx, ordinary mov ah, 0Ah cld .loop: lodsb call ebx loop .loop cmp ebx, ordinary je .filesize ; Output final lf mov al, ah stosb inc edx .filesize: ; truncate file to new size push dword 0 ; high dword push edx ; low dword push eax ; pad push ebp sys.ftruncate ; close it (ebp still pushed) sys.close add esp, byte 16 sys.munmap .quit: push dword 0 sys.exit align 4 ordinary: cmp al, 0Dh je .cr cmp al, ah je .lf stosb inc edx ret align 4 .cr: mov ebx, cr ret align 4 .lf: mov ebx, lf ret align 4 cr: cmp al, 0Dh je .cr cmp al, ah je .lf xchg al, ah stosb inc edx xchg al, ah ; fall through .lf: stosb inc edx mov ebx, ordinary ret align 4 .cr: mov al, ah stosb inc edx ret align 4 lf: cmp al, ah je .lf cmp al, 0Dh je .cr xchg al, ah stosb inc edx xchg al, ah stosb inc edx mov ebx, ordinary ret align 4 .cr: mov ebx, ordinary mov al, ah ; fall through .lf: stosb inc edx ret Do not use this program on files stored on a disk formated by &ms-dos; or &windows;. There seems to be a subtle bug in the FreeBSD code when using mmap on these drives mounted under FreeBSD: If the file is over a certain size, mmap will just fill the memory with zeros, and then copy them to the file overwriting its contents. One-Pointed Mind As a student of Zen, I like the idea of a one-pointed mind: Do one thing at a time, and do it well. This, indeed, is very much how &unix; works as well. While a typical &windows; application is attempting to do everything imaginable (and is, therefore, riddled with bugs), a typical &unix; program does only one thing, and it does it well. The typical &unix; user then essentially assembles his own applications by writing a shell script which combines the various existing programs by piping the output of one program to the input of another. When writing your own &unix; software, it is generally a good idea to see what parts of the problem you need to solve can be handled by existing programs, and only write your own programs for that part of the problem that you do not have an existing solution for. CSV I will illustrate this principle with a specific real-life example I was faced with recently: I needed to extract the 11th field of each record from a database I downloaded from a web site. The database was a CSV file, i.e., a list of comma-separated values. That is quite a standard format for sharing data among people who may be using different database software. The first line of the file contains the list of various fields separated by commas. The rest of the file contains the data listed line by line, with values separated by commas. I tried awk, using the comma as a separator. But because several lines contained a quoted comma, awk was extracting the wrong field from those lines. Therefore, I needed to write my own software to extract the 11th field from the CSV file. However, going with the &unix; spirit, I only needed to write a simple filter that would do the following: Remove the first line from the file; Change all unquoted commas to a different character; Remove all quotation marks. Strictly speaking, I could use sed to remove the first line from the file, but doing so in my own program was very easy, so I decided to do it and reduce the size of the pipeline. At any rate, writing a program like this took me about 20 minutes. Writing a program that extracts the 11th field from the CSV file would take a lot longer, and I could not reuse it to extract some other field from some other database. This time I decided to let it do a little more work than a typical tutorial program would: It parses its command line for options; It displays proper usage if it finds wrong arguments; It produces meaningful error messages. Here is its usage message: Usage: csv [-t<delim>] [-c<comma>] [-p] [-o <outfile>] [-i <infile>] All parameters are optional, and can appear in any order. The -t parameter declares what to replace the commas with. The tab is the default here. For example, -t; will replace all unquoted commas with semicolons. I did not need the -c option, but it may come in handy in the future. It lets me declare that I want a character other than a comma replaced with something else. For example, -c@ will replace all at signs (useful if you want to split a list of email addresses to their user names and domains). The -p option preserves the first line, i.e., it does not delete it. By default, we delete the first line because in a CSV file it contains the field names rather than data. The -i and -o options let me specify the input and the output files. Defaults are stdin and stdout, so this is a regular &unix; filter. I made sure that both -i filename and -ifilename are accepted. I also made sure that only one input and one output files may be specified. To get the 11th field of each record, I can now do: &prompt.user; csv '-t;' data.csv | awk '-F;' '{print $11}' The code stores the options (except for the file descriptors) in EDX: The comma in DH, the new separator in DL, and the flag for the -p option in the highest bit of EDX, so a check for its sign will give us a quick decision what to do. Here is the code: ;;;;;;; csv.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Convert a comma-separated file to a something-else separated file. ; ; Started: 31-May-2001 ; Updated: 1-Jun-2001 ; ; Copyright (c) 2001 G. Adam Stanislav ; All rights reserved. ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' %define BUFSIZE 2048 section .data fd.in dd stdin fd.out dd stdout usg db 'Usage: csv [-t<delim>] [-c<comma>] [-p] [-o <outfile>] [-i <infile>]', 0Ah usglen equ $-usg iemsg db "csv: Can't open input file", 0Ah iemlen equ $-iemsg oemsg db "csv: Can't create output file", 0Ah oemlen equ $-oemsg section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text align 4 ierr: push dword iemlen push dword iemsg push dword stderr sys.write push dword 1 ; return failure sys.exit align 4 oerr: push dword oemlen push dword oemsg push dword stderr sys.write push dword 2 sys.exit align 4 usage: push dword usglen push dword usg push dword stderr sys.write push dword 3 sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] mov edx, (',' << 8) | 9 .arg: pop ecx or ecx, ecx je near .init ; no more arguments ; ECX contains the pointer to an argument cmp byte [ecx], '-' jne usage inc ecx mov ax, [ecx] .o: cmp al, 'o' jne .i ; Make sure we are not asked for the output file twice cmp dword [fd.out], stdout jne usage ; Find the path to output file - it is either at [ECX+1], ; i.e., -ofile -- ; or in the next argument, ; i.e., -o file inc ecx or ah, ah jne .openoutput pop ecx jecxz usage .openoutput: push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc near oerr add esp, byte 12 mov [fd.out], eax jmp short .arg .i: cmp al, 'i' jne .p ; Make sure we are not asked twice cmp dword [fd.in], stdin jne near usage ; Find the path to the input file inc ecx or ah, ah jne .openinput pop ecx or ecx, ecx je near usage .openinput: push dword 0 ; O_RDONLY push ecx sys.open jc near ierr ; open failed add esp, byte 8 mov [fd.in], eax jmp .arg .p: cmp al, 'p' jne .t or ah, ah jne near usage or edx, 1 << 31 jmp .arg .t: cmp al, 't' ; redefine output delimiter jne .c or ah, ah je near usage mov dl, ah jmp .arg .c: cmp al, 'c' jne near usage or ah, ah je near usage mov dh, ah jmp .arg align 4 .init: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer ; See if we are to preserve the first line or edx, edx js .loop .firstline: ; get rid of the first line call getchar cmp al, 0Ah jne .firstline .loop: ; read a byte from stdin call getchar ; is it a comma (or whatever the user asked for)? cmp al, dh jne .quote ; Replace the comma with a tab (or whatever the user wants) mov al, dl .put: call putchar jmp short .loop .quote: cmp al, '"' jne .put ; Print everything until you get another quote or EOL. If it ; is a quote, skip it. If it is EOL, print it. .qloop: call getchar cmp al, '"' je .loop cmp al, 0Ah je .put call putchar jmp short .qloop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: jecxz .read call write .read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close ; return success push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: jecxz .ret ; nothing to write sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now .ret: ret Much of it is taken from hex.asm above. But there is one important difference: I no longer call write whenever I am outputting a line feed. Yet, the code can be used interactively. I have found a better solution for the interactive problem since I first started writing this chapter. I wanted to make sure each line is printed out separately only when needed. After all, there is no need to flush out every line when used non-interactively. The new solution I use now is to call write every time I find the input buffer empty. That way, when running in the interactive mode, the program reads one line from the user's keyboard, processes it, and sees its input buffer is empty. It flushes its output and reads the next line. The Dark Side of Buffering This change prevents a mysterious lockup in a very specific case. I refer to it as the dark side of buffering, mostly because it presents a danger that is not quite obvious. It is unlikely to happen with a program like the csv above, so let us consider yet another filter: In this case we expect our input to be raw data representing color values, such as the red, green, and blue intensities of a pixel. Our output will be the negative of our input. Such a filter would be very simple to write. Most of it would look just like all the other filters we have written so far, so I am only going to show you its inner loop: .loop: call getchar not al ; Create a negative call putchar jmp short .loop Because this filter works with raw data, it is unlikely to be used interactively. But it could be called by image manipulation software. And, unless it calls write before each call to read, chances are it will lock up. Here is what might happen: The image editor will load our filter using the C function popen(). It will read the first row of pixels from a bitmap or pixmap. It will write the first row of pixels to the pipe leading to the fd.in of our filter. Our filter will read each pixel from its input, turn it to a negative, and write it to its output buffer. Our filter will call getchar to fetch the next pixel. getchar will find an empty input buffer, so it will call read. read will call the SYS_read system call. The kernel will suspend our filter until the image editor sends more data to the pipe. The image editor will read from the other pipe, connected to the fd.out of our filter so it can set the first row of the output image before it sends us the second row of the input. The kernel suspends the image editor until it receives some output from our filter, so it can pass it on to the image editor. At this point our filter waits for the image editor to send it more data to process, while the image editor is waiting for our filter to send it the result of the processing of the first row. But the result sits in our output buffer. The filter and the image editor will continue waiting for each other forever (or, at least, until they are killed). Our software has just entered a race condition. This problem does not exist if our filter flushes its output buffer before asking the kernel for more input data. Using the <acronym>FPU</acronym> Strangely enough, most of assembly language literature does not even mention the existence of the FPU, or floating point unit, let alone discuss programming it. Yet, never does assembly language shine more than when we create highly optimized FPU code by doing things that can be done only in assembly language. Organization of the <acronym>FPU</acronym> The FPU consists of 8 80–bit floating–point registers. These are organized in a stack fashion—you can push a value on TOS (top of stack) and you can pop it. That said, the assembly language op codes are not push and pop because those are already taken. You can push a value on TOS by using fld, fild, and fbld. Several other op codes let you push many common constants—such as pi—on the TOS. Similarly, you can pop a value by using fst, fstp, fist, fistp, and fbstp. Actually, only the op codes that end with a p will literally pop the value, the rest will store it somewhere else without removing it from the TOS. We can transfer the data between the TOS and the computer memory either as a 32–bit, 64–bit, or 80–bit real, a 16–bit, 32–bit, or 64–bit integer, or an 80–bit packed decimal. The 80–bit packed decimal is a special case of binary coded decimal which is very convenient when converting between the ASCII representation of data and the internal data of the FPU. It allows us to use 18 significant digits. No matter how we represent data in the memory, the FPU always stores it in the 80–bit real format in its registers. Its internal precision is at least 19 decimal digits, so even if we choose to display results as ASCII in the full 18–digit precision, we are still showing correct results. We can perform mathematical operations on the TOS: We can calculate its sine, we can scale it (i.e., we can multiply or divide it by a power of 2), we can calculate its base–2 logarithm, and many other things. We can also multiply or divide it by, add it to, or subtract it from, any of the FPU registers (including itself). The official Intel op code for the TOS is st, and for the registers st(0)–st(7). st and st(0), then, refer to the same register. For whatever reasons, the original author of nasm has decided to use different op codes, namely st0–st7. In other words, there are no parentheses, and the TOS is always st0, never just st. The Packed Decimal Format The packed decimal format uses 10 bytes (80 bits) of memory to represent 18 digits. The number represented there is always an integer. You can use it to get decimal places by multiplying the TOS by a power of 10 first. The highest bit of the highest byte (byte 9) is the sign bit: If it is set, the number is negative, otherwise, it is positive. The rest of the bits of this byte are unused/ignored. The remaining 9 bytes store the 18 digits of the number: 2 digits per byte. The more significant digit is stored in the high nibble (4 bits), the less significant digit in the low nibble. That said, you might think that -1234567 would be stored in the memory like this (using hexadecimal notation): 80 00 00 00 00 00 01 23 45 67 Alas it is not! As with everything else of Intel make, even the packed decimal is little–endian. That means our -1234567 is stored like this: 67 45 23 01 00 00 00 00 00 80 Remember that, or you will be pulling your hair out in desperation! The book to read—if you can find it—is Richard Startz' 8087/80287/80387 for the IBM PC & Compatibles. Though it does seem to take the fact about the little–endian storage of the packed decimal for granted. I kid you not about the desperation of trying to figure out what was wrong with the filter I show below before it occurred to me I should try the little–endian order even for this type of data. Excursion to Pinhole Photography To write meaningful software, we must not only understand our programming tools, but also the field we are creating software for. Our next filter will help us whenever we want to build a pinhole camera, so, we need some background in pinhole photography before we can continue. The Camera The easiest way to describe any camera ever built is as some empty space enclosed in some lightproof material, with a small hole in the enclosure. The enclosure is usually sturdy (e.g., a box), though sometimes it is flexible (the bellows). It is quite dark inside the camera. However, the hole lets light rays in through a single point (though in some cases there may be several). These light rays form an image, a representation of whatever is outside the camera, in front of the hole. If some light sensitive material (such as film) is placed inside the camera, it can capture the image. The hole often contains a lens, or a lens assembly, often called the objective. The Pinhole But, strictly speaking, the lens is not necessary: The original cameras did not use a lens but a pinhole. Even today, pinholes are used, both as a tool to study how cameras work, and to achieve a special kind of image. The image produced by the pinhole is all equally sharp. Or blurred. There is an ideal size for a pinhole: If it is either larger or smaller, the image loses its sharpness. Focal Length This ideal pinhole diameter is a function of the square root of focal length, which is the distance of the pinhole from the film. D = PC * sqrt(FL) In here, D is the ideal diameter of the pinhole, FL is the focal length, and PC is a pinhole constant. According to Jay Bender, its value is 0.04, while Kenneth Connors has determined it to be 0.037. Others have proposed other values. Plus, this value is for the daylight only: Other types of light will require a different constant, whose value can only be determined by experimentation. The F–Number The f–number is a very useful measure of how much light reaches the film. A light meter can determine that, for example, to expose a film of specific sensitivity with f5.6 may require the exposure to last 1/1000 sec. It does not matter whether it is a 35–mm camera, or a 6x9cm camera, etc. As long as we know the f–number, we can determine the proper exposure. The f–number is easy to calculate: F = FL / D In other words, the f–number equals the focal length divided by the diameter of the pinhole. It also means a higher f–number either implies a smaller pinhole or a larger focal distance, or both. That, in turn, implies, the higher the f–number, the longer the exposure has to be. Furthermore, while pinhole diameter and focal distance are one–dimensional measurements, both, the film and the pinhole, are two–dimensional. That means that if you have measured the exposure at f–number A as t, then the exposure at f–number B is: t * (B / A)² Normalized F–Number While many modern cameras can change the diameter of their pinhole, and thus their f–number, quite smoothly and gradually, such was not always the case. To allow for different f–numbers, cameras typically contained a metal plate with several holes of different sizes drilled to them. Their sizes were chosen according to the above formula in such a way that the resultant f–number was one of standard f–numbers used on all cameras everywhere. For example, a very old Kodak Duaflex IV camera in my possession has three such holes for f–numbers 8, 11, and 16. A more recently made camera may offer f–numbers of 2.8, 4, 5.6, 8, 11, 16, 22, and 32 (as well as others). These numbers were not chosen arbitrarily: They all are powers of the square root of 2, though they may be rounded somewhat. The F–Stop A typical camera is designed in such a way that setting any of the normalized f–numbers changes the feel of the dial. It will naturally stop in that position. Because of that, these positions of the dial are called f–stops. Since the f–numbers at each stop are powers of the square root of 2, moving the dial by 1 stop will double the amount of light required for proper exposure. Moving it by 2 stops will quadruple the required exposure. Moving the dial by 3 stops will require the increase in exposure 8 times, etc. Designing the Pinhole Software We are now ready to decide what exactly we want our pinhole software to do. Processing Program Input Since its main purpose is to help us design a working pinhole camera, we will use the focal length as the input to the program. This is something we can determine without software: Proper focal length is determined by the size of the film and by the need to shoot "regular" pictures, wide angle pictures, or telephoto pictures. Most of the programs we have written so far worked with individual characters, or bytes, as their input: The hex program converted individual bytes into a hexadecimal number, the csv program either let a character through, or deleted it, or changed it to a different character, etc. One program, ftuc used the state machine to consider at most two input bytes at a time. But our pinhole program cannot just work with individual characters, it has to deal with larger syntactic units. For example, if we want the program to calculate the pinhole diameter (and other values we will discuss later) at the focal lengths of 100 mm, 150 mm, and 210 mm, we may want to enter something like this: 100, 150, 210 Our program needs to consider more than a single byte of input at a time. When it sees the first 1, it must understand it is seeing the first digit of a decimal number. When it sees the 0 and the other 0, it must know it is seeing more digits of the same number. When it encounters the first comma, it must know it is no longer receiving the digits of the first number. It must be able to convert the digits of the first number into the value of 100. And the digits of the second number into the value of 150. And, of course, the digits of the third number into the numeric value of 210. We need to decide what delimiters to accept: Do the input numbers have to be separated by a comma? If so, how do we treat two numbers separated by something else? Personally, I like to keep it simple. Something either is a number, so I process it. Or it is not a number, so I discard it. I do not like the computer complaining about me typing in an extra character when it is obvious that it is an extra character. Duh! Plus, it allows me to break up the monotony of computing and type in a query instead of just a number: What is the best pinhole diameter for the focal length of 150? There is no reason for the computer to spit out a number of complaints: Syntax error: What Syntax error: is Syntax error: the Syntax error: best Et cetera, et cetera, et cetera. Secondly, I like the # character to denote the start of a comment which extends to the end of the line. This does not take too much effort to code, and lets me treat input files for my software as executable scripts. In our case, we also need to decide what units the input should come in: We choose millimeters because that is how most photographers measure the focus length. Finally, we need to decide whether to allow the use of the decimal point (in which case we must also consider the fact that much of the world uses a decimal comma). In our case allowing for the decimal point/comma would offer a false sense of precision: There is little if any noticeable difference between the focus lengths of 50 and 51, so allowing the user to input something like 50.5 is not a good idea. This is my opinion, mind you, but I am the one writing this program. You can make other choices in yours, of course. Offering Options The most important thing we need to know when building a pinhole camera is the diameter of the pinhole. Since we want to shoot sharp images, we will use the above formula to calculate the pinhole diameter from focal length. As experts are offering several different values for the PC constant, we will need to have the choice. It is traditional in &unix; programming to have two main ways of choosing program parameters, plus to have a default for the time the user does not make a choice. Why have two ways of choosing? One is to allow a (relatively) permanent choice that applies automatically each time the software is run without us having to tell it over and over what we want it to do. The permanent choices may be stored in a configuration file, typically found in the user's home directory. The file usually has the same name as the application but is started with a dot. Often "rc" is added to the file name. So, ours could be ~/.pinhole or ~/.pinholerc. (The ~/ means current user's home directory.) The configuration file is used mostly by programs that have many configurable parameters. Those that have only one (or a few) often use a different method: They expect to find the parameter in an environment variable. In our case, we might look at an environment variable named PINHOLE. Usually, a program uses one or the other of the above methods. Otherwise, if a configuration file said one thing, but an environment variable another, the program might get confused (or just too complicated). Because we only need to choose one such parameter, we will go with the second method and search the environment for a variable named PINHOLE. The other way allows us to make ad hoc decisions: "Though I usually want you to use 0.039, this time I want 0.03872." In other words, it allows us to override the permanent choice. This type of choice is usually done with command line parameters. Finally, a program always needs a default. The user may not make any choices. Perhaps he does not know what to choose. Perhaps he is "just browsing." Preferably, the default will be the value most users would choose anyway. That way they do not need to choose. Or, rather, they can choose the default without an additional effort. Given this system, the program may find conflicting options, and handle them this way: If it finds an ad hoc choice (e.g., command line parameter), it should accept that choice. It must ignore any permanent choice and any default. Otherwise, if it finds a permanent option (e.g., an environment variable), it should accept it, and ignore the default. Otherwise, it should use the default. We also need to decide what format our PC option should have. At first site, it seems obvious to use the PINHOLE=0.04 format for the environment variable, and -p0.04 for the command line. Allowing that is actually a security risk. The PC constant is a very small number. Naturally, we will test our software using various small values of PC. But what will happen if someone runs the program choosing a huge value? It may crash the program because we have not designed it to handle huge numbers. Or, we may spend more time on the program so it can handle huge numbers. We might do that if we were writing commercial software for computer illiterate audience. Or, we might say, "Tough! The user should know better."" Or, we just may make it impossible for the user to enter a huge number. This is the approach we will take: We will use an implied 0. prefix. In other words, if the user wants 0.04, we will expect him to type -p04, or set PINHOLE=04 in his environment. So, if he says -p9999999, we will interpret it as 0.9999999—still ridiculous but at least safer. Secondly, many users will just want to go with either Bender's constant or Connors' constant. To make it easier on them, we will interpret -b as identical to -p04, and -c as identical to -p037. The Output We need to decide what we want our software to send to the output, and in what format. Since our input allows for an unspecified number of focal length entries, it makes sense to use a traditional database–style output of showing the result of the calculation for each focal length on a separate line, while separating all values on one line by a tab character. Optionally, we should also allow the user to specify the use of the CSV format we have studied earlier. In this case, we will print out a line of comma–separated names describing each field of every line, then show our results as before, but substituting a comma for the tab. We need a command line option for the CSV format. We cannot use -c because that already means use Connors' constant. For some strange reason, many web sites refer to CSV files as "Excel spreadsheet" (though the CSV format predates Excel). We will, therefore, use the -e switch to inform our software we want the output in the CSV format. We will start each line of the output with the focal length. This may sound repetitious at first, especially in the interactive mode: The user types in the focal length, and we are repeating it. But the user can type several focal lengths on one line. The input can also come in from a file or from the output of another program. In that case the user does not see the input at all. By the same token, the output can go to a file which we will want to examine later, or it could go to the printer, or become the input of another program. So, it makes perfect sense to start each line with the focal length as entered by the user. No, wait! Not as entered by the user. What if the user types in something like this: 00000000150 Clearly, we need to strip those leading zeros. So, we might consider reading the user input as is, converting it to binary inside the FPU, and printing it out from there. But... What if the user types something like this: 17459765723452353453534535353530530534563507309676764423 Ha! The packed decimal FPU format lets us input 18–digit numbers. But the user has entered more than 18 digits. How do we handle that? Well, we could modify our code to read the first 18 digits, enter it to the FPU, then read more, multiply what we already have on the TOS by 10 raised to the number of additional digits, then add to it. Yes, we could do that. But in this program it would be ridiculous (in a different one it may be just the thing to do): Even the circumference of the Earth expressed in millimeters only takes 11 digits. Clearly, we cannot build a camera that large (not yet, anyway). So, if the user enters such a huge number, he is either bored, or testing us, or trying to break into the system, or playing games—doing anything but designing a pinhole camera. What will we do? We will slap him in the face, in a manner of speaking: 17459765723452353453534535353530530534563507309676764423 ??? ??? ??? ??? ??? To achieve that, we will simply ignore any leading zeros. Once we find a non–zero digit, we will initialize a counter to 0 and start taking three steps: Send the digit to the output. Append the digit to a buffer we will use later to produce the packed decimal we can send to the FPU. Increase the counter. Now, while we are taking these three steps, we also need to watch out for one of two conditions: If the counter grows above 18, we stop appending to the buffer. We continue reading the digits and sending them to the output. If, or rather when, the next input character is not a digit, we are done inputting for now. Incidentally, we can simply discard the non–digit, unless it is a #, which we must return to the input stream. It starts a comment, so we must see it after we are done producing output and start looking for more input. That still leaves one possibility uncovered: If all the user enters is a zero (or several zeros), we will never find a non–zero to display. We can determine this has happened whenever our counter stays at 0. In that case we need to send 0 to the output, and perform another "slap in the face": 0 ??? ??? ??? ??? ??? Once we have displayed the focal length and determined it is valid (greater than 0 but not exceeding 18 digits), we can calculate the pinhole diameter. It is not by coincidence that pinhole contains the word pin. Indeed, many a pinhole literally is a pin hole, a hole carefully punched with the tip of a pin. That is because a typical pinhole is very small. Our formula gets the result in millimeters. We will multiply it by 1000, so we can output the result in microns. At this point we have yet another trap to face: Too much precision. Yes, the FPU was designed for high precision mathematics. But we are not dealing with high precision mathematics. We are dealing with physics (optics, specifically). Suppose we want to convert a truck into a pinhole camera (we would not be the first ones to do that!). Suppose its box is 12 meters long, so we have the focal length of 12000. Well, using Bender's constant, it gives us square root of 12000 multiplied by 0.04, which is 4.381780460 millimeters, or 4381.780460 microns. Put either way, the result is absurdly precise. Our truck is not exactly 12000 millimeters long. We did not measure its length with such a precision, so stating we need a pinhole with the diameter of 4.381780460 millimeters is, well, deceiving. 4.4 millimeters would do just fine. I "only" used ten digits in the above example. Imagine the absurdity of going for all 18! We need to limit the number of significant digits of our result. One way of doing it is by using an integer representing microns. So, our truck would need a pinhole with the diameter of 4382 microns. Looking at that number, we still decide that 4400 microns, or 4.4 millimeters is close enough. Additionally, we can decide that no matter how big a result we get, we only want to display four significant digits (or any other number of them, of course). Alas, the FPU does not offer rounding to a specific number of digits (after all, it does not view the numbers as decimal but as binary). We, therefore, must devise an algorithm to reduce the number of significant digits. Here is mine (I think it is awkward—if you know a better one, please, let me know): Initialize a counter to 0. While the number is greater than or equal to 10000, divide it by 10 and increase the counter. Output the result. While the counter is greater than 0, output 0 and decrease the counter. The 10000 is only good if you want four significant digits. For any other number of significant digits, replace 10000 with 10 raised to the number of significant digits. We will, then, output the pinhole diameter in microns, rounded off to four significant digits. At this point, we know the focal length and the pinhole diameter. That means we have enough information to also calculate the f–number. We will display the f–number, rounded to four significant digits. Chances are the f–number will tell us very little. To make it more meaningful, we can find the nearest normalized f–number, i.e., the nearest power of the square root of 2. We do that by multiplying the actual f–number by itself, which, of course, will give us its square. We will then calculate its base–2 logarithm, which is much easier to do than calculating the base–square–root–of–2 logarithm! We will round the result to the nearest integer. Next, we will raise 2 to the result. Actually, the FPU gives us a good shortcut to do that: We can use the fscale op code to "scale" 1, which is analogous to shifting an integer left. Finally, we calculate the square root of it all, and we have the nearest normalized f–number. If all that sounds overwhelming—or too much work, perhaps—it may become much clearer if you see the code. It takes 9 op codes altogether: fmul st0, st0 fld1 fld st1 fyl2x frndint fld1 fscale fsqrt fstp st1 The first line, fmul st0, st0, squares the contents of the TOS (top of the stack, same as st, called st0 by nasm). The fld1 pushes 1 on the TOS. The next line, fld st1, pushes the square back to the TOS. At this point the square is both in st and st(2) (it will become clear why we leave a second copy on the stack in a moment). st(1) contains 1. Next, fyl2x calculates base–2 logarithm of st multiplied by st(1). That is why we placed 1 on st(1) before. At this point, st contains the logarithm we have just calculated, st(1) contains the square of the actual f–number we saved for later. frndint rounds the TOS to the nearest integer. fld1 pushes a 1. fscale shifts the 1 we have on the TOS by the value in st(1), effectively raising 2 to st(1). Finally, fsqrt calculates the square root of the result, i.e., the nearest normalized f–number. We now have the nearest normalized f–number on the TOS, the base–2 logarithm rounded to the nearest integer in st(1), and the square of the actual f–number in st(2). We are saving the value in st(2) for later. But we do not need the contents of st(1) anymore. The last line, fstp st1, places the contents of st to st(1), and pops. As a result, what was st(1) is now st, what was st(2) is now st(1), etc. The new st contains the normalized f–number. The new st(1) contains the square of the actual f–number we have stored there for posterity. At this point, we are ready to output the normalized f–number. Because it is normalized, we will not round it off to four significant digits, but will send it out in its full precision. The normalized f-number is useful as long as it is reasonably small and can be found on our light meter. Otherwise we need a different method of determining proper exposure. Earlier we have figured out the formula of calculating proper exposure at an arbitrary f–number from that measured at a different f–number. Every light meter I have ever seen can determine proper exposure at f5.6. We will, therefore, calculate an "f5.6 multiplier," i.e., by how much we need to multiply the exposure measured at f5.6 to determine the proper exposure for our pinhole camera. From the above formula we know this factor can be calculated by dividing our f–number (the actual one, not the normalized one) by 5.6, and squaring the result. Mathematically, dividing the square of our f–number by the square of 5.6 will give us the same result. Computationally, we do not want to square two numbers when we can only square one. So, the first solution seems better at first. But... 5.6 is a constant. We do not have to have our FPU waste precious cycles. We can just tell it to divide the square of the f–number by whatever 5.6² equals to. Or we can divide the f–number by 5.6, and then square the result. The two ways now seem equal. But, they are not! Having studied the principles of photography above, we remember that the 5.6 is actually square root of 2 raised to the fifth power. An irrational number. The square of this number is exactly 32. Not only is 32 an integer, it is a power of 2. We do not need to divide the square of the f–number by 32. We only need to use fscale to shift it right by five positions. In the FPU lingo it means we will fscale it with st(1) equal to -5. That is much faster than a division. So, now it has become clear why we have saved the square of the f–number on the top of the FPU stack. The calculation of the f5.6 multiplier is the easiest calculation of this entire program! We will output it rounded to four significant digits. There is one more useful number we can calculate: The number of stops our f–number is from f5.6. This may help us if our f–number is just outside the range of our light meter, but we have a shutter which lets us set various speeds, and this shutter uses stops. Say, our f–number is 5 stops from f5.6, and the light meter says we should use 1/1000 sec. Then we can set our shutter speed to 1/1000 first, then move the dial by 5 stops. This calculation is quite easy as well. All we have to do is to calculate the base-2 logarithm of the f5.6 multiplier we had just calculated (though we need its value from before we rounded it off). We then output the result rounded to the nearest integer. We do not need to worry about having more than four significant digits in this one: The result is most likely to have only one or two digits anyway. FPU Optimizations In assembly language we can optimize the FPU code in ways impossible in high languages, including C. Whenever a C function needs to calculate a floating–point value, it loads all necessary variables and constants into FPU registers. It then does whatever calculation is required to get the correct result. Good C compilers can optimize that part of the code really well. It "returns" the value by leaving the result on the TOS. However, before it returns, it cleans up. Any variables and constants it used in its calculation are now gone from the FPU. It cannot do what we just did above: We calculated the square of the f–number and kept it on the stack for later use by another function. We knew we would need that value later on. We also knew we had enough room on the stack (which only has room for 8 numbers) to store it there. A C compiler has no way of knowing that a value it has on the stack will be required again in the very near future. Of course, the C programmer may know it. But the only recourse he has is to store the value in a memory variable. That means, for one, the value will be changed from the 80-bit precision used internally by the FPU to a C double (64 bits) or even single (32 bits). That also means that the value must be moved from the TOS into the memory, and then back again. Alas, of all FPU operations, the ones that access the computer memory are the slowest. So, whenever programming the FPU in assembly language, look for the ways of keeping intermediate results on the FPU stack. We can take that idea even further! In our program we are using a constant (the one we named PC). It does not matter how many pinhole diameters we are calculating: 1, 10, 20, 1000, we are always using the same constant. Therefore, we can optimize our program by keeping the constant on the stack all the time. Early on in our program, we are calculating the value of the above constant. We need to divide our input by 10 for every digit in the constant. It is much faster to multiply than to divide. So, at the start of our program, we divide 10 into 1 to obtain 0.1, which we then keep on the stack: Instead of dividing the input by 10 for every digit, we multiply it by 0.1. By the way, we do not input 0.1 directly, even though we could. We have a reason for that: While 0.1 can be expressed with just one decimal place, we do not know how many binary places it takes. We, therefore, let the FPU calculate its binary value to its own high precision. We are using other constants: We multiply the pinhole diameter by 1000 to convert it from millimeters to microns. We compare numbers to 10000 when we are rounding them off to four significant digits. So, we keep both, 1000 and 10000, on the stack. And, of course, we reuse the 0.1 when rounding off numbers to four digits. Last but not least, we keep -5 on the stack. We need it to scale the square of the f–number, instead of dividing it by 32. It is not by coincidence we load this constant last. That makes it the top of the stack when only the constants are on it. So, when the square of the f–number is being scaled, the -5 is at st(1), precisely where fscale expects it to be. It is common to create certain constants from scratch instead of loading them from the memory. That is what we are doing with -5: fld1 ; TOS = 1 fadd st0, st0 ; TOS = 2 fadd st0, st0 ; TOS = 4 fld1 ; TOS = 1 faddp st1, st0 ; TOS = 5 fchs ; TOS = -5 We can generalize all these optimizations into one rule: Keep repeat values on the stack! &postscript; is a stack–oriented programming language. There are many more books available about &postscript; than about the FPU assembly language: Mastering &postscript; will help you master the FPU. <application>pinhole</application>—The Code ;;;;;;; pinhole.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Find various parameters of a pinhole camera construction and use ; ; Started: 9-Jun-2001 ; Updated: 10-Jun-2001 ; ; Copyright (c) 2001 G. Adam Stanislav ; All rights reserved. ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' %define BUFSIZE 2048 section .data align 4 ten dd 10 thousand dd 1000 tthou dd 10000 fd.in dd stdin fd.out dd stdout envar db 'PINHOLE=' ; Exactly 8 bytes, or 2 dwords long pinhole db '04,', ; Bender's constant (0.04) connors db '037', 0Ah ; Connors' constant usg db 'Usage: pinhole [-b] [-c] [-e] [-p <value>] [-o <outfile>] [-i <infile>]', 0Ah usglen equ $-usg iemsg db "pinhole: Can't open input file", 0Ah iemlen equ $-iemsg oemsg db "pinhole: Can't create output file", 0Ah oemlen equ $-oemsg pinmsg db "pinhole: The PINHOLE constant must not be 0", 0Ah pinlen equ $-pinmsg toobig db "pinhole: The PINHOLE constant may not exceed 18 decimal places", 0Ah biglen equ $-toobig huhmsg db 9, '???' separ db 9, '???' sep2 db 9, '???' sep3 db 9, '???' sep4 db 9, '???', 0Ah huhlen equ $-huhmsg header db 'focal length in millimeters,pinhole diameter in microns,' db 'F-number,normalized F-number,F-5.6 multiplier,stops ' db 'from F-5.6', 0Ah headlen equ $-header section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE dbuffer resb 20 ; decimal input buffer bbuffer resb 10 ; BCD buffer section .text align 4 huh: call write push dword huhlen push dword huhmsg push dword [fd.out] sys.write add esp, byte 12 ret align 4 perr: push dword pinlen push dword pinmsg push dword stderr sys.write push dword 4 ; return failure sys.exit align 4 consttoobig: push dword biglen push dword toobig push dword stderr sys.write push dword 5 ; return failure sys.exit align 4 ierr: push dword iemlen push dword iemsg push dword stderr sys.write push dword 1 ; return failure sys.exit align 4 oerr: push dword oemlen push dword oemsg push dword stderr sys.write push dword 2 sys.exit align 4 usage: push dword usglen push dword usg push dword stderr sys.write push dword 3 sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] sub esi, esi .arg: pop ecx or ecx, ecx je near .getenv ; no more arguments ; ECX contains the pointer to an argument cmp byte [ecx], '-' jne usage inc ecx mov ax, [ecx] inc ecx .o: cmp al, 'o' jne .i ; Make sure we are not asked for the output file twice cmp dword [fd.out], stdout jne usage ; Find the path to output file - it is either at [ECX+1], ; i.e., -ofile -- ; or in the next argument, ; i.e., -o file or ah, ah jne .openoutput pop ecx jecxz usage .openoutput: push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc near oerr add esp, byte 12 mov [fd.out], eax jmp short .arg .i: cmp al, 'i' jne .p ; Make sure we are not asked twice cmp dword [fd.in], stdin jne near usage ; Find the path to the input file or ah, ah jne .openinput pop ecx or ecx, ecx je near usage .openinput: push dword 0 ; O_RDONLY push ecx sys.open jc near ierr ; open failed add esp, byte 8 mov [fd.in], eax jmp .arg .p: cmp al, 'p' jne .c or ah, ah jne .pcheck pop ecx or ecx, ecx je near usage mov ah, [ecx] .pcheck: cmp ah, '0' jl near usage cmp ah, '9' ja near usage mov esi, ecx jmp .arg .c: cmp al, 'c' jne .b or ah, ah jne near usage mov esi, connors jmp .arg .b: cmp al, 'b' jne .e or ah, ah jne near usage mov esi, pinhole jmp .arg .e: cmp al, 'e' jne near usage or ah, ah jne near usage mov al, ',' mov [huhmsg], al mov [separ], al mov [sep2], al mov [sep3], al mov [sep4], al jmp .arg align 4 .getenv: ; If ESI = 0, we did not have a -p argument, ; and need to check the environment for "PINHOLE=" or esi, esi jne .init sub ecx, ecx .nextenv: pop esi or esi, esi je .default ; no PINHOLE envar found ; check if this envar starts with 'PINHOLE=' mov edi, envar mov cl, 2 ; 'PINHOLE=' is 2 dwords long rep cmpsd jne .nextenv ; Check if it is followed by a digit mov al, [esi] cmp al, '0' jl .default cmp al, '9' jbe .init ; fall through align 4 .default: ; We got here because we had no -p argument, ; and did not find the PINHOLE envar. mov esi, pinhole ; fall through align 4 .init: sub eax, eax sub ebx, ebx sub ecx, ecx sub edx, edx mov edi, dbuffer+1 mov byte [dbuffer], '0' ; Convert the pinhole constant to real .constloop: lodsb cmp al, '9' ja .setconst cmp al, '0' je .processconst jb .setconst inc dl .processconst: inc cl cmp cl, 18 ja near consttoobig stosb jmp short .constloop align 4 .setconst: or dl, dl je near perr finit fild dword [tthou] fld1 fild dword [ten] fdivp st1, st0 fild dword [thousand] mov edi, obuffer mov ebp, ecx call bcdload .constdiv: fmul st0, st2 loop .constdiv fld1 fadd st0, st0 fadd st0, st0 fld1 faddp st1, st0 fchs ; If we are creating a CSV file, ; print header cmp byte [separ], ',' jne .bigloop push dword headlen push dword header push dword [fd.out] sys.write .bigloop: call getchar jc near done ; Skip to the end of the line if you got '#' cmp al, '#' jne .num call skiptoeol jmp short .bigloop .num: ; See if you got a number cmp al, '0' jl .bigloop cmp al, '9' ja .bigloop ; Yes, we have a number sub ebp, ebp sub edx, edx .number: cmp al, '0' je .number0 mov dl, 1 .number0: or dl, dl ; Skip leading 0's je .nextnumber push eax call putchar pop eax inc ebp cmp ebp, 19 jae .nextnumber mov [dbuffer+ebp], al .nextnumber: call getchar jc .work cmp al, '#' je .ungetc cmp al, '0' jl .work cmp al, '9' ja .work jmp short .number .ungetc: dec esi inc ebx .work: ; Now, do all the work or dl, dl je near .work0 cmp ebp, 19 jae near .toobig call bcdload ; Calculate pinhole diameter fld st0 ; save it fsqrt fmul st0, st3 fld st0 fmul st5 sub ebp, ebp ; Round off to 4 significant digits .diameter: fcom st0, st7 fstsw ax sahf jb .printdiameter fmul st0, st6 inc ebp jmp short .diameter .printdiameter: call printnumber ; pinhole diameter ; Calculate F-number fdivp st1, st0 fld st0 sub ebp, ebp .fnumber: fcom st0, st6 fstsw ax sahf jb .printfnumber fmul st0, st5 inc ebp jmp short .fnumber .printfnumber: call printnumber ; F number ; Calculate normalized F-number fmul st0, st0 fld1 fld st1 fyl2x frndint fld1 fscale fsqrt fstp st1 sub ebp, ebp call printnumber ; Calculate time multiplier from F-5.6 fscale fld st0 ; Round off to 4 significant digits .fmul: fcom st0, st6 fstsw ax sahf jb .printfmul inc ebp fmul st0, st5 jmp short .fmul .printfmul: call printnumber ; F multiplier ; Calculate F-stops from 5.6 fld1 fxch st1 fyl2x sub ebp, ebp call printnumber mov al, 0Ah call putchar jmp .bigloop .work0: mov al, '0' call putchar align 4 .toobig: call huh jmp .bigloop align 4 done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close finit ; return success push dword 0 sys.exit align 4 skiptoeol: ; Keep reading until you come to cr, lf, or eof call getchar jc done cmp al, 0Ah jne .cr ret .cr: cmp al, 0Dh jne skiptoeol ret align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx clc ret read: jecxz .read call write .read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .empty sub eax, eax ret align 4 .empty: add esp, byte 4 stc ret align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: jecxz .ret ; nothing to write sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now .ret: ret align 4 bcdload: ; EBP contains the number of chars in dbuffer push ecx push esi push edi lea ecx, [ebp+1] lea esi, [dbuffer+ebp-1] shr ecx, 1 std mov edi, bbuffer sub eax, eax mov [edi], eax mov [edi+4], eax mov [edi+2], ax .loop: lodsw sub ax, 3030h shl al, 4 or al, ah mov [edi], al inc edi loop .loop fbld [bbuffer] cld pop edi pop esi pop ecx sub eax, eax ret align 4 printnumber: push ebp mov al, [separ] call putchar ; Print the integer at the TOS mov ebp, bbuffer+9 fbstp [bbuffer] ; Check the sign mov al, [ebp] dec ebp or al, al jns .leading ; We got a negative number (should never happen) mov al, '-' call putchar .leading: ; Skip leading zeros mov al, [ebp] dec ebp or al, al jne .first cmp ebp, bbuffer jae .leading ; We are here because the result was 0. ; Print '0' and return mov al, '0' jmp putchar .first: ; We have found the first non-zero. ; But it is still packed test al, 0F0h jz .second push eax shr al, 4 add al, '0' call putchar pop eax and al, 0Fh .second: add al, '0' call putchar .next: cmp ebp, bbuffer jb .done mov al, [ebp] push eax shr al, 4 add al, '0' call putchar pop eax and al, 0Fh add al, '0' call putchar dec ebp jmp short .next .done: pop ebp or ebp, ebp je .ret .zeros: mov al, '0' call putchar dec ebp jne .zeros .ret: ret The code follows the same format as all the other filters we have seen before, with one subtle exception:

We are no longer assuming that the end of input implies the end of things to do, something we took for granted in the character–oriented filters. This filter does not process characters. It processes a language (albeit a very simple one, consisting only of numbers). When we have no more input, it can mean one of two things: We are done and can quit. This is the same as before. The last character we have read was a digit. We have stored it at the end of our ASCII–to–float conversion buffer. We now need to convert the contents of that buffer into a number and write the last line of our output. For that reason, we have modified our getchar and our read routines to return with the carry flag clear whenever we are fetching another character from the input, or the carry flag set whenever there is no more input. Of course, we are still using assembly language magic to do that! Take a good look at getchar. It always returns with the carry flag clear. Yet, our main code relies on the carry flag to tell it when to quit—and it works. The magic is in read. Whenever it receives more input from the system, it just returns to getchar, which fetches a character from the input buffer, clears the carry flag and returns. But when read receives no more input from the system, it does not return to getchar at all. Instead, the add esp, byte 4 op code adds 4 to ESP, sets the carry flag, and returns. So, where does it return to? Whenever a program uses the call op code, the microprocessor pushes the return address, i.e., it stores it on the top of the stack (not the FPU stack, the system stack, which is in the memory). When a program uses the ret op code, the microprocessor pops the return value from the stack, and jumps to the address that was stored there. But since we added 4 to ESP (which is the stack pointer register), we have effectively given the microprocessor a minor case of amnesia: It no longer remembers it was getchar that called read. And since getchar never pushed anything before calling read, the top of the stack now contains the return address to whatever or whoever called getchar. As far as that caller is concerned, he called getchar, which returned with the carry flag set!

Other than that, the bcdload routine is caught up in the middle of a Lilliputian conflict between the Big–Endians and the Little–Endians. It is converting the text representation of a number into that number: The text is stored in the big–endian order, but the packed decimal is little–endian. To solve the conflict, we use the std op code early on. We cancel it with cld later on: It is quite important we do not call anything that may depend on the default setting of the direction flag while std is active. Everything else in this code should be quite clear, providing you have read the entire chapter that precedes it. It is a classical example of the adage that programming requires a lot of thought and only a little coding. Once we have thought through every tiny detail, the code almost writes itself. Using <application>pinhole</application> Because we have decided to make the program ignore any input except for numbers (and even those inside a comment), we can actually perform textual queries. We do not have to, but we can. In my humble opinion, forming a textual query, instead of having to follow a very strict syntax, makes software much more user friendly. Suppose we want to build a pinhole camera to use the 4x5 inch film. The standard focal length for that film is about 150mm. We want to fine–tune our focal length so the pinhole diameter is as round a number as possible. Let us also suppose we are quite comfortable with cameras but somewhat intimidated by computers. Rather than just have to type in a bunch of numbers, we want to ask a couple of questions. Our session might look like this: &prompt.user; pinhole Computer, What size pinhole do I need for the focal length of 150? 150 490 306 362 2930 12 Hmmm... How about 160? 160 506 316 362 3125 12 Let's make it 155, please. 155 498 311 362 3027 12 Ah, let's try 157... 157 501 313 362 3066 12 156? 156 500 312 362 3047 12 That's it! Perfect! Thank you very much! ^D We have found that while for the focal length of 150, our pinhole diameter should be 490 microns, or 0.49 mm, if we go with the almost identical focal length of 156 mm, we can get away with a pinhole diameter of exactly one half of a millimeter. Scripting Because we have chosen the # character to denote the start of a comment, we can treat our pinhole software as a scripting language. You have probably seen shell scripts that start with: #! /bin/sh ...or... #!/bin/sh ...because the blank space after the #! is optional. Whenever &unix; is asked to run an executable file which starts with the #!, it assumes the file is a script. It adds the command to the rest of the first line of the script, and tries to execute that. Suppose now that we have installed pinhole in /usr/local/bin/, we can now write a script to calculate various pinhole diameters suitable for various focal lengths commonly used with the 120 film. The script might look something like this: #! /usr/local/bin/pinhole -b -i # Find the best pinhole diameter # for the 120 film ### Standard 80 ### Wide angle 30, 40, 50, 60, 70 ### Telephoto 100, 120, 140 Because 120 is a medium size film, we may name this file medium. We can set its permissions to execute, and run it as if it were a program: &prompt.user; chmod 755 medium &prompt.user; ./medium &unix; will interpret that last command as: &prompt.user; /usr/local/bin/pinhole -b -i ./medium It will run that command and display: 80 358 224 256 1562 11 30 219 137 128 586 9 40 253 158 181 781 10 50 283 177 181 977 10 60 310 194 181 1172 10 70 335 209 181 1367 10 100 400 250 256 1953 11 120 438 274 256 2344 11 140 473 296 256 2734 11 Now, let us enter: &prompt.user; ./medium -c &unix; will treat that as: &prompt.user; /usr/local/bin/pinhole -b -i ./medium -c That gives it two conflicting options: -b and -c (Use Bender's constant and use Connors' constant). We have programmed it so later options override early ones—our program will calculate everything using Connors' constant: 80 331 242 256 1826 11 30 203 148 128 685 9 40 234 171 181 913 10 50 262 191 181 1141 10 60 287 209 181 1370 10 70 310 226 256 1598 11 100 370 270 256 2283 11 120 405 296 256 2739 11 140 438 320 362 3196 12 We decide we want to go with Bender's constant after all. We want to save its values as a comma–separated file: &prompt.user; ./medium -b -e > bender &prompt.user; cat bender focal length in millimeters,pinhole diameter in microns,F-number,normalized F-number,F-5.6 multiplier,stops from F-5.6 80,358,224,256,1562,11 30,219,137,128,586,9 40,253,158,181,781,10 50,283,177,181,977,10 60,310,194,181,1172,10 70,335,209,181,1367,10 100,400,250,256,1953,11 120,438,274,256,2344,11 140,473,296,256,2734,11 &prompt.user; Caveats Assembly language programmers who "grew up" under &ms-dos; and &windows; often tend to take shortcuts. Reading the keyboard scan codes and writing directly to video memory are two classical examples of practices which, under &ms-dos; are not frowned upon but considered the right thing to do. The reason? Both the PC BIOS and &ms-dos; are notoriously slow when performing these operations. You may be tempted to continue similar practices in the &unix; environment. For example, I have seen a web site which explains how to access the keyboard scan codes on a popular &unix; clone. That is generally a very bad idea in &unix; environment! Let me explain why. &unix; Is Protected For one thing, it may simply not be possible. &unix; runs in protected mode. Only the kernel and device drivers are allowed to access hardware directly. Perhaps a particular &unix; clone will let you read the keyboard scan codes, but chances are a real &unix; operating system will not. And even if one version may let you do it, the next one may not, so your carefully crafted software may become a dinosaur overnight. &unix; Is an Abstraction But there is a much more important reason not to try accessing the hardware directly (unless, of course, you are writing a device driver), even on the &unix; like systems that let you do it: &unix; is an abstraction! There is a major difference in the philosophy of design between &ms-dos; and &unix;. &ms-dos; was designed as a single-user system. It is run on a computer with a keyboard and a video screen attached directly to that computer. User input is almost guaranteed to come from that keyboard. Your program's output virtually always ends up on that screen. This is NEVER guaranteed under &unix;. It is quite common for a &unix; user to pipe and redirect program input and output: -&prompt.user; program1 | program2 | program3 > file1 +&prompt.user; program1 | program2 | program3 > file1 If you have written program2, your input does not come from the keyboard but from the output of program1. Similarly, your output does not go to the screen but becomes the input for program3 whose output, in turn, goes to file1. But there is more! Even if you made sure that your input comes from, and your output goes to, the terminal, there is no guarantee the terminal is a PC: It may not have its video memory where you expect it, nor may its keyboard be producing PC-style scan codes. It may be a &macintosh;, or any other computer. Now you may be shaking your head: My software is in PC assembly language, how can it run on a &macintosh;? But I did not say your software would be running on a &macintosh;, only that its terminal may be a &macintosh;. Under &unix;, the terminal does not have to be directly attached to the computer that runs your software, it can even be on another continent, or, for that matter, on another planet. It is perfectly possible that a &macintosh; user in Australia connects to a &unix; system in North America (or anywhere else) via telnet. The software then runs on one computer, while the terminal is on a different computer: If you try to read the scan codes, you will get the wrong input! Same holds true about any other hardware: A file you are reading may be on a disk you have no direct access to. A camera you are reading images from may be on a space shuttle, connected to you via satellites. That is why under &unix; you must never make any assumptions about where your data is coming from and going to. Always let the system handle the physical access to the hardware. These are caveats, not absolute rules. Exceptions are possible. For example, if a text editor has determined it is running on a local machine, it may want to read the scan codes directly for improved control. I am not mentioning these caveats to tell you what to do or what not to do, just to make you aware of certain pitfalls that await you if you have just arrived to &unix; form &ms-dos;. Of course, creative people often break rules, and it is OK as long as they know they are breaking them and why. Acknowledgements This tutorial would never have been possible without the help of many experienced FreeBSD programmers from the &a.hackers;, many of whom have patiently answered my questions, and pointed me in the right direction in my attempts to explore the inner workings of &unix; system programming in general and FreeBSD in particular. Thomas M. Sommers opened the door for me. His How do I write "Hello, world" in FreeBSD assembler? web page was my first encounter with an example of assembly language programming under FreeBSD. Jake Burkholder has kept the door open by willingly answering all of my questions and supplying me with example assembly language source code. Copyright © 2000-2001 G. Adam Stanislav. All rights reserved. diff --git a/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.sgml b/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.sgml index ca0beb0f5e..38b62abc46 100644 --- a/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.sgml +++ b/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.sgml @@ -1,498 +1,498 @@ The Documentation Build Process This chapter's main purpose is to clearly explain how the documentation build process is organized, and how to affect modifications to this process. After you have finished reading this chapter you should: Know what you need to build the FDP documentation, in addition to those mentioned in the SGML tools chapter. Be able to read and understand the make instructions that are present in each document's Makefiles, as well as an overview of the doc.project.mk includes. Be able to customize the build process by using make variables and make targets. The FreeBSD Documentation Build Toolset Here are your tools. Use them every way you can. The primary build tool you will need is make, but specifically Berkeley Make. Package building is handled by FreeBSD's pkg_create. If you are not using FreeBSD, you will either have to live without packages, or compile the source yourself. gzip is needed to create compressed versions of the document. bzip2 compression and zip archives are also supported. tar is supported, but package building demands it. install is the default method to install the documentation. There are alternatives, however. It is unlikely you will have any trouble finding these last two, they are mentioned for completeness only. Understanding Makefiles in the Documentation tree There are three main types of Makefiles in the FreeBSD Documentation Project tree. Subdirectory Makefiles simply pass commands to those directories below them. Documentation Makefiles describe the document(s) that should be produced from this directory. Make includes are the glue that perform the document production, and are usually of the form doc.xxx.mk. Subdirectory Makefiles These Makefiles usually take the form of: SUBDIR =articles SUBDIR+=books COMPAT_SYMLINK = en DOC_PREFIX?= ${.CURDIR}/.. .include "${DOC_PREFIX}/share/mk/doc.project.mk" In quick summary, the first four non-empty lines define the make variables, SUBDIR, COMPAT_SYMLINK, and DOC_PREFIX. The first SUBDIR statement, as well as the COMPAT_SYMLINK statement, shows how to assign a value to a variable, overriding any previous value. The second SUBDIR statement shows how a value is appended to the current value of a variable. The SUBDIR variable is now articles books. The DOC_PREFIX assignment shows how a value is assigned to the variable, but only if it is not already defined. This is useful if DOC_PREFIX is not where this Makefile thinks it is - the user can override this and provide the correct value. Now what does it all mean? SUBDIR mentions which subdirectories below this one the build process should pass any work on to. COMPAT_SYMLINK is specific to compatibility symlinks (amazingly enough) for languages to their official encoding (doc/en would point to en_US.ISO-8859-1). DOC_PREFIX is the path to the root of the FreeBSD Document Project tree. This is not always that easy to find, and is also easily overridden, to allow for flexibility. .CURDIR is a make builtin variable with the path to the current directory. The final line includes the FreeBSD Documentation Project's project-wide make system file doc.project.mk which is the glue which converts these variables into build instructions. Documentation Makefiles These Makefiles set a bunch of make variables that describe how to build the documentation contained in that directory. Here is an example: MAINTAINER=nik@FreeBSD.org DOC?= book FORMATS?= html-split html INSTALL_COMPRESSED?= gz INSTALL_ONLY_COMPRESSED?= # SGML content SRCS= book.sgml DOC_PREFIX?= ${.CURDIR}/../../.. .include "$(DOC_PREFIX)/share/mk/docproj.docbook.mk" The MAINTAINER variable is a very important one. This variable provides the ability to claim ownership over a document in the FreeBSD Documentation Project, whereby you gain the responsibility for maintaining it. DOC is the name (sans the .sgml extension) of the main document created by this directory. SRCS lists all the individual files that make up the document. This should also include important files in which a change should result in a rebuild. FORMATS indicates the default formats that should be built for this document. INSTALL_COMPRESSED is the default list of compression techniques that should be used in the document build. INSTALL_ONLY_COMPRESS, empty by default, should be non-empty if only compressed documents are desired in the build. We covered optional variable assignments in the previous section. The DOC_PREFIX and include statements should be familiar already. FreeBSD Documentation Project make includes This is best explained by inspection of the code. Here are the system include files: doc.project.mk is the main project include file, which includes all the following include files, as necessary. doc.subdir.mk handles traversing of the document tree during the build and install processes. doc.install.mk provides variables that affect ownership and installation of documents. doc.docbook.mk is included if DOCFORMAT is docbook and DOC is set. doc.project.mk By inspection: DOCFORMAT?= docbook MAINTAINER?= doc@FreeBSD.org PREFIX?= /usr/local PRI_LANG?= en_US.ISO8859-1 .if defined(DOC) .if ${DOCFORMAT} == "docbook" .include "doc.docbook.mk" .endif .endif .include "doc.subdir.mk" .include "doc.install.mk" Variables DOCFORMAT and MAINTAINER are assigned default values, if these are not set by the document make file. PREFIX is the prefix under which the documentation building tools are installed. For normal package and port installation, this is /usr/local. PRI_LANG should be set to whatever language and encoding is natural amongst users these documents are being built for. US English is the default. PRI_LANG in no way affects what documents can, or even will, be built. Its main use is creating links to commonly referenced documents into the FreeBSD documentation install root. Conditionals The .if defined(DOC) line is an example of a make conditional which, like in other programs, defines behavior if some condition is true or if it is false. defined is a function which returns whether the variable given is defined or not. .if ${DOCFORMAT} == "docbook", next, tests whether the DOCFORMAT variable is "docbook", and in this case, includes doc.docbook.mk. The two .endifs close the two above conditionals, marking the end of their application. doc.subdir.mk This is too long to explain by inspection, you should be able to work it out with the knowledge gained from the previous chapters, and a little help given here. Variables SUBDIR is a list of subdirectories that the build process should go further down into. ROOT_SYMLINKS is the name of directories that should be linked to the document install root from their actual locations, if the current language is the primary language (specified by PRI_LANG). COMPAT_SYMLINK is described in the Subdirectory Makefile section. Targets and macros Dependencies are described by target: dependency1 dependency2 ... tuples, where to build target, you need to build the given dependencies first. After that descriptive tuple, instructions on how to build the target may be given, if the conversion process between the target and its dependencies are not previously defined, or if this particular conversion is not the same as the default conversion method. A special dependency .USE defines the equivalent of a macro. _SUBDIRUSE: .USE .for entry in ${SUBDIR} - @${ECHO} "===> ${DIRPRFX}${entry}" - @(cd ${.CURDIR}/${entry} && \ + @${ECHO} "===> ${DIRPRFX}${entry}" + @(cd ${.CURDIR}/${entry} && \ ${MAKE} ${.TARGET:S/realpackage/package/:S/realinstall/install/} DIRPRFX=${DIRPRFX}${entry}/ ) .endfor In the above, _SUBDIRUSE is now a macro which will execute the given commands when it is listed as a dependency. What sets this macro apart from other targets? Basically, it is executed after the instructions given in the build procedure it is listed as a dependency to, and it does not adjust .TARGET, which is the variable which contains the name of the target currently being built. clean: _SUBDIRUSE rm -f ${CLEANFILES} In the above, clean will use the _SUBDIRUSE macro after it has executed the instruction rm -f ${CLEANFILES}. In effect, this causes clean to go further and further down the directory tree, deleting built files as it goes down, not on the way back up. Provided targets install and package both go down the directory tree calling the real versions of themselves in the subdirectories (realinstall and realpackage respectively). clean removes files created by the build process (and goes down the directory tree too). cleandir does the same, and also removes the object directory, if any. More on conditionals exists is another condition function which returns true if the given file exists. empty returns true if the given variable is empty. target returns true if the given target does not already exist. Looping constructs in make (.for) .for provides a way to repeat a set of instructions for each space-separated element in a variable. It does this by assigning a variable to contain the current element in the list being examined. _SUBDIRUSE: .USE .for entry in ${SUBDIR} - @${ECHO} "===> ${DIRPRFX}${entry}" - @(cd ${.CURDIR}/${entry} && \ + @${ECHO} "===> ${DIRPRFX}${entry}" + @(cd ${.CURDIR}/${entry} && \ ${MAKE} ${.TARGET:S/realpackage/package/:S/realinstall/install/} DIRPRFX=${DIRPRFX}${entry}/ ) .endfor In the above, if SUBDIR is empty, no action is taken; if it has one or more elements, the instructions between .for and .endfor would repeat for every element, with entry being replaced with the value of the current element. diff --git a/en_US.ISO8859-1/books/fdp-primer/examples/appendix.sgml b/en_US.ISO8859-1/books/fdp-primer/examples/appendix.sgml index c17f37ed65..4e4196b78a 100644 --- a/en_US.ISO8859-1/books/fdp-primer/examples/appendix.sgml +++ b/en_US.ISO8859-1/books/fdp-primer/examples/appendix.sgml @@ -1,355 +1,355 @@ Examples This appendix contains example SGML files and command lines you can use to convert them from one output format to another. If you have successfully installed the Documentation Project tools then you should be able to use these examples directly. These examples are not exhaustive—they do not contain all the elements you might want to use, particularly in your document's front matter. For more examples of DocBook markup you should examine the SGML source for this and other documents, available in the CVSup doc collection, or available online starting at . To avoid confusion, these examples use the standard DocBook 4.1 DTD rather than the FreeBSD extension. They also use the stock stylesheets distributed by Norm Walsh, rather than any customizations made to those stylesheets by the FreeBSD Documentation Project. This makes them more useful as generic DocBook examples. DocBook <sgmltag>book</sgmltag> DocBook <sgmltag>book</sgmltag> An Example Book Your first name Your surname

foo@example.com

2000 Copyright string here If your book has an abstract then it should go here. Preface Your book may have a preface, in which case it should be placed here. My first chapter This is the first chapter in my book. My first section This is the first section in my book. ]]> DocBook <sgmltag>article</sgmltag> DocBook <sgmltag>article</sgmltag>

An example article Your first name Your surname

foo@example.com

2000 Copyright string here If your article has an abstract then it should go here. My first section This is the first section in my article. My first sub-section This is the first sub-section in my article.

]]> Producing formatted output This section assumes that you have installed the software listed in the textproc/docproj port, either by hand, or by using the port. Further, it is assumed that your software is installed in subdirectories under /usr/local/, and the directory where binaries have been installed is in your PATH. Adjust the paths as necessary for your system. Using Jade Converting DocBook to HTML (one large file) &prompt.user; jade -V nochunks \ -c /usr/local/share/sgml/docbook/dsssl/modular/catalog \ -c /usr/local/share/sgml/docbook/catalog \ -c /usr/local/share/sgml/jade/catalog \ -d /usr/local/share/sgml/docbook/dsssl/modular/html/docbook.dsl \ - -t sgml file.sgml > file.html + -t sgml file.sgml > file.html Specifies the nochunks parameter to the stylesheets, forcing all output to be written to STDOUT (using Norm Walsh's stylesheets). Specifies the catalogs that Jade will need to process. Three catalogs are required. The first is a catalog that contains information about the DSSSL stylesheets. The second contains information about the DocBook DTD. The third contains information specific to Jade. Specifies the full path to the DSSSL stylesheet that Jade will use when processing the document. Instructs Jade to perform a transformation from one DTD to another. In this case, the input is being transformed from the DocBook DTD to the HTML DTD. Specifies the file that Jade should process, and redirects output to the specified .html file. Converting DocBook to HTML (several small files) &prompt.user; jade \ -c /usr/local/share/sgml/docbook/dsssl/modular/catalog \ -c /usr/local/share/sgml/docbook/catalog \ -c /usr/local/share/sgml/jade/catalog \ -d /usr/local/share/sgml/docbook/dsssl/modular/html/docbook.dsl \ -t sgml file.sgml Specifies the catalogs that Jade will need to process. Three catalogs are required. The first is a catalog that contains information about the DSSSL stylesheets. The second contains information about the DocBook DTD. The third contains information specific to Jade. Specifies the full path to the DSSSL stylesheet that Jade will use when processing the document. Instructs Jade to perform a transformation from one DTD to another. In this case, the input is being transformed from the DocBook DTD to the HTML DTD. Specifies the file that Jade should process. The stylesheets determine how the individual HTML files will be named, and the name of the root file (i.e., the one that contains the start of the document. This example may still only generate one HTML file, depending on the structure of the document you are processing, and the stylesheet's rules for splitting output. Converting DocBook to Postscript The source SGML file must be converted to a &tex; file. &prompt.user; jade -Vtex-backend \ -c /usr/local/share/sgml/docbook/dsssl/modular/catalog \ -c /usr/local/share/sgml/docbook/catalog \ -c /usr/local/share/sgml/jade/catalog \ -d /usr/local/share/sgml/docbook/dsssl/modular/print/docbook.dsl \ -t tex file.sgml Customizes the stylesheets to use various options specific to producing output for &tex;. Specifies the catalogs that Jade will need to process. Three catalogs are required. The first is a catalog that contains information about the DSSSL stylesheets. The second contains information about the DocBook DTD. The third contains information specific to Jade. Specifies the full path to the DSSSL stylesheet that Jade will use when processing the document. Instructs Jade to convert the output to &tex;. The generated .tex file must now be run through tex, specifying the &jadetex macro package. &prompt.user; tex "&jadetex" file.tex You have to run tex at least three times. The first run processes the document, and determines areas of the document which are referenced from other parts of the document, for use in indexing, and so on. Do not be alarmed if you see warning messages such as LaTeX Warning: Reference `136' on page 5 undefined on input line 728. at this point. The second run reprocesses the document now that certain pieces of information are known (such as the document's page length). This allows index entries and other cross-references to be fixed up. The third pass performs any final cleanup necessary. The output from this stage will be file.dvi. Finally, run dvips to convert the .dvi file to Postscript. &prompt.user; dvips -o file.ps file.dvi Converting DocBook to PDF The first part of this process is identical to that when converting DocBook to Postscript, using the same jade command line (). When the .tex file has been generated you run pdfTeX. However, use the &pdfjadetex macro package instead. &prompt.user; pdftex "&pdfjadetex" file.tex Again, run this command three times. This will generate file.pdf, which does not need to be processed any further. diff --git a/en_US.ISO8859-1/books/fdp-primer/sgml-primer/chapter.sgml b/en_US.ISO8859-1/books/fdp-primer/sgml-primer/chapter.sgml index 01bd6cd622..c90154b62a 100644 --- a/en_US.ISO8859-1/books/fdp-primer/sgml-primer/chapter.sgml +++ b/en_US.ISO8859-1/books/fdp-primer/sgml-primer/chapter.sgml @@ -1,1591 +1,1591 @@ SGML Primer The majority of FDP documentation is written in applications of SGML. This chapter explains exactly what that means, how to read and understand the source to the documentation, and the sort of SGML tricks you will see used in the documentation. Portions of this section were inspired by Mark Galassi's Get Going With DocBook. Overview Way back when, electronic text was simple to deal with. Admittedly, you had to know which character set your document was written in (ASCII, EBCDIC, or one of a number of others) but that was about it. Text was text, and what you saw really was what you got. No frills, no formatting, no intelligence. Inevitably, this was not enough. Once you have text in a machine-usable format, you expect machines to be able to use it and manipulate it intelligently. You would like to indicate that certain phrases should be emphasized, or added to a glossary, or be hyperlinks. You might want filenames to be shown in a typewriter style font for viewing on screen, but as italics when printed, or any of a myriad of other options for presentation. It was once hoped that Artificial Intelligence (AI) would make this easy. Your computer would read in the document and automatically identify key phrases, filenames, text that the reader should type in, examples, and more. Unfortunately, real life has not happened quite like that, and our computers require some assistance before they can meaningfully process our text. More precisely, they need help identifying what is what. You or I can look at

To remove /tmp/foo use &man.rm.1;. &prompt.user; rm /tmp/foo

and easily see which parts are filenames, which are commands to be typed in, which parts are references to manual pages, and so on. But the computer processing the document cannot. For this we need markup. Markup is commonly used to describe adding value or increasing cost. The term takes on both these meanings when applied to text. Markup is additional text included in the document, distinguished from the document's content in some way, so that programs that process the document can read the markup and use it when making decisions about the document. Editors can hide the markup from the user, so the user is not distracted by it. The extra information stored in the markup adds value to the document. Adding the markup to the document must typically be done by a person—after all, if computers could recognize the text sufficiently well to add the markup then there would be no need to add it in the first place. This increases the cost (i.e., the effort required) to create the document. The previous example is actually represented in this document like this: To remove /tmp/foo use &man.rm.1;. &prompt.user; rm /tmp/foo]]> As you can see, the markup is clearly separate from the content. Obviously, if you are going to use markup you need to define what your markup means, and how it should be interpreted. You will need a markup language that you can follow when marking up your documents. Of course, one markup language might not be enough. A markup language for technical documentation has very different requirements than a markup language that was to be used for cookery recipes. This, in turn, would be very different from a markup language used to describe poetry. What you really need is a first language that you use to write these other markup languages. A meta markup language. This is exactly what the Standard Generalized Markup Language (SGML) is. Many markup languages have been written in SGML, including the two most used by the FDP, HTML and DocBook. Each language definition is more properly called a Document Type Definition (DTD). The DTD specifies the name of the elements that can be used, what order they appear in (and whether some markup can be used inside other markup) and related information. A DTD is sometimes referred to as an application of SGML. A DTD is a complete specification of all the elements that are allowed to appear, the order in which they should appear, which elements are mandatory, which are optional, and so forth. This makes it possible to write an SGML parser which reads in both the DTD and a document which claims to conform to the DTD. The parser can then confirm whether or not all the elements required by the DTD are in the document in the right order, and whether there are any errors in the markup. This is normally referred to as validating the document. This processing simply confirms that the choice of elements, their ordering, and so on, conforms to that listed in the DTD. It does not check that you have used appropriate markup for the content. If you tried to mark up all the filenames in your document as function names, the parser would not flag this as an error (assuming, of course, that your DTD defines elements for filenames and functions, and that they are allowed to appear in the same place). It is likely that most of your contributions to the Documentation Project will consist of content marked up in either HTML or DocBook, rather than alterations to the DTDs. For this reason this book will not touch on how to write a DTD. Elements, tags, and attributes All the DTDs written in SGML share certain characteristics. This is hardly surprising, as the philosophy behind SGML will inevitably show through. One of the most obvious manifestations of this philosophy is that of content and elements. Your documentation (whether it is a single web page, or a lengthy book) is considered to consist of content. This content is then divided (and further subdivided) into elements. The purpose of adding markup is to name and identify the boundaries of these elements for further processing. For example, consider a typical book. At the very top level, the book is itself an element. This book element obviously contains chapters, which can be considered to be elements in their own right. Each chapter will contain more elements, such as paragraphs, quotations, and footnotes. Each paragraph might contain further elements, identifying content that was direct speech, or the name of a character in the story. You might like to think of this as chunking content. At the very top level you have one chunk, the book. Look a little deeper, and you have more chunks, the individual chapters. These are chunked further into paragraphs, footnotes, character names, and so on. Notice how you can make this differentiation between different elements of the content without resorting to any SGML terms. It really is surprisingly straightforward. You could do this with a highlighter pen and a printout of the book, using different colors to indicate different chunks of content. Of course, we do not have an electronic highlighter pen, so we need some other way of indicating which element each piece of content belongs to. In languages written in SGML (HTML, DocBook, et al) this is done by means of tags. A tag is used to identify where a particular element starts, and where the element ends. The tag is not part of the element itself. Because each DTD was normally written to mark up specific types of information, each one will recognize different elements, and will therefore have different names for the tags. For an element called element-name the start tag will normally look like <element-name>. The corresponding closing tag for this element is </element-name>. Using an element (start and end tags) HTML has an element for indicating that the content enclosed by the element is a paragraph, called p. This element has both start and end tags. This is a paragraph. It starts with the start tag for the 'p' element, and it will end with the end tag for the 'p' element.

This is another paragraph. But this one is much shorter.

]]> Not all elements require an end tag. Some elements have no content. For example, in HTML you can indicate that you want a horizontal line to appear in the document. Obviously, this line has no content, so just the start tag is required for this element. Using an element (start tag only) HTML has an element for indicating a horizontal rule, called hr. This element does not wrap content, so only has a start tag. This is a paragraph.

This is another paragraph. A horizontal rule separates this from the previous paragraph.

]]> If it is not obvious by now, elements can contain other elements. In the book example earlier, the book element contained all the chapter elements, which in turn contained all the paragraph elements, and so on. Elements within elements; <sgmltag>em</sgmltag> This is a simple paragraph where some of the words have been emphasized.

]]> The DTD will specify the rules detailing which elements can contain other elements, and exactly what they can contain. People often confuse the terms tags and elements, and use the terms as if they were interchangeable. They are not. An element is a conceptual part of your document. An element has a defined start and end. The tags mark where the element starts and end. When this document (or anyone else knowledgeable about SGML) refers to the <p> tag they mean the literal text consisting of the three characters <, p, and >. But the phrase the <p> element refers to the whole element. This distinction is very subtle. But keep it in mind. Elements can have attributes. An attribute has a name and a value, and is used for adding extra information to the element. This might be information that indicates how the content should be rendered, or might be something that uniquely identifies that occurrence of the element, or it might be something else. An element's attributes are written inside the start tag for that element, and take the form attribute-name="attribute-value". In sufficiently recent versions of HTML, the p element has an attribute called align, which suggests an alignment (justification) for the paragraph to the program displaying the HTML. The align attribute can take one of four defined values, left, center, right and justify. If the attribute is not specified then the default is left. Using an element with an attribute The inclusion of the align attribute on this paragraph was superfluous, since the default is left.

This may appear in the center.

]]> Some attributes will only take specific values, such as left or justify. Others will allow you to enter anything you want. If you need to include quotes (") within an attribute then use single quotes around the attribute value. Single quotes around attributes I am on the right!

]]> Sometimes you do not need to use quotes around attribute values at all. However, the rules for doing this are subtle, and it is far simpler just to always quote your attribute values. The information on attributes, elements, and tags is stored in SGML catalogs. The various Documentation Project tools use these catalog files to validate your work. The tools in textproc/docproj include a variety of SGML catalog files. The FreeBSD Documentation Project includes its own set of catalog files. Your tools need to know about both sorts of catalog files. For you to do… In order to run the examples in this document you will need to install some software on your system and ensure that an environment variable is set correctly. Download and install textproc/docproj from the FreeBSD ports system. This is a meta-port that should download and install all of the programs and supporting files that are used by the Documentation Project. Add lines to your shell startup files to set SGML_CATALOG_FILES. (If you are not working on the English version of the documentation, you will want to substitute the correct directory for your language.) <filename>.profile</filename>, for &man.sh.1; and &man.bash.1; users SGML_ROOT=/usr/local/share/sgml SGML_CATALOG_FILES=${SGML_ROOT}/jade/catalog SGML_CATALOG_FILES=${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES SGML_CATALOG_FILES=${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES SGML_CATALOG_FILES=${SGML_ROOT}/docbook/4.1/catalog:$SGML_CATALOG_FILES SGML_CATALOG_FILES=/usr/doc/share/sgml/catalog:$SGML_CATALOG_FILES SGML_CATALOG_FILES=/usr/doc/en_US.ISO8859-1/share/sgml/catalog:$SGML_CATALOG_FILES export SGML_CATALOG_FILES <filename>.cshrc</filename>, for &man.csh.1; and &man.tcsh.1; users setenv SGML_ROOT /usr/local/share/sgml setenv SGML_CATALOG_FILES ${SGML_ROOT}/jade/catalog setenv SGML_CATALOG_FILES ${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES setenv SGML_CATALOG_FILES ${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES setenv SGML_CATALOG_FILES ${SGML_ROOT}/docbook/4.1/catalog:$SGML_CATALOG_FILES setenv SGML_CATALOG_FILES /usr/doc/share/sgml/catalog:$SGML_CATALOG_FILES setenv SGML_CATALOG_FILES /usr/doc/en_US.ISO8859-1/share/sgml/catalog:$SGML_CATALOG_FILES Then either log out, and log back in again, or run those commands from the command line to set the variable values. Create example.sgml, and enter the following text: An example HTML file

This is a paragraph containing some text.

This paragraph contains some more text.

This paragraph might be right-justified.

]]> Try to validate this file using an SGML parser. Part of textproc/docproj is the nsgmls validating parser. Normally, nsgmls reads in a document marked up according to an SGML DTD and returns a copy of the document's Element Structure Information Set (ESIS, but that is not important right now). However, when nsgmls is given the parameter, nsgmls will suppress its normal output, and just print error messages. This makes it a useful way to check to see if your document is valid or not. Use nsgmls to check that your document is valid: &prompt.user; nsgmls -s example.sgml As you will see, nsgmls returns without displaying any output. This means that your document validated successfully. See what happens when required elements are omitted. Try removing the title and /title tags, and re-run the validation. &prompt.user; nsgmls -s example.sgml nsgmls:example.sgml:5:4:E: character data is not allowed here nsgmls:example.sgml:6:8:E: end tag for "HEAD" which is not finished The error output from nsgmls is organized into colon-separated groups, or columns. Column Meaning 1 The name of the program generating the error. This will always be nsgmls. 2 The name of the file that contains the error. 3 Line number where the error appears. 4 Column number where the error appears. 5 A one letter code indicating the nature of the message. I indicates an informational message, W is for warnings, and E is for errors It is not always the fifth column either. nsgmls -sv displays nsgmls:I: SP version "1.3" (depending on the installed version). As you can see, this is an informational message. , and X is for cross-references. As you can see, these messages are errors. 6 The text of the error message. Simply omitting the title tags has generated 2 different errors. The first error indicates that content (in this case, characters, rather than the start tag for an element) has occurred where the SGML parser was expecting something else. In this case, the parser was expecting to see one of the start tags for elements that are valid inside head (such as title). The second error is because head elements must contain a title element. Because it does not nsgmls considers that the element has not been properly finished. However, the closing tag indicates that the element has been closed before it has been finished. Put the title element back in. The DOCTYPE declaration The beginning of each document that you write must specify the name of the DTD that the document conforms to. This is so that SGML parsers can determine the DTD and ensure that the document does conform to it. This information is generally expressed on one line, in the DOCTYPE declaration. A typical declaration for a document written to conform with version 4.0 of the HTML DTD looks like this: ]]> That line contains a number of different components. <! Is the indicator that indicates that this is an SGML declaration. This line is declaring the document type. DOCTYPE Shows that this is an SGML declaration for the document type. html Names the first element that will appear in the document. PUBLIC "-//W3C//DTD HTML 4.0//EN" Lists the Formal Public Identifier (FPI) Formal Public Identifier for the DTD that this document conforms to. Your SGML parser will use this to find the correct DTD when processing this document. PUBLIC is not a part of the FPI, but indicates to the SGML processor how to find the DTD referenced in the FPI. Other ways of telling the SGML parser how to find the DTD are shown later. > Returns to the document. Formal Public Identifiers (FPIs)<indexterm significance="preferred"> <primary>Formal Public Identifier</primary> </indexterm> You do not need to know this, but it is useful background, and might help you debug problems when your SGML processor can not locate the DTD you are using. FPIs must follow a specific syntax. This syntax is as follows: "Owner//Keyword Description//Language" Owner This indicates the owner of the FPI. If this string starts with ISO then this is an ISO owned FPI. For example, the FPI "ISO 8879:1986//ENTITIES Greek Symbols//EN" lists ISO 8879:1986 as being the owner for the set of entities for Greek symbols. ISO 8879:1986 is the ISO number for the SGML standard. Otherwise, this string will either look like -//Owner or +//Owner (notice the only difference is the leading + or -). If the string starts with - then the owner information is unregistered, with a + it identifies it as being registered. ISO 9070:1991 defines how registered names are generated; it might be derived from the number of an ISO publication, an ISBN code, or an organization code assigned according to ISO 6523. In addition, a registration authority could be created in order to assign registered names. The ISO council delegated this to the American National Standards Institute (ANSI). Because the FreeBSD Project has not been registered the owner string is -//FreeBSD. And as you can see, the W3C are not a registered owner either. Keyword There are several keywords that indicate the type of information in the file. Some of the most common keywords are DTD, ELEMENT, ENTITIES, and TEXT. DTD is used only for DTD files, ELEMENT is usually used for DTD fragments that contain only entity or element declarations. TEXT is used for SGML content (text and tags). Description Any description you want to supply for the contents of this file. This may include version numbers or any short text that is meaningful to you and unique for the SGML system. Language This is an ISO two-character code that identifies the native language for the file. EN is used for English. <filename>catalog</filename> files If you use the syntax above and process this document using an SGML processor, the processor will need to have some way of turning the FPI into the name of the file on your computer that contains the DTD. In order to do this it can use a catalog file. A catalog file (typically called catalog) contains lines that map FPIs to filenames. For example, if the catalog file contained the line: PUBLIC "-//W3C//DTD HTML 4.0//EN" "4.0/strict.dtd" The SGML processor would know to look up the DTD from strict.dtd in the 4.0 subdirectory of whichever directory held the catalog file that contained that line. Look at the contents of /usr/local/share/sgml/html/catalog. This is the catalog file for the HTML DTDs that will have been installed as part of the textproc/docproj port. <envar>SGML_CATALOG_FILES</envar> In order to locate a catalog file, your SGML processor will need to know where to look. Many of them feature command line parameters for specifying the path to one or more catalogs. In addition, you can set SGML_CATALOG_FILES to point to the files. This environment variable should consist of a colon-separated list of catalog files (including their full path). Typically, you will want to include the following files: /usr/local/share/sgml/docbook/4.1/catalog /usr/local/share/sgml/html/catalog /usr/local/share/sgml/iso8879/catalog /usr/local/share/sgml/jade/catalog You should already have done this. Alternatives to FPIs Instead of using an FPI to indicate the DTD that the document conforms to (and therefore, which file on the system contains the DTD) you can explicitly specify the name of the file. The syntax for this is slightly different: ]]> The SYSTEM keyword indicates that the SGML processor should locate the DTD in a system specific fashion. This typically (but not always) means the DTD will be provided as a filename. Using FPIs is preferred for reasons of portability. You do not want to have to ship a copy of the DTD around with your document, and if you used the SYSTEM identifier then everyone would need to keep their DTDs in the same place. Escaping back to SGML Earlier in this primer I said that SGML is only used when writing a DTD. This is not strictly true. There is certain SGML syntax that you will want to be able to use within your documents. For example, comments can be included in your document, and will be ignored by the parser. Comments are entered using SGML syntax. Other uses for SGML syntax in your document will be shown later too. Obviously, you need some way of indicating to the SGML processor that the following content is not elements within the document, but is SGML that the parser should act upon. These sections are marked by <! ... > in your document. Everything between these delimiters is SGML syntax as you might find within a DTD. As you may just have realized, the DOCTYPE declaration is an example of SGML syntax that you need to include in your document… Comments Comments are an SGML construction, and are normally only valid inside a DTD. However, as shows, it is possible to use SGML syntax within your document. The delimiter for SGML comments is the string --. The first occurrence of this string opens a comment, and the second closes it. SGML generic comment  ]]> Use 2 dashes There is a problem with producing the Postscript and PDF versions of this document. The above example probably shows just one hyphen symbol, - after the <! and before the >. You must use two -, not one. The Postscript and PDF versions have translated the two - in the original to a longer, more professional em-dash, and broken this example in the process. The HTML, plain text, and RTF versions of this document are not affected. ]]> If you have used HTML before you may have been shown different rules for comments. In particular, you may think that the string . This is not the case. A lot of web browsers have broken HTML parsers, and will accept that as valid. However, the SGML parsers used by the Documentation Project are much stricter, and will reject documents that make that error. Erroneous SGML comments ]]> The SGML parser will treat this as though it were actually: <!THIS IS OUTSIDE THE COMMENT> This is not valid SGML, and may give confusing error messages. ]]> As the example suggests, do not write comments like that. ]]> That is a (slightly) better approach, but it still potentially confusing to people new to SGML. For you to do… Add some comments to example.sgml, and check that the file still validates using nsgmls. Add some invalid comments to example.sgml, and see the error messages that nsgmls gives when it encounters an invalid comment. Entities Entities are a mechanism for assigning names to chunks of content. As an SGML parser processes your document, any entities it finds are replaced by the content of the entity. This is a good way to have re-usable, easily changeable chunks of content in your SGML documents. It is also the only way to include one marked up file inside another using SGML. There are two types of entities which can be used in two different situations; general entities and parameter entities. General Entities You cannot use general entities in an SGML context (although you define them in one). They can only be used in your document. Contrast this with parameter entities. Each general entity has a name. When you want to reference a general entity (and therefore include whatever text it represents in your document), you write &entity-name;. For example, suppose you had an entity called current.version which expanded to the current version number of your product. You could write: The current version of our product is ¤t.version;.]]> When the version number changes you can simply change the definition of the value of the general entity and reprocess your document. You can also use general entities to enter characters that you could not otherwise include in an SGML document. For example, < and & cannot normally appear in an SGML document. When the SGML parser sees the < symbol it assumes that a tag (either a start tag or an end tag) is about to appear, and when it sees the & symbol it assumes the next text will be the name of an entity. Fortunately, you can use the two general entities < and & whenever you need to include one or other of these. A general entity can only be defined within an SGML context. Typically, this is done immediately after the DOCTYPE declaration. Defining general entities ]>]]> Notice how the DOCTYPE declaration has been extended by adding a square bracket at the end of the first line. The two entities are then defined over the next two lines, before the square bracket is closed, and then the DOCTYPE declaration is closed. The square brackets are necessary to indicate that we are extending the DTD indicated by the DOCTYPE declaration. Parameter entities Like general entities, parameter entities are used to assign names to reusable chunks of text. However, where as general entities can only be used within your document, parameter entities can only be used within an SGML context. Parameter entities are defined in a similar way to general entities. However, instead of using &entity-name; to refer to them, use %entity-name; Parameter entities use the Percent symbol. . The definition also includes the % between the ENTITY keyword and the name of the entity. Defining parameter entities ]>]]> This may not seem particularly useful. It will be. For you to do… Add a general entity to example.sgml. ]> An example HTML file

This is a paragraph containing some text.

This paragraph contains some more text.

This paragraph might be right-justified.

The current version of this document is: &version;

]]> Validate the document using nsgmls. Load example.sgml into your web browser (you may need to copy it to example.html before your browser recognizes it as an HTML document). Unless your browser is very advanced, you will not see the entity reference &version; replaced with the version number. Most web browsers have very simplistic parsers which do not handle proper SGML This is a shame. Imagine all the problems and hacks (such as Server Side Includes) that could be avoided if they did. . The solution is to normalize your document using an SGML normalizer. The normalizer reads in valid SGML and outputs equally valid SGML which has been transformed in some way. One of the ways in which the normalizer transforms the SGML is to expand all the entity references in the document, replacing the entities with the text that they represent. You can use sgmlnorm to do this. - &prompt.user; sgmlnorm example.sgml > example.html + &prompt.user; sgmlnorm example.sgml > example.html You should find a normalized (i.e., entity references expanded) copy of your document in example.html, ready to load into your web browser. If you look at the output from sgmlnorm you will see that it does not include a DOCTYPE declaration at the start. To include this you need to use the option: - &prompt.user; sgmlnorm -d example.sgml > example.html + &prompt.user; sgmlnorm -d example.sgml > example.html Using entities to include files Entities (both general and parameter) are particularly useful when used to include one file inside another. Using general entities to include files Suppose you have some content for an SGML book organized into files, one file per chapter, called chapter1.sgml, chapter2.sgml, and so forth, with a book.sgml file that will contain these chapters. In order to use the contents of these files as the values for your entities, you declare them with the SYSTEM keyword. This directs the SGML parser to use the contents of the named file as the value of the entity. Using general entities to include files ]> &chapter.1; &chapter.2; &chapter.3; ]]> When using general entities to include other files within a document, the files being included (chapter1.sgml, chapter2.sgml, and so on) must not start with a DOCTYPE declaration. This is a syntax error. Using parameter entities to include files Recall that parameter entities can only be used inside an SGML context. Why then would you want to include a file within an SGML context? You can use this to ensure that you can reuse your general entities. Suppose that you had many chapters in your document, and you reused these chapters in two different books, each book organizing the chapters in a different fashion. You could list the entities at the top of each book, but this quickly becomes cumbersome to manage. Instead, place the general entity definitions inside one file, and use a parameter entity to include that file within your document. Using parameter entities to include files First, place your entity definitions in a separate file, called chapters.ent. This file contains the following: ]]> Now create a parameter entity to refer to the contents of the file. Then use the parameter entity to load the file into the document, which will then make all the general entities available for use. Then use the general entities as before: %chapters; ]> &chapter.1; &chapter.2; &chapter.3; ]]> For you to do… Use general entities to include files Create three files, para1.sgml, para2.sgml, and para3.sgml. Put content similar to the following in each file: This is the first paragraph.

]]> Edit example.sgml so that it looks like this: ]> An example HTML file

The current version of this document is: &version;

¶1; ¶2; ¶3; ]]> Produce example.html by normalizing example.sgml. - &prompt.user; sgmlnorm -d example.sgml > example.html + &prompt.user; sgmlnorm -d example.sgml > example.html Load example.html into your web browser, and confirm that the paran.sgml files have been included in example.html. Use parameter entities to include files You must have taken the previous steps first. Edit example.sgml so that it looks like this: %entities; ]> An example HTML file

The current version of this document is: &version;

¶1; ¶2; ¶3; ]]> Create a new file, entities.sgml, with this content: ]]> Produce example.html by normalizing example.sgml. - &prompt.user; sgmlnorm -d example.sgml > example.html + &prompt.user; sgmlnorm -d example.sgml > example.html Load example.html into your web browser, and confirm that the paran.sgml files have been included in example.html. Marked sections SGML provides a mechanism to indicate that particular pieces of the document should be processed in a special way. These are termed marked sections. Structure of a marked section <![ KEYWORD [ Contents of marked section ]]> As you would expect, being an SGML construct, a marked section starts with <!. The first square bracket begins to delimit the marked section. KEYWORD describes how this marked section should be processed by the parser. The second square bracket indicates that the content of the marked section starts here. The marked section is finished by closing the two square brackets, and then returning to the document context from the SGML context with >. Marked section keywords <literal>CDATA</literal>, <literal>RCDATA</literal> These keywords denote the marked sections content model, and allow you to change it from the default. When an SGML parser is processing a document it keeps track of what is called the content model. Briefly, the content model describes what sort of content the parser is expecting to see, and what it will do with it when it finds it. The two content models you will probably find most useful are CDATA and RCDATA. CDATA is for Character Data. If the parser is in this content model then it is expecting to see characters, and characters only. In this model the < and & symbols lose their special status, and will be treated as ordinary characters. RCDATA is for Entity references and character data If the parser is in this content model then it is expecting to see characters and entities. < loses its special status, but & will still be treated as starting the beginning of a general entity. This is particularly useful if you are including some verbatim text that contains lots of < and & characters. While you could go through the text ensuring that every < is converted to a < and every & is converted to a &, it can be easier to mark the section as only containing CDATA. When the SGML parser encounters this it will ignore the < and & symbols embedded in the content. When you use CDATA or RCDATA in examples of text marked up in SGML, keep in mind that the content of CDATA is not validated. You have to check the included SGML text using other means. You could, for example, write the example in another document, validate the example code, and then paste it to your CDATA content. Using a CDATA marked section <para>Here is an example of how you would include some text that contained many <literal><</literal> and <literal>&</literal> symbols. The sample text is a fragment of HTML. The surrounding text (<para> and <programlisting>) are from DocBook.</para> <programlisting> <![ CDATA [ This is a sample that shows you some of the elements within HTML. Since the angle brackets are used so many times, it is simpler to say the whole example is a CDATA marked section than to use the entity names for the left and right angle brackets throughout.

This is a listitem
This is a second listitem
This is a third listitem

This is the end of the example.

]]> ]]> </programlisting> If you look at the source for this document you will see this technique used throughout. <literal>INCLUDE</literal> and <literal>IGNORE</literal> If the keyword is INCLUDE then the contents of the marked section will be processed. If the keyword is IGNORE then the marked section is ignored and will not be processed. It will not appear in the output. Using <literal>INCLUDE</literal> and <literal>IGNORE</literal> in marked sections <![ INCLUDE [ This text will be processed and included. ]]> <![ IGNORE [ This text will not be processed or included. ]]> By itself, this is not too useful. If you wanted to remove text from your document you could cut it out, or wrap it in comments. It becomes more useful when you realize you can use parameter entities to control this. Remember that parameter entities can only be used in SGML contexts, and the keyword of a marked section is an SGML context. For example, suppose that you produced a hard-copy version of some documentation and an electronic version. In the electronic version you wanted to include some extra content that was not to appear in the hard-copy. Create a parameter entity, and set its value to INCLUDE. Write your document, using marked sections to delimit content that should only appear in the electronic version. In these marked sections use the parameter entity in place of the keyword. When you want to produce the hard-copy version of the document, change the parameter entity's value to IGNORE and reprocess the document. Using a parameter entity to control a marked section <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [ <!ENTITY % electronic.copy "INCLUDE"> ]]> ... <![ %electronic.copy [ This content should only appear in the electronic version of the document. ]]> When producing the hard-copy version, change the entity's definition to: <!ENTITY % electronic.copy "IGNORE"> On reprocessing the document, the marked sections that use %electronic.copy as their keyword will be ignored. For you to do… Create a new file, section.sgml, that contains the following: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [ <!ENTITY % text.output "INCLUDE"> ]> <html> <head> <title>An example using marked sections</title> </head> <body> <p>This paragraph <![ CDATA [contains many < characters (< < < < <) so it is easier to wrap it in a CDATA marked section ]]></p> <![ IGNORE [ <p>This paragraph will definitely not be included in the output.</p> ]]> <![ [ <p>This paragraph might appear in the output, or it might not.</p> <p>Its appearance is controlled by the parameter entity.</p> ]]> </body> </html> Normalize this file using &man.sgmlnorm.1; and examine the output. Notice which paragraphs have appeared, which have disappeared, and what has happened to the content of the CDATA marked section. Change the definition of the text.output entity from INCLUDE to IGNORE. Re-normalize the file, and examine the output to see what has changed. Conclusion That is the conclusion of this SGML primer. For reasons of space and complexity several things have not been covered in depth (or at all). However, the previous sections cover enough SGML for you to be able to follow the organization of the FDP documentation. diff --git a/en_US.ISO8859-1/books/handbook/advanced-networking/chapter.sgml b/en_US.ISO8859-1/books/handbook/advanced-networking/chapter.sgml index 2b15eaac57..90cda0c018 100644 --- a/en_US.ISO8859-1/books/handbook/advanced-networking/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/advanced-networking/chapter.sgml @@ -1,4238 +1,4238 @@ Advanced Networking Synopsis This chapter will cover a number of advanced networking topics. After reading this chapter, you will know: The basics of gateways and routes. How to set up IEEE 802.11 and &bluetooth; devices. How to make FreeBSD act as a bridge. How to set up network booting on a diskless machine. How to set up network address translation. How to connect two computers via PLIP. How to set up IPv6 on a FreeBSD machine. How to configure ATM. Before reading this chapter, you should: Understand the basics of the /etc/rc scripts. Be familiar with basic network terminology. Know how to configure and install a new FreeBSD kernel (). Know how to install additional third-party software (). Coranth Gryphon Contributed by Gateways and Routes routing gateway subnet For one machine to be able to find another over a network, there must be a mechanism in place to describe how to get from one to the other. This is called routing. A route is a defined pair of addresses: a destination and a gateway. The pair indicates that if you are trying to get to this destination, communicate through this gateway. There are three types of destinations: individual hosts, subnets, and default. The default route is used if none of the other routes apply. We will talk a little bit more about default routes later on. There are also three types of gateways: individual hosts, interfaces (also called links), and Ethernet hardware addresses (MAC addresses). An Example To illustrate different aspects of routing, we will use the following example from netstat: &prompt.user; netstat -r Routing tables Destination Gateway Flags Refs Use Netif Expire default outside-gw UGSc 37 418 ppp0 localhost localhost UH 0 181 lo0 test0 0:e0:b5:36:cf:4f UHLW 5 63288 ed0 77 10.20.30.255 link#1 UHLW 1 2421 example.com link#1 UC 0 0 host1 0:e0:a8:37:8:1e UHLW 3 4601 lo0 host2 0:e0:a8:37:8:1e UHLW 0 5 lo0 => host2.example.com link#1 UC 0 0 224 link#1 UC 0 0 default route The first two lines specify the default route (which we will cover in the next section) and the localhost route. loopback device The interface (Netif column) that this routing table specifies to use for localhost is lo0, also known as the loopback device. This says to keep all traffic for this destination internal, rather than sending it out over the LAN, since it will only end up back where it started. Ethernet MAC address The next thing that stands out are the addresses beginning with 0:e0:. These are Ethernet hardware addresses, which are also known as MAC addresses. FreeBSD will automatically identify any hosts (test0 in the example) on the local Ethernet and add a route for that host, directly to it over the Ethernet interface, ed0. There is also a timeout (Expire column) associated with this type of route, which is used if we fail to hear from the host in a specific amount of time. When this happens, the route to this host will be automatically deleted. These hosts are identified using a mechanism known as RIP (Routing Information Protocol), which figures out routes to local hosts based upon a shortest path determination. subnet FreeBSD will also add subnet routes for the local subnet (10.20.30.255 is the broadcast address for the subnet 10.20.30, and example.com is the domain name associated with that subnet). The designation link#1 refers to the first Ethernet card in the machine. You will notice no additional interface is specified for those. Both of these groups (local network hosts and local subnets) have their routes automatically configured by a daemon called routed. If this is not run, then only routes which are statically defined (i.e. entered explicitly) will exist. The host1 line refers to our host, which it knows by Ethernet address. Since we are the sending host, FreeBSD knows to use the loopback interface (lo0) rather than sending it out over the Ethernet interface. The two host2 lines are an example of what happens when we use an &man.ifconfig.8; alias (see the section on Ethernet for reasons why we would do this). The => symbol after the lo0 interface says that not only are we using the loopback (since this address also refers to the local host), but specifically it is an alias. Such routes only show up on the host that supports the alias; all other hosts on the local network will simply have a link#1 line for such routes. The final line (destination subnet 224) deals with multicasting, which will be covered in another section. Finally, various attributes of each route can be seen in the Flags column. Below is a short table of some of these flags and their meanings: U Up: The route is active. H Host: The route destination is a single host. G Gateway: Send anything for this destination on to this remote system, which will figure out from there where to send it. S Static: This route was configured manually, not automatically generated by the system. C Clone: Generates a new route based upon this route for machines we connect to. This type of route is normally used for local networks. W WasCloned: Indicated a route that was auto-configured based upon a local area network (Clone) route. L Link: Route involves references to Ethernet hardware. Default Routes default route When the local system needs to make a connection to a remote host, it checks the routing table to determine if a known path exists. If the remote host falls into a subnet that we know how to reach (Cloned routes), then the system checks to see if it can connect along that interface. If all known paths fail, the system has one last option: the default route. This route is a special type of gateway route (usually the only one present in the system), and is always marked with a c in the flags field. For hosts on a local area network, this gateway is set to whatever machine has a direct connection to the outside world (whether via PPP link, DSL, cable modem, T1, or another network interface). If you are configuring the default route for a machine which itself is functioning as the gateway to the outside world, then the default route will be the gateway machine at your Internet Service Provider's (ISP) site. Let us look at an example of default routes. This is a common configuration: [Local2] <--ether--> [Local1] <--PPP--> [ISP-Serv] <--ether--> [T1-GW] The hosts Local1 and Local2 are at your site. Local1 is connected to an ISP via a dial up PPP connection. This PPP server computer is connected through a local area network to another gateway computer through an external interface to the ISPs Internet feed. The default routes for each of your machines will be: Host Default Gateway Interface Local2 Local1 Ethernet Local1 T1-GW PPP A common question is Why (or how) would we set the T1-GW to be the default gateway for Local1, rather than the ISP server it is connected to?. Remember, since the PPP interface is using an address on the ISP's local network for your side of the connection, routes for any other machines on the ISP's local network will be automatically generated. Hence, you will already know how to reach the T1-GW machine, so there is no need for the intermediate step of sending traffic to the ISP server. It is common to use the address X.X.X.1 as the gateway address for your local network. So (using the same example), if your local class-C address space was 10.20.30 and your ISP was using 10.9.9 then the default routes would be: Host Default Route Local2 (10.20.30.2) Local1 (10.20.30.1) Local1 (10.20.30.1, 10.9.9.30) T1-GW (10.9.9.1) You can easily define the default route via the /etc/rc.conf file. In our example, on the Local2 machine, we added the following line in /etc/rc.conf: defaultrouter="10.20.30.1" It is also possible to do it directly from the command line with the &man.route.8; command: &prompt.root; route add default 10.20.30.1 For more information on manual manipulation of network routing tables, consult &man.route.8; manual page. Dual Homed Hosts dual homed hosts There is one other type of configuration that we should cover, and that is a host that sits on two different networks. Technically, any machine functioning as a gateway (in the example above, using a PPP connection) counts as a dual-homed host. But the term is really only used to refer to a machine that sits on two local-area networks. In one case, the machine has two Ethernet cards, each having an address on the separate subnets. Alternately, the machine may only have one Ethernet card, and be using &man.ifconfig.8; aliasing. The former is used if two physically separate Ethernet networks are in use, the latter if there is one physical network segment, but two logically separate subnets. Either way, routing tables are set up so that each subnet knows that this machine is the defined gateway (inbound route) to the other subnet. This configuration, with the machine acting as a router between the two subnets, is often used when we need to implement packet filtering or firewall security in either or both directions. If you want this machine to actually forward packets between the two interfaces, you need to tell FreeBSD to enable this ability. See the next section for more details on how to do this. Building a Router router A network router is simply a system that forwards packets from one interface to another. Internet standards and good engineering practice prevent the FreeBSD Project from enabling this by default in FreeBSD. You can enable this feature by changing the following variable to YES in &man.rc.conf.5;: gateway_enable=YES # Set to YES if this host will be a gateway This option will set the &man.sysctl.8; variable net.inet.ip.forwarding to 1. If you should need to stop routing temporarily, you can reset this to 0 temporarily. Your new router will need routes to know where to send the traffic. If your network is simple enough you can use static routes. FreeBSD also comes with the standard BSD routing daemon &man.routed.8;, which speaks RIP (both version 1 and version 2) and IRDP. Support for BGP v4, OSPF v2, and other sophisticated routing protocols is available with the net/zebra package. Commercial products such as &gated; are also available for more complex network routing solutions. BGP RIP OSPF Al Hoang Contributed by Setting Up Static Routes Manual Configuration Let us assume we have a network as follows: INTERNET | (10.0.0.1/24) Default Router to Internet | |Interface xl0 |10.0.0.10/24 +------+ | | RouterA | | (FreeBSD gateway) +------+ | Interface xl1 | 192.168.1.1/24 | +--------------------------------+ Internal Net 1 | 192.168.1.2/24 | +------+ | | RouterB | | +------+ | 192.168.2.1/24 | Internal Net 2 In this scenario, RouterA is our &os; machine that is acting as a router to the rest of the Internet. It has a default route set to 10.0.0.1 which allows it to connect with the outside world. We will assume that RouterB is already configured properly and knows how to get wherever it needs to go. (This is simple in this picture. Just add a default route on RouterB using 192.168.1.1 as the gateway.) If we look at the routing table for RouterA we would see something like the following: &prompt.user; netstat -nr Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 10.0.0.1 UGS 0 49378 xl0 127.0.0.1 127.0.0.1 UH 0 6 lo0 10.0.0/24 link#1 UC 0 0 xl0 192.168.1/24 link#2 UC 0 0 xl1 With the current routing table RouterA will not be able to reach our Internal Net 2. It does not have a route for 192.168.2.0/24. One way to alleviate this is to manually add the route. The following command would add the Internal Net 2 network to RouterA's routing table using 192.168.1.2 as the next hop: &prompt.root; route add -net 192.168.2.0/24 192.168.1.2 Now RouterA can reach any hosts on the 192.168.2.0/24 network. Persistent Configuration The above example is perfect for configuring a static route on a running system. However, one problem is that the routing information will not persist if you reboot your &os; machine. The way to handle the addition of a static route is to put it in your /etc/rc.conf file: # Add Internal Net 2 as a static route static_routes="internalnet2" route_internalnet2="-net 192.168.2.0/24 192.168.1.2" The static_routes configuration variable is a list of strings separated by a space. Each string references to a route name. In our above example we only have one string in static_routes. This string is internalnet2. We then add a configuration variable called route_internalnet2 where we put all of the configuration parameters we would give to the &man.route.8; command. For our example above we would have used the command: &prompt.root; route add -net 192.168.2.0/24 192.168.1.2 so we need "-net 192.168.2.0/24 192.168.1.2". As said above, we can have more than one string in static_routes. This allows us to create multiple static routes. The following lines shows an example of adding static routes for the 192.168.0.0/24 and 192.168.1.0/24 networks on an imaginary router: static_routes="net1 net2" route_net1="-net 192.168.0.0/24 192.168.0.1" route_net2="-net 192.168.1.0/24 192.168.1.1" Routing Propagation routing propagation We have already talked about how we define our routes to the outside world, but not about how the outside world finds us. We already know that routing tables can be set up so that all traffic for a particular address space (in our examples, a class-C subnet) can be sent to a particular host on that network, which will forward the packets inbound. When you get an address space assigned to your site, your service provider will set up their routing tables so that all traffic for your subnet will be sent down your PPP link to your site. But how do sites across the country know to send to your ISP? There is a system (much like the distributed DNS information) that keeps track of all assigned address-spaces, and defines their point of connection to the Internet Backbone. The Backbone are the main trunk lines that carry Internet traffic across the country, and around the world. Each backbone machine has a copy of a master set of tables, which direct traffic for a particular network to a specific backbone carrier, and from there down the chain of service providers until it reaches your network. It is the task of your service provider to advertise to the backbone sites that they are the point of connection (and thus the path inward) for your site. This is known as route propagation. Troubleshooting traceroute Sometimes, there is a problem with routing propagation, and some sites are unable to connect to you. Perhaps the most useful command for trying to figure out where routing is breaking down is the &man.traceroute.8; command. It is equally useful if you cannot seem to make a connection to a remote machine (i.e. &man.ping.8; fails). The &man.traceroute.8; command is run with the name of the remote host you are trying to connect to. It will show the gateway hosts along the path of the attempt, eventually either reaching the target host, or terminating because of a lack of connection. For more information, see the manual page for &man.traceroute.8;. Multicast Routing multicast routing kernel options MROUTING FreeBSD supports both multicast applications and multicast routing natively. Multicast applications do not require any special configuration of FreeBSD; applications will generally run out of the box. Multicast routing requires that support be compiled into the kernel: options MROUTING In addition, the multicast routing daemon, &man.mrouted.8; must be configured to set up tunnels and DVMRP via /etc/mrouted.conf. More details on multicast configuration may be found in the manual page for &man.mrouted.8;. Eric Anderson Written by Wireless Networking wireless networking 802.11 wireless networking Introduction It can be very useful to be able to use a computer without the annoyance of having a network cable attached at all times. FreeBSD can be used as a wireless client, and even as a wireless access point. Wireless Modes of Operation There are two different ways to configure 802.11 wireless devices: BSS and IBSS. BSS Mode BSS mode is the mode that typically is used. BSS mode is also called infrastructure mode. In this mode, a number of wireless access points are connected to a wired network. Each wireless network has its own name. This name is called the SSID of the network. Wireless clients connect to these wireless access points. The IEEE 802.11 standard defines the protocol that wireless networks use to connect. A wireless client can be tied to a specific network, when a SSID is set. A wireless client can also attach to any network by not explicitly setting a SSID. IBSS Mode IBSS mode, also called ad-hoc mode, is designed for point to point connections. There are actually two types of ad-hoc mode. One is IBSS mode, also called ad-hoc or IEEE ad-hoc mode. This mode is defined by the IEEE 802.11 standards. The second is called demo ad-hoc mode or Lucent ad-hoc mode (and sometimes, confusingly, ad-hoc mode). This is the old, pre-802.11 ad-hoc mode and should only be used for legacy installations. We will not cover either of the ad-hoc modes further. Infrastructure Mode Access Points Access points are wireless networking devices that allow one or more wireless clients to use the device as a central hub. When using an access point, all clients communicate through the access point. Multiple access points are often used to cover a complete area such as a house, business, or park with a wireless network. Access points typically have multiple network connections: the wireless card, and one or more wired Ethernet adapters for connection to the rest of the network. Access points can either be purchased prebuilt, or you can build your own with FreeBSD and a supported wireless card. Several vendors make wireless access points and wireless cards with various features. Building a FreeBSD Access Point wireless networking access point Requirements In order to set up a wireless access point with FreeBSD, you need to have a compatible wireless card. Currently, only cards with the Prism chipset are supported. You will also need a wired network card that is supported by FreeBSD (this should not be difficult to find, FreeBSD supports a lot of different devices). For this guide, we will assume you want to &man.bridge.4; all traffic between the wireless device and the network attached to the wired network card. The hostap functionality that FreeBSD uses to implement the access point works best with certain versions of firmware. Prism 2 cards should use firmware version 1.3.4 or newer. Prism 2.5 and Prism 3 cards should use firmware 1.4.9. Older versions of the firmware way or may not function correctly. At this time, the only way to update cards is with &windows; firmware update utilities available from your card's manufacturer. Setting It Up First, make sure your system can see the wireless card: &prompt.root; ifconfig -a wi0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet6 fe80::202:2dff:fe2d:c938%wi0 prefixlen 64 scopeid 0x7 inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255 ether 00:09:2d:2d:c9:50 media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps) status: no carrier ssid "" stationname "FreeBSD Wireless node" channel 10 authmode OPEN powersavemode OFF powersavesleep 100 wepmode OFF weptxkey 1 Do not worry about the details now, just make sure it shows you something to indicate you have a wireless card installed. If you have trouble seeing the wireless interface, and you are using a PC Card, you may want to check out &man.pccardc.8; and &man.pccardd.8; manual pages for more information. Next, you will need to load a module in order to get the bridging part of FreeBSD ready for the access point. To load the &man.bridge.4; module, simply run the following command: &prompt.root; kldload bridge It should not have produced any errors when loading the module. If it did, you may need to compile the &man.bridge.4; code into your kernel. The Bridging section of this handbook should be able to help you accomplish that task. Now that you have the bridging stuff done, we need to tell the FreeBSD kernel which interfaces to bridge together. We do that by using &man.sysctl.8;: &prompt.root; sysctl net.link.ether.bridge.enable=1 &prompt.root; sysctl net.link.ether.bridge.config="wi0 xl0" &prompt.root; sysctl net.inet.ip.forwarding=1 On &os; versions earlier than 5.2, you need to use the following options instead: &prompt.root; sysctl net.link.ether.bridge=1 &prompt.root; sysctl net.link.ether.bridge_cfg="wi0,xl0" &prompt.root; sysctl net.inet.ip.forwarding=1 Now it is time for the wireless card setup. The following command will set the card into an access point: &prompt.root; ifconfig wi0 ssid my_net channel 11 media DS/11Mbps mediaopt hostap up stationname "FreeBSD AP" The &man.ifconfig.8; line brings the wi0 interface up, sets its SSID to my_net, and sets the station name to FreeBSD AP. The sets the card into 11Mbps mode and is needed for any to take effect. The option places the interface into access point mode. The option sets the 802.11b channel to use. The &man.wicontrol.8; manual page has valid channel options for your regulatory domain. Now you should have a complete functioning access point up and running. You are encouraged to read &man.wicontrol.8;, &man.ifconfig.8;, and &man.wi.4; for further information. It is also suggested that you read the section on encryption that follows. Status Information Once the access point is configured and operational, operators will want to see the clients that are associated with the access point. At any time, the operator may type: &prompt.root; wicontrol -l 1 station: 00:09:b7:7b:9d:16 asid=04c0, flags=3<ASSOC,AUTH>, caps=1<ESS>, rates=f<1M,2M,5.5M,11M>, sig=38/15 This shows that there is one station associated, along with its parameters. The signal indicated should be used as a relative indication of strength only. Its translation to dBm or other units varies between different firmware revisions. Clients A wireless client is a system that accesses an access point or another client directly. Typically, wireless clients only have one network device, the wireless networking card. There are a few different ways to configure a wireless client. These are based on the different wireless modes, generally BSS (infrastructure mode, which requires an access point), and IBSS (ad-hoc, or peer-to-peer mode). In our example, we will use the most popular of the two, BSS mode, to talk to an access point. Requirements There is only one real requirement for setting up FreeBSD as a wireless client. You will need a wireless card that is supported by FreeBSD. Setting Up a Wireless FreeBSD Client You will need to know a few things about the wireless network you are joining before you start. In this example, we are joining a network that has a name of my_net, and encryption turned off. In this example, we are not using encryption, which is a dangerous situation. In the next section, you will learn how to turn on encryption, why it is important to do so, and why some encryption technologies still do not completely protect you. Make sure your card is recognized by FreeBSD: &prompt.root; ifconfig -a wi0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet6 fe80::202:2dff:fe2d:c938%wi0 prefixlen 64 scopeid 0x7 inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255 ether 00:09:2d:2d:c9:50 media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps) status: no carrier ssid "" stationname "FreeBSD Wireless node" channel 10 authmode OPEN powersavemode OFF powersavesleep 100 wepmode OFF weptxkey 1 Now, we can set the card to the correct settings for our network: &prompt.root; ifconfig wi0 inet 192.168.0.20 netmask 255.255.255.0 ssid my_net Replace 192.168.0.20 and 255.255.255.0 with a valid IP address and netmask on your wired network. Remember, our access point is bridging the data between the wireless network, and the wired network, so it will appear to the other devices on your network that you are on the wired network just as they are. Once you have done that, you should be able to ping hosts on the wired network just as if you were connected using a standard wired connection. If you are experiencing problems with your wireless connection, check to make sure that you are associated (connected) to the access point: &prompt.root; ifconfig wi0 should return some information, and you should see: status: associated If it does not show associated, then you may be out of range of the access point, have encryption on, or possibly have a configuration problem. Encryption wireless networking encryption Encryption on a wireless network is important because you no longer have the ability to keep the network contained in a well protected area. Your wireless data will be broadcast across your entire neighborhood, so anyone who cares to read it can. This is where encryption comes in. By encrypting the data that is sent over the airwaves, you make it much more difficult for any interested party to grab your data right out of the air. The two most common ways to encrypt the data between your client and the access point are WEP, and &man.ipsec.4;. WEP WEP WEP is an abbreviation for Wired Equivalency Protocol. WEP is an attempt to make wireless networks as safe and secure as a wired network. Unfortunately, it has been cracked, and is fairly trivial to break. This also means it is not something to rely on when it comes to encrypting sensitive data. It is better than nothing, so use the following to turn on WEP on your new FreeBSD access point: &prompt.root; ifconfig wi0 inet up ssid my_net wepmode on wepkey 0x1234567890 media DS/11Mbps mediaopt hostap And you can turn on WEP on a client with this command: &prompt.root; ifconfig wi0 inet 192.168.0.20 netmask 255.255.255.0 ssid my_net wepmode on wepkey 0x1234567890 Note that you should replace the 0x1234567890 with a more unique key. IPsec &man.ipsec.4; is a much more robust and powerful tool for encrypting data across a network. This is definitely the preferred way to encrypt data over a wireless network. You can read more about &man.ipsec.4; security and how to implement it in the IPsec section of this handbook. Tools There are a small number of tools available for use in debugging and setting up your wireless network, and here we will attempt to describe some of them and what they do. The <application>bsd-airtools</application> Package The bsd-airtools package is a complete toolset that includes wireless auditing tools for WEP key cracking, access point detection, etc. The bsd-airtools utilities can be installed from the net/bsd-airtools port. Information on installing ports can be found in of this handbook. The program dstumbler is the packaged tool that allows for access point discovery and signal to noise ratio graphing. If you are having a hard time getting your access point up and running, dstumbler may help you get started. To test your wireless network security, you may choose to use dweputils (dwepcrack, dwepdump and dwepkeygen) to help you determine if WEP is the right solution to your wireless security needs. The <command>wicontrol</command>, <command>ancontrol</command> and <command>raycontrol</command> Utilities These are the tools you can use to control how your wireless card behaves on the wireless network. In the examples above, we have chosen to use &man.wicontrol.8;, since our wireless card is a wi0 interface. If you had a Cisco wireless device, it would come up as an0, and therefore you would use &man.ancontrol.8;. The <command>ifconfig</command> Command ifconfig The &man.ifconfig.8; command can be used to do many of the same options as &man.wicontrol.8;, however it does lack a few options. Check &man.ifconfig.8; for command line parameters and options. Supported Cards Access Points The only cards that are currently supported for BSS (as an access point) mode are devices based on the Prism 2, 2.5, or 3 chipsets. For a complete list, look at &man.wi.4;. 802.11b Clients Almost all 802.11b wireless cards are currently supported under FreeBSD. Most cards based on Prism, Spectrum24, Hermes, Aironet, and Raylink will work as a wireless network card in IBSS (ad-hoc, peer-to-peer, and BSS) mode. - 802.11a & 802.11g Clients + 802.11a & 802.11g Clients The &man.ath.4; device driver supports 802.11a and 802.11g. If your card is based on an Atheros chipset, you may be able to use this driver. Unfortunately, there are still many vendors that do not provide schematics for their drivers to the open source community because they regard such information as trade secrets. Consequently, the developers of FreeBSD and other operating systems are left two choices: develop the drivers by a long and pain-staking process of reverse engineering or using the existing driver binaries available for the µsoft.windows; platforms. Most developers, including those involved with FreeBSD, have taken the latter approach. Thanks to the contributions of Bill Paul (wpaul), as of FreeBSD 5.3-RELEASE there is native support for the Network Driver Interface Specification (NDIS). The FreeBSD NDISulator (otherwise known as Project Evil) takes a &windows; driver binary and basically tricks it into thinking it is running on &windows;. This feature is still relatively new, but most test cases seem to work adequately. NDIS NDISulator &windows; drivers Microsoft Windows Microsoft Windows device drivers KLD (kernel loadable object) In order to use the NDISulator, you need three things: Kernel sources &windowsxp; driver binary (.SYS extension) &windowsxp; driver configuration file (.INF extension) You may need to compile the &man.ndis.4; mini port driver wrapper module. As root: &prompt.root; cd /usr/src/sys/modules/ndis -&prompt.root; make && make install +&prompt.root; make && make install Locate the files for your specific card. Generally, they can be found on the included CDs or at the vendors' websites. In the following examples, we will use W32DRIVER.SYS and W32DRIVER.INF. The next step is to compile the driver binary into a loadable kernel module. To accomplish this, as root, go into the if_ndis module directory and copy the &windows; driver files into it: &prompt.root; cd /usr/src/sys/modules/if_ndis &prompt.root; cp /path/to/driver/W32DRIVER.SYS ./ &prompt.root; cp /path/to/driver/W32DRIVER.INF ./ We will now use the ndiscvt utility to create the driver definition header ndis_driver_data.h to build the module: &prompt.root; ndiscvt -i W32DRIVER.INF -s W32DRIVER.SYS -o ndis_driver_data.h The and options specify the configuration and binary files, respectively. We use the option because the Makefile will be looking for this file when it comes time to build the module. Some &windows; drivers require additional files to operate. You may include them with ndiscvt by using the option. Consult the &man.ndiscvt.8; manual page for more information. Finally, we can build and install the driver module: - &prompt.root; make && make install + &prompt.root; make && make install To use the driver, you must load the appropriate modules: &prompt.root; kldload ndis &prompt.root; kldload if_ndis The first command loads the NDIS miniport driver wrapper, the second loads the actual network interface. Check &man.dmesg.8; to see if there were any errors loading. If all went well, you should get output resembling the following: ndis0: <Wireless-G PCI Adapter> mem 0xf4100000-0xf4101fff irq 3 at device 8.0 on pci1 ndis0: NDIS API version: 5.0 ndis0: Ethernet address: 0a:b1:2c:d3:4e:f5 ndis0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps ndis0: 11g rates: 6Mbps 9Mbps 12Mbps 18Mbps 36Mbps 48Mbps 54Mbps From here you can treat the ndis0 device like any other wireless device (e.g. wi0) and consult the earlier sections of this chapter. Pav Lucistnik Written by

pav@FreeBSD.org

Bluetooth Bluetooth Introduction Bluetooth is a wireless technology for creating personal networks operating in the 2.4 GHz unlicensed band, with a range of 10 meters. Networks are usually formed ad-hoc from portable devices such as cellular phones, handhelds and laptops. Unlike the other popular wireless technology, Wi-Fi, Bluetooth offers higher level service profiles, e.g. FTP-like file servers, file pushing, voice transport, serial line emulation, and more. The Bluetooth stack in &os; is implemented using the Netgraph framework (see &man.netgraph.4;). A broad variety of Bluetooth USB dongles is supported by the &man.ng.ubt.4; driver. The Broadcom BCM2033 chip based Bluetooth devices are supported via the &man.ubtbcmfw.4; and &man.ng.ubt.4; drivers. The 3Com Bluetooth PC Card 3CRWB60-A is supported by the &man.ng.bt3c.4; driver. Serial and UART based Bluetooth devices are supported via &man.sio.4;, &man.ng.h4.4; and &man.hcseriald.8;. This section describes the use of the USB Bluetooth dongle. Bluetooth support is available in &os; 5.0 and newer systems. Plugging in the Device By default Bluetooth device drivers are available as kernel modules. Before attaching a device, you will need to load the driver into the kernel: &prompt.root; kldload ng_ubt If the Bluetooth device is present in the system during system startup, load the module from /boot/loader.conf: ng_ubt_load="YES" Plug in your USB dongle. The output similar to the following will appear on the console (or in syslog): ubt0: vendor 0x0a12 product 0x0001, rev 1.10/5.25, addr 2 ubt0: Interface 0 endpoints: interrupt=0x81, bulk-in=0x82, bulk-out=0x2 ubt0: Interface 1 (alt.config 5) endpoints: isoc-in=0x83, isoc-out=0x3, wMaxPacketSize=49, nframes=6, buffer size=294 The Bluetooth stack has to be started manually on &os; 6.0, and on &os; 5.X before 5.5. It is done automatically from &man.devd.8; on &os; 5.5, 6.1 and newer. Copy /usr/share/examples/netgraph/bluetooth/rc.bluetooth into some convenient place, like /etc/rc.bluetooth. This script is used to start and stop the Bluetooth stack. It is a good idea to stop the stack before unplugging the device, but it is not (usually) fatal. When starting the stack, you will receive output similar to the following: &prompt.root; /etc/rc.bluetooth start ubt0 BD_ADDR: 00:02:72:00:d4:1a Features: 0xff 0xff 0xf 00 00 00 00 00 <3-Slot> <5-Slot> <Encryption> <Slot offset> <Timing accuracy> <Switch> <Hold mode> <Sniff mode> <Park mode> <RSSI> <Channel quality> <SCO link> <HV2 packets> <HV3 packets> <u-law log> <A-law log> <CVSD> <Paging scheme> <Power control> <Transparent SCO data> Max. ACL packet size: 192 bytes Number of ACL packets: 8 Max. SCO packet size: 64 bytes Number of SCO packets: 8 HCI Host Controller Interface (HCI) Host Controller Interface (HCI) provides a command interface to the baseband controller and link manager, and access to hardware status and control registers. This interface provides a uniform method of accessing the Bluetooth baseband capabilities. HCI layer on the Host exchanges data and commands with the HCI firmware on the Bluetooth hardware. The Host Controller Transport Layer (i.e. physical bus) driver provides both HCI layers with the ability to exchange information with each other. A single Netgraph node of type hci is created for a single Bluetooth device. The HCI node is normally connected to the Bluetooth device driver node (downstream) and the L2CAP node (upstream). All HCI operations must be performed on the HCI node and not on the device driver node. Default name for the HCI node is devicehci. For more details refer to the &man.ng.hci.4; manual page. One of the most common tasks is discovery of Bluetooth devices in RF proximity. This operation is called inquiry. Inquiry and other HCI related operations are done with the &man.hccontrol.8; utility. The example below shows how to find out which Bluetooth devices are in range. You should receive the list of devices in a few seconds. Note that a remote device will only answer the inquiry if it put into discoverable mode. &prompt.user; hccontrol -n ubt0hci inquiry Inquiry result, num_responses=1 Inquiry result #0 BD_ADDR: 00:80:37:29:19:a4 Page Scan Rep. Mode: 0x1 Page Scan Period Mode: 00 Page Scan Mode: 00 Class: 52:02:04 Clock offset: 0x78ef Inquiry complete. Status: No error [00] BD_ADDR is unique address of a Bluetooth device, similar to MAC addresses of a network card. This address is needed for further communication with a device. It is possible to assign human readable name to a BD_ADDR. The /etc/bluetooth/hosts file contains information regarding the known Bluetooth hosts. The following example shows how to obtain human readable name that was assigned to the remote device: &prompt.user; hccontrol -n ubt0hci remote_name_request 00:80:37:29:19:a4 BD_ADDR: 00:80:37:29:19:a4 Name: Pav's T39 If you perform an inquiry on a remote Bluetooth device, it will find your computer as your.host.name (ubt0). The name assigned to the local device can be changed at any time. The Bluetooth system provides a point-to-point connection (only two Bluetooth units involved), or a point-to-multipoint connection. In the point-to-multipoint connection the connection is shared among several Bluetooth devices. The following example shows how to obtain the list of active baseband connections for the local device: &prompt.user; hccontrol -n ubt0hci read_connection_list Remote BD_ADDR Handle Type Mode Role Encrypt Pending Queue State 00:80:37:29:19:a4 41 ACL 0 MAST NONE 0 0 OPEN A connection handle is useful when termination of the baseband connection is required. Note, that it is normally not required to do it by hand. The stack will automatically terminate inactive baseband connections. &prompt.root; hccontrol -n ubt0hci disconnect 41 Connection handle: 41 Reason: Connection terminated by local host [0x16] Refer to hccontrol help for a complete listing of available HCI commands. Most of the HCI commands do not require superuser privileges. L2CAP Logical Link Control and Adaptation Protocol (L2CAP) Logical Link Control and Adaptation Protocol (L2CAP) provides connection-oriented and connectionless data services to upper layer protocols with protocol multiplexing capability and segmentation and reassembly operation. L2CAP permits higher level protocols and applications to transmit and receive L2CAP data packets up to 64 kilobytes in length. L2CAP is based around the concept of channels. Channel is a logical connection on top of baseband connection. Each channel is bound to a single protocol in a many-to-one fashion. Multiple channels can be bound to the same protocol, but a channel cannot be bound to multiple protocols. Each L2CAP packet received on a channel is directed to the appropriate higher level protocol. Multiple channels can share the same baseband connection. A single Netgraph node of type l2cap is created for a single Bluetooth device. The L2CAP node is normally connected to the Bluetooth HCI node (downstream) and Bluetooth sockets nodes (upstream). Default name for the L2CAP node is devicel2cap. For more details refer to the &man.ng.l2cap.4; manual page. A useful command is &man.l2ping.8;, which can be used to ping other devices. Some Bluetooth implementations might not return all of the data sent to them, so 0 bytes in the following example is normal. &prompt.root; l2ping -a 00:80:37:29:19:a4 0 bytes from 0:80:37:29:19:a4 seq_no=0 time=48.633 ms result=0 0 bytes from 0:80:37:29:19:a4 seq_no=1 time=37.551 ms result=0 0 bytes from 0:80:37:29:19:a4 seq_no=2 time=28.324 ms result=0 0 bytes from 0:80:37:29:19:a4 seq_no=3 time=46.150 ms result=0 The &man.l2control.8; utility is used to perform various operations on L2CAP nodes. This example shows how to obtain the list of logical connections (channels) and the list of baseband connections for the local device: &prompt.user; l2control -a 00:02:72:00:d4:1a read_channel_list L2CAP channels: Remote BD_ADDR SCID/ DCID PSM IMTU/ OMTU State 00:07:e0:00:0b:ca 66/ 64 3 132/ 672 OPEN &prompt.user; l2control -a 00:02:72:00:d4:1a read_connection_list L2CAP connections: Remote BD_ADDR Handle Flags Pending State 00:07:e0:00:0b:ca 41 O 0 OPEN Another diagnostic tool is &man.btsockstat.1;. It does a job similar to as &man.netstat.1; does, but for Bluetooth network-related data structures. The example below shows the same logical connection as &man.l2control.8; above. &prompt.user; btsockstat Active L2CAP sockets PCB Recv-Q Send-Q Local address/PSM Foreign address CID State c2afe900 0 0 00:02:72:00:d4:1a/3 00:07:e0:00:0b:ca 66 OPEN Active RFCOMM sessions L2PCB PCB Flag MTU Out-Q DLCs State c2afe900 c2b53380 1 127 0 Yes OPEN Active RFCOMM sockets PCB Recv-Q Send-Q Local address Foreign address Chan DLCI State c2e8bc80 0 250 00:02:72:00:d4:1a 00:07:e0:00:0b:ca 3 6 OPEN RFCOMM RFCOMM Protocol The RFCOMM protocol provides emulation of serial ports over the L2CAP protocol. The protocol is based on the ETSI standard TS 07.10. RFCOMM is a simple transport protocol, with additional provisions for emulating the 9 circuits of RS-232 (EIATIA-232-E) serial ports. The RFCOMM protocol supports up to 60 simultaneous connections (RFCOMM channels) between two Bluetooth devices. For the purposes of RFCOMM, a complete communication path involves two applications running on different devices (the communication endpoints) with a communication segment between them. RFCOMM is intended to cover applications that make use of the serial ports of the devices in which they reside. The communication segment is a Bluetooth link from one device to another (direct connect). RFCOMM is only concerned with the connection between the devices in the direct connect case, or between the device and a modem in the network case. RFCOMM can support other configurations, such as modules that communicate via Bluetooth wireless technology on one side and provide a wired interface on the other side. In &os; the RFCOMM protocol is implemented at the Bluetooth sockets layer. pairing Pairing of Devices By default, Bluetooth communication is not authenticated, and any device can talk to any other device. A Bluetooth device (for example, cellular phone) may choose to require authentication to provide a particular service (for example, Dial-Up service). Bluetooth authentication is normally done with PIN codes. A PIN code is an ASCII string up to 16 characters in length. User is required to enter the same PIN code on both devices. Once user has entered the PIN code, both devices will generate a link key. After that the link key can be stored either in the devices themselves or in a persistent storage. Next time both devices will use previously generated link key. The described above procedure is called pairing. Note that if the link key is lost by any device then pairing must be repeated. The &man.hcsecd.8; daemon is responsible for handling of all Bluetooth authentication requests. The default configuration file is /etc/bluetooth/hcsecd.conf. An example section for a cellular phone with the PIN code arbitrarily set to 1234 is shown below: device { bdaddr 00:80:37:29:19:a4; name "Pav's T39"; key nokey; pin "1234"; } There is no limitation on PIN codes (except length). Some devices (for example Bluetooth headsets) may have a fixed PIN code built in. The switch forces the &man.hcsecd.8; daemon to stay in the foreground, so it is easy to see what is happening. Set the remote device to receive pairing and initiate the Bluetooth connection to the remote device. The remote device should say that pairing was accepted, and request the PIN code. Enter the same PIN code as you have in hcsecd.conf. Now your PC and the remote device are paired. Alternatively, you can initiate pairing on the remote device. On &os; 5.5, 6.1 and newer, the following line can be added to the /etc/rc.conf file to have hcsecd started automatically on system start: hcsecd_enable="YES" The following is a sample of the hcsecd daemon output: hcsecd[16484]: Got Link_Key_Request event from 'ubt0hci', remote bdaddr 0:80:37:29:19:a4 hcsecd[16484]: Found matching entry, remote bdaddr 0:80:37:29:19:a4, name 'Pav's T39', link key doesn't exist hcsecd[16484]: Sending Link_Key_Negative_Reply to 'ubt0hci' for remote bdaddr 0:80:37:29:19:a4 hcsecd[16484]: Got PIN_Code_Request event from 'ubt0hci', remote bdaddr 0:80:37:29:19:a4 hcsecd[16484]: Found matching entry, remote bdaddr 0:80:37:29:19:a4, name 'Pav's T39', PIN code exists hcsecd[16484]: Sending PIN_Code_Reply to 'ubt0hci' for remote bdaddr 0:80:37:29:19:a4 SDP Service Discovery Protocol (SDP) The Service Discovery Protocol (SDP) provides the means for client applications to discover the existence of services provided by server applications as well as the attributes of those services. The attributes of a service include the type or class of service offered and the mechanism or protocol information needed to utilize the service. SDP involves communication between a SDP server and a SDP client. The server maintains a list of service records that describe the characteristics of services associated with the server. Each service record contains information about a single service. A client may retrieve information from a service record maintained by the SDP server by issuing a SDP request. If the client, or an application associated with the client, decides to use a service, it must open a separate connection to the service provider in order to utilize the service. SDP provides a mechanism for discovering services and their attributes, but it does not provide a mechanism for utilizing those services. Normally, a SDP client searches for services based on some desired characteristics of the services. However, there are times when it is desirable to discover which types of services are described by an SDP server's service records without any a priori information about the services. This process of looking for any offered services is called browsing. The Bluetooth SDP server &man.sdpd.8; and command line client &man.sdpcontrol.8; are included in the standard &os; installation. The following example shows how to perform a SDP browse query. &prompt.user; sdpcontrol -a 00:01:03:fc:6e:ec browse Record Handle: 00000000 Service Class ID List: Service Discovery Server (0x1000) Protocol Descriptor List: L2CAP (0x0100) Protocol specific parameter #1: u/int/uuid16 1 Protocol specific parameter #2: u/int/uuid16 1 Record Handle: 0x00000001 Service Class ID List: Browse Group Descriptor (0x1001) Record Handle: 0x00000002 Service Class ID List: LAN Access Using PPP (0x1102) Protocol Descriptor List: L2CAP (0x0100) RFCOMM (0x0003) Protocol specific parameter #1: u/int8/bool 1 Bluetooth Profile Descriptor List: LAN Access Using PPP (0x1102) ver. 1.0 ... and so on. Note that each service has a list of attributes (RFCOMM channel for example). Depending on the service you might need to make a note of some of the attributes. Some Bluetooth implementations do not support service browsing and may return an empty list. In this case it is possible to search for the specific service. The example below shows how to search for the OBEX Object Push (OPUSH) service: &prompt.user; sdpcontrol -a 00:01:03:fc:6e:ec search OPUSH Offering services on &os; to Bluetooth clients is done with the &man.sdpd.8; server. On &os; 5.5, 6.1 and newer, the following line can be added to the /etc/rc.conf file: sdpd_enable="YES" Then the sdpd daemon can be started with: &prompt.root; /etc/rc.d/sdpd start On &os; 6.0, and on &os; 5.X before 5.5, sdpd is not integrated into the system startup scripts. It has to be started manually with: &prompt.root; sdpd The local server application that wants to provide Bluetooth service to the remote clients will register service with the local SDP daemon. The example of such application is &man.rfcomm.pppd.8;. Once started it will register Bluetooth LAN service with the local SDP daemon. The list of services registered with the local SDP server can be obtained by issuing SDP browse query via local control channel: &prompt.root; sdpcontrol -l browse Dial-Up Networking (DUN) and Network Access with PPP (LAN) Profiles The Dial-Up Networking (DUN) profile is mostly used with modems and cellular phones. The scenarios covered by this profile are the following: use of a cellular phone or modem by a computer as a wireless modem for connecting to a dial-up Internet access server, or using other dial-up services; use of a cellular phone or modem by a computer to receive data calls. Network Access with PPP (LAN) profile can be used in the following situations: LAN access for a single Bluetooth device; LAN access for multiple Bluetooth devices; PC to PC (using PPP networking over serial cable emulation). In &os; both profiles are implemented with &man.ppp.8; and &man.rfcomm.pppd.8; - a wrapper that converts RFCOMM Bluetooth connection into something PPP can operate with. Before any profile can be used, a new PPP label in the /etc/ppp/ppp.conf must be created. Consult &man.rfcomm.pppd.8; manual page for examples. In the following example &man.rfcomm.pppd.8; will be used to open RFCOMM connection to remote device with BD_ADDR 00:80:37:29:19:a4 on DUN RFCOMM channel. The actual RFCOMM channel number will be obtained from the remote device via SDP. It is possible to specify RFCOMM channel by hand, and in this case &man.rfcomm.pppd.8; will not perform SDP query. Use &man.sdpcontrol.8; to find out RFCOMM channel on the remote device. &prompt.root; rfcomm_pppd -a 00:80:37:29:19:a4 -c -C dun -l rfcomm-dialup In order to provide Network Access with PPP (LAN) service the &man.sdpd.8; server must be running. A new entry for LAN clients must be created in the /etc/ppp/ppp.conf file. Consult &man.rfcomm.pppd.8; manual page for examples. Finally, start RFCOMM PPP server on valid RFCOMM channel number. The RFCOMM PPP server will automatically register Bluetooth LAN service with the local SDP daemon. The example below shows how to start RFCOMM PPP server. &prompt.root; rfcomm_pppd -s -C 7 -l rfcomm-server OBEX OBEX Object Push (OPUSH) Profile OBEX is a widely used protocol for simple file transfers between mobile devices. Its main use is in infrared communication, where it is used for generic file transfers between notebooks or PDAs, and for sending business cards or calendar entries between cellular phones and other devices with PIM applications. The OBEX server and client are implemented as a third-party package obexapp, which is available as comms/obexapp port. OBEX client is used to push and/or pull objects from the OBEX server. An object can, for example, be a business card or an appointment. The OBEX client can obtain RFCOMM channel number from the remote device via SDP. This can be done by specifying service name instead of RFCOMM channel number. Supported service names are: IrMC, FTRN and OPUSH. It is possible to specify RFCOMM channel as a number. Below is an example of an OBEX session, where device information object is pulled from the cellular phone, and a new object (business card) is pushed into the phone's directory. &prompt.user; obexapp -a 00:80:37:29:19:a4 -C IrMC obex> get telecom/devinfo.txt devinfo-t39.txt Success, response: OK, Success (0x20) obex> put new.vcf Success, response: OK, Success (0x20) obex> di Success, response: OK, Success (0x20) In order to provide OBEX Object Push service, &man.sdpd.8; server must be running. A root folder, where all incoming objects will be stored, must be created. The default path to the root folder is /var/spool/obex. Finally, start OBEX server on valid RFCOMM channel number. The OBEX server will automatically register OBEX Object Push service with the local SDP daemon. The example below shows how to start OBEX server. &prompt.root; obexapp -s -C 10 Serial Port Profile (SPP) The Serial Port Profile (SPP) allows Bluetooth devices to perform RS232 (or similar) serial cable emulation. The scenario covered by this profile deals with legacy applications using Bluetooth as a cable replacement, through a virtual serial port abstraction. The &man.rfcomm.sppd.1; utility implements the Serial Port profile. A pseudo tty is used as a virtual serial port abstraction. The example below shows how to connect to a remote device Serial Port service. Note that you do not have to specify a RFCOMM channel - &man.rfcomm.sppd.1; can obtain it from the remote device via SDP. If you would like to override this, specify a RFCOMM channel on the command line. &prompt.root; rfcomm_sppd -a 00:07:E0:00:0B:CA -t /dev/ttyp6 rfcomm_sppd[94692]: Starting on /dev/ttyp6... Once connected, the pseudo tty can be used as serial port: &prompt.root; cu -l ttyp6 Troubleshooting A remote device cannot connect Some older Bluetooth devices do not support role switching. By default, when &os; is accepting a new connection, it tries to perform a role switch and become master. Devices, which do not support this will not be able to connect. Note that role switching is performed when a new connection is being established, so it is not possible to ask the remote device if it does support role switching. There is a HCI option to disable role switching on the local side: &prompt.root; hccontrol -n ubt0hci write_node_role_switch 0 Something is going wrong, can I see what exactly is happening? Yes, you can. Use the third-party package hcidump, which is available as comms/hcidump port. The hcidump utility is similar to &man.tcpdump.1;. It can be used to display the content of the Bluetooth packets on the terminal and to dump the Bluetooth packets to a file. Steve Peterson Written by Bridging Introduction IP subnet bridge It is sometimes useful to divide one physical network (such as an Ethernet segment) into two separate network segments without having to create IP subnets and use a router to connect the segments together. A device that connects two networks together in this fashion is called a bridge. A FreeBSD system with two network interface cards can act as a bridge. The bridge works by learning the MAC layer addresses (Ethernet addresses) of the devices on each of its network interfaces. It forwards traffic between two networks only when its source and destination are on different networks. In many respects, a bridge is like an Ethernet switch with very few ports. Situations Where Bridging Is Appropriate There are two common situations in which a bridge is used today. High Traffic on a Segment Situation one is where your physical network segment is overloaded with traffic, but you do not want for whatever reason to subnet the network and interconnect the subnets with a router. Let us consider an example of a newspaper where the Editorial and Production departments are on the same subnetwork. The Editorial users all use server A for file service, and the Production users are on server B. An Ethernet network is used to connect all users together, and high loads on the network are slowing things down. If the Editorial users could be segregated on one network segment and the Production users on another, the two network segments could be connected with a bridge. Only the network traffic destined for interfaces on the other side of the bridge would be sent to the other network, reducing congestion on each network segment. Filtering/Traffic Shaping Firewall firewall NAT The second common situation is where firewall functionality is needed without network address translation (NAT). An example is a small company that is connected via DSL or ISDN to their ISP. They have a 13 globally-accessible IP addresses from their ISP and have 10 PCs on their network. In this situation, using a router-based firewall is difficult because of subnetting issues. router DSL ISDN A bridge-based firewall can be configured and dropped into the path just downstream of their DSL/ISDN router without any IP numbering issues. Configuring a Bridge Network Interface Card Selection A bridge requires at least two network cards to function. Unfortunately, not all network interface cards as of FreeBSD 4.0 support bridging. Read &man.bridge.4; for details on the cards that are supported. Install and test the two network cards before continuing. Kernel Configuration Changes kernel options BRIDGE To enable kernel support for bridging, add the: options BRIDGE statement to your kernel configuration file, and rebuild your kernel. Firewall Support firewall If you are planning to use the bridge as a firewall, you will need to add the IPFIREWALL option as well. Read for general information on configuring the bridge as a firewall. If you need to allow non-IP packets (such as ARP) to flow through the bridge, there is a firewall option that must be set. This option is IPFIREWALL_DEFAULT_TO_ACCEPT. Note that this changes the default rule for the firewall to accept any packet. Make sure you know how this changes the meaning of your ruleset before you set it. Traffic Shaping Support If you want to use the bridge as a traffic shaper, you will need to add the DUMMYNET option to your kernel configuration. Read &man.dummynet.4; for further information. Enabling the Bridge Add the line: net.link.ether.bridge.enable=1 to /etc/sysctl.conf to enable the bridge at runtime, and the line: net.link.ether.bridge.config=if1,if2 to enable bridging on the specified interfaces (replace if1 and if2 with the names of your two network interfaces). If you want the bridged packets to be filtered by &man.ipfw.8;, you should add: net.link.ether.bridge.ipfw=1 as well. For versions prior to &os; 5.2-RELEASE, use instead the following lines: net.link.ether.bridge=1 net.link.ether.bridge_cfg=if1,if2 net.link.ether.bridge_ipfw=1 Other Information If you want to be able to &man.ssh.1; into the bridge from the network, it is correct to assign one of the network cards an IP address. The consensus is that assigning both cards an address is a bad idea. If you have multiple bridges on your network, there cannot be more than one path between any two workstations. Technically, this means that there is no support for spanning tree link management. A bridge can add latency to your &man.ping.8; times, especially for traffic from one segment to another. Jean-François Dockès Updated by Alex Dupre Reorganized and enhanced by Diskless Operation diskless workstation diskless operation A FreeBSD machine can boot over the network and operate without a local disk, using file systems mounted from an NFS server. No system modification is necessary, beyond standard configuration files. Such a system is relatively easy to set up because all the necessary elements are readily available: There are at least two possible methods to load the kernel over the network: PXE: The &intel; Preboot eXecution Environment system is a form of smart boot ROM built into some networking cards or motherboards. See &man.pxeboot.8; for more details. The Etherboot port (net/etherboot) produces ROM-able code to boot kernels over the network. The code can be either burnt into a boot PROM on a network card, or loaded from a local floppy (or hard) disk drive, or from a running &ms-dos; system. Many network cards are supported. A sample script (/usr/share/examples/diskless/clone_root) eases the creation and maintenance of the workstation's root file system on the server. The script will probably require a little customization but it will get you started very quickly. Standard system startup files exist in /etc to detect and support a diskless system startup. Swapping, if needed, can be done either to an NFS file or to a local disk. There are many ways to set up diskless workstations. Many elements are involved, and most can be customized to suit local taste. The following will describe variations on the setup of a complete system, emphasizing simplicity and compatibility with the standard FreeBSD startup scripts. The system described has the following characteristics: The diskless workstations use a shared read-only / file system, and a shared read-only /usr. The root file system is a copy of a standard FreeBSD root (typically the server's), with some configuration files overridden by ones specific to diskless operation or, possibly, to the workstation they belong to. The parts of the root which have to be writable are overlaid with &man.mfs.8; (&os; 4.X) or &man.md.4; (&os; 5.X) file systems. Any changes will be lost when the system reboots. The kernel is transferred and loaded either with Etherboot or PXE as some situations may mandate the use of either method. As described, this system is insecure. It should live in a protected area of a network, and be untrusted by other hosts. All the information in this section has been tested using &os; releases 4.9-RELEASE and 5.2.1-RELEASE. The text is primarily structured for 4.X usage. Notes have been inserted where appropriate to indicate 5.X changes. Background Information Setting up diskless workstations is both relatively straightforward and prone to errors. These are sometimes difficult to diagnose for a number of reasons. For example: Compile time options may determine different behaviors at runtime. Error messages are often cryptic or totally absent. In this context, having some knowledge of the background mechanisms involved is very useful to solve the problems that may arise. Several operations need to be performed for a successful bootstrap: The machine needs to obtain initial parameters such as its IP address, executable filename, server name, root path. This is done using the DHCP or BOOTP protocols. DHCP is a compatible extension of BOOTP, and uses the same port numbers and basic packet format. It is possible to configure a system to use only BOOTP. The &man.bootpd.8; server program is included in the base &os; system. However, DHCP has a number of advantages over BOOTP (nicer configuration files, possibility of using PXE, plus many others not directly related to diskless operation), and we will describe mainly a DHCP configuration, with equivalent examples using &man.bootpd.8; when possible. The sample configuration will use the ISC DHCP software package (release 3.0.1.r12 was installed on the test server). The machine needs to transfer one or several programs to local memory. Either TFTP or NFS are used. The choice between TFTP and NFS is a compile time option in several places. A common source of error is to specify filenames for the wrong protocol: TFTP typically transfers all files from a single directory on the server, and would expect filenames relative to this directory. NFS needs absolute file paths. The possible intermediate bootstrap programs and the kernel need to be initialized and executed. There are several important variations in this area: PXE will load &man.pxeboot.8;, which is a modified version of the &os; third stage loader. The &man.loader.8; will obtain most parameters necessary to system startup, and leave them in the kernel environment before transferring control. It is possible to use a GENERIC kernel in this case. Etherboot, will directly load the kernel, with less preparation. You will need to build a kernel with specific options. PXE and Etherboot work equally well with 4.X systems. Because 5.X kernels normally let the &man.loader.8; do more work for them, PXE is preferred for 5.X systems. If your BIOS and network cards support PXE, you should probably use it. However, it is still possible to start a 5.X system with Etherboot. Finally, the machine needs to access its file systems. NFS is used in all cases. See also &man.diskless.8; manual page. Setup Instructions Configuration Using <application>ISC DHCP</application> DHCP diskless operation The ISC DHCP server can answer both BOOTP and DHCP requests. As of release 4.9, ISC DHCP 3.0 is not part of the base system. You will first need to install the net/isc-dhcp3-server port or the corresponding package. Once ISC DHCP is installed, it needs a configuration file to run, (normally named /usr/local/etc/dhcpd.conf). Here follows a commented example, where host margaux uses Etherboot and host corbieres uses PXE: default-lease-time 600; max-lease-time 7200; authoritative; option domain-name "example.com"; option domain-name-servers 192.168.4.1; option routers 192.168.4.1; subnet 192.168.4.0 netmask 255.255.255.0 { use-host-decl-names on; option subnet-mask 255.255.255.0; option broadcast-address 192.168.4.255; host margaux { hardware ethernet 01:23:45:67:89:ab; fixed-address margaux.example.com; next-server 192.168.4.4; filename "/data/misc/kernel.diskless"; option root-path "192.168.4.4:/data/misc/diskless"; } host corbieres { hardware ethernet 00:02:b3:27:62:df; fixed-address corbieres.example.com; next-server 192.168.4.4; filename "pxeboot"; option root-path "192.168.4.4:/data/misc/diskless"; } } This option tells dhcpd to send the value in the host declarations as the hostname for the diskless host. An alternate way would be to add an option host-name margaux inside the host declarations. The next-server directive designates the TFTP or NFS server to use for loading loader or kernel file (the default is to use the same host as the DHCP server). The filename directive defines the file that Etherboot or PXE will load for the next execution step. It must be specified according to the transfer method used. Etherboot can be compiled to use NFS or TFTP. The &os; port configures NFS by default. PXE uses TFTP, which is why a relative filename is used here (this may depend on the TFTP server configuration, but would be fairly typical). Also, PXE loads pxeboot, not the kernel. There are other interesting possibilities, like loading pxeboot from a &os; CD-ROM /boot directory (as &man.pxeboot.8; can load a GENERIC kernel, this makes it possible to use PXE to boot from a remote CD-ROM). The root-path option defines the path to the root file system, in usual NFS notation. When using PXE, it is possible to leave off the host's IP as long as you do not enable the kernel option BOOTP. The NFS server will then be the same as the TFTP one. Configuration Using BOOTP BOOTP diskless operation Here follows an equivalent bootpd configuration (reduced to one client). This would be found in /etc/bootptab. Please note that Etherboot must be compiled with the non-default option NO_DHCP_SUPPORT in order to use BOOTP, and that PXE needs DHCP. The only obvious advantage of bootpd is that it exists in the base system. .def100:\ :hn:ht=1:sa=192.168.4.4:vm=rfc1048:\ :sm=255.255.255.0:\ :ds=192.168.4.1:\ :gw=192.168.4.1:\ :hd="/tftpboot":\ :bf="/kernel.diskless":\ :rp="192.168.4.4:/data/misc/diskless": margaux:ha=0123456789ab:tc=.def100 Preparing a Boot Program with <application>Etherboot</application> Etherboot Etherboot's Web site contains extensive documentation mainly intended for Linux systems, but nonetheless containing useful information. The following will just outline how you would use Etherboot on a FreeBSD system. You must first install the net/etherboot package or port. You can change the Etherboot configuration (i.e. to use TFTP instead of NFS) by editing the Config file in the Etherboot source directory. For our setup, we shall use a boot floppy. For other methods (PROM, or &ms-dos; program), please refer to the Etherboot documentation. To make a boot floppy, insert a floppy in the drive on the machine where you installed Etherboot, then change your current directory to the src directory in the Etherboot tree and type: &prompt.root; gmake bin32/devicetype.fd0 devicetype depends on the type of the Ethernet card in the diskless workstation. Refer to the NIC file in the same directory to determine the right devicetype. Booting with <acronym>PXE</acronym> By default, the &man.pxeboot.8; loader loads the kernel via NFS. It can be compiled to use TFTP instead by specifying the LOADER_TFTP_SUPPORT option in /etc/make.conf. See the comments in /etc/defaults/make.conf (or /usr/share/examples/etc/make.conf for 5.X systems) for instructions. There are two other undocumented make.conf options which may be useful for setting up a serial console diskless machine: BOOT_PXELDR_PROBE_KEYBOARD, and BOOT_PXELDR_ALWAYS_SERIAL (the latter only exists on &os; 5.X). To use PXE when the machine starts, you will usually need to select the Boot from network option in your BIOS setup, or type a function key during the PC initialization. Configuring the <acronym>TFTP</acronym> and <acronym>NFS</acronym> Servers TFTP diskless operation NFS diskless operation If you are using PXE or Etherboot configured to use TFTP, you need to enable tftpd on the file server: Create a directory from which tftpd will serve the files, e.g. /tftpboot. Add this line to your /etc/inetd.conf: tftp dgram udp wait root /usr/libexec/tftpd tftpd -l -s /tftpboot It appears that at least some PXE versions want the TCP version of TFTP. In this case, add a second line, replacing dgram udp with stream tcp. Tell inetd to reread its configuration file: &prompt.root; kill -HUP `cat /var/run/inetd.pid` You can place the tftpboot directory anywhere on the server. Make sure that the location is set in both inetd.conf and dhcpd.conf. In all cases, you also need to enable NFS and export the appropriate file system on the NFS server. Add this to /etc/rc.conf: nfs_server_enable="YES" Export the file system where the diskless root directory is located by adding the following to /etc/exports (adjust the volume mount point and replace margaux corbieres with the names of the diskless workstations): /data/misc -alldirs -ro margaux corbieres Tell mountd to reread its configuration file. If you actually needed to enable NFS in /etc/rc.conf at the first step, you probably want to reboot instead. &prompt.root; kill -HUP `cat /var/run/mountd.pid` Building a Diskless Kernel diskless operation kernel configuration If using Etherboot, you need to create a kernel configuration file for the diskless client with the following options (in addition to the usual ones): options BOOTP # Use BOOTP to obtain IP address/hostname options BOOTP_NFSROOT # NFS mount root file system using BOOTP info You may also want to use BOOTP_NFSV3, BOOT_COMPAT and BOOTP_WIRED_TO (refer to LINT in 4.X or NOTES on 5.X). These option names are historical and slightly misleading as they actually enable indifferent use of DHCP and BOOTP inside the kernel (it is also possible to force strict BOOTP or DHCP use). Build the kernel (see ), and copy it to the place specified in dhcpd.conf. When using PXE, building a kernel with the above options is not strictly necessary (though suggested). Enabling them will cause more DHCP requests to be issued during kernel startup, with a small risk of inconsistency between the new values and those retrieved by &man.pxeboot.8; in some special cases. The advantage of using them is that the host name will be set as a side effect. Otherwise you will need to set the host name by another method, for example in a client-specific rc.conf file. In order to be loadable with Etherboot, a 5.X kernel needs to have the device hints compiled in. You would typically set the following option in the configuration file (see the NOTES configuration comments file): hints "GENERIC.hints" Preparing the Root Filesystem root file system diskless operation You need to create a root file system for the diskless workstations, in the location listed as root-path in dhcpd.conf. The following sections describe two ways to do it. Using the <filename>clone_root</filename> Script This is the quickest way to create a root file system, but currently it is only supported on &os; 4.X. This shell script is located at /usr/share/examples/diskless/clone_root and needs customization, at least to adjust the place where the file system will be created (the DEST variable). Refer to the comments at the top of the script for instructions. They explain how the base file system is built, and how files may be selectively overridden by versions specific to diskless operation, to a subnetwork, or to an individual workstation. They also give examples for the diskless /etc/fstab and /etc/rc.conf files. The README files in /usr/share/examples/diskless contain a lot of interesting background information, but, together with the other examples in the diskless directory, they actually document a configuration method which is distinct from the one used by clone_root and the system startup scripts in /etc, which is a little confusing. Use them for reference only, except if you prefer the method that they describe, in which case you will need customized rc scripts. Using the Standard <command>make world</command> Procedure This method can be applied to either &os; 4.X or 5.X and will install a complete virgin system (not only the root file system) into DESTDIR. All you have to do is simply execute the following script: #!/bin/sh export DESTDIR=/data/misc/diskless mkdir -p ${DESTDIR} -cd /usr/src; make world && make kernel +cd /usr/src; make world && make kernel cd /usr/src/etc; make distribution Once done, you may need to customize your /etc/rc.conf and /etc/fstab placed into DESTDIR according to your needs. Configuring Swap If needed, a swap file located on the server can be accessed via NFS. One of the methods commonly used to do this has been discontinued in release 5.X. <acronym>NFS</acronym> Swap with &os; 4.X The swap file location and size can be specified with BOOTP/DHCP &os;-specific options 128 and 129. Examples of configuration files for ISC DHCP 3.0 or bootpd follow: Add the following lines to dhcpd.conf: # Global section option swap-path code 128 = string; option swap-size code 129 = integer 32; host margaux { ... # Standard lines, see above option swap-path "192.168.4.4:/netswapvolume/netswap"; option swap-size 64000; } swap-path is the path to a directory where swap files will be located. Each file will be named swap.client-ip. Older versions of dhcpd used a syntax of option option-128 "..., which is no longer supported. /etc/bootptab would use the following syntax instead: T128="192.168.4.4:/netswapvolume/netswap":T129=0000fa00 In /etc/bootptab, the swap size must be expressed in hexadecimal format. On the NFS swap file server, create the swap file(s): &prompt.root; mkdir /netswapvolume/netswap &prompt.root; cd /netswapvolume/netswap &prompt.root; dd if=/dev/zero bs=1024 count=64000 of=swap.192.168.4.6 &prompt.root; chmod 0600 swap.192.168.4.6 192.168.4.6 is the IP address for the diskless client. On the NFS swap file server, add the following line to /etc/exports: /netswapvolume -maproot=0:10 -alldirs margaux corbieres Then tell mountd to reread the exports file, as above. <acronym>NFS</acronym> Swap with &os 5.X The kernel does not support enabling NFS swap at boot time. Swap must be enabled by the startup scripts, by mounting a writeable file system and creating and enabling a swap file. To create a swap file of appropriate size, you can do like this: &prompt.root; dd if=/dev/zero of=/path/to/swapfile bs=1k count=1 oseek=100000 To enable it you have to add the following line to your rc.conf: swapfile=/path/to/swapfile Miscellaneous Issues Running with a Read-only <filename>/usr</filename> diskless operation /usr read-only If the diskless workstation is configured to run X, you will have to adjust the XDM configuration file, which puts the error log on /usr by default. Using a Non-FreeBSD Server When the server for the root file system is not running FreeBSD, you will have to create the root file system on a FreeBSD machine, then copy it to its destination, using tar or cpio. In this situation, there are sometimes problems with the special files in /dev, due to differing major/minor integer sizes. A solution to this problem is to export a directory from the non-FreeBSD server, mount this directory onto a FreeBSD machine, and run MAKEDEV on the FreeBSD machine to create the correct device entries (FreeBSD 5.0 and later use &man.devfs.5; to allocate device nodes transparently for the user, running MAKEDEV on these versions is pointless). ISDN ISDN A good resource for information on ISDN technology and hardware is Dan Kegel's ISDN Page. A quick simple road map to ISDN follows: If you live in Europe you might want to investigate the ISDN card section. If you are planning to use ISDN primarily to connect to the Internet with an Internet Provider on a dial-up non-dedicated basis, you might look into Terminal Adapters. This will give you the most flexibility, with the fewest problems, if you change providers. If you are connecting two LANs together, or connecting to the Internet with a dedicated ISDN connection, you might consider the stand alone router/bridge option. Cost is a significant factor in determining what solution you will choose. The following options are listed from least expensive to most expensive. Hellmuth Michaelis Contributed by ISDN Cards ISDN cards FreeBSD's ISDN implementation supports only the DSS1/Q.931 (or Euro-ISDN) standard using passive cards. Starting with FreeBSD 4.4, some active cards are supported where the firmware also supports other signaling protocols; this also includes the first supported Primary Rate (PRI) ISDN card. The isdn4bsd software allows you to connect to other ISDN routers using either IP over raw HDLC or by using synchronous PPP: either by using kernel PPP with isppp, a modified &man.sppp.4; driver, or by using userland &man.ppp.8;. By using userland &man.ppp.8;, channel bonding of two or more ISDN B-channels is possible. A telephone answering machine application is also available as well as many utilities such as a software 300 Baud modem. Some growing number of PC ISDN cards are supported under FreeBSD and the reports show that it is successfully used all over Europe and in many other parts of the world. The passive ISDN cards supported are mostly the ones with the Infineon (formerly Siemens) ISAC/HSCX/IPAC ISDN chipsets, but also ISDN cards with chips from Cologne Chip (ISA bus only), PCI cards with Winbond W6692 chips, some cards with the Tiger300/320/ISAC chipset combinations and some vendor specific chipset based cards such as the AVM Fritz!Card PCI V.1.0 and the AVM Fritz!Card PnP. Currently the active supported ISDN cards are the AVM B1 (ISA and PCI) BRI cards and the AVM T1 PCI PRI cards. For documentation on isdn4bsd, have a look at /usr/share/examples/isdn/ directory on your FreeBSD system or at the homepage of isdn4bsd which also has pointers to hints, erratas and much more documentation such as the isdn4bsd handbook. In case you are interested in adding support for a different ISDN protocol, a currently unsupported ISDN PC card or otherwise enhancing isdn4bsd, please get in touch with &a.hm;. For questions regarding the installation, configuration and troubleshooting isdn4bsd, a &a.isdn.name; mailing list is available. ISDN Terminal Adapters Terminal adapters (TA), are to ISDN what modems are to regular phone lines. modem Most TA's use the standard Hayes modem AT command set, and can be used as a drop in replacement for a modem. A TA will operate basically the same as a modem except connection and throughput speeds will be much faster than your old modem. You will need to configure PPP exactly the same as for a modem setup. Make sure you set your serial speed as high as possible. PPP The main advantage of using a TA to connect to an Internet Provider is that you can do Dynamic PPP. As IP address space becomes more and more scarce, most providers are not willing to provide you with a static IP anymore. Most stand-alone routers are not able to accommodate dynamic IP allocation. TA's completely rely on the PPP daemon that you are running for their features and stability of connection. This allows you to upgrade easily from using a modem to ISDN on a FreeBSD machine, if you already have PPP set up. However, at the same time any problems you experienced with the PPP program and are going to persist. If you want maximum stability, use the kernel PPP option, not the userland PPP. The following TA's are known to work with FreeBSD: Motorola BitSurfer and Bitsurfer Pro Adtran Most other TA's will probably work as well, TA vendors try to make sure their product can accept most of the standard modem AT command set. The real problem with external TA's is that, like modems, you need a good serial card in your computer. You should read the FreeBSD Serial Hardware tutorial for a detailed understanding of serial devices, and the differences between asynchronous and synchronous serial ports. A TA running off a standard PC serial port (asynchronous) limits you to 115.2 Kbs, even though you have a 128 Kbs connection. To fully utilize the 128 Kbs that ISDN is capable of, you must move the TA to a synchronous serial card. Do not be fooled into buying an internal TA and thinking you have avoided the synchronous/asynchronous issue. Internal TA's simply have a standard PC serial port chip built into them. All this will do is save you having to buy another serial cable and find another empty electrical socket. A synchronous card with a TA is at least as fast as a stand-alone router, and with a simple 386 FreeBSD box driving it, probably more flexible. The choice of synchronous card/TA v.s. stand-alone router is largely a religious issue. There has been some discussion of this in the mailing lists. We suggest you search the archives for the complete discussion. Stand-alone ISDN Bridges/Routers ISDN stand-alone bridges/routers ISDN bridges or routers are not at all specific to FreeBSD or any other operating system. For a more complete description of routing and bridging technology, please refer to a networking reference book. In the context of this section, the terms router and bridge will be used interchangeably. As the cost of low end ISDN routers/bridges comes down, it will likely become a more and more popular choice. An ISDN router is a small box that plugs directly into your local Ethernet network, and manages its own connection to the other bridge/router. It has built in software to communicate via PPP and other popular protocols. A router will allow you much faster throughput than a standard TA, since it will be using a full synchronous ISDN connection. The main problem with ISDN routers and bridges is that interoperability between manufacturers can still be a problem. If you are planning to connect to an Internet provider, you should discuss your needs with them. If you are planning to connect two LAN segments together, such as your home LAN to the office LAN, this is the simplest lowest maintenance solution. Since you are buying the equipment for both sides of the connection you can be assured that the link will work. For example to connect a home computer or branch office network to a head office network the following setup could be used: Branch Office or Home Network 10 base 2 Network uses a bus based topology with 10 base 2 Ethernet (thinnet). Connect router to network cable with AUI/10BT transceiver, if necessary. ---Sun workstation | ---FreeBSD box | ---Windows 95 | Stand-alone router | ISDN BRI line 10 Base 2 Ethernet If your home/branch office is only one computer you can use a twisted pair crossover cable to connect to the stand-alone router directly. Head Office or Other LAN 10 base T Network uses a star topology with 10 base T Ethernet (Twisted Pair). -------Novell Server | H | | ---Sun | | | U ---FreeBSD | | | ---Windows 95 | B | |___---Stand-alone router | ISDN BRI line ISDN Network Diagram One large advantage of most routers/bridges is that they allow you to have 2 separate independent PPP connections to 2 separate sites at the same time. This is not supported on most TA's, except for specific (usually expensive) models that have two serial ports. Do not confuse this with channel bonding, MPP, etc. This can be a very useful feature if, for example, you have an dedicated ISDN connection at your office and would like to tap into it, but do not want to get another ISDN line at work. A router at the office location can manage a dedicated B channel connection (64 Kbps) to the Internet and use the other B channel for a separate data connection. The second B channel can be used for dial-in, dial-out or dynamically bonding (MPP, etc.) with the first B channel for more bandwidth. IPX/SPX An Ethernet bridge will also allow you to transmit more than just IP traffic. You can also send IPX/SPX or whatever other protocols you use. Chern Lee Contributed by Network Address Translation Overview natd FreeBSD's Network Address Translation daemon, commonly known as &man.natd.8; is a daemon that accepts incoming raw IP packets, changes the source to the local machine and re-injects these packets back into the outgoing IP packet stream. &man.natd.8; does this by changing the source IP address and port such that when data is received back, it is able to determine the original location of the data and forward it back to its original requester. Internet connection sharing NAT The most common use of NAT is to perform what is commonly known as Internet Connection Sharing. Setup Due to the diminishing IP space in IPv4, and the increased number of users on high-speed consumer lines such as cable or DSL, people are increasingly in need of an Internet Connection Sharing solution. The ability to connect several computers online through one connection and IP address makes &man.natd.8; a reasonable choice. Most commonly, a user has a machine connected to a cable or DSL line with one IP address and wishes to use this one connected computer to provide Internet access to several more over a LAN. To do this, the FreeBSD machine on the Internet must act as a gateway. This gateway machine must have two NICs—one for connecting to the Internet router, the other connecting to a LAN. All the machines on the LAN are connected through a hub or switch. There are many ways to get a LAN connected to the Internet through a &os; gateway. This example will only cover a gateway with at least two NICs. _______ __________ ________ | | | | | | | Hub |-----| Client B |-----| Router |----- Internet |_______| |__________| |________| | ____|_____ | | | Client A | |__________| Network Layout A setup like this is commonly used to share an Internet connection. One of the LAN machines is connected to the Internet. The rest of the machines access the Internet through that gateway machine. kernel configuration Configuration The following options must be in the kernel configuration file: options IPFIREWALL options IPDIVERT Additionally, at choice, the following may also be suitable: options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFIREWALL_VERBOSE The following must be in /etc/rc.conf: gateway_enable="YES" firewall_enable="YES" firewall_type="OPEN" natd_enable="YES" natd_interface="fxp0" natd_flags="" Sets up the machine to act as a gateway. Running sysctl net.inet.ip.forwarding=1 would have the same effect. Enables the firewall rules in /etc/rc.firewall at boot. This specifies a predefined firewall ruleset that allows anything in. See /etc/rc.firewall for additional types. Indicates which interface to forward packets through (the interface connected to the Internet). Any additional configuration options passed to &man.natd.8; on boot. Having the previous options defined in /etc/rc.conf would run natd -interface fxp0 at boot. This can also be run manually. It is also possible to use a configuration file for &man.natd.8; when there are too many options to pass. In this case, the configuration file must be defined by adding the following line to /etc/rc.conf: natd_flags="-f /etc/natd.conf" The /etc/natd.conf file will contain a list of configuration options, one per line. For example the next section case would use the following file: redirect_port tcp 192.168.0.2:6667 6667 redirect_port tcp 192.168.0.3:80 80 For more information about the configuration file, consult the &man.natd.8; manual page about the option. Each machine and interface behind the LAN should be assigned IP address numbers in the private network space as defined by RFC 1918 and have a default gateway of the natd machine's internal IP address. For example, client A and B behind the LAN have IP addresses of 192.168.0.2 and 192.168.0.3, while the natd machine's LAN interface has an IP address of 192.168.0.1. Client A and B's default gateway must be set to that of the natd machine, 192.168.0.1. The natd machine's external, or Internet interface does not require any special modification for &man.natd.8; to work. Port Redirection The drawback with &man.natd.8; is that the LAN clients are not accessible from the Internet. Clients on the LAN can make outgoing connections to the world but cannot receive incoming ones. This presents a problem if trying to run Internet services on one of the LAN client machines. A simple way around this is to redirect selected Internet ports on the natd machine to a LAN client. For example, an IRC server runs on client A, and a web server runs on client B. For this to work properly, connections received on ports 6667 (IRC) and 80 (web) must be redirected to the respective machines. The must be passed to &man.natd.8; with the proper options. The syntax is as follows: -redirect_port proto targetIP:targetPORT[-targetPORT] [aliasIP:]aliasPORT[-aliasPORT] [remoteIP[:remotePORT[-remotePORT]]] In the above example, the argument should be: -redirect_port tcp 192.168.0.2:6667 6667 -redirect_port tcp 192.168.0.3:80 80 This will redirect the proper tcp ports to the LAN client machines. The argument can be used to indicate port ranges over individual ports. For example, tcp 192.168.0.2:2000-3000 2000-3000 would redirect all connections received on ports 2000 to 3000 to ports 2000 to 3000 on client A. These options can be used when directly running &man.natd.8;, placed within the natd_flags="" option in /etc/rc.conf, or passed via a configuration file. For further configuration options, consult &man.natd.8; Address Redirection address redirection Address redirection is useful if several IP addresses are available, yet they must be on one machine. With this, &man.natd.8; can assign each LAN client its own external IP address. &man.natd.8; then rewrites outgoing packets from the LAN clients with the proper external IP address and redirects all traffic incoming on that particular IP address back to the specific LAN client. This is also known as static NAT. For example, the IP addresses 128.1.1.1, 128.1.1.2, and 128.1.1.3 belong to the natd gateway machine. 128.1.1.1 can be used as the natd gateway machine's external IP address, while 128.1.1.2 and 128.1.1.3 are forwarded back to LAN clients A and B. The syntax is as follows: -redirect_address localIP publicIP localIP The internal IP address of the LAN client. publicIP The external IP address corresponding to the LAN client. In the example, this argument would read: -redirect_address 192.168.0.2 128.1.1.2 -redirect_address 192.168.0.3 128.1.1.3 Like , these arguments are also placed within the natd_flags="" option of /etc/rc.conf, or passed via a configuration file. With address redirection, there is no need for port redirection since all data received on a particular IP address is redirected. The external IP addresses on the natd machine must be active and aliased to the external interface. Look at &man.rc.conf.5; to do so. Parallel Line IP (PLIP) PLIP Parallel Line IP PLIP PLIP lets us run TCP/IP between parallel ports. It is useful on machines without network cards, or to install on laptops. In this section, we will discuss: Creating a parallel (laplink) cable. Connecting two computers with PLIP. Creating a Parallel Cable You can purchase a parallel cable at most computer supply stores. If you cannot do that, or you just want to know how it is done, the following table shows how to make one out of a normal parallel printer cable. Wiring a Parallel Cable for Networking A-name A-End B-End Descr. Post/Bit DATA0 -ERROR 2 15 15 2 Data 0/0x01 1/0x08 DATA1 +SLCT 3 13 13 3 Data 0/0x02 1/0x10 DATA2 +PE 4 12 12 4 Data 0/0x04 1/0x20 DATA3 -ACK 5 10 10 5 Strobe 0/0x08 1/0x40 DATA4 BUSY 6 11 11 6 Data 0/0x10 1/0x80 GND 18-25 18-25 GND -

Setting Up PLIP First, you have to get a laplink cable. Then, confirm that both computers have a kernel with &man.lpt.4; driver support: &prompt.root; grep lp /var/run/dmesg.boot lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port The parallel port must be an interrupt driven port, under &os; 4.X, you should have a line similar to the following in your kernel configuration file: device ppc0 at isa? irq 7 Under &os; 5.X, the /boot/device.hints file should contain the following lines: hint.ppc.0.at="isa" hint.ppc.0.irq="7" Then check if the kernel configuration file has a device plip line or if the plip.ko kernel module is loaded. In both cases the parallel networking interface should appear when you directly use the &man.ifconfig.8; command. Under &os; 4.X like this: &prompt.root; ifconfig lp0 lp0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> mtu 1500 and for &os; 5.X: &prompt.root; ifconfig plip0 plip0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> mtu 1500 The device name used for parallel interface is different between &os; 4.X (lpX) and &os; 5.X (plipX). Plug in the laplink cable into the parallel interface on both computers. Configure the network interface parameters on both sites as root. For example, if you want connect the host host1 running &os; 4.X with host2 running &os; 5.X: host1 <-----> host2 IP Address 10.0.0.1 10.0.0.2 Configure the interface on host1 by doing: &prompt.root; ifconfig lp0 10.0.0.1 10.0.0.2 Configure the interface on host2 by doing: &prompt.root; ifconfig plip0 10.0.0.2 10.0.0.1 You now should have a working connection. Please read the manual pages &man.lp.4; and &man.lpt.4; for more details. You should also add both hosts to /etc/hosts: 127.0.0.1 localhost.my.domain localhost 10.0.0.1 host1.my.domain host1 10.0.0.2 host2.my.domain To confirm the connection works, go to each host and ping the other. For example, on host1: &prompt.root; ifconfig lp0 lp0: flags=8851<UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 10.0.0.1 --> 10.0.0.2 netmask 0xff000000 &prompt.root; netstat -r Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire host2 host1 UH 0 0 lp0 &prompt.root; ping -c 4 host2 PING host2 (10.0.0.2): 56 data bytes 64 bytes from 10.0.0.2: icmp_seq=0 ttl=255 time=2.774 ms 64 bytes from 10.0.0.2: icmp_seq=1 ttl=255 time=2.530 ms 64 bytes from 10.0.0.2: icmp_seq=2 ttl=255 time=2.556 ms 64 bytes from 10.0.0.2: icmp_seq=3 ttl=255 time=2.714 ms --- host2 ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max/stddev = 2.530/2.643/2.774/0.103 ms Aaron Kaplan Originally Written by Tom Rhodes Restructured and Added by Brad Davis Extended by IPv6 IPv6 (also know as IPng IP next generation) is the new version of the well known IP protocol (also know as IPv4). Like the other current *BSD systems, FreeBSD includes the KAME IPv6 reference implementation. So your FreeBSD system comes with all you will need to experiment with IPv6. This section focuses on getting IPv6 configured and running. In the early 1990s, people became aware of the rapidly diminishing address space of IPv4. Given the expansion rate of the Internet there were two major concerns: Running out of addresses. Today this is not so much of a concern anymore since private address spaces (10.0.0.0/8, 192.168.0.0/24, etc.) and Network Address Translation (NAT) are being employed. Router table entries were getting too large. This is still a concern today. IPv6 deals with these and many other issues: 128 bit address space. In other words theoretically there are 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses available. This means there are approximately 6.67 * 10^27 IPv6 addresses per square meter on our planet. Routers will only store network aggregation addresses in their routing tables thus reducing the average space of a routing table to 8192 entries. There are also lots of other useful features of IPv6 such as: Address autoconfiguration (RFC2462) Anycast addresses (one-out-of many) Mandatory multicast addresses IPsec (IP security) Simplified header structure Mobile IP IPv6-to-IPv4 transition mechanisms For more information see: IPv6 overview at playground.sun.com KAME.net 6bone.net Background on IPv6 Addresses There are different types of IPv6 addresses: Unicast, Anycast and Multicast. Unicast addresses are the well known addresses. A packet sent to a unicast address arrives exactly at the interface belonging to the address. Anycast addresses are syntactically indistinguishable from unicast addresses but they address a group of interfaces. The packet destined for an anycast address will arrive at the nearest (in router metric) interface. Anycast addresses may only be used by routers. Multicast addresses identify a group of interfaces. A packet destined for a multicast address will arrive at all interfaces belonging to the multicast group. The IPv4 broadcast address (usually xxx.xxx.xxx.255) is expressed by multicast addresses in IPv6. Reserved IPv6 addresses IPv6 address Prefixlength (Bits) Description Notes :: 128 bits unspecified cf. 0.0.0.0 in IPv4 ::1 128 bits loopback address cf. 127.0.0.1 in IPv4 ::00:xx:xx:xx:xx 96 bits embedded IPv4 The lower 32 bits are the IPv4 address. Also called IPv4 compatible IPv6 address ::ff:xx:xx:xx:xx 96 bits IPv4 mapped IPv6 address The lower 32 bits are the IPv4 address. For hosts which do not support IPv6. fe80:: - feb:: 10 bits link-local cf. loopback address in IPv4 fec0:: - fef:: 10 bits site-local ff:: 8 bits multicast 001 (base 2) 3 bits global unicast All global unicast addresses are assigned from this pool. The first 3 bits are 001.

Reading IPv6 Addresses The canonical form is represented as: x:x:x:x:x:x:x:x, each x being a 16 Bit hex value. For example FEBC:A574:382B:23C1:AA49:4592:4EFE:9982 Often an address will have long substrings of all zeros therefore one such substring per address can be abbreviated by ::. Also up to three leading 0s per hexquad can be omitted. For example fe80::1 corresponds to the canonical form fe80:0000:0000:0000:0000:0000:0000:0001. A third form is to write the last 32 Bit part in the well known (decimal) IPv4 style with dots . as separators. For example 2002::10.0.0.1 corresponds to the (hexadecimal) canonical representation 2002:0000:0000:0000:0000:0000:0a00:0001 which in turn is equivalent to writing 2002::a00:1. By now the reader should be able to understand the following: &prompt.root; ifconfig rl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 inet 10.0.0.10 netmask 0xffffff00 broadcast 10.0.0.255 inet6 fe80::200:21ff:fe03:8e1%rl0 prefixlen 64 scopeid 0x1 ether 00:00:21:03:08:e1 media: Ethernet autoselect (100baseTX ) status: active fe80::200:21ff:fe03:8e1%rl0 is an auto configured link-local address. It is generated from the MAC address as part of the auto configuration. For further information on the structure of IPv6 addresses see RFC3513. Getting Connected Currently there are four ways to connect to other IPv6 hosts and networks: Join the experimental 6bone Getting an IPv6 network from your upstream provider. Talk to your Internet provider for instructions. Tunnel via 6-to-4 (RFC3068) Use the net/freenet6 port if you are on a dial-up connection. Here we will talk on how to connect to the 6bone since it currently seems to be the most popular way. First take a look at the 6bone site and find a 6bone connection nearest to you. Write to the responsible person and with a little bit of luck you will be given instructions on how to set up your connection. Usually this involves setting up a GRE (gif) tunnel. Here is a typical example on setting up a &man.gif.4; tunnel: &prompt.root; ifconfig gif0 create &prompt.root; ifconfig gif0 gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280 &prompt.root; ifconfig gif0 tunnel MY_IPv4_ADDR HIS_IPv4_ADDR &prompt.root; ifconfig gif0 inet6 alias MY_ASSIGNED_IPv6_TUNNEL_ENDPOINT_ADDR Replace the capitalized words by the information you received from the upstream 6bone node. This establishes the tunnel. Check if the tunnel is working by &man.ping6.8; 'ing ff02::1%gif0. You should receive two ping replies. In case you are intrigued by the address ff02:1%gif0, this is a multicast address. %gif0 states that the multicast address at network interface gif0 is to be used. Since we ping a multicast address the other endpoint of the tunnel should reply as well. By now setting up a route to your 6bone uplink should be rather straightforward: &prompt.root; route add -inet6 default -interface gif0 &prompt.root; ping6 -n MY_UPLINK &prompt.root; traceroute6 www.jp.FreeBSD.org (3ffe:505:2008:1:2a0:24ff:fe57:e561) from 3ffe:8060:100::40:2, 30 hops max, 12 byte packets 1 atnet-meta6 14.147 ms 15.499 ms 24.319 ms 2 6bone-gw2-ATNET-NT.ipv6.tilab.com 103.408 ms 95.072 ms * 3 3ffe:1831:0:ffff::4 138.645 ms 134.437 ms 144.257 ms 4 3ffe:1810:0:6:290:27ff:fe79:7677 282.975 ms 278.666 ms 292.811 ms 5 3ffe:1800:0:ff00::4 400.131 ms 396.324 ms 394.769 ms 6 3ffe:1800:0:3:290:27ff:fe14:cdee 394.712 ms 397.19 ms 394.102 ms This output will differ from machine to machine. By now you should be able to reach the IPv6 site www.kame.net and see the dancing tortoise — that is if you have a IPv6 enabled browser such as www/mozilla, Konqueror, which is part of x11/kdebase3, or www/epiphany. DNS in the IPv6 World There used to be two types of DNS records for IPv6. The IETF has declared A6 records obsolete. AAAA records are the standard now. Using AAAA records is straightforward. Assign your hostname to the new IPv6 address you just received by adding: MYHOSTNAME AAAA MYIPv6ADDR To your primary zone DNS file. In case you do not serve your own DNS zones ask your DNS provider. Current versions of bind (version 8.3 and 9) and dns/djbdns (with the IPv6 patch) support AAAA records. Applying the needed changes to <filename>/etc/rc.conf</filename> IPv6 Client Settings These settings will help you configure a machine that will be on your LAN and act as a client, not a router. To have &man.rtsol.8; autoconfigure your interface on boot all you need to add is: ipv6_enable="YES" To statically assign an IP address such as 2001:471:1f11:251:290:27ff:fee0:2093, to your fxp0 interface, add: ipv6_ifconfig_fxp0="2001:471:1f11:251:290:27ff:fee0:2093" To assign a default router of 2001:471:1f11:251::1 add the following to /etc/rc.conf: ipv6_defaultrouter="2001:471:1f11:251::1" IPv6 Router/Gateway Settings This will help you take the directions that your tunnel provider, such as the 6bone, has given you and convert it into settings that will persist through reboots. To restore your tunnel on startup use something like the following in /etc/rc.conf: List the Generic Tunneling interfaces that will be configured, for example gif0: gif_interfaces="gif0" To configure the interface with a local endpoint of MY_IPv4_ADDR to a remote endpoint of REMOTE_IPv4_ADDR: gifconfig_gif0="MY_IPv4_ADDR REMOTE_IPv4_ADDR" To apply the IPv6 address you have been assigned for use as your IPv6 tunnel endpoint, add: ipv6_ifconfig_gif0="MY_ASSIGNED_IPv6_TUNNEL_ENDPOINT_ADDR" Then all you have to do is set the default route for IPv6. This is the other side of the IPv6 tunnel: ipv6_defaultrouter="MY_IPv6_REMOTE_TUNNEL_ENDPOINT_ADDR" IPv6 Tunnel Settings If the server is to route IPv6 between the rest of your network and the world, the following /etc/rc.conf setting will also be needed: ipv6_gateway_enable="YES" Router Advertisement and Host Auto Configuration This section will help you setup &man.rtadvd.8; to advertise the IPv6 default route. To enable &man.rtadvd.8; you will need the following in your /etc/rc.conf: rtadvd_enable="YES" It is important that you specify the interface on which to do IPv6 router solicitation. For example to tell &man.rtadvd.8; to use fxp0: rtadvd_interfaces="fxp0" Now we must create the configuration file, /etc/rtadvd.conf. Here is an example: fxp0:\ :addrs#1:addr="2001:471:1f11:246::":prefixlen#64:tc=ether: Replace fxp0 with the interface you are going to be using. Next, replace 2001:471:1f11:246:: with the prefix of your allocation. If you are dedicated a /64 subnet you will not need to change anything else. Otherwise, you will need to change the prefixlen# to the correct value. Harti Brandt Contributed by Asynchronous Transfer Mode (ATM) Configuring classical IP over ATM (PVCs) Classical IP over ATM (CLIP) is the simplest method to use Asynchronous Transfer Mode (ATM) with IP. It can be used with switched connections (SVCs) and with permanent connections (PVCs). This section describes how to set up a network based on PVCs. Fully meshed configurations The first method to set up a CLIP with PVCs is to connect each machine to each other machine in the network via a dedicated PVC. While this is simple to configure it tends to become impractical for a larger number of machines. The example supposes that we have four machines in the network, each connected to the ATM network with an ATM adapter card. The first step is the planning of the IP addresses and the ATM connections between the machines. We use the following: Host IP Address hostA 192.168.173.1 hostB 192.168.173.2 hostC 192.168.173.3 hostD 192.168.173.4 To build a fully meshed net we need one ATM connection between each pair of machines: Machines VPI.VCI couple hostA - hostB 0.100 hostA - hostC 0.101 hostA - hostD 0.102 hostB - hostC 0.103 hostB - hostD 0.104 hostC - hostD 0.105 The VPI and VCI values at each end of the connection may of course differ, but for simplicity we assume that they are the same. Next we need to configure the ATM interfaces on each host: hostA&prompt.root; ifconfig hatm0 192.168.173.1 up hostB&prompt.root; ifconfig hatm0 192.168.173.2 up hostC&prompt.root; ifconfig hatm0 192.168.173.3 up hostD&prompt.root; ifconfig hatm0 192.168.173.4 up assuming that the ATM interface is hatm0 on all hosts. Now the PVCs need to be configured on hostA (we assume that they are already configured on the ATM switches, you need to consult the manual for the switch on how to do this). hostA&prompt.root; atmconfig natm add 192.168.173.2 hatm0 0 100 llc/snap ubr hostA&prompt.root; atmconfig natm add 192.168.173.3 hatm0 0 101 llc/snap ubr hostA&prompt.root; atmconfig natm add 192.168.173.4 hatm0 0 102 llc/snap ubr hostB&prompt.root; atmconfig natm add 192.168.173.1 hatm0 0 100 llc/snap ubr hostB&prompt.root; atmconfig natm add 192.168.173.3 hatm0 0 103 llc/snap ubr hostB&prompt.root; atmconfig natm add 192.168.173.4 hatm0 0 104 llc/snap ubr hostC&prompt.root; atmconfig natm add 192.168.173.1 hatm0 0 101 llc/snap ubr hostC&prompt.root; atmconfig natm add 192.168.173.2 hatm0 0 103 llc/snap ubr hostC&prompt.root; atmconfig natm add 192.168.173.4 hatm0 0 105 llc/snap ubr hostD&prompt.root; atmconfig natm add 192.168.173.1 hatm0 0 102 llc/snap ubr hostD&prompt.root; atmconfig natm add 192.168.173.2 hatm0 0 104 llc/snap ubr hostD&prompt.root; atmconfig natm add 192.168.173.3 hatm0 0 105 llc/snap ubr Of course other traffic contracts than UBR can be used given the ATM adapter supports those. In this case the name of the traffic contract is followed by the parameters of the traffic. Help for the &man.atmconfig.8; tool can be obtained with: &prompt.root; atmconfig help natm add or in the &man.atmconfig.8; manual page. The same configuration can also be done via /etc/rc.conf. For hostA this would look like: network_interfaces="lo0 hatm0" ifconfig_hatm0="inet 192.168.173.1 up" natm_static_routes="hostB hostC hostD" route_hostB="192.168.173.2 hatm0 0 100 llc/snap ubr" route_hostC="192.168.173.3 hatm0 0 101 llc/snap ubr" route_hostD="192.168.173.4 hatm0 0 102 llc/snap ubr" The current state of all CLIP routes can be obtained with: hostA&prompt.root; atmconfig natm show diff --git a/en_US.ISO8859-1/books/handbook/basics/chapter.sgml b/en_US.ISO8859-1/books/handbook/basics/chapter.sgml index d6c48fbef1..9e1f7f5953 100644 --- a/en_US.ISO8859-1/books/handbook/basics/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/basics/chapter.sgml @@ -1,2579 +1,2579 @@ Chris Shumway Rewritten by UNIX Basics Synopsis The following chapter will cover the basic commands and functionality of the FreeBSD operating system. Much of this material is relevant for any &unix;-like operating system. Feel free to skim over this chapter if you are familiar with the material. If you are new to FreeBSD, then you will definitely want to read through this chapter carefully. After reading this chapter, you will know: How to use the virtual consoles of FreeBSD. How &unix; file permissions work along with understanding file flags in &os;. The default &os; file system layout. The &os; disk organization. How to mount and unmount file systems. What processes, daemons, and signals are. What a shell is, and how to change your default login environment. How to use basic text editors. What devices and device nodes are. What binary format is used under &os;. How to read manual pages for more information. Virtual Consoles and Terminals virtual consoles terminals FreeBSD can be used in various ways. One of them is typing commands to a text terminal. A lot of the flexibility and power of a &unix; operating system is readily available at your hands when using FreeBSD this way. This section describes what terminals and consoles are, and how you can use them in FreeBSD. The Console console If you have not configured FreeBSD to automatically start a graphical environment during startup, the system will present you with a login prompt after it boots, right after the startup scripts finish running. You will see something similar to: Additional ABI support:. Local package initialization:. Additional TCP options:. Fri Sep 20 13:01:06 EEST 2002 FreeBSD/i386 (pc3.example.org) (ttyv0) login: The messages might be a bit different on your system, but you will see something similar. The last two lines are what we are interested in right now. The second last line reads: FreeBSD/i386 (pc3.example.org) (ttyv0) This line contains some bits of information about the system you have just booted. You are looking at a FreeBSD console, running on an Intel or compatible processor of the x86 architecture This is what i386 means. Note that even if you are not running FreeBSD on an Intel 386 CPU, this is going to be i386. It is not the type of your processor, but the processor architecture that is shown here. . The name of this machine (every &unix; machine has a name) is pc3.example.org, and you are now looking at its system console—the ttyv0 terminal. Finally, the last line is always: login: This is the part where you are supposed to type in your username to log into FreeBSD. The next section describes how you can do this. Logging into FreeBSD FreeBSD is a multiuser, multiprocessing system. This is the formal description that is usually given to a system that can be used by many different people, who simultaneously run a lot of programs on a single machine. Every multiuser system needs some way to distinguish one user from the rest. In FreeBSD (and all the &unix;-like operating systems), this is accomplished by requiring that every user must log into the system before being able to run programs. Every user has a unique name (the username) and a personal, secret key (the password). FreeBSD will ask for these two before allowing a user to run any programs. startup scripts Right after FreeBSD boots and finishes running its startup scripts Startup scripts are programs that are run automatically by FreeBSD when booting. Their main function is to set things up for everything else to run, and start any services that you have configured to run in the background doing useful things. , it will present you with a prompt and ask for a valid username: login: For the sake of this example, let us assume that your username is john. Type john at this prompt and press Enter. You should then be presented with a prompt to enter a password: login: john Password: Type in john's password now, and press Enter. The password is not echoed! You need not worry about this right now. Suffice it to say that it is done for security reasons. If you have typed your password correctly, you should by now be logged into FreeBSD and ready to try out all the available commands. You should see the MOTD or message of the day followed by a command prompt (a #, $, or % character). This indicates you have successfully logged into FreeBSD. Multiple Consoles Running &unix; commands in one console is fine, but FreeBSD can run many programs at once. Having one console where commands can be typed would be a bit of a waste when an operating system like FreeBSD can run dozens of programs at the same time. This is where virtual consoles can be very helpful. FreeBSD can be configured to present you with many different virtual consoles. You can switch from one of them to any other virtual console by pressing a couple of keys on your keyboard. Each console has its own different output channel, and FreeBSD takes care of properly redirecting keyboard input and monitor output as you switch from one virtual console to the next. Special key combinations have been reserved by FreeBSD for switching consoles A fairly technical and accurate description of all the details of the FreeBSD console and keyboard drivers can be found in the manual pages of &man.syscons.4;, &man.atkbd.4;, &man.vidcontrol.1; and &man.kbdcontrol.1;. We will not expand on the details here, but the interested reader can always consult the manual pages for a more detailed and thorough explanation of how things work. . You can use AltF1, AltF2, through AltF8 to switch to a different virtual console in FreeBSD. As you are switching from one console to the next, FreeBSD takes care of saving and restoring the screen output. The result is an illusion of having multiple virtual screens and keyboards that you can use to type commands for FreeBSD to run. The programs that you launch on one virtual console do not stop running when that console is not visible. They continue running when you have switched to a different virtual console. The <filename>/etc/ttys</filename> File The default configuration of FreeBSD will start up with eight virtual consoles. This is not a hardwired setting though, and you can easily customize your installation to boot with more or fewer virtual consoles. The number and settings of the virtual consoles are configured in the /etc/ttys file. You can use the /etc/ttys file to configure the virtual consoles of FreeBSD. Each uncommented line in this file (lines that do not start with a # character) contains settings for a single terminal or virtual console. The default version of this file that ships with FreeBSD configures nine virtual consoles, and enables eight of them. They are the lines that start with ttyv: # name getty type status comments # ttyv0 "/usr/libexec/getty Pc" cons25 on secure # Virtual terminals ttyv1 "/usr/libexec/getty Pc" cons25 on secure ttyv2 "/usr/libexec/getty Pc" cons25 on secure ttyv3 "/usr/libexec/getty Pc" cons25 on secure ttyv4 "/usr/libexec/getty Pc" cons25 on secure ttyv5 "/usr/libexec/getty Pc" cons25 on secure ttyv6 "/usr/libexec/getty Pc" cons25 on secure ttyv7 "/usr/libexec/getty Pc" cons25 on secure ttyv8 "/usr/X11R6/bin/xdm -nodaemon" xterm off secure For a detailed description of every column in this file and all the options you can use to set things up for the virtual consoles, consult the &man.ttys.5; manual page. Single User Mode Console A detailed description of what single user mode is can be found in . It is worth noting that there is only one console when you are running FreeBSD in single user mode. There are no virtual consoles available. The settings of the single user mode console can also be found in the /etc/ttys file. Look for the line that starts with console: # name getty type status comments # # If console is marked "insecure", then init will ask for the root password # when going to single-user mode. console none unknown off secure As the comments above the console line indicate, you can edit this line and change secure to insecure. If you do that, when FreeBSD boots into single user mode, it will still ask for the root password. Be careful when changing this to insecure. If you ever forget the root password, booting into single user mode is a bit involved. It is still possible, but it might be a bit hard for someone who is not very comfortable with the FreeBSD booting process and the programs involved. Permissions UNIX FreeBSD, being a direct descendant of BSD &unix;, is based on several key &unix; concepts. The first and most pronounced is that FreeBSD is a multi-user operating system. The system can handle several users all working simultaneously on completely unrelated tasks. The system is responsible for properly sharing and managing requests for hardware devices, peripherals, memory, and CPU time fairly to each user. Because the system is capable of supporting multiple users, everything the system manages has a set of permissions governing who can read, write, and execute the resource. These permissions are stored as three octets broken into three pieces, one for the owner of the file, one for the group that the file belongs to, and one for everyone else. This numerical representation works like this: permissions file permissions Value Permission Directory Listing 0 No read, no write, no execute --- 1 No read, no write, execute --x 2 No read, write, no execute -w- 3 No read, write, execute -wx 4 Read, no write, no execute r-- 5 Read, no write, execute r-x 6 Read, write, no execute rw- 7 Read, write, execute rwx ls directories You can use the command line argument to &man.ls.1; to view a long directory listing that includes a column with information about a file's permissions for the owner, group, and everyone else. For example, a ls -l in an arbitrary directory may show: &prompt.user; ls -l total 530 -rw-r--r-- 1 root wheel 512 Sep 5 12:31 myfile -rw-r--r-- 1 root wheel 512 Sep 5 12:31 otherfile -rw-r--r-- 1 root wheel 7680 Sep 5 12:31 email.txt ... Here is how the first column of ls -l is broken up: -rw-r--r-- The first (leftmost) character tells if this file is a regular file, a directory, a special character device, a socket, or any other special pseudo-file device. In this case, the - indicates a regular file. The next three characters, rw- in this example, give the permissions for the owner of the file. The next three characters, r--, give the permissions for the group that the file belongs to. The final three characters, r--, give the permissions for the rest of the world. A dash means that the permission is turned off. In the case of this file, the permissions are set so the owner can read and write to the file, the group can read the file, and the rest of the world can only read the file. According to the table above, the permissions for this file would be 644, where each digit represents the three parts of the file's permission. This is all well and good, but how does the system control permissions on devices? FreeBSD actually treats most hardware devices as a file that programs can open, read, and write data to just like any other file. These special device files are stored on the /dev directory. Directories are also treated as files. They have read, write, and execute permissions. The executable bit for a directory has a slightly different meaning than that of files. When a directory is marked executable, it means it can be traversed into, that is, it is possible to cd (change directory) into it. This also means that within the directory it is possible to access files whose names are known (subject, of course, to the permissions on the files themselves). In particular, in order to perform a directory listing, read permission must be set on the directory, whilst to delete a file that one knows the name of, it is necessary to have write and execute permissions to the directory containing the file. There are more permission bits, but they are primarily used in special circumstances such as setuid binaries and sticky directories. If you want more information on file permissions and how to set them, be sure to look at the &man.chmod.1; manual page. Tom Rhodes Contributed by Symbolic Permissions permissionssymbolic Symbolic permissions, sometimes referred to as symbolic expressions, use characters in place of octal values to assign permissions to files or directories. Symbolic expressions use the syntax of (who) (action) (permissions), where the following values are available: Option Letter Represents (who) u User (who) g Group owner (who) o Other (who) a All (world) (action) + Adding permissions (action) - Removing permissions (action) = Explicitly set permissions (permissions) r Read (permissions) w Write (permissions) x Execute (permissions) t Sticky bit (permissions) s Set UID or GID These values are used with the &man.chmod.1; command just like before, but with letters. For an example, you could use the following command to block other users from accessing FILE: &prompt.user; chmod go= FILE A comma separated list can be provided when more than one set of changes to a file must be made. For example the following command will remove the groups and world write permission on FILE, then it adds the execute permissions for everyone: &prompt.user; chmod go-w,a+x FILE Tom Rhodes Contributed by &os; File Flags In addition to file permissions discussed previously, &os; supports the use of file flags. These flags add an additional level of security and control over files, but not directories. These file flags add an additional level of control over files, helping to ensure that in some cases not even the root can remove or alter files. File flags are altered by using the &man.chflags.1; utility, using a simple interface. For example, to enable the system undeletable flag on the file file1, issue the following command: &prompt.root; chflags sunlink file1 And to disable the system undeletable flag, simply issue the previous command with no in front of the . Observe: &prompt.root; chflags nosunlink file1 To view the flags of this file, use the &man.ls.1; with the flags: &prompt.root; ls -lo file1 The output should look like the following: -rw-r--r-- 1 trhodes trhodes sunlnk 0 Mar 1 05:54 file1 Several flags may only added or removed to files by the root user. In other cases, the file owner may set these flags. It is recommended an administrator read over the &man.chflags.1; and &man.chflags.2; manual pages for more information. Directory Structure directory hierarchy The FreeBSD directory hierarchy is fundamental to obtaining an overall understanding of the system. The most important concept to grasp is that of the root directory, /. This directory is the first one mounted at boot time and it contains the base system necessary to prepare the operating system for multi-user operation. The root directory also contains mount points for every other file system that you may want to mount. A mount point is a directory where additional file systems can be grafted onto the root file system. This is further described in . Standard mount points include /usr, /var, /tmp, /mnt, and /cdrom. These directories are usually referenced to entries in the file /etc/fstab. /etc/fstab is a table of various file systems and mount points for reference by the system. Most of the file systems in /etc/fstab are mounted automatically at boot time from the script &man.rc.8; unless they contain the option. Details can be found in . A complete description of the file system hierarchy is available in &man.hier.7;. For now, a brief overview of the most common directories will suffice. Directory Description / Root directory of the file system. /bin/ User utilities fundamental to both single-user and multi-user environments. /boot/ Programs and configuration files used during operating system bootstrap. /boot/defaults/ Default bootstrapping configuration files; see &man.loader.conf.5;. /dev/ Device nodes; see &man.intro.4;. /etc/ System configuration files and scripts. /etc/defaults/ Default system configuration files; see &man.rc.8;. /etc/mail/ Configuration files for mail transport agents such as &man.sendmail.8;. /etc/namedb/ named configuration files; see &man.named.8;. /etc/periodic/ Scripts that are run daily, weekly, and monthly, via &man.cron.8;; see &man.periodic.8;. /etc/ppp/ ppp configuration files; see &man.ppp.8;. /mnt/ Empty directory commonly used by system administrators as a temporary mount point. /proc/ Process file system; see &man.procfs.5;, &man.mount.procfs.8;. /rescue/ Statically linked programs for emergency recovery; see &man.rescue.8;. /root/ Home directory for the root account. /sbin/ System programs and administration utilities fundamental to both single-user and multi-user environments. /stand/ Programs used in a standalone environment. /tmp/ Temporary files. The contents of /tmp are usually NOT preserved across a system reboot. A memory-based file system is often mounted at /tmp. This can be automated using the tmpmfs-related variables of &man.rc.conf.5; (or with an entry in /etc/fstab; see &man.mdmfs.8;, or for FreeBSD 4.X, &man.mfs.8;). /usr/ The majority of user utilities and applications. /usr/bin/ Common utilities, programming tools, and applications. /usr/include/ Standard C include files. /usr/lib/ Archive libraries. /usr/libdata/ Miscellaneous utility data files. /usr/libexec/ - System daemons & system utilities (executed by other + System daemons & system utilities (executed by other programs). /usr/local/ Local executables, libraries, etc. Also used as the default destination for the FreeBSD ports framework. Within /usr/local, the general layout sketched out by &man.hier.7; for /usr should be used. Exceptions are the man directory, which is directly under /usr/local rather than under /usr/local/share, and the ports documentation is in share/doc/port. /usr/obj/ Architecture-specific target tree produced by building the /usr/src tree. /usr/ports The FreeBSD Ports Collection (optional). /usr/sbin/ - System daemons & system utilities (executed by users). + System daemons & system utilities (executed by users). /usr/share/ Architecture-independent files. /usr/src/ BSD and/or local source files. /usr/X11R6/ X11R6 distribution executables, libraries, etc (optional). /var/ Multi-purpose log, temporary, transient, and spool files. A memory-based file system is sometimes mounted at /var. This can be automated using the varmfs-related variables of &man.rc.conf.5 (or with an entry in /etc/fstab; see &man.mdmfs.8;, or for FreeBSD 4.X, &man.mfs.8;). /var/log/ Miscellaneous system log files. /var/mail/ User mailbox files. /var/spool/ Miscellaneous printer and mail system spooling directories. /var/tmp/ Temporary files. The files are usually preserved across a system reboot, unless /var is a memory-based file system. /var/yp NIS maps. Disk Organization The smallest unit of organization that FreeBSD uses to find files is the filename. Filenames are case-sensitive, which means that readme.txt and README.TXT are two separate files. FreeBSD does not use the extension (.txt) of a file to determine whether the file is a program, or a document, or some other form of data. Files are stored in directories. A directory may contain no files, or it may contain many hundreds of files. A directory can also contain other directories, allowing you to build up a hierarchy of directories within one another. This makes it much easier to organize your data. Files and directories are referenced by giving the file or directory name, followed by a forward slash, /, followed by any other directory names that are necessary. If you have directory foo, which contains directory bar, which contains the file readme.txt, then the full name, or path to the file is foo/bar/readme.txt. Directories and files are stored in a file system. Each file system contains exactly one directory at the very top level, called the root directory for that file system. This root directory can then contain other directories. So far this is probably similar to any other operating system you may have used. There are a few differences; for example, &ms-dos; uses \ to separate file and directory names, while &macos; uses :. FreeBSD does not use drive letters, or other drive names in the path. You would not write c:/foo/bar/readme.txt on FreeBSD. Instead, one file system is designated the root file system. The root file system's root directory is referred to as /. Every other file system is then mounted under the root file system. No matter how many disks you have on your FreeBSD system, every directory appears to be part of the same disk. Suppose you have three file systems, called A, B, and C. Each file system has one root directory, which contains two other directories, called A1, A2 (and likewise B1, B2 and C1, C2). Call A the root file system. If you used the ls command to view the contents of this directory you would see two subdirectories, A1 and A2. The directory tree looks like this: / | +--- A1 | `--- A2 A file system must be mounted on to a directory in another file system. So now suppose that you mount file system B on to the directory A1. The root directory of B replaces A1, and the directories in B appear accordingly: / | +--- A1 | | | +--- B1 | | | `--- B2 | `--- A2 Any files that are in the B1 or B2 directories can be reached with the path /A1/B1 or /A1/B2 as necessary. Any files that were in /A1 have been temporarily hidden. They will reappear if B is unmounted from A. If B had been mounted on A2 then the diagram would look like this: / | +--- A1 | `--- A2 | +--- B1 | `--- B2 and the paths would be /A2/B1 and /A2/B2 respectively. File systems can be mounted on top of one another. Continuing the last example, the C file system could be mounted on top of the B1 directory in the B file system, leading to this arrangement: / | +--- A1 | `--- A2 | +--- B1 | | | +--- C1 | | | `--- C2 | `--- B2 Or C could be mounted directly on to the A file system, under the A1 directory: / | +--- A1 | | | +--- C1 | | | `--- C2 | `--- A2 | +--- B1 | `--- B2 If you are familiar with &ms-dos;, this is similar, although not identical, to the join command. This is not normally something you need to concern yourself with. Typically you create file systems when installing FreeBSD and decide where to mount them, and then never change them unless you add a new disk. It is entirely possible to have one large root file system, and not need to create any others. There are some drawbacks to this approach, and one advantage. Benefits of Multiple File Systems Different file systems can have different mount options. For example, with careful planning, the root file system can be mounted read-only, making it impossible for you to inadvertently delete or edit a critical file. Separating user-writable file systems, such as /home, from other file systems also allows them to be mounted nosuid; this option prevents the suid/guid bits on executables stored on the file system from taking effect, possibly improving security. FreeBSD automatically optimizes the layout of files on a file system, depending on how the file system is being used. So a file system that contains many small files that are written frequently will have a different optimization to one that contains fewer, larger files. By having one big file system this optimization breaks down. FreeBSD's file systems are very robust should you lose power. However, a power loss at a critical point could still damage the structure of the file system. By splitting your data over multiple file systems it is more likely that the system will still come up, making it easier for you to restore from backup as necessary. Benefit of a Single File System File systems are a fixed size. If you create a file system when you install FreeBSD and give it a specific size, you may later discover that you need to make the partition bigger. This is not easily accomplished without backing up, recreating the file system with the new size, and then restoring the backed up data. FreeBSD 4.4 and later versions feature the &man.growfs.8; command, which makes it possible to increase the size of file system on the fly, removing this limitation. File systems are contained in partitions. This does not have the same meaning as the common usage of the term partition (for example, &ms-dos; partition), because of &os;'s &unix; heritage. Each partition is identified by a letter from a through to h. Each partition can contain only one file system, which means that file systems are often described by either their typical mount point in the file system hierarchy, or the letter of the partition they are contained in. FreeBSD also uses disk space for swap space. Swap space provides FreeBSD with virtual memory. This allows your computer to behave as though it has much more memory than it actually does. When FreeBSD runs out of memory it moves some of the data that is not currently being used to the swap space, and moves it back in (moving something else out) when it needs it. Some partitions have certain conventions associated with them. Partition Convention a Normally contains the root file system b Normally contains swap space c Normally the same size as the enclosing slice. This allows utilities that need to work on the entire slice (for example, a bad block scanner) to work on the c partition. You would not normally create a file system on this partition. d Partition d used to have a special meaning associated with it, although that is now gone. To this day, some tools may operate oddly if told to work on partition d, so sysinstall will not normally create partition d. Each partition-that-contains-a-file-system is stored in what FreeBSD calls a slice. Slice is FreeBSD's term for what the common call partitions, and again, this is because of FreeBSD's &unix; background. Slices are numbered, starting at 1, through to 4. slices partitions dangerously dedicated Slice numbers follow the device name, prefixed with an s, starting at 1. So da0s1 is the first slice on the first SCSI drive. There can only be four physical slices on a disk, but you can have logical slices inside physical slices of the appropriate type. These extended slices are numbered starting at 5, so ad0s5 is the first extended slice on the first IDE disk. These devices are used by file systems that expect to occupy a slice. Slices, dangerously dedicated physical drives, and other drives contain partitions, which are represented as letters from a to h. This letter is appended to the device name, so da0a is the a partition on the first da drive, which is dangerously dedicated. ad1s3e is the fifth partition in the third slice of the second IDE disk drive. Finally, each disk on the system is identified. A disk name starts with a code that indicates the type of disk, and then a number, indicating which disk it is. Unlike slices, disk numbering starts at 0. Common codes that you will see are listed in . When referring to a partition FreeBSD requires that you also name the slice and disk that contains the partition, and when referring to a slice you should also refer to the disk name. Do this by listing the disk name, s, the slice number, and then the partition letter. Examples are shown in . shows a conceptual model of the disk layout that should help make things clearer. In order to install FreeBSD you must first configure the disk slices, then create partitions within the slice you will use for FreeBSD, and then create a file system (or swap space) in each partition, and decide where that file system will be mounted. Disk Device Codes Code Meaning ad ATAPI (IDE) disk da SCSI direct access disk acd ATAPI (IDE) CDROM cd SCSI CDROM fd Floppy disk

Sample Disk, Slice, and Partition Names Name Meaning ad0s1a The first partition (a) on the first slice (s1) on the first IDE disk (ad0). da1s2e The fifth partition (e) on the second slice (s2) on the second SCSI disk (da1). Conceptual Model of a Disk This diagram shows FreeBSD's view of the first IDE disk attached to the system. Assume that the disk is 4 GB in size, and contains two 2 GB slices (&ms-dos; partitions). The first slice contains a &ms-dos; disk, C:, and the second slice contains a FreeBSD installation. This example FreeBSD installation has three partitions, and a swap partition. The three partitions will each hold a file system. Partition a will be used for the root file system, e for the /var directory hierarchy, and f for the /usr directory hierarchy. .-----------------. --. | | | | DOS / Windows | | -: : > First slice, ad0s1 +: : > First slice, ad0s1 : : | | | | :=================: ==: --. | | | Partition a, mounted as / | -| | > referred to as ad0s2a | +| | > referred to as ad0s2a | | | | | :-----------------: ==: | | | | Partition b, used as swap | -| | > referred to as ad0s2b | +| | > referred to as ad0s2b | | | | | :-----------------: ==: | Partition c, no -| | | Partition e, used as /var > file system, all -| | > referred to as ad0s2e | of FreeBSD slice, +| | | Partition e, used as /var > file system, all +| | > referred to as ad0s2e | of FreeBSD slice, | | | | ad0s2c :-----------------: ==: | | | | | : : | Partition f, used as /usr | -: : > referred to as ad0s2f | +: : > referred to as ad0s2f | : : | | | | | | | | --' | `-----------------' --' Mounting and Unmounting File Systems The file system is best visualized as a tree, rooted, as it were, at /. /dev, /usr, and the other directories in the root directory are branches, which may have their own branches, such as /usr/local, and so on. root file system There are various reasons to house some of these directories on separate file systems. /var contains the directories log/, spool/, and various types of temporary files, and as such, may get filled up. Filling up the root file system is not a good idea, so splitting /var from / is often favorable. Another common reason to contain certain directory trees on other file systems is if they are to be housed on separate physical disks, or are separate virtual disks, such as Network File System mounts, or CDROM drives. The <filename>fstab</filename> File file systems mounted with fstab During the boot process, file systems listed in /etc/fstab are automatically mounted (unless they are listed with the option). The /etc/fstab file contains a list of lines of the following format: device /mount-point fstype options dumpfreq passno device A device name (which should exist), as explained in . mount-point A directory (which should exist), on which to mount the file system. fstype The file system type to pass to &man.mount.8;. The default FreeBSD file system is ufs. options Either for read-write file systems, or for read-only file systems, followed by any other options that may be needed. A common option is for file systems not normally mounted during the boot sequence. Other options are listed in the &man.mount.8; manual page. dumpfreq This is used by &man.dump.8; to determine which file systems require dumping. If the field is missing, a value of zero is assumed. passno This determines the order in which file systems should be checked. File systems that should be skipped should have their passno set to zero. The root file system (which needs to be checked before everything else) should have its passno set to one, and other file systems' passno should be set to values greater than one. If more than one file systems have the same passno then &man.fsck.8; will attempt to check file systems in parallel if possible. Consult the &man.fstab.5; manual page for more information on the format of the /etc/fstab file and the options it contains. The <command>mount</command> Command file systems mounting The &man.mount.8; command is what is ultimately used to mount file systems. In its most basic form, you use: &prompt.root; mount device mountpoint There are plenty of options, as mentioned in the &man.mount.8; manual page, but the most common are: Mount Options Mount all the file systems listed in /etc/fstab. Except those marked as noauto, excluded by the flag, or those that are already mounted. Do everything except for the actual mount system call. This option is useful in conjunction with the flag to determine what &man.mount.8; is actually trying to do. Force the mount of an unclean file system (dangerous), or forces the revocation of write access when downgrading a file system's mount status from read-write to read-only. Mount the file system read-only. This is identical to using the ( for &os; versions older than 5.2) argument to the option. fstype Mount the given file system as the given file system type, or mount only file systems of the given type, if given the option. ufs is the default file system type. Update mount options on the file system. Be verbose. Mount the file system read-write. The option takes a comma-separated list of the options, including the following: nodev Do not interpret special devices on the file system. This is a useful security option. noexec Do not allow execution of binaries on this file system. This is also a useful security option. nosuid Do not interpret setuid or setgid flags on the file system. This is also a useful security option. The <command>umount</command> Command file systems unmounting The &man.umount.8; command takes, as a parameter, one of a mountpoint, a device name, or the or option. All forms take to force unmounting, and for verbosity. Be warned that is not generally a good idea. Forcibly unmounting file systems might crash the computer or damage data on the file system. and are used to unmount all mounted file systems, possibly modified by the file system types listed after . , however, does not attempt to unmount the root file system. Processes FreeBSD is a multi-tasking operating system. This means that it seems as though more than one program is running at once. Each program running at any one time is called a process. Every command you run will start at least one new process, and there are a number of system processes that run all the time, keeping the system functional. Each process is uniquely identified by a number called a process ID, or PID, and, like files, each process also has one owner and group. The owner and group information is used to determine what files and devices the process can open, using the file permissions discussed earlier. Most processes also have a parent process. The parent process is the process that started them. For example, if you are typing commands to the shell then the shell is a process, and any commands you run are also processes. Each process you run in this way will have your shell as its parent process. The exception to this is a special process called &man.init.8;. init is always the first process, so its PID is always 1. init is started automatically by the kernel when FreeBSD starts. Two commands are particularly useful to see the processes on the system, &man.ps.1; and &man.top.1;. The ps command is used to show a static list of the currently running processes, and can show their PID, how much memory they are using, the command line they were started with, and so on. The top command displays all the running processes, and updates the display every few seconds, so that you can interactively see what your computer is doing. By default, ps only shows you the commands that are running and are owned by you. For example: &prompt.user; ps PID TT STAT TIME COMMAND 298 p0 Ss 0:01.10 tcsh 7078 p0 S 2:40.88 xemacs mdoc.xsl (xemacs-21.1.14) 37393 p0 I 0:03.11 xemacs freebsd.dsl (xemacs-21.1.14) 48630 p0 S 2:50.89 /usr/local/lib/netscape-linux/navigator-linux-4.77.bi 48730 p0 IW 0:00.00 (dns helper) (navigator-linux-) 72210 p0 R+ 0:00.00 ps 390 p1 Is 0:01.14 tcsh 7059 p2 Is+ 1:36.18 /usr/local/bin/mutt -y 6688 p3 IWs 0:00.00 tcsh 10735 p4 IWs 0:00.00 tcsh 20256 p5 IWs 0:00.00 tcsh 262 v0 IWs 0:00.00 -tcsh (tcsh) 270 v0 IW+ 0:00.00 /bin/sh /usr/X11R6/bin/startx -- -bpp 16 280 v0 IW+ 0:00.00 xinit /home/nik/.xinitrc -- -bpp 16 284 v0 IW 0:00.00 /bin/sh /home/nik/.xinitrc 285 v0 S 0:38.45 /usr/X11R6/bin/sawfish As you can see in this example, the output from &man.ps.1; is organized into a number of columns. PID is the process ID discussed earlier. PIDs are assigned starting from 1, go up to 99999, and wrap around back to the beginning when you run out. The TT column shows the tty the program is running on, and can safely be ignored for the moment. STAT shows the program's state, and again, can be safely ignored. TIME is the amount of time the program has been running on the CPU—this is usually not the elapsed time since you started the program, as most programs spend a lot of time waiting for things to happen before they need to spend time on the CPU. Finally, COMMAND is the command line that was used to run the program. &man.ps.1; supports a number of different options to change the information that is displayed. One of the most useful sets is auxww. displays information about all the running processes, not just your own. displays the username of the process' owner, as well as memory usage. displays information about daemon processes, and causes &man.ps.1; to display the full command line, rather than truncating it once it gets too long to fit on the screen. The output from &man.top.1; is similar. A sample session looks like this: &prompt.user; top last pid: 72257; load averages: 0.13, 0.09, 0.03 up 0+13:38:33 22:39:10 47 processes: 1 running, 46 sleeping CPU states: 12.6% user, 0.0% nice, 7.8% system, 0.0% interrupt, 79.7% idle Mem: 36M Active, 5256K Inact, 13M Wired, 6312K Cache, 15M Buf, 408K Free Swap: 256M Total, 38M Used, 217M Free, 15% Inuse PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 72257 nik 28 0 1960K 1044K RUN 0:00 14.86% 1.42% top 7078 nik 2 0 15280K 10960K select 2:54 0.88% 0.88% xemacs-21.1.14 281 nik 2 0 18636K 7112K select 5:36 0.73% 0.73% XF86_SVGA 296 nik 2 0 3240K 1644K select 0:12 0.05% 0.05% xterm 48630 nik 2 0 29816K 9148K select 3:18 0.00% 0.00% navigator-linu 175 root 2 0 924K 252K select 1:41 0.00% 0.00% syslogd 7059 nik 2 0 7260K 4644K poll 1:38 0.00% 0.00% mutt ... The output is split into two sections. The header (the first five lines) shows the PID of the last process to run, the system load averages (which are a measure of how busy the system is), the system uptime (time since the last reboot) and the current time. The other figures in the header relate to how many processes are running (47 in this case), how much memory and swap space has been taken up, and how much time the system is spending in different CPU states. Below that are a series of columns containing similar information to the output from &man.ps.1;. As before you can see the PID, the username, the amount of CPU time taken, and the command that was run. &man.top.1; also defaults to showing you the amount of memory space taken by the process. This is split into two columns, one for total size, and one for resident size—total size is how much memory the application has needed, and the resident size is how much it is actually using at the moment. In this example you can see that &netscape; has required almost 30 MB of RAM, but is currently only using 9 MB. &man.top.1; automatically updates this display every two seconds; this can be changed with the option. Daemons, Signals, and Killing Processes When you run an editor it is easy to control the editor, tell it to load files, and so on. You can do this because the editor provides facilities to do so, and because the editor is attached to a terminal. Some programs are not designed to be run with continuous user input, and so they disconnect from the terminal at the first opportunity. For example, a web server spends all day responding to web requests, it normally does not need any input from you. Programs that transport email from site to site are another example of this class of application. We call these programs daemons. Daemons were characters in Greek mythology; neither good or evil, they were little attendant spirits that, by and large, did useful things for mankind. Much like the web servers and mail servers of today do useful things. This is why the BSD mascot has, for a long time, been the cheerful looking daemon with sneakers and a pitchfork. There is a convention to name programs that normally run as daemons with a trailing d. BIND is the Berkeley Internet Name Daemon (and the actual program that executes is called named), the Apache web server program is called httpd, the line printer spooling daemon is lpd and so on. This is a convention, not a hard and fast rule; for example, the main mail daemon for the Sendmail application is called sendmail, and not maild, as you might imagine. Sometimes you will need to communicate with a daemon process. These communications are called signals, and you can communicate with a daemon (or with any other running process) by sending it a signal. There are a number of different signals that you can send—some of them have a specific meaning, others are interpreted by the application, and the application's documentation will tell you how that application interprets signals. You can only send a signal to a process that you own. If you send a signal to someone else's process with &man.kill.1; or &man.kill.2; permission will be denied. The exception to this is the root user, who can send signals to everyone's processes. FreeBSD will also send applications signals in some cases. If an application is badly written, and tries to access memory that it is not supposed to, FreeBSD sends the process the Segmentation Violation signal (SIGSEGV). If an application has used the &man.alarm.3; system call to be alerted after a period of time has elapsed then it will be sent the Alarm signal (SIGALRM), and so on. Two signals can be used to stop a process, SIGTERM and SIGKILL. SIGTERM is the polite way to kill a process; the process can catch the signal, realize that you want it to shut down, close any log files it may have open, and generally finish whatever it is doing at the time before shutting down. In some cases a process may even ignore SIGTERM if it is in the middle of some task that can not be interrupted. SIGKILL can not be ignored by a process. This is the I do not care what you are doing, stop right now signal. If you send SIGKILL to a process then FreeBSD will stop that process there and then Not quite true—there are a few things that can not be interrupted. For example, if the process is trying to read from a file that is on another computer on the network, and the other computer has gone away for some reason (been turned off, or the network has a fault), then the process is said to be uninterruptible. Eventually the process will time out, typically after two minutes. As soon as this time out occurs the process will be killed. . The other signals you might want to use are SIGHUP, SIGUSR1, and SIGUSR2. These are general purpose signals, and different applications will do different things when they are sent. Suppose that you have changed your web server's configuration file—you would like to tell the web server to re-read its configuration. You could stop and restart httpd, but this would result in a brief outage period on your web server, which may be undesirable. Most daemons are written to respond to the SIGHUP signal by re-reading their configuration file. So instead of killing and restarting httpd you would send it the SIGHUP signal. Because there is no standard way to respond to these signals, different daemons will have different behavior, so be sure and read the documentation for the daemon in question. Signals are sent using the &man.kill.1; command, as this example shows. Sending a Signal to a Process This example shows how to send a signal to &man.inetd.8;. The inetd configuration file is /etc/inetd.conf, and inetd will re-read this configuration file when it is sent SIGHUP. Find the process ID of the process you want to send the signal to. Do this using &man.ps.1; and &man.grep.1;. The &man.grep.1; command is used to search through output, looking for the string you specify. This command is run as a normal user, and &man.inetd.8; is run as root, so the options must be given to &man.ps.1;. &prompt.user; ps -ax | grep inetd 198 ?? IWs 0:00.00 inetd -wW So the &man.inetd.8; PID is 198. In some cases the grep inetd command might also occur in this output. This is because of the way &man.ps.1; has to find the list of running processes. Use &man.kill.1; to send the signal. Because &man.inetd.8; is being run by root you must use &man.su.1; to become root first. &prompt.user; su Password: &prompt.root; /bin/kill -s HUP 198 In common with most &unix; commands, &man.kill.1; will not print any output if it is successful. If you send a signal to a process that you do not own then you will see kill: PID: Operation not permitted. If you mistype the PID you will either send the signal to the wrong process, which could be bad, or, if you are lucky, you will have sent the signal to a PID that is not currently in use, and you will see kill: PID: No such process. Why Use <command>/bin/kill</command>? Many shells provide the kill command as a built in command; that is, the shell will send the signal directly, rather than running /bin/kill. This can be very useful, but different shells have a different syntax for specifying the name of the signal to send. Rather than try to learn all of them, it can be simpler just to use the /bin/kill ... command directly. Sending other signals is very similar, just substitute TERM or KILL in the command line as necessary. Killing random process on the system can be a bad idea. In particular, &man.init.8;, process ID 1, is very special. Running /bin/kill -s KILL 1 is a quick way to shutdown your system. Always double check the arguments you run &man.kill.1; with before you press Return. Shells shells command line In FreeBSD, a lot of everyday work is done in a command line interface called a shell. A shell's main job is to take commands from the input channel and execute them. A lot of shells also have built in functions to help everyday tasks such as file management, file globbing, command line editing, command macros, and environment variables. FreeBSD comes with a set of shells, such as sh, the Bourne Shell, and tcsh, the improved C-shell. Many other shells are available from the FreeBSD Ports Collection, such as zsh and bash. Which shell do you use? It is really a matter of taste. If you are a C programmer you might feel more comfortable with a C-like shell such as tcsh. If you have come from Linux or are new to a &unix; command line interface you might try bash. The point is that each shell has unique properties that may or may not work with your preferred working environment, and that you have a choice of what shell to use. One common feature in a shell is filename completion. Given the typing of the first few letters of a command or filename, you can usually have the shell automatically complete the rest of the command or filename by hitting the Tab key on the keyboard. Here is an example. Suppose you have two files called foobar and foo.bar. You want to delete foo.bar. So what you would type on the keyboard is: rm fo[Tab].[Tab]. The shell would print out rm foo[BEEP].bar. The [BEEP] is the console bell, which is the shell telling me it was unable to totally complete the filename because there is more than one match. Both foobar and foo.bar start with fo, but it was able to complete to foo. If you type in ., then hit Tab again, the shell would be able to fill in the rest of the filename for you. environment variables Another feature of the shell is the use of environment variables. Environment variables are a variable key pair stored in the shell's environment space. This space can be read by any program invoked by the shell, and thus contains a lot of program configuration. Here is a list of common environment variables and what they mean: environment variables Variable Description USER Current logged in user's name. PATH Colon separated list of directories to search for binaries. DISPLAY Network name of the X11 display to connect to, if available. SHELL The current shell. TERM The name of the user's terminal. Used to determine the capabilities of the terminal. TERMCAP Database entry of the terminal escape codes to perform various terminal functions. OSTYPE Type of operating system. e.g., FreeBSD. MACHTYPE The CPU architecture that the system is running on. EDITOR The user's preferred text editor. PAGER The user's preferred text pager. MANPATH Colon separated list of directories to search for manual pages. Bourne shells Setting an environment variable differs somewhat from shell to shell. For example, in the C-Style shells such as tcsh and csh, you would use setenv to set environment variables. Under Bourne shells such as sh and bash, you would use export to set your current environment variables. For example, to set or modify the EDITOR environment variable, under csh or tcsh a command like this would set EDITOR to /usr/local/bin/emacs: &prompt.user; setenv EDITOR /usr/local/bin/emacs Under Bourne shells: &prompt.user; export EDITOR="/usr/local/bin/emacs" You can also make most shells expand the environment variable by placing a $ character in front of it on the command line. For example, echo $TERM would print out whatever $TERM is set to, because the shell expands $TERM and passes it on to echo. Shells treat a lot of special characters, called meta-characters as special representations of data. The most common one is the * character, which represents any number of characters in a filename. These special meta-characters can be used to do filename globbing. For example, typing in echo * is almost the same as typing in ls because the shell takes all the files that match * and puts them on the command line for echo to see. To prevent the shell from interpreting these special characters, they can be escaped from the shell by putting a backslash (\) character in front of them. echo $TERM prints whatever your terminal is set to. echo \$TERM prints $TERM as is. Changing Your Shell The easiest way to change your shell is to use the chsh command. Running chsh will place you into the editor that is in your EDITOR environment variable; if it is not set, you will be placed in vi. Change the Shell: line accordingly. You can also give chsh the option; this will set your shell for you, without requiring you to enter an editor. For example, if you wanted to change your shell to bash, the following should do the trick: &prompt.user; chsh -s /usr/local/bin/bash The shell that you wish to use must be present in the /etc/shells file. If you have installed a shell from the ports collection, then this should have been done for you already. If you installed the shell by hand, you must do this. For example, if you installed bash by hand and placed it into /usr/local/bin, you would want to: &prompt.root; echo "/usr/local/bin/bash" >> /etc/shells Then rerun chsh. Text Editors text editors editors A lot of configuration in FreeBSD is done by editing text files. Because of this, it would be a good idea to become familiar with a text editor. FreeBSD comes with a few as part of the base system, and many more are available in the Ports Collection. ee editors ee The easiest and simplest editor to learn is an editor called ee, which stands for easy editor. To start ee, one would type at the command line ee filename where filename is the name of the file to be edited. For example, to edit /etc/rc.conf, type in ee /etc/rc.conf. Once inside of ee, all of the commands for manipulating the editor's functions are listed at the top of the display. The caret ^ character represents the Ctrl key on the keyboard, so ^e expands to the key combination Ctrle. To leave ee, hit the Esc key, then choose leave editor. The editor will prompt you to save any changes if the file has been modified. vi editors vi emacs editors emacs FreeBSD also comes with more powerful text editors such as vi as part of the base system, while other editors, like Emacs and vim, are part of the FreeBSD Ports Collection (editors/emacs and editors/vim). These editors offer much more functionality and power at the expense of being a little more complicated to learn. However if you plan on doing a lot of text editing, learning a more powerful editor such as vim or Emacs will save you much more time in the long run. Devices and Device Nodes A device is a term used mostly for hardware-related activities in a system, including disks, printers, graphics cards, and keyboards. When FreeBSD boots, the majority of what FreeBSD displays are devices being detected. You can look through the boot messages again by viewing /var/run/dmesg.boot. For example, acd0 is the first IDE CDROM drive, while kbd0 represents the keyboard. Most of these devices in a &unix; operating system must be accessed through special files called device nodes, which are located in the /dev directory. Creating Device Nodes When adding a new device to your system, or compiling in support for additional devices, you may need to create one or more device nodes for the new devices. MAKEDEV Script On systems without DEVFS (this concerns all FreeBSD versions before 5.0), device nodes are created using the &man.MAKEDEV.8; script as shown below: &prompt.root; cd /dev &prompt.root; sh MAKEDEV ad1 This example would make the proper device nodes for the second IDE drive when installed. <literal>DEVFS</literal> (DEVice File System) The device file system, or DEVFS, provides access to kernel's device namespace in the global file system namespace. Instead of having to create and modify device nodes, DEVFS maintains this particular file system for you. See the &man.devfs.5; manual page for more information. DEVFS is used by default in FreeBSD 5.0 and above. Binary Formats To understand why &os; uses the &man.elf.5; format, you must first know a little about the three currently dominant executable formats for &unix;: &man.a.out.5; The oldest and classic &unix; object format. It uses a short and compact header with a magic number at the beginning that is often used to characterize the format (see &man.a.out.5; for more details). It contains three loaded segments: .text, .data, and .bss plus a symbol table and a string table. COFF The SVR3 object format. The header now comprises a section table, so you can have more than just .text, .data, and .bss sections. &man.elf.5; The successor to COFF, featuring multiple sections and 32-bit or 64-bit possible values. One major drawback: ELF was also designed with the assumption that there would be only one ABI per system architecture. That assumption is actually quite incorrect, and not even in the commercial SYSV world (which has at least three ABIs: SVR4, Solaris, SCO) does it hold true. FreeBSD tries to work around this problem somewhat by providing a utility for branding a known ELF executable with information about the ABI it is compliant with. See the manual page for &man.brandelf.1; for more information. FreeBSD comes from the classic camp and used the &man.a.out.5; format, a technology tried and proven through many generations of BSD releases, until the beginning of the 3.X branch. Though it was possible to build and run native ELF binaries (and kernels) on a FreeBSD system for some time before that, FreeBSD initially resisted the push to switch to ELF as the default format. Why? Well, when the Linux camp made their painful transition to ELF, it was not so much to flee the a.out executable format as it was their inflexible jump-table based shared library mechanism, which made the construction of shared libraries very difficult for vendors and developers alike. Since the ELF tools available offered a solution to the shared library problem and were generally seen as the way forward anyway, the migration cost was accepted as necessary and the transition made. FreeBSD's shared library mechanism is based more closely on Sun's &sunos; style shared library mechanism and, as such, is very easy to use. So, why are there so many different formats? Back in the dim, dark past, there was simple hardware. This simple hardware supported a simple, small system. a.out was completely adequate for the job of representing binaries on this simple system (a PDP-11). As people ported &unix; from this simple system, they retained the a.out format because it was sufficient for the early ports of &unix; to architectures like the Motorola 68k, VAXen, etc. Then some bright hardware engineer decided that if he could force software to do some sleazy tricks, then he would be able to shave a few gates off the design and allow his CPU core to run faster. While it was made to work with this new kind of hardware (known these days as RISC), a.out was ill-suited for this hardware, so many formats were developed to get to a better performance from this hardware than the limited, simple a.out format could offer. Things like COFF, ECOFF, and a few obscure others were invented and their limitations explored before things seemed to settle on ELF. In addition, program sizes were getting huge and disks (and physical memory) were still relatively small so the concept of a shared library was born. The VM system also became more sophisticated. While each one of these advancements was done using the a.out format, its usefulness was stretched more and more with each new feature. In addition, people wanted to dynamically load things at run time, or to junk parts of their program after the init code had run to save in core memory and swap space. Languages became more sophisticated and people wanted code called before main automatically. Lots of hacks were done to the a.out format to allow all of these things to happen, and they basically worked for a time. In time, a.out was not up to handling all these problems without an ever increasing overhead in code and complexity. While ELF solved many of these problems, it would be painful to switch from the system that basically worked. So ELF had to wait until it was more painful to remain with a.out than it was to migrate to ELF. However, as time passed, the build tools that FreeBSD derived their build tools from (the assembler and loader especially) evolved in two parallel trees. The FreeBSD tree added shared libraries and fixed some bugs. The GNU folks that originally wrote these programs rewrote them and added simpler support for building cross compilers, plugging in different formats at will, and so on. Since many people wanted to build cross compilers targeting FreeBSD, they were out of luck since the older sources that FreeBSD had for as and ld were not up to the task. The new GNU tools chain (binutils) does support cross compiling, ELF, shared libraries, C++ extensions, etc. In addition, many vendors are releasing ELF binaries, and it is a good thing for FreeBSD to run them. ELF is more expressive than a.out and allows more extensibility in the base system. The ELF tools are better maintained, and offer cross compilation support, which is important to many people. ELF may be a little slower than a.out, but trying to measure it can be difficult. There are also numerous details that are different between the two in how they map pages, handle init code, etc. None of these are very important, but they are differences. In time support for a.out will be moved out of the GENERIC kernel, and eventually removed from the kernel once the need to run legacy a.out programs is past. For More Information Manual Pages manual pages The most comprehensive documentation on FreeBSD is in the form of manual pages. Nearly every program on the system comes with a short reference manual explaining the basic operation and various arguments. These manuals can be viewed with the man command. Use of the man command is simple: &prompt.user; man command command is the name of the command you wish to learn about. For example, to learn more about ls command type: &prompt.user; man ls The online manual is divided up into numbered sections: User commands. System calls and error numbers. Functions in the C libraries. Device drivers. File formats. Games and other diversions. Miscellaneous information. System maintenance and operation commands. Kernel developers. In some cases, the same topic may appear in more than one section of the online manual. For example, there is a chmod user command and a chmod() system call. In this case, you can tell the man command which one you want by specifying the section: &prompt.user; man 1 chmod This will display the manual page for the user command chmod. References to a particular section of the online manual are traditionally placed in parenthesis in written documentation, so &man.chmod.1; refers to the chmod user command and &man.chmod.2; refers to the system call. This is fine if you know the name of the command and simply wish to know how to use it, but what if you cannot recall the command name? You can use man to search for keywords in the command descriptions by using the switch: &prompt.user; man -k mail With this command you will be presented with a list of commands that have the keyword mail in their descriptions. This is actually functionally equivalent to using the apropos command. So, you are looking at all those fancy commands in /usr/bin but do not have the faintest idea what most of them actually do? Simply do: &prompt.user; cd /usr/bin &prompt.user; man -f * or &prompt.user; cd /usr/bin &prompt.user; whatis * which does the same thing. GNU Info Files Free Software Foundation FreeBSD includes many applications and utilities produced by the Free Software Foundation (FSF). In addition to manual pages, these programs come with more extensive hypertext documents called info files which can be viewed with the info command or, if you installed emacs, the info mode of emacs. To use the &man.info.1; command, simply type: &prompt.user; info For a brief introduction, type h. For a quick command reference, type ?. diff --git a/en_US.ISO8859-1/books/handbook/config/chapter.sgml b/en_US.ISO8859-1/books/handbook/config/chapter.sgml index f2411afab5..f3b2831e54 100644 --- a/en_US.ISO8859-1/books/handbook/config/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/config/chapter.sgml @@ -1,3173 +1,3173 @@ Chern Lee Written by Mike Smith Based on a tutorial written by Matt Dillon Also based on tuning(7) written by Configuration and Tuning Synopsis system configuration system optimization One of the important aspects of &os; is system configuration. Correct system configuration will help prevent headaches during future upgrades. This chapter will explain much of the &os; configuration process, including some of the parameters which can be set to tune a &os; system. After reading this chapter, you will know: How to efficiently work with file systems and swap partitions. The basics of rc.conf configuration and /usr/local/etc/rc.d startup systems. How to configure and test a network card. How to configure virtual hosts on your network devices. How to use the various configuration files in /etc. How to tune &os; using sysctl variables. How to tune disk performance and modify kernel limitations. Before reading this chapter, you should: Understand &unix; and &os; basics (). Be familiar with the basics of kernel configuration/compilation (). Initial Configuration Partition Layout partition layout /etc /var /usr Base Partitions When laying out file systems with &man.disklabel.8; or &man.sysinstall.8;, remember that hard drives transfer data faster from the outer tracks to the inner. Thus smaller and heavier-accessed file systems should be closer to the outside of the drive, while larger partitions like /usr should be placed toward the inner. It is a good idea to create partitions in a similar order to: root, swap, /var, /usr. The size of /var reflects the intended machine usage. /var is used to hold mailboxes, log files, and printer spools. Mailboxes and log files can grow to unexpected sizes depending on how many users exist and how long log files are kept. Most users would never require a gigabyte, but remember that /var/tmp must be large enough to contain packages. The /usr partition holds much of the files required to support the system, the &man.ports.7; collection (recommended) and the source code (optional). Both of which are optional at install time. At least 2 gigabytes would be recommended for this partition. When selecting partition sizes, keep the space requirements in mind. Running out of space in one partition while barely using another can be a hassle. Some users have found that &man.sysinstall.8;'s Auto-defaults partition sizer will sometimes select smaller than adequate /var and / partitions. Partition wisely and generously. Swap Partition swap sizing swap partition As a rule of thumb, the swap partition should be about double the size of system memory (RAM). For example, if the machine has 128 megabytes of memory, the swap file should be 256 megabytes. Systems with less memory may perform better with more swap. Less than 256 megabytes of swap is not recommended and memory expansion should be considered. The kernel's VM paging algorithms are tuned to perform best when the swap partition is at least two times the size of main memory. Configuring too little swap can lead to inefficiencies in the VM page scanning code and might create issues later if more memory is added. On larger systems with multiple SCSI disks (or multiple IDE disks operating on different controllers), it is recommend that a swap is configured on each drive (up to four drives). The swap partitions should be approximately the same size. The kernel can handle arbitrary sizes but internal data structures scale to 4 times the largest swap partition. Keeping the swap partitions near the same size will allow the kernel to optimally stripe swap space across disks. Large swap sizes are fine, even if swap is not used much. It might be easier to recover from a runaway program before being forced to reboot. Why Partition? Several users think a single large partition will be fine, but there are several reasons why this is a bad idea. First, each partition has different operational characteristics and separating them allows the file system to tune accordingly. For example, the root and /usr partitions are read-mostly, without much writing. While a lot of reading and writing could occur in /var and /var/tmp. By properly partitioning a system, fragmentation introduced in the smaller write heavy partitions will not bleed over into the mostly-read partitions. Keeping the write-loaded partitions closer to the disk's edge, will increase I/O performance in the partitions where it occurs the most. Now while I/O performance in the larger partitions may be needed, shifting them more toward the edge of the disk will not lead to a significant performance improvement over moving /var to the edge. Finally, there are safety concerns. A smaller, neater root partition which is mostly read-only has a greater chance of surviving a bad crash. Core Configuration rc files rc.conf The principal location for system configuration information is within /etc/rc.conf. This file contains a wide range of configuration information, principally used at system startup to configure the system. Its name directly implies this; it is configuration information for the rc* files. An administrator should make entries in the rc.conf file to override the default settings from /etc/defaults/rc.conf. The defaults file should not be copied verbatim to /etc - it contains default values, not examples. All system-specific changes should be made in the rc.conf file itself. A number of strategies may be applied in clustered applications to separate site-wide configuration from system-specific configuration in order to keep administration overhead down. The recommended approach is to place site-wide configuration into another file, such as /etc/rc.conf.site, and then include this file into /etc/rc.conf, which will contain only system-specific information. As rc.conf is read by &man.sh.1; it is trivial to achieve this. For example: rc.conf: . /etc/rc.conf.site hostname="node15.example.com" network_interfaces="fxp0 lo0" ifconfig_fxp0="inet 10.1.1.1" rc.conf.site: defaultrouter="10.1.1.254" saver="daemon" blanktime="100" The rc.conf.site file can then be distributed to every system using rsync or a similar program, while the rc.conf file remains unique. Upgrading the system using &man.sysinstall.8; or make world will not overwrite the rc.conf file, so system configuration information will not be lost. Application Configuration Typically, installed applications have their own configuration files, with their own syntax, etc. It is important that these files be kept separate from the base system, so that they may be easily located and managed by the package management tools. /usr/local/etc Typically, these files are installed in /usr/local/etc. In the case where an application has a large number of configuration files, a subdirectory will be created to hold them. Normally, when a port or package is installed, sample configuration files are also installed. These are usually identified with a .default suffix. If there are no existing configuration files for the application, they will be created by copying the .default files. For example, consider the contents of the directory /usr/local/etc/apache: -rw-r--r-- 1 root wheel 2184 May 20 1998 access.conf -rw-r--r-- 1 root wheel 2184 May 20 1998 access.conf.default -rw-r--r-- 1 root wheel 9555 May 20 1998 httpd.conf -rw-r--r-- 1 root wheel 9555 May 20 1998 httpd.conf.default -rw-r--r-- 1 root wheel 12205 May 20 1998 magic -rw-r--r-- 1 root wheel 12205 May 20 1998 magic.default -rw-r--r-- 1 root wheel 2700 May 20 1998 mime.types -rw-r--r-- 1 root wheel 2700 May 20 1998 mime.types.default -rw-r--r-- 1 root wheel 7980 May 20 1998 srm.conf -rw-r--r-- 1 root wheel 7933 May 20 1998 srm.conf.default The file sizes show that only the srm.conf file has been changed. A later update of the Apache port would not overwrite this changed file. Tom Rhodes Contributed by Starting Services services Many users choose to install third party software on &os; from the Ports Collection. In many of these situations it may be necessary to configure the software in a manner which will allow it to be started upon system initialization. Services, such as mail/postfix or www/apache13 are just two of the many software packages which may be started during system initialization. This section explains the procedures available for starting third party software. In &os;, most included services, such as &man.cron.8;, are started through the system start up scripts. These scripts may differ depending on &os; or vendor version; however, the most important aspect to consider is that their start up configuration can be handled through simple startup scripts. Before the advent of rcNG, applications would drop a simple start up script into the /usr/local/etc/rc.d directory which would be read by the system initialization scripts. These scripts would then be executed during the latter stages of system start up. While many individuals have spent hours trying to merge the old configuration style into the new system, the fact remains that some third party utilities still require a script simply dropped into the aforementioned directory. The subtle differences in the scripts depend whether or not rcNG is being used. Prior to &os; 5.1 the old configuration style is used and in almost all cases a new style script would do just fine. While every script must meet some minimal requirements, most of the time these requirements are &os; version agnostic. Each script must have a .sh extension appended to the end and every script must be executable by the system. The latter may be achieved by using the chmod command and setting the unique permissions of 755. There should also be, at minimal, an option to start the application and an option to stop the application. The simplest start up script would probably look a little bit like this one: #!/bin/sh echo -n ' utility' case "$1" in start) /usr/local/bin/utility ;; stop) kill -9 `cat /var/run/utility.pid` ;; *) - echo "Usage: `basename $0` {start|stop}" >&2 + echo "Usage: `basename $0` {start|stop}" >&2 exit 64 ;; esac exit 0 This script provides for a stop and start option for the application hereto referred simply as utility. Could be started manually with: &prompt.root; /usr/local/etc/rc.d/utility.sh start While not all third party software requires the line in rc.conf, almost every day a new port will be modified to accept this configuration. Check the final output of the installation for more information on a specific application. Some third party software will provide start up scripts which permit the application to be used with rcNG; although, this will be discussed in the next section. Extended Application Configuration Now that &os; includes rcNG, configuration of application start up has become more optimal; indeed, it has become a bit more in depth. Using the key words discussed in the rcNG section, applications may now be set to start after certain other services for example DNS; may permit extra flags to be passed through rc.conf in place of hard coded flags in the start up script, etc. A basic script may look similar to the following: #!/bin/sh # # PROVIDE: utility # REQUIRE: DAEMON # BEFORE: LOGIN # KEYWORD: FreeBSD shutdown # # DO NOT CHANGE THESE DEFAULT VALUES HERE # SET THEM IN THE /etc/rc.conf FILE # utility_enable=${utility_enable-"NO"} utility_flags=${utility_flags-""} utility_pidfile=${utility_pidfile-"/var/run/utility.pid"} . /etc/rc.subr name="utility" rcvar=`set_rcvar` command="/usr/local/sbin/utility" load_rc_config $name pidfile="${utility_pidfile}" start_cmd="echo \"Starting ${name}.\"; /usr/bin/nice -5 ${command} ${utility_flags} ${command_args}" run_rc_command "$1" This script will ensure that the provided utility will be started before the login service but after the daemon service. It also provides a method for setting and tracking the PID, or process ID file. This application could then have the following line placed in /etc/rc.conf: utility_enable="YES" This new method also allows for easier manipulation of the command line arguments, inclusion of the default functions provided in /etc/rc.subr, compatibility with the &man.rcorder.8; utility and provide for easier configuration via the rc.conf file. In essence, this script could even be placed in /etc/rc.d directory. Yet, that has the potential to upset the &man.mergemaster.8; utility when used in conjunction with software upgrades. Using Services to Start Services Other services, such as POP3 server daemons, IMAP, etc. could be started using the &man.inetd.8;. This involves installing the service utility from the Ports Collection with a configuration line appended to the /etc/inetd.conf file, or uncommenting one of the current configuration lines. Working with inetd and its configuration is described in depth in the inetd section. In some cases, it may be more plausible to use the &man.cron.8; daemon to start system services. This approach has a number of advantages because cron runs these processes as the crontab's file owner. This allows regular users to start and maintain some applications. The cron utility provides a unique feature, @reboot, which may be used in place of the time specification. This will cause the job to be run when &man.cron.8; is started, normally during system initialization. Tom Rhodes Contributed by Configuring the <command>cron</command> Utility cron configuration One of the most useful utilities in &os; is &man.cron.8;. The cron utility runs in the background and constantly checks the /etc/crontab file. The cron utility also checks the /var/cron/tabs directory, in search of new crontab files. These crontab files store information about specific functions which cron is supposed to perform at certain times. The cron utility uses two different types of configuration files, the system crontab and user crontabs. The only difference between these two formats is the sixth field. In the system crontab, the sixth field is the name of a user for the command to run as. This gives the system crontab the ability to run commands as any user. In a user crontab, the sixth field is the command to run, and all commands run as the user who created the crontab; this is an important security feature. User crontabs allow individual users to schedule tasks without the need for root privileges. Commands in a user's crontab run with the permissions of the user who owns the crontab. The root user can have a user crontab just like any other user. This one is different from /etc/crontab (the system crontab). Because of the system crontab, there is usually no need to create a user crontab for root. Let us take a look at the /etc/crontab file (the system crontab): # /etc/crontab - root's crontab for &os; # # $&os;: src/etc/crontab,v 1.32 2002/11/22 16:13:39 tom Exp $ # # SHELL=/bin/sh PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin HOME=/var/log # # #minute hour mday month wday who command # # */5 * * * * root /usr/libexec/atrun Like most &os; configuration files, the # character represents a comment. A comment can be placed in the file as a reminder of what and why a desired action is performed. Comments cannot be on the same line as a command or else they will be interpreted as part of the command; they must be on a new line. Blank lines are ignored. First, the environment must be defined. The equals (=) character is used to define any environment settings, as with this example where it is used for the SHELL, PATH, and HOME options. If the shell line is omitted, cron will use the default, which is sh. If the PATH variable is omitted, no default will be used and file locations will need to be absolute. If HOME is omitted, cron will use the invoking users home directory. This line defines a total of seven fields. Listed here are the values minute, hour, mday, month, wday, who, and command. These are almost all self explanatory. minute is the time in minutes the command will be run. hour is similar to the minute option, just in hours. mday stands for day of the month. month is similar to hour and minute, as it designates the month. The wday option stands for day of the week. All these fields must be numeric values, and follow the twenty-four hour clock. The who field is special, and only exists in the /etc/crontab file. This field specifies which user the command should be run as. When a user installs his or her crontab file, they will not have this option. Finally, the command option is listed. This is the last field, so naturally it should designate the command to be executed. This last line will define the values discussed above. Notice here we have a */5 listing, followed by several more * characters. These * characters mean first-last, and can be interpreted as every time. So, judging by this line, it is apparent that the atrun command is to be invoked by root every five minutes regardless of what day or month it is. For more information on the atrun command, see the &man.atrun.8; manual page. Commands can have any number of flags passed to them; however, commands which extend to multiple lines need to be broken with the backslash \ continuation character. This is the basic set up for every crontab file, although there is one thing different about this one. Field number six, where we specified the username, only exists in the system /etc/crontab file. This field should be omitted for individual user crontab files. Installing a Crontab You must not use the procedure described here to edit/install the system crontab. Simply use your favorite editor: the cron utility will notice that the file has changed and immediately begin using the updated version. See this FAQ entry for more information. To install a freshly written user crontab, first use your favorite editor to create a file in the proper format, and then use the crontab utility. The most common usage is: &prompt.user; crontab crontab-file In this example, crontab-file is the filename of a crontab that was previously created. There is also an option to list installed crontab files: just pass the option to crontab and look over the output. For users who wish to begin their own crontab file from scratch, without the use of a template, the crontab -e option is available. This will invoke the selected editor with an empty file. When the file is saved, it will be automatically installed by the crontab command. If you later want to remove your user crontab completely, use crontab with the option. Tom Rhodes Contributed by Using rc under &os; 5.X and newer &os; has recently integrated the NetBSD rc.d system for system initialization. Users should notice the files listed in the /etc/rc.d directory. Many of these files are for basic services which can be controlled with the , , and options. For instance, &man.sshd.8; can be restarted with the following command: &prompt.root; /etc/rc.d/sshd restart This procedure is similar for other services. Of course, services are usually started automatically as specified in &man.rc.conf.5;. For example, enabling the Network Address Translation daemon at startup is as simple as adding the following line to /etc/rc.conf: natd_enable="YES" If a line is already present, then simply change the to . The rc scripts will automatically load any other dependent services during the next reboot, as described below. Since the rc.d system is primarily intended to start/stop services at system startup/shutdown time, the standard , and options will only perform their action if the appropriate /etc/rc.conf variables are set. For instance the above sshd restart command will only work if sshd_enable is set to in /etc/rc.conf. To , or a service regardless of the settings in /etc/rc.conf, the commands should be prefixed with force. For instance to restart sshd regardless of the current /etc/rc.conf setting, execute the following command: &prompt.root; /etc/rc.d/sshd forcerestart It is easy to check if a service is enabled in /etc/rc.conf by running the appropriate rc.d script with the option . Thus, an administrator can check that sshd is in fact enabled in /etc/rc.conf by running: &prompt.root; /etc/rc.d/sshd rcvar # sshd $sshd_enable=YES The second line (# sshd) is the output from the sshd command, not a root console. To determine if a service is running, a option is available. For instance to verify that sshd is actually started: &prompt.root; /etc/rc.d/sshd status sshd is running as pid 433. It is also possible to a service. This will attempt to send a signal to an individual service, forcing the service to reload its configuration files. In most cases this means sending the service a SIGHUP signal. The rc.d system is not only used for network services, it also contributes to most of the system initialization. For instance, consider the bgfsck file. When this script is executed, it will print out the following message: Starting background file system checks in 60 seconds. Therefore this file is used for background file system checks, which are done only during system initialization. Many system services depend on other services to function properly. For example, NIS and other RPC-based services may fail to start until after the rpcbind (portmapper) service has started. To resolve this issue, information about dependencies and other meta-data is included in the comments at the top of each startup script. The &man.rcorder.8; program is then used to parse these comments during system initialization to determine the order in which system services should be invoked to satisfy the dependencies. The following words may be included at the top of each startup file: PROVIDE: Specifies the services this file provides. REQUIRE: Lists services which are required for this service. This file will run after the specified services. BEFORE: Lists services which depend on this service. This file will run before the specified services. KEYWORD: &os; or NetBSD. This is used for *BSD dependent features. By using this method, an administrator can easily control system services without the hassle of runlevels like some other &unix; operating systems. Additional information about the rc.d system can be found in the &man.rc.8; and &man.rc.subr.8; manual pages. Marc Fonvieille Contributed by Setting Up Network Interface Cards network cards configuration Nowadays we can not think about a computer without thinking about a network connection. Adding and configuring a network card is a common task for any &os; administrator. Locating the Correct Driver network cards driver Before you begin, you should know the model of the card you have, the chip it uses, and whether it is a PCI or ISA card. &os; supports a wide variety of both PCI and ISA cards. Check the Hardware Compatibility List for your release to see if your card is supported. Once you are sure your card is supported, you need to determine the proper driver for the card. /usr/src/sys/conf/NOTES and /usr/src/sys/arch/conf/NOTES will give you the list of network interface drivers with some information about the supported chipsets/cards. If you have doubts about which driver is the correct one, read the manual page of the driver. The manual page will give you more information about the supported hardware and even the possible problems that could occur. NOTES does not exist on &os; 4.X. Instead, check the LINT file for information about various network interfaces. See for a more detailed summary of NOTES versus LINT. If you own a common card, most of the time you will not have to look very hard for a driver. Drivers for common network cards are present in the GENERIC kernel, so your card should show up during boot, like so: dc0: <82c169 PNIC 10/100BaseTX> port 0xa000-0xa0ff mem 0xd3800000-0xd38 000ff irq 15 at device 11.0 on pci0 dc0: Ethernet address: 00:a0:cc:da:da:da miibus0: <MII bus> on dc0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dc1: <82c169 PNIC 10/100BaseTX> port 0x9800-0x98ff mem 0xd3000000-0xd30 000ff irq 11 at device 12.0 on pci0 dc1: Ethernet address: 00:a0:cc:da:da:db miibus1: <MII bus> on dc1 ukphy1: <Generic IEEE 802.3u media interface> on miibus1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto In this example, we see that two cards using the &man.dc.4; driver are present on the system. If the driver for your NIC is not present in GENERIC, you will need to load the proper driver to use your NIC. This may be accomplished in one of two ways: The easiest way is to simply load a kernel module for your network card with &man.kldload.8;. Not all NIC drivers are available as modules; notable examples of devices for which modules do not exist are ISA cards. Alternatively, you may statically compile the support for your card into your kernel. Check /usr/src/sys/conf/NOTES, /usr/src/sys/arch/conf/NOTES and the manual page of the driver to know what to add in your kernel configuration file. For more information about recompiling your kernel, please see . If your card was detected at boot by your kernel (GENERIC) you do not have to build a new kernel. Configuring the Network Card network cards configuration Once the right driver is loaded for the network card, the card needs to be configured. As with many other things, the network card may have been configured at installation time by sysinstall. To display the configuration for the network interfaces on your system, enter the following command: &prompt.user; ifconfig dc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 192.168.1.3 netmask 0xffffff00 broadcast 192.168.1.255 ether 00:a0:cc:da:da:da media: Ethernet autoselect (100baseTX <full-duplex>) status: active dc1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255 ether 00:a0:cc:da:da:db media: Ethernet 10baseT/UTP status: no carrier lp0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> mtu 1500 lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 inet 127.0.0.1 netmask 0xff000000 tun0: flags=8010<POINTOPOINT,MULTICAST> mtu 1500 Old versions of &os; may require the option following &man.ifconfig.8;, for more details about the correct syntax of &man.ifconfig.8;, please refer to the manual page. Note also that entries concerning IPv6 (inet6 etc.) were omitted in this example. In this example, the following devices were displayed: dc0: The first Ethernet interface dc1: The second Ethernet interface lp0: The parallel port interface lo0: The loopback device tun0: The tunnel device used by ppp &os; uses the driver name followed by the order in which one the card is detected at the kernel boot to name the network card. For example sis2 would be the third network card on the system using the &man.sis.4; driver. In this example, the dc0 device is up and running. The key indicators are: UP means that the card is configured and ready. The card has an Internet (inet) address (in this case 192.168.1.3). It has a valid subnet mask (netmask; 0xffffff00 is the same as 255.255.255.0). It has a valid broadcast address (in this case, 192.168.1.255). The MAC address of the card (ether) is 00:a0:cc:da:da:da The physical media selection is on autoselection mode (media: Ethernet autoselect (100baseTX <full-duplex>)). We see that dc1 was configured to run with 10baseT/UTP media. For more information on available media types for a driver, please refer to its manual page. The status of the link (status) is active, i.e. the carrier is detected. For dc1, we see status: no carrier. This is normal when an Ethernet cable is not plugged into the card. If the &man.ifconfig.8; output had shown something similar to: dc0: flags=8843<BROADCAST,SIMPLEX,MULTICAST> mtu 1500 ether 00:a0:cc:da:da:da it would indicate the card has not been configured. To configure your card, you need root privileges. The network card configuration can be done from the command line with &man.ifconfig.8; but you would have to do it after each reboot of the system. The file /etc/rc.conf is where to add the network card's configuration. Open /etc/rc.conf in your favorite editor. You need to add a line for each network card present on the system, for example in our case, we added these lines: ifconfig_dc0="inet 192.168.1.3 netmask 255.255.255.0" ifconfig_dc1="inet 10.0.0.1 netmask 255.255.255.0 media 10baseT/UTP" You have to replace dc0, dc1, and so on, with the correct device for your cards, and the addresses with the proper ones. You should read the card driver and &man.ifconfig.8; manual pages for more details about the allowed options and also &man.rc.conf.5; manual page for more information on the syntax of /etc/rc.conf. If you configured the network during installation, some lines about the network card(s) may be already present. Double check /etc/rc.conf before adding any lines. You will also have to edit the file /etc/hosts to add the names and the IP addresses of various machines of the LAN, if they are not already there. For more information please refer to &man.hosts.5; and to /usr/share/examples/etc/hosts. Testing and Troubleshooting Once you have made the necessary changes in /etc/rc.conf, you should reboot your system. This will allow the change(s) to the interface(s) to be applied, and verify that the system restarts without any configuration errors. Once the system has been rebooted, you should test the network interfaces. Testing the Ethernet Card network cards testing To verify that an Ethernet card is configured correctly, you have to try two things. First, ping the interface itself, and then ping another machine on the LAN. First test the local interface: &prompt.user; ping -c5 192.168.1.3 PING 192.168.1.3 (192.168.1.3): 56 data bytes 64 bytes from 192.168.1.3: icmp_seq=0 ttl=64 time=0.082 ms 64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.074 ms 64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.076 ms 64 bytes from 192.168.1.3: icmp_seq=3 ttl=64 time=0.108 ms 64 bytes from 192.168.1.3: icmp_seq=4 ttl=64 time=0.076 ms --- 192.168.1.3 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.074/0.083/0.108/0.013 ms Now we have to ping another machine on the LAN: &prompt.user; ping -c5 192.168.1.2 PING 192.168.1.2 (192.168.1.2): 56 data bytes 64 bytes from 192.168.1.2: icmp_seq=0 ttl=64 time=0.726 ms 64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.766 ms 64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.700 ms 64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.747 ms 64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.704 ms --- 192.168.1.2 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.700/0.729/0.766/0.025 ms You could also use the machine name instead of 192.168.1.2 if you have set up the /etc/hosts file. Troubleshooting network cards troubleshooting Troubleshooting hardware and software configurations is always a pain, and a pain which can be alleviated by checking the simple things first. Is your network cable plugged in? Have you properly configured the network services? Did you configure the firewall correctly? Is the card you are using supported by &os;? Always check the hardware notes before sending off a bug report. Update your version of &os; to the latest STABLE version. Check the mailing list archives, or perhaps search the Internet. If the card works, yet performance is poor, it would be worthwhile to read over the &man.tuning.7; manual page. You can also check the network configuration as incorrect network settings can cause slow connections. Some users experience one or two device timeout messages, which is normal for some cards. If they continue, or are bothersome, you may wish to be sure the device is not conflicting with another device. Double check the cable connections. Perhaps you may just need to get another card. At times, users see a few watchdog timeout errors. The first thing to do here is to check your network cable. Many cards require a PCI slot which supports Bus Mastering. On some old motherboards, only one PCI slot allows it (usually slot 0). Check the network card and the motherboard documentation to determine if that may be the problem. No route to host messages occur if the system is unable to route a packet to the destination host. This can happen if no default route is specified, or if a cable is unplugged. Check the output of netstat -rn and make sure there is a valid route to the host you are trying to reach. If there is not, read on to . ping: sendto: Permission denied error messages are often caused by a misconfigured firewall. If ipfw is enabled in the kernel but no rules have been defined, then the default policy is to deny all traffic, even ping requests! Read on to for more information. Sometimes performance of the card is poor, or below average. In these cases it is best to set the media selection mode from autoselect to the correct media selection. While this usually works for most hardware, it may not resolve this issue for everyone. Again, check all the network settings, and read over the &man.tuning.7; manual page. Virtual Hosts virtual hosts IP aliases A very common use of &os; is virtual site hosting, where one server appears to the network as many servers. This is achieved by assigning multiple network addresses to a single interface. A given network interface has one real address, and may have any number of alias addresses. These aliases are normally added by placing alias entries in /etc/rc.conf. An alias entry for the interface fxp0 looks like: ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx" Note that alias entries must start with alias0 and proceed upwards in order, (for example, _alias1, _alias2, and so on). The configuration process will stop at the first missing number. The calculation of alias netmasks is important, but fortunately quite simple. For a given interface, there must be one address which correctly represents the network's netmask. Any other addresses which fall within this network must have a netmask of all 1s (expressed as either 255.255.255.255 or 0xffffffff). For example, consider the case where the fxp0 interface is connected to two networks, the 10.1.1.0 network with a netmask of 255.255.255.0 and the 202.0.75.16 network with a netmask of 255.255.255.240. We want the system to appear at 10.1.1.1 through 10.1.1.5 and at 202.0.75.17 through 202.0.75.20. As noted above, only the first address in a given network range (in this case, 10.0.1.1 and 202.0.75.17) should have a real netmask; all the rest (10.1.1.2 through 10.1.1.5 and 202.0.75.18 through 202.0.75.20) must be configured with a netmask of 255.255.255.255. The following /etc/rc.conf entries configure the adapter correctly for this arrangement: ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0" ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255" ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255" ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255" ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255" ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240" ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255" ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255" ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255" Configuration Files <filename>/etc</filename> Layout There are a number of directories in which configuration information is kept. These include: /etc Generic system configuration information; data here is system-specific. /etc/defaults Default versions of system configuration files. /etc/mail Extra &man.sendmail.8; configuration, other MTA configuration files. /etc/ppp Configuration for both user- and kernel-ppp programs. /etc/namedb Default location for &man.named.8; data. Normally named.conf and zone files are stored here. /usr/local/etc Configuration files for installed applications. May contain per-application subdirectories. /usr/local/etc/rc.d Start/stop scripts for installed applications. /var/db Automatically generated system-specific database files, such as the package database, the locate database, and so on Hostnames hostname DNS <filename>/etc/resolv.conf</filename> resolv.conf /etc/resolv.conf dictates how &os;'s resolver accesses the Internet Domain Name System (DNS). The most common entries to resolv.conf are: nameserver The IP address of a name server the resolver should query. The servers are queried in the order listed with a maximum of three. search Search list for hostname lookup. This is normally determined by the domain of the local hostname. domain The local domain name. A typical resolv.conf: search example.com nameserver 147.11.1.11 nameserver 147.11.100.30 Only one of the search and domain options should be used. If you are using DHCP, &man.dhclient.8; usually rewrites resolv.conf with information received from the DHCP server. <filename>/etc/hosts</filename> hosts /etc/hosts is a simple text database reminiscent of the old Internet. It works in conjunction with DNS and NIS providing name to IP address mappings. Local computers connected via a LAN can be placed in here for simplistic naming purposes instead of setting up a &man.named.8; server. Additionally, /etc/hosts can be used to provide a local record of Internet names, reducing the need to query externally for commonly accessed names. # $&os;$ # # Host Database # This file should contain the addresses and aliases # for local hosts that share this file. # In the presence of the domain name service or NIS, this file may # not be consulted at all; see /etc/nsswitch.conf for the resolution order. # # ::1 localhost localhost.my.domain myname.my.domain 127.0.0.1 localhost localhost.my.domain myname.my.domain # # Imaginary network. #10.0.0.2 myname.my.domain myname #10.0.0.3 myfriend.my.domain myfriend # # According to RFC 1918, you can use the following IP networks for # private nets which will never be connected to the Internet: # # 10.0.0.0 - 10.255.255.255 # 172.16.0.0 - 172.31.255.255 # 192.168.0.0 - 192.168.255.255 # # In case you want to be able to connect to the Internet, you need # real official assigned numbers. PLEASE PLEASE PLEASE do not try # to invent your own network numbers but instead get one from your # network provider (if any) or from the Internet Registry (ftp to # rs.internic.net, directory `/templates'). # /etc/hosts takes on the simple format of: [Internet address] [official hostname] [alias1] [alias2] ... For example: 10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2 Consult &man.hosts.5; for more information. Log File Configuration log files <filename>syslog.conf</filename> syslog.conf syslog.conf is the configuration file for the &man.syslogd.8; program. It indicates which types of syslog messages are logged to particular log files. # $&os;$ # # Spaces ARE valid field separators in this file. However, # other *nix-like systems still insist on using tabs as field # separators. If you are sharing this file between systems, you # may want to use only tabs as field separators here. # Consult the syslog.conf(5) manual page. *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages security.* /var/log/security mail.info /var/log/maillog lpr.info /var/log/lpd-errs cron.* /var/log/cron *.err root *.notice;news.err root *.alert root *.emerg * # uncomment this to log all writes to /dev/console to /var/log/console.log #console.info /var/log/console.log # uncomment this to enable logging of all log messages to /var/log/all.log #*.* /var/log/all.log # uncomment this to enable logging to a remote log host named loghost #*.* @loghost # uncomment these if you're running inn # news.crit /var/log/news/news.crit # news.err /var/log/news/news.err # news.notice /var/log/news/news.notice !startslip *.* /var/log/slip.log !ppp *.* /var/log/ppp.log Consult the &man.syslog.conf.5; manual page for more information. <filename>newsyslog.conf</filename> newsyslog.conf newsyslog.conf is the configuration file for &man.newsyslog.8;, a program that is normally scheduled to run by &man.cron.8;. &man.newsyslog.8; determines when log files require archiving or rearranging. logfile is moved to logfile.0, logfile.0 is moved to logfile.1, and so on. Alternatively, the log files may be archived in &man.gzip.1; format causing them to be named: logfile.0.gz, logfile.1.gz, and so on. newsyslog.conf indicates which log files are to be managed, how many are to be kept, and when they are to be touched. Log files can be rearranged and/or archived when they have either reached a certain size, or at a certain periodic time/date. # configuration file for newsyslog # $&os;$ # # filename [owner:group] mode count size when [ZB] [/pid_file] [sig_num] /var/log/cron 600 3 100 * Z /var/log/amd.log 644 7 100 * Z /var/log/kerberos.log 644 7 100 * Z /var/log/lpd-errs 644 7 100 * Z /var/log/maillog 644 7 * @T00 Z /var/log/sendmail.st 644 10 * 168 B /var/log/messages 644 5 100 * Z /var/log/all.log 600 7 * @T00 Z /var/log/slip.log 600 3 100 * Z /var/log/ppp.log 600 3 100 * Z /var/log/security 600 10 100 * Z /var/log/wtmp 644 3 * @01T05 B /var/log/daily.log 640 7 * @T00 Z /var/log/weekly.log 640 5 1 $W6D0 Z /var/log/monthly.log 640 12 * $M1D0 Z /var/log/console.log 640 5 100 * Z Consult the &man.newsyslog.8; manual page for more information. <filename>sysctl.conf</filename> sysctl.conf sysctl sysctl.conf looks much like rc.conf. Values are set in a variable=value form. The specified values are set after the system goes into multi-user mode. Not all variables are settable in this mode. A sample sysctl.conf turning off logging of fatal signal exits and letting Linux programs know they are really running under &os;: kern.logsigexit=0 # Do not log fatal signal exits (e.g. sig 11) compat.linux.osname=&os; compat.linux.osrelease=4.3-STABLE Tuning with sysctl sysctl tuning with sysctl &man.sysctl.8; is an interface that allows you to make changes to a running &os; system. This includes many advanced options of the TCP/IP stack and virtual memory system that can dramatically improve performance for an experienced system administrator. Over five hundred system variables can be read and set using &man.sysctl.8;. At its core, &man.sysctl.8; serves two functions: to read and to modify system settings. To view all readable variables: &prompt.user; sysctl -a To read a particular variable, for example, kern.maxproc: &prompt.user; sysctl kern.maxproc kern.maxproc: 1044 To set a particular variable, use the intuitive variable=value syntax: &prompt.root; sysctl kern.maxfiles=5000 -kern.maxfiles: 2088 -> 5000 +kern.maxfiles: 2088 -> 5000 Settings of sysctl variables are usually either strings, numbers, or booleans (a boolean being 1 for yes or a 0 for no). If you want to set automatically some variables each time the machine boots, add them to the /etc/sysctl.conf file. For more information see the &man.sysctl.conf.5; manual page and the . Tom Rhodes Contributed by &man.sysctl.8; Read-only In some cases it may be desirable to modify read-only &man.sysctl.8; values. While this is sometimes unavoidable, it can only be done on (re)boot. For instance on some laptop models the &man.cardbus.4; device will not probe memory ranges, and fail with errors which look similar to: cbb0: Could not map register memory device_probe_and_attach: cbb0 attach returned 12 Cases like the one above usually require the modification of some default &man.sysctl.8; settings which are set read only. To overcome these situations a user can put &man.sysctl.8; OIDs in their local /boot/loader.conf. Default settings are located in the /boot/defaults/loader.conf file. Fixing the problem mentioned above would require a user to set in the aforementioned file. Now &man.cardbus.4; will work properly. Tuning Disks Sysctl Variables <varname>vfs.vmiodirenable</varname> vfs.vmiodirenable The vfs.vmiodirenable sysctl variable may be set to either 0 (off) or 1 (on); it is 1 by default. This variable controls how directories are cached by the system. Most directories are small, using just a single fragment (typically 1 K) in the file system and less (typically 512 bytes) in the buffer cache. With this variable turned off (to 0), the buffer cache will only cache a fixed number of directories even if you have a huge amount of memory. When turned on (to 1), this sysctl allows the buffer cache to use the VM Page Cache to cache the directories, making all the memory available for caching directories. However, the minimum in-core memory used to cache a directory is the physical page size (typically 4 K) rather than 512 bytes. We recommend keeping this option on if you are running any services which manipulate large numbers of files. Such services can include web caches, large mail systems, and news systems. Keeping this option on will generally not reduce performance even with the wasted memory but you should experiment to find out. <varname>vfs.write_behind</varname> vfs.write_behind The vfs.write_behind sysctl variable defaults to 1 (on). This tells the file system to issue media writes as full clusters are collected, which typically occurs when writing large sequential files. The idea is to avoid saturating the buffer cache with dirty buffers when it would not benefit I/O performance. However, this may stall processes and under certain circumstances you may wish to turn it off. <varname>vfs.hirunningspace</varname> vfs.hirunningspace The vfs.hirunningspace sysctl variable determines how much outstanding write I/O may be queued to disk controllers system-wide at any given instance. The default is usually sufficient but on machines with lots of disks you may want to bump it up to four or five megabytes. Note that setting too high a value (exceeding the buffer cache's write threshold) can lead to extremely bad clustering performance. Do not set this value arbitrarily high! Higher write values may add latency to reads occurring at the same time. There are various other buffer-cache and VM page cache related sysctls. We do not recommend modifying these values. As of &os; 4.3, the VM system does an extremely good job of automatically tuning itself. <varname>vm.swap_idle_enabled</varname> vm.swap_idle_enabled The vm.swap_idle_enabled sysctl variable is useful in large multi-user systems where you have lots of users entering and leaving the system and lots of idle processes. Such systems tend to generate a great deal of continuous pressure on free memory reserves. Turning this feature on and tweaking the swapout hysteresis (in idle seconds) via vm.swap_idle_threshold1 and vm.swap_idle_threshold2 allows you to depress the priority of memory pages associated with idle processes more quickly then the normal pageout algorithm. This gives a helping hand to the pageout daemon. Do not turn this option on unless you need it, because the tradeoff you are making is essentially pre-page memory sooner rather than later; thus eating more swap and disk bandwidth. In a small system this option will have a determinable effect but in a large system that is already doing moderate paging this option allows the VM system to stage whole processes into and out of memory easily. <varname>hw.ata.wc</varname> hw.ata.wc &os; 4.3 flirted with turning off IDE write caching. This reduced write bandwidth to IDE disks but was considered necessary due to serious data consistency issues introduced by hard drive vendors. The problem is that IDE drives lie about when a write completes. With IDE write caching turned on, IDE hard drives not only write data to disk out of order, but will sometimes delay writing some blocks indefinitely when under heavy disk loads. A crash or power failure may cause serious file system corruption. &os;'s default was changed to be safe. Unfortunately, the result was such a huge performance loss that we changed write caching back to on by default after the release. You should check the default on your system by observing the hw.ata.wc sysctl variable. If IDE write caching is turned off, you can turn it back on by setting the kernel variable back to 1. This must be done from the boot loader at boot time. Attempting to do it after the kernel boots will have no effect. For more information, please see &man.ata.4;. <literal>SCSI_DELAY</literal> (<varname>kern.cam.scsi_delay</varname>) kern.cam.scsi_delay kernel options SCSI_DELAY The SCSI_DELAY kernel config may be used to reduce system boot times. The defaults are fairly high and can be responsible for 15 seconds of delay in the boot process. Reducing it to 5 seconds usually works (especially with modern drives). Newer versions of &os; (5.0 and higher) should use the kern.cam.scsi_delay boot time tunable. The tunable, and kernel config option accept values in terms of milliseconds and not seconds. Soft Updates Soft Updates tunefs The &man.tunefs.8; program can be used to fine-tune a file system. This program has many different options, but for now we are only concerned with toggling Soft Updates on and off, which is done by: &prompt.root; tunefs -n enable /filesystem &prompt.root; tunefs -n disable /filesystem A filesystem cannot be modified with &man.tunefs.8; while it is mounted. A good time to enable Soft Updates is before any partitions have been mounted, in single-user mode. As of &os; 4.5, it is possible to enable Soft Updates at filesystem creation time, through use of the -U option to &man.newfs.8;. Soft Updates drastically improves meta-data performance, mainly file creation and deletion, through the use of a memory cache. We recommend to use Soft Updates on all of your file systems. There are two downsides to Soft Updates that you should be aware of: First, Soft Updates guarantees filesystem consistency in the case of a crash but could very easily be several seconds (even a minute!) behind updating the physical disk. If your system crashes you may lose more work than otherwise. Secondly, Soft Updates delays the freeing of filesystem blocks. If you have a filesystem (such as the root filesystem) which is almost full, performing a major update, such as make installworld, can cause the filesystem to run out of space and the update to fail. More Details about Soft Updates Soft Updates details There are two traditional approaches to writing a file systems meta-data back to disk. (Meta-data updates are updates to non-content data like inodes or directories.) Historically, the default behavior was to write out meta-data updates synchronously. If a directory had been changed, the system waited until the change was actually written to disk. The file data buffers (file contents) were passed through the buffer cache and backed up to disk later on asynchronously. The advantage of this implementation is that it operates safely. If there is a failure during an update, the meta-data are always in a consistent state. A file is either created completely or not at all. If the data blocks of a file did not find their way out of the buffer cache onto the disk by the time of the crash, &man.fsck.8; is able to recognize this and repair the filesystem by setting the file length to 0. Additionally, the implementation is clear and simple. The disadvantage is that meta-data changes are slow. An rm -r, for instance, touches all the files in a directory sequentially, but each directory change (deletion of a file) will be written synchronously to the disk. This includes updates to the directory itself, to the inode table, and possibly to indirect blocks allocated by the file. Similar considerations apply for unrolling large hierarchies (tar -x). The second case is asynchronous meta-data updates. This is the default for Linux/ext2fs and mount -o async for *BSD ufs. All meta-data updates are simply being passed through the buffer cache too, that is, they will be intermixed with the updates of the file content data. The advantage of this implementation is there is no need to wait until each meta-data update has been written to disk, so all operations which cause huge amounts of meta-data updates work much faster than in the synchronous case. Also, the implementation is still clear and simple, so there is a low risk for bugs creeping into the code. The disadvantage is that there is no guarantee at all for a consistent state of the filesystem. If there is a failure during an operation that updated large amounts of meta-data (like a power failure, or someone pressing the reset button), the filesystem will be left in an unpredictable state. There is no opportunity to examine the state of the filesystem when the system comes up again; the data blocks of a file could already have been written to the disk while the updates of the inode table or the associated directory were not. It is actually impossible to implement a fsck which is able to clean up the resulting chaos (because the necessary information is not available on the disk). If the filesystem has been damaged beyond repair, the only choice is to use &man.newfs.8; on it and restore it from backup. The usual solution for this problem was to implement dirty region logging, which is also referred to as journaling, although that term is not used consistently and is occasionally applied to other forms of transaction logging as well. Meta-data updates are still written synchronously, but only into a small region of the disk. Later on they will be moved to their proper location. Because the logging area is a small, contiguous region on the disk, there are no long distances for the disk heads to move, even during heavy operations, so these operations are quicker than synchronous updates. Additionally the complexity of the implementation is fairly limited, so the risk of bugs being present is low. A disadvantage is that all meta-data are written twice (once into the logging region and once to the proper location) so for normal work, a performance pessimization might result. On the other hand, in case of a crash, all pending meta-data operations can be quickly either rolled-back or completed from the logging area after the system comes up again, resulting in a fast filesystem startup. Kirk McKusick, the developer of Berkeley FFS, solved this problem with Soft Updates: all pending meta-data updates are kept in memory and written out to disk in a sorted sequence (ordered meta-data updates). This has the effect that, in case of heavy meta-data operations, later updates to an item catch the earlier ones if the earlier ones are still in memory and have not already been written to disk. So all operations on, say, a directory are generally performed in memory before the update is written to disk (the data blocks are sorted according to their position so that they will not be on the disk ahead of their meta-data). If the system crashes, this causes an implicit log rewind: all operations which did not find their way to the disk appear as if they had never happened. A consistent filesystem state is maintained that appears to be the one of 30 to 60 seconds earlier. The algorithm used guarantees that all resources in use are marked as such in their appropriate bitmaps: blocks and inodes. After a crash, the only resource allocation error that occurs is that resources are marked as used which are actually free. &man.fsck.8; recognizes this situation, and frees the resources that are no longer used. It is safe to ignore the dirty state of the filesystem after a crash by forcibly mounting it with mount -f. In order to free resources that may be unused, &man.fsck.8; needs to be run at a later time. This is the idea behind the background fsck: at system startup time, only a snapshot of the filesystem is recorded. The fsck can be run later on. All file systems can then be mounted dirty, so the system startup proceeds in multiuser mode. Then, background fscks will be scheduled for all file systems where this is required, to free resources that may be unused. (File systems that do not use Soft Updates still need the usual foreground fsck though.) The advantage is that meta-data operations are nearly as fast as asynchronous updates (i.e. faster than with logging, which has to write the meta-data twice). The disadvantages are the complexity of the code (implying a higher risk for bugs in an area that is highly sensitive regarding loss of user data), and a higher memory consumption. Additionally there are some idiosyncrasies one has to get used to. After a crash, the state of the filesystem appears to be somewhat older. In situations where the standard synchronous approach would have caused some zero-length files to remain after the fsck, these files do not exist at all with a Soft Updates filesystem because neither the meta-data nor the file contents have ever been written to disk. Disk space is not released until the updates have been written to disk, which may take place some time after running rm. This may cause problems when installing large amounts of data on a filesystem that does not have enough free space to hold all the files twice. Tuning Kernel Limits tuning kernel limits File/Process Limits <varname>kern.maxfiles</varname> kern.maxfiles kern.maxfiles can be raised or lowered based upon your system requirements. This variable indicates the maximum number of file descriptors on your system. When the file descriptor table is full, file: table is full will show up repeatedly in the system message buffer, which can be viewed with the dmesg command. Each open file, socket, or fifo uses one file descriptor. A large-scale production server may easily require many thousands of file descriptors, depending on the kind and number of services running concurrently. kern.maxfile's default value is dictated by the option in your kernel configuration file. kern.maxfiles grows proportionally to the value of . When compiling a custom kernel, it is a good idea to set this kernel configuration option according to the uses of your system. From this number, the kernel is given most of its pre-defined limits. Even though a production machine may not actually have 256 users connected at once, the resources needed may be similar to a high-scale web server. Starting with &os; 4.5, the system will auto-tune maxusers for you if you explicitly set it to 0 The auto-tuning algorithm sets maxusers equal to the amount of memory in the system, with a minimum of 32, and a maximum of 384. . In &os; 5.X and above, maxusers will default to 0 if not specified. If you are using an version of &os; earlier than 4.5, or you want to manage it yourself you will want to set maxusers to at least 4, especially if you are using the X Window System or compiling software. The reason is that the most important table set by maxusers is the maximum number of processes, which is set to 20 + 16 * maxusers, so if you set maxusers to 1, then you can only have 36 simultaneous processes, including the 18 or so that the system starts up at boot time and the 15 or so you will probably create when you start the X Window System. Even a simple task like reading a manual page will start up nine processes to filter, decompress, and view it. Setting maxusers to 64 will allow you to have up to 1044 simultaneous processes, which should be enough for nearly all uses. If, however, you see the dreaded proc table full error when trying to start another program, or are running a server with a large number of simultaneous users (like ftp.FreeBSD.org), you can always increase the number and rebuild. maxusers does not limit the number of users which can log into your machine. It simply sets various table sizes to reasonable values considering the maximum number of users you will likely have on your system and how many processes each of them will be running. One keyword which does limit the number of simultaneous remote logins and X terminal windows is pseudo-device pty 16. With &os; 5.X, you do not have to worry about this number since the &man.pty.4; driver is auto-cloning; you simply use the line device pty in your configuration file. <varname>kern.ipc.somaxconn</varname> kern.ipc.somaxconn The kern.ipc.somaxconn sysctl variable limits the size of the listen queue for accepting new TCP connections. The default value of 128 is typically too low for robust handling of new connections in a heavily loaded web server environment. For such environments, it is recommended to increase this value to 1024 or higher. The service daemon may itself limit the listen queue size (e.g. &man.sendmail.8;, or Apache) but will often have a directive in its configuration file to adjust the queue size. Large listen queues also do a better job of avoiding Denial of Service (DoS) attacks. Network Limits The NMBCLUSTERS kernel configuration option dictates the amount of network Mbufs available to the system. A heavily-trafficked server with a low number of Mbufs will hinder &os;'s ability. Each cluster represents approximately 2 K of memory, so a value of 1024 represents 2 megabytes of kernel memory reserved for network buffers. A simple calculation can be done to figure out how many are needed. If you have a web server which maxes out at 1000 simultaneous connections, and each connection eats a 16 K receive and 16 K send buffer, you need approximately 32 MB worth of network buffers to cover the web server. A good rule of thumb is to multiply by 2, so 2x32 MB / 2 KB = 64 MB / 2 kB = 32768. We recommend values between 4096 and 32768 for machines with greater amounts of memory. Under no circumstances should you specify an arbitrarily high value for this parameter as it could lead to a boot time crash. The option to &man.netstat.1; may be used to observe network cluster use. kern.ipc.nmbclusters loader tunable should be used to tune this at boot time. Only older versions of &os; will require you to use the NMBCLUSTERS kernel &man.config.8; option. For busy servers that make extensive use of the &man.sendfile.2; system call, it may be necessary to increase the number of &man.sendfile.2; buffers via the NSFBUFS kernel configuration option or by setting its value in /boot/loader.conf (see &man.loader.8; for details). A common indicator that this parameter needs to be adjusted is when processes are seen in the sfbufa state. The sysctl variable kern.ipc.nsfbufs is a read-only glimpse at the kernel configured variable. This parameter nominally scales with kern.maxusers, however it may be necessary to tune accordingly. Even though a socket has been marked as non-blocking, calling &man.sendfile.2; on the non-blocking socket may result in the &man.sendfile.2; call blocking until enough struct sf_buf's are made available. <varname>net.inet.ip.portrange.*</varname> net.inet.ip.portrange.* The net.inet.ip.portrange.* sysctl variables control the port number ranges automatically bound to TCP and UDP sockets. There are three ranges: a low range, a default range, and a high range. Most network programs use the default range which is controlled by the net.inet.ip.portrange.first and net.inet.ip.portrange.last, which default to 1024 and 5000, respectively. Bound port ranges are used for outgoing connections, and it is possible to run the system out of ports under certain circumstances. This most commonly occurs when you are running a heavily loaded web proxy. The port range is not an issue when running servers which handle mainly incoming connections, such as a normal web server, or has a limited number of outgoing connections, such as a mail relay. For situations where you may run yourself out of ports, it is recommended to increase net.inet.ip.portrange.last modestly. A value of 10000, 20000 or 30000 may be reasonable. You should also consider firewall effects when changing the port range. Some firewalls may block large ranges of ports (usually low-numbered ports) and expect systems to use higher ranges of ports for outgoing connections — for this reason it is not recommended that net.inet.ip.portrange.first be lowered. TCP Bandwidth Delay Product TCP Bandwidth Delay Product Limiting net.inet.tcp.inflight.enable The TCP Bandwidth Delay Product Limiting is similar to TCP/Vegas in NetBSD. It can be enabled by setting net.inet.tcp.inflight.enable sysctl variable to 1. The system will attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput. This feature is useful if you are serving data over modems, Gigabit Ethernet, or even high speed WAN links (or any other link with a high bandwidth delay product), especially if you are also using window scaling or have configured a large send window. If you enable this option, you should also be sure to set net.inet.tcp.inflight.debug to 0 (disable debugging), and for production use setting net.inet.tcp.inflight.min to at least 6144 may be beneficial. However, note that setting high minimums may effectively disable bandwidth limiting depending on the link. The limiting feature reduces the amount of data built up in intermediate route and switch packet queues as well as reduces the amount of data built up in the local host's interface queue. With fewer packets queued up, interactive connections, especially over slow modems, will also be able to operate with lower Round Trip Times. However, note that this feature only effects data transmission (uploading / server side). It has no effect on data reception (downloading). Adjusting net.inet.tcp.inflight.stab is not recommended. This parameter defaults to 20, representing 2 maximal packets added to the bandwidth delay product window calculation. The additional window is required to stabilize the algorithm and improve responsiveness to changing conditions, but it can also result in higher ping times over slow links (though still much lower than you would get without the inflight algorithm). In such cases, you may wish to try reducing this parameter to 15, 10, or 5; and may also have to reduce net.inet.tcp.inflight.min (for example, to 3500) to get the desired effect. Reducing these parameters should be done as a last resort only. In 4.X and earlier releases of &os; the inflight sysctl variables are directly under net.inet.tcp. Their names were (in alphabetic order): net.inet.tcp.inflight_debug, net.inet.tcp.inflight_enable, net.inet.tcp.inflight_max, net.inet.tcp.inflight_min, net.inet.tcp.inflight_stab. Virtual Memory <varname>kern.maxvnodes</varname> A vnode is the internal representation of a file or directory. So increasing the number of vnodes available to the operating system cuts down on disk I/O. Normally this is handled by the operating system and does not need to be changed. In some cases where disk I/O is a bottleneck and the system is running out of vnodes, this setting will need to be increased. The amount of inactive and free RAM will need to be taken into account. To see the current number of vnodes in use: &prompt.root; sysctl vfs.numvnodes vfs.numvnodes: 91349 To see the maximum vnodes: &prompt.root; sysctl kern.maxvnodes kern.maxvnodes: 100000 If the current vnode usage is near the maximum, increasing kern.maxvnodes by a value of 1,000 is probably a good idea. Keep an eye on the number of vfs.numvnodes. If it climbs up to the maximum again, kern.maxvnodes will need to be increased further. A shift in your memory usage as reported by &man.top.1; should be visible. More memory should be active. Adding Swap Space No matter how well you plan, sometimes a system does not run as you expect. If you find you need more swap space, it is simple enough to add. You have three ways to increase swap space: adding a new hard drive, enabling swap over NFS, and creating a swap file on an existing partition. Swap on a New Hard Drive The best way to add swap, of course, is to use this as an excuse to add another hard drive. You can always use another hard drive, after all. If you can do this, go reread the discussion of swap space in of the Handbook for some suggestions on how to best arrange your swap. Swapping over NFS Swapping over NFS is only recommended if you do not have a local hard disk to swap to. Swapping over NFS is slow and inefficient in versions of &os; prior to 4.X. It is reasonably fast and efficient in 4.0-RELEASE and newer. Even with newer versions of &os;, NFS swapping will be limited by the available network bandwidth and puts an additional burden on the NFS server. Swapfiles You can create a file of a specified size to use as a swap file. In our example here we will use a 64MB file called /usr/swap0. You can use any name you want, of course. Creating a Swapfile on &os; 4.X Be certain that your kernel configuration includes the vnode driver. It is not in recent versions of GENERIC. pseudo-device vn 1 #Vnode driver (turns a file into a device) Create a vn-device: &prompt.root; cd /dev &prompt.root; sh MAKEDEV vn0 Create a swapfile (/usr/swap0): &prompt.root; dd if=/dev/zero of=/usr/swap0 bs=1024k count=64 Set proper permissions on (/usr/swap0): &prompt.root; chmod 0600 /usr/swap0 Enable the swap file in /etc/rc.conf: swapfile="/usr/swap0" # Set to name of swapfile if aux swapfile desired. Reboot the machine or to enable the swap file immediately, type: &prompt.root; vnconfig -e /dev/vn0b /usr/swap0 swap Creating a Swapfile on &os; 5.X Be certain that your kernel configuration includes the memory disk driver (&man.md.4;). It is default in GENERIC kernel. device md # Memory "disks" Create a swapfile (/usr/swap0): &prompt.root; dd if=/dev/zero of=/usr/swap0 bs=1024k count=64 Set proper permissions on (/usr/swap0): &prompt.root; chmod 0600 /usr/swap0 Enable the swap file in /etc/rc.conf: swapfile="/usr/swap0" # Set to name of swapfile if aux swapfile desired. Reboot the machine or to enable the swap file immediately, type: - &prompt.root; mdconfig -a -t vnode -f /usr/swap0 -u 0 && swapon /dev/md0 + &prompt.root; mdconfig -a -t vnode -f /usr/swap0 -u 0 && swapon /dev/md0 Hiten Pandya Written by Tom Rhodes Power and Resource Management It is very important to utilize hardware resources in an efficient manner. Before ACPI was introduced, it was very difficult and inflexible for operating systems to manage the power usage and thermal properties of a system. The hardware was controlled by some sort of BIOS embedded interface, such as Plug and Play BIOS (PNPBIOS), or Advanced Power Management (APM) and so on. Power and Resource Management is one of the key components of a modern operating system. For example, you may want an operating system to monitor system limits (and possibly alert you) in case your system temperature increased unexpectedly. In this section of the &os; Handbook, we will provide comprehensive information about ACPI. References will be provided for further reading at the end. Please be aware that ACPI is available on &os; 5.X and above systems as a default kernel module. For &os; 4.9, ACPI can be enabled by adding the line device acpica to a kernel configuration and rebuilding. What Is ACPI? ACPI APM Advanced Configuration and Power Interface (ACPI) is a standard written by an alliance of vendors to provide a standard interface for hardware resources and power management (hence the name). It is a key element in Operating System-directed configuration and Power Management, i.e.: it provides more control and flexibility to the operating system (OS). Modern systems stretched the limits of the current Plug and Play interfaces (such as APM, which is used in &os; 4.X), prior to the introduction of ACPI. ACPI is the direct successor to APM (Advanced Power Management). Shortcomings of Advanced Power Management (APM) The Advanced Power Management (APM) facility controls the power usage of a system based on its activity. The APM BIOS is supplied by the (system) vendor and it is specific to the hardware platform. An APM driver in the OS mediates access to the APM Software Interface, which allows management of power levels. There are four major problems in APM. Firstly, power management is done by the (vendor-specific) BIOS, and the OS does not have any knowledge of it. One example of this, is when the user sets idle-time values for a hard drive in the APM BIOS, that when exceeded, it (BIOS) would spin down the hard drive, without the consent of the OS. Secondly, the APM logic is embedded in the BIOS, and it operates outside the scope of the OS. This means users can only fix problems in their APM BIOS by flashing a new one into the ROM; which is a very dangerous procedure with the potential to leave the system in an unrecoverable state if it fails. Thirdly, APM is a vendor-specific technology, which means that there is a lot of parity (duplication of efforts) and bugs found in one vendor's BIOS, may not be solved in others. Last but not the least, the APM BIOS did not have enough room to implement a sophisticated power policy, or one that can adapt very well to the purpose of the machine. Plug and Play BIOS (PNPBIOS) was unreliable in many situations. PNPBIOS is 16-bit technology, so the OS has to use 16-bit emulation in order to interface with PNPBIOS methods. The &os; APM driver is documented in the &man.apm.4; manual page. Configuring <acronym>ACPI</acronym> The acpi.ko driver is loaded by default at start up by the &man.loader.8; and should not be compiled into the kernel. The reasoning behind this is that modules are easier to work with, say if switching to another acpi.ko without doing a kernel rebuild. This has the advantage of making testing easier. Another reason is that starting ACPI after a system has been brought up is not too useful, and in some cases can be fatal. In doubt, just disable ACPI all together. This driver should not and can not be unloaded because the system bus uses it for various hardware interactions. ACPI can be disabled with the &man.acpiconf.8; utility. In fact most of the interaction with ACPI can be done via &man.acpiconf.8;. Basically this means, if anything about ACPI is in the &man.dmesg.8; output, then most likely it is already running. ACPI and APM cannot coexist and should be used separately. The last one to load will terminate if the driver notices the other running. In the simplest form, ACPI can be used to put the system into a sleep mode with &man.acpiconf.8;, the flag, and a 1-5 option. Most users will only need 1. Option 5 will do a soft-off which is the same action as: &prompt.root; halt -p The other options are available. Check out the &man.acpiconf.8; manual page for more information. Nate Lawson Written by Peter Schultz With contributions from Tom Rhodes Using and Debugging &os; <acronym>ACPI</acronym> ACPI problems ACPI is a fundamentally new way of discovering devices, managing power usage, and providing standardized access to various hardware previously managed by the BIOS. Progress is being made toward ACPI working on all systems, but bugs in some motherboards' ACPI Machine Language (AML) bytecode, incompleteness in &os;'s kernel subsystems, and bugs in the &intel; ACPI-CA interpreter continue to appear. This document is intended to help you assist the &os; ACPI maintainers in identifying the root cause of problems you observe and debugging and developing a solution. Thanks for reading this and we hope we can solve your system's problems. Submitting Debugging Information Before submitting a problem, be sure you are running the latest BIOS version and, if available, embedded controller firmware version. For those of you that want to submit a problem right away, please send the following information to freebsd-acpi@FreeBSD.org: Description of the buggy behavior, including system type and model and anything that causes the bug to appear. Also, please note as accurately as possible when the bug began occurring if it is new for you. The &man.dmesg.8; output after boot -v, including any error messages generated by you exercising the bug. The &man.dmesg.8; output from boot -v with ACPI disabled, if disabling it helps fix the problem. Output from sysctl hw.acpi. This is also a good way of figuring out what features your system offers. URL where your ACPI Source Language (ASL) can be found. Do not send the ASL directly to the list as it can be very large. Generate a copy of your ASL by running this command: &prompt.root; acpidump -t -d > name-system.asl (Substitute your login name for name and manufacturer/model for system. Example: njl-FooCo6000.asl) Most of the developers watch the &a.current; but please submit problems to &a.acpi.name; to be sure it is seen. Please be patient, all of us have full-time jobs elsewhere. If your bug is not immediately apparent, we will probably ask you to submit a PR via &man.send-pr.1;. When entering a PR, please include the same information as requested above. This will help us track the problem and resolve it. Do not send a PR without emailing &a.acpi.name; first as we use PRs as reminders of existing problems, not a reporting mechanism. It is likely that your problem has been reported by someone before. Background ACPI ACPI is present in all modern computers that conform to the ia32 (x86), ia64 (Itanium), and amd64 (AMD) architectures. The full standard has many features including CPU performance management, power planes control, thermal zones, various battery systems, embedded controllers, and bus enumeration. Most systems implement less than the full standard. For instance, a desktop system usually only implements the bus enumeration parts while a laptop might have cooling and battery management support as well. Laptops also have suspend and resume, with their own associated complexity. An ACPI-compliant system has various components. The BIOS and chipset vendors provide various fixed tables (e.g., FADT) in memory that specify things like the APIC map (used for SMP), config registers, and simple configuration values. Additionally, a table of bytecode (the Differentiated System Description Table DSDT) is provided that specifies a tree-like name space of devices and methods. The ACPI driver must parse the fixed tables, implement an interpreter for the bytecode, and modify device drivers and the kernel to accept information from the ACPI subsystem. For &os;, &intel; has provided an interpreter (ACPI-CA) that is shared with Linux and NetBSD. The path to the ACPI-CA source code is src/sys/contrib/dev/acpica. The glue code that allows ACPI-CA to work on &os; is in src/sys/dev/acpica/Osd. Finally, drivers that implement various ACPI devices are found in src/sys/dev/acpica. Common Problems ACPI problems For ACPI to work correctly, all the parts have to work correctly. Here are some common problems, in order of frequency of appearance, and some possible workarounds or fixes. Mouse Issues In some cases, resuming from a suspend operation will cause the mouse to fail. A known work around is to add hint.psm.0.flags="0x3000" to the /boot/loader.conf file. If this does not work then please consider sending a bug report as described above. Suspend/Resume ACPI has three suspend to RAM (STR) states, S1-S3, and one suspend to disk state (STD), called S4. S5 is soft off and is the normal state your system is in when plugged in but not powered up. S4 can actually be implemented two separate ways. S4BIOS is a BIOS-assisted suspend to disk. S4OS is implemented entirely by the operating system. Start by checking sysctl hw.acpi for the suspend-related items. Here are the results for a Thinkpad: hw.acpi.supported_sleep_state: S3 S4 S5 hw.acpi.s4bios: 0 This means that we can use acpiconf -s to test S3, S4OS, and S5. If was one (1), we would have S4BIOS support instead of S4 OS. When testing suspend/resume, start with S1, if supported. This state is most likely to work since it does not require much driver support. No one has implemented S2 but if you have it, it is similar to S1. The next thing to try is S3. This is the deepest STR state and requires a lot of driver support to properly reinitialize your hardware. If you have problems resuming, feel free to email the &a.acpi.name; list but do not expect the problem to be resolved since there are a lot of drivers/hardware that need more testing and work. To help isolate the problem, remove as many drivers from your kernel as possible. If it works, you can narrow down which driver is the problem by loading drivers until it fails again. Typically binary drivers like nvidia.ko, X11 display drivers, and USB will have the most problems while Ethernet interfaces usually work fine. If you can properly load/unload the drivers, you can automate this by putting the appropriate commands in /etc/rc.suspend and /etc/rc.resume. There is a commented-out example for unloading and loading a driver. Try setting to zero (0) if your display is messed up after resume. Try setting longer or shorter values for to see if that helps. Another thing to try is load a recent Linux distribution with ACPI support and test their suspend/resume support on the same hardware. If it works on Linux, it is likely a &os; driver problem and narrowing down which driver causes the problems will help us fix the problem. Note that the ACPI maintainers do not usually maintain other drivers (e.g sound, ATA, etc.) so any work done on tracking down a driver problem should probably eventually be posted to the &a.current.name; list and mailed to the driver maintainer. If you are feeling adventurous, go ahead and start putting some debugging &man.printf.3;s in a problematic driver to track down where in its resume function it hangs. Finally, try disabling ACPI and enabling APM instead. If suspend/resume works with APM, you may be better off sticking with APM, especially on older hardware (pre-2000). It took vendors a while to get ACPI support correct and older hardware is more likely to have BIOS problems with ACPI. System Hangs (temporary or permanent) Most system hangs are a result of lost interrupts or an interrupt storm. Chipsets have a lot of problems based on how the BIOS configures interrupts before boot, correctness of the APIC (MADT) table, and routing of the System Control Interrupt (SCI). interrupt storms Interrupt storms can be distinguished from lost interrupts by checking the output of vmstat -i and looking at the line that has acpi0. If the counter is increasing at more than a couple per second, you have an interrupt storm. If the system appears hung, try breaking to DDB (CTRL ALTESC on console) and type show interrupts. APIC disabling Your best hope when dealing with interrupt problems is to try disabling APIC support with hint.apic.0.disabled="1" in loader.conf. Panics Panics are relatively rare for ACPI and are the top priority to be fixed. The first step is to isolate the steps to reproduce the panic (if possible) and get a backtrace. Follow the advice for enabling options DDB and setting up a serial console (see ) or setting up a &man.dump.8; partition. You can get a backtrace in DDB with tr. If you have to handwrite the backtrace, be sure to at least get the lowest five (5) and top five (5) lines in the trace. Then, try to isolate the problem by booting with ACPI disabled. If that works, you can isolate the ACPI subsystem by using various values of . See the &man.acpi.4; manual page for some examples. System Powers Up After Suspend or Shutdown First, try setting hw.acpi.disable_on_poweroff="0" in &man.loader.conf.5;. This keeps ACPI from disabling various events during the shutdown process. Some systems need this value set to 1 (the default) for the same reason. This usually fixes the problem of a system powering up spontaneously after a suspend or poweroff. Other Problems If you have other problems with ACPI (working with a docking station, devices not detected, etc.), please email a description to the mailing list as well; however, some of these issues may be related to unfinished parts of the ACPI subsystem so they might take a while to be implemented. Please be patient and prepared to test patches we may send you. <acronym>ASL</acronym>, <command>acpidump</command>, and <acronym>IASL</acronym> ACPI ASL The most common problem is the BIOS vendors providing incorrect (or outright buggy!) bytecode. This is usually manifested by kernel console messages like this: ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.LPC0.FIGD._STA] \\ (Node 0xc3f6d160), AE_NOT_FOUND Often, you can resolve these problems by updating your BIOS to the latest revision. Most console messages are harmless but if you have other problems like battery status not working, they are a good place to start looking for problems in the AML. The bytecode, known as AML, is compiled from a source language called ASL. The AML is found in the table known as the DSDT. To get a copy of your ASL, use &man.acpidump.8;. You should use both the (show contents of the fixed tables) and (disassemble AML to ASL) options. See the Submitting Debugging Information section for an example syntax. The simplest first check you can do is to recompile your ASL to check for errors. Warnings can usually be ignored but errors are bugs that will usually prevent ACPI from working correctly. To recompile your ASL, issue the following command: &prompt.root; iasl your.asl Fixing Your <acronym>ASL</acronym> ACPI ASL In the long run, our goal is for almost everyone to have ACPI work without any user intervention. At this point, however, we are still developing workarounds for common mistakes made by the BIOS vendors. The µsoft; interpreter (acpi.sys and acpiec.sys) does not strictly check for adherence to the standard, and thus many BIOS vendors who only test ACPI under &windows; never fix their ASL. We hope to continue to identify and document exactly what non-standard behavior is allowed by µsoft;'s interpreter and replicate it so &os; can work without forcing users to fix the ASL. As a workaround and to help us identify behavior, you can fix the ASL manually. If this works for you, please send a &man.diff.1; of the old and new ASL so we can possibly work around the buggy behavior in ACPI-CA and thus make your fix unnecessary. ACPI error messages Here is a list of common error messages, their cause, and how to fix them: _OS dependencies Some AML assumes the world consists of various &windows; versions. You can tell &os; to claim it is any OS to see if this fixes problems you may have. An easy way to override this is to set hw.acpi.osname="Windows 2001" in /boot/loader.conf or other similar strings you find in the ASL. Missing Return statements Some methods do not explicitly return a value as the standard requires. While ACPI-CA does not handle this, &os; has a workaround that allows it to return the value implicitly. You can also add explicit Return statements where required if you know what value should be returned. To force iasl to compile the ASL, use the flag. Overriding the Default <acronym>AML</acronym> After you customize your.asl, you will want to compile it, run: &prompt.root; iasl your.asl You can add the flag to force creation of the AML, even if there are errors during compilation. Remember that some errors (e.g., missing Return statements) are automatically worked around by the interpreter. DSDT.aml is the default output filename for iasl. You can load this instead of your BIOS's buggy copy (which is still present in flash memory) by editing /boot/loader.conf as follows: acpi_dsdt_load="YES" acpi_dsdt_name="/boot/DSDT.aml" Be sure to copy your DSDT.aml to the /boot directory. Getting Debugging Output From <acronym>ACPI</acronym> ACPI problems ACPI debugging The ACPI driver has a very flexible debugging facility. It allows you to specify a set of subsystems as well as the level of verbosity. The subsystems you wish to debug are specified as layers and are broken down into ACPI-CA components (ACPI_ALL_COMPONENTS) and ACPI hardware support (ACPI_ALL_DRIVERS). The verbosity of debugging output is specified as the level and ranges from ACPI_LV_ERROR (just report errors) to ACPI_LV_VERBOSE (everything). The level is a bitmask so multiple options can be set at once, separated by spaces. In practice, you will want to use a serial console to log the output if it is so long it flushes the console message buffer. A full list of the individual layers and levels is found in the &man.acpi.4; manual page. Debugging output is not enabled by default. To enable it, add options ACPI_DEBUG to your kernel configuration file if ACPI is compiled into the kernel. You can add ACPI_DEBUG=1 to your /etc/make.conf to enable it globally. If it is a module, you can recompile just your acpi.ko module as follows: &prompt.root; cd /sys/modules/acpi/acpi && make clean && make ACPI_DEBUG=1 Install acpi.ko in /boot/kernel and add your desired level and layer to loader.conf. This example enables debug messages for all ACPI-CA components and all ACPI hardware drivers (CPU, LID, etc.) It will only output error messages, the least verbose level. debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS" debug.acpi.level="ACPI_LV_ERROR" If the information you want is triggered by a specific event (say, a suspend and then resume), you can leave out changes to loader.conf and instead use sysctl to specify the layer and level after booting and preparing your system for the specific event. The sysctls are named the same as the tunables in loader.conf. References More information about ACPI may be found in the following locations: The &a.acpi; The ACPI Mailing List Archives The old ACPI Mailing List Archives The ACPI 2.0 Specification &os; Manual pages: &man.acpi.4;, &man.acpi.thermal.4;, &man.acpidump.8;, &man.iasl.8;, &man.acpidb.8; DSDT debugging resource. (Uses Compaq as an example but generally useful.) diff --git a/en_US.ISO8859-1/books/handbook/cutting-edge/chapter.sgml b/en_US.ISO8859-1/books/handbook/cutting-edge/chapter.sgml index ba194452b7..147c962f99 100644 --- a/en_US.ISO8859-1/books/handbook/cutting-edge/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/cutting-edge/chapter.sgml @@ -1,1850 +1,1850 @@ Jim Mock Restructured, reorganized, and parts updated by Jordan Hubbard Original work by Poul-Henning Kamp John Polstra Nik Clayton The Cutting Edge Synopsis &os; is under constant development between releases. For people who want to be on the cutting edge, there are several easy mechanisms for keeping your system in sync with the latest developments. Be warned—the cutting edge is not for everyone! This chapter will help you decide if you want to track the development system, or stick with one of the released versions. After reading this chapter, you will know: The difference between the two development branches: &os.stable; and &os.current;. How to keep your system up to date with CVSup, CVS, or CTM. How to rebuild and reinstall the entire base system with make buildworld (etc). Before reading this chapter, you should: Properly set up your network connection (). Know how to install additional third-party software (). &os.current; vs. &os.stable; -CURRENT -STABLE There are two development branches to FreeBSD: &os.current; and &os.stable;. This section will explain a bit about each and describe how to keep your system up-to-date with each respective tree. &os.current; will be discussed first, then &os.stable;. Staying Current with &os; As you read this, keep in mind that &os.current; is the bleeding edge of &os; development. &os.current; users are expected to have a high degree of technical skill, and should be capable of solving difficult system problems on their own. If you are new to &os;, think twice before installing it. What Is &os.current;? snapshot &os.current; is the latest working sources for &os;. This includes work in progress, experimental changes, and transitional mechanisms that might or might not be present in the next official release of the software. While many &os; developers compile the &os.current; source code daily, there are periods of time when the sources are not buildable. These problems are resolved as expeditiously as possible, but whether or not &os.current; brings disaster or greatly desired functionality can be a matter of which exact moment you grabbed the source code in! Who Needs &os.current;? &os.current; is made available for 3 primary interest groups: Members of the &os; community who are actively working on some part of the source tree and for whom keeping current is an absolute requirement. Members of the &os; community who are active testers, willing to spend time solving problems in order to ensure that &os.current; remains as sane as possible. These are also people who wish to make topical suggestions on changes and the general direction of &os;, and submit patches to implement them. Those who merely wish to keep an eye on things, or to use the current sources for reference purposes (e.g. for reading, not running). These people also make the occasional comment or contribute code. What Is &os.current; <emphasis>Not</emphasis>? A fast-track to getting pre-release bits because you heard there is some cool new feature in there and you want to be the first on your block to have it. Being the first on the block to get the new feature means that you are the first on the block to get the new bugs. A quick way of getting bug fixes. Any given version of &os.current; is just as likely to introduce new bugs as to fix existing ones. In any way officially supported. We do our best to help people genuinely in one of the 3 legitimate &os.current; groups, but we simply do not have the time to provide tech support. This is not because we are mean and nasty people who do not like helping people out (we would not even be doing &os; if we were). We simply cannot answer hundreds messages a day and work on FreeBSD! Given the choice between improving &os; and answering lots of questions on experimental code, the developers opt for the former. Using &os.current; -CURRENT using Join the &a.current.name; and the &a.cvsall.name; lists. This is not just a good idea, it is essential. If you are not on the &a.current.name; list, you will not see the comments that people are making about the current state of the system and thus will probably end up stumbling over a lot of problems that others have already found and solved. Even more importantly, you will miss out on important bulletins which may be critical to your system's continued health. The &a.cvsall.name; list will allow you to see the commit log entry for each change as it is made along with any pertinent information on possible side-effects. To join these lists, or one of the others available go to &a.mailman.lists.link; and click on the list that you wish to subscribe to. Instructions on the rest of the procedure are available there. Grab the sources from a &os; mirror site. You can do this in one of two ways: cvsup cron -CURRENT Syncing with CVSup Use the cvsup program with the supfile named standard-supfile available from /usr/share/examples/cvsup. This is the most recommended method, since it allows you to grab the entire collection once and then only what has changed from then on. Many people run cvsup from cron and keep their sources up-to-date automatically. You have to customize the sample supfile above, and configure cvsup for your environment. -CURRENT Syncing with CTM Use the CTM facility. If you have very bad connectivity (high price connections or only email access) CTM is an option. However, it is a lot of hassle and can give you broken files. This leads to it being rarely used, which again increases the chance of it not working for fairly long periods of time. We recommend using CVSup for anybody with a 9600 bps modem or faster connection. If you are grabbing the sources to run, and not just look at, then grab all of &os.current;, not just selected portions. The reason for this is that various parts of the source depend on updates elsewhere, and trying to compile just a subset is almost guaranteed to get you into trouble. -CURRENT compiling Before compiling &os.current;, read the Makefile in /usr/src carefully. You should at least install a new kernel and rebuild the world the first time through as part of the upgrading process. Reading the &a.current; and /usr/src/UPDATING will keep you up-to-date on other bootstrapping procedures that sometimes become necessary as we move toward the next release. Be active! If you are running &os.current;, we want to know what you have to say about it, especially if you have suggestions for enhancements or bug fixes. Suggestions with accompanying code are received most enthusiastically! Staying Stable with &os; What Is &os.stable;? -STABLE &os.stable; is our development branch from which major releases are made. Changes go into this branch at a different pace, and with the general assumption that they have first gone into &os.current; for testing. This is still a development branch, however, and this means that at any given time, the sources for &os.stable; may or may not be suitable for any particular purpose. It is simply another engineering development track, not a resource for end-users. Who Needs &os.stable;? If you are interested in tracking or contributing to the FreeBSD development process, especially as it relates to the next point release of FreeBSD, then you should consider following &os.stable;. While it is true that security fixes also go into the &os.stable; branch, you do not need to track &os.stable; to do this. Every security advisory for FreeBSD explains how to fix the problem for the releases it affects That is not quite true. We can not continue to support old releases of FreeBSD forever, although we do support them for many years. For a complete description of the current security policy for old releases of FreeBSD, please see http://www.FreeBSD.org/security/. , and tracking an entire development branch just for security reasons is likely to bring in a lot of unwanted changes as well. Although we endeavor to ensure that the &os.stable; branch compiles and runs at all times, this cannot be guaranteed. In addition, while code is developed in &os.current; before including it in &os.stable;, more people run &os.stable; than &os.current;, so it is inevitable that bugs and corner cases will sometimes be found in &os.stable; that were not apparent in &os.current;. For these reasons, we do not recommend that you blindly track &os.stable;, and it is particularly important that you do not update any production servers to &os.stable; without first thoroughly testing the code in your development environment. If you do not have the resources to do this then we recommend that you run the most recent release of FreeBSD, and use the binary update mechanism to move from release to release. Using &os.stable; -STABLE using Join the &a.stable.name; list. This will keep you informed of build-dependencies that may appear in &os.stable; or any other issues requiring special attention. Developers will also make announcements in this mailing list when they are contemplating some controversial fix or update, giving the users a chance to respond if they have any issues to raise concerning the proposed change. The &a.cvsall.name; list will allow you to see the commit log entry for each change as it is made along with any pertinent information on possible side-effects. To join these lists, or one of the others available go to &a.mailman.lists.link; and click on the list that you wish to subscribe to. Instructions on the rest of the procedure are available there. If you are going to install a new system and want it to run monthly snapshot built from &os.stable;, please check the Snapshots web page for more information. Alternatively, it is possible to install the most recent &os.stable; release from the mirror sites and follow the instructions below to upgrade your system to the most up to date &os.stable; source code. If you are already running a previous release of &os; and wish to upgrade via sources then you can easily do so from &os; mirror site. This can be done in one of two ways: cvsup cron -STABLE syncing with CVSup Use the cvsup program with the supfile named stable-supfile from the directory /usr/share/examples/cvsup. This is the most recommended method, since it allows you to grab the entire collection once and then only what has changed from then on. Many people run cvsup from cron to keep their sources up-to-date automatically. You have to customize the sample supfile above, and configure cvsup for your environment. -STABLE syncing with CTM Use the CTM facility. If you do not have a fast and inexpensive connection to the Internet, this is the method you should consider using. Essentially, if you need rapid on-demand access to the source and communications bandwidth is not a consideration, use cvsup or ftp. Otherwise, use CTM. -STABLE compiling Before compiling &os.stable;, read the Makefile in /usr/src carefully. You should at least install a new kernel and rebuild the world the first time through as part of the upgrading process. Reading the &a.stable; and /usr/src/UPDATING will keep you up-to-date on other bootstrapping procedures that sometimes become necessary as we move toward the next release. Synchronizing Your Source There are various ways of using an Internet (or email) connection to stay up-to-date with any given area of the &os; project sources, or all areas, depending on what interests you. The primary services we offer are Anonymous CVS, CVSup, and CTM. While it is possible to update only parts of your source tree, the only supported update procedure is to update the entire tree and recompile both userland (i.e., all the programs that run in user space, such as those in /bin and /sbin) and kernel sources. Updating only part of your source tree, only the kernel, or only userland will often result in problems. These problems may range from compile errors to kernel panics or data corruption. CVS anonymous Anonymous CVS and CVSup use the pull model of updating sources. In the case of CVSup the user (or a cron script) invokes the cvsup program, and it interacts with a cvsupd server somewhere to bring your files up-to-date. The updates you receive are up-to-the-minute and you get them when, and only when, you want them. You can easily restrict your updates to the specific files or directories that are of interest to you. Updates are generated on the fly by the server, according to what you have and what you want to have. Anonymous CVS is quite a bit more simplistic than CVSup in that it is just an extension to CVS which allows it to pull changes directly from a remote CVS repository. CVSup can do this far more efficiently, but Anonymous CVS is easier to use. CTM CTM, on the other hand, does not interactively compare the sources you have with those on the master archive or otherwise pull them across. Instead, a script which identifies changes in files since its previous run is executed several times a day on the master CTM machine, any detected changes being compressed, stamped with a sequence-number and encoded for transmission over email (in printable ASCII only). Once received, these CTM deltas can then be handed to the &man.ctm.rmail.1; utility which will automatically decode, verify and apply the changes to the user's copy of the sources. This process is far more efficient than CVSup, and places less strain on our server resources since it is a push rather than a pull model. There are other trade-offs, of course. If you inadvertently wipe out portions of your archive, CVSup will detect and rebuild the damaged portions for you. CTM will not do this, and if you wipe some portion of your source tree out (and do not have it backed up) then you will have to start from scratch (from the most recent CVS base delta) and rebuild it all with CTM or, with Anonymous CVS, simply delete the bad bits and resync. Rebuilding <quote>world</quote> Rebuilding world Once you have synchronized your local source tree against a particular version of &os; (&os.stable;, &os.current;, and so on) you can then use the source tree to rebuild the system. Take a Backup It cannot be stressed enough how important it is to take a backup of your system before you do this. While rebuilding the world is (as long as you follow these instructions) an easy task to do, there will inevitably be times when you make mistakes, or when mistakes made by others in the source tree render your system unbootable. Make sure you have taken a backup. And have a fixit floppy or bootable CD at hand. You will probably never have to use it, but it is better to be safe than sorry! Subscribe to the Right Mailing List mailing list The &os.stable; and &os.current; branches are, by their nature, in development. People that contribute to &os; are human, and mistakes occasionally happen. Sometimes these mistakes can be quite harmless, just causing your system to print a new diagnostic warning. Or the change may be catastrophic, and render your system unbootable or destroy your file systems (or worse). If problems like these occur, a heads up is posted to the appropriate mailing list, explaining the nature of the problem and which systems it affects. And an all clear announcement is posted when the problem has been solved. If you try to track &os.stable; or &os.current; and do not read the &a.stable; or the &a.current; respectively, then you are asking for trouble. Do not use <command>make world</command> A lot of older documentation recommends using make world for this. Doing that skips some important steps and should only be used if you are sure of what you are doing. For almost all circumstances make world is the wrong thing to do, and the procedure described here should be used instead. The Canonical Way to Update Your System To update your system, you should check /usr/src/UPDATING for any pre-buildworld steps necessary for your version of the sources and then use the following procedure: &prompt.root; make buildworld &prompt.root; make buildkernel &prompt.root; make installkernel &prompt.root; reboot There are a few rare cases when an extra run of mergemaster -p is needed before the buildworld step. These are described in UPDATING. In general, though, you can safely omit this step if you are not updating across one or more major &os; versions. After installkernel finishes successfully, you should boot in single user mode (i.e. using boot -s from the loader prompt). Then run: &prompt.root; mergemaster -p &prompt.root; make installworld &prompt.root; mergemaster &prompt.root; reboot Read Further Explanations The sequence described above is only a short resume to help you getting started. You should however read the following sections to clearly understand each step, especially if you want to use a custom kernel configuration. Read <filename>/usr/src/UPDATING</filename> Before you do anything else, read /usr/src/UPDATING (or the equivalent file wherever you have a copy of the source code). This file should contain important information about problems you might encounter, or specify the order in which you might have to run certain commands. If UPDATING contradicts something you read here, UPDATING takes precedence. Reading UPDATING is not an acceptable substitute for subscribing to the correct mailing list, as described previously. The two requirements are complementary, not exclusive. Check <filename>/etc/make.conf</filename> make.conf Examine the files /usr/share/examples/etc/make.conf (called /etc/defaults/make.conf in &os; 4.X) and /etc/make.conf. The first contains some default defines – most of which are commented out. To make use of them when you rebuild your system from source, add them to /etc/make.conf. Keep in mind that anything you add to /etc/make.conf is also used every time you run make, so it is a good idea to set them to something sensible for your system. A typical user will probably want to copy the CFLAGS and NO_PROFILE (or NOPROFILE on &os; 5.X and older) lines found in /usr/share/examples/etc/make.conf (or in /etc/defaults/make.conf on &os; 4.X) to /etc/make.conf and uncomment them. Examine the other definitions (COPTFLAGS, NOPORTDOCS and so on) and decide if they are relevant to you. Update the Files in <filename>/etc</filename> The /etc directory contains a large part of your system's configuration information, as well as scripts that are run at system startup. Some of these scripts change from version to version of FreeBSD. Some of the configuration files are also used in the day to day running of the system. In particular, /etc/group. There have been occasions when the installation part of make installworld has expected certain usernames or groups to exist. When performing an upgrade it is likely that these users or groups did not exist. This caused problems when upgrading. In some cases make buildworld will check to see if these users or groups exist. A recent example of this is when the smmsp user was added. Users had the installation process fail for them when &man.mtree.8; was trying to create /var/spool/clientmqueue. The solution is to examine /usr/src/etc/group and compare its list of groups with your own. If there are any groups in the new file that are not in your file then copy them over. Similarly, you should rename any groups in /etc/group which have the same GID but a different name to those in /usr/src/etc/group. Since 4.6-RELEASE you can run &man.mergemaster.8; in pre-buildworld mode by providing the option. This will compare only those files that are essential for the success of buildworld or installworld. If your old version of mergemaster does not support , use the new version in the source tree when running for the first time: &prompt.root; cd /usr/src/usr.sbin/mergemaster &prompt.root; ./mergemaster.sh -p If you are feeling particularly paranoid, you can check your system to see which files are owned by the group you are renaming or deleting: &prompt.root; find / -group GID -print will show all files owned by group GID (which can be either a group name or a numeric group ID). Drop to Single User Mode single-user mode You may want to compile the system in single user mode. Apart from the obvious benefit of making things go slightly faster, reinstalling the system will touch a lot of important system files, all the standard system binaries, libraries, include files and so on. Changing these on a running system (particularly if you have active users on the system at the time) is asking for trouble. multi-user mode Another method is to compile the system in multi-user mode, and then drop into single user mode for the installation. If you would like to do it this way, simply hold off on the following steps until the build has completed. You can postpone dropping to single user mode until you have to installkernel or installworld. As the superuser, you can execute: &prompt.root; shutdown now from a running system, which will drop it to single user mode. Alternatively, reboot the system, and at the boot prompt, enter the flag. The system will then boot single user. At the shell prompt you should then run: &prompt.root; fsck -p &prompt.root; mount -u / &prompt.root; mount -a -t ufs &prompt.root; swapon -a This checks the file systems, remounts / read/write, mounts all the other UFS file systems referenced in /etc/fstab and then turns swapping on. If your CMOS clock is set to local time and not to GMT (this is true if the output of the &man.date.1; command does not show the correct time and zone), you may also need to run the following command: &prompt.root; adjkerntz -i This will make sure that your local time-zone settings get set up correctly — without this, you may later run into some problems. Remove <filename>/usr/obj</filename> As parts of the system are rebuilt they are placed in directories which (by default) go under /usr/obj. The directories shadow those under /usr/src. You can speed up the make buildworld process, and possibly save yourself some dependency headaches by removing this directory as well. Some files below /usr/obj may have the immutable flag set (see &man.chflags.1; for more information) which must be removed first. &prompt.root; cd /usr/obj &prompt.root; chflags -R noschg * &prompt.root; rm -rf * Recompile the Source Saving the Output It is a good idea to save the output you get from running &man.make.1; to another file. If something goes wrong you will have a copy of the error message. While this might not help you in diagnosing what has gone wrong, it can help others if you post your problem to one of the &os; mailing lists. The easiest way to do this is to use the &man.script.1; command, with a parameter that specifies the name of the file to save all output to. You would do this immediately before rebuilding the world, and then type exit when the process has finished. &prompt.root; script /var/tmp/mw.out Script started, output file is /var/tmp/mw.out &prompt.root; make TARGET … compile, compile, compile … &prompt.root; exit Script done, … If you do this, do not save the output in /tmp. This directory may be cleared next time you reboot. A better place to store it is in /var/tmp (as in the previous example) or in root's home directory. Compile the Base System You must be in the /usr/src directory: &prompt.root; cd /usr/src (unless, of course, your source code is elsewhere, in which case change to that directory instead). make To rebuild the world you use the &man.make.1; command. This command reads instructions from the Makefile, which describes how the programs that comprise &os; should be rebuilt, the order in which they should be built, and so on. The general format of the command line you will type is as follows: &prompt.root; make -x -DVARIABLE target In this example, is an option that you would pass to &man.make.1;. See the &man.make.1; manual page for an example of the options you can pass. passes a variable to the Makefile. The behavior of the Makefile is controlled by these variables. These are the same variables as are set in /etc/make.conf, and this provides another way of setting them. &prompt.root; make -DNO_PROFILE target is another way of specifying that profiled libraries should not be built, and corresponds with the NO_PROFILE= true # Avoid compiling profiled libraries line in /etc/make.conf. target tells &man.make.1; what you want to do. Each Makefile defines a number of different targets, and your choice of target determines what happens. Some targets are listed in the Makefile, but are not meant for you to run. Instead, they are used by the build process to break out the steps necessary to rebuild the system into a number of sub-steps. Most of the time you will not need to pass any parameters to &man.make.1;, and so your command like will look like this: &prompt.root; make target Beginning with version 2.2.5 of &os; (actually, it was first created on the &os.current; branch, and then retrofitted to &os.stable; midway between 2.2.2 and 2.2.5) the world target has been split in two: buildworld and installworld. Beginning with version 5.3 of &os; the world target will be changed so it will not work at all by default because it is actually dangerous for most users. As the names imply, buildworld builds a complete new tree under /usr/obj, and installworld installs this tree on the current machine. This is very useful for 2 reasons. First, it allows you to do the build safe in the knowledge that no components of your running system will be affected. The build is self hosted. Because of this, you can safely run buildworld on a machine running in multi-user mode with no fear of ill-effects. It is still recommended that you run the installworld part in single user mode, though. Secondly, it allows you to use NFS mounts to upgrade multiple machines on your network. If you have three machines, A, B and C that you want to upgrade, run make buildworld and make installworld on A. B and C should then NFS mount /usr/src and /usr/obj from A, and you can then run make installworld to install the results of the build on B and C. Although the world target still exists, you are strongly encouraged not to use it. Run &prompt.root; make buildworld It is now possible to specify a option to make which will cause it to spawn several simultaneous processes. This is most useful on multi-CPU machines. However, since much of the compiling process is IO bound rather than CPU bound it is also useful on single CPU machines. On a typical single-CPU machine you would run: &prompt.root; make -j4 buildworld &man.make.1; will then have up to 4 processes running at any one time. Empirical evidence posted to the mailing lists shows this generally gives the best performance benefit. If you have a multi-CPU machine and you are using an SMP configured kernel try values between 6 and 10 and see how they speed things up. Be aware that this is still somewhat experimental, and commits to the source tree may occasionally break this feature. If the world fails to compile using this parameter try again without it before you report any problems. Timings rebuilding world timings Many factors influence the build time, but currently a 500 MHz &pentium; III with 128 MB of RAM takes about 2 hours to build the &os.stable; tree, with no tricks or shortcuts used during the process. A &os.current; tree will take somewhat longer. Compile and Install a New Kernel kernel compiling To take full advantage of your new system you should recompile the kernel. This is practically a necessity, as certain memory structures may have changed, and programs like &man.ps.1; and &man.top.1; will fail to work until the kernel and source code versions are the same. The simplest, safest way to do this is to build and install a kernel based on GENERIC. While GENERIC may not have all the necessary devices for your system, it should contain everything necessary to boot your system back to single user mode. This is a good test that the new system works properly. After booting from GENERIC and verifying that your system works you can then build a new kernel based on your normal kernel configuration file. On modern versions of FreeBSD it is important to build world before building a new kernel. If you want to build a custom kernel, and already have a configuration file, just use KERNCONF=MYKERNEL like this: &prompt.root; cd /usr/src &prompt.root; make buildkernel KERNCONF=MYKERNEL &prompt.root; make installkernel KERNCONF=MYKERNEL Note that if you have raised kern.securelevel above 1 and you have set either the noschg or similar flags to your kernel binary, you might find it necessary to drop into single user mode to use installkernel. Otherwise you should be able to run both these commands from multi user mode without problems. See &man.init.8; for details about kern.securelevel and &man.chflags.1; for details about the various file flags. Reboot into Single User Mode single-user mode You should reboot into single user mode to test the new kernel works. Do this by following the instructions in . Install the New System Binaries If you were building a version of &os; recent enough to have used make buildworld then you should now use installworld to install the new system binaries. Run &prompt.root; cd /usr/src &prompt.root; make installworld If you specified variables on the make buildworld command line, you must specify the same variables in the make installworld command line. This does not necessarily hold true for other options; for example, must never be used with installworld. For example, if you ran: &prompt.root; make -DNO_PROFILE buildworld you must install the results with: &prompt.root; make -DNO_PROFILE installworld otherwise it would try to install profiled libraries that had not been built during the make buildworld phase. Update Files Not Updated by <command>make installworld</command> Remaking the world will not update certain directories (in particular, /etc, /var and /usr) with new or changed configuration files. The simplest way to update these files is to use &man.mergemaster.8;, though it is possible to do it manually if you would prefer to do that. Regardless of which way you choose, be sure to make a backup of /etc in case anything goes wrong. Tom Rhodes Contributed by <command>mergemaster</command> mergemaster The &man.mergemaster.8; utility is a Bourne script that will aid you in determining the differences between your configuration files in /etc, and the configuration files in the source tree /usr/src/etc. This is the recommended solution for keeping the system configuration files up to date with those located in the source tree. To begin simply type mergemaster at your prompt, and watch it start going. mergemaster will then build a temporary root environment, from / down, and populate it with various system configuration files. Those files are then compared to the ones currently installed in your system. At this point, files that differ will be shown in &man.diff.1; format, with the sign representing added or modified lines, and representing lines that will be either removed completely, or replaced with a new line. See the &man.diff.1; manual page for more information about the &man.diff.1; syntax and how file differences are shown. &man.mergemaster.8; will then show you each file that displays variances, and at this point you will have the option of either deleting the new file (referred to as the temporary file), installing the temporary file in its unmodified state, merging the temporary file with the currently installed file, or viewing the &man.diff.1; results again. Choosing to delete the temporary file will tell &man.mergemaster.8; that we wish to keep our current file unchanged, and to delete the new version. This option is not recommended, unless you see no reason to change the current file. You can get help at any time by typing ? at the &man.mergemaster.8; prompt. If the user chooses to skip a file, it will be presented again after all other files have been dealt with. Choosing to install the unmodified temporary file will replace the current file with the new one. For most unmodified files, this is the best option. Choosing to merge the file will present you with a text editor, and the contents of both files. You can now merge them by reviewing both files side by side on the screen, and choosing parts from both to create a finished product. When the files are compared side by side, the l key will select the left contents and the r key will select contents from your right. The final output will be a file consisting of both parts, which can then be installed. This option is customarily used for files where settings have been modified by the user. Choosing to view the &man.diff.1; results again will show you the file differences just like &man.mergemaster.8; did before prompting you for an option. After &man.mergemaster.8; is done with the system files you will be prompted for other options. &man.mergemaster.8; may ask if you want to rebuild the password file and/or run &man.MAKEDEV.8; if you run a FreeBSD version prior to 5.0, and will finish up with an option to remove left-over temporary files. Manual Update If you wish to do the update manually, however, you cannot just copy over the files from /usr/src/etc to /etc and have it work. Some of these files must be installed first. This is because the /usr/src/etc directory is not a copy of what your /etc directory should look like. In addition, there are files that should be in /etc that are not in /usr/src/etc. If you are using &man.mergemaster.8; (as recommended), you can skip forward to the next section. The simplest way to do this by hand is to install the files into a new directory, and then work through them looking for differences. Backup Your Existing <filename>/etc</filename> Although, in theory, nothing is going to touch this directory automatically, it is always better to be sure. So copy your existing /etc directory somewhere safe. Something like: &prompt.root; cp -Rp /etc /etc.old does a recursive copy, preserves times, ownerships on files and suchlike. You need to build a dummy set of directories to install the new /etc and other files into. /var/tmp/root is a reasonable choice, and there are a number of subdirectories required under this as well. &prompt.root; mkdir /var/tmp/root &prompt.root; cd /usr/src/etc &prompt.root; make DESTDIR=/var/tmp/root distrib-dirs distribution This will build the necessary directory structure and install the files. A lot of the subdirectories that have been created under /var/tmp/root are empty and should be deleted. The simplest way to do this is to: &prompt.root; cd /var/tmp/root &prompt.root; find -d . -type d | xargs rmdir 2>/dev/null This will remove all empty directories. (Standard error is redirected to /dev/null to prevent the warnings about the directories that are not empty.) /var/tmp/root now contains all the files that should be placed in appropriate locations below /. You now have to go through each of these files, determining how they differ with your existing files. Note that some of the files that will have been installed in /var/tmp/root have a leading .. At the time of writing the only files like this are shell startup files in /var/tmp/root/ and /var/tmp/root/root/, although there may be others (depending on when you are reading this). Make sure you use ls -a to catch them. The simplest way to do this is to use &man.diff.1; to compare the two files: &prompt.root; diff /etc/shells /var/tmp/root/etc/shells This will show you the differences between your /etc/shells file and the new /var/tmp/root/etc/shells file. Use these to decide whether to merge in changes that you have made or whether to copy over your old file. Name the New Root Directory (<filename>/var/tmp/root</filename>) with a Time Stamp, so You Can Easily Compare Differences Between Versions Frequently rebuilding the world means that you have to update /etc frequently as well, which can be a bit of a chore. You can speed this process up by keeping a copy of the last set of changed files that you merged into /etc. The following procedure gives one idea of how to do this. Make the world as normal. When you want to update /etc and the other directories, give the target directory a name based on the current date. If you were doing this on the 14th of February 1998 you could do the following: &prompt.root; mkdir /var/tmp/root-19980214 &prompt.root; cd /usr/src/etc &prompt.root; make DESTDIR=/var/tmp/root-19980214 \ distrib-dirs distribution Merge in the changes from this directory as outlined above. Do not remove the /var/tmp/root-19980214 directory when you have finished. When you have downloaded the latest version of the source and remade it, follow step 1. This will give you a new directory, which might be called /var/tmp/root-19980221 (if you wait a week between doing updates). You can now see the differences that have been made in the intervening week using &man.diff.1; to create a recursive diff between the two directories: &prompt.root; cd /var/tmp &prompt.root; diff -r root-19980214 root-19980221 Typically, this will be a much smaller set of differences than those between /var/tmp/root-19980221/etc and /etc. Because the set of differences is smaller, it is easier to migrate those changes across into your /etc directory. You can now remove the older of the two /var/tmp/root-* directories: &prompt.root; rm -rf /var/tmp/root-19980214 Repeat this process every time you need to merge in changes to /etc. You can use &man.date.1; to automate the generation of the directory names: &prompt.root; mkdir /var/tmp/root-`date "+%Y%m%d"` Update <filename>/dev</filename> DEVFS If you are running FreeBSD 5.0 or later you can safely skip this section. These versions use &man.devfs.5; to allocate device nodes transparently for the user. In most cases, the &man.mergemaster.8; tool will realize when it is necessary to update the device nodes, and offer to complete it automatically. These instructions tell how to update the device nodes manually. For safety's sake, this is a multi-step process. Copy /var/tmp/root/dev/MAKEDEV to /dev: &prompt.root; cp /var/tmp/root/dev/MAKEDEV /dev MAKEDEV If you used &man.mergemaster.8; to update /etc, then your MAKEDEV script should have been updated already, though it cannot hurt to check (with &man.diff.1;) and copy it manually if necessary. Now, take a snapshot of your current /dev. This snapshot needs to contain the permissions, ownerships, major and minor numbers of each filename, but it should not contain the time stamps. The easiest way to do this is to use &man.awk.1; to strip out some of the information: &prompt.root; cd /dev -&prompt.root; ls -l | awk '{print $1, $2, $3, $4, $5, $6, $NF}' > /var/tmp/dev.out +&prompt.root; ls -l | awk '{print $1, $2, $3, $4, $5, $6, $NF}' > /var/tmp/dev.out Remake all the device nodes: &prompt.root; sh MAKEDEV all Write another snapshot of the directory, this time to /var/tmp/dev2.out. Now look through these two files for any device node that you missed creating. There should not be any, but it is better to be safe than sorry. &prompt.root; diff /var/tmp/dev.out /var/tmp/dev2.out You are most likely to notice disk slice discrepancies which will involve commands such as: &prompt.root; sh MAKEDEV sd0s1 to recreate the slice entries. Your precise circumstances may vary. Update <filename>/stand</filename> This step is included only for completeness. It can safely be omitted. If you are using FreeBSD 5.2 or later, the /rescue directory is automatically updated for the user with current, statically compiled binaries during make installworld, thus obsoleting the need to update /stand (which does not exist at all on &os; 6.0 and later). For the sake of completeness, you may want to update the files in /stand as well. These files consist of hard links to the /stand/sysinstall binary. This binary should be statically linked, so that it can work when no other file systems (and in particular /usr) have been mounted. &prompt.root; cd /usr/src/release/sysinstall &prompt.root; make all install Rebooting You are now done. After you have verified that everything appears to be in the right place you can reboot the system. A simple &man.shutdown.8; should do it: &prompt.root; shutdown -r now Finished You should now have successfully upgraded your &os; system. Congratulations. If things went slightly wrong, it is easy to rebuild a particular piece of the system. For example, if you accidentally deleted /etc/magic as part of the upgrade or merge of /etc, the &man.file.1; command will stop working. In this case, the fix would be to run: &prompt.root; cd /usr/src/usr.bin/file &prompt.root; make all install Questions Do I need to re-make the world for every change? There is no easy answer to this one, as it depends on the nature of the change. For example, if you just ran CVSup, and it has shown the following files as being updated: src/games/cribbage/instr.c src/games/sail/pl_main.c src/release/sysinstall/config.c src/release/sysinstall/media.c src/share/mk/bsd.port.mk it probably is not worth rebuilding the entire world. You could just go to the appropriate sub-directories and make all install, and that's about it. But if something major changed, for example src/lib/libc/stdlib then you should either re-make the world, or at least those parts of it that are statically linked (as well as anything else you might have added that is statically linked). At the end of the day, it is your call. You might be happy re-making the world every fortnight say, and let changes accumulate over that fortnight. Or you might want to re-make just those things that have changed, and be confident you can spot all the dependencies. And, of course, this all depends on how often you want to upgrade, and whether you are tracking &os.stable; or &os.current;. My compile failed with lots of signal 11 (or other signal number) errors. What has happened? signal 11 This is normally indicative of hardware problems. (Re)making the world is an effective way to stress test your hardware, and will frequently throw up memory problems. These normally manifest themselves as the compiler mysteriously dying on receipt of strange signals. A sure indicator of this is if you can restart the make and it dies at a different point in the process. In this instance there is little you can do except start swapping around the components in your machine to determine which one is failing. Can I remove /usr/obj when I have finished? The short answer is yes. /usr/obj contains all the object files that were produced during the compilation phase. Normally, one of the first steps in the make buildworld process is to remove this directory and start afresh. In this case, keeping /usr/obj around after you have finished makes little sense, and will free up a large chunk of disk space (currently about 340 MB). However, if you know what you are doing you can have make buildworld skip this step. This will make subsequent builds run much faster, since most of sources will not need to be recompiled. The flip side of this is that subtle dependency problems can creep in, causing your build to fail in odd ways. This frequently generates noise on the &os; mailing lists, when one person complains that their build has failed, not realizing that it is because they have tried to cut corners. Can interrupted builds be resumed? This depends on how far through the process you got before you found a problem. In general (and this is not a hard and fast rule) the make buildworld process builds new copies of essential tools (such as &man.gcc.1;, and &man.make.1;) and the system libraries. These tools and libraries are then installed. The new tools and libraries are then used to rebuild themselves, and are installed again. The entire system (now including regular user programs, such as &man.ls.1; or &man.grep.1;) is then rebuilt with the new system files. If you are at the last stage, and you know it (because you have looked through the output that you were storing) then you can (fairly safely) do: … fix the problem … &prompt.root; cd /usr/src &prompt.root; make -DNO_CLEAN all On &os; 5.X and older, use -DNOCLEAN instead. This will not undo the work of the previous make buildworld. If you see the message: -------------------------------------------------------------- Building everything.. -------------------------------------------------------------- in the make buildworld output then it is probably fairly safe to do so. If you do not see that message, or you are not sure, then it is always better to be safe than sorry, and restart the build from scratch. How can I speed up making the world? Run in single user mode. Put the /usr/src and /usr/obj directories on separate file systems held on separate disks. If possible, put these disks on separate disk controllers. Better still, put these file systems across multiple disks using the &man.ccd.4; (concatenated disk driver) device. Turn off profiling (set NO_PROFILE=true in /etc/make.conf). You almost certainly do not need it. Also in /etc/make.conf, set CFLAGS to something like . The optimization is much slower, and the optimization difference between and is normally negligible. lets the compiler use pipes rather than temporary files for communication, which saves disk access (at the expense of memory). Pass the option to &man.make.1; to run multiple processes in parallel. This usually helps regardless of whether you have a single or a multi processor machine. The file system holding /usr/src can be mounted (or remounted) with the option. This prevents the file system from recording the file access time. You probably do not need this information anyway. &prompt.root; mount -u -o noatime /usr/src The example assumes /usr/src is on its own file system. If it is not (if it is a part of /usr for example) then you will need to use that file system mount point, and not /usr/src. The file system holding /usr/obj can be mounted (or remounted) with the option. This causes disk writes to happen asynchronously. In other words, the write completes immediately, and the data is written to the disk a few seconds later. This allows writes to be clustered together, and can be a dramatic performance boost. Keep in mind that this option makes your file system more fragile. With this option there is an increased chance that, should power fail, the file system will be in an unrecoverable state when the machine restarts. If /usr/obj is the only thing on this file system then it is not a problem. If you have other, valuable data on the same file system then ensure your backups are fresh before you enable this option. &prompt.root; mount -u -o async /usr/obj As above, if /usr/obj is not on its own file system, replace it in the example with the name of the appropriate mount point. What do I do if something goes wrong? Make absolutely sure your environment has no extraneous cruft from earlier builds. This is simple enough. &prompt.root; chflags -R noschg /usr/obj/usr &prompt.root; rm -rf /usr/obj/usr &prompt.root; cd /usr/src &prompt.root; make cleandir &prompt.root; make cleandir Yes, make cleandir really should be run twice. Then restart the whole process, starting with make buildworld. If you still have problems, send the error and the output of uname -a to &a.questions;. Be prepared to answer other questions about your setup! Mike Meyer Contributed by Tracking for Multiple Machines NFS installing multiple machines If you have multiple machines that you want to track the same source tree, then having all of them download sources and rebuild everything seems like a waste of resources: disk space, network bandwidth, and CPU cycles. It is, and the solution is to have one machine do most of the work, while the rest of the machines mount that work via NFS. This section outlines a method of doing so. Preliminaries First, identify a set of machines that is going to run the same set of binaries, which we will call a build set. Each machine can have a custom kernel, but they will be running the same userland binaries. From that set, choose a machine to be the build machine. It is going to be the machine that the world and kernel are built on. Ideally, it should be a fast machine that has sufficient spare CPU to run make buildworld and make buildkernel. You will also want to choose a machine to be the test machine, which will test software updates before they are put into production. This must be a machine that you can afford to have down for an extended period of time. It can be the build machine, but need not be. All the machines in this build set need to mount /usr/obj and /usr/src from the same machine, and at the same point. Ideally, those are on two different drives on the build machine, but they can be NFS mounted on that machine as well. If you have multiple build sets, /usr/src should be on one build machine, and NFS mounted on the rest. Finally make sure that /etc/make.conf on all the machines in the build set agrees with the build machine. That means that the build machine must build all the parts of the base system that any machine in the build set is going to install. Also, each build machine should have its kernel name set with KERNCONF in /etc/make.conf, and the build machine should list them all in KERNCONF, listing its own kernel first. The build machine must have the kernel configuration files for each machine in /usr/src/sys/arch/conf if it is going to build their kernels. The Base System Now that all that is done, you are ready to build everything. Build the kernel and world as described in on the build machine, but do not install anything. After the build has finished, go to the test machine, and install the kernel you just built. If this machine mounts /usr/src and /usr/obj via NFS, when you reboot to single user you will need to enable the network and mount them. The easiest way to do this is to boot to multi-user, then run shutdown now to go to single user mode. Once there, you can install the new kernel and world and run mergemaster just as you normally would. When done, reboot to return to normal multi-user operations for this machine. After you are certain that everything on the test machine is working properly, use the same procedure to install the new software on each of the other machines in the build set. Ports The same ideas can be used for the ports tree. The first critical step is mounting /usr/ports from the same machine to all the machines in the build set. You can then set up /etc/make.conf properly to share distfiles. You should set DISTDIR to a common shared directory that is writable by whichever user root is mapped to by your NFS mounts. Each machine should set WRKDIRPREFIX to a local build directory. Finally, if you are going to be building and distributing packages, you should set PACKAGES to a directory similar to DISTDIR. diff --git a/en_US.ISO8859-1/books/handbook/disks/chapter.sgml b/en_US.ISO8859-1/books/handbook/disks/chapter.sgml index 88feeefee0..3a9e064b04 100644 --- a/en_US.ISO8859-1/books/handbook/disks/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/disks/chapter.sgml @@ -1,4139 +1,4139 @@ Storage Synopsis This chapter covers the use of disks in FreeBSD. This includes memory-backed disks, network-attached disks, standard SCSI/IDE storage devices, and devices using the USB interface. After reading this chapter, you will know: The terminology FreeBSD uses to describe the organization of data on a physical disk (partitions and slices). How to add additional hard disks to your system. How to configure &os; to use USB storage devices. How to set up virtual file systems, such as memory disks. How to use quotas to limit disk space usage. How to encrypt disks to secure them against attackers. How to create and burn CDs and DVDs on FreeBSD. The various storage media options for backups. How to use backup programs available under FreeBSD. How to backup to floppy disks. What snapshots are and how to use them efficiently. Before reading this chapter, you should: Know how to configure and install a new FreeBSD kernel (). Device Names The following is a list of physical storage devices supported in FreeBSD, and the device names associated with them. Physical Disk Naming Conventions Drive type Drive device name IDE hard drives ad IDE CDROM drives acd SCSI hard drives and USB Mass storage devices da SCSI CDROM drives cd Assorted non-standard CDROM drives mcd for Mitsumi CD-ROM, scd for Sony CD-ROM, matcd for Matsushita/Panasonic CD-ROM The &man.matcd.4; driver has been removed in FreeBSD 4.X branch since October 5th, 2002 and does not exist in FreeBSD 5.0 and later releases. Floppy drives fd SCSI tape drives sa IDE tape drives ast Flash drives fla for &diskonchip; Flash device RAID drives aacd for &adaptec; AdvancedRAID, mlxd and mlyd for &mylex;, amrd for AMI &megaraid;, idad for Compaq Smart RAID, twed for &tm.3ware; RAID.

David O'Brien Originally contributed by Adding Disks disks adding Lets say we want to add a new SCSI disk to a machine that currently only has a single drive. First turn off the computer and install the drive in the computer following the instructions of the computer, controller, and drive manufacturer. Due to the wide variations of procedures to do this, the details are beyond the scope of this document. Login as user root. After you have installed the drive, inspect /var/run/dmesg.boot to ensure the new disk was found. Continuing with our example, the newly added drive will be da1 and we want to mount it on /1 (if you are adding an IDE drive, the device name will be wd1 in pre-4.0 systems, or ad1 in 4.X and 5.X systems). partitions slices fdisk FreeBSD runs on IBM-PC compatible computers, therefore it must take into account the PC BIOS partitions. These are different from the traditional BSD partitions. A PC disk has up to four BIOS partition entries. If the disk is going to be truly dedicated to FreeBSD, you can use the dedicated mode. Otherwise, FreeBSD will have to live within one of the PC BIOS partitions. FreeBSD calls the PC BIOS partitions slices so as not to confuse them with traditional BSD partitions. You may also use slices on a disk that is dedicated to FreeBSD, but used in a computer that also has another operating system installed. This is a good way to avoid confusing the fdisk utility of other, non-FreeBSD operating systems. In the slice case the drive will be added as /dev/da1s1e. This is read as: SCSI disk, unit number 1 (second SCSI disk), slice 1 (PC BIOS partition 1), and e BSD partition. In the dedicated case, the drive will be added simply as /dev/da1e. Due to the use of 32-bit integers to store the number of sectors, &man.bsdlabel.8; (called &man.disklabel.8; in &os; 4.X) is limited to 2^32-1 sectors per disk or 2TB in most cases. The &man.fdisk.8; format allows a starting sector of no more than 2^32-1 and a length of no more than 2^32-1, limiting partitions to 2TB and disks to 4TB in most cases. The &man.sunlabel.8; format is limited to 2^32-1 sectors per partition and 8 partitions for a total of 16TB. For larger disks, &man.gpt.8; partitions may be used. Using &man.sysinstall.8; sysinstall adding disks su Navigating <application>Sysinstall</application> You may use sysinstall (/stand/sysinstall in &os; versions older than 5.2) to partition and label a new disk using its easy to use menus. Either login as user root or use the su command. Run sysinstall and enter the Configure menu. Within the FreeBSD Configuration Menu, scroll down and select the Fdisk option. <application>fdisk</application> Partition Editor Once inside fdisk, typing A will use the entire disk for FreeBSD. When asked if you want to remain cooperative with any future possible operating systems, answer YES. Write the changes to the disk using W. Now exit the FDISK editor by typing q. Next you will be asked about the Master Boot Record. Since you are adding a disk to an already running system, choose None. Disk Label Editor BSD partitions Next, you need to exit sysinstall and start it again. Follow the directions above, although this time choose the Label option. This will enter the Disk Label Editor. This is where you will create the traditional BSD partitions. A disk can have up to eight partitions, labeled a-h. A few of the partition labels have special uses. The a partition is used for the root partition (/). Thus only your system disk (e.g, the disk you boot from) should have an a partition. The b partition is used for swap partitions, and you may have many disks with swap partitions. The c partition addresses the entire disk in dedicated mode, or the entire FreeBSD slice in slice mode. The other partitions are for general use. sysinstall's Label editor favors the e partition for non-root, non-swap partitions. Within the Label editor, create a single file system by typing C. When prompted if this will be a FS (file system) or swap, choose FS and type in a mount point (e.g, /mnt). When adding a disk in post-install mode, sysinstall will not create entries in /etc/fstab for you, so the mount point you specify is not important. You are now ready to write the new label to the disk and create a file system on it. Do this by typing W. Ignore any errors from sysinstall that it could not mount the new partition. Exit the Label Editor and sysinstall completely. Finish The last step is to edit /etc/fstab to add an entry for your new disk. Using Command Line Utilities Using Slices This setup will allow your disk to work correctly with other operating systems that might be installed on your computer and will not confuse other operating systems' fdisk utilities. It is recommended to use this method for new disk installs. Only use dedicated mode if you have a good reason to do so! &prompt.root; dd if=/dev/zero of=/dev/da1 bs=1k count=1 &prompt.root; fdisk -BI da1 #Initialize your new disk &prompt.root; disklabel -B -w -r da1s1 auto #Label it. &prompt.root; disklabel -e da1s1 # Edit the disklabel just created and add any partitions. &prompt.root; mkdir -p /1 &prompt.root; newfs /dev/da1s1e # Repeat this for every partition you created. &prompt.root; mount /dev/da1s1e /1 # Mount the partition(s) &prompt.root; vi /etc/fstab # Add the appropriate entry/entries to your /etc/fstab. If you have an IDE disk, substitute ad for da. On pre-4.X systems use wd. Dedicated OS/2 If you will not be sharing the new drive with another operating system, you may use the dedicated mode. Remember this mode can confuse Microsoft operating systems; however, no damage will be done by them. IBM's &os2; however, will appropriate any partition it finds which it does not understand. &prompt.root; dd if=/dev/zero of=/dev/da1 bs=1k count=1 &prompt.root; disklabel -Brw da1 auto &prompt.root; disklabel -e da1 # create the `e' partition &prompt.root; newfs -d0 /dev/da1e &prompt.root; mkdir -p /1 &prompt.root; vi /etc/fstab # add an entry for /dev/da1e &prompt.root; mount /1 An alternate method is: &prompt.root; dd if=/dev/zero of=/dev/da1 count=2 &prompt.root; disklabel /dev/da1 | disklabel -BrR da1 /dev/stdin &prompt.root; newfs /dev/da1e &prompt.root; mkdir -p /1 &prompt.root; vi /etc/fstab # add an entry for /dev/da1e &prompt.root; mount /1 Since &os; 5.1-RELEASE, the &man.bsdlabel.8; utility replaces the old &man.disklabel.8; program. With &man.bsdlabel.8; a number of obsolete options and parameters have been retired; in the examples above the option should be removed with &man.bsdlabel.8;. For more information, please refer to the &man.bsdlabel.8; manual page. RAID Software RAID Christopher Shumway Original work by Jim Brown Revised by RAIDsoftware RAIDCCD Concatenated Disk Driver (CCD) Configuration When choosing a mass storage solution the most important factors to consider are speed, reliability, and cost. It is rare to have all three in balance; normally a fast, reliable mass storage device is expensive, and to cut back on cost either speed or reliability must be sacrificed. In designing the system described below, cost was chosen as the most important factor, followed by speed, then reliability. Data transfer speed for this system is ultimately constrained by the network. And while reliability is very important, the CCD drive described below serves online data that is already fully backed up on CD-R's and can easily be replaced. Defining your own requirements is the first step in choosing a mass storage solution. If your requirements prefer speed or reliability over cost, your solution will differ from the system described in this section. Installing the Hardware In addition to the IDE system disk, three Western Digital 30GB, 5400 RPM IDE disks form the core of the CCD disk described below providing approximately 90GB of online storage. Ideally, each IDE disk would have its own IDE controller and cable, but to minimize cost, additional IDE controllers were not used. Instead the disks were configured with jumpers so that each IDE controller has one master, and one slave. Upon reboot, the system BIOS was configured to automatically detect the disks attached. More importantly, FreeBSD detected them on reboot: ad0: 19574MB <WDC WD205BA> [39770/16/63] at ata0-master UDMA33 ad1: 29333MB <WDC WD307AA> [59598/16/63] at ata0-slave UDMA33 ad2: 29333MB <WDC WD307AA> [59598/16/63] at ata1-master UDMA33 ad3: 29333MB <WDC WD307AA> [59598/16/63] at ata1-slave UDMA33 If FreeBSD does not detect all the disks, ensure that you have jumpered them correctly. Most IDE drives also have a Cable Select jumper. This is not the jumper for the master/slave relationship. Consult the drive documentation for help in identifying the correct jumper. Next, consider how to attach them as part of the file system. You should research both &man.vinum.8; () and &man.ccd.4;. In this particular configuration, &man.ccd.4; was chosen. Setting Up the CCD The &man.ccd.4; driver allows you to take several identical disks and concatenate them into one logical file system. In order to use &man.ccd.4;, you need a kernel with &man.ccd.4; support built in. Add this line to your kernel configuration file, rebuild, and reinstall the kernel: pseudo-device ccd 4 On 5.X systems, you have to use instead the following line: device ccd In FreeBSD 5.X, it is not necessary to specify a number of &man.ccd.4; devices, as the &man.ccd.4; device driver is now self-cloning — new device instances will automatically be created on demand. The &man.ccd.4; support can also be loaded as a kernel loadable module in FreeBSD 3.0 or later. To set up &man.ccd.4;, you must first use &man.disklabel.8; to label the disks: disklabel -r -w ad1 auto disklabel -r -w ad2 auto disklabel -r -w ad3 auto This creates a disklabel for ad1c, ad2c and ad3c that spans the entire disk. Since &os; 5.1-RELEASE, the &man.bsdlabel.8; utility replaces the old &man.disklabel.8; program. With &man.bsdlabel.8; a number of obsolete options and parameters have been retired; in the examples above the option should be removed. For more information, please refer to the &man.bsdlabel.8; manual page. The next step is to change the disk label type. You can use &man.disklabel.8; to edit the disks: disklabel -e ad1 disklabel -e ad2 disklabel -e ad3 This opens up the current disk label on each disk with the editor specified by the EDITOR environment variable, typically &man.vi.1;. An unmodified disk label will look something like this: 8 partitions: # size offset fstype [fsize bsize bps/cpg] c: 60074784 0 unused 0 0 0 # (Cyl. 0 - 59597) Add a new e partition for &man.ccd.4; to use. This can usually be copied from the c partition, but the must be 4.2BSD. The disk label should now look something like this: 8 partitions: # size offset fstype [fsize bsize bps/cpg] c: 60074784 0 unused 0 0 0 # (Cyl. 0 - 59597) e: 60074784 0 4.2BSD 0 0 0 # (Cyl. 0 - 59597) Building the File System The device node for ccd0c may not exist yet, so to create it, perform the following commands: cd /dev sh MAKEDEV ccd0 In FreeBSD 5.0, &man.devfs.5; will automatically manage device nodes in /dev, so use of MAKEDEV is not necessary. Now that you have all the disks labeled, you must build the &man.ccd.4;. To do that, use &man.ccdconfig.8;, with options similar to the following: ccdconfig ccd0 32 0 /dev/ad1e /dev/ad2e /dev/ad3e The use and meaning of each option is shown below: The first argument is the device to configure, in this case, /dev/ccd0c. The /dev/ portion is optional. The interleave for the file system. The interleave defines the size of a stripe in disk blocks, each normally 512 bytes. So, an interleave of 32 would be 16,384 bytes. Flags for &man.ccdconfig.8;. If you want to enable drive mirroring, you can specify a flag here. This configuration does not provide mirroring for &man.ccd.4;, so it is set at 0 (zero). The final arguments to &man.ccdconfig.8; are the devices to place into the array. Use the complete pathname for each device. After running &man.ccdconfig.8; the &man.ccd.4; is configured. A file system can be installed. Refer to &man.newfs.8; for options, or simply run: newfs /dev/ccd0c Making it All Automatic Generally, you will want to mount the &man.ccd.4; upon each reboot. To do this, you must configure it first. Write out your current configuration to /etc/ccd.conf using the following command: ccdconfig -g > /etc/ccd.conf During reboot, the script /etc/rc runs ccdconfig -C if /etc/ccd.conf exists. This automatically configures the &man.ccd.4; so it can be mounted. If you are booting into single user mode, before you can &man.mount.8; the &man.ccd.4;, you need to issue the following command to configure the array: ccdconfig -C To automatically mount the &man.ccd.4;, place an entry for the &man.ccd.4; in /etc/fstab so it will be mounted at boot time: /dev/ccd0c /media ufs rw 2 2 The Vinum Volume Manager RAIDsoftware RAID Vinum The Vinum Volume Manager is a block device driver which implements virtual disk drives. It isolates disk hardware from the block device interface and maps data in ways which result in an increase in flexibility, performance and reliability compared to the traditional slice view of disk storage. &man.vinum.8; implements the RAID-0, RAID-1 and RAID-5 models, both individually and in combination. See for more information about &man.vinum.8;. Hardware RAID RAID hardware FreeBSD also supports a variety of hardware RAID controllers. These devices control a RAID subsystem without the need for FreeBSD specific software to manage the array. Using an on-card BIOS, the card controls most of the disk operations itself. The following is a brief setup description using a Promise IDE RAID controller. When this card is installed and the system is started up, it displays a prompt requesting information. Follow the instructions to enter the card's setup screen. From here, you have the ability to combine all the attached drives. After doing so, the disk(s) will look like a single drive to FreeBSD. Other RAID levels can be set up accordingly. Rebuilding ATA RAID1 Arrays FreeBSD allows you to hot-replace a failed disk in an array. This requires that you catch it before you reboot. You will probably see something like the following in /var/log/messages or in the &man.dmesg.8; output: ad6 on monster1 suffered a hard error. ad6: READ command timeout tag=0 serv=0 - resetting ad6: trying fallback to PIO mode ata3: resetting devices .. done ad6: hard error reading fsbn 1116119 of 0-7 (ad6 bn 1116119; cn 1107 tn 4 sn 11)\\ status=59 error=40 ar0: WARNING - mirror lost Using &man.atacontrol.8;, check for further information: &prompt.root; atacontrol list ATA channel 0: Master: no device present Slave: acd0 <HL-DT-ST CD-ROM GCR-8520B/1.00> ATA/ATAPI rev 0 ATA channel 1: Master: no device present Slave: no device present ATA channel 2: Master: ad4 <MAXTOR 6L080J4/A93.0500> ATA/ATAPI rev 5 Slave: no device present ATA channel 3: Master: ad6 <MAXTOR 6L080J4/A93.0500> ATA/ATAPI rev 5 Slave: no device present &prompt.root; atacontrol status ar0 ar0: ATA RAID1 subdisks: ad4 ad6 status: DEGRADED You will first need to detach the ata channel with the failed disk so you can safely remove it: &prompt.root; atacontrol detach ata3 Replace the disk. Reattach the ata channel: &prompt.root; atacontrol attach ata3 Master: ad6 <MAXTOR 6L080J4/A93.0500> ATA/ATAPI rev 5 Slave: no device present Add the new disk to the array as a spare: &prompt.root; atacontrol addspare ar0 ad6 Rebuild the array: &prompt.root; atacontrol rebuild ar0 It is possible to check on the progress by issuing the following command: &prompt.root; dmesg | tail -10 [output removed] ad6: removed from configuration ad6: deleted from ar0 disk1 ad6: inserted into ar0 disk1 as spare &prompt.root; atacontrol status ar0 ar0: ATA RAID1 subdisks: ad4 ad6 status: REBUILDING 0% completed Wait until this operation completes. Marc Fonvieille Contributed by USB Storage Devices USB disks A lot of external storage solutions, nowadays, use the Universal Serial Bus (USB): hard drives, USB thumbdrives, CD-R burners, etc. &os; provides support for these devices. Configuration The USB mass storage devices driver, &man.umass.4;, provides the support for USB storage devices. If you use the GENERIC kernel, you do not have to change anything in your configuration. If you use a custom kernel, be sure that the following lines are present in your kernel configuration file: device scbus device da device pass device uhci device ohci device usb device umass The &man.umass.4; driver uses the SCSI subsystem to access to the USB storage devices, your USB device will be seen as a SCSI device by the system. Depending on the USB chipset on your motherboard, you only need either device uhci or device ohci, however having both in the kernel configuration file is harmless. Do not forget to compile and install the new kernel if you added any lines. If your USB device is a CD-R or DVD burner, the SCSI CD-ROM driver, &man.cd.4;, must be added to the kernel via the line: device cd Since the burner is seen as a SCSI drive, the driver &man.atapicam.4; should not be used in the kernel configuration. Support for USB 2.0 controllers is provided on &os; 5.X, and on the 4.X branch since &os; 4.10-RELEASE. You have to add: device ehci to your configuration file for USB 2.0 support. Note &man.uhci.4; and &man.ohci.4; drivers are still needed if you want USB 1.X support. On &os; 4.X, the USB daemon (&man.usbd.8;) must be running to be able to see some USB devices. To enable it, add usbd_enable="YES" to your /etc/rc.conf file and reboot the machine. Testing the Configuration The configuration is ready to be tested: plug in your USB device, and in the system message buffer (&man.dmesg.8;), the drive should appear as something like: umass0: USB Solid state disk, rev 1.10/1.00, addr 2 GEOM: create disk da0 dp=0xc2d74850 da0 at umass-sim0 bus 0 target 0 lun 0 da0: <Generic Traveling Disk 1.11> Removable Direct Access SCSI-2 device da0: 1.000MB/s transfers da0: 126MB (258048 512 byte sectors: 64H 32S/T 126C) Of course, the brand, the device node (da0) and other details can differ according to your configuration. Since the USB device is seen as a SCSI one, the camcontrol command can be used to list the USB storage devices attached to the system: &prompt.root; camcontrol devlist <Generic Traveling Disk 1.11> at scbus0 target 0 lun 0 (da0,pass0) If the drive comes with a file system, you should be able to mount it. The will help you to format and create partitions on the USB drive if needed. If you unplug the device (the disk must be unmounted before), you should see, in the system message buffer, something like the following: umass0: at uhub0 port 1 (addr 2) disconnected (da0:umass-sim0:0:0:0): lost device (da0:umass-sim0:0:0:0): removing device entry GEOM: destroy disk da0 dp=0xc2d74850 umass0: detached Further Reading Beside the Adding Disks and Mounting and Unmounting File Systems sections, reading various manual pages may be also useful: &man.umass.4;, &man.camcontrol.8;, and &man.usbdevs.8;. Mike Meyer Contributed by Creating and Using Optical Media (CDs) CDROMs creating Introduction CDs have a number of features that differentiate them from conventional disks. Initially, they were not writable by the user. They are designed so that they can be read continuously without delays to move the head between tracks. They are also much easier to transport between systems than similarly sized media were at the time. CDs do have tracks, but this refers to a section of data to be read continuously and not a physical property of the disk. To produce a CD on FreeBSD, you prepare the data files that are going to make up the tracks on the CD, then write the tracks to the CD. ISO 9660 file systems ISO 9660 The ISO 9660 file system was designed to deal with these differences. It unfortunately codifies file system limits that were common then. Fortunately, it provides an extension mechanism that allows properly written CDs to exceed those limits while still working with systems that do not support those extensions. sysutils/cdrtools The sysutils/cdrtools port includes &man.mkisofs.8;, a program that you can use to produce a data file containing an ISO 9660 file system. It has options that support various extensions, and is described below. CD burner ATAPI Which tool to use to burn the CD depends on whether your CD burner is ATAPI or something else. ATAPI CD burners use the burncd program that is part of the base system. SCSI and USB CD burners should use cdrecord from the sysutils/cdrtools port. burncd has a limited number of supported drives. To find out if a drive is supported, see the CD-R/RW supported drives list. CD burner ATAPI/CAM driver If you run &os; 5.X, &os; 4.8-RELEASE version or higher, it will be possible to use cdrecord and other tools for SCSI drives on an ATAPI hardware with the ATAPI/CAM module. If you want a CD burning software with a graphical user interface, you should have a look to X-CD-Roast or K3b. These tools are available as packages or from the sysutils/xcdroast and sysutils/k3b ports. X-CD-Roast and K3b require the ATAPI/CAM module with ATAPI hardware. mkisofs The &man.mkisofs.8; program, which is part of the sysutils/cdrtools port, produces an ISO 9660 file system that is an image of a directory tree in the &unix; file system name space. The simplest usage is: &prompt.root; mkisofs -o imagefile.iso /path/to/tree file systems ISO 9660 This command will create an imagefile.iso containing an ISO 9660 file system that is a copy of the tree at /path/to/tree. In the process, it will map the file names to names that fit the limitations of the standard ISO 9660 file system, and will exclude files that have names uncharacteristic of ISO file systems. file systems HFS file systems Joliet A number of options are available to overcome those restrictions. In particular, enables the Rock Ridge extensions common to &unix; systems, enables Joliet extensions used by Microsoft systems, and can be used to create HFS file systems used by &macos;. For CDs that are going to be used only on FreeBSD systems, can be used to disable all filename restrictions. When used with , it produces a file system image that is identical to the FreeBSD tree you started from, though it may violate the ISO 9660 standard in a number of ways. CDROMs creating bootable The last option of general use is . This is used to specify the location of the boot image for use in producing an El Torito bootable CD. This option takes an argument which is the path to a boot image from the top of the tree being written to the CD. By default, &man.mkisofs.8; creates an ISO image in the so-called floppy disk emulation mode, and thus expects the boot image to be exactly 1200, 1440 or 2880 KB in size. Some boot loaders, like the one used by the FreeBSD distribution disks, do not use emulation mode; in this case, the option should be used. So, if /tmp/myboot holds a bootable FreeBSD system with the boot image in /tmp/myboot/boot/cdboot, you could produce the image of an ISO 9660 file system in /tmp/bootable.iso like so: &prompt.root; mkisofs -R -no-emul-boot -b boot/cdboot -o /tmp/bootable.iso /tmp/myboot Having done that, if you have vn (FreeBSD 4.X), or md (FreeBSD 5.X) configured in your kernel, you can mount the file system with: &prompt.root; vnconfig -e vn0c /tmp/bootable.iso &prompt.root; mount -t cd9660 /dev/vn0c /mnt for FreeBSD 4.X, and for FreeBSD 5.X: &prompt.root; mdconfig -a -t vnode -f /tmp/bootable.iso -u 0 &prompt.root; mount -t cd9660 /dev/md0 /mnt At which point you can verify that /mnt and /tmp/myboot are identical. There are many other options you can use with &man.mkisofs.8; to fine-tune its behavior. In particular: modifications to an ISO 9660 layout and the creation of Joliet and HFS discs. See the &man.mkisofs.8; manual page for details. burncd CDROMs burning If you have an ATAPI CD burner, you can use the burncd command to burn an ISO image onto a CD. burncd is part of the base system, installed as /usr/sbin/burncd. Usage is very simple, as it has few options: &prompt.root; burncd -f cddevice data imagefile.iso fixate Will burn a copy of imagefile.iso on cddevice. The default device is /dev/acd0 (or /dev/acd0c under &os; 4.X). See &man.burncd.8; for options to set the write speed, eject the CD after burning, and write audio data. cdrecord If you do not have an ATAPI CD burner, you will have to use cdrecord to burn your CDs. cdrecord is not part of the base system; you must install it from either the port at sysutils/cdrtools or the appropriate package. Changes to the base system can cause binary versions of this program to fail, possibly resulting in a coaster. You should therefore either upgrade the port when you upgrade your system, or if you are tracking -STABLE, upgrade the port when a new version becomes available. While cdrecord has many options, basic usage is even simpler than burncd. Burning an ISO 9660 image is done with: &prompt.root; cdrecord dev=device imagefile.iso The tricky part of using cdrecord is finding the to use. To find the proper setting, use the flag of cdrecord, which might produce results like this: CDROMs burning &prompt.root; cdrecord -scanbus Cdrecord 1.9 (i386-unknown-freebsd4.2) Copyright (C) 1995-2000 Jörg Schilling Using libscg version 'schily-0.1' scsibus0: 0,0,0 0) 'SEAGATE ' 'ST39236LW ' '0004' Disk 0,1,0 1) 'SEAGATE ' 'ST39173W ' '5958' Disk 0,2,0 2) * 0,3,0 3) 'iomega ' 'jaz 1GB ' 'J.86' Removable Disk 0,4,0 4) 'NEC ' 'CD-ROM DRIVE:466' '1.26' Removable CD-ROM 0,5,0 5) * 0,6,0 6) * 0,7,0 7) * scsibus1: 1,0,0 100) * 1,1,0 101) * 1,2,0 102) * 1,3,0 103) * 1,4,0 104) * 1,5,0 105) 'YAMAHA ' 'CRW4260 ' '1.0q' Removable CD-ROM 1,6,0 106) 'ARTEC ' 'AM12S ' '1.06' Scanner 1,7,0 107) * This lists the appropriate value for the devices on the list. Locate your CD burner, and use the three numbers separated by commas as the value for . In this case, the CRW device is 1,5,0, so the appropriate input would be . There are easier ways to specify this value; see &man.cdrecord.1; for details. That is also the place to look for information on writing audio tracks, controlling the speed, and other things. Duplicating Audio CDs You can duplicate an audio CD by extracting the audio data from the CD to a series of files, and then writing these files to a blank CD. The process is slightly different for ATAPI and SCSI drives. SCSI Drives Use cdda2wav to extract the audio. &prompt.user; cdda2wav -v255 -D2,0 -B -Owav Use cdrecord to write the .wav files. &prompt.user; cdrecord -v dev=2,0 -dao -useinfo *.wav Make sure that 2,0 is set appropriately, as described in . ATAPI Drives The ATAPI CD driver makes each track available as /dev/acddtnn, where d is the drive number, and nn is the track number written with two decimal digits, prefixed with zero as needed. So the first track on the first disk is /dev/acd0t01, the second is /dev/acd0t02, the third is /dev/acd0t03, and so on. Make sure the appropriate files exist in /dev. &prompt.root; cd /dev &prompt.root; sh MAKEDEV acd0t99 In FreeBSD 5.0, &man.devfs.5; will automatically create and manage entries in /dev for you, so it is not necessary to use MAKEDEV. Extract each track using &man.dd.1;. You must also use a specific block size when extracting the files. &prompt.root; dd if=/dev/acd0t01 of=track1.cdr bs=2352 &prompt.root; dd if=/dev/acd0t02 of=track2.cdr bs=2352 ... Burn the extracted files to disk using burncd. You must specify that these are audio files, and that burncd should fixate the disk when finished. &prompt.root; burncd -f /dev/acd0 audio track1.cdr track2.cdr ... fixate Duplicating Data CDs You can copy a data CD to a image file that is functionally equivalent to the image file created with &man.mkisofs.8;, and you can use it to duplicate any data CD. The example given here assumes that your CDROM device is acd0. Substitute your correct CDROM device. Under &os; 4.X, a c must be appended to the end of the device name to indicate the entire partition or, in the case of CDROMs, the entire disc. &prompt.root; dd if=/dev/acd0 of=file.iso bs=2048 Now that you have an image, you can burn it to CD as described above. Using Data CDs Now that you have created a standard data CDROM, you probably want to mount it and read the data on it. By default, &man.mount.8; assumes that a file system is of type ufs. If you try something like: &prompt.root; mount /dev/cd0 /mnt you will get a complaint about Incorrect super block, and no mount. The CDROM is not a UFS file system, so attempts to mount it as such will fail. You just need to tell &man.mount.8; that the file system is of type ISO9660, and everything will work. You do this by specifying the option &man.mount.8;. For example, if you want to mount the CDROM device, /dev/cd0, under /mnt, you would execute: &prompt.root; mount -t cd9660 /dev/cd0 /mnt Note that your device name (/dev/cd0 in this example) could be different, depending on the interface your CDROM uses. Also, the option just executes &man.mount.cd9660.8;. The above example could be shortened to: &prompt.root; mount_cd9660 /dev/cd0 /mnt You can generally use data CDROMs from any vendor in this way. Disks with certain ISO 9660 extensions might behave oddly, however. For example, Joliet disks store all filenames in two-byte Unicode characters. The FreeBSD kernel does not speak Unicode (yet!), so non-English characters show up as question marks. (If you are running FreeBSD 4.3 or later, the CD9660 driver includes hooks to load an appropriate Unicode conversion table on the fly. Modules for some of the common encodings are available via the sysutils/cd9660_unicode port.) Occasionally, you might get Device not configured when trying to mount a CDROM. This usually means that the CDROM drive thinks that there is no disk in the tray, or that the drive is not visible on the bus. It can take a couple of seconds for a CDROM drive to realize that it has been fed, so be patient. Sometimes, a SCSI CDROM may be missed because it did not have enough time to answer the bus reset. If you have a SCSI CDROM please add the following option to your kernel configuration and rebuild your kernel. options SCSI_DELAY=15000 This tells your SCSI bus to pause 15 seconds during boot, to give your CDROM drive every possible chance to answer the bus reset. Burning Raw Data CDs You can choose to burn a file directly to CD, without creating an ISO 9660 file system. Some people do this for backup purposes. This runs more quickly than burning a standard CD: &prompt.root; burncd -f /dev/acd1 -s 12 data archive.tar.gz fixate In order to retrieve the data burned to such a CD, you must read data from the raw device node: &prompt.root; tar xzvf /dev/acd1 You cannot mount this disk as you would a normal CDROM. Such a CDROM cannot be read under any operating system except FreeBSD. If you want to be able to mount the CD, or share data with another operating system, you must use &man.mkisofs.8; as described above. Marc Fonvieille Contributed by CD burner ATAPI/CAM driver Using the ATAPI/CAM Driver This driver allows ATAPI devices (CD-ROM, CD-RW, DVD drives etc...) to be accessed through the SCSI subsystem, and so allows the use of applications like sysutils/cdrdao or &man.cdrecord.1;. To use this driver, you will need to add the following line to your kernel configuration file: device atapicam You also need the following lines in your kernel configuration file: device ata device scbus device cd device pass which should already be present. Then rebuild, install your new kernel, and reboot your machine. During the boot process, your burner should show up, like so: acd0: CD-RW <MATSHITA CD-RW/DVD-ROM UJDA740> at ata1-master PIO4 cd0 at ata1 bus 0 target 0 lun 0 cd0: <MATSHITA CDRW/DVD UJDA740 1.00> Removable CD-ROM SCSI-0 device cd0: 16.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed The drive could now be accessed via the /dev/cd0 device name, for example to mount a CD-ROM on /mnt, just type the following: &prompt.root; mount -t cd9660 /dev/cd0 /mnt As root, you can run the following command to get the SCSI address of the burner: &prompt.root; camcontrol devlist <MATSHITA CDRW/DVD UJDA740 1.00> at scbus1 target 0 lun 0 (pass0,cd0) So 1,0,0 will be the SCSI address to use with &man.cdrecord.1; and other SCSI application. For more information about ATAPI/CAM and SCSI system, refer to the &man.atapicam.4; and &man.cam.4; manual pages. Marc Fonvieille Contributed by Andy Polyakov With inputs from Creating and Using Optical Media (DVDs) DVD burning Introduction Compared to the CD, the DVD is the next generation of optical media storage technology. The DVD can hold more data than any CD and is nowadays the standard for video publishing. Five physical recordable formats can be defined for what we will call a recordable DVD: DVD-R: This was the first DVD recordable format available. The DVD-R standard is defined by the DVD Forum. This format is write once. DVD-RW: This is the rewriteable version of the DVD-R standard. A DVD-RW can be rewritten about 1000 times. DVD-RAM: This is also a rewriteable format supported by the DVD Forum. A DVD-RAM can be seen as a removable hard drive. However, this media is not compatible with most DVD-ROM drives and DVD-Video players; only a few DVD writers support the DVD-RAM format. DVD+RW: This is a rewriteable format defined by the DVD+RW Alliance. A DVD+RW can be rewritten about 1000 times. DVD+R: This format is the write once variation of the DVD+RW format. A single layer recordable DVD can hold up to 4,700,000,000 bytes which is actually 4.38 GB or 4485 MB (1 kilobyte is 1024 bytes). A distinction must be made between the physical media and the application. For example, a DVD-Video is a specific file layout that can be written on any recordable DVD physical media: DVD-R, DVD+R, DVD-RW etc. Before choosing the type of media, you must be sure that both the burner and the DVD-Video player (a standalone player or a DVD-ROM drive on a computer) are compatible with the media under consideration. Configuration The program &man.growisofs.1; will be used to perform DVD recording. This command is part of the dvd+rw-tools utilities (sysutils/dvd+rw-tools). The dvd+rw-tools support all DVD media types. These tools use the SCSI subsystem to access to the devices, therefore the ATAPI/CAM support must be added to your kernel. If your burner uses the USB interface this addition is useless, and you should read the for more details on USB devices configuration. You also have to enable DMA access for ATAPI devices, this can be done in adding the following line to the /boot/loader.conf file: hw.ata.atapi_dma="1" Before attempting to use the dvd+rw-tools you should consult the dvd+rw-tools' hardware compatibility notes for any information related to your DVD burner. If you want a graphical user interface, you should have a look to K3b (sysutils/k3b) which provides a user friendly interface to &man.growisofs.1; and many others burning tools. Burning Data DVDs The &man.growisofs.1; command is a frontend to mkisofs, it will invoke &man.mkisofs.8; to create the file system layout and will perform the write on the DVD. This means you do not need to create an image of the data before the burning process. To burn onto a DVD+R or a DVD-R the data from the /path/to/data directory, use the following command: &prompt.root; growisofs -dvd-compat -Z /dev/cd0 -J -R /path/to/data The options are passed to &man.mkisofs.8; for the file system creation (in this case: an ISO 9660 file system with Joliet and Rock Ridge extensions), consult the &man.mkisofs.8; manual page for more details. The option is used for the initial session recording in any case: multiple sessions or not. The DVD device, /dev/cd0, must be changed according to your configuration. The parameter will close the disk, the recording will be unappendable. In return this should provide better media compatibility with DVD-ROM drives. It is also possible to burn a pre-mastered image, for example to burn the image imagefile.iso, we will run: &prompt.root; growisofs -dvd-compat -Z /dev/cd0=imagefile.iso The write speed should be detected and automatically set according to the media and the drive being used. If you want to force the write speed, use the parameter. For more information, read the &man.growisofs.1; manual page. DVD DVD-Video Burning a DVD-Video A DVD-Video is a specific file layout based on ISO 9660 and the micro-UDF (M-UDF) specifications. The DVD-Video also presents a specific data structure hierarchy, it is the reason why you need a particular program such as multimedia/dvdauthor to author the DVD. If you already have an image of the DVD-Video file system, just burn it in the same way as for any image, see the previous section for an example. If you have made the DVD authoring and the result is in, for example, the directory /path/to/video, the following command should be used to burn the DVD-Video: &prompt.root; growisofs -Z /dev/cd0 -dvd-video /path/to/video The option will be passed down to &man.mkisofs.8; and will instruct it to create a DVD-Video file system layout. Beside this, the option implies &man.growisofs.1; option. DVD DVD+RW Using a DVD+RW Unlike CD-RW, a virgin DVD+RW needs to be formatted before first use. The &man.growisofs.1; program will take care of it automatically whenever appropriate, which is the recommended way. However you can use the dvd+rw-format command to format the DVD+RW: &prompt.root; dvd+rw-format /dev/cd0 You need to perform this operation just once, keep in mind that only virgin DVD+RW medias need to be formatted. Then you can burn the DVD+RW in the way seen in previous sections. If you want to burn new data (burn a totally new file system not append some data) onto a DVD+RW, you do not need to blank it, you just have to write over the previous recording (in performing a new initial session), like this: &prompt.root; growisofs -Z /dev/cd0 -J -R /path/to/newdata DVD+RW format offers the possibility to easily append data to a previous recording. The operation consists in merging a new session to the existing one, it is not multisession writing, &man.growisofs.1; will grow the ISO 9660 file system present on the media. For example, if we want to append data to our previous DVD+RW, we have to use the following: &prompt.root; growisofs -M /dev/cd0 -J -R /path/to/nextdata The same &man.mkisofs.8; options we used to burn the initial session should be used during next writes. You may want to use the option if you want better media compatibility with DVD-ROM drives. In the DVD+RW case, this will not prevent you from adding data. If for any reason you really want to blank the media, do the following: &prompt.root; growisofs -Z /dev/cd0=/dev/zero DVD DVD-RW Using a DVD-RW A DVD-RW accepts two disc formats: the incremental sequential one and the restricted overwrite. By default DVD-RW discs are in sequential format. A virgin DVD-RW can be directly written without the need of a formatting operation, however a non-virgin DVD-RW in sequential format needs to be blanked before to be able to write a new initial session. To blank a DVD-RW in sequential mode, run: &prompt.root; dvd+rw-format -blank=full /dev/cd0 A full blanking () will take about one hour on a 1x media. A fast blanking can be performed using the option if the DVD-RW will be recorded in Disk-At-Once (DAO) mode. To burn the DVD-RW in DAO mode, use the command: &prompt.root; growisofs -use-the-force-luke=dao -Z /dev/cd0=imagefile.iso The option should not be required since &man.growisofs.1; attempts to detect minimally (fast blanked) media and engage DAO write. In fact one should use restricted overwrite mode with any DVD-RW, this format is more flexible than the default incremental sequential one. To write data on a sequential DVD-RW, use the same instructions as for the other DVD formats: &prompt.root; growisofs -Z /dev/cd0 -J -R /path/to/data If you want to append some data to your previous recording, you will have to use the &man.growisofs.1; option. However, if you perform data addition on a DVD-RW in incremental sequential mode, a new session will be created on the disc and the result will be a multi-session disc. A DVD-RW in restricted overwrite format does not need to be blanked before a new initial session, you just have to overwrite the disc with the option, this is similar to the DVD+RW case. It is also possible to grow an existing ISO 9660 file system written on the disc in a same way as for a DVD+RW with the option. The result will be a one-session DVD. To put a DVD-RW in the restricted overwrite format, the following command must be used: &prompt.root; dvd+rw-format /dev/cd0 To change back to the sequential format use: &prompt.root; dvd+rw-format -blank=full /dev/cd0 Multisession Very few DVD-ROM drives support multisession DVDs, they will most of time, hopefully, only read the first session. DVD+R, DVD-R and DVD-RW in sequential format can accept multiple sessions, the notion of multiple sessions does not exist for the DVD+RW and the DVD-RW restricted overwrite formats. Using the following command after an initial (non-closed) session on a DVD+R, DVD-R, or DVD-RW in sequential format, will add a new session to the disc: &prompt.root; growisofs -M /dev/cd0 -J -R /path/to/nextdata Using this command line with a DVD+RW or a DVD-RW in restricted overwrite mode, will append data in merging the new session to the existing one. The result will be a single-session disc. This is the way used to add data after an initial write on these medias. Some space on the media is used between each session for end and start of sessions. Therefore, one should add sessions with large amount of data to optimize media space. The number of sessions is limited to 154 for a DVD+R, about 2000 for a DVD-R, and 127 for a DVD+R Double Layer. For More Information To obtain more information about a DVD, the dvd+rw-mediainfo /dev/cd0 command can be ran with the disc in the drive. More information about the dvd+rw-tools can be found in the &man.growisofs.1; manual page, on the dvd+rw-tools web site and in the cdwrite mailing list archives. The dvd+rw-mediainfo output of the resulting recording or the media with issues is mandatory for any problem report. Without this output, it will be quite impossible to help you. Julio Merino Original work by Martin Karlsson Rewritten by Creating and Using Floppy Disks Storing data on floppy disks is sometimes useful, for example when one does not have any other removable storage media or when one needs to transfer small amounts of data to another computer. This section will explain how to use floppy disks in FreeBSD. It will primarily cover formatting and usage of 3.5inch DOS floppies, but the concepts are similar for other floppy disk formats. Formatting Floppies The Device Floppy disks are accessed through entries in /dev, just like other devices. To access the raw floppy disk in 4.X and earlier releases, one uses /dev/fdN, where N stands for the drive number, usually 0, or /dev/fdNX, where X stands for a letter. In 5.0 or newer releases, simply use /dev/fdN. The Disk Size in 4.X and Earlier Releases There are also /dev/fdN.size devices, where size is a floppy disk size in kilobytes. These entries are used at low-level format time to determine the disk size. 1440kB is the size that will be used in the following examples. Sometimes the entries under /dev will have to be (re)created. To do that, issue: - &prompt.root; cd /dev && ./MAKEDEV "fd*" + &prompt.root; cd /dev && ./MAKEDEV "fd*" The Disk Size in 5.0 and Newer Releases In 5.0, &man.devfs.5; will automatically manage device nodes in /dev, so use of MAKEDEV is not necessary. The desired disk size is passed to &man.fdformat.1; through the flag. Supported sizes are listed in &man.fdcontrol.8;, but be advised that 1440kB is what works best. Formatting A floppy disk needs to be low-level formated before it can be used. This is usually done by the vendor, but formatting is a good way to check media integrity. Although it is possible to force larger (or smaller) disk sizes, 1440kB is what most floppy disks are designed for. To low-level format the floppy disk you need to use &man.fdformat.1;. This utility expects the device name as an argument. Make note of any error messages, as these can help determine if the disk is good or bad. Formatting in 4.X and Earlier Releases Use the /dev/fdN.size devices to format the floppy. Insert a new 3.5inch floppy disk in your drive and issue: &prompt.root; /usr/sbin/fdformat /dev/fd0.1440 Formatting in 5.0 and Newer Releases Use the /dev/fdN devices to format the floppy. Insert a new 3.5inch floppy disk in your drive and issue: &prompt.root; /usr/sbin/fdformat -f 1440 /dev/fd0 The Disk Label After low-level formatting the disk, you will need to place a disk label on it. This disk label will be destroyed later, but it is needed by the system to determine the size of the disk and its geometry later. The new disk label will take over the whole disk, and will contain all the proper information about the geometry of the floppy. The geometry values for the disk label are listed in /etc/disktab. You can run now &man.disklabel.8; like so: &prompt.root; /sbin/disklabel -B -r -w /dev/fd0 fd1440 Since &os; 5.1-RELEASE, the &man.bsdlabel.8; utility replaces the old &man.disklabel.8; program. With &man.bsdlabel.8; a number of obsolete options and parameters have been retired; in the example above the option should be removed. For more information, please refer to the &man.bsdlabel.8; manual page. The File System Now the floppy is ready to be high-level formated. This will place a new file system on it, which will let FreeBSD read and write to the disk. After creating the new file system, the disk label is destroyed, so if you want to reformat the disk, you will have to recreate the disk label. The floppy's file system can be either UFS or FAT. FAT is generally a better choice for floppies. To put a new file system on the floppy, issue: &prompt.root; /sbin/newfs_msdos /dev/fd0 The disk is now ready for use. Using the Floppy To use the floppy, mount it with &man.mount.msdos.8; (in 4.X and earlier releases) or &man.mount.msdosfs.8; (in 5.0 or newer releases). One can also use emulators/mtools from the ports collection. Creating and Using Data Tapes tape media The major tape media are the 4mm, 8mm, QIC, mini-cartridge and DLT. 4mm (DDS: Digital Data Storage) tape media DDS (4mm) tapes tape media QIC tapes 4mm tapes are replacing QIC as the workstation backup media of choice. This trend accelerated greatly when Conner purchased Archive, a leading manufacturer of QIC drives, and then stopped production of QIC drives. 4mm drives are small and quiet but do not have the reputation for reliability that is enjoyed by 8mm drives. The cartridges are less expensive and smaller (3 x 2 x 0.5 inches, 76 x 51 x 12 mm) than 8mm cartridges. 4mm, like 8mm, has comparatively short head life for the same reason, both use helical scan. Data throughput on these drives starts ~150 kB/s, peaking at ~500 kB/s. Data capacity starts at 1.3 GB and ends at 2.0 GB. Hardware compression, available with most of these drives, approximately doubles the capacity. Multi-drive tape library units can have 6 drives in a single cabinet with automatic tape changing. Library capacities reach 240 GB. The DDS-3 standard now supports tape capacities up to 12 GB (or 24 GB compressed). 4mm drives, like 8mm drives, use helical-scan. All the benefits and drawbacks of helical-scan apply to both 4mm and 8mm drives. Tapes should be retired from use after 2,000 passes or 100 full backups. 8mm (Exabyte) tape media Exabyte (8mm) tapes 8mm tapes are the most common SCSI tape drives; they are the best choice of exchanging tapes. Nearly every site has an Exabyte 2 GB 8mm tape drive. 8mm drives are reliable, convenient and quiet. Cartridges are inexpensive and small (4.8 x 3.3 x 0.6 inches; 122 x 84 x 15 mm). One downside of 8mm tape is relatively short head and tape life due to the high rate of relative motion of the tape across the heads. Data throughput ranges from ~250 kB/s to ~500 kB/s. Data sizes start at 300 MB and go up to 7 GB. Hardware compression, available with most of these drives, approximately doubles the capacity. These drives are available as single units or multi-drive tape libraries with 6 drives and 120 tapes in a single cabinet. Tapes are changed automatically by the unit. Library capacities reach 840+ GB. The Exabyte Mammoth model supports 12 GB on one tape (24 GB with compression) and costs approximately twice as much as conventional tape drives. Data is recorded onto the tape using helical-scan, the heads are positioned at an angle to the media (approximately 6 degrees). The tape wraps around 270 degrees of the spool that holds the heads. The spool spins while the tape slides over the spool. The result is a high density of data and closely packed tracks that angle across the tape from one edge to the other. QIC tape media QIC-150 QIC-150 tapes and drives are, perhaps, the most common tape drive and media around. QIC tape drives are the least expensive serious backup drives. The downside is the cost of media. QIC tapes are expensive compared to 8mm or 4mm tapes, up to 5 times the price per GB data storage. But, if your needs can be satisfied with a half-dozen tapes, QIC may be the correct choice. QIC is the most common tape drive. Every site has a QIC drive of some density or another. Therein lies the rub, QIC has a large number of densities on physically similar (sometimes identical) tapes. QIC drives are not quiet. These drives audibly seek before they begin to record data and are clearly audible whenever reading, writing or seeking. QIC tapes measure (6 x 4 x 0.7 inches; 152 x 102 x 17 mm). Data throughput ranges from ~150 kB/s to ~500 kB/s. Data capacity ranges from 40 MB to 15 GB. Hardware compression is available on many of the newer QIC drives. QIC drives are less frequently installed; they are being supplanted by DAT drives. Data is recorded onto the tape in tracks. The tracks run along the long axis of the tape media from one end to the other. The number of tracks, and therefore the width of a track, varies with the tape's capacity. Most if not all newer drives provide backward-compatibility at least for reading (but often also for writing). QIC has a good reputation regarding the safety of the data (the mechanics are simpler and more robust than for helical scan drives). Tapes should be retired from use after 5,000 backups. DLT tape media DLT DLT has the fastest data transfer rate of all the drive types listed here. The 1/2" (12.5mm) tape is contained in a single spool cartridge (4 x 4 x 1 inches; 100 x 100 x 25 mm). The cartridge has a swinging gate along one entire side of the cartridge. The drive mechanism opens this gate to extract the tape leader. The tape leader has an oval hole in it which the drive uses to hook the tape. The take-up spool is located inside the tape drive. All the other tape cartridges listed here (9 track tapes are the only exception) have both the supply and take-up spools located inside the tape cartridge itself. Data throughput is approximately 1.5 MB/s, three times the throughput of 4mm, 8mm, or QIC tape drives. Data capacities range from 10 GB to 20 GB for a single drive. Drives are available in both multi-tape changers and multi-tape, multi-drive tape libraries containing from 5 to 900 tapes over 1 to 20 drives, providing from 50 GB to 9 TB of storage. With compression, DLT Type IV format supports up to 70 GB capacity. Data is recorded onto the tape in tracks parallel to the direction of travel (just like QIC tapes). Two tracks are written at once. Read/write head lifetimes are relatively long; once the tape stops moving, there is no relative motion between the heads and the tape. AIT tape media AIT AIT is a new format from Sony, and can hold up to 50 GB (with compression) per tape. The tapes contain memory chips which retain an index of the tape's contents. This index can be rapidly read by the tape drive to determine the position of files on the tape, instead of the several minutes that would be required for other tapes. Software such as SAMS:Alexandria can operate forty or more AIT tape libraries, communicating directly with the tape's memory chip to display the contents on screen, determine what files were backed up to which tape, locate the correct tape, load it, and restore the data from the tape. Libraries like this cost in the region of $20,000, pricing them a little out of the hobbyist market. Using a New Tape for the First Time The first time that you try to read or write a new, completely blank tape, the operation will fail. The console messages should be similar to: sa0(ncr1:4:0): NOT READY asc:4,1 sa0(ncr1:4:0): Logical unit is in process of becoming ready The tape does not contain an Identifier Block (block number 0). All QIC tape drives since the adoption of QIC-525 standard write an Identifier Block to the tape. There are two solutions: mt fsf 1 causes the tape drive to write an Identifier Block to the tape. Use the front panel button to eject the tape. Re-insert the tape and dump data to the tape. dump will report DUMP: End of tape detected and the console will show: HARDWARE FAILURE info:280 asc:80,96. rewind the tape using: mt rewind. Subsequent tape operations are successful. Backups to Floppies Can I Use Floppies for Backing Up My Data? backup floppies floppy disks Floppy disks are not really a suitable media for making backups as: The media is unreliable, especially over long periods of time. Backing up and restoring is very slow. They have a very limited capacity (the days of backing up an entire hard disk onto a dozen or so floppies has long since passed). However, if you have no other method of backing up your data then floppy disks are better than no backup at all. If you do have to use floppy disks then ensure that you use good quality ones. Floppies that have been lying around the office for a couple of years are a bad choice. Ideally use new ones from a reputable manufacturer. So How Do I Backup My Data to Floppies? The best way to backup to floppy disk is to use &man.tar.1; with the (multi volume) option, which allows backups to span multiple floppies. To backup all the files in the current directory and sub-directory use this (as root): &prompt.root; tar Mcvf /dev/fd0 * When the first floppy is full &man.tar.1; will prompt you to insert the next volume (because &man.tar.1; is media independent it refers to volumes; in this context it means floppy disk). Prepare volume #2 for /dev/fd0 and hit return: This is repeated (with the volume number incrementing) until all the specified files have been archived. Can I Compress My Backups? tar gzip compression Unfortunately, &man.tar.1; will not allow the option to be used for multi-volume archives. You could, of course, &man.gzip.1; all the files, &man.tar.1; them to the floppies, then &man.gunzip.1; the files again! How Do I Restore My Backups? To restore the entire archive use: &prompt.root; tar Mxvf /dev/fd0 There are two ways that you can use to restore only specific files. First, you can start with the first floppy and use: &prompt.root; tar Mxvf /dev/fd0 filename The utility &man.tar.1; will prompt you to insert subsequent floppies until it finds the required file. Alternatively, if you know which floppy the file is on then you can simply insert that floppy and use the same command as above. Note that if the first file on the floppy is a continuation from the previous one then &man.tar.1; will warn you that it cannot restore it, even if you have not asked it to! Lowell Gilbert Original work by Backup Strategies The first requirement in devising a backup plan is to make sure that all of the following problems are covered: Disk failure Accidental file deletion Random file corruption Complete machine destruction (e.g. fire), including destruction of any on-site backups. It is perfectly possible that some systems will be best served by having each of these problems covered by a completely different technique. Except for strictly personal systems with very low-value data, it is unlikely that one technique would cover all of them. Some of the techniques in the toolbox are: Archives of the whole system, backed up onto permanent media offsite. This actually provides protection against all of the possible problems listed above, but is slow and inconvenient to restore from. You can keep copies of the backups onsite and/or online, but there will still be inconveniences in restoring files, especially for non-privileged users. Filesystem snapshots. This is really only helpful in the accidental file deletion scenario, but it can be very helpful in that case, and is quick and easy to deal with. Copies of whole filesystems and/or disks (e.g. periodic rsync of the whole machine). This is generally most useful in networks with unique requirements. For general protection against disk failure, it is usually inferior to RAID. For restoring accidentally deleted files, it can be comparable to UFS snapshots, but that depends on your preferences. RAID. Minimizes or avoids downtime when a disk fails. At the expense of having to deal with disk failures more often (because you have more disks), albeit at a much lower urgency. Checking fingerprints of files. The &man.mtree.8; utility is very useful for this. Although it is not a backup technique, it helps guarantee that you will notice when you need to resort to your backups. This is particularly important for offline backups, and should be checked periodically. It is quite easy to come up with even more techniques, many of them variations on the ones listed above. Specialized requirements will usually lead to specialized techniques (for example, backing up a live database usually requires a method particular to the database software as an intermediate step). The important thing is to know what dangers you want to protect against, and how you will handle each. Backup Basics The three major backup programs are &man.dump.8;, &man.tar.1;, and &man.cpio.1;. Dump and Restore backup software dump / restore dump restore The traditional &unix; backup programs are dump and restore. They operate on the drive as a collection of disk blocks, below the abstractions of files, links and directories that are created by the file systems. dump backs up an entire file system on a device. It is unable to backup only part of a file system or a directory tree that spans more than one file system. dump does not write files and directories to tape, but rather writes the raw data blocks that comprise files and directories. If you use dump on your root directory, you would not back up /home, /usr or many other directories since these are typically mount points for other file systems or symbolic links into those file systems. dump has quirks that remain from its early days in Version 6 of AT&T UNIX (circa 1975). The default parameters are suitable for 9-track tapes (6250 bpi), not the high-density media available today (up to 62,182 ftpi). These defaults must be overridden on the command line to utilize the capacity of current tape drives. .rhosts It is also possible to backup data across the network to a tape drive attached to another computer with rdump and rrestore. Both programs rely upon &man.rcmd.3; and &man.ruserok.3; to access the remote tape drive. Therefore, the user performing the backup must be listed in the .rhosts file on the remote computer. The arguments to rdump and rrestore must be suitable to use on the remote computer. When rdumping from a FreeBSD computer to an Exabyte tape drive connected to a Sun called komodo, use: &prompt.root; /sbin/rdump 0dsbfu 54000 13000 126 komodo:/dev/nsa8 /dev/da0a 2>&1 Beware: there are security implications to allowing .rhosts authentication. Evaluate your situation carefully. It is also possible to use dump and restore in a more secure fashion over ssh. Using <command>dump</command> over <application>ssh</application> &prompt.root; /sbin/dump -0uan -f - /usr | gzip -2 | ssh -c blowfish \ targetuser@targetmachine.example.com dd of=/mybigfiles/dump-usr-l0.gz Or using dump's built-in method, setting the environment variable RSH: Using <command>dump</command> over <application>ssh</application> with <envar>RSH</envar> set &prompt.root; RSH=/usr/bin/ssh /sbin/dump -0uan -f targetuser@targetmachine.example.com:/dev/sa0 /usr <command>tar</command> backup software tar &man.tar.1; also dates back to Version 6 of AT&T UNIX (circa 1975). tar operates in cooperation with the file system; it writes files and directories to tape. tar does not support the full range of options that are available from &man.cpio.1;, but it does not require the unusual command pipeline that cpio uses. tar On FreeBSD 5.3 and later, both GNU tar and the default bsdtar are available. The GNU version can be invoked with gtar. It supports remote devices using the same syntax as rdump. To tar to an Exabyte tape drive connected to a Sun called komodo, use: &prompt.root; /usr/bin/gtar cf komodo:/dev/nsa8 . 2>&1 The same could be accomplished with bsdtar by using a pipeline and rsh to send the data to a remote tape drive. &prompt.root; tar cf - . | rsh hostname dd of=tape-device obs=20b If you are worried about the security of backing up over a network you should use the ssh command instead of rsh. <command>cpio</command> backup software cpio &man.cpio.1; is the original &unix; file interchange tape program for magnetic media. cpio has options (among many others) to perform byte-swapping, write a number of different archive formats, and pipe the data to other programs. This last feature makes cpio an excellent choice for installation media. cpio does not know how to walk the directory tree and a list of files must be provided through stdin. cpio cpio does not support backups across the network. You can use a pipeline and rsh to send the data to a remote tape drive. &prompt.root; for f in directory_list; do -find $f >> backup.list +find $f >> backup.list done -&prompt.root; cpio -v -o --format=newc < backup.list | ssh user@host "cat > backup_device" +&prompt.root; cpio -v -o --format=newc < backup.list | ssh user@host "cat > backup_device" Where directory_list is the list of directories you want to back up, user@host is the user/hostname combination that will be performing the backups, and backup_device is where the backups should be written to (e.g., /dev/nsa0). <command>pax</command> backup software pax pax POSIX IEEE &man.pax.1; is IEEE/&posix;'s answer to tar and cpio. Over the years the various versions of tar and cpio have gotten slightly incompatible. So rather than fight it out to fully standardize them, &posix; created a new archive utility. pax attempts to read and write many of the various cpio and tar formats, plus new formats of its own. Its command set more resembles cpio than tar. <application>Amanda</application> backup software Amanda Amanda Amanda (Advanced Maryland Network Disk Archiver) is a client/server backup system, rather than a single program. An Amanda server will backup to a single tape drive any number of computers that have Amanda clients and a network connection to the Amanda server. A common problem at sites with a number of large disks is that the length of time required to backup to data directly to tape exceeds the amount of time available for the task. Amanda solves this problem. Amanda can use a holding disk to backup several file systems at the same time. Amanda creates archive sets: a group of tapes used over a period of time to create full backups of all the file systems listed in Amanda's configuration file. The archive set also contains nightly incremental (or differential) backups of all the file systems. Restoring a damaged file system requires the most recent full backup and the incremental backups. The configuration file provides fine control of backups and the network traffic that Amanda generates. Amanda will use any of the above backup programs to write the data to tape. Amanda is available as either a port or a package, it is not installed by default. Do Nothing Do nothing is not a computer program, but it is the most widely used backup strategy. There are no initial costs. There is no backup schedule to follow. Just say no. If something happens to your data, grin and bear it! If your time and your data is worth little to nothing, then Do nothing is the most suitable backup program for your computer. But beware, &unix; is a useful tool, you may find that within six months you have a collection of files that are valuable to you. Do nothing is the correct backup method for /usr/obj and other directory trees that can be exactly recreated by your computer. An example is the files that comprise the HTML or &postscript; version of this Handbook. These document formats have been created from SGML input files. Creating backups of the HTML or &postscript; files is not necessary. The SGML files are backed up regularly. Which Backup Program Is Best? LISA &man.dump.8; Period. Elizabeth D. Zwicky torture tested all the backup programs discussed here. The clear choice for preserving all your data and all the peculiarities of &unix; file systems is dump. Elizabeth created file systems containing a large variety of unusual conditions (and some not so unusual ones) and tested each program by doing a backup and restore of those file systems. The peculiarities included: files with holes, files with holes and a block of nulls, files with funny characters in their names, unreadable and unwritable files, devices, files that change size during the backup, files that are created/deleted during the backup and more. She presented the results at LISA V in Oct. 1991. See torture-testing Backup and Archive Programs. Emergency Restore Procedure Before the Disaster There are only four steps that you need to perform in preparation for any disaster that may occur. disklabel First, print the disklabel from each of your disks (e.g. disklabel da0 | lpr), your file system table (/etc/fstab) and all boot messages, two copies of each. fix-it floppies Second, determine that the boot and fix-it floppies (boot.flp and fixit.flp) have all your devices. The easiest way to check is to reboot your machine with the boot floppy in the floppy drive and check the boot messages. If all your devices are listed and functional, skip on to step three. Otherwise, you have to create two custom bootable floppies which have a kernel that can mount all of your disks and access your tape drive. These floppies must contain: fdisk, disklabel, newfs, mount, and whichever backup program you use. These programs must be statically linked. If you use dump, the floppy must contain restore. Third, create backup tapes regularly. Any changes that you make after your last backup may be irretrievably lost. Write-protect the backup tapes. Fourth, test the floppies (either boot.flp and fixit.flp or the two custom bootable floppies you made in step two.) and backup tapes. Make notes of the procedure. Store these notes with the bootable floppy, the printouts and the backup tapes. You will be so distraught when restoring that the notes may prevent you from destroying your backup tapes (How? In place of tar xvf /dev/sa0, you might accidentally type tar cvf /dev/sa0 and over-write your backup tape). For an added measure of security, make bootable floppies and two backup tapes each time. Store one of each at a remote location. A remote location is NOT the basement of the same office building. A number of firms in the World Trade Center learned this lesson the hard way. A remote location should be physically separated from your computers and disk drives by a significant distance. A Script for Creating a Bootable Floppy /mnt/sbin/init -gzip -c -best /sbin/fsck > /mnt/sbin/fsck -gzip -c -best /sbin/mount > /mnt/sbin/mount -gzip -c -best /sbin/halt > /mnt/sbin/halt -gzip -c -best /sbin/restore > /mnt/sbin/restore +gzip -c -best /sbin/init > /mnt/sbin/init +gzip -c -best /sbin/fsck > /mnt/sbin/fsck +gzip -c -best /sbin/mount > /mnt/sbin/mount +gzip -c -best /sbin/halt > /mnt/sbin/halt +gzip -c -best /sbin/restore > /mnt/sbin/restore -gzip -c -best /bin/sh > /mnt/bin/sh -gzip -c -best /bin/sync > /mnt/bin/sync +gzip -c -best /bin/sh > /mnt/bin/sh +gzip -c -best /bin/sync > /mnt/bin/sync cp /root/.profile /mnt/root cp -f /dev/MAKEDEV /mnt/dev chmod 755 /mnt/dev/MAKEDEV chmod 500 /mnt/sbin/init chmod 555 /mnt/sbin/fsck /mnt/sbin/mount /mnt/sbin/halt chmod 555 /mnt/bin/sh /mnt/bin/sync chmod 6555 /mnt/sbin/restore # # create the devices nodes # cd /mnt/dev ./MAKEDEV std ./MAKEDEV da0 ./MAKEDEV da1 ./MAKEDEV da2 ./MAKEDEV sa0 ./MAKEDEV pty0 cd / # # create minimum file system table # -cat > /mnt/etc/fstab < /mnt/etc/passwd < /mnt/etc/master.passwd < After the Disaster The key question is: did your hardware survive? You have been doing regular backups so there is no need to worry about the software. If the hardware has been damaged, the parts should be replaced before attempting to use the computer. If your hardware is okay, check your floppies. If you are using a custom boot floppy, boot single-user (type -s at the boot: prompt). Skip the following paragraph. If you are using the boot.flp and fixit.flp floppies, keep reading. Insert the boot.flp floppy in the first floppy drive and boot the computer. The original install menu will be displayed on the screen. Select the Fixit--Repair mode with CDROM or floppy. option. Insert the fixit.flp when prompted. restore and the other programs that you need are located in /mnt2/rescue (/mnt2/stand for &os; versions older than 5.2). Recover each file system separately. mount root partition disklabel newfs Try to mount (e.g. mount /dev/da0a /mnt) the root partition of your first disk. If the disklabel was damaged, use disklabel to re-partition and label the disk to match the label that you printed and saved. Use newfs to re-create the file systems. Re-mount the root partition of the floppy read-write (mount -u -o rw /mnt). Use your backup program and backup tapes to recover the data for this file system (e.g. restore vrf /dev/sa0). Unmount the file system (e.g. umount /mnt). Repeat for each file system that was damaged. Once your system is running, backup your data onto new tapes. Whatever caused the crash or data loss may strike again. Another hour spent now may save you from further distress later. * I Did Not Prepare for the Disaster, What Now? ]]> Marc Fonvieille Reorganized and enhanced by Network, Memory, and File-Backed File Systems virtual disks disks virtual Aside from the disks you physically insert into your computer: floppies, CDs, hard drives, and so forth; other forms of disks are understood by FreeBSD - the virtual disks. NFS Coda disks memory These include network file systems such as the Network File System and Coda, memory-based file systems and file-backed file systems. According to the FreeBSD version you run, you will have to use different tools for creation and use of file-backed and memory-based file systems. The FreeBSD 4.X users will have to use &man.MAKEDEV.8; to create the required devices. FreeBSD 5.0 and later use &man.devfs.5; to allocate device nodes transparently for the user. File-Backed File System under FreeBSD 4.X disks file-backed (4.X) The utility &man.vnconfig.8; configures and enables vnode pseudo-disk devices. A vnode is a representation of a file, and is the focus of file activity. This means that &man.vnconfig.8; uses files to create and operate a file system. One possible use is the mounting of floppy or CD images kept in files. To use &man.vnconfig.8;, you need &man.vn.4; support in your kernel configuration file: pseudo-device vn To mount an existing file system image: Using vnconfig to Mount an Existing File System Image under FreeBSD 4.X &prompt.root; vnconfig vn0 diskimage &prompt.root; mount /dev/vn0c /mnt To create a new file system image with &man.vnconfig.8;: Creating a New File-Backed Disk with <command>vnconfig</command> &prompt.root; dd if=/dev/zero of=newimage bs=1k count=5k 5120+0 records in 5120+0 records out &prompt.root; vnconfig -s labels -c vn0 newimage &prompt.root; disklabel -r -w vn0 auto &prompt.root; newfs vn0c Warning: 2048 sector(s) in last cylinder unallocated /dev/vn0c: 10240 sectors in 3 cylinders of 1 tracks, 4096 sectors 5.0MB in 1 cyl groups (16 c/g, 32.00MB/g, 1280 i/g) super-block backups (for fsck -b #) at: 32 &prompt.root; mount /dev/vn0c /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/vn0c 4927 1 4532 0% /mnt File-Backed File System under FreeBSD 5.X disks file-backed (5.X) The utility &man.mdconfig.8; is used to configure and enable memory disks, &man.md.4;, under FreeBSD 5.X. To use &man.mdconfig.8;, you have to load &man.md.4; module or to add the support in your kernel configuration file: device md The &man.mdconfig.8; command supports three kinds of memory backed virtual disks: memory disks allocated with &man.malloc.9;, memory disks using a file or swap space as backing. One possible use is the mounting of floppy or CD images kept in files. To mount an existing file system image: Using <command>mdconfig</command> to Mount an Existing File System Image under FreeBSD 5.X &prompt.root; mdconfig -a -t vnode -f diskimage -u 0 &prompt.root; mount /dev/md0 /mnt To create a new file system image with &man.mdconfig.8;: Creating a New File-Backed Disk with <command>mdconfig</command> &prompt.root; dd if=/dev/zero of=newimage bs=1k count=5k 5120+0 records in 5120+0 records out &prompt.root; mdconfig -a -t vnode -f newimage -u 0 &prompt.root; disklabel -r -w md0 auto &prompt.root; newfs md0c /dev/md0c: 5.0MB (10240 sectors) block size 16384, fragment size 2048 using 4 cylinder groups of 1.27MB, 81 blks, 256 inodes. super-block backups (for fsck -b #) at: 32, 2624, 5216, 7808 &prompt.root; mount /dev/md0c /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/md0c 4846 2 4458 0% /mnt If you do not specify the unit number with the option, &man.mdconfig.8; will use the &man.md.4; automatic allocation to select an unused device. The name of the allocated unit will be output on stdout like md4. For more details about &man.mdconfig.8;, please refer to the manual page. Since &os; 5.1-RELEASE, the &man.bsdlabel.8; utility replaces the old &man.disklabel.8; program. With &man.bsdlabel.8; a number of obsolete options and parameters have been retired; in the example above the option should be removed. For more information, please refer to the &man.bsdlabel.8; manual page. The utility &man.mdconfig.8; is very useful, however it asks many command lines to create a file-backed file system. FreeBSD 5.0 also comes with a tool called &man.mdmfs.8;, this program configures a &man.md.4; disk using &man.mdconfig.8;, puts a UFS file system on it using &man.newfs.8;, and mounts it using &man.mount.8;. For example, if you want to create and mount the same file system image as above, simply type the following: Configure and Mount a File-Backed Disk with <command>mdmfs</command> &prompt.root; dd if=/dev/zero of=newimage bs=1k count=5k 5120+0 records in 5120+0 records out &prompt.root; mdmfs -F newimage -s 5m md0 /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/md0 4846 2 4458 0% /mnt If you use the option without unit number, &man.mdmfs.8; will use &man.md.4; auto-unit feature to automatically select an unused device. For more details about &man.mdmfs.8;, please refer to the manual page. Memory-Based File System under FreeBSD 4.X disks memory file system (4.X) The &man.md.4; driver is a simple, efficient means to create memory file systems under FreeBSD 4.X. &man.malloc.9; is used to allocate the memory. Simply take a file system you have prepared with, for example, &man.vnconfig.8;, and: md Memory Disk under FreeBSD 4.X &prompt.root; dd if=newimage of=/dev/md0 5120+0 records in 5120+0 records out &prompt.root; mount /dev/md0c /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/md0c 4927 1 4532 0% /mnt For more details, please refer to &man.md.4; manual page. Memory-Based File System under FreeBSD 5.X disks memory file system (5.X) The same tools are used for memory-based and file-backed file systems: &man.mdconfig.8; or &man.mdmfs.8;. The storage for memory-based file system is allocated with &man.malloc.9;. Creating a New Memory-Based Disk with <command>mdconfig</command> &prompt.root; mdconfig -a -t malloc -s 5m -u 1 &prompt.root; newfs -U md1 /dev/md1: 5.0MB (10240 sectors) block size 16384, fragment size 2048 using 4 cylinder groups of 1.27MB, 81 blks, 256 inodes. with soft updates super-block backups (for fsck -b #) at: 32, 2624, 5216, 7808 &prompt.root; mount /dev/md1 /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/md1 4846 2 4458 0% /mnt Creating a New Memory-Based Disk with <command>mdmfs</command> &prompt.root; mdmfs -M -s 5m md2 /mnt &prompt.root; df /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/md2 4846 2 4458 0% /mnt Instead of using a &man.malloc.9; backed file system, it is possible to use swap, for that just replace with in the command line of &man.mdconfig.8;. The &man.mdmfs.8; utility by default (without ) creates a swap-based disk. For more details, please refer to &man.mdconfig.8; and &man.mdmfs.8; manual pages. Detaching a Memory Disk from the System disks detaching a memory disk When a memory-based or file-based file system is not used, you should release all resources to the system. The first thing to do is to unmount the file system, then use &man.mdconfig.8; to detach the disk from the system and release the resources. For example to detach and free all resources used by /dev/md4: &prompt.root; mdconfig -d -u 4 It is possible to list information about configured &man.md.4; devices in using the command mdconfig -l. For FreeBSD 4.X, &man.vnconfig.8; is used to detach the device. For example to detach and free all resources used by /dev/vn4: &prompt.root; vnconfig -u vn4 Tom Rhodes Contributed by File System Snapshots file systems snapshots FreeBSD 5.0 offers a new feature in conjunction with Soft Updates: File system snapshots. Snapshots allow a user to create images of specified file systems, and treat them as a file. Snapshot files must be created in the file system that the action is performed on, and a user may create no more than 20 snapshots per file system. Active snapshots are recorded in the superblock so they are persistent across unmount and remount operations along with system reboots. When a snapshot is no longer required, it can be removed with the standard &man.rm.1; command. Snapshots may be removed in any order, however all the used space may not be acquired because another snapshot will possibly claim some of the released blocks. The un-alterable file flag is set by &man.mksnap.ffs.8; after initial creation of a snapshot file. The &man.unlink.1; command makes an exception for snapshot files since it allows them to be removed. Snapshots are created with the &man.mount.8; command. To place a snapshot of /var in the file /var/snapshot/snap use the following command: &prompt.root; mount -u -o snapshot /var/snapshot/snap /var Alternatively, you can use &man.mksnap.ffs.8; to create a snapshot: &prompt.root; mksnap_ffs /var /var/snapshot/snap One can find snapshot files on a file system (e.g. /var) by using the &man.find.1; command: &prompt.root; find /var -flags snapshot Once a snapshot has been created, it has several uses: Some administrators will use a snapshot file for backup purposes, because the snapshot can be transfered to CDs or tape. File integrity, &man.fsck.8; may be ran on the snapshot. Assuming that the file system was clean when it was mounted, you should always get a clean (and unchanging) result. This is essentially what the background &man.fsck.8; process does. Run the &man.dump.8; utility on the snapshot. A dump will be returned that is consistent with the file system and the timestamp of the snapshot. &man.dump.8; can also take a snapshot, create a dump image and then remove the snapshot in one command using the flag. &man.mount.8; the snapshot as a frozen image of the file system. To &man.mount.8; the snapshot /var/snapshot/snap run: &prompt.root; mdconfig -a -t vnode -f /var/snapshot/snap -u 4 &prompt.root; mount -r /dev/md4 /mnt You can now walk the hierarchy of your frozen /var file system mounted at /mnt. Everything will initially be in the same state it was during the snapshot creation time. The only exception is that any earlier snapshots will appear as zero length files. When the use of a snapshot has delimited, it can be unmounted with: &prompt.root; umount /mnt &prompt.root; mdconfig -d -u 4 For more information about and file system snapshots, including technical papers, you can visit Marshall Kirk McKusick's website at . File System Quotas accounting disk space disk quotas Quotas are an optional feature of the operating system that allow you to limit the amount of disk space and/or the number of files a user or members of a group may allocate on a per-file system basis. This is used most often on timesharing systems where it is desirable to limit the amount of resources any one user or group of users may allocate. This will prevent one user or group of users from consuming all of the available disk space. Configuring Your System to Enable Disk Quotas Before attempting to use disk quotas, it is necessary to make sure that quotas are configured in your kernel. This is done by adding the following line to your kernel configuration file: options QUOTA The stock GENERIC kernel does not have this enabled by default, so you will have to configure, build and install a custom kernel in order to use disk quotas. Please refer to for more information on kernel configuration. Next you will need to enable disk quotas in /etc/rc.conf. This is done by adding the line: enable_quotas="YES" disk quotas checking For finer control over your quota startup, there is an additional configuration variable available. Normally on bootup, the quota integrity of each file system is checked by the &man.quotacheck.8; program. The &man.quotacheck.8; facility insures that the data in the quota database properly reflects the data on the file system. This is a very time consuming process that will significantly affect the time your system takes to boot. If you would like to skip this step, a variable in /etc/rc.conf is made available for the purpose: check_quotas="NO" Finally you will need to edit /etc/fstab to enable disk quotas on a per-file system basis. This is where you can either enable user or group quotas or both for all of your file systems. To enable per-user quotas on a file system, add the option to the options field in the /etc/fstab entry for the file system you want to enable quotas on. For example: /dev/da1s2g /home ufs rw,userquota 1 2 Similarly, to enable group quotas, use the option instead of . To enable both user and group quotas, change the entry as follows: /dev/da1s2g /home ufs rw,userquota,groupquota 1 2 By default, the quota files are stored in the root directory of the file system with the names quota.user and quota.group for user and group quotas respectively. See &man.fstab.5; for more information. Even though the &man.fstab.5; manual page says that you can specify an alternate location for the quota files, this is not recommended because the various quota utilities do not seem to handle this properly. At this point you should reboot your system with your new kernel. /etc/rc will automatically run the appropriate commands to create the initial quota files for all of the quotas you enabled in /etc/fstab, so there is no need to manually create any zero length quota files. In the normal course of operations you should not be required to run the &man.quotacheck.8;, &man.quotaon.8;, or &man.quotaoff.8; commands manually. However, you may want to read their manual pages just to be familiar with their operation. Setting Quota Limits disk quotas limits Once you have configured your system to enable quotas, verify that they really are enabled. An easy way to do this is to run: &prompt.root; quota -v You should see a one line summary of disk usage and current quota limits for each file system that quotas are enabled on. You are now ready to start assigning quota limits with the &man.edquota.8; command. You have several options on how to enforce limits on the amount of disk space a user or group may allocate, and how many files they may create. You may limit allocations based on disk space (block quotas) or number of files (inode quotas) or a combination of both. Each of these limits are further broken down into two categories: hard and soft limits. hard limit A hard limit may not be exceeded. Once a user reaches his hard limit he may not make any further allocations on the file system in question. For example, if the user has a hard limit of 500 kbytes on a file system and is currently using 490 kbytes, the user can only allocate an additional 10 kbytes. Attempting to allocate an additional 11 kbytes will fail. soft limit Soft limits, on the other hand, can be exceeded for a limited amount of time. This period of time is known as the grace period, which is one week by default. If a user stays over his or her soft limit longer than the grace period, the soft limit will turn into a hard limit and no further allocations will be allowed. When the user drops back below the soft limit, the grace period will be reset. The following is an example of what you might see when you run the &man.edquota.8; command. When the &man.edquota.8; command is invoked, you are placed into the editor specified by the EDITOR environment variable, or in the vi editor if the EDITOR variable is not set, to allow you to edit the quota limits. &prompt.root; edquota -u test Quotas for user test: /usr: kbytes in use: 65, limits (soft = 50, hard = 75) inodes in use: 7, limits (soft = 50, hard = 60) /usr/var: kbytes in use: 0, limits (soft = 50, hard = 75) inodes in use: 0, limits (soft = 50, hard = 60) You will normally see two lines for each file system that has quotas enabled. One line for the block limits, and one line for inode limits. Simply change the value you want updated to modify the quota limit. For example, to raise this user's block limit from a soft limit of 50 and a hard limit of 75 to a soft limit of 500 and a hard limit of 600, change: /usr: kbytes in use: 65, limits (soft = 50, hard = 75) to: /usr: kbytes in use: 65, limits (soft = 500, hard = 600) The new quota limits will be in place when you exit the editor. Sometimes it is desirable to set quota limits on a range of UIDs. This can be done by use of the option on the &man.edquota.8; command. First, assign the desired quota limit to a user, and then run edquota -p protouser startuid-enduid. For example, if user test has the desired quota limits, the following command can be used to duplicate those quota limits for UIDs 10,000 through 19,999: &prompt.root; edquota -p test 10000-19999 For more information see &man.edquota.8; manual page. Checking Quota Limits and Disk Usage disk quotas checking You can use either the &man.quota.1; or the &man.repquota.8; commands to check quota limits and disk usage. The &man.quota.1; command can be used to check individual user or group quotas and disk usage. A user may only examine his own quota, and the quota of a group he is a member of. Only the super-user may view all user and group quotas. The &man.repquota.8; command can be used to get a summary of all quotas and disk usage for file systems with quotas enabled. The following is some sample output from the quota -v command for a user that has quota limits on two file systems. Disk quotas for user test (uid 1002): Filesystem usage quota limit grace files quota limit grace /usr 65* 50 75 5days 7 50 60 /usr/var 0 50 75 0 50 60 grace period On the /usr file system in the above example, this user is currently 15 kbytes over the soft limit of 50 kbytes and has 5 days of the grace period left. Note the asterisk * which indicates that the user is currently over his quota limit. Normally file systems that the user is not using any disk space on will not show up in the output from the &man.quota.1; command, even if he has a quota limit assigned for that file system. The option will display those file systems, such as the /usr/var file system in the above example. Quotas over NFS NFS Quotas are enforced by the quota subsystem on the NFS server. The &man.rpc.rquotad.8; daemon makes quota information available to the &man.quota.1; command on NFS clients, allowing users on those machines to see their quota statistics. Enable rpc.rquotad in /etc/inetd.conf like so: rquotad/1 dgram rpc/udp wait root /usr/libexec/rpc.rquotad rpc.rquotad Now restart inetd: &prompt.root; kill -HUP `cat /var/run/inetd.pid` Lucky Green Contributed by

shamrock@cypherpunks.to

Encrypting Disk Partitions disks encrypting FreeBSD offers excellent online protections against unauthorized data access. File permissions and Mandatory Access Control (MAC) (see ) help prevent unauthorized third-parties from accessing data while the operating system is active and the computer is powered up. However, the permissions enforced by the operating system are irrelevant if an attacker has physical access to a computer and can simply move the computer's hard drive to another system to copy and analyze the sensitive data. Regardless of how an attacker may have come into possession of a hard drive or powered-down computer, both GEOM Based Disk Encryption (gbde) and geli cryptographic subsystems in &os; are able to protect the data on the computer's file systems against even highly-motivated attackers with significant resources. Unlike cumbersome encryption methods that encrypt only individual files, gbde and geli transparently encrypt entire file systems. No cleartext ever touches the hard drive's platter. Disk Encryption with <application>gbde</application> Become <username>root</username> Configuring gbde requires super-user privileges. &prompt.user; su - Password: Verify the Operating System Version &man.gbde.4; requires FreeBSD 5.0 or higher. &prompt.root; uname -r 5.0-RELEASE Add &man.gbde.4; Support to the Kernel Configuration File Add the following line to the kernel configuration file: options GEOM_BDE Rebuild the kernel as described in . Reboot into the new kernel. Preparing the Encrypted Hard Drive The following example assumes that you are adding a new hard drive to your system that will hold a single encrypted partition. This partition will be mounted as /private. gbde can also be used to encrypt /home and /var/mail, but this requires more complex instructions which exceed the scope of this introduction. Add the New Hard Drive Install the new drive to the system as explained in . For the purposes of this example, a new hard drive partition has been added as /dev/ad4s1c. The /dev/ad0s1* devices represent existing standard FreeBSD partitions on the example system. &prompt.root; ls /dev/ad* /dev/ad0 /dev/ad0s1b /dev/ad0s1e /dev/ad4s1 /dev/ad0s1 /dev/ad0s1c /dev/ad0s1f /dev/ad4s1c /dev/ad0s1a /dev/ad0s1d /dev/ad4 Create a Directory to Hold gbde Lock Files &prompt.root; mkdir /etc/gbde The gbde lock file contains information that gbde requires to access encrypted partitions. Without access to the lock file, gbde will not be able to decrypt the data contained in the encrypted partition without significant manual intervention which is not supported by the software. Each encrypted partition uses a separate lock file. Initialize the gbde Partition A gbde partition must be initialized before it can be used. This initialization needs to be performed only once: &prompt.root; gbde init /dev/ad4s1c -i -L /etc/gbde/ad4s1c &man.gbde.8; will open your editor, permitting you to set various configuration options in a template. For use with UFS1 or UFS2, set the sector_size to 2048: $FreeBSD: src/sbin/gbde/template.txt,v 1.1 2002/10/20 11:16:13 phk Exp $ # # Sector size is the smallest unit of data which can be read or written. # Making it too small decreases performance and decreases available space. # Making it too large may prevent filesystems from working. 512 is the # minimum and always safe. For UFS, use the fragment size # sector_size = 2048 [...] &man.gbde.8; will ask you twice to type the passphrase that should be used to secure the data. The passphrase must be the same both times. gbde's ability to protect your data depends entirely on the quality of the passphrase that you choose. For tips on how to select a secure passphrase that is easy to remember, see the Diceware Passphrase website. The gbde init command creates a lock file for your gbde partition that in this example is stored as /etc/gbde/ad4s1c. gbde lock files must be backed up together with the contents of any encrypted partitions. While deleting a lock file alone cannot prevent a determined attacker from decrypting a gbde partition, without the lock file, the legitimate owner will be unable to access the data on the encrypted partition without a significant amount of work that is totally unsupported by &man.gbde.8; and its designer. Attach the Encrypted Partition to the Kernel &prompt.root; gbde attach /dev/ad4s1c -l /etc/gbde/ad4s1c You will be asked to provide the passphrase that you selected during the initialization of the encrypted partition. The new encrypted device will show up in /dev as /dev/device_name.bde: &prompt.root; ls /dev/ad* /dev/ad0 /dev/ad0s1b /dev/ad0s1e /dev/ad4s1 /dev/ad0s1 /dev/ad0s1c /dev/ad0s1f /dev/ad4s1c /dev/ad0s1a /dev/ad0s1d /dev/ad4 /dev/ad4s1c.bde Create a File System on the Encrypted Device Once the encrypted device has been attached to the kernel, you can create a file system on the device. To create a file system on the encrypted device, use &man.newfs.8;. Since it is much faster to initialize a new UFS2 file system than it is to initialize the old UFS1 file system, using &man.newfs.8; with the option is recommended. The option is the default with &os; 5.1-RELEASE and later. &prompt.root; newfs -U -O2 /dev/ad4s1c.bde The &man.newfs.8; command must be performed on an attached gbde partition which is identified by a *.bde extension to the device name. Mount the Encrypted Partition Create a mount point for the encrypted file system. &prompt.root; mkdir /private Mount the encrypted file system. &prompt.root; mount /dev/ad4s1c.bde /private Verify That the Encrypted File System is Available The encrypted file system should now be visible to &man.df.1; and be available for use. &prompt.user; df -H Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1a 1037M 72M 883M 8% / /devfs 1.0K 1.0K 0B 100% /dev /dev/ad0s1f 8.1G 55K 7.5G 0% /home /dev/ad0s1e 1037M 1.1M 953M 0% /tmp /dev/ad0s1d 6.1G 1.9G 3.7G 35% /usr /dev/ad4s1c.bde 150G 4.1K 138G 0% /private Mounting Existing Encrypted File Systems After each boot, any encrypted file systems must be re-attached to the kernel, checked for errors, and mounted, before the file systems can be used. The required commands must be executed as user root. Attach the gbde Partition to the Kernel &prompt.root; gbde attach /dev/ad4s1c -l /etc/gbde/ad4s1c You will be asked to provide the passphrase that you selected during initialization of the encrypted gbde partition. Check the File System for Errors Since encrypted file systems cannot yet be listed in /etc/fstab for automatic mounting, the file systems must be checked for errors by running &man.fsck.8; manually before mounting. &prompt.root; fsck -p -t ffs /dev/ad4s1c.bde Mount the Encrypted File System &prompt.root; mount /dev/ad4s1c.bde /private The encrypted file system is now available for use. Automatically Mounting Encrypted Partitions It is possible to create a script to automatically attach, check, and mount an encrypted partition, but for security reasons the script should not contain the &man.gbde.8; password. Instead, it is recommended that such scripts be run manually while providing the password via the console or &man.ssh.1;. As of &os; 5.2-RELEASE, there is a new rcNG script provided. Arguments for this script can be passed via &man.rc.conf.5;, for example: gbde_autoattach_all="YES" gbde_devices="ad4s1c" This will require that the gbde passphrase be entered at boot time. After typing the correct passphrase, the gbde encrypted partition will be mounted automatically. This can be very useful when using gbde on notebooks. Cryptographic Protections Employed by gbde &man.gbde.8; encrypts the sector payload using 128-bit AES in CBC mode. Each sector on the disk is encrypted with a different AES key. For more information on gbde's cryptographic design, including how the sector keys are derived from the user-supplied passphrase, see &man.gbde.4;. Compatibility Issues &man.sysinstall.8; is incompatible with gbde-encrypted devices. All *.bde devices must be detached from the kernel before starting &man.sysinstall.8; or it will crash during its initial probing for devices. To detach the encrypted device used in our example, use the following command: &prompt.root; gbde detach /dev/ad4s1c Also note that, as &man.vinum.4; does not use the &man.geom.4; subsystem, you cannot use gbde with vinum volumes. Daniel Gerzo Contributed by Disk Encryption with <command>geli</command> A new cryptographic GEOM class is available as of &os; 6.0 - geli. It is currently being developed by &a.pjd;. Geli is different to gbde; it offers different features and uses a different scheme for doing cryptographic work. The most important features of &man.geli.8; are: Utilizes the &man.crypto.9; framework — when cryptographic hardware is available, geli will use it automatically. Supports multiple cryptographic algorithms (currently AES, Blowfish, and 3DES). Allows the root partition to be encrypted. The passphrase used to access the encrypted root partition will be requested during the system boot. Allows the use of two independent keys (e.g. a key and a company key). geli is fast - performs simple sector-to-sector encryption. Allows backup and restore of Master Keys. When a user has to destroy his keys, it will be possible to get access to the data again by restoring keys from the backup. Allows to attach a disk with a random, one-time key — useful for swap partitions and temporary file systems. More geli features can be found in the &man.geli.8; manual page. The next steps will describe how to enable support for geli in the &os; kernel and will explain how to create a new geli encryption provider. At the end it will be demonstrated how to create an encrypted swap partition using features provided by geli. In order to use geli, you must be running &os; 6.0-RELEASE or later. Super-user privileges will be required since modifications to the kernel are necessary. Adding <command>geli</command> Support to the Kernel Configuration File Add the following lines to the kernel configuration file: options GEOM_ELI device crypto Rebuild the kernel as described in . Alternatively, the geli module can be loaded at boot time. Add the following line to the /boot/loader.conf: geom_eli_load="YES" &man.geli.8; should now be supported by the kernel. Generating the Master Key The following example will describe how to generate a key file, which will be used as part of the Master Key for the encrypted provider mounted under /private. The key file will provide some random data used to encrypt the Master Key. The Master Key will be protected by a passphrase as well. Provider's sector size will be 4kB big. Furthermore, the discussion will describe how to attach the geli provider, create a file system on it, how to mount it, how to work with it, and finally how to detach it. It is recommended to use a bigger sector size (like 4kB) for better performance. The Master Key will be protected with a passphrase and the data source for key file will be /dev/random. The sector size of /dev/da2.eli, which we call provider, will be 4kB. &prompt.root; dd if=/dev/random of=/root/da2.key bs=64 count=1 &prompt.root; geli init -s 4096 -K /root/da2.key /dev/da2 Enter new passphrase: Reenter new passphrase: It is not mandatory that both a passphrase and a key file are used; either method of securing the Master Key can be used in isolation. If key file is given as -, standard input will be used. This example shows how more than one key file can be used. &prompt.root; cat keyfile1 keyfile2 keyfile3 | geli init -K - /dev/da2 Attaching the Provider with the generated Key &prompt.root; geli attach -k /root/da2.key /dev/da2 Enter passphrase: The new plaintext device will be named /dev/da2.eli. &prompt.root; ls /dev/da2* /dev/da2 /dev/da2.eli Creating the new File System &prompt.root; dd if=/dev/random of=/dev/da2.eli bs=1m &prompt.root; newfs /dev/da2.eli &prompt.root; mount /dev/da2.eli /private The encrypted file system should be visible to &man.df.1; and be available for use now. &prompt.root; df -H Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1a 248M 89M 139M 38% / /devfs 1.0K 1.0K 0B 100% /dev /dev/ad0s1f 7.7G 2.3G 4.9G 32% /usr /dev/ad0s1d 989M 1.5M 909M 0% /tmp /dev/ad0s1e 3.9G 1.3G 2.3G 35% /var /dev/da2.eli 150G 4.1K 138G 0% /private Unmounting and Detaching the Provider Once the work on the encrypted partition is done, and the /private partition is no longer needed, it is prudent to consider unmounting and detaching the geli encrypted partition from the kernel. &prompt.root; umount /private &prompt.root; geli detach da2.eli More information about the use of &man.geli.8; can be found in the manual page. Encrypting a Swap Partition The following example demonstrates how to create a geli encrypted swap partition. &prompt.root; dd if=/dev/random of=/dev/ad0s1b bs=1m &prompt.root; geli onetime -d -a 3des ad0s1b &prompt.root; swapon /dev/ad0s1b.eli Using the <filename>geli</filename> rcNG Script geli comes with a rcNG script which can be used to simplify the usage of geli. An example of configuring geli through &man.rc.conf.5; follows: geli_devices="da2" geli_da2_flags="-p -k /root/da2.key" This will configure /dev/da2 as a geli provider of which the Master Key file is located in /root/da2.key, and geli will not use a passphrase when attaching the provider (note that this can only be used if -P was given during the geli init phase). The system will detach the geli provider from the kernel before the system shuts down. More information about configuring rcNG is provided in the rcNG section of the Handbook. diff --git a/en_US.ISO8859-1/books/handbook/firewalls/chapter.sgml b/en_US.ISO8859-1/books/handbook/firewalls/chapter.sgml index d3f07d269e..ab436548fb 100644 --- a/en_US.ISO8859-1/books/handbook/firewalls/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/firewalls/chapter.sgml @@ -1,3342 +1,3342 @@ Joseph J. Barbish Contributed by Brad Davis Converted to SGML and updated by Firewalls firewall security firewalls Introduction Firewalls make it possible to filter incoming and outgoing traffic that flows through your system. A firewall can use one or more sets of rules to inspect the network packets as they come in or go out of your network connections and either allows the traffic through or blocks it. The rules of a firewall can inspect one or more characteristics of the packets, including but not limited to the protocol type, the source or destination host address, and the source or destination port. Firewalls can greatly enhance the security of a host or a network. They can be used to do one or more of the following things: To protect and insulate the applications, services and machines of your internal network from unwanted traffic coming in from the public Internet. To limit or disable access from hosts of the internal network to services of the public Internet. To support network address translation (NAT), which allows your internal network to use private IP addresses and share a single connection to the public Internet (either with a single IP address or by a shared pool of automatically assigned public addresses). After reading this chapter, you will know: How to properly define packet filtering rules. The differences between the firewalls built into &os;. How to use and configure the OpenBSD PF firewall. How to use and configure IPFILTER. How to use and configure IPFW. Before reading this chapter, you should: Understand basic &os; and Internet concepts. Firewall Concepts firewall rulesets There are two basic ways to create firewall rulesets: inclusive or exclusive. An exclusive firewall allows all traffic through except for the traffic matching the ruleset. An inclusive firewall does the reverse. It only allows traffic matching the rules through and blocks everything else. Inclusive firewalls are generally safer than exclusive firewalls because they significantly reduce the risk of allowing unwanted traffic to pass through the firewall. Security can be tightened further using a stateful firewall. With a stateful firewall the firewall keeps track of which connections are opened through the firewall and will only allow traffic through which either matches an existing connection or opens a new one. The disadvantage of a stateful firewall is that it can be vulnerable to Denial of Service (DoS) attacks if a lot of new connections are opened very fast. With most firewalls it is possible to use a combination of stateful and non-stateful behavior to make an optimal firewall for the site. Firewall Packages &os; has three different firewall packages built into the base system. They are: IPFILTER (also known as IPF), IPFIREWALL (also known as IPFW), and OpenBSD's PacketFilter (also known as PF). &os; also has two built in packages for traffic shaping (basically controlling bandwidth usage): &man.altq.4; and &man.dummynet.4;. Dummynet has traditionally been closely tied with IPFW, and ALTQ with IPF/PF. IPF, IPFW, and PF all use rules to control the access of packets to and from your system, although they go about it different ways and have different rule syntaxes. The reason that &os; has multiple built in firewall packages is that different people have different requirements and preferences. No single firewall package is the best. The author prefers IPFILTER because its stateful rules are much less complicated to use in a NAT environment and it has a built in ftp proxy that simplifies the rules to allow secure outbound FTP usage. Since all firewalls are based on inspecting the values of selected packet control fields, the creator of the firewall rulesets must have an understanding of how TCP/IP works, what the different values in the packet control fields are and how these values are used in a normal session conversation. For a good explanation go to: . The OpenBSD Packet Filter (PF) and <acronym>ALTQ</acronym> firewall PF As of July 2003 the OpenBSD firewall software application known as PF was ported to &os; and was made available in the &os; Ports Collection; the first release that contained PF as an integrated part of the base system was &os; 5.3 in November 2004. PF is a complete, fully featured firewall that has optional support for ALTQ (Alternate Queuing). ALTQ provides Quality of Service (QoS) bandwidth shaping that allows guaranteeing bandwidth to different services based on filtering rules. The OpenBSD Project does an outstanding job of maintaining the PF User's Guide that it will not be made part of this handbook firewall section as that would just be duplicated effort. The availability of PF for the various &os; releases and versions is summarized below: &os; Version PF Availability Pre-4.X versions PF is not available for any release of &os; older than the 4.X branch. All versions of the 4.X branch PF is available as part of KAME. 5.X releases before 5.3-RELEASE The security/pf port can be used to install PF on these versions of &os;. These releases were targeted to developers and people who wanted a preview of early 5.X versions. Upgrading to 5.3-RELEASE or newer versions of &os; is strongly recommended. 5.3-RELEASE and later versions PF is part of the base system. Do not use the security/pf port on these versions of &os;. It will not work. Use the &man.pf.4; support of the base system instead. More info can be found at the PF for &os; web site: . The OpenBSD PF user's guide is here: . PF in &os; 5.X is at the level of OpenBSD version 3.5. The port from the &os; Ports Collection is at the level of OpenBSD version 3.4. Keep that in mind when browsing the user's guide. Enabling PF PF is included in the basic &os; install for versions newer than 5.3 as a separate run time loadable module. The system will dynamically load the PF kernel loadable module when the rc.conf statement pf_enable="YES" is used. The loadable module was created with &man.pflog.4; logging enabled. The module assumes the presence of options INET and device bpf. Unless NOINET6 (for example in &man.make.conf.5;) was defined during the build, it also requires options INET6. Kernel options kernel options device pf kernel options device pflog kernel options device pfsync It is not a mandatory requirement that you enable PF by compiling the following options into the &os; kernel. It is only presented here as background information. Compiling PF into the kernel causes the loadable module to never be used. Sample kernel config PF option statements are in the /usr/src/sys/conf/NOTES kernel source and are reproduced here: device pf device pflog device pfsync device pf enables support for the Packet Filter firewall. device pflog enables the optional &man.pflog.4; pseudo network device which can be used to log traffic to a &man.bpf.4; descriptor. The &man.pflogd.8; daemon can be used to store the logging information to disk. device pfsync enables the optional &man.pfsync.4; pseudo network device that is used to monitor state changes. As this is not part of the loadable module one has to build a custom kernel to use it. These settings will take effect only after you have built and installed a kernel with them set. Available rc.conf Options You need the following statements in /etc/rc.conf to activate PF at boot time: pf_enable="YES" # Enable PF (load module if required) pf_rules="/etc/pf.conf" # rules definition file for pf pf_flags="" # additional flags for pfctl startup pflog_enable="YES" # start pflogd(8) pflog_logfile="/var/log/pflog" # where pflogd should store the logfile pflog_flags="" # additional flags for pflogd startup If you have a LAN behind this firewall and have to forward packets for the computers in the LAN or want to do NAT, you have to enable the following option as well: gateway_enable="YES" # Enable as LAN gateway Enabling <acronym>ALTQ</acronym> ALTQ is only available by compiling the options into the &os; Kernel. ALTQ is not supported by all of the available network card drivers. Please see the &man.altq.4; manual page for a list of drivers that are supported in your release of &os;. The following options will enable ALTQ and add additional functionality. options ALTQ options ALTQ_CBQ # Class Bases Queuing (CBQ) options ALTQ_RED # Random Early Detection (RED) options ALTQ_RIO # RED In/Out options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC) options ALTQ_PRIQ # Priority Queuing (PRIQ) options ALTQ_NOPCC # Required for SMP build options ALTQ enables the ALTQ framework. options ALTQ_CBQ enables Class Based Queuing (CBQ). CBQ allows you to divide a connection's bandwidth into different classes or queues to prioritize traffic based on filter rules. options ALTQ_RED enables Random Early Detection (RED). RED is used to avoid network congestion. RED does this by measuring the length of the queue and comparing it to the minimum and maximum thresholds for the queue. If the queue is over the maximum all new packets will be dropped. True to its name, RED drops packets from different connections randomly. options ALTQ_RIO enables Random Early Detection In and Out. options ALTQ_HFSC enables the Hierarchical Fair Service Curve Packet Scheduler. For more information about HFSC see: . options ALTQ_PRIQ enables Priority Queuing (PRIQ). PRIQ will always pass traffic that is in a higher queue first. options ALTQ_NOPCC enables SMP support for ALTQ. This option is required on SMP systems. The IPFILTER (IPF) Firewall firewall IPFILTER This section is work in progress. The contents might not be accurate at all times. The author of IPFILTER is Darren Reed. IPFILTER is not operating system dependent: it is an open source application and has been ported to &os;, NetBSD, OpenBSD, &sunos;, HP/UX, and &solaris; operating systems. IPFILTER is actively being supported and maintained, with updated versions being released regularly. IPFILTER is based on a kernel-side firewall and NAT mechanism that can be controlled and monitored by userland interface programs. The firewall rules can be set or deleted with the &man.ipf.8; utility. The NAT rules can be set or deleted with the &man.ipnat.1; utility. The &man.ipfstat.8; utility can print run-time statistics for the kernel parts of IPFILTER. The &man.ipmon.8; program can log IPFILTER actions to the system log files. IPF was originally written using a rule processing logic of the last matching rule wins and used only stateless type of rules. Over time IPF has been enhanced to include a quick option and a stateful keep state option which drastically modernized the rules processing logic. IPF's official documentation covers the legacy rule coding parameters and the legacy rule file processing logic. The modernized functions are only included as additional options, completely understating their benefits in producing a far superior secure firewall. The instructions contained in this section are based on using rules that contain the quick option and the stateful keep state option. This is the basic framework for coding an inclusive firewall rule set. An inclusive firewall only allows packets matching the rules to pass through. This way you can control what services can originate behind the firewall destined for the public Internet and also control the services which can originate from the public Internet accessing your private network. Everything else is blocked and logged by default design. Inclusive firewalls are much, much more secure than exclusive firewall rule sets and is the only rule set type covered herein. For detailed explanation of the legacy rules processing method see: and . The IPF FAQ is at . Enabling IPF IPFILTER enabling IPF is included in the basic &os; install as a separate run time loadable module. The system will dynamically load the IPF kernel loadable module when the rc.conf statement ipfilter_enable="YES" is used. The loadable module was created with logging enabled and the default pass all options. You do not need to compile IPF into the &os; kernel just to change the default to block all, you can do that by just coding a block all rule at the end of your rule set. Kernel options kernel options IPFILTER kernel options IPFILTER_LOG kernel options IPFILTER_DEFAULT_BLOCK IPFILTER kernel options It is not a mandatory requirement that you enable IPF by compiling the following options into the &os; kernel. It is only presented here as background information. Compiling IPF into the kernel causes the loadable module to never be used. Sample kernel config IPF option statements are in the /usr/src/sys/conf/NOTES kernel source (/usr/src/sys/arch/conf/LINT for &os; 4.X) and are reproduced here: options IPFILTER options IPFILTER_LOG options IPFILTER_DEFAULT_BLOCK options IPFILTER enables support for the IPFILTER firewall. options IPFILTER_LOG enables the option to have IPF log traffic by writing to the ipl packet logging pseudo—device for every rule that has the log keyword. options IPFILTER_DEFAULT_BLOCK changes the default behavior so any packet not matching a firewall pass rule gets blocked. These settings will take effect only after you have built and installed a kernel with them set. Available rc.conf Options You need the following statements in /etc/rc.conf to activate IPF at boot time: ipfilter_enable="YES" # Start ipf firewall ipfilter_rules="/etc/ipf.rules" # loads rules definition text file ipmon_enable="YES" # Start IP monitor log ipmon_flags="-Ds" # D = start as daemon # s = log to syslog # v = log tcp window, ack, seq - # n = map IP & port to names + # n = map IP & port to names If you have a LAN behind this firewall that uses the reserved private IP address ranges, then you need to add the following to enable NAT functionality: gateway_enable="YES" # Enable as LAN gateway ipnat_enable="YES" # Start ipnat function ipnat_rules="/etc/ipnat.rules" # rules definition file for ipnat IPF ipf The ipf command is used to load your rules file. Normally you create a file containing your custom rules and use this command to replace in mass the currently running firewall internal rules: &prompt.root; ipf -Fa -f /etc/ipf.rules means flush all internal rules tables. means this is the file to read for the rules to load. This gives you the ability to make changes to your custom rules file, run the above IPF command, and thus update the running firewall with a fresh copy of all the rules without having to reboot the system. This method is very convenient for testing new rules as the procedure can be executed as many times as needed. See the &man.ipf.8; manual page for details on the other flags available with this command. The &man.ipf.8; command expects the rules file to be a standard text file. It will not accept a rules file written as a script with symbolic substitution. There is a way to build IPF rules that utilizes the power of script symbolic substitution. For more information, see . IPFSTAT ipfstat IPFILTER statistics The default behavior of &man.ipfstat.8; is to retrieve and display the totals of the accumulated statistics gathered as a result of applying the user coded rules against packets going in and out of the firewall since it was last started, or since the last time the accumulators were reset to zero by the ipf -Z command. See the &man.ipfstat.8; manual page for details. The default &man.ipfstat.8; command output will look something like this: input packets: blocked 99286 passed 1255609 nomatch 14686 counted 0 output packets: blocked 4200 passed 1284345 nomatch 14687 counted 0 input packets logged: blocked 99286 passed 0 output packets logged: blocked 0 passed 0 packets logged: input 0 output 0 log failures: input 3898 output 0 fragment state(in): kept 0 lost 0 fragment state(out): kept 0 lost 0 packet state(in): kept 169364 lost 0 packet state(out): kept 431395 lost 0 ICMP replies: 0 TCP RSTs sent: 0 Result cache hits(in): 1215208 (out): 1098963 IN Pullups succeeded: 2 failed: 0 OUT Pullups succeeded: 0 failed: 0 Fastroute successes: 0 failures: 0 TCP cksum fails(in): 0 (out): 0 Packet log flags set: (0) When supplied with either for inbound or for outbound, it will retrieve and display the appropriate list of filter rules currently installed and in use by the kernel. ipfstat -in displays the inbound internal rules table with rule number. ipfstat -on displays the outbound internal rules table with the rule number. The output will look something like this: @1 pass out on xl0 from any to any @2 block out on dc0 from any to any @3 pass out quick on dc0 proto tcp/udp from any to any keep state ipfstat -ih displays the inbound internal rules table, prefixing each rule with a count of how many times the rule was matched. ipfstat -oh displays the outbound internal rules table, prefixing each rule with a count of how many times the rule was matched. The output will look something like this: 2451423 pass out on xl0 from any to any 354727 block out on dc0 from any to any 430918 pass out quick on dc0 proto tcp/udp from any to any keep state One of the most important functions of the ipfstat command is the flag which displays the state table in a way similar to the way &man.top.1; shows the &os; running process table. When your firewall is under attack this function gives you the ability to identify, drill down to, and see the attacking packets. The optional sub-flags give the ability to select the destination or source IP, port, or protocol that you want to monitor in real time. See the &man.ipfstat.8; manual page for details. IPMON ipmon IPFILTER logging In order for ipmon to work properly, the kernel option IPFILTER_LOG must be turned on. This command has two different modes that it can be used in. Native mode is the default mode when you type the command on the command line without the flag. Daemon mode is for when you want to have a continuous system log file available so that you can review logging of past events. This is how &os; and IPFILTER are configured to work together. &os; has a built in facility to automatically rotate system logs. That is why outputting the log information to syslogd is better than the default of outputting to a regular file. In the default rc.conf file you see the ipmon_flags statement uses the flags: ipmon_flags="-Ds" # D = start as daemon # s = log to syslog # v = log tcp window, ack, seq - # n = map IP & port to names + # n = map IP & port to names The benefits of logging are obvious. It provides the ability to review, after the fact, information such as which packets had been dropped, what addresses they came from and where they were going. These all give you a significant edge in tracking down attackers. Even with the logging facility enabled, IPF will not generate any rule logging on its own. The firewall administrator decides what rules in the rule set he wants to log and adds the log keyword to those rules. Normally only deny rules are logged. It is very customary to include a default deny everything rule with the log keyword included as your last rule in the rule set. This way you get to see all the packets that did not match any of the rules in the rule set. IPMON Logging Syslogd uses its own special method for segregation of log data. It uses special groupings called facility and level. IPMON in mode uses security (local0 in 4.X) as the facility name. All IPMON logged data goes to security (local0 in 4.X). The following levels can be used to further segregate the logged data if desired: LOG_INFO - packets logged using the "log" keyword as the action rather than pass or block. LOG_NOTICE - packets logged which are also passed LOG_WARNING - packets logged which are also blocked LOG_ERR - packets which have been logged and which can be considered short To setup IPFILTER to log all data to /var/log/ipfilter.log, you will need to create the file. The following command will do that: &prompt.root; touch /var/log/ipfilter.log The syslog function is controlled by definition statements in the /etc/syslog.conf file. The syslog.conf file offers considerable flexibility in how syslog will deal with system messages issued by software applications like IPF. Add the following statement to /etc/syslog.conf for &os; 5.X and later: security.* /var/log/ipfilter.log Or add the following statement to /etc/syslog.conf for &os; 4.X: local0.* /var/log/ipfilter.log The security.* (local0 for 4.X) means to write all the logged messages to the coded file location. To activate the changes to /etc/syslog.conf you can reboot or bump the syslog task into re-reading /etc/syslog.conf by running /etc/rc.d/syslogd reload (killall -HUP syslogd in &os; 4.X). Do not forget to change /etc/newsyslog.conf to rotate the new log you just created above. The Format of Logged Messages Messages generated by ipmon consist of data fields separated by white space. Fields common to all messages are: The date of packet receipt. The time of packet receipt. This is in the form HH:MM:SS.F, for hours, minutes, seconds, and fractions of a second (which can be several digits long). The name of the interface the packet was processed on, e.g. dc0. The group and rule number of the rule, e.g. @0:17. These can be viewed with ipfstat -in. The action: p for passed, b for blocked, S for a short packet, n did not match any rules, L for a log rule. The order of precedence in showing flags is: S, p, b, n, L. A capital P or B means that the packet has been logged due to a global logging setting, not a particular rule. The addresses. This is actually three fields: the - source address and port (separated by a comma), the -> + source address and port (separated by a comma), the -> symbol, and the destination address and port. - 209.53.17.22,80 -> 198.73.220.17,1722. + 209.53.17.22,80 -> 198.73.220.17,1722. PR followed by the protocol name or number, e.g. PR tcp. len followed by the header length and total length of the packet, e.g. len 20 40. If the packet is a TCP packet, there will be an additional field starting with a hyphen followed by letters corresponding to any flags that were set. See the &man.ipmon.8; manual page for a list of letters and their flags. If the packet is an ICMP packet, there will be two fields at the end, the first always being ICMP, and the next being the ICMP message and sub-message type, separated by a slash, e.g. ICMP 3/3 for a port unreachable message. Building the Rule Script with Symbolic Substitution Some experienced IPF users create a file containing the rules and code them in a manner compatible with running them as a script with symbolic substitution. The major benefit of doing this is that you only have to change the value associated with the symbolic name and when the script is run all the rules containing the symbolic name will have the value substituted in the rules. Being a script, you can use symbolic substitution to code frequently used values and substitute them in multiple rules. You will see this in the following example. The script syntax used here is compatible with the sh, csh, and tcsh shells. Symbolic substitution fields are prefixed with a dollar sign: $. Symbolic fields do not have the $ prefix. The value to populate the symbolic field must be enclosed with double quotes ("). Start your rule file with something like this: ############# Start of IPF rules script ######################## oif="dc0" # name of the outbound interface odns="192.0.2.11" # ISP's DNS server IP address myip="192.0.2.7" # my static IP address from ISP ks="keep state" fks="flags S keep state" # You can choose between building /etc/ipf.rules file # from this script or running this script "as is". # # Uncomment only one line and comment out another. # # 1) This can be used for building /etc/ipf.rules: #cat > /etc/ipf.rules << EOF # # 2) This can be used to run script "as is": /sbin/ipf -Fa -f - << EOF # Allow out access to my ISP's Domain name server. pass out quick on $oif proto tcp from any to $odns port = 53 $fks pass out quick on $oif proto udp from any to $odns port = 53 $ks # Allow out non-secure standard www function pass out quick on $oif proto tcp from $myip to any port = 80 $fks # Allow out secure www function https over TLS SSL pass out quick on $oif proto tcp from $myip to any port = 443 $fks EOF ################## End of IPF rules script ######################## That is all there is to it. The rules are not important in this example; how the symbolic substitution fields are populated and used are. If the above example was in a file named /etc/ipf.rules.script, you could reload these rules by entering the following command: &prompt.root; sh /etc/ipf.rules.script There is one problem with using a rules file with embedded symbolics: IPF does not understand symbolic substitution, and cannot read such scripts directly. This script can be used in one of two ways: Uncomment the line that begins with cat, and comment out the line that begins with /sbin/ipf. Place ipfilter_enable="YES" into /etc/rc.conf as usual, and run script once after each modification to create or update /etc/ipf.rules. Disable IPFILTER in system startup scripts by adding ipfilter_enable="NO" (this is default value) into /etc/rc.conf file. Add a script like the following to your /usr/local/etc/rc.d/ startup directory. The script should have an obvious name like ipf.loadrules.sh. The .sh extension is mandatory. #!/bin/sh sh /etc/ipf.rules.script The permissions on this script file must be read, write, execute for owner root. &prompt.root; chmod 700 /usr/local/etc/rc.d/ipf.loadrules.sh Now, when your system boots, your IPF rules will be loaded. IPF Rule Sets A rule set is a group of ipf rules coded to pass or block packets based on the values contained in the packet. The bi-directional exchange of packets between hosts comprises a session conversation. The firewall rule set processes the packet two times, once on its arrival from the public Internet host and again as it leaves for its return trip back to the public Internet host. Each TCP/IP service (i.e. telnet, www, mail, etc.) is predefined by its protocol, source and destination IP address, or the source and destination port number. This is the basic selection criteria used to create rules which will pass or block services. IPFILTER rule processing order IPF was originally written using a rules processing logic of the last matching rule wins and used only stateless rules. Over time IPF has been enhanced to include a quick option and a stateful keep state option which drastically modernized the rule processing logic. The instructions contained in this section are based on using rules that contain the quick option and the stateful keep state option. This is the basic framework for coding an inclusive firewall rule set. An inclusive firewall only allows services matching the rules through. This way you can control what services can originate behind the firewall destined for the public Internet and also control the services which can originate from the public Internet accessing your private network. Everything else is blocked and logged by default design. Inclusive firewalls are much, much securer than exclusive firewall rule sets and is the only rule set type covered herein. When working with the firewall rules, be very careful. Some configurations will lock you out of the server. To be on the safe side, you may wish to consider performing the initial firewall configuration from the local console rather than doing it remotely e.g. via ssh. Rule Syntax IPFILTER rule syntax The rule syntax presented here has been simplified to only address the modern stateful rule context and first matching rule wins logic. For the complete legacy rule syntax description see the &man.ipf.8; manual page. A # character is used to mark the start of a comment and may appear at the end of a rule line or on its own line. Blank lines are ignored. Rules contain keywords. These keywords have to be coded in a specific order from left to right on the line. Keywords are identified in bold type. Some keywords have sub-options which may be keywords themselves and also include more sub-options. Each of the headings in the below syntax has a bold section header which expands on the content. ACTION IN-OUT OPTIONS SELECTION STATEFUL PROTO SRC_ADDR,DST_ADDR OBJECT PORT_NUM TCP_FLAG STATEFUL ACTION = block | pass IN-OUT = in | out OPTIONS = log | quick | on interface-name SELECTION = proto value | source/destination IP | port = number | flags flag-value PROTO = tcp/udp | udp | tcp | icmp SRC_ADD,DST_ADDR = all | from object to object OBJECT = IP address | any PORT_NUM = port number TCP_FLAG = S STATEFUL = keep state ACTION The action indicates what to do with the packet if it matches the rest of the filter rule. Each rule must have a action. The following actions are recognized: block indicates that the packet should be dropped if the selection parameters match the packet. pass indicates that the packet should exit the firewall if the selection parameters match the packet. IN-OUT A mandatory requirement is that each filter rule explicitly state which side of the I/O it is to be used on. The next keyword must be either in or out and one or the other has to be coded or the rule will not pass syntax checks. in means this rule is being applied against an inbound packet which has just been received on the interface facing the public Internet. out means this rule is being applied against an outbound packet destined for the interface facing the public Internet. OPTIONS These options must be used in the order shown here. log indicates that the packet header will be written to the ipl log (as described in the LOGGING section below) if the selection parameters match the packet. quick indicates that if the selection parameters match the packet, this rule will be the last rule checked, allowing a short-circuit path to avoid processing any following rules for this packet. This option is a mandatory requirement for the modernized rules processing logic. on indicates the interface name to be incorporated into the selection parameters. Interface names are as displayed by &man.ifconfig.8;. Using this option, the rule will only match if the packet is going through that interface in the specified direction (in/out). This option is a mandatory requirement for the modernized rules processing logic. When a packet is logged, the headers of the packet are written to the IPL packet logging pseudo-device. Immediately following the log keyword, the following qualifiers may be used (in this order): body indicates that the first 128 bytes of the packet contents will be logged after the headers. first If the log keyword is being used in conjunction with a keep state option, it is recommended that this option is also applied so that only the triggering packet is logged and not every packet which thereafter matches the keep state information. SELECTION The keywords described in this section are used to describe attributes of the packet to be interrogated when determining whether rules match or not. There is a keyword subject, and it has sub-option keywords, one of which has to be selected. The following general-purpose attributes are provided for matching, and must be used in this order: PROTO proto is the subject keyword and must be coded along with one of its corresponding keyword sub-option values. The value allows a specific protocol to be matched against. This option is a mandatory requirement for the modernized rules processing logic. tcp/udp | udp | tcp | icmp or any protocol names found in /etc/protocols are recognized and may be used. The special protocol keyword tcp/udp may be used to match either a TCP or a UDP packet, and has been added as a convenience to save duplication of otherwise identical rules. SRC_ADDR/DST_ADDR The all keyword is essentially a synonym for from any to any with no other match parameters. from src to dst: the from and to keywords are used to match against IP addresses. Rules must specify BOTH source and destination parameters. any is a special keyword that matches any IP address. Examples of use: from any to any or from 0.0.0.0/0 to any or from any to 0.0.0.0/0 or from 0.0.0.0 to any or from any to 0.0.0.0. IP addresses may be specified as a dotted IP address numeric form/mask-length, or as single dotted IP address numeric form. There is no way to match ranges of IP addresses which do not express themselves easily as mask-length. See this web page for help on writing mask-length: . PORT If a port match is included, for either or both of source and destination, then it is only applied to TCP and UDP packets. When composing port comparisons, either the service name from /etc/services or an integer port number may be used. When the port appears as part of the from object, it matches the source port number; when it appears as part of the to object, it matches the destination port number. The use of the port option with the to object is a mandatory requirement for the modernized rules processing logic. Example of use: from any to any port = 80 Port comparisons may be done in a number of forms, with a number of comparison operators, or port ranges may be specified. port "=" | "!=" | "<" | ">" | "<=" | ">=" | "eq" | "ne" | "lt" | "gt" | "le" | "ge". To specify port ranges, port "<>" | "><" Following the source and destination matching parameters, the following two parameters are mandatory requirements for the modernized rules processing logic. <acronym>TCP</acronym>_FLAG Flags are only effective for TCP filtering. The letters represents one of the possible flags that can be interrogated in the TCP packet header. The modernized rules processing logic uses the flags S parameter to identify the tcp session start request. STATEFUL keep state indicates that on a pass rule, any packets that match the rules selection parameters should activate the stateful filtering facility. This option is a mandatory requirement for the modernized rules processing logic. Stateful Filtering IPFILTER stateful filtering Stateful filtering treats traffic as a bi-directional exchange of packets comprising a session conversation. When activated, keep-state dynamically generates internal rules for each anticipated packet being exchanged during the bi-directional session conversation. It has the interrogation abilities to determine if the session conversation between the originating sender and the destination are following the valid procedure of bi-directional packet exchange. Any packets that do not properly fit the session conversation template are automatically rejected as impostors. Keep state will also allow ICMP packets related to a TCP or UDP session through. So if you get ICMP type 3 code 4 in response to some web surfing allowed out by a keep state rule, they will be automatically allowed in. Any packet that IPF can be certain is part of an active session, even if it is a different protocol, will be let in. What happens is: Packets destined to go out the interface connected to the public Internet are first checked against the dynamic state table, if the packet matches the next expected packet comprising in a active session conversation, then it exits the firewall and the state of the session conversation flow is updated in the dynamic state table, the remaining packets get checked against the outbound rule set. Packets coming in to the interface connected to the public Internet are first checked against the dynamic state table, if the packet matches the next expected packet comprising a active session conversation, then it exits the firewall and the state of the session conversation flow is updated in the dynamic state table, the remaining packets get checked against the inbound rule set. When the conversation completes it is removed from the dynamic state table. Stateful filtering allows you to focus on blocking/passing new sessions. If the new session is passed, all its subsequent packets will be allowed through automatically and any impostors automatically rejected. If a new session is blocked, none of its subsequent packets will be allowed through. Stateful filtering has technically advanced interrogation abilities capable of defending against the flood of different attack methods currently employed by attackers. Inclusive Rule Set Example The following rule set is an example of how to code a very secure inclusive type of firewall. An inclusive firewall only allows services matching pass rules through and blocks all other by default. All firewalls have at the minimum two interfaces which have to have rules to allow the firewall to function. All &unix; flavored systems including &os; are designed to use interface lo0 and IP address 127.0.0.1 for internal communication within the operating system. The firewall rules must contain rules to allow free unmolested movement of these special internally used packets. The interface which faces the public Internet is the one where you place your rules to authorize and control access out to the public Internet and access requests arriving from the public Internet. This can be your user PPP tun0 interface or your NIC that is connected to your DSL or cable modem. In cases where one or more NICs are cabled to private LANs behind the firewall, those interfaces must have a rule coded to allow free unmolested movement of packets originating from those LAN interfaces. The rules should be first organized into three major sections: all the free unmolested interfaces, the public interface outbound, and the public interface inbound. The rules in each of the public interface sections should have the most frequently matched rules placed before less commonly matched rules, with the last rule in the section blocking and logging all packets on that interface and direction. The Outbound section in the following rule set only contains 'pass' rules which contain selection values that uniquely identify the service that is authorized for public Internet access. All the rules have the 'quick', 'on', 'proto', 'port', and 'keep state' option coded. The 'proto tcp' rules have the 'flag' option included to identify the session start request as the triggering packet to activate the stateful facility. The Inbound section has all the blocking of undesirable packets first, for two different reasons. The first is that these things being blocked may be part of an otherwise valid packet which may be allowed in by the later authorized service rules. The second reason is that by having a rule that explicitly blocks selected packets that I receive on an infrequent basis and that I do not want to see in the log, they will not be caught by the last rule in the section which blocks and logs all packets which have fallen through the rules. The last rule in the section which blocks and logs all packets is how you create the legal evidence needed to prosecute the people who are attacking your system. Another thing you should take note of, is there is no response returned for any of the undesirable stuff, their packets just get dropped and vanish. This way the attacker has no knowledge if his packets have reached your system. The less the attackers can learn about your system, the more time they must invest before actually doing something bad. The inbound 'nmap OS fingerprint' attempts rule I log the first occurrence because this is something a attacker would do. Any time you see log messages on a rule with 'log first'. You should do an ipfstat -hio command to see the number of times the rule has been matched so you know if you are being flooded, i.e. under attack. When you log packets with port numbers you do not recognize, look it up in /etc/services or go to and do a port number lookup to find what the purpose of that port number is. Check out this link for port numbers used by Trojans . The following rule set is a complete very secure 'inclusive' type of firewall rule set that I have used on my system. You can not go wrong using this rule set for your own. Just comment out any pass rules for services that you do not want to authorize. If you see messages in your log that you want to stop seeing just add a block rule in the inbound section. You have to change the dc0 interface name in every rule to the interface name of the Nic card that connects your system to the public Internet. For user PPP it would be tun0. Add the following statements to /etc/ipf.rules: ################################################################# # No restrictions on Inside LAN Interface for private network # Not needed unless you have LAN ################################################################# #pass out quick on xl0 all #pass in quick on xl0 all ################################################################# # No restrictions on Loopback Interface ################################################################# pass in quick on lo0 all pass out quick on lo0 all ################################################################# # Interface facing Public Internet (Outbound Section) # Interrogate session start requests originating from behind the # firewall on the private network # or from this gateway server destine for the public Internet. ################################################################# # Allow out access to my ISP's Domain name server. # xxx must be the IP address of your ISP's DNS. # Dup these lines if your ISP has more than one DNS server # Get the IP addresses from /etc/resolv.conf file pass out quick on dc0 proto tcp from any to xxx port = 53 flags S keep state pass out quick on dc0 proto udp from any to xxx port = 53 keep state # Allow out access to my ISP's DHCP server for cable or DSL networks. # This rule is not needed for 'user ppp' type connection to the # public Internet, so you can delete this whole group. # Use the following rule and check log for IP address. -# Then put IP address in commented out rule & delete first rule +# Then put IP address in commented out rule & delete first rule pass out log quick on dc0 proto udp from any to any port = 67 keep state #pass out quick on dc0 proto udp from any to z.z.z.z port = 67 keep state # Allow out non-secure standard www function pass out quick on dc0 proto tcp from any to any port = 80 flags S keep state # Allow out secure www function https over TLS SSL pass out quick on dc0 proto tcp from any to any port = 443 flags S keep state -# Allow out send & get email function +# Allow out send & get email function pass out quick on dc0 proto tcp from any to any port = 110 flags S keep state pass out quick on dc0 proto tcp from any to any port = 25 flags S keep state # Allow out Time pass out quick on dc0 proto tcp from any to any port = 37 flags S keep state # Allow out nntp news pass out quick on dc0 proto tcp from any to any port = 119 flags S keep state -# Allow out gateway & LAN users non-secure FTP ( both passive & active modes) +# Allow out gateway & LAN users non-secure FTP ( both passive & active modes) # This function uses the IPNAT built in FTP proxy function coded in # the nat rules file to make this single rule function correctly. # If you want to use the pkg_add command to install application packages # on your gateway system you need this rule. pass out quick on dc0 proto tcp from any to any port = 21 flags S keep state # Allow out secure FTP, Telnet, and SCP # This function is using SSH (secure shell) pass out quick on dc0 proto tcp from any to any port = 22 flags S keep state # Allow out non-secure Telnet pass out quick on dc0 proto tcp from any to any port = 23 flags S keep state # Allow out FBSD CVSUP function pass out quick on dc0 proto tcp from any to any port = 5999 flags S keep state # Allow out ping to public Internet pass out quick on dc0 proto icmp from any to any icmp-type 8 keep state # Allow out whois for LAN PC to public Internet pass out quick on dc0 proto tcp from any to any port = 43 flags S keep state # Block and log only the first occurrence of everything # else that's trying to get out. # This rule enforces the block all by default logic. block out log first quick on dc0 all ################################################################# # Interface facing Public Internet (Inbound Section) # Interrogate packets originating from the public Internet # destine for this gateway server or the private network. ################################################################# # Block all inbound traffic from non-routable or reserved address spaces block in quick on dc0 from 192.168.0.0/16 to any #RFC 1918 private IP block in quick on dc0 from 172.16.0.0/12 to any #RFC 1918 private IP block in quick on dc0 from 10.0.0.0/8 to any #RFC 1918 private IP block in quick on dc0 from 127.0.0.0/8 to any #loopback block in quick on dc0 from 0.0.0.0/8 to any #loopback block in quick on dc0 from 169.254.0.0/16 to any #DHCP auto-config block in quick on dc0 from 192.0.2.0/24 to any #reserved for docs block in quick on dc0 from 204.152.64.0/23 to any #Sun cluster interconnect -block in quick on dc0 from 224.0.0.0/3 to any #Class D & E multicast +block in quick on dc0 from 224.0.0.0/3 to any #Class D & E multicast ##### Block a bunch of different nasty things. ############ # That I do not want to see in the log # Block frags block in quick on dc0 all with frags # Block short tcp packets block in quick on dc0 proto tcp all with short # block source routed packets block in quick on dc0 all with opt lsrr block in quick on dc0 all with opt ssrr # Block nmap OS fingerprint attempts # Log first occurrence of these so I can get their IP address block in log first quick on dc0 proto tcp from any to any flags FUP # Block anything with special options block in quick on dc0 all with ipopts # Block public pings block in quick on dc0 proto icmp all icmp-type 8 # Block ident block in quick on dc0 proto tcp from any to any port = 113 # Block all Netbios service. 137=name, 138=datagram, 139=session # Netbios is MS/Windows sharing services. # Block MS/Windows hosts2 name server requests 81 block in log first quick on dc0 proto tcp/udp from any to any port = 137 block in log first quick on dc0 proto tcp/udp from any to any port = 138 block in log first quick on dc0 proto tcp/udp from any to any port = 139 block in log first quick on dc0 proto tcp/udp from any to any port = 81 # Allow traffic in from ISP's DHCP server. This rule must contain # the IP address of your ISP's DHCP server as it's the only # authorized source to send this packet type. Only necessary for # cable or DSL configurations. This rule is not needed for # 'user ppp' type connection to the public Internet. # This is the same IP address you captured and # used in the outbound section. pass in quick on dc0 proto udp from z.z.z.z to any port = 68 keep state # Allow in standard www function because I have apache server pass in quick on dc0 proto tcp from any to any port = 80 flags S keep state # Allow in non-secure Telnet session from public Internet # labeled non-secure because ID/PW passed over public Internet as clear text. # Delete this sample group if you do not have telnet server enabled. #pass in quick on dc0 proto tcp from any to any port = 23 flags S keep state # Allow in secure FTP, Telnet, and SCP from public Internet # This function is using SSH (secure shell) pass in quick on dc0 proto tcp from any to any port = 22 flags S keep state # Block and log only first occurrence of all remaining traffic # coming into the firewall. The logging of only the first # occurrence stops a .denial of service. attack targeted # at filling up your log file space. # This rule enforces the block all by default logic. block in log first quick on dc0 all ################### End of rules file ##################################### <acronym>NAT</acronym> NAT IP masquerading NAT network address translation NAT NAT stands for Network Address Translation. To those familiar with &linux;, this concept is called IP Masquerading; NAT and IP Masquerading are the same thing. One of the many things the IPF NAT function enables is the ability to have a private Local Area Network (LAN) behind the firewall sharing a single ISP assigned IP address on the public Internet. You may ask why would someone want to do this. ISPs normally assign a dynamic IP address to their non-commercial users. Dynamic means that the IP address can be different each time you dial in and log on to your ISP, or for cable and DSL modem users when you power off and then power on your modems you can get assigned a different IP address. This IP address is how you are known to the public Internet. Now lets say you have five PCs at home and each one needs Internet access. You would have to pay your ISP for an individual Internet account for each PC and have five phone lines. With NAT you only need a single account with your ISP, then cable your other four PCs to a switch and the switch to the NIC in your &os; system which is going to service your LAN as a gateway. NAT will automatically translate the private LAN IP address for each separate PC on the LAN to the single public IP address as it exits the firewall bound for the public Internet. It also does the reverse translation for returning packets. NAT is most often accomplished without the approval, or knowledge, of your ISP and in most cases is grounds for your ISP terminating your account if found out. Commercial users pay a lot more for their Internet connection and usually get assigned a block of static IP address which never change. The ISP also expects and consents to their Commercial customers using NAT for their internal private LANs. There is a special range of IP addresses reserved for NATed private LAN IP address. According to RFC 1918, you can use the following IP ranges for private nets which will never be routed directly to the public Internet: Start IP 10.0.0.0 - Ending IP 10.255.255.255 Start IP 172.16.0.0 - Ending IP 172.31.255.255 Start IP 192.168.0.0 - Ending IP 192.168.255.255 IP<acronym>NAT</acronym> NAT and IPFILTER ipnat NAT rules are loaded by using the ipnat command. Typically the NAT rules are stored in /etc/ipnat.rules. See &man.ipnat.1; for details. When changing the NAT rules after NAT has been started, make your changes to the file containing the NAT rules, then run ipnat command with the flags to delete the internal in use NAT rules and flush the contents of the translation table of all active entries. To reload the NAT rules issue a command like this: &prompt.root; ipnat -CF -f /etc/ipnat.rules To display some statistics about your NAT, use this command: &prompt.root; ipnat -s To list the NAT table's current mappings, use this command: &prompt.root; ipnat -l To turn verbose mode on, and display information relating to rule processing and active rules/table entries: &prompt.root; ipnat -v IP<acronym>NAT</acronym> Rules NAT rules are very flexible and can accomplish many different things to fit the needs of commercial and home users. The rule syntax presented here has been simplified to what is most commonly used in a non-commercial environment. For a complete rule syntax description see the &man.ipnat.5; manual page. The syntax for a NAT rule looks something like this: - map IF LAN_IP_RANGE -> PUBLIC_ADDRESS + map IF LAN_IP_RANGE -> PUBLIC_ADDRESS The keyword map starts the rule. Replace IF with the external interface. The LAN_IP_RANGE is what your internal clients use for IP Addressing, usually this is something like 192.168.1.0/24. The PUBLIC_ADDRESS can either be the external IP address or the special keyword 0/32, which means to use the IP address assigned to IF. How <acronym>NAT</acronym> works A packet arrives at the firewall from the LAN with a public destination. It passes through the outbound filter rules, NAT gets his turn at the packet and applies its rules top down, first matching rule wins. NAT tests each of its rules against the packets interface name and source IP address. When a packets interface name matches a NAT rule then the [source IP address, i.e. private LAN IP address] of the packet is checked to see if it falls within the IP address range specified to the left of the arrow symbol on the NAT rule. On a match the packet has its source IP address rewritten with the public IP address obtained by the 0/32 keyword. NAT posts a entry in its internal NAT table so when the packet returns from the public Internet it can be mapped back to its original private IP address and then passed to the filter rules for processing. Enabling IP<acronym>NAT</acronym> To enable IPNAT add these statements to /etc/rc.conf. To enable your machine to route traffic between interfaces: gateway_enable="YES" To start IPNAT automatically each time: ipnat_enable="YES" To specify where to load the IPNAT rules from: ipnat_rules="/etc/ipnat.rules" <acronym>NAT</acronym> for a very large LAN For networks that have large numbers of PC's on the LAN or networks with more than a single LAN, the process of funneling all those private IP addresses into a single public IP address becomes a resource problem that may cause problems with the same port numbers being used many times across many NATed LAN PC's, causing collisions. There are two ways to relieve this resource problem. Assigning Ports to Use A normal NAT rule would look like: - map dc0 192.168.1.0/24 -> 0/32 + map dc0 192.168.1.0/24 -> 0/32 In the above rule the packet's source port is unchanged as the packet passes through IPNAT. By adding the portmap keyword you can tell IPNAT to only use source ports in a range. For example the following rule will tell IPNAT to modify the source port to be within that range: - map dc0 192.168.1.0/24 -> 0/32 portmap tcp/udp 20000:60000 + map dc0 192.168.1.0/24 -> 0/32 portmap tcp/udp 20000:60000 Additionally we can make things even easier by using the auto keyword to tell IPNAT to determine by itself which ports are available to use: - map dc0 192.168.1.0/24 -> 0/32 portmap tcp/udp auto + map dc0 192.168.1.0/24 -> 0/32 portmap tcp/udp auto Using a pool of public addresses In very large LANs there comes a point where there are just too many LAN addresses to fit into a single public address. By changing the following rule: - map dc0 192.168.1.0/24 -> 204.134.75.1 + map dc0 192.168.1.0/24 -> 204.134.75.1 Currently this rule maps all connections through 204.134.75.1. This can be changed to specify a range: - map dc0 192.168.1.0/24 -> 204.134.75.1-10 + map dc0 192.168.1.0/24 -> 204.134.75.1-10 Or a subnet using CIDR notation such as: - map dc0 192.168.1.0/24 -> 204.134.75.0/24 + map dc0 192.168.1.0/24 -> 204.134.75.0/24 Port Redirection A very common practice is to have a web server, email server, database server and DNS server each segregated to a different PC on the LAN. In this case the traffic from these servers still have to be NATed, but there has to be some way to direct the inbound traffic to the correct LAN PCs. IPNAT has the redirection facilities of NAT to solve this problem. Lets say you have your web server on LAN address 10.0.10.25 and your single public IP address is 20.20.20.5 you would code the rule like this: - rdr dc0 20.20.20.5/32 port 80 -> 10.0.10.25 port 80 + rdr dc0 20.20.20.5/32 port 80 -> 10.0.10.25 port 80 or: - rdr dc0 0/32 port 80 -> 10.0.10.25 port 80 + rdr dc0 0/32 port 80 -> 10.0.10.25 port 80 or for a LAN DNS Server on LAN address of 10.0.10.33 that needs to receive public DNS requests: - rdr dc0 20.20.20.5/32 port 53 -> 10.0.10.33 port 53 udp + rdr dc0 20.20.20.5/32 port 53 -> 10.0.10.33 port 53 udp FTP and <acronym>NAT</acronym> FTP is a dinosaur left over from the time before the Internet as it is known today, when research universities were leased lined together and FTP was used to share files among research Scientists. This was a time when data security was not a consideration. Over the years the FTP protocol became buried into the backbone of the emerging Internet and its username and password being sent in clear text was never changed to address new security concerns. FTP has two flavors, it can run in active mode or passive mode. The difference is in how the data channel is acquired. Passive mode is more secure as the data channel is acquired be the ordinal ftp session requester. For a real good explanation of FTP and the different modes see . IP<acronym>NAT</acronym> Rules IPNAT has a special built in FTP proxy option which can be specified on the NAT map rule. It can monitor all outbound packet traffic for FTP active or passive start session requests and dynamically create temporary filter rules containing only the port number really in use for the data channel. This eliminates the security risk FTP normally exposes the firewall to from having large ranges of high order port numbers open. This rule will handle all the traffic for the internal LAN: - map dc0 10.0.10.0/29 -> 0/32 proxy port 21 ftp/tcp + map dc0 10.0.10.0/29 -> 0/32 proxy port 21 ftp/tcp This rule handles the FTP traffic from the gateway: - map dc0 0.0.0.0/0 -> 0/32 proxy port 21 ftp/tcp + map dc0 0.0.0.0/0 -> 0/32 proxy port 21 ftp/tcp This rule handles all non-FTP traffic from the internal LAN: - map dc0 10.0.10.0/29 -> 0/32 + map dc0 10.0.10.0/29 -> 0/32 The FTP map rule goes before our regular map rule. All packets are tested against the first rule from the top. Matches on interface name, then private LAN source IP address, and then is it a FTP packet. If all that matches then the special FTP proxy creates temp filter rules to let the FTP session packets pass in and out, in addition to also NATing the FTP packets. All LAN packets that are not FTP do not match the first rule and fall through to the third rule and are tested, matching on interface and source IP, then are NATed. IP<acronym>NAT</acronym> FTP Filter Rules Only one filter rule is needed for FTP if the NAT FTP proxy is used. Without the FTP Proxy you will need the following three rules: # Allow out LAN PC client FTP to public Internet # Active and passive modes pass out quick on rl0 proto tcp from any to any port = 21 flags S keep state # Allow out passive mode data channel high order port numbers -pass out quick on rl0 proto tcp from any to any port > 1024 flags S keep state +pass out quick on rl0 proto tcp from any to any port > 1024 flags S keep state # Active mode let data channel in from FTP server pass in quick on rl0 proto tcp from any to any port = 20 flags S keep state FTP <acronym>NAT</acronym> Proxy Bug As of &os; 4.9 which includes IPFILTER version 3.4.31 the FTP proxy works as documented during the FTP session until the session is told to close. When the close happens packets returning from the remote FTP server are blocked and logged coming in on port 21. The NAT FTP/proxy appears to remove its temp rules prematurely, before receiving the response from the remote FTP server acknowledging the close. A problem report was posted to the IPF mailing list. The solution is to add a filter rule to get rid of these unwanted log messages or do nothing and ignore FTP inbound error messages in your log. Most people do not use outbound FTP too often. block in quick on rl0 proto tcp from any to any port = 21 IPFW firewall IPFW This section is work in progress. The contents might not be accurate at all times. The IPFIREWALL (IPFW) is a &os; sponsored firewall software application authored and maintained by &os; volunteer staff members. It uses the legacy stateless rules and a legacy rule coding technique to achieve what is referred to as Simple Stateful logic. The IPFW sample rule set (found in /etc/rc.firewall) in the standard &os; install is rather simple and it is not expected that it used directly without modifications. The example does not use stateful filtering, which is beneficial in most setups, so it will not be used as base for this section. The IPFW stateless rule syntax is empowered with technically sophisticated selection capabilities which far surpasses the knowledge level of the customary firewall installer. IPFW is targeted at the professional user or the advanced technical computer hobbyist who have advanced packet selection requirements. A high degree of detailed knowledge into how different protocols use and create their unique packet header information is necessary before the power of the IPFW rules can be unleashed. Providing that level of explanation is out of the scope of this section of the handbook. IPFW is composed of seven components, the primary component is the kernel firewall filter rule processor and its integrated packet accounting facility, the logging facility, the 'divert' rule which triggers the NAT facility, and the advanced special purpose facilities, the dummynet traffic shaper facilities, the 'fwd rule' forward facility, the bridge facility, and the ipstealth facility. Enabling IPFW IPFW enabling IPFW is included in the basic &os; install as a separate run time loadable module. The system will dynamically load the kernel module when the rc.conf statement firewall_enable="YES" is used. You do not need to compile IPFW into the &os; kernel unless you want NAT function enabled. After rebooting your system with firewall_enable="YES" in rc.conf the following white highlighted message is displayed on the screen as part of the boot process: ipfw2 initialized, divert disabled, rule-based forwarding disabled, default to deny, logging disabled The loadable module does have logging ability compiled in. To enable logging and set the verbose logging limit, there is a knob you can set in /etc/sysctl.conf by adding these statements, logging will be enabled on future reboots: net.inet.ip.fw.verbose=1 net.inet.ip.fw.verbose_limit=5 Kernel Options kernel options IPFIREWALL kernel options IPFIREWALL_VERBOSE kernel options IPFIREWALL_VERBOSE_LIMIT IPFW kernel options It is not a mandatory requirement that you enable IPFW by compiling the following options into the &os; kernel unless you need NAT function. It is presented here as background information. options IPFIREWALL This option enables IPFW as part of the kernel options IPFIREWALL_VERBOSE Enables logging of packets that pass through IPFW and have the 'log' keyword specified in the rule set. options IPFIREWALL_VERBOSE_LIMIT=5 Limits the number of packets logged through &man.syslogd.8; on a per entry basis. You may wish to use this option in hostile environments which you want to log firewall activity. This will close a possible denial of service attack via syslog flooding. kernel options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFIREWALL_DEFAULT_TO_ACCEPT This option will allow everything to pass through the firewall by default, which is a good idea when you are first setting up your firewall. options IPV6FIREWALL options IPV6FIREWALL_VERBOSE options IPV6FIREWALL_VERBOSE_LIMIT options IPV6FIREWALL_DEFAULT_TO_ACCEPT These options are exactly the same as the IPv4 options but they are for IPv6. If you do not use IPv6 you might want to use IPV6FIREWALL without any rules to block all IPv6 kernel options IPDIVERT options IPDIVERT This enables the use of NAT functionality. If you do not include IPFIREWALL_DEFAULT_TO_ACCEPT or set your rules to allow incoming packets you will block all packets going to and from this machine. <filename>/etc/rc.conf</filename> Options If you do not have IPFW compiled into your kernel you will need to load it with the following statement in your /etc/rc.conf: firewall_enable="YES" Set the script to run to activate your rules: firewall_script="/etc/ipfw.rules" Enable logging: firewall_logging="YES" The only thing that the firewall_logging variable will do is setting the net.inet.ip.fw.verbose sysctl variable to the value of 1 (see ). There is no rc.conf variable to set log limitations, but it can be set via sysctl variable, manually or from the /etc/sysctl.conf file: net.inet.ip.fw.verbose_limit=5 If your machine is acting as a gateway, i.e. providing Network Address Translation (NAT) via &man.natd.8;, please refer to for information regarding the required /etc/rc.conf options. The IPFW Command ipfw The ipfw command is the normal vehicle for making manual single rule additions or deletions to the firewall active internal rules while it is running. The problem with using this method is once your system is shutdown or halted all the rules you added or changed or deleted are lost. Writing all your rules in a file and using that file to load the rules at boot time, or to replace in mass the currently running firewall rules with changes you made to the files content is the recommended method used here. The ipfw command is still a very useful to display the running firewall rules to the console screen. The IPFW accounting facility dynamically creates a counter for each rule that counts each packet that matches the rule. During the process of testing a rule, listing the rule with its counter is the one of the ways of determining if the rule is functioning. To list all the rules in sequence: &prompt.root; ipfw list To list all the rules with a time stamp of when the last time the rule was matched: &prompt.root; ipfw -t list To list the accounting information, packet count for matched rules along with the rules themselves. The first column is the rule number, followed by the number of outgoing matched packets, followed by the number of incoming matched packets, and then the rule itself. &prompt.root; ipfw -a list List the dynamic rules in addition to the static rules: &prompt.root; ipfw -d list Also show the expired dynamic rules: &prompt.root; ipfw -d -e list Zero the counters: &prompt.root; ipfw zero Zero the counters for just rule NUM: &prompt.root; ipfw zero NUM IPFW Rule Sets A rule set is a group of ipfw rules coded to allow or deny packets based on the values contained in the packet. The bi-directional exchange of packets between hosts comprises a session conversation. The firewall rule set processes the packet twice: once on its arrival from the public Internet host and again as it leaves for its return trip back to the public Internet host. Each tcp/ip service (i.e. telnet, www, mail, etc.) is predefined by its protocol, and port number. This is the basic selection criteria used to create rules which will allow or deny services. IPFW rule processing order When a packet enters the firewall it is compared against the first rule in the rule set and progress one rule at a time moving from top to bottom of the set in ascending rule number sequence order. When the packet matches a rule selection parameters, the rules action field value is executed and the search of the rule set terminates for that packet. This is referred to as the first match wins search method. If the packet does not match any of the rules, it gets caught by the mandatory ipfw default rule, number 65535 which denies all packets and discards them without any reply back to the originating destination. The search continues after count, skipto and tee rules. The instructions contained here are based on using rules that contain the stateful 'keep state', 'limit', 'in'/'out', and via options. This is the basic framework for coding an inclusive type firewall rule set. An inclusive firewall only allows services matching the rules through. This way you can control what services can originate behind the firewall destine for the public Internet and also control the services which can originate from the public Internet accessing your private network. Everything else is denied by default design. Inclusive firewalls are much, much more secure than exclusive firewall rule sets and is the only rule set type covered here in. When working with the firewall rules be careful, you can end up locking your self out. Rule Syntax IPFW rule syntax The rule syntax presented here has been simplified to what is necessary to create a standard inclusive type firewall rule set. For a complete rule syntax description see the &man.ipfw.8; manual page. Rules contain keywords: these keywords have to be coded in a specific order from left to right on the line. Keywords are identified in bold type. Some keywords have sub-options which may be keywords them selves and also include more sub-options. # is used to mark the start of a comment and may appear at the end of a rule line or on its own lines. Blank lines are ignored. CMD RULE_NUMBER ACTION LOGGING SELECTION STATEFUL CMD Each new rule has to be prefixed with add to add the rule to the internal table. RULE_NUMBER Each rule has to have a rule number to go with it. ACTION A rule can be associated with one of the following actions, which will be executed when the packet matches the selection criterion of the rule. allow | accept | pass | permit These all mean the same thing which is to allow packets that match the rule to exit the firewall rule processing. The search terminates at this rule. check-state Checks the packet against the dynamic rules table. If a match is found, execute the action associated with the rule which generated this dynamic rule, otherwise move to the next rule. The check-state rule does not have selection criterion. If no check-state rule is present in the rule set, the dynamic rules table is checked at the first keep-state or limit rule. deny | drop Both words mean the same thing which is to discard packets that match this rule. The search terminates. Logging log or logamount When a packet matches a rule with the log keyword, a message will be logged to syslogd with a facility name of SECURITY. The logging only occurs if the number of packets logged so far for that particular rule does not exceed the logamount parameter. If no logamount is specified, the limit is taken from the sysctl variable net.inet.ip.fw.verbose_limit. In both cases, a value of zero removes the logging limit. Once the limit is reached, logging can be re-enabled by clearing the logging counter or the packet counter for that rule, see the ipfw reset log command. Logging is done after all other packet matching conditions have been successfully verified, and before performing the final action (accept, deny) on the packet. It is up to you to decide which rules you want to enable logging on. Selection The keywords described in this section are used to describe attributes of the packet to be interrogated when determining whether rules match the packet or not. The following general-purpose attributes are provided for matching, and must be used in this order: udp | tcp | icmp or any protocol names found in /etc/protocols are recognized and may be used. The value specified is protocol to be matched against. This is a mandatory requirement. from src to dst The from and to keywords are used to match against IP addresses. Rules must specify BOTH source and destination parameters. any is a special keyword that matches any IP address. me is a special keyword that matches any IP address configured on an interface in your &os; system to represent the PC the firewall is running on (i.e. this box) as in 'from me to any' or 'from any to me' or 'from 0.0.0.0/0 to any' or 'from any to 0.0.0.0/0' or 'from 0.0.0.0 to any' or 'from any to 0.0.0.0' or 'from me to 0.0.0.0'. IP addresses are specified as a dotted IP address numeric form/mask-length, or as single dotted IP address numeric form. This is a mandatory requirement. See this link for help on writing mask-lengths. port number For protocols which support port numbers (such as TCP and UDP). It is mandatory that you code the port number of the service you want to match on. Service names (from /etc/services) may be used instead of numeric port values. in | out Matches incoming or outgoing packets, respectively. The in and out are keywords and it is mandatory that you code one or the other as part of your rule matching criterion. via IF Matches packets going through the interface specified by exact name. The via keyword causes the interface to always be checked as part of the match process. setup This is a mandatory keyword that identifies the session start request for TCP packets. keep-state This is a mandatory> keyword. Upon a match, the firewall will create a dynamic rule, whose default behavior is to match bidirectional traffic between source and destination IP/port using the same protocol. limit {src-addr | src-port | dst-addr | dst-port} The firewall will only allow N connections with the same set of parameters as specified in the rule. One or more of source and destination addresses and ports can be specified. The 'limit' and 'keep-state' can not be used on same rule. Limit provides the same stateful function as 'keep-state' plus its own functions. Stateful Rule Option IPFW stateful filtering Stateful filtering treats traffic as a bi-directional exchange of packets comprising a session conversation. It has the interrogation abilities to determine if the session conversation between the originating sender and the destination are following the valid procedure of bi-directional packet exchange. Any packets that do not properly fit the session conversation template are automatically rejected as impostors. 'check-state' is used to identify where in the IPFW rules set the packet is to be tested against the dynamic rules facility. On a match the packet exits the firewall to continue on its way and a new rule is dynamic created for the next anticipated packet being exchanged during this bi-directional session conversation. On a no match the packet advances to the next rule in the rule set for testing. The dynamic rules facility is vulnerable to resource depletion from a SYN-flood attack which would open a huge number of dynamic rules. To counter this attack, &os; version 4.5 added another new option named limit. This option is used to limit the number of simultaneous session conversations by interrogating the rules source or destinations fields as directed by the limit option and using the packet's IP address found there, in a search of the open dynamic rules counting the number of times this rule and IP address combination occurred, if this count is greater that the value specified on the limit option, the packet is discarded. Logging Firewall Messages IPFW logging The benefits of logging are obvious: it provides the ability to review after the fact the rules you activated logging on which provides information like, what packets had been dropped, what addresses they came from, where they were going, giving you a significant edge in tracking down attackers. Even with the logging facility enabled, IPFW will not generate any rule logging on it's own. The firewall administrator decides what rules in the rule set he wants to log and adds the log verb to those rules. Normally only deny rules are logged, like the deny rule for incoming ICMP pings. It is very customary to duplicate the ipfw default deny everything rule with the log verb included as your last rule in the rule set. This way you get to see all the packets that did not match any of the rules in the rule set. Logging is a two edged sword, if you are not careful, you can lose yourself in the over abundance of log data and fill your disk up with growing log files. DoS attacks that fill up disk drives is one of the oldest attacks around. These log message are not only written to syslogd, but also are displayed on the root console screen and soon become very annoying. The IPFIREWALL_VERBOSE_LIMIT=5 kernel option limits the number of consecutive messages sent to the system logger syslogd, concerning the packet matching of a given rule. When this option is enabled in the kernel, the number of consecutive messages concerning a particular rule is capped at the number specified. There is nothing to be gained from 200 log messages saying the same identical thing. For instance, five consecutive messages concerning a particular rule would be logged to syslogd, the remainder identical consecutive messages would be counted and posted to the syslogd with a phrase like this: last message repeated 45 times All logged packets messages are written by default to /var/log/security file, which is defined in the /etc/syslog.conf file. Building a Rule Script Most experienced IPFW users create a file containing the rules and code them in a manner compatible with running them as a script. The major benefit of doing this is the firewall rules can be refreshed in mass without the need of rebooting the system to activate the new rules. This method is very convenient in testing new rules as the procedure can be executed as many times as needed. Being a script, you can use symbolic substitution to code frequent used values and substitution them in multiple rules. You will see this in the following example. The script syntax used here is compatible with the 'sh', 'csh', 'tcsh' shells. Symbolic substitution fields are prefixed with a dollar sign $. Symbolic fields do not have the $ prefix. The value to populate the Symbolic field must be enclosed to "double quotes". Start your rules file like this: ############### start of example ipfw rules script ############# # ipfw -q -f flush # Delete all rules # Set defaults oif="tun0" # out interface odns="192.0.2.11" # ISP's DNS server IP address cmd="ipfw -q add " # build rule prefix ks="keep-state" # just too lazy to key this each time $cmd 00500 check-state $cmd 00502 deny all from any to any frag $cmd 00501 deny tcp from any to any established $cmd 00600 allow tcp from any to any 80 out via $oif setup $ks $cmd 00610 allow tcp from any to $odns 53 out via $oif setup $ks $cmd 00611 allow udp from any to $odns 53 out via $oif $ks ################### End of example ipfw rules script ############ That is all there is to it. The rules are not important in this example, how the Symbolic substitution field are populated and used are. If the above example was in /etc/ipfw.rules file, you could reload these rules by entering on the command line. &prompt.root; sh /etc/ipfw.rules The /etc/ipfw.rules file could be located anywhere you want and the file could be named any thing you would like. The same thing could also be accomplished by running these commands by hand: &prompt.root; ipfw -q -f flush &prompt.root; ipfw -q add check-state &prompt.root; ipfw -q add deny all from any to any frag &prompt.root; ipfw -q add deny tcp from any to any established &prompt.root; ipfw -q add allow tcp from any to any 80 out via tun0 setup keep-state &prompt.root; ipfw -q add allow tcp from any to 192.0.2.11 53 out via tun0 setup keep-state &prompt.root; ipfw -q add 00611 allow udp from any to 192.0.2.11 53 out via tun0 keep-state Stateful Ruleset The following non-NATed rule set is an example of how to code a very secure 'inclusive' type of firewall. An inclusive firewall only allows services matching pass rules through and blocks all other by default. All firewalls have at the minimum two interfaces which have to have rules to allow the firewall to function. All &unix; flavored operating systems, &os; included, are designed to use interface lo0 and IP address 127.0.0.1 for internal communication with in the operating system. The firewall rules must contain rules to allow free unmolested movement of these special internally used packets. The interface which faces the public Internet, is the one which you code your rules to authorize and control access out to the public Internet and access requests arriving from the public Internet. This can be your ppp tun0 interface or your NIC that is connected to your DSL or cable modem. In cases where one or more than one NIC are connected to a private LANs behind the firewall, those interfaces must have rules coded to allow free unmolested movement of packets originating from those LAN interfaces. The rules should be first organized into three major sections, all the free unmolested interfaces, public interface outbound, and the public interface inbound. The order of the rules in each of the public interface sections should be in order of the most used rules being placed before less often used rules with the last rule in the section being a block log all packets on that interface and direction. The Outbound section in the following rule set only contains 'allow' rules which contain selection values that uniquely identify the service that is authorized for public Internet access. All the rules have the, proto, port, in/out, via and keep state option coded. The 'proto tcp' rules have the 'setup' option included to identify the start session request as the trigger packet to be posted to the keep state stateful table. The Inbound section has all the blocking of undesirable packets first for two different reasons. First is these things being blocked may be part of an otherwise valid packet which may be allowed in by the later authorized service rules. Second reason is that by having a rule that explicitly blocks selected packets that I receive on an infrequent bases and do not want to see in the log, this keeps them from being caught by the last rule in the section which blocks and logs all packets which have fallen through the rules. The last rule in the section which blocks and logs all packets is how you create the legal evidence needed to prosecute the people who are attacking your system. Another thing you should take note of, is there is no response returned for any of the undesirable stuff, their packets just get dropped and vanish. This way the attackers has no knowledge if his packets have reached your system. The less the attackers can learn about your system the more secure it is. When you log packets with port numbers you do not recognize, look the numbers up in /etc/services/ or go to and do a port number lookup to find what the purpose of that port number is. Check out this link for port numbers used by Trojans: . An Example Inclusive Ruleset The following non-NATed rule set is a complete inclusive type ruleset. You can not go wrong using this rule set for you own. Just comment out any pass rules for services you do not want. If you see messages in your log that you want to stop seeing just add a deny rule in the inbound section. You have to change the 'dc0' interface name in every rule to the interface name of the NIC that connects your system to the public Internet. For user ppp it would be 'tun0'. You will see a pattern in the usage of these rules. All statements that are a request to start a session to the public Internet use keep-state. All the authorized services that originate from the public Internet have the limit option to stop flooding. All rules use in or out to clarify direction. All rules use via interface name to specify the interface the packet is traveling over. The following rules go into /etc/ipfw.rules. ################ Start of IPFW rules file ############################### # Flush out the list before we begin. ipfw -q -f flush # Set rules command prefix cmd="ipfw -q add" pif="dc0" # public interface name of NIC # facing the public Internet ################################################################# # No restrictions on Inside LAN Interface for private network # Not needed unless you have LAN. # Change xl0 to your LAN NIC interface name ################################################################# #$cmd 00005 allow all from any to any via xl0 ################################################################# # No restrictions on Loopback Interface ################################################################# $cmd 00010 allow all from any to any via lo0 ################################################################# # Allow the packet through if it has previous been added to the # the "dynamic" rules table by a allow keep-state statement. ################################################################# $cmd 00015 check-state ################################################################# # Interface facing Public Internet (Outbound Section) # Interrogate session start requests originating from behind the # firewall on the private network or from this gateway server # destine for the public Internet. ################################################################# # Allow out access to my ISP's Domain name server. # x.x.x.x must be the IP address of your ISP.s DNS # Dup these lines if your ISP has more than one DNS server # Get the IP addresses from /etc/resolv.conf file $cmd 00110 allow tcp from any to x.x.x.x 53 out via $pif setup keep-state $cmd 00111 allow udp from any to x.x.x.x 53 out via $pif keep-state # Allow out access to my ISP's DHCP server for cable/DSL configurations. # This rule is not needed for .user ppp. connection to the public Internet. # so you can delete this whole group. # Use the following rule and check log for IP address. -# Then put IP address in commented out rule & delete first rule +# Then put IP address in commented out rule & delete first rule $cmd 00120 allow log udp from any to any 67 out via $pif keep-state #$cmd 00120 allow udp from any to x.x.x.x 67 out via $pif keep-state # Allow out non-secure standard www function $cmd 00200 allow tcp from any to any 80 out via $pif setup keep-state # Allow out secure www function https over TLS SSL $cmd 00220 allow tcp from any to any 443 out via $pif setup keep-state -# Allow out send & get email function +# Allow out send & get email function $cmd 00230 allow tcp from any to any 25 out via $pif setup keep-state $cmd 00231 allow tcp from any to any 110 out via $pif setup keep-state -# Allow out FBSD (make install & CVSUP) functions +# Allow out FBSD (make install & CVSUP) functions # Basically give user root "GOD" privileges. $cmd 00240 allow tcp from me to any out via $pif setup keep-state uid root # Allow out ping $cmd 00250 allow icmp from any to any out via $pif keep-state # Allow out Time $cmd 00260 allow tcp from any to any 37 out via $pif setup keep-state # Allow out nntp news (i.e. news groups) $cmd 00270 allow tcp from any to any 119 out via $pif setup keep-state # Allow out secure FTP, Telnet, and SCP # This function is using SSH (secure shell) $cmd 00280 allow tcp from any to any 22 out via $pif setup keep-state # Allow out whois $cmd 00290 allow tcp from any to any 43 out via $pif setup keep-state # deny and log everything else that.s trying to get out. # This rule enforces the block all by default logic. $cmd 00299 deny log all from any to any out via $pif ################################################################# # Interface facing Public Internet (Inbound Section) # Interrogate packets originating from the public Internet # destine for this gateway server or the private network. ################################################################# # Deny all inbound traffic from non-routable reserved address spaces $cmd 00300 deny all from 192.168.0.0/16 to any in via $pif #RFC 1918 private IP $cmd 00301 deny all from 172.16.0.0/12 to any in via $pif #RFC 1918 private IP $cmd 00302 deny all from 10.0.0.0/8 to any in via $pif #RFC 1918 private IP $cmd 00303 deny all from 127.0.0.0/8 to any in via $pif #loopback $cmd 00304 deny all from 0.0.0.0/8 to any in via $pif #loopback $cmd 00305 deny all from 169.254.0.0/16 to any in via $pif #DHCP auto-config $cmd 00306 deny all from 192.0.2.0/24 to any in via $pif #reserved for docs $cmd 00307 deny all from 204.152.64.0/23 to any in via $pif #Sun cluster interconnect -$cmd 00308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast +$cmd 00308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast # Deny public pings $cmd 00310 deny icmp from any to any in via $pif # Deny ident $cmd 00315 deny tcp from any to any 113 in via $pif # Deny all Netbios service. 137=name, 138=datagram, 139=session # Netbios is MS/Windows sharing services. # Block MS/Windows hosts2 name server requests 81 $cmd 00320 deny tcp from any to any 137 in via $pif $cmd 00321 deny tcp from any to any 138 in via $pif $cmd 00322 deny tcp from any to any 139 in via $pif $cmd 00323 deny tcp from any to any 81 in via $pif # Deny any late arriving packets $cmd 00330 deny all from any to any frag in via $pif # Deny ACK packets that did not match the dynamic rule table $cmd 00332 deny tcp from any to any established in via $pif # Allow traffic in from ISP's DHCP server. This rule must contain # the IP address of your ISP.s DHCP server as it.s the only # authorized source to send this packet type. # Only necessary for cable or DSL configurations. # This rule is not needed for .user ppp. type connection to # the public Internet. This is the same IP address you captured # and used in the outbound section. #$cmd 00360 allow udp from any to x.x.x.x 67 in via $pif keep-state # Allow in standard www function because I have apache server $cmd 00400 allow tcp from any to me 80 in via $pif setup limit src-addr 2 # Allow in secure FTP, Telnet, and SCP from public Internet $cmd 00410 allow tcp from any to me 22 in via $pif setup limit src-addr 2 # Allow in non-secure Telnet session from public Internet -# labeled non-secure because ID & PW are passed over public +# labeled non-secure because ID & PW are passed over public # Internet as clear text. # Delete this sample group if you do not have telnet server enabled. $cmd 00420 allow tcp from any to me 23 in via $pif setup limit src-addr 2 -# Reject & Log all incoming connections from the outside +# Reject & Log all incoming connections from the outside $cmd 00499 deny log all from any to any in via $pif # Everything else is denied by default # deny and log all packets that fell through to see what they are $cmd 00999 deny log all from any to any ################ End of IPFW rules file ############################### An Example <acronym>NAT</acronym> and Stateful Ruleset NAT and IPFW There are some additional configuration statements that need to be enabled to activate the NAT function of IPFW. The kernel source needs 'option divert' statement added to the other IPFIREWALL statements compiled into a custom kernel. In addition to the normal IPFW options in /etc/rc.conf, the following are needed. natd_enable="YES" # Enable NATD function natd_interface="rl0" # interface name of public Internet NIC natd_flags="-dynamic -m" # -m = preserve port numbers if possible Utilizing stateful rules with divert natd rule (Network Address Translation) greatly complicates the rule set coding logic. The positioning of the check-state, and 'divert natd' rules in the rule set becomes very critical. This is no longer a simple fall-through logic flow. A new action type is used, called 'skipto'. To use the skipto command it is mandatory that you number each rule so you know exactly where the skipto rule number is you are really jumping to. The following is an uncommented example of one coding method, selected here to explain the sequence of the packet flow through the rule sets. The processing flow starts with the first rule from the top of the rule file and progress one rule at a time deeper into the file until the end is reach or the packet being tested to the selection criteria matches and the packet is released out of the firewall. It is important to take notice of the location of rule numbers 100 101, 450, 500, and 510. These rules control the translation of the outbound and inbound packets so their entries in the keep-state dynamic table always register the private LAN IP address. Next notice that all the allow and deny rules specified the direction the packet is going (IE outbound or inbound) and the interface. Also notice that all the start outbound session requests all skipto rule 500 for the network address translation. Lets say a LAN user uses their web browser to get a web page. Web pages use port 80 to communicate over. So the packet enters the firewall, It does not match 100 because it is headed out not in. It passes rule 101 because this is the first packet so it has not been posted to the keep-state dynamic table yet. The packet finally comes to rule 125 a matches. It is outbound through the NIC facing the public Internet. The packet still has it's source IP address as a private LAN IP address. On the match to this rule, two actions take place. The keep-state option will post this rule into the keep-state dynamic rules table and the specified action is executed. The action is part of the info posted to the dynamic table. In this case it is "skipto rule 500". Rule 500 NATs the packet IP address and out it goes. Remember this, this is very important. This packet makes its way to the destination and returns and enters the top of the rule set. This time it does match rule 100 and has it destination IP address mapped back to its corresponding LAN IP address. It then is processed by the check-state rule, it's found in the table as an existing session conversation and released to the LAN. It goes to the LAN PC that sent it and a new packet is sent requesting another segment of the data from the remote server. This time it gets checked by the check-state rule and its outbound entry is found, the associated action, 'skipto 500', is executed. The packet jumps to rule 500 gets NATed and released on it's way out. On the inbound side, everything coming in that is part of an existing session conversation is being automatically handled by the check-state rule and the properly placed divert natd rules. All we have to address is denying all the bad packets and only allowing in the authorized services. Lets say there is a apache server running on the firewall box and we want people on the public Internet to be able to access the local web site. The new inbound start request packet matches rule 100 and its IP address is mapped to LAN IP for the firewall box. The packet is them matched against all the nasty things we want to check for and finally matches against rule 425. On a match two things occur. The packet rule is posted to the keep-state dynamic table but this time any new session requests originating from that source IP address is limited to 2. This defends against DoS attacks of service running on the specified port number. The action is allow so the packet is released to the LAN. On return the check-state rule recognizes the packet as belonging to an existing session conversation sends it to rule 500 for NATing and released to outbound interface. Example Ruleset #1: #!/bin/sh cmd="ipfw -q add" skip="skipto 500" pif=rl0 ks="keep-state" good_tcpo="22,25,37,43,53,80,443,110,119" ipfw -q -f flush $cmd 002 allow all from any to any via xl0 # exclude LAN traffic $cmd 003 allow all from any to any via lo0 # exclude loopback traffic $cmd 100 divert natd ip from any to any in via $pif $cmd 101 check-state # Authorized outbound packets $cmd 120 $skip udp from any to xx.168.240.2 53 out via $pif $ks $cmd 121 $skip udp from any to xx.168.240.5 53 out via $pif $ks $cmd 125 $skip tcp from any to any $good_tcpo out via $pif setup $ks $cmd 130 $skip icmp from any to any out via $pif $ks $cmd 135 $skip udp from any to any 123 out via $pif $ks # Deny all inbound traffic from non-routable reserved address spaces $cmd 300 deny all from 192.168.0.0/16 to any in via $pif #RFC 1918 private IP $cmd 301 deny all from 172.16.0.0/12 to any in via $pif #RFC 1918 private IP $cmd 302 deny all from 10.0.0.0/8 to any in via $pif #RFC 1918 private IP $cmd 303 deny all from 127.0.0.0/8 to any in via $pif #loopback $cmd 304 deny all from 0.0.0.0/8 to any in via $pif #loopback $cmd 305 deny all from 169.254.0.0/16 to any in via $pif #DHCP auto-config $cmd 306 deny all from 192.0.2.0/24 to any in via $pif #reserved for docs $cmd 307 deny all from 204.152.64.0/23 to any in via $pif #Sun cluster -$cmd 308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast +$cmd 308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast # Authorized inbound packets $cmd 400 allow udp from xx.70.207.54 to any 68 in $ks $cmd 420 allow tcp from any to me 80 in via $pif setup limit src-addr 1 $cmd 450 deny log ip from any to any # This is skipto location for outbound stateful rules $cmd 500 divert natd ip from any to any out via $pif $cmd 510 allow ip from any to any ######################## end of rules ################## The following is pretty much the same as above, but uses a self documenting coding style full of description comments to help the inexperienced IPFW rule writer to better understand what the rules are doing. Example Ruleset #2: #!/bin/sh ################ Start of IPFW rules file ############################### # Flush out the list before we begin. ipfw -q -f flush # Set rules command prefix cmd="ipfw -q add" skip="skipto 800" pif="rl0" # public interface name of NIC # facing the public Internet ################################################################# # No restrictions on Inside LAN Interface for private network # Change xl0 to your LAN NIC interface name ################################################################# $cmd 005 allow all from any to any via xl0 ################################################################# # No restrictions on Loopback Interface ################################################################# $cmd 010 allow all from any to any via lo0 ################################################################# # check if packet is inbound and nat address if it is ################################################################# $cmd 014 divert natd ip from any to any in via $pif ################################################################# # Allow the packet through if it has previous been added to the # the "dynamic" rules table by a allow keep-state statement. ################################################################# $cmd 015 check-state ################################################################# # Interface facing Public Internet (Outbound Section) # Interrogate session start requests originating from behind the # firewall on the private network or from this gateway server # destine for the public Internet. ################################################################# # Allow out access to my ISP's Domain name server. # x.x.x.x must be the IP address of your ISP's DNS # Dup these lines if your ISP has more than one DNS server # Get the IP addresses from /etc/resolv.conf file $cmd 020 $skip tcp from any to x.x.x.x 53 out via $pif setup keep-state # Allow out access to my ISP's DHCP server for cable/DSL configurations. $cmd 030 $skip udp from any to x.x.x.x 67 out via $pif keep-state # Allow out non-secure standard www function $cmd 040 $skip tcp from any to any 80 out via $pif setup keep-state # Allow out secure www function https over TLS SSL $cmd 050 $skip tcp from any to any 443 out via $pif setup keep-state -# Allow out send & get email function +# Allow out send & get email function $cmd 060 $skip tcp from any to any 25 out via $pif setup keep-state $cmd 061 $skip tcp from any to any 110 out via $pif setup keep-state -# Allow out FreeBSD (make install & CVSUP) functions +# Allow out FreeBSD (make install & CVSUP) functions # Basically give user root "GOD" privileges. $cmd 070 $skip tcp from me to any out via $pif setup keep-state uid root # Allow out ping $cmd 080 $skip icmp from any to any out via $pif keep-state # Allow out Time $cmd 090 $skip tcp from any to any 37 out via $pif setup keep-state # Allow out nntp news (i.e. news groups) $cmd 100 $skip tcp from any to any 119 out via $pif setup keep-state # Allow out secure FTP, Telnet, and SCP # This function is using SSH (secure shell) $cmd 110 $skip tcp from any to any 22 out via $pif setup keep-state # Allow out whois $cmd 120 $skip tcp from any to any 43 out via $pif setup keep-state # Allow ntp time server $cmd 130 $skip udp from any to any 123 out via $pif keep-state ################################################################# # Interface facing Public Internet (Inbound Section) # Interrogate packets originating from the public Internet # destine for this gateway server or the private network. ################################################################# # Deny all inbound traffic from non-routable reserved address spaces $cmd 300 deny all from 192.168.0.0/16 to any in via $pif #RFC 1918 private IP $cmd 301 deny all from 172.16.0.0/12 to any in via $pif #RFC 1918 private IP $cmd 302 deny all from 10.0.0.0/8 to any in via $pif #RFC 1918 private IP $cmd 303 deny all from 127.0.0.0/8 to any in via $pif #loopback $cmd 304 deny all from 0.0.0.0/8 to any in via $pif #loopback $cmd 305 deny all from 169.254.0.0/16 to any in via $pif #DHCP auto-config $cmd 306 deny all from 192.0.2.0/24 to any in via $pif #reserved for docs $cmd 307 deny all from 204.152.64.0/23 to any in via $pif #Sun cluster -$cmd 308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast +$cmd 308 deny all from 224.0.0.0/3 to any in via $pif #Class D & E multicast # Deny ident $cmd 315 deny tcp from any to any 113 in via $pif # Deny all Netbios service. 137=name, 138=datagram, 139=session # Netbios is MS/Windows sharing services. # Block MS/Windows hosts2 name server requests 81 $cmd 320 deny tcp from any to any 137 in via $pif $cmd 321 deny tcp from any to any 138 in via $pif $cmd 322 deny tcp from any to any 139 in via $pif $cmd 323 deny tcp from any to any 81 in via $pif # Deny any late arriving packets $cmd 330 deny all from any to any frag in via $pif # Deny ACK packets that did not match the dynamic rule table $cmd 332 deny tcp from any to any established in via $pif # Allow traffic in from ISP's DHCP server. This rule must contain # the IP address of your ISP's DHCP server as it's the only # authorized source to send this packet type. # Only necessary for cable or DSL configurations. # This rule is not needed for 'user ppp' type connection to # the public Internet. This is the same IP address you captured # and used in the outbound section. $cmd 360 allow udp from x.x.x.x to any 68 in via $pif keep-state # Allow in standard www function because I have Apache server $cmd 370 allow tcp from any to me 80 in via $pif setup limit src-addr 2 # Allow in secure FTP, Telnet, and SCP from public Internet $cmd 380 allow tcp from any to me 22 in via $pif setup limit src-addr 2 # Allow in non-secure Telnet session from public Internet -# labeled non-secure because ID & PW are passed over public +# labeled non-secure because ID & PW are passed over public # Internet as clear text. # Delete this sample group if you do not have telnet server enabled. $cmd 390 allow tcp from any to me 23 in via $pif setup limit src-addr 2 -# Reject & Log all unauthorized incoming connections from the public Internet +# Reject & Log all unauthorized incoming connections from the public Internet $cmd 400 deny log all from any to any in via $pif -# Reject & Log all unauthorized out going connections to the public Internet +# Reject & Log all unauthorized out going connections to the public Internet $cmd 450 deny log all from any to any out via $pif # This is skipto location for outbound stateful rules $cmd 800 divert natd ip from any to any out via $pif $cmd 801 allow ip from any to any # Everything else is denied by default # deny and log all packets that fell through to see what they are $cmd 999 deny log all from any to any ################ End of IPFW rules file ############################### diff --git a/en_US.ISO8859-1/books/handbook/geom/chapter.sgml b/en_US.ISO8859-1/books/handbook/geom/chapter.sgml index 71b55b8061..8037a83a3c 100644 --- a/en_US.ISO8859-1/books/handbook/geom/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/geom/chapter.sgml @@ -1,420 +1,420 @@ Tom Rhodes Written by GEOM: Modular Disk Transformation Framework Synopsis GEOM GEOM Disk Framework GEOM This chapter covers the use of disks under the GEOM framework in &os;. This includes the major RAID control utilities which use the framework for configuration. This chapter will not go into in depth discussion on how GEOM handles or controls I/O, the underlying subsystem, or code. This information is provided through the &man.geom.4; manual page and its various SEE ALSO references. This chapter is also not a definitive guide to RAID configurations. Only GEOM-supported RAID classifications will be discussed. After reading this chapter, you will know: What type of RAID support is available through GEOM. How to use the base utilities to configure, maintain, and manipulate the various RAID levels. How to mirror, stripe, encrypt, and remotely connect disk devices through GEOM. How to troubleshoot disks attached to the GEOM framework. Before reading this chapter, you should: Understand how &os; treats disk devices (). Know how to configure and install a new &os; kernel (). GEOM Introduction GEOM permits access and control to classes — Master Boot Records, BSD labels, etc — through the use of providers, or the special files in /dev. Supporting various software RAID configurations, GEOM will transparently provide access to the operating system and operating system utilities. Tom Rhodes Written by Murray Stokely RAID0 - Striping GEOM Striping Striping is a method used to combine several disk drives into a single volume. In many cases, this is done through the use of hardware controllers. The GEOM disk subsystem provides software support for RAID0, also known as disk striping. In a RAID0 system, data are split up in blocks that get written across all the drives in the array. Instead of having to wait on the system to write 256k to one disk, a RAID0 system can simultaneously write 64k to each of four different disks, offering superior I/O performance. This performance can be enhanced further by using multiple disk controllers. Each disk in a RAID0 stripe must be of the same size, since I/O requests are interleaved to read or write to multiple disks in parallel. Disk Striping Illustration Creating a stripe of unformatted ATA disks Load the geom_stripe module: &prompt.root; kldload geom_stripe.ko Ensure that a suitable mount point exists. If this volume will become a root partition, then temporarily use another mount point such as /mnt: &prompt.root; mkdir /mnt Determine the device names for the disks which will be striped, and create the new stripe device. For example, the following command could be used to stripe two unused, unpartitioned ATA disks: /dev/ad2 and /dev/ad3. &prompt.root; gstripe label -v st0 /dev/ad2 /dev/ad3 A partition table must be created on the new volume with the following command: &prompt.root; bsdlabel -wB /dev/stripe/st0 This process should have created two other devices in the /dev/stripe directory in addition to the st0 device. Those include st0a and st0c. A file system must now be created on the st0a device using the following newfs command: &prompt.root; newfs -U /dev/stripe/st0a Many numbers will glide across the screen, and after a few seconds, the process will be complete. The volume has been created and is ready to be mounted. The following command can be used to manually mount a newly created disk stripe: &prompt.root; mount /dev/stripe/st0a /mnt To mount this striped file system automatically during the boot process, place the volume information in /etc/fstab file: &prompt.root; echo "/dev/stripe/st0a /mnt ufs rw 2 2" \ >> /etc/fstab The geom module must also be automatically loaded during system initialization, by adding a line to /boot/loader.conf: &prompt.root; echo 'geom_stripe_load="YES"' >> /boot/loader.conf RAID1 - Mirroring GEOM Disk Mirroring Mirroring is a technology used by many corporations and home users to back up data without interruption. When a mirror exists, it simply means that diskB replicates diskA. Or, perhaps diskC+D replicates diskA+B. Regardless of the disk configuration, the important aspect is that information on one disk or partition is being replicated. Later, that information could be more easily restored, backed up without causing service or access interruption, and even be physically stored in a data safe. To begin, ensure the system has two disk drives of equal size, this exercise assumes they are direct access (&man.da.4;) SCSI disks. Begin by installing &os; on the first disk with only two partitions. One should be a swap partition, double the RAM size and all remaining space devoted to the root (/) file system. It is possible to have separate partitions for other mount points; however, this will increase the difficulty level ten fold due to manual alteration of the &man.bsdlabel.8; and &man.fdisk.8; settings. Reboot and wait for the system to fully initialize. Once this process has completed, log in as the root user. Create the /dev/mirror/gm device and link it with /dev/da1: &prompt.root; gmirror label -vnb round-robin gm0 /dev/da1 The system should respond with: Metadata value stored on /dev/da1. Done. Initialize GEOM, this will load the /boot/kernel/geom_mirror.ko kernel module: &prompt.root; gmirror load This command should have created the gm0, device node under the /dev/mirror directory. Install a generic fdisk label and boot code to newly created gm0 device: &prompt.root; fdisk -vBI /dev/mirror/gm0 Now install generic bsdlabel information: &prompt.root; bsdlabel -wB /dev/mirror/gm0s1 If multiple slices and partitions exist, the flags for the previous two commands will require alteration. They must match the slice and partition size of the other disk. Use the &man.newfs.8; utility to create a default file system on the gm0s1a device node: &prompt.root; newfs -U /dev/mirror/gm0s1a This should have caused the system to spit out some information and a bunch of numbers. This is good. Examine the screen for any error messages and mount the device to the /mnt mount point: &prompt.root; mount /dev/mirror/gm0s1a /mnt Now move all data from the boot disk over to this new file system. This example uses the &man.dump.8; and &man.restore.8; commands; however, &man.dd.1; would also work with this scenario. - &prompt.root; dump -L -0 -f- / |(cd /mnt && restore -r -v -f-) + &prompt.root; dump -L -0 -f- / |(cd /mnt && restore -r -v -f-) This must be done for each file system. Simply place the appropriate file system in the correct location when running the aforementioned command. Now edit the replicated /mnt/etc/fstab file and remove or comment out the swap file It should be noted that commenting out the swap file entry in fstab will most likely require you to re-establish a different way of enabling swap space. Please refer to for more information. . Change the other file system information to use the new disk. See the following example: # Device Mountpoint FStype Options Dump Pass# #/dev/da0s2b none swap sw 0 0 /dev/mirror/gm0s1a / ufs rw 1 1 Now create a boot.conf file on both the current and new root partitions. This file will help the system BIOS boot the correct drive: &prompt.root; echo "1:da(1,a)/boot/loader" > /boot.config &prompt.root; echo "1:da(1,a)/boot/loader" > /mnt/boot.config We have placed it on both root partitions to ensure proper boot up. If for some reason the system cannot read from the new root partition, a failsafe is available. Now add the following line to the new /boot/loader.conf: &prompt.root; echo 'geom_mirror_load="YES"' >> /mnt/boot/loader.conf This will instruct &man.loader.8; utility to load the geom_mirror.ko module during system initialization. Reboot the system: &prompt.root; shutdown -r now If all has gone well, the system should have booted from the gm0s1a device and a login prompt should be waiting. If something went wrong, see review the forthcoming troubleshooting section. Now add the da0 disk to gm0 device: &prompt.root; gmirror configure -a gm0 &prompt.root; gmirror insert gm0 /dev/da0 The flag tells &man.gmirror.8; to use automatic synchronization; i.e., mirror the disk writes automatically. The manual page explains how to rebuild and replace disks, although it uses data in place of gm0. Troubleshooting System refuses to boot If the system boots up to a prompt similar to: ffs_mountroot: can't find rootvp Root mount failed: 6 mountroot> Reboot the machine using the power or reset button. At the boot menu, select option six (6). This will drop the system to a &man.loader.8; prompt. Load the kernel module manually: OK? load geom_mirror.ko OK? boot If this works then for whatever reason the module was not being loaded properly. Place: options GEOM_MIRROR in the kernel configuration file, rebuild and reinstall. That should remedy this issue. diff --git a/en_US.ISO8859-1/books/handbook/install/chapter.sgml b/en_US.ISO8859-1/books/handbook/install/chapter.sgml index ad1a3d82dd..70d77a1c3a 100644 --- a/en_US.ISO8859-1/books/handbook/install/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/install/chapter.sgml @@ -1,5686 +1,5686 @@ Jim Mock Restructured, reorganized, and parts rewritten by Randy Pratt The sysinstall walkthrough, screenshots, and general copy by Installing FreeBSD Synopsis installation FreeBSD is provided with a text-based, easy to use installation program called sysinstall. This is the default installation program for FreeBSD, although vendors are free to provide their own installation suite if they wish. This chapter describes how to use sysinstall to install FreeBSD. After reading this chapter, you will know: How to create the FreeBSD installation disks. How FreeBSD refers to, and subdivides, your hard disks. How to start sysinstall. The questions sysinstall will ask you, what they mean, and how to answer them. Before reading this chapter, you should: Read the supported hardware list that shipped with the version of FreeBSD you are installing, and verify that your hardware is supported. In general, these installation instructions are written for &i386; (PC compatible) architecture computers. Where applicable, instructions specific to other platforms (for example, Alpha) will be listed. Although this guide is kept as up to date as possible, you may find minor differences between the installer and what is shown here. It is suggested that you use this chapter as a general guide rather than a literal installation manual. Pre-installation Tasks Inventory Your Computer Before installing FreeBSD you should attempt to inventory the components in your computer. The FreeBSD installation routines will show you the components (hard disks, network cards, CDROM drives, and so forth) with their model number and manufacturer. FreeBSD will also attempt to determine the correct configuration for these devices, which includes information about IRQ and IO port usage. Due to the vagaries of PC hardware this process is not always completely successful, and you may need to correct FreeBSD's determination of your configuration. If you already have another operating system installed, such as &windows; or Linux, it is a good idea to use the facilities provided by those operating systems to see how your hardware is already configured. If you are not sure what settings an expansion card is using, you may find it printed on the card itself. Popular IRQ numbers are 3, 5, and 7, and IO port addresses are normally written as hexadecimal numbers, such as 0x330. We recommend you print or write down this information before installing FreeBSD. It may help to use a table, like this: Sample Device Inventory Device Name IRQ IO port(s) Notes First hard disk N/A N/A 40 GB, made by Seagate, first IDE master CDROM N/A N/A First IDE slave Second hard disk N/A N/A 20 GB, made by IBM, second IDE master First IDE controller 14 0x1f0 Network card N/A N/A &intel; 10/100 Modem N/A N/A &tm.3com; 56K faxmodem, on COM1 …

Backup Your Data If the computer you will be installing FreeBSD on contains valuable data, then ensure you have it backed up, and that you have tested the backups before installing FreeBSD. The FreeBSD installation routine will prompt you before writing any data to your disk, but once that process has started it cannot be undone. Decide Where to Install FreeBSD If you want FreeBSD to use your entire hard disk, then there is nothing more to concern yourself with at this point — you can skip this section. However, if you need FreeBSD to co-exist with other operating systems then you need to have a rough understanding of how data is laid out on the disk, and how this affects you. Disk Layouts for the &i386; A PC disk can be divided into discrete chunks. These chunks are called partitions. By design, the PC only supports four partitions per disk. These partitions are called primary partitions. To work around this limitation and allow more than four partitions, a new partition type was created, the extended partition. A disk may contain only one extended partition. Special partitions, called logical partitions, can be created inside this extended partition. Each partition has a partition ID, which is a number used to identify the type of data on the partition. FreeBSD partitions have the partition ID of 165. In general, each operating system that you use will identify partitions in a particular way. For example, DOS, and its descendants, like &windows;, assign each primary and logical partition a drive letter, starting with C:. FreeBSD must be installed into a primary partition. FreeBSD can keep all its data, including any files that you create, on this one partition. However, if you have multiple disks, then you can create a FreeBSD partition on all, or some, of them. When you install FreeBSD, you must have one partition available. This might be a blank partition that you have prepared, or it might be an existing partition that contains data that you no longer care about. If you are already using all the partitions on all your disks, then you will have to free one of them for FreeBSD using the tools provided by the other operating systems you use (e.g., fdisk on DOS or &windows;). If you have a spare partition then you can use that. However, you may need to shrink one or more of your existing partitions first. A minimal installation of FreeBSD takes as little as 100 MB of disk space. However, that is a very minimal install, leaving almost no space for your own files. A more realistic minimum is 250 MB without a graphical environment, and 350 MB or more if you want a graphical user interface. If you intend to install a lot of third party software as well, then you will need more space. You can use a commercial tool such as &partitionmagic; to resize your partitions to make space for FreeBSD. The tools directory on the CDROM contains two free software tools which can carry out this task, namely FIPS and PResizer. Documentation for both of these is available in the same directory. FIPS, PResizer, and &partitionmagic; can resize FAT16 and FAT32 partitions — used in &ms-dos; through &windows; ME. &partitionmagic; is the only one of the above applications that can resize NTFS partitions. Incorrect use of these tools can delete the data on your disk. Be sure that you have recent, working backups before using them. Using an Existing Partition Unchanged Suppose that you have a computer with a single 4 GB disk that already has a version of &windows; installed, and you have split the disk into two drive letters, C: and D:, each of which is 2 GB in size. You have 1 GB of data on C:, and 0.5 GB of data on D:. This means that your disk has two partitions on it, one per drive letter. You can copy all your existing data from D: to C:, which will free up the second partition, ready for FreeBSD. Shrinking an Existing Partition Suppose that you have a computer with a single 4 GB disk that already has a version of &windows; installed. When you installed &windows; you created one large partition, giving you a C: drive that is 4 GB in size. You are currently using 1.5 GB of space, and want FreeBSD to have 2 GB of space. In order to install FreeBSD you will need to either: Backup your &windows; data, and then reinstall &windows;, asking for a 2 GB partition at install time. Use one of the tools such as &partitionmagic;, described above, to shrink your &windows; partition. Disk Layouts for the Alpha Alpha You will need a dedicated disk for FreeBSD on the Alpha. It is not possible to share a disk with another operating system at this time. Depending on the specific Alpha machine you have, this disk can either be a SCSI disk or an IDE disk, as long as your machine is capable of booting from it. Following the conventions of the Digital / Compaq manuals all SRM input is shown in uppercase. SRM is case insensitive. To find the names and types of disks in your machine, use the SHOW DEVICE command from the SRM console prompt: >>>SHOW DEVICE dka0.0.0.4.0 DKA0 TOSHIBA CD-ROM XM-57 3476 dkc0.0.0.1009.0 DKC0 RZ1BB-BS 0658 dkc100.1.0.1009.0 DKC100 SEAGATE ST34501W 0015 dva0.0.0.0.1 DVA0 ewa0.0.0.3.0 EWA0 00-00-F8-75-6D-01 pkc0.7.0.1009.0 PKC0 SCSI Bus ID 7 5.27 pqa0.0.0.4.0 PQA0 PCI EIDE pqb0.0.1.4.0 PQB0 PCI EIDE This example is from a Digital Personal Workstation 433au and shows three disks attached to the machine. The first is a CDROM drive called DKA0 and the other two are disks and are called DKC0 and DKC100 respectively. Disks with names of the form DKx are SCSI disks. For example DKA100 refers to a SCSI disk with SCSI target ID 1 on the first SCSI bus (A), whereas DKC300 refers to a SCSI disk with SCSI ID 3 on the third SCSI bus (C). Devicename PKx refers to the SCSI host bus adapter. As seen in the SHOW DEVICE output SCSI CDROM drives are treated as any other SCSI hard disk drive. IDE disks have names similar to DQx, while PQx is the associated IDE controller. Collect Your Network Configuration Details If you intend to connect to a network as part of your FreeBSD installation (for example, if you will be installing from an FTP site or an NFS server), then you need to know your network configuration. You will be prompted for this information during the installation so that FreeBSD can connect to the network to complete the install. Connecting to an Ethernet Network or Cable/DSL Modem If you connect to an Ethernet network, or you have an Internet connection using an Ethernet adapter via cable or DSL, then you will need the following information: IP address IP address of the default gateway Hostname DNS server IP addresses Subnet Mask If you do not know this information, then ask your system administrator or service provider. They may say that this information is assigned automatically, using DHCP. If so, make a note of this. Connecting Using a Modem If you dial up to an ISP using a regular modem then you can still install FreeBSD over the Internet, it will just take a very long time. You will need to know: The phone number to dial for your ISP The COM: port your modem is connected to The username and password for your ISP account Check for FreeBSD Errata Although the FreeBSD project strives to ensure that each release of FreeBSD is as stable as possible, bugs do occasionally creep into the process. On very rare occasions those bugs affect the installation process. As these problems are discovered and fixed, they are noted in the FreeBSD Errata, which is found on the FreeBSD web site. You should check the errata before installing to make sure that there are no late-breaking problems which you should be aware of. Information about all the releases, including the errata for each release, can be found on the release information section of the FreeBSD web site. Obtain the FreeBSD Installation Files The FreeBSD installation process can install FreeBSD from files located in any of the following places: Local Media A CDROM or DVD A DOS partition on the same computer A SCSI or QIC tape Floppy disks Network An FTP site, going through a firewall, or using an HTTP proxy, as necessary An NFS server A dedicated parallel or serial connection If you have purchased FreeBSD on CD or DVD then you already have everything you need, and should proceed to the next section (). If you have not obtained the FreeBSD installation files you should skip ahead to which explains how to prepare to install FreeBSD from any of the above. After reading that section, you should come back here, and read on to . Prepare the Boot Media The FreeBSD installation process is started by booting your computer into the FreeBSD installer—it is not a program you run within another operating system. Your computer normally boots using the operating system installed on your hard disk, but it can also be configured to use a bootable floppy disk. Most modern computers can also boot from a CDROM in the CDROM drive. If you have FreeBSD on CDROM or DVD (either one you purchased or you prepared yourself), and your computer allows you to boot from the CDROM or DVD (typically a BIOS option called Boot Order or similar), then you can skip this section. The FreeBSD CDROM and DVD images are bootable and can be used to install FreeBSD without any other special preparation. To create boot floppy images, follow these steps: Acquire the Boot Floppy Images The boot disks are available on your installation media in the floppies/ directory, and can also be downloaded from the floppies directory, ftp://ftp.FreeBSD.org/pub/FreeBSD/releases/<arch>/<version>-RELEASE/floppies/. Replace <arch> and <version> with the architecture and the version number which you want to install, respectively. For example, the boot floppy images for &os; &rel.current;-RELEASE for &i386; are available from . The floppy images have a .flp extension. The floppies/ directory contains a number of different images, and the ones you will need to use depends on the version of FreeBSD you are installing, and in some cases, the hardware you are installing to. If you are installing FreeBSD 4.X in most cases you will just need two files, kern.flp and mfsroot.flp. If you are installing FreeBSD 5.X in most cases you will need three floppies, boot.flp, kern1.flp, and kern2.flp. Check README.TXT in the same directory for the most up to date information about these floppy images. Additional device drivers may be necessary for 5.X systems older than &os; 5.3. These drivers are provided on the drivers.flp image. Your FTP program must use binary mode to download these disk images. Some web browsers have been known to use text (or ASCII) mode, which will be apparent if you cannot boot from the disks. Prepare the Floppy Disks You must prepare one floppy disk per image file you had to download. It is imperative that these disks are free from defects. The easiest way to test this is to format the disks for yourself. Do not trust pre-formatted floppies. The format utility in &windows; will not tell about the presence of bad blocks, it simply marks them as bad and ignores them. It is advised that you use brand new floppies if choosing this installation route. If you try to install FreeBSD and the installation program crashes, freezes, or otherwise misbehaves, one of the first things to suspect is the floppies. Try writing the floppy image files to new disks and try again. Write the Image Files to the Floppy Disks The .flp files are not regular files you copy to the disk. They are images of the complete contents of the disk. This means that you cannot simply copy files from one disk to another. Instead, you must use specific tools to write the images directly to the disk. DOS If you are creating the floppies on a computer running &ms-dos;/&windows;, then we provide a tool to do this called fdimage. If you are using the floppies from the CDROM, and your CDROM is the E: drive, then you would run this: E:\> tools\fdimage floppies\kern.flp A: Repeat this command for each .flp file, replacing the floppy disk each time, being sure to label the disks with the name of the file that you copied to them. Adjust the command line as necessary, depending on where you have placed the .flp files. If you do not have the CDROM, then fdimage can be downloaded from the tools directory on the FreeBSD FTP site. If you are writing the floppies on a &unix; system (such as another FreeBSD system) you can use the &man.dd.1; command to write the image files directly to disk. On FreeBSD, you would run: &prompt.root; dd if=kern.flp of=/dev/fd0 On FreeBSD, /dev/fd0 refers to the first floppy disk (the A: drive). /dev/fd1 would be the B: drive, and so on. Other &unix; variants might have different names for the floppy disk devices, and you will need to check the documentation for the system as necessary. You are now ready to start installing FreeBSD. Starting the Installation By default, the installation will not make any changes to your disk(s) until you see the following message: Last Chance: Are you SURE you want continue the installation? If you're running this on a disk with data you wish to save then WE STRONGLY ENCOURAGE YOU TO MAKE PROPER BACKUPS before proceeding! We can take no responsibility for lost disk contents! The install can be exited at any time prior to the final warning without changing the contents of the hard drive. If you are concerned that you have configured something incorrectly you can just turn the computer off before this point, and no damage will be done. Booting Booting for the &i386; Start with your computer turned off. Turn on the computer. As it starts it should display an option to enter the system set up menu, or BIOS, commonly reached by keys like F2, F10, Del, or Alt S . Use whichever keystroke is indicated on screen. In some cases your computer may display a graphic while it starts. Typically, pressing Esc will dismiss the graphic and allow you to see the necessary messages. Find the setting that controls which devices the system boots from. This is usually labeled as the Boot Order and commonly shown as a list of devices, such as Floppy, CDROM, First Hard Disk, and so on. If you needed to prepare boot floppies, then make sure that the floppy disk is selected. If you are booting from the CDROM then make sure that that is selected instead. In case of doubt, you should consult the manual that came with your computer, and/or its motherboard. Make the change, then save and exit. The computer should now restart. If you needed to prepare boot floppies, as described in , then one of them will be the first boot disc, probably the one containing kern.flp. Put this disc in your floppy drive. If you are booting from CDROM, then you will need to turn on the computer, and insert the CDROM at the first opportunity. If your computer starts up as normal and loads your existing operating system, then either: The disks were not inserted early enough in the boot process. Leave them in, and try restarting your computer. The BIOS changes earlier did not work correctly. You should redo that step until you get the right option. Your particular BIOS does not support booting from the desired media. FreeBSD will start to boot. If you are booting from CDROM you will see a display similar to this (version information omitted): Verifying DMI Pool Data ........ Boot from ATAPI CD-ROM : 1. FD 2.88MB System Type-(00) Uncompressing ... done BTX loader 1.00 BTX version is 1.01 Console: internal video/keyboard BIOS drive A: is disk0 BIOS drive B: is disk1 BIOS drive C: is disk2 BIOS drive D: is disk3 BIOS 639kB/261120kB available memory FreeBSD/i386 bootstrap loader, Revision 0.8 /kernel text=0x277391 data=0x3268c+0x332a8 | | Hit [Enter] to boot immediately, or any other key for command prompt. Booting [kernel] in 9 seconds... _ If you are booting from floppy disc, you will see a display similar to this (version information omitted): Verifying DMI Pool Data ........ BTX loader 1.00 BTX version is 1.01 Console: internal video/keyboard BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS 639kB/261120kB available memory FreeBSD/i386 bootstrap loader, Revision 0.8 /kernel text=0x277391 data=0x3268c+0x332a8 | Please insert MFS root floppy and press enter: Follow these instructions by removing the kern.flp disc, insert the mfsroot.flp disc, and press Enter. &os; 5.3 and above provide other floppy disks set, as described in previous section. Boot from first floppy; when prompted, insert the other disks as required. Whether you booted from floppy or CDROM, the boot process will then get to this point: Hit [Enter] to boot immediately, or any other key for command prompt. Booting [kernel] in 9 seconds... _ Either wait ten seconds, or press Enter (for &os; 4.X this will then launch the kernel configuration menu). Booting for the Alpha Alpha Start with your computer turned off. Turn on the computer and wait for a boot monitor prompt. If you needed to prepare boot floppies, as described in then one of them will be the first boot disc, probably the one containing kern.flp. Put this disc in your floppy drive and type the following command to boot the disk (substituting the name of your floppy drive if necessary): >>>BOOT DVA0 -FLAGS '' -FILE '' If you are booting from CDROM, insert the CDROM into the drive and type the following command to start the installation (substituting the name of the appropriate CDROM drive if necessary): >>>BOOT DKA0 -FLAGS '' -FILE '' FreeBSD will start to boot. If you are booting from a floppy disc, at some point you will see the message: Please insert MFS root floppy and press enter: Follow these instructions by removing the kern.flp disc, insert the mfsroot.flp disc, and press Enter. Whether you booted from floppy or CDROM, the boot process will then get to this point: Hit [Enter] to boot immediately, or any other key for command prompt. Booting [kernel] in 9 seconds... _ Either wait ten seconds, or press Enter. This will then launch the kernel configuration menu. Kernel Configuration From FreeBSD versions 5.0 and later, userconfig has been deprecated in favor of the new &man.device.hints.5; method. For more information on &man.device.hints.5; please visit The kernel is the core of the operating system. It is responsible for many things, including access to all the devices you may have on your system, such as hard disks, network cards, sound cards, and so on. Each piece of hardware supported by the FreeBSD kernel has a driver associated with it. Each driver has a two or three letter name, such as sa for the SCSI sequential access driver, or sio for the Serial I/O driver (which manages COM ports). When the kernel starts, each driver checks the system to see whether or not the hardware it supports exists on your system. If it does, then the driver configures the hardware and makes it available to the rest of the kernel. This checking is commonly referred to as device probing. Unfortunately, it is not always possible to do this in a safe way. Some hardware drivers do not co-exist well, and probing for one piece of hardware can sometimes leave another in an inconsistent state. This is a basic limitation of the PC design. Many older devices are called ISA devices—as opposed to PCI devices. The ISA specification requires each device to have some information hard coded into it, typically the Interrupt Request Line number (IRQ) and IO port address that the driver uses. This information is commonly set by using physical jumpers on the card, or by using a DOS based utility. This was often a source of problems, because it was not possible to have two devices that shared the same IRQ or port address. Newer devices follow the PCI specification, which does not require this, as the devices are supposed to cooperate with the BIOS, and are told which IRQ and IO port addresses to use. If you have any ISA devices in your computer then FreeBSD's driver for that device will need to be configured with the IRQ and port address that you have set the card to. This is why carrying out an inventory of your hardware (see ) can be useful. Unfortunately, the default IRQs and memory ports used by some drivers clash. This is because some ISA devices are shipped with IRQs or memory ports that clash. The defaults in FreeBSD's drivers are deliberately set to mirror the manufacturer's defaults, so that, out of the box, as many devices as possible will work. This is almost never an issue when running FreeBSD day-to-day. Your computer will not normally contain two pieces of hardware that clash, because one of them would not work (irrespective of the operating system you are using). It becomes an issue when you are installing FreeBSD for the first time because the kernel used to carry out the install has to contain as many drivers as possible, so that many different hardware configurations can be supported. This means that some of those drivers will have conflicting configurations. The devices are probed in a strict order, and if you own a device that is probed late in the process, but conflicted with an earlier probe, then your hardware might not function or be probed correctly when you install FreeBSD. Because of this, the first thing you have the opportunity to do when installing FreeBSD is look at the list of drivers that are configured into the kernel, and either disable some of them, if you do not own that device, or confirm (and alter) the driver's configuration if you do own the device but the defaults are wrong. This probably sounds much more complicated than it actually is. shows the first kernel configuration menu. We recommend that you choose the Start kernel configuration in full-screen visual mode option, as it presents the easiest interface for the new user.

Kernel Configuration Menu &txt.install.userconfig; The kernel configuration screen () is then divided into four sections: A collapsible list of all the drivers that are currently marked as active, subdivided into groups such as Storage, and Network. Each driver is shown as a description, its two or three letter driver name, and the IRQ and memory port used by that driver. In addition, if an active driver conflicts with another active driver then CONF is shown next to the driver name. This section also shows the total number of conflicting drivers that are currently active. Drivers that have been marked inactive. They remain in the kernel, but they will not probe for their device when the kernel starts. These are subdivided into groups in the same way as the active driver list. More detail about the currently selected driver, including its IRQ and memory port address. Information about the keystrokes that are valid at this point in time.

The Kernel Device Configuration Visual Interface &txt.install.userconfig2; Do not worry if any conflicts are listed, it is to be expected; all the drivers are enabled, and as has already been explained, some of them will conflict with one another. You now have to work through the list of drivers, resolving the conflicts. Resolving Driver Conflicts Press X. This will completely expand the list of drivers, so you can see all of them. You will need to use the arrow keys to scroll back and forth through the active driver list. shows the result of pressing X.

Expanded Driver List Disable all the drivers for devices that you do not have. To disable a driver, highlight it with the arrow keys and press Del. The driver will be moved to the Inactive Drivers list. If you inadvertently disable a device that you need then press Tab to switch to the Inactive Drivers list, select the driver that you disabled, and press Enter to move it back to the active list. Do not disable sc0. This controls the screen, and you will need this unless you are installing over a serial cable. Only disable atkbd0 if you are using a USB keyboard. If you have a normal keyboard then you must keep atkbd0. If there are no conflicts listed then you can skip this step. Otherwise, the remaining conflicts need to be examined. If they do not have the indication of an allowed conflict in the message area, then either the IRQ/address for device probe will need to be changed, or the IRQ/address on the hardware will need to be changed. To change the driver's configuration for IRQ and IO port address, select the device and press Enter. The cursor will move to the third section of the screen, and you can change the values. You should enter the values for IRQ and port address that you discovered when you made your hardware inventory. Press Q to finish editing the device's configuration and return to the active driver list. If you are not sure what these figures should be then you can try using -1. Some FreeBSD drivers can safely probe the hardware to discover what the correct value should be, and a value of -1 configures them to do this. The procedure for changing the address on the hardware varies from device to device. For some devices you may need to physically remove the card from your computer and adjust jumper settings or DIP switches. Other cards may have come with a DOS floppy that contains the programs used to reconfigure the card. In any case, you should refer to the documentation that came with the device. This will obviously entail restarting your computer, so you will need to boot back into the FreeBSD installation routine when you have reconfigured the card. When all the conflicts have been resolved the screen will look similar to .

Driver Configuration With No Conflicts As you can see, the active driver list is now much smaller, with only drivers for the hardware that actually exists being listed. You can now save these changes, and move on to the next step of the install. Press Q to quit the device configuration interface. This message will appear: Save these parameters before exiting? ([Y]es/[N]o/[C]ancel) Answer Y to save the parameters to memory (it will be saved to disk if you finish the install) and the probing will start. After displaying the probe results in white on black text sysinstall will start and display its main menu ().

Sysinstall Main Menu Reviewing the Device Probe Results The last few hundred lines that have been displayed on screen are stored and can be reviewed. To review the buffer, press Scroll Lock. This turns on scrolling in the display. You can then use the arrow keys, or PageUp and PageDown to view the results. Press Scroll Lock again to stop scrolling. Do this now, to review the text that scrolled off the screen when the kernel was carrying out the device probes. You will see text similar to , although the precise text will differ depending on the devices that you have in your computer.

Typical Device Probe Results avail memory = 253050880 (247120K bytes) Preloaded elf kernel "kernel" at 0xc0817000. Preloaded mfs_root "/mfsroot" at 0xc0817084. md0: Preloaded image </mfsroot> 4423680 bytes at 0xc03ddcd4 md1: Malloc disk Using $PIR table, 4 entries at 0xc00fde60 npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pcib1:<VIA 82C598MVP (Apollo MVP3) PCI-PCI (AGP) bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <Matrox MGA G200 AGP graphics accelerator> at 0.0 irq 11 isab0: <VIA 82C586 PCI-ISA bridge> at device 7.0 on pci0 isa0: <iSA bus> on isab0 atapci0: <VIA 82C586 ATA33 controller> port 0xe000-0xe00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0 <VIA 83C572 USB controller> port 0xe400-0xe41f irq 10 at device 7.2 on pci 0 usb0: <VIA 83572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr1 uhub0: 2 ports with 2 removable, self powered pci0: <unknown card> (vendor=0x1106, dev=0x3040) at 7.3 dc0: <ADMtek AN985 10/100BaseTX> port 0xe800-0xe8ff mem 0xdb000000-0xeb0003ff ir q 11 at device 8.0 on pci0 dc0: Ethernet address: 00:04:5a:74:6b:b5 miibus0: <MII bus> on dc0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ed0: <NE2000 PCI Ethernet (RealTek 8029)> port 0xec00-0xec1f irq 9 at device 10. 0 on pci0 ed0 address 52:54:05:de:73:1b, type NE2000 (16 bit) isa0: too many dependant configs (8) isa0: unexpected small tag 14 orm0: <Option ROM> at iomem 0xc0000-0xc7fff on isa0 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> flags 0x1 irq1 on atkbdc0 kbd0 at atkbd0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model Generic PS/@ mouse, device ID 0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 pppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/15 bytes threshold plip0: <PLIP network interface> on ppbus0 ad0: 8063MB <IBM-DHEA-38451> [16383/16/63] at ata0-master UDMA33 acd0: CD-RW <LITE-ON LTR-1210B> at ata1-slave PIO4 Mounting root from ufs:/dev/md0c /stand/sysinstall running as init on vty0 Check the probe results carefully to make sure that FreeBSD found all the devices you expected. If a device was not found, then it will not be listed. If the device's driver required configuring with the IRQ and port address then you should check that you entered them correctly. If you need to make changes to the UserConfig device probing, it is easy to exit the sysinstall program and start over again. It is also a good way to become more familiar with the process.

Select Sysinstall Exit Use the arrow keys to select Exit Install from the Main Install Screen menu. The following message will display: User Confirmation Requested Are you sure you wish to exit? The system will reboot (be sure to remove any floppies from the drives). [ Yes ] No The install program will start again if the CDROM is left in the drive and &gui.yes; is selected. If you are booting from floppies it will be necessary to remove the mfsroot.flp floppy and replace it with kern.flp before rebooting. Introducing Sysinstall The sysinstall utility is the installation application provided by the FreeBSD Project. It is console based and is divided into a number of menus and screens that you can use to configure and control the installation process. The sysinstall menu system is controlled by the arrow keys, Enter, Space, and other keys. A detailed description of these keys and what they do is contained in sysinstall's usage information. To review this information, ensure that the Usage entry is highlighted and that the [Select] button is selected, as shown in , then press Enter. The instructions for using the menu system will be displayed. After reviewing them, press Enter to return to the Main Menu.

Selecting Usage from Sysinstall Main Menu Selecting the Documentation Menu From the Main Menu, select Doc with the arrow keys and press Enter.

Selecting Documentation Menu This will display the Documentation Menu.

Sysinstall Documentation Menu It is important to read the documents provided. To view a document, select it with the arrow keys and press Enter. When finished reading a document, pressing Enter will return to the Documentation Menu. To return to the Main Installation Menu, select Exit with the arrow keys and press Enter. Selecting the Keymap Menu To change the keyboard mapping, use the arrow keys to select Keymap from the menu and press Enter. This is only required if you are using a non-standard or non-US keyboard.

Sysinstall Main Menu A different keyboard mapping may be chosen by selecting the menu item using up/down arrow keys and pressing Space. Pressing Space again will unselect the item. When finished, choose the &gui.ok; using the arrow keys and press Enter. Only a partial list is shown in this screen representation. Selecting &gui.cancel; by pressing Tab will use the default keymap and return to the Main Install Menu. Installation Options Screen Select Options and press Enter.

Sysinstall Main Menu

Sysinstall Options The default values are usually fine for most users and do not need to be changed. The release name will vary according to the version being installed. The description of the selected item will appear at the bottom of the screen highlighted in blue. Notice that one of the options is Use Defaults to reset all values to startup defaults. Press F1 to read the help screen about the various options. Pressing Q will return to the Main Install menu. Begin a Standard Installation The Standard installation is the option recommended for those new to &unix; or FreeBSD. Use the arrow keys to select Standard and then press Enter to start the installation.

Begin Standard Installation Allocating Disk Space Your first task is to allocate disk space for FreeBSD, and label that space so that sysinstall can prepare it. In order to do this you need to know how FreeBSD expects to find information on the disk. BIOS Drive Numbering Before you install and configure FreeBSD on your system, there is an important subject that you should be aware of, especially if you have multiple hard drives. DOS Microsoft Windows In a PC running a BIOS-dependent operating system such as &ms-dos; or µsoft.windows;, the BIOS is able to abstract the normal disk drive order, and the operating system goes along with the change. This allows the user to boot from a disk drive other than the so-called primary master. This is especially convenient for some users who have found that the simplest and cheapest way to keep a system backup is to buy an identical second hard drive, and perform routine copies of the first drive to the second drive using Ghost or XCOPY . Then, if the first drive fails, or is attacked by a virus, or is scribbled upon by an operating system defect, he can easily recover by instructing the BIOS to logically swap the drives. It is like switching the cables on the drives, but without having to open the case. SCSI BIOS More expensive systems with SCSI controllers often include BIOS extensions which allow the SCSI drives to be re-ordered in a similar fashion for up to seven drives. A user who is accustomed to taking advantage of these features may become surprised when the results with FreeBSD are not as expected. FreeBSD does not use the BIOS, and does not know the logical BIOS drive mapping. This can lead to very perplexing situations, especially when drives are physically identical in geometry, and have also been made as data clones of one another. When using FreeBSD, always restore the BIOS to natural drive numbering before installing FreeBSD, and then leave it that way. If you need to switch drives around, then do so, but do it the hard way, and open the case and move the jumpers and cables. An Illustration from the Files of Bill and Fred's Exceptional Adventures: Bill breaks-down an older Wintel box to make another FreeBSD box for Fred. Bill installs a single SCSI drive as SCSI unit zero and installs FreeBSD on it. Fred begins using the system, but after several days notices that the older SCSI drive is reporting numerous soft errors and reports this fact to Bill. After several more days, Bill decides it is time to address the situation, so he grabs an identical SCSI drive from the disk drive archive in the back room. An initial surface scan indicates that this drive is functioning well, so Bill installs this drive as SCSI unit four and makes an image copy from drive zero to drive four. Now that the new drive is installed and functioning nicely, Bill decides that it is a good idea to start using it, so he uses features in the SCSI BIOS to re-order the disk drives so that the system boots from SCSI unit four. FreeBSD boots and runs just fine. Fred continues his work for several days, and soon Bill and Fred decide that it is time for a new adventure — time to upgrade to a newer version of FreeBSD. Bill removes SCSI unit zero because it was a bit flaky and replaces it with another identical disk drive from the archive. Bill then installs the new version of FreeBSD onto the new SCSI unit zero using Fred's magic Internet FTP floppies. The installation goes well. Fred uses the new version of FreeBSD for a few days, and certifies that it is good enough for use in the engineering department. It is time to copy all of his work from the old version. So Fred mounts SCSI unit four (the latest copy of the older FreeBSD version). Fred is dismayed to find that none of his precious work is present on SCSI unit four. Where did the data go? When Bill made an image copy of the original SCSI unit zero onto SCSI unit four, unit four became the new clone. When Bill re-ordered the SCSI BIOS so that he could boot from SCSI unit four, he was only fooling himself. FreeBSD was still running on SCSI unit zero. Making this kind of BIOS change will cause some or all of the Boot and Loader code to be fetched from the selected BIOS drive, but when the FreeBSD kernel drivers take-over, the BIOS drive numbering will be ignored, and FreeBSD will transition back to normal drive numbering. In the illustration at hand, the system continued to operate on the original SCSI unit zero, and all of Fred's data was there, not on SCSI unit four. The fact that the system appeared to be running on SCSI unit four was simply an artifact of human expectations. We are delighted to mention that no data bytes were killed or harmed in any way by our discovery of this phenomenon. The older SCSI unit zero was retrieved from the bone pile, and all of Fred's work was returned to him, (and now Bill knows that he can count as high as zero). Although SCSI drives were used in this illustration, the concepts apply equally to IDE drives. Creating Slices Using FDisk No changes you make at this point will be written to the disk. If you think you have made a mistake and want to start again you can use the menus to exit sysinstall and try again or press U to use the Undo option. If you get confused and can not see how to exit you can always turn your computer off. After choosing to begin a standard installation in sysinstall you will be shown this message: Message In the next menu, you will need to set up a DOS-style ("fdisk") partitioning scheme for your hard disk. If you simply wish to devote all disk space to FreeBSD (overwriting anything else that might be on the disk(s) selected) then use the (A)ll command to select the default partitioning scheme followed by a (Q)uit. If you wish to allocate only free space to FreeBSD, move to a partition marked "unused" and use the (C)reate command. [ OK ] [ Press enter or space ] Press Enter as instructed. You will then be shown a list of all the hard drives that the kernel found when it carried out the device probes. shows an example from a system with two IDE disks. They have been called ad0 and ad2.

Select Drive for FDisk You might be wondering why ad1 is not listed here. Why has it been missed? Consider what would happen if you had two IDE hard disks, one as the master on the first IDE controller, and one as the master on the second IDE controller. If FreeBSD numbered these as it found them, as ad0 and ad1 then everything would work. But if you then added a third disk, as the slave device on the first IDE controller, it would now be ad1, and the previous ad1 would become ad2. Because device names (such as ad1s1a) are used to find filesystems, you may suddenly discover that some of your filesystems no longer appear correctly, and you would need to change your FreeBSD configuration. To work around this, the kernel can be configured to name IDE disks based on where they are, and not the order in which they were found. With this scheme the master disk on the second IDE controller will always be ad2, even if there are no ad0 or ad1 devices. This configuration is the default for the FreeBSD kernel, which is why this display shows ad0 and ad2. The machine on which this screenshot was taken had IDE disks on both master channels of the IDE controllers, and no disks on the slave channels. You should select the disk on which you want to install FreeBSD, and then press &gui.ok;. FDisk will start, with a display similar to that shown in . The FDisk display is broken into three sections. The first section, covering the first two lines of the display, shows details about the currently selected disk, including its FreeBSD name, the disk geometry, and the total size of the disk. The second section shows the slices that are currently on the disk, where they start and end, how large they are, the name FreeBSD gives them, and their description and sub-type. This example shows two small unused slices, which are artifacts of disk layout schemes on the PC. It also shows one large FAT slice, which almost certainly appears as C: in &ms-dos; / &windows;, and an extended slice, which may contain other drive letters for &ms-dos; / &windows;. The third section shows the commands that are available in FDisk.

Typical Fdisk Partitions before Editing What you do now will depend on how you want to slice up your disk. If you want to use FreeBSD for the entire disk (which will delete all the other data on this disk when you confirm that you want sysinstall to continue later in the installation process) then you can press A, which corresponds to the Use Entire Disk option. The existing slices will be removed, and replaced with a small area flagged as unused (again, an artifact of PC disk layout), and then one large slice for FreeBSD. If you do this, then you should select the newly created FreeBSD slice using the arrow keys, and press S to mark the slice as being bootable. The screen will then look very similar to . Note the A in the Flags column, which indicates that this slice is active, and will be booted from. If you will be deleting an existing slice to make space for FreeBSD then you should select the slice using the arrow keys, and then press D. You can then press C, and be prompted for size of slice you want to create. Enter the appropriate figure and press Enter. The default value in this box represents the largest possible slice you can make, which could be the largest contiguous block of unallocated space or the size of the entire hard disk. If you have already made space for FreeBSD (perhaps by using a tool such as &partitionmagic;) then you can press C to create a new slice. Again, you will be prompted for the size of slice you would like to create.

Fdisk Partition Using Entire Disk When finished, press Q. Your changes will be saved in sysinstall, but will not yet be written to disk. Install a Boot Manager You now have the option to install a boot manager. In general, you should choose to install the FreeBSD boot manager if: You have more than one drive, and have installed FreeBSD onto a drive other than the first one. You have installed FreeBSD alongside another operating system on the same disk, and you want to choose whether to start FreeBSD or the other operating system when you start the computer. If FreeBSD is going to be the only operating system on this machine, installed on the first hard disk, then the Standard boot manager will suffice. Choose None if you are using a third-party boot manager capable of booting FreeBSD. Make your choice and press Enter.

Sysinstall Boot Manager Menu The help screen, reached by pressing F1, discusses the problems that can be encountered when trying to share the hard disk between operating systems. Creating Slices on Another Drive If there is more than one drive, it will return to the Select Drives screen after the boot manager selection. If you wish to install FreeBSD on to more than one disk, then you can select another disk here and repeat the slice process using FDisk. If you are installing FreeBSD on a drive other than your first, then the FreeBSD boot manager needs to be installed on both drives.

Exit Select Drive The Tab key toggles between the last drive selected, &gui.ok;, and &gui.cancel;. Press the Tab once to toggle to the &gui.ok;, then press Enter to continue with the installation. Creating Partitions Using <application>Disklabel</application> You must now create some partitions inside each slice that you have just created. Remember that each partition is lettered, from a through to h, and that partitions b, c, and d have conventional meanings that you should adhere to. Certain applications can benefit from particular partition schemes, especially if you are laying out partitions across more than one disk. However, for this, your first FreeBSD installation, you do not need to give too much thought to how you partition the disk. It is more important that you install FreeBSD and start learning how to use it. You can always re-install FreeBSD to change your partition scheme when you are more familiar with the operating system. This scheme features four partitions—one for swap space, and three for filesystems. Partition Layout for First Disk Partition Filesystem Size Description a / 100 MB This is the root filesystem. Every other filesystem will be mounted somewhere under this one. 100 MB is a reasonable size for this filesystem. You will not be storing too much data on it, as a regular FreeBSD install will put about 40 MB of data here. The remaining space is for temporary data, and also leaves expansion space if future versions of FreeBSD need more space in /. b N/A 2-3 x RAM The system's swap space is kept on this partition. Choosing the right amount of swap space can be a bit of an art. A good rule of thumb is that your swap space should be two or three times as much as the available physical memory (RAM). You should also have at least 64 MB of swap, so if you have less than 32 MB of RAM in your computer then set the swap amount to 64 MB. If you have more than one disk then you can put swap space on each disk. FreeBSD will then use each disk for swap, which effectively speeds up the act of swapping. In this case, calculate the total amount of swap you need (e.g., 128 MB), and then divide this by the number of disks you have (e.g., two disks) to give the amount of swap you should put on each disk, in this example, 64 MB of swap per disk. e /var 50 MB The /var directory contains files that are constantly varying; log files, and other administrative files. Many of these files are read-from or written-to extensively during FreeBSD's day-to-day running. Putting these files on another filesystem allows FreeBSD to optimize the access of these files without affecting other files in other directories that do not have the same access pattern. f /usr Rest of disk All your other files will typically be stored in /usr and its subdirectories.

If you will be installing FreeBSD on to more than one disk then you must also create partitions in the other slices that you configured. The easiest way to do this is to create two partitions on each disk, one for the swap space, and one for a filesystem. Partition Layout for Subsequent Disks Partition Filesystem Size Description b N/A See description As already discussed, you can split swap space across each disk. Even though the a partition is free, convention dictates that swap space stays on the b partition. e /diskn Rest of disk The rest of the disk is taken up with one big partition. This could easily be put on the a partition, instead of the e partition. However, convention says that the a partition on a slice is reserved for the filesystem that will be the root (/) filesystem. You do not have to follow this convention, but sysinstall does, so following it yourself makes the installation slightly cleaner. You can choose to mount this filesystem anywhere; this example suggests that you mount them as directories /diskn, where n is a number that changes for each disk. But you can use another scheme if you prefer.

Having chosen your partition layout you can now create it using sysinstall. You will see this message: Message Now, you need to create BSD partitions inside of the fdisk partition(s) just created. If you have a reasonable amount of disk space (200MB or more) and don't have any special requirements, simply use the (A)uto command to allocate space automatically. If you have more specific needs or just don't care for the layout chosen by (A)uto, press F1 for more information on manual layout. [ OK ] [ Press enter or space ] Press Enter to start the FreeBSD partition editor, called Disklabel. shows the display when you first start Disklabel. The display is divided in to three sections. The first few lines show the name of the disk you are currently working on, and the slice that contains the partitions you are creating (at this point Disklabel calls this the Partition name rather than slice name). This display also shows the amount of free space within the slice; that is, space that was set aside in the slice, but that has not yet been assigned to a partition. The middle of the display shows the partitions that have been created, the name of the filesystem that each partition contains, their size, and some options pertaining to the creation of the filesystem. The bottom third of the screen shows the keystrokes that are valid in Disklabel.

Sysinstall Disklabel Editor Disklabel can automatically create partitions for you and assign them default sizes. Try this now, by Pressing A. You will see a display similar to that shown in . Depending on the size of the disk you are using, the defaults may or may not be appropriate. This does not matter, as you do not have to accept the defaults. Beginning with FreeBSD 4.5, the default partitioning assigns the /tmp directory its own partition instead of being part of the / partition. This helps avoid filling the / partition with temporary files.

Sysinstall Disklabel Editor with Auto Defaults If you choose to not use the default partitions and wish to replace them with your own, use the arrow keys to select the first partition, and press D to delete it. Repeat this to delete all the suggested partitions. To create the first partition (a, mounted as / — root), make sure the proper disk slice at the top of the screen is selected and press C. A dialog box will appear prompting you for the size of the new partition (as shown in ). You can enter the size as the number of disk blocks you want to use, or as a number followed by either M for megabytes, G for gigabytes, or C for cylinders. Beginning with FreeBSD 5.X, users can: select UFS2 (which is default on &os; 5.1 and above) using the Custom Newfs (Z) option, create labels with Auto Defaults and modify them with the Custom Newfs option or add during the regular creation period. Do not forget to add for SoftUpdates if you use the Custom Newfs option!

Free Space for Root Partition The default size shown will create a partition that takes up the rest of the slice. If you are using the partition sizes described in the earlier example, then delete the existing figure using Backspace, and then type in 64M, as shown in . Then press &gui.ok;.

Edit Root Partition Size Having chosen the partition's size you will then be asked whether this partition will contain a filesystem or swap space. The dialog box is shown in . This first partition will contain a filesystem, so check that FS is selected and press Enter.

Choose the Root Partition Type Finally, because you are creating a filesystem, you must tell Disklabel where the filesystem is to be mounted. The dialog box is shown in . The root filesystem's mount point is /, so type /, and then press Enter.

Choose the Root Mount Point The display will then update to show you the newly created partition. You should repeat this procedure for the other partitions. When you create the swap partition, you will not be prompted for the filesystem mount point, as swap partitions are never mounted. When you create the final partition, /usr, you can leave the suggested size as is, to use the rest of the slice. Your final FreeBSD DiskLabel Editor screen will appear similar to , although your values chosen may be different. Press Q to finish.

Sysinstall Disklabel Editor Choosing What to Install Select the Distribution Set Deciding which distribution set to install will depend largely on the intended use of the system and the amount of disk space available. The predefined options range from installing the smallest possible configuration to everything. Those who are new to &unix; and/or FreeBSD should almost certainly select one of these canned options. Customizing a distribution set is typically for the more experienced user. Press F1 for more information on the distribution set options and what they contain. When finished reviewing the help, pressing Enter will return to the Select Distributions Menu. If a graphical user interface is desired then a distribution set that is preceded by an X should be chosen. The configuration of the X server and selection of a default desktop must be done after the installation of &os;. More information regarding the configuration of a X server can be found in . The default version of X11 that is installed depends on the version of FreeBSD that you are installing. For FreeBSD versions prior to 5.3, &xfree86; 4.X is installed. For &os; 5.3 and later, &xorg; is the default. If compiling a custom kernel is anticipated, select an option which includes the source code. For more information on why a custom kernel should be built or how to build a custom kernel, see . Obviously, the most versatile system is one that includes everything. If there is adequate disk space, select All as shown in by using the arrow keys and press Enter. If there is a concern about disk space consider using an option that is more suitable for the situation. Do not fret over the perfect choice, as other distributions can be added after installation.

Choose Distributions Installing the Ports Collection After selecting the desired distribution, an opportunity to install the FreeBSD Ports Collection is presented. The ports collection is an easy and convenient way to install software. The Ports Collection does not contain the source code necessary to compile the software. Instead, it is a collection of files which automates the downloading, compiling and installation of third-party software packages. discusses how to use the ports collection. The installation program does not check to see if you have adequate space. Select this option only if you have adequate hard disk space. As of FreeBSD &rel.current;, the FreeBSD Ports Collection takes up about &ports.size; of disk space. You can safely assume a larger value for more recent versions of FreeBSD. User Confirmation Requested Would you like to install the FreeBSD ports collection? This will give you ready access to over &os.numports; ported software packages, at a cost of around &ports.size; of disk space when "clean" and possibly much more than that if a lot of the distribution tarballs are loaded (unless you have the extra CDs from a FreeBSD CD/DVD distribution available and can mount it on /cdrom, in which case this is far less of a problem). The Ports Collection is a very valuable resource and well worth having on your /usr partition, so it is advisable to say Yes to this option. - For more information on the Ports Collection & the latest ports, + For more information on the Ports Collection & the latest ports, visit: http://www.FreeBSD.org/ports [ Yes ] No Select &gui.yes; with the arrow keys to install the Ports Collection or &gui.no; to skip this option. Press Enter to continue. The Choose Distributions menu will redisplay.

Confirm Distributions If satisfied with the options, select Exit with the arrow keys, ensure that &gui.ok; is highlighted, and pressing Enter to continue. Choosing Your Installation Media If Installing from a CDROM or DVD, use the arrow keys to highlight Install from a FreeBSD CD/DVD. Ensure that &gui.ok; is highlighted, then press Enter to proceed with the installation. For other methods of installation, select the appropriate option and follow the instructions. Press F1 to display the Online Help for installation media. Press Enter to return to the media selection menu.

Choose Installation Media FTP Installation Modes installation network FTP There are three FTP installation modes you can choose from: active FTP, passive FTP, or via a HTTP proxy. FTP Active: Install from an FTP server This option will make all FTP transfers use Active mode. This will not work through firewalls, but will often work with older FTP servers that do not support passive mode. If your connection hangs with passive mode (the default), try active! FTP Passive: Install from an FTP server through a firewall FTP passive mode This option instructs sysinstall to use Passive mode for all FTP operations. This allows the user to pass through firewalls that do not allow incoming connections on random TCP ports. FTP via a HTTP proxy: Install from an FTP server through a http proxy FTP via a HTTP proxy This option instructs sysinstall to use the HTTP protocol (like a web browser) to connect to a proxy for all FTP operations. The proxy will translate the requests and send them to the FTP server. This allows the user to pass through firewalls that do not allow FTP at all, but offer a HTTP proxy. In this case, you have to specify the proxy in addition to the FTP server. For a proxy FTP server, you should usually give the name of the server you really want as a part of the username, after an @ sign. The proxy server then fakes the real server. For example, assuming you want to install from ftp.FreeBSD.org, using the proxy FTP server foo.example.com, listening on port 1024. In this case, you go to the options menu, set the FTP username to ftp@ftp.FreeBSD.org, and the password to your email address. As your installation media, you specify FTP (or passive FTP, if the proxy supports it), and the URL ftp://foo.example.com:1234/pub/FreeBSD. Since /pub/FreeBSD from ftp.FreeBSD.org is proxied under foo.example.com, you are able to install from that machine (which will fetch the files from ftp.FreeBSD.org as your installation requests them). Committing to the Installation The installation can now proceed if desired. This is also the last chance for aborting the installation to prevent changes to the hard drive. User Confirmation Requested Last Chance! Are you SURE you want to continue the installation? If you're running this on a disk with data you wish to save then WE STRONGLY ENCOURAGE YOU TO MAKE PROPER BACKUPS before proceeding! We can take no responsibility for lost disk contents! [ Yes ] No Select &gui.yes; and press Enter to proceed. The installation time will vary according to the distribution chosen, installation media, and the speed of the computer. There will be a series of messages displayed indicating the status. The installation is complete when the following message is displayed: Message Congratulations! You now have FreeBSD installed on your system. We will now move on to the final configuration questions. For any option you do not wish to configure, simply select No. If you wish to re-enter this utility after the system is up, you may do so by typing: /stand/sysinstall . [ OK ] [ Press enter to continue ] Press Enter to proceed with post-installation configurations. Selecting &gui.no; and pressing Enter will abort the installation so no changes will be made to your system. The following message will appear: Message Installation complete with some errors. You may wish to scroll through the debugging messages on VTY1 with the scroll-lock feature. You can also choose "No" at the next prompt and go back into the installation menus to retry whichever operations have failed. [ OK ] This message is generated because nothing was installed. Pressing Enter will return to the Main Installation Menu to exit the installation. Post-installation Configuration of various options follows the successful installation. An option can be configured by re-entering the configuration options before booting the new FreeBSD system or after installation using sysinstall (/stand/sysinstall in &os; versions older than 5.2) and selecting Configure. Network Device Configuration If you previously configured PPP for an FTP install, this screen will not display and can be configured later as described above. For detailed information on Local Area Networks and configuring FreeBSD as a gateway/router refer to the Advanced Networking chapter. User Confirmation Requested Would you like to configure any Ethernet or SLIP/PPP network devices? [ Yes ] No To configure a network device, select &gui.yes; and press Enter. Otherwise, select &gui.no; to continue.

Selecting an Ethernet Device Select the interface to be configured with the arrow keys and press Enter. User Confirmation Requested Do you want to try IPv6 configuration of the interface? Yes [ No ] In this private local area network, the current Internet type protocol (IPv4) was sufficient and &gui.no; was selected with the arrow keys and Enter pressed. If you are connected to an existing IPv6 network with an RA server, then choose &gui.yes; and press Enter. It will take several seconds to scan for RA servers. User Confirmation Requested Do you want to try DHCP configuration of the interface? Yes [ No ] If DHCP (Dynamic Host Configuration Protocol) is not required select &gui.no; with the arrow keys and press Enter. Selecting &gui.yes; will execute dhclient, and if successful, will fill in the network configuration information automatically. Refer to for more information. The following Network Configuration screen shows the configuration of the Ethernet device for a system that will act as the gateway for a Local Area Network.

Set Network Configuration for ed0 Use Tab to select the information fields and fill in appropriate information: Host The fully-qualified hostname, such as k6-2.example.com in this case. Domain The name of the domain that your machine is in, such as example.com for this case. IPv4 Gateway IP address of host forwarding packets to non-local destinations. You must fill this in if the machine is a node on the network. Leave this field blank if the machine is the gateway to the Internet for the network. The IPv4 Gateway is also known as the default gateway or default route. Name server IP address of your local DNS server. There is no local DNS server on this private local area network so the IP address of the provider's DNS server (208.163.10.2) was used. IPv4 address The IP address to be used for this interface was 192.168.0.1 Netmask The address block being used for this local area network is a Class C block (192.168.0.0 - 192.168.255.255). The default netmask is for a Class C network (255.255.255.0). Extra options to ifconfig Any interface-specific options to ifconfig you would like to add. There were none in this case. Use Tab to select &gui.ok; when finished and press Enter. User Confirmation Requested Would you like to Bring Up the ed0 interface right now? [ Yes ] No Choosing &gui.yes; and pressing Enter will bring the machine up on the network and be ready for use. However, this does not accomplish much during installation, since the machine still needs to be rebooted. Configure Gateway User Confirmation Requested Do you want this machine to function as a network gateway? [ Yes ] No If the machine will be acting as the gateway for a local area network and forwarding packets between other machines then select &gui.yes; and press Enter. If the machine is a node on a network then select &gui.no; and press Enter to continue. Configure Internet Services User Confirmation Requested Do you want to configure inetd and the network services that it provides? Yes [ No ] If &gui.no; is selected, various services such telnetd will not be enabled. This means that remote users will not be able to telnet into this machine. Local users will be still be able to access remote machines with telnet. These services can be enabled after installation by editing /etc/inetd.conf with your favorite text editor. See for more information. Select &gui.yes; if you wish to configure these services during install. An additional confirmation will display: User Confirmation Requested The Internet Super Server (inetd) allows a number of simple Internet services to be enabled, including finger, ftp and telnetd. Enabling these services may increase risk of security problems by increasing the exposure of your system. With this in mind, do you wish to enable inetd? [ Yes ] No Select &gui.yes; to continue. User Confirmation Requested inetd(8) relies on its configuration file, /etc/inetd.conf, to determine which of its Internet services will be available. The default FreeBSD inetd.conf(5) leaves all services disabled by default, so they must be specifically enabled in the configuration file before they will function, even once inetd(8) is enabled. Note that services for IPv6 must be separately enabled from IPv4 services. Select [Yes] now to invoke an editor on /etc/inetd.conf, or [No] to use the current settings. [ Yes ] No Selecting &gui.yes; will allow adding services by deleting the # at the beginning of a line.

Editing <filename>inetd.conf</filename> After adding the desired services, pressing Esc will display a menu which will allow exiting and saving the changes. Anonymous FTP FTP anonymous User Confirmation Requested Do you want to have anonymous FTP access to this machine? Yes [ No ] Deny Anonymous FTP Selecting the default &gui.no; and pressing Enter will still allow users who have accounts with passwords to use FTP to access the machine. Allow Anonymous FTP Anyone can access your machine if you elect to allow anonymous FTP connections. The security implications should be considered before enabling this option. For more information about security see . To allow anonymous FTP, use the arrow keys to select &gui.yes; and press Enter. The following screen (or similar) will display:

Default Anonymous FTP Configuration Pressing F1 will display the help: This screen allows you to configure the anonymous FTP user. The following configuration values are editable: UID: The user ID you wish to assign to the anonymous FTP user. All files uploaded will be owned by this ID. Group: Which group you wish the anonymous FTP user to be in. Comment: String describing this user in /etc/passwd FTP Root Directory: Where files available for anonymous FTP will be kept. Upload subdirectory: Where files uploaded by anonymous FTP users will go. The ftp root directory will be put in /var by default. If you do not have enough room there for the anticipated FTP needs, the /usr directory could be used by setting the FTP Root Directory to /usr/ftp. When you are satisfied with the values, press Enter to continue. User Confirmation Requested Create a welcome message file for anonymous FTP users? [ Yes ] No If you select &gui.yes; and press Enter, an editor will automatically start allowing you to edit the message.

Edit the FTP Welcome Message This is a text editor called ee. Use the instructions to change the message or change the message later using a text editor of your choice. Note the file name/location at the bottom of the editor screen. Press Esc and a pop-up menu will default to a) leave editor. Press Enter to exit and continue. Press Enter again to save changes if you made any. Configure Network File System Network File System (NFS) allows sharing of files across a network. A machine can be configured as a server, a client, or both. Refer to for a more information. NFS Server User Confirmation Requested Do you want to configure this machine as an NFS server? Yes [ No ] If there is no need for a Network File System server, select &gui.no; and press Enter. If &gui.yes; is chosen, a message will pop-up indicating that the exports file must be created. Message Operating as an NFS server means that you must first configure an /etc/exports file to indicate which hosts are allowed certain kinds of access to your local filesystems. Press [Enter] now to invoke an editor on /etc/exports [ OK ] Press Enter to continue. A text editor will start allowing the exports file to be created and edited.

Editing <filename>exports</filename> Use the instructions to add the actual exported filesystems now or later using a text editor of your choice. Note the file name/location at the bottom of the editor screen. Press Esc and a pop-up menu will default to a) leave editor. Press Enter to exit and continue. NFS Client The NFS client allows your machine to access NFS servers. User Confirmation Requested Do you want to configure this machine as an NFS client? Yes [ No ] With the arrow keys, select &gui.yes; or &gui.no; as appropriate and press Enter. Security Profile A security profile is a set of configuration options that attempts to achieve the desired ratio of security to convenience by enabling and disabling certain programs and other settings. The more severe the security profile, the fewer programs will be enabled by default. This is one of the basic principles of security: do not run anything except what you must. Please note that the security profile is just a default setting. All programs can be enabled and disabled after you have installed FreeBSD by editing or adding the appropriate line(s) to /etc/rc.conf. For more information, please see the &man.rc.conf.5; manual page. The following table describes what each of the security profiles does. The columns are the choices you have for a security profile, and the rows are the program or feature that the profile enables or disables. Possible Security Profiles Extreme Moderate &man.sendmail.8; NO YES &man.sshd.8; NO YES &man.portmap.8; NO MAYBE The portmapper is enabled if the machine has been configured as an NFS client or server earlier in the installation. NFS server NO YES &man.securelevel.8; YES If you choose a security profile that sets the securelevel to Extreme or High, you must be aware of the implications. Please read the &man.init.8; manual page and pay particular attention to the meanings of the security levels, or you may have significant trouble later! NO

User Confirmation Requested Do you want to select a default security profile for this host (select No for "medium" security)? [ Yes ] No Selecting &gui.no; and pressing Enter will set the security profile to medium. Selecting &gui.yes; and pressing Enter will allow selecting a different security profile.

Security Profile Options Press F1 to display the help. Press Enter to return to selection menu. Use the arrow keys to choose Medium unless your are sure that another level is required for your needs. With &gui.ok; highlighted, press Enter. An appropriate confirmation message will display depending on which security setting was chosen. Message Moderate security settings have been selected. Sendmail and SSHd have been enabled, securelevels are disabled, and NFS server setting have been left intact. PLEASE NOTE that this still does not save you from having to properly secure your system in other ways or exercise due diligence in your administration, this simply picks a standard set of out-of-box defaults to start with. To change any of these settings later, edit /etc/rc.conf [OK] Message Extreme security settings have been selected. Sendmail, SSHd, and NFS services have been disabled, and securelevels have been enabled. PLEASE NOTE that this still does not save you from having to properly secure your system in other ways or exercise due diligence in your administration, this simply picks a more secure set of out-of-box defaults to start with. To change any of these settings later, edit /etc/rc.conf [OK] Press Enter to continue with the post-installation configuration. The security profile is not a silver bullet! Even if you use the extreme setting, you need to keep up with security issues by reading an appropriate mailing list (), using good passwords and passphrases, and generally adhering to good security practices. It simply sets up the desired security to convenience ratio out of the box. System Console Settings There are several options available to customize the system console. User Confirmation Requested Would you like to customize your system console settings? [ Yes ] No To view and configure the options, select &gui.yes; and press Enter.

System Console Configuration Options A commonly used option is the screen saver. Use the arrow keys to select Saver and then press Enter.

Screen Saver Options Select the desired screen saver using the arrow keys and then press Enter. The System Console Configuration menu will redisplay. The default time interval is 300 seconds. To change the time interval, select Saver again. At the Screen Saver Options menu, select Timeout using the arrow keys and press Enter. A pop-up menu will appear:

Screen Saver Timeout The value can be changed, then select &gui.ok; and press Enter to return to the System Console Configuration menu.

System Console Configuration Exit Selecting Exit and pressing Enter will continue with the post-installation configurations. Setting the Time Zone Setting the time zone for your machine will allow it to automatically correct for any regional time changes and perform other time zone related functions properly. The example shown is for a machine located in the Eastern time zone of the United States. Your selections will vary according to your geographical location. User Confirmation Requested Would you like to set this machine's time zone now? [ Yes ] No Select &gui.yes; and press Enter to set the time zone. User Confirmation Requested Is this machine's CMOS clock set to UTC? If it is set to local time or you don't know, please choose NO here! Yes [ No ] Select &gui.yes; or &gui.no; according to how the machine's clock is configured and press Enter.

Select Your Region The appropriate region is selected using the arrow keys and then pressing Enter.

Select Your Country Select the appropriate country using the arrow keys and press Enter.

Select Your Time Zone The appropriate time zone is selected using the arrow keys and pressing Enter. Confirmation Does the abbreviation 'EDT' look reasonable? [ Yes ] No Confirm the abbreviation for the time zone is correct. If it looks okay, press Enter to continue with the post-installation configuration. Linux Compatibility User Confirmation Requested Would you like to enable Linux binary compatibility? [ Yes ] No Selecting &gui.yes; and pressing Enter will allow running Linux software on FreeBSD. The install will add the appropriate packages for Linux compatibility. If installing by FTP, the machine will need to be connected to the Internet. Sometimes a remote ftp site will not have all the distributions like the Linux binary compatibility. This can be installed later if necessary. Mouse Settings This option will allow you to cut and paste text in the console and user programs with a 3-button mouse. If using a 2-button mouse, refer to manual page, &man.moused.8;, after installation for details on emulating the 3-button style. This example depicts a non-USB mouse configuration (such as a PS/2 or COM port mouse): User Confirmation Requested Does this system have a non-USB mouse attached to it? [ Yes ] No Select &gui.yes; for a non-USB mouse or &gui.no; for a USB mouse and press Enter.

Select Mouse Protocol Type Use the arrow keys to select Type and press Enter.

Set Mouse Protocol The mouse used in this example is a PS/2 type, so the default Auto was appropriate. To change protocol, use the arrow keys to select another option. Ensure that &gui.ok; is highlighted and press Enter to exit this menu.

Configure Mouse Port Use the arrow keys to select Port and press Enter.

Setting the Mouse Port This system had a PS/2 mouse, so the default PS/2 was appropriate. To change the port, use the arrow keys and then press Enter.

Enable the Mouse Daemon Last, use the arrow keys to select Enable, and press Enter to enable and test the mouse daemon.

Test the Mouse Daemon Move the mouse around the screen and verify the cursor shown responds properly. If it does, select &gui.yes; and press Enter. If not, the mouse has not been configured correctly — select &gui.no; and try using different configuration options. Select Exit with the arrow keys and press Enter to return to continue with the post-installation configuration. Tom Rhodes Contributed by Configure Additional Network Services Configuring network services can be a daunting task for new users if they lack previous knowledge in this area. Networking, including the Internet, is critical to all modern operating systems including &os;; as a result, it is very useful to have some understanding &os;'s extensive networking capabilities. Doing this during the installation will ensure users have some understanding of the various services available to them. Network services are programs that accept input from anywhere on the network. Every effort is made to make sure these programs will not do anything harmful. Unfortunately, programmers are not perfect and through time there have been cases where bugs in network services have been exploited by attackers to do bad things. It is important that you only enable the network services you know that you need. If in doubt it is best if you do not enable a network service until you find out that you do need it. You can always enable it later by re-running sysinstall or by using the features provided by the /etc/rc.conf file. Selecting the Networking option will display a menu similar to the one below:

Network Configuration Upper-level The first option, Interfaces, was previously covered during the , thus this option can safely be ignored. Selecting the AMD option adds support for the BSD automatic mount utility. This is usually used in conjunction with the NFS protocol (see below) for automatically mounting remote file systems. No special configuration is required here. Next in line is the AMD Flags option. When selected, a menu will pop up for you to enter specific AMD flags. The menu already contains a set of default options: -a /.amd_mnt -l syslog /host /etc/amd.map /net /etc/amd.map The option sets the default mount location which is specified here as /.amd_mnt. The option specifies the default log file; however, when syslogd is used all log activity will be sent to the system log daemon. The /host directory is used to mount an exported file system from a remote host, while /net directory is used to mount an exported file system from an IP address. The /etc/amd.map file defines the default options for AMD exports. FTP anonymous The Anon FTP option permits anonymous FTP connections. Select this option to make this machine an anonymous FTP server. Be aware of the security risks involved with this option. Another menu will be displayed to explain the security risks and configuration in depth. The Gateway configuration menu will set the machine up to be a gateway as explained previously. This can be used to unset the Gateway option if you accidentally selected it during the installation process. The Inetd option can be used to configure or completely disable the &man.inetd.8; daemon as discussed above. The Mail option is used to configure the system's default MTA or Mail Transfer Agent. Selecting this option will bring up the following menu:

Select a default MTA Here you are offered a choice as to which MTA to install and set as the default. An MTA is nothing more than a mail server which delivers email to users on the system or the Internet. Selecting Sendmail will install the popular sendmail server which is the &os; default. The Sendmail local option will set sendmail to be the default MTA, but disable its ability to receive incoming email from the Internet. The other options here, Postfix and Exim act similar to Sendmail. They both deliver email; however, some users prefer these alternatives to the sendmail MTA. After selecting an MTA, or choosing not to select an MTA, the network configuration menu will appear with the next option being NFS client. The NFS client option will configure the system to communicate with a server via NFS. An NFS server makes file systems available to other machines on the network via the NFS protocol. If this is a stand alone machine, this option can remain unselected. The system may require more configuration later; see for more information about client and server configuration. Below that option is the NFS server option, permitting you to set the system up as an NFS server. This adds the required information to start up the RPC remote procedure call services. RPC is used to coordinate connections between hosts and programs. Next in line is the Ntpdate option, which deals with time synchronization. When selected, a menu like the one below shows up:

Ntpdate Configuration From this menu, select the server which is the closest to your location. Selecting a close one will make the time synchronization more accurate as a server further from your location may have more connection latency. The next option is the PCNFSD selection. This option will install the net/pcnfsd package from the Ports Collection. This is a useful utility which provides NFS authentication services for systems which are unable to provide their own, such as Microsoft's &ms-dos; operating system. Now you must scroll down a bit to see the other options:

Network Configuration Lower-level The &man.rpcbind.8;, &man.rpc.statd.8;, and &man.rpc.lockd.8; utilities are all used for Remote Procedure Calls (RPC). The rpcbind utility manages communication between NFS servers and clients, and is required for NFS servers to operate correctly. The rpc.statd daemon interacts with the rpc.statd daemon on other hosts to provide status monitoring. The reported status is usually held in the /var/db/statd.status file. The next option listed here is the rpc.lockd option, which, when selected, will provide file locking services. This is usually used with rpc.statd to monitor what hosts are requesting locks and how frequently they request them. While these last two options are marvelous for debugging, they are not required for NFS servers and clients to operate correctly. As you progress down the list the next item here is Routed, which is the routing daemon. The &man.routed.8; utility manages network routing tables, discovers multicast routers, and supplies a copy of the routing tables to any physically connected host on the network upon request. This is mainly used for machines which act as a gateway for the local network. When selected, a menu will be presented requesting the default location of the utility. The default location is already defined for you and can be selected with the Enter key. You will then be presented with yet another menu, this time asking for the flags you wish to pass on to routed. The default is and it should already appear on the screen. Next in line is the Rwhod option which, when selected, will start the &man.rwhod.8; daemon during system initialization. The rwhod utility broadcasts system messages across the network periodically, or collects them when in consumer mode. More information can be found in the &man.ruptime.1; and &man.rwho.1; manual pages. The next to the last option in the list is for the &man.sshd.8; daemon. This is the secure shell server for OpenSSH and it is highly recommended over the standard telnet and FTP servers. The sshd server is used to create a secure connection from one host to another by using encrypted connections. Finally there is the TCP Extensions option. This enables the TCP Extensions defined in RFC 1323 and RFC 1644. While on many hosts this can speed up connections, it can also cause some connections to be dropped. It is not recommended for servers, but may be beneficial for stand alone machines. Now that you have configured the network services, you can scroll up to the very top item which is Exit and continue on to the next configuration section. Configure X Server As of &os; 5.3-RELEASE, the X server configuration facility has been removed from sysinstall, you have to install and configure the X server after the installation of &os;. More information regarding the installation and the configuration of a X server can be found in . You can skip this section if you are not installing a &os; version prior to 5.3-RELEASE. In order to use a graphical user interface such as KDE, GNOME, or others, the X server will need to be configured. In order to run &xfree86; as a non root user you will need to have x11/wrapper installed. This is installed by default beginning with FreeBSD 4.7. For earlier versions this can be added from the Package Selection menu. To see whether your video card is supported, check the &xfree86; web site. User Confirmation Requested Would you like to configure your X server at this time? [ Yes ] No It is necessary to know your monitor specifications and video card information. Equipment damage can occur if settings are incorrect. If you do not have this information, select &gui.no; and perform the configuration after installation when you have the information using sysinstall (/stand/sysinstall in &os; versions older than 5.2), selecting Configure and then XFree86. Improper configuration of the X server at this time can leave the machine in a frozen state. It is often advised to configure the X server once the installation has completed. If you have graphics card and monitor information, select &gui.yes; and press Enter to proceed with configuring the X server.

Select Configuration Method Menu There are several ways to configure the X server. Use the arrow keys to select one of the methods and press Enter. Be sure to read all instructions carefully. The xf86cfg and xf86cfg -textmode methods may make the screen go dark and take a few seconds to start. Be patient. The following will illustrate the use of the xf86config configuration tool. The configuration choices you make will depend on the hardware in the system so your choices will probably be different than those shown: Message You have configured and been running the mouse daemon. Choose "/dev/sysmouse" as the mouse port and "SysMouse" or "MouseSystems" as the mouse protocol in the X configuration utility. [ OK ] [ Press enter to continue ] This indicates that the mouse daemon previously configured has been detected. Press Enter to continue. Starting xf86config will display a brief introduction: This program will create a basic XF86Config file, based on menu selections you make. The XF86Config file usually resides in /usr/X11R6/etc/X11 or /etc/X11. A sample XF86Config file is supplied with XFree86; it is configured for a standard VGA card and monitor with 640x480 resolution. This program will ask for a pathname when it is ready to write the file. You can either take the sample XF86Config as a base and edit it for your configuration, or let this program produce a base XF86Config file for your configuration and fine-tune it. Before continuing with this program, make sure you know what video card you have, and preferably also the chipset it uses and the amount of video memory on your video card. SuperProbe may be able to help with this. Press enter to continue, or ctrl-c to abort. Pressing Enter will start the mouse configuration. Be sure to follow the instructions and use Mouse Systems as the mouse protocol and /dev/sysmouse as the mouse port even if using a PS/2 mouse is shown as an illustration. First specify a mouse protocol type. Choose one from the following list: 1. Microsoft compatible (2-button protocol) - 2. Mouse Systems (3-button protocol) & FreeBSD moused protocol + 2. Mouse Systems (3-button protocol) & FreeBSD moused protocol 3. Bus Mouse 4. PS/2 Mouse 5. Logitech Mouse (serial, old type, Logitech protocol) 6. Logitech MouseMan (Microsoft compatible) 7. MM Series 8. MM HitTablet 9. Microsoft IntelliMouse If you have a two-button mouse, it is most likely of type 1, and if you have a three-button mouse, it can probably support both protocol 1 and 2. There are two main varieties of the latter type: mice with a switch to select the protocol, and mice that default to 1 and require a button to be held at boot-time to select protocol 2. Some mice can be convinced to do 2 by sending a special sequence to the serial port (see the ClearDTR/ClearRTS options). Enter a protocol number: 2 You have selected a Mouse Systems protocol mouse. If your mouse is normally in Microsoft-compatible mode, enabling the ClearDTR and ClearRTS options may cause it to switch to Mouse Systems mode when the server starts. Please answer the following question with either 'y' or 'n'. Do you want to enable ClearDTR and ClearRTS? n You have selected a three-button mouse protocol. It is recommended that you do not enable Emulate3Buttons, unless the third button doesn't work. Please answer the following question with either 'y' or 'n'. Do you want to enable Emulate3Buttons? y Now give the full device name that the mouse is connected to, for example /dev/tty00. Just pressing enter will use the default, /dev/mouse. On FreeBSD, the default is /dev/sysmouse. Mouse device: /dev/sysmouse The keyboard is the next item to be configured. A generic 101-key model is shown for illustration. Any name may be used for the variant or simply press Enter to accept the default value. Please select one of the following keyboard types that is the better description of your keyboard. If nothing really matches, choose 1 (Generic 101-key PC) 1 Generic 101-key PC 2 Generic 102-key (Intl) PC 3 Generic 104-key PC 4 Generic 105-key (Intl) PC 5 Dell 101-key PC 6 Everex STEPnote 7 Keytronic FlexPro 8 Microsoft Natural 9 Northgate OmniKey 101 10 Winbook Model XP5 11 Japanese 106-key 12 PC-98xx Series 13 Brazilian ABNT2 14 HP Internet 15 Logitech iTouch 16 Logitech Cordless Desktop Pro 17 Logitech Internet Keyboard 18 Logitech Internet Navigator Keyboard 19 Compaq Internet 20 Microsoft Natural Pro 21 Genius Comfy KB-16M 22 IBM Rapid Access 23 IBM Rapid Access II 24 Chicony Internet Keyboard 25 Dell Internet Keyboard Enter a number to choose the keyboard. 1 Please select the layout corresponding to your keyboard 1 U.S. English 2 U.S. English w/ ISO9995-3 3 U.S. English w/ deadkeys 4 Albanian 5 Arabic 6 Armenian 7 Azerbaidjani 8 Belarusian 9 Belgian 10 Bengali 11 Brazilian 12 Bulgarian 13 Burmese 14 Canadian 15 Croatian 16 Czech 17 Czech (qwerty) 18 Danish Enter a number to choose the country. Press enter for the next page 1 Please enter a variant name for 'us' layout. Or just press enter for default variant us Please answer the following question with either 'y' or 'n'. Do you want to select additional XKB options (group switcher, group indicator, etc.)? n Next, we proceed to the configuration for the monitor. Do not exceed the ratings of your monitor. Damage could occur. If you have any doubts, do the configuration after you have the information. Now we want to set the specifications of the monitor. The two critical parameters are the vertical refresh rate, which is the rate at which the whole screen is refreshed, and most importantly the horizontal sync rate, which is the rate at which scanlines are displayed. The valid range for horizontal sync and vertical sync should be documented in the manual of your monitor. If in doubt, check the monitor database /usr/X11R6/lib/X11/doc/Monitors to see if your monitor is there. Press enter to continue, or ctrl-c to abort. You must indicate the horizontal sync range of your monitor. You can either select one of the predefined ranges below that correspond to industry- standard monitor types, or give a specific range. It is VERY IMPORTANT that you do not specify a monitor type with a horizontal sync range that is beyond the capabilities of your monitor. If in doubt, choose a conservative setting. hsync in kHz; monitor type with characteristic modes 1 31.5; Standard VGA, 640x480 @ 60 Hz 2 31.5 - 35.1; Super VGA, 800x600 @ 56 Hz 3 31.5, 35.5; 8514 Compatible, 1024x768 @ 87 Hz interlaced (no 800x600) 4 31.5, 35.15, 35.5; Super VGA, 1024x768 @ 87 Hz interlaced, 800x600 @ 56 Hz 5 31.5 - 37.9; Extended Super VGA, 800x600 @ 60 Hz, 640x480 @ 72 Hz 6 31.5 - 48.5; Non-Interlaced SVGA, 1024x768 @ 60 Hz, 800x600 @ 72 Hz 7 31.5 - 57.0; High Frequency SVGA, 1024x768 @ 70 Hz 8 31.5 - 64.3; Monitor that can do 1280x1024 @ 60 Hz 9 31.5 - 79.0; Monitor that can do 1280x1024 @ 74 Hz 10 31.5 - 82.0; Monitor that can do 1280x1024 @ 76 Hz 11 Enter your own horizontal sync range Enter your choice (1-11): 6 You must indicate the vertical sync range of your monitor. You can either select one of the predefined ranges below that correspond to industry- standard monitor types, or give a specific range. For interlaced modes, the number that counts is the high one (e.g. 87 Hz rather than 43 Hz). 1 50-70 2 50-90 3 50-100 4 40-150 5 Enter your own vertical sync range Enter your choice: 2 You must now enter a few identification/description strings, namely an identifier, a vendor name, and a model name. Just pressing enter will fill in default names. The strings are free-form, spaces are allowed. Enter an identifier for your monitor definition: Hitachi The selection of a video card driver from a list is next. If you pass your card on the list, continue to press Enter and the list will repeat. Only an excerpt from the list is shown: Now we must configure video card specific settings. At this point you can choose to make a selection out of a database of video card definitions. Because there can be variation in Ramdacs and clock generators even between cards of the same model, it is not sensible to blindly copy the settings (e.g. a Device section). For this reason, after you make a selection, you will still be asked about the components of the card, with the settings from the chosen database entry presented as a strong hint. The database entries include information about the chipset, what driver to run, the Ramdac and ClockChip, and comments that will be included in the Device section. However, a lot of definitions only hint about what driver to run (based on the chipset the card uses) and are untested. If you can't find your card in the database, there's nothing to worry about. You should only choose a database entry that is exactly the same model as your card; choosing one that looks similar is just a bad idea (e.g. a GemStone Snail 64 may be as different from a GemStone Snail 64+ in terms of hardware as can be). Do you want to look at the card database? y 288 Matrox Millennium G200 8MB mgag200 289 Matrox Millennium G200 SD 16MB mgag200 290 Matrox Millennium G200 SD 4MB mgag200 291 Matrox Millennium G200 SD 8MB mgag200 292 Matrox Millennium G400 mgag400 293 Matrox Millennium II 16MB mga2164w 294 Matrox Millennium II 4MB mga2164w 295 Matrox Millennium II 8MB mga2164w 296 Matrox Mystique mga1064sg 297 Matrox Mystique G200 16MB mgag200 298 Matrox Mystique G200 4MB mgag200 299 Matrox Mystique G200 8MB mgag200 300 Matrox Productiva G100 4MB mgag100 301 Matrox Productiva G100 8MB mgag100 302 MediaGX mediagx 303 MediaVision Proaxcel 128 ET6000 304 Mirage Z-128 ET6000 305 Miro CRYSTAL VRX Verite 1000 Enter a number to choose the corresponding card definition. Press enter for the next page, q to continue configuration. 288 Your selected card definition: Identifier: Matrox Millennium G200 8MB Chipset: mgag200 Driver: mga Do NOT probe clocks or use any Clocks line. Press enter to continue, or ctrl-c to abort. Now you must give information about your video card. This will be used for the "Device" section of your video card in XF86Config. You must indicate how much video memory you have. It is probably a good idea to use the same approximate amount as that detected by the server you intend to use. If you encounter problems that are due to the used server not supporting the amount memory you have (e.g. ATI Mach64 is limited to 1024K with the SVGA server), specify the maximum amount supported by the server. How much video memory do you have on your video card: 1 256K 2 512K 3 1024K 4 2048K 5 4096K 6 Other Enter your choice: 6 Amount of video memory in Kbytes: 8192 You must now enter a few identification/description strings, namely an identifier, a vendor name, and a model name. Just pressing enter will fill in default names (possibly from a card definition). Your card definition is Matrox Millennium G200 8MB. The strings are free-form, spaces are allowed. Enter an identifier for your video card definition: Next, the video modes are set for the resolutions desired. Typically, useful ranges are 640x480, 800x600, and 1024x768 but those are a function of video card capability, monitor size, and eye comfort. When selecting a color depth, select the highest mode that your card will support. For each depth, a list of modes (resolutions) is defined. The default resolution that the server will start-up with will be the first listed mode that can be supported by the monitor and card. Currently it is set to: "640x480" "800x600" "1024x768" "1280x1024" for 8-bit "640x480" "800x600" "1024x768" "1280x1024" for 16-bit "640x480" "800x600" "1024x768" "1280x1024" for 24-bit Modes that cannot be supported due to monitor or clock constraints will be automatically skipped by the server. 1 Change the modes for 8-bit (256 colors) 2 Change the modes for 16-bit (32K/64K colors) 3 Change the modes for 24-bit (24-bit color) 4 The modes are OK, continue. Enter your choice: 2 Select modes from the following list: 1 "640x400" 2 "640x480" 3 "800x600" 4 "1024x768" 5 "1280x1024" 6 "320x200" 7 "320x240" 8 "400x300" 9 "1152x864" a "1600x1200" b "1800x1400" c "512x384" Please type the digits corresponding to the modes that you want to select. For example, 432 selects "1024x768" "800x600" "640x480", with a default mode of 1024x768. Which modes? 432 You can have a virtual screen (desktop), which is screen area that is larger than the physical screen and which is panned by moving the mouse to the edge of the screen. If you don't want virtual desktop at a certain resolution, you cannot have modes listed that are larger. Each color depth can have a differently-sized virtual screen Please answer the following question with either 'y' or 'n'. Do you want a virtual screen that is larger than the physical screen? n For each depth, a list of modes (resolutions) is defined. The default resolution that the server will start-up with will be the first listed mode that can be supported by the monitor and card. Currently it is set to: "640x480" "800x600" "1024x768" "1280x1024" for 8-bit "1024x768" "800x600" "640x480" for 16-bit "640x480" "800x600" "1024x768" "1280x1024" for 24-bit Modes that cannot be supported due to monitor or clock constraints will be automatically skipped by the server. 1 Change the modes for 8-bit (256 colors) 2 Change the modes for 16-bit (32K/64K colors) 3 Change the modes for 24-bit (24-bit color) 4 The modes are OK, continue. Enter your choice: 4 Please specify which color depth you want to use by default: 1 1 bit (monochrome) 2 4 bits (16 colors) 3 8 bits (256 colors) 4 16 bits (65536 colors) 5 24 bits (16 million colors) Enter a number to choose the default depth. 4 Finally, the configuration needs to be saved. Be sure to enter /etc/X11/XF86Config as the location for saving the configuration. I am going to write the XF86Config file now. Make sure you don't accidently overwrite a previously configured one. Shall I write it to /etc/X11/XF86Config? y If the configuration fails, you can try the configuration again by selecting &gui.yes; when the following message appears: User Confirmation Requested The XFree86 configuration process seems to have failed. Would you like to try again? [ Yes ] No If you have trouble configuring &xfree86;, select &gui.no; and press Enter and continue with the installation process. After installation you can use xf86cfg -textmode or xf86config to access the command line configuration utilities as root. There is an additional method for configuring &xfree86; described in . If you choose not to configure &xfree86; at this time the next menu will be for package selection. The default setting which allows the server to be killed is the hotkey sequence CtrlAlt Backspace. This can be executed if something is wrong with the server settings and prevent hardware damage. The default setting that allows video mode switching will permit changing of the mode while running X with the hotkey sequence CtrlAlt+ or CtrlAlt- . After you have &xfree86; running, the display can be adjusted for height, width, or centering by using xvidtune. There are warnings that improper settings can damage your equipment. Heed them. If in doubt, do not do it. Instead, use the monitor controls to adjust the display for X Window. There may be some display differences when switching back to text mode, but it is better than damaging equipment. Read the &man.xvidtune.1; manual page before making any adjustments. Following a successful &xfree86; configuration, it will proceed to the selection of a default desktop. Select Default X Desktop As of &os; 5.3-RELEASE, the X desktop selection facility has been removed from sysinstall, you have to configure the X desktop after the installation of &os;. More information regarding the installation and the configuration of a X desktop can be found in . You can skip this section if you are not installing a &os; version prior to 5.3-RELEASE. There are a variety of window managers available. They range from very basic environments to full desktop environments with a large suite of software. Some require only minimal disk space and low memory while others with more features require much more. The best way to determine which is most suitable for you is to try a few different ones. Those are available from the Ports Collection or as packages and can be added after installation. You can select one of the popular desktops to be installed and configured as the default desktop. This will allow you to start it right after installation.

Select Default Desktop Use the arrow keys to select a desktop and press Enter. Installation of the selected desktop will proceed. Install Packages Packages are pre-compiled binaries and are a convenient way to install software. Installation of one package is shown for purposes of illustration. Additional packages can also be added at this time if desired. After installation sysinstall (/stand/sysinstall in &os; versions older than 5.2) can be used to add additional packages. User Confirmation Requested The FreeBSD package collection is a collection of hundreds of ready-to-run applications, from text editors to games to WEB servers and more. Would you like to browse the collection now? [ Yes ] No Selecting &gui.yes; and pressing Enter will be followed by the Package Selection screens:

Select Package Category Only packages on the current installation media are available for installation at any given time. All packages available will be displayed if All is selected or you can select a particular category. Highlight your selection with the arrow keys and press Enter. A menu will display showing all the packages available for the selection made:

Select Packages The bash shell is shown selected. Select as many as desired by highlighting the package and pressing the Space key. A short description of each package will appear in the lower left corner of the screen. Pressing the Tab key will toggle between the last selected package, &gui.ok;, and &gui.cancel;. When you have finished marking the packages for installation, press Tab once to toggle to the &gui.ok; and press Enter to return to the Package Selection menu. The left and right arrow keys will also toggle between &gui.ok; and &gui.cancel;. This method can also be used to select &gui.ok; and press Enter to return to the Package Selection menu.

Install Packages Use the Tab and arrow keys to select [ Install ] and press Enter. You will then need to confirm that you want to install the packages:

Confirm Package Installation Selecting &gui.ok; and pressing Enter will start the package installation. Installing messages will appear until completed. Make note if there are any error messages. The final configuration continues after packages are installed. If you end up not selecting any packages, and wish to return to the final configuration, select Install anyways. Add Users/Groups You should add at least one user during the installation so that you can use the system without being logged in as root. The root partition is generally small and running applications as root can quickly fill it. A bigger danger is noted below: User Confirmation Requested Would you like to add any initial user accounts to the system? Adding at least one account for yourself at this stage is suggested since working as the "root" user is dangerous (it is easy to do things which adversely affect the entire system). [ Yes ] No Select &gui.yes; and press Enter to continue with adding a user.

Select User Select User with the arrow keys and press Enter.

Add User Information The following descriptions will appear in the lower part of the screen as the items are selected with Tab to assist with entering the required information: Login ID The login name of the new user (mandatory). UID The numerical ID for this user (leave blank for automatic choice). Group The login group name for this user (leave blank for automatic choice). Password The password for this user (enter this field with care!). Full name The user's full name (comment). Member groups The groups this user belongs to (i.e. gets access rights for). Home directory The user's home directory (leave blank for default). Login shell The user's login shell (leave blank for default, e.g. /bin/sh). The login shell was changed from /bin/sh to /usr/local/bin/bash to use the bash shell that was previously installed as a package. Do not try to use a shell that does not exist or you will not be able to login. The most common shell used in the BSD-world is the C shell, which can be indicated as /bin/tcsh. The user was also added to the wheel group to be able to become a superuser with root privileges. When you are satisfied, press &gui.ok; and the User and Group Management menu will redisplay:

Exit User and Group Management Groups can also be added at this time if specific needs are known. Otherwise, this may be accessed through using sysinstall (/stand/sysinstall in &os; versions older than 5.2) after installation is completed. When you are finished adding users, select Exit with the arrow keys and press Enter to continue the installation. Set the <username>root</username> Password Message Now you must set the system manager's password. This is the password you'll use to log in as "root". [ OK ] [ Press enter to continue ] Press Enter to set the root password. The password will need to be typed in twice correctly. Needless to say, make sure you have a way of finding the password if you forget. Notice that the password you type in is not echoed, nor are asterisks displayed. Changing local password for root. New password : Retype new password : The installation will continue after the password is successfully entered. Exiting Install If you need to configure additional network devices or any other configuration, you can do it at this point or after installation with sysinstall (/stand/sysinstall in &os; versions older than 5.2). User Confirmation Requested Visit the general configuration menu for a chance to set any last options? Yes [ No ] Select &gui.no; with the arrow keys and press Enter to return to the Main Installation Menu.

Exit Install Select [X Exit Install] with the arrow keys and press Enter. You will be asked to confirm exiting the installation: User Confirmation Requested Are you sure you wish to exit? The system will reboot (be sure to remove any floppies from the drives). [ Yes ] No Select &gui.yes; and remove the floppy if booting from the floppy. The CDROM drive is locked until the machine starts to reboot. The CDROM drive is then unlocked and the disk can be removed from drive (quickly). The system will reboot so watch for any error messages that may appear. FreeBSD Bootup FreeBSD Bootup on the &i386; If everything went well, you will see messages scroll off the screen and you will arrive at a login prompt. You can view the content of the messages by pressing Scroll-Lock and using PgUp and PgDn. Pressing Scroll-Lock again will return to the prompt. The entire message may not display (buffer limitation) but it can be viewed from the command line after logging in by typing dmesg at the prompt. Login using the username/password you set during installation (rpratt, in this example). Avoid logging in as root except when necessary. Typical boot messages (version information omitted): Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. Timecounter "i8254" frequency 1193182 Hz CPU: AMD-K6(tm) 3D processor (300.68-MHz 586-class CPU) Origin = "AuthenticAMD" Id = 0x580 Stepping = 0 Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX> AMD Features=0x80000800<SYSCALL,3DNow!> real memory = 268435456 (262144K bytes) config> di sn0 config> di lnc0 config> di le0 config> di ie0 config> di fe0 config> di cs0 config> di bt0 config> di aic0 config> di aha0 config> di adv0 config> q avail memory = 256311296 (250304K bytes) Preloaded elf kernel "kernel" at 0xc0491000. Preloaded userconfig_script "/boot/kernel.conf" at 0xc049109c. md0: Malloc disk Using $PIR table, 4 entries at 0xc00fde60 npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pcib1: <VIA 82C598MVP (Apollo MVP3) PCI-PCI (AGP) bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <Matrox MGA G200 AGP graphics accelerator> at 0.0 irq 11 isab0: <VIA 82C586 PCI-ISA bridge> at device 7.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 82C586 ATA33 controller> port 0xe000-0xe00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: <VIA 83C572 USB controller> port 0xe400-0xe41f irq 10 at device 7.2 on pci0 usb0: <VIA 83C572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered chip1: <VIA 82C586B ACPI interface> at device 7.3 on pci0 ed0: <NE2000 PCI Ethernet (RealTek 8029)> port 0xe800-0xe81f irq 9 at device 10.0 on pci0 ed0: address 52:54:05:de:73:1b, type NE2000 (16 bit) isa0: too many dependant configs (8) isa0: unexpected small tag 14 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <keyboard controller (i8042)> at port 0x60-0x64 on isa0 atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model Generic PS/2 mouse, device ID 0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x1 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/15 bytes threshold ppbus0: IEEE1284 device found /NIBBLE Probing for PnP devices on ppbus0: plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 ad0: 8063MB <IBM-DHEA-38451> [16383/16/63] at ata0-master using UDMA33 ad2: 8063MB <IBM-DHEA-38451> [16383/16/63] at ata1-master using UDMA33 acd0: CDROM <DELTA OTC-H101/ST3 F/W by OIPD> at ata0-slave using PIO4 Mounting root from ufs:/dev/ad0s1a swapon: adding /dev/ad0s1b as swap device Automatic boot in progress... /dev/ad0s1a: FILESYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1a: clean, 48752 free (552 frags, 6025 blocks, 0.9% fragmentation) /dev/ad0s1f: FILESYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1f: clean, 128997 free (21 frags, 16122 blocks, 0.0% fragmentation) /dev/ad0s1g: FILESYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1g: clean, 3036299 free (43175 frags, 374073 blocks, 1.3% fragmentation) /dev/ad0s1e: filesystem CLEAN; SKIPPING CHECKS /dev/ad0s1e: clean, 128193 free (17 frags, 16022 blocks, 0.0% fragmentation) Doing initial network setup: hostname. ed0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255 inet6 fe80::5054::5ff::fede:731b%ed0 prefixlen 64 tentative scopeid 0x1 ether 52:54:05:de:73:1b lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x8 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 Additional routing options: IP gateway=YES TCP keepalive=YES routing daemons:. additional daemons: syslogd. Doing additional network setup:. Starting final network daemons: creating ssh RSA host key Generating public/private rsa1 key pair. Your identification has been saved in /etc/ssh/ssh_host_key. Your public key has been saved in /etc/ssh/ssh_host_key.pub. The key fingerprint is: cd:76:89:16:69:0e:d0:6e:f8:66:d0:07:26:3c:7e:2d root@k6-2.example.com creating ssh DSA host key Generating public/private dsa key pair. Your identification has been saved in /etc/ssh/ssh_host_dsa_key. Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub. The key fingerprint is: f9:a1:a9:47:c4:ad:f9:8d:52:b8:b8:ff:8c:ad:2d:e6 root@k6-2.example.com. setting ELF ldconfig path: /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout /usr/X11R6/lib/aout starting standard daemons: inetd cron sshd usbd sendmail. Initial rc.i386 initialization:. rc.i386 configuring syscons: blank_time screensaver moused. Additional ABI support: linux. Local package initialization:. Additional TCP options:. FreeBSD/i386 (k6-2.example.com) (ttyv0) login: rpratt Password: Generating the RSA and DSA keys may take some time on slower machines. This happens only on the initial boot-up of a new installation. Subsequent boots will be faster. If the X server has been configured and a Default Desktop chosen, it can be started by typing startx at the command line. Bootup of FreeBSD on the Alpha Alpha Once the install procedure has finished, you will be able to start FreeBSD by typing something like this to the SRM prompt: >>>BOOT DKC0 This instructs the firmware to boot the specified disk. To make FreeBSD boot automatically in the future, use these commands: >>> SET BOOT_OSFLAGS A >>> SET BOOT_FILE '' >>> SET BOOTDEF_DEV DKC0 >>> SET AUTO_ACTION BOOT The boot messages will be similar (but not identical) to those produced by FreeBSD booting on the &i386;. FreeBSD Shutdown It is important to properly shutdown the operating system. Do not just turn off power. First, become a superuser by typing su at the command line and entering the root password. This will work only if the user is a member of the wheel group. Otherwise, login as root and use shutdown -h now. The operating system has halted. Please press any key to reboot. It is safe to turn off the power after the shutdown command has been issued and the message Please press any key to reboot appears. If any key is pressed instead of turning off the power switch, the system will reboot. You could also use the Ctrl Alt Del key combination to reboot the system, however this is not recommended during normal operation. Supported Hardware hardware FreeBSD currently runs on a wide variety of ISA, VLB, EISA, and PCI bus-based PCs with Intel, AMD, Cyrix, or NexGen x86 processors, as well as a number of machines based on the Compaq Alpha processor. Support for generic IDE or ESDI drive configurations, various SCSI controllers, PCMCIA cards, USB devices, and network and serial cards is also provided. FreeBSD also supports IBM's microchannel (MCA) bus. A list of supported hardware is provided with each FreeBSD release in the FreeBSD Hardware Notes. This document can usually be found in a file named HARDWARE.TXT, in the top-level directory of a CDROM or FTP distribution or in sysinstall's documentation menu. It lists, for a given architecture, what hardware devices are known to be supported by each release of FreeBSD. Copies of the supported hardware list for various releases and architectures can also be found on the Release Information page of the FreeBSD Web site. Troubleshooting installation troubleshooting The following section covers basic installation troubleshooting, such as common problems people have reported. There are also a few questions and answers for people wishing to dual-boot FreeBSD with &ms-dos;. What to Do If Something Goes Wrong Due to various limitations of the PC architecture, it is impossible for probing to be 100% reliable, however, there are a few things you can do if it fails. Check the Hardware Notes document for your version of FreeBSD to make sure your hardware is supported. If your hardware is supported and you still experience lock-ups or other problems, reset your computer, and when the visual kernel configuration option is given, choose it. This will allow you to go through your hardware and supply information to the system about it. The kernel on the boot disks is configured assuming that most hardware devices are in their factory default configuration in terms of IRQs, IO addresses, and DMA channels. If your hardware has been reconfigured, you will most likely need to use the configuration editor to tell FreeBSD where to find things. It is also possible that a probe for a device not present will cause a later probe for another device that is present to fail. In that case, the probes for the conflicting driver(s) should be disabled. Some installation problems can be avoided or alleviated by updating the firmware on various hardware components, most notably the motherboard. The motherboard firmware may also be referred to as BIOS and most of the motherboard or computer manufactures have a website where the upgrades and upgrade information may be located. Most manufacturers strongly advise against upgrading the motherboard BIOS unless there is a good reason for doing so, which could possibly be a critical update of sorts. The upgrade process can go wrong, causing permanent damage to the BIOS chip. Do not disable any drivers you will need during the installation, such as your screen (sc0). If the installation wedges or fails mysteriously after leaving the configuration editor, you have probably removed or changed something you should not have. Reboot and try again. In configuration mode, you can: List the device drivers installed in the kernel. Disable device drivers for hardware that is not present in your system. Change IRQs, DRQs, and IO port addresses used by a device driver. After adjusting the kernel to match your hardware configuration, type Q to boot with the new settings. Once the installation has completed, any changes you made in the configuration mode will be permanent so you do not have to reconfigure every time you boot. It is still highly likely that you will eventually want to build a custom kernel. Dealing with Existing &ms-dos; Partitions DOS Many users wish to install &os; on PCs inhabited by µsoft; based operating systems. For those instances, &os; has a utility known as FIPS. This utility can be found in the tools directory on the install CD-ROM, or downloaded from one of various &os; mirrors. The FIPS utility allows you to split an existing &ms-dos; partition into two pieces, preserving the original partition and allowing you to install onto the second free piece. You first need to defragment your &ms-dos; partition using the &windows; Disk Defragmenter utility (go into Explorer, right-click on the hard drive, and choose to defrag your hard drive), or use Norton Disk Tools. Now you can run the FIPS utility. It will prompt you for the rest of the information, just follow the on screen instructions. Afterwards, you can reboot and install &os; on the new free slice. See the Distributions menu for an estimate of how much free space you will need for the kind of installation you want. There is also a very useful product from PowerQuest (http://www.powerquest.com) called &partitionmagic;. This application has far more functionality than FIPS, and is highly recommended if you plan to add/remove operating systems often. It does cost money, so if you plan to install &os; and keep it installed, FIPS will probably be fine for you. Using &ms-dos; and &windows; File Systems At this time, &os; does not support file systems compressed with the Double Space™ application. Therefore the file system will need to be uncompressed before &os; can access the data. This can be done by running the Compression Agent located in the Start> Programs > System Tools menu. &os; can support &ms-dos; based file systems. This requires you use the &man.mount.msdos.8; command (in &os; 5.X, the command is &man.mount.msdosfs.8;) with the required parameters. The utilities most common usage is: &prompt.root; mount_msdos /dev/ad0s1 /mnt In this example, the &ms-dos; file system is located on the first partition of the primary hard disk. Your situation may be different, check the output from the dmesg, and mount commands. They should produce enough information to give an idea of the partition layout. Extended &ms-dos; file systems are usually mapped after the &os; partitions. In other words, the slice number may be higher than the ones &os; is using. For instance, the first &ms-dos; partition may be /dev/ad0s1, the &os; partition may be /dev/ad0s2, with the extended &ms-dos; partition being located on /dev/ad0s3. To some, this can be confusing at first. NTFS partitions can also be mounted in a similar manner using the &man.mount.ntfs.8; command. Alpha User's Questions and Answers Alpha This section answers some commonly asked questions about installing FreeBSD on Alpha systems. Can I boot from the ARC or Alpha BIOS Console? ARC Alpha BIOS SRM No. &os;, like Compaq Tru64 and VMS, will only boot from the SRM console. Help, I have no space! Do I need to delete everything first? Unfortunately, yes. Can I mount my Compaq Tru64 or VMS filesystems? No, not at this time. Valentino Vaschetto Contributed by Advanced Installation Guide This section describes how to install FreeBSD in exceptional cases. Installing FreeBSD on a System without a Monitor or Keyboard installation headless (serial console) serial console This type of installation is called a headless install, because the machine that you are trying to install FreeBSD on either does not have a monitor attached to it, or does not even have a VGA output. How is this possible you ask? Using a serial console. A serial console is basically using another machine to act as the main display and keyboard for a system. To do this, just follow the steps to create installation floppies, explained in . To modify these floppies to boot into a serial console, follow these steps: Enabling the Boot Floppies to Boot into a Serial Console mount If you were to boot into the floppies that you just made, FreeBSD would boot into its normal install mode. We want FreeBSD to boot into a serial console for our install. To do this, you have to mount the kern.flp floppy onto your FreeBSD system using the &man.mount.8; command. &prompt.root; mount /dev/fd0 /mnt Now that you have the floppy mounted, you must change into the /mnt directory: &prompt.root; cd /mnt Here is where you must set the floppy to boot into a serial console. You have to make a file called boot.config containing /boot/loader -h. All this does is pass a flag to the bootloader to boot into a serial console. - &prompt.root; echo "/boot/loader -h" > boot.config + &prompt.root; echo "/boot/loader -h" > boot.config Now that you have your floppy configured correctly, you must unmount the floppy using the &man.umount.8; command: &prompt.root; cd / &prompt.root; umount /mnt Now you can remove the floppy from the floppy drive. Connecting Your Null-modem Cable null-modem cable You now need to connect a null-modem cable between the two machines. Just connect the cable to the serial ports of the 2 machines. A normal serial cable will not work here, you need a null-modem cable because it has some of the wires inside crossed over. Booting Up for the Install It is now time to go ahead and start the install. Put the kern.flp floppy in the floppy drive of the machine you are doing the headless install on, and power on the machine. Connecting to Your Headless Machine cu Now you have to connect to that machine with &man.cu.1;: &prompt.root; cu -l /dev/cuaa0 That's it! You should now be able to control the headless machine through your cu session. It will ask you to put in the mfsroot.flp, and then it will come up with a selection of what kind of terminal to use. Select the FreeBSD color console and proceed with your install! Preparing Your Own Installation Media To prevent repetition, FreeBSD disc in this context means a FreeBSD CDROM or DVD that you have purchased or produced yourself. There may be some situations in which you need to create your own FreeBSD installation media and/or source. This might be physical media, such as a tape, or a source that sysinstall can use to retrieve the files, such as a local FTP site, or an &ms-dos; partition. For example: You have many machines connected to your local network, and one FreeBSD disc. You want to create a local FTP site using the contents of the FreeBSD disc, and then have your machines use this local FTP site instead of needing to connect to the Internet. You have a FreeBSD disc, and FreeBSD does not recognize your CD/DVD drive, but &ms-dos;/&windows; does. You want to copy the FreeBSD installation files to a DOS partition on the same computer, and then install FreeBSD using those files. The computer you want to install on does not have a CD/DVD drive or a network card, but you can connect a Laplink-style serial or parallel cable to a computer that does. You want to create a tape that can be used to install FreeBSD. Creating an Installation CDROM As part of each release, the FreeBSD project makes available two CDROM images (ISO images). These images can be written (burned) to CDs if you have a CD writer, and then used to install FreeBSD. If you have a CD writer, and bandwidth is cheap, then this is the easiest way to install FreeBSD. Download the Correct ISO Images The ISO images for each release can be downloaded from ftp://ftp.FreeBSD.org/pub/FreeBSD/ISO-IMAGES-arch/version or the closest mirror. Substitute arch and version as appropriate. That directory will normally contain the following images: FreeBSD 4.<replaceable>X</replaceable> ISO Image Names and Meanings Filename Contains version-RELEASE-arch-miniinst.iso Everything you need to install FreeBSD. version-RELEASE-arch-disc1.iso Everything you need to install FreeBSD, and as many additional third party packages as would fit on the disc. version-RELEASE-arch-disc2.iso A live filesystem, which is used in conjunction with the Repair facility in sysinstall. A copy of the FreeBSD CVS tree. As many additional third party packages as would fit on the disc.

FreeBSD 5.<replaceable>X</replaceable> ISO Image Names and Meanings Filename Contains version-RELEASE-arch-bootonly.iso Everything you need to boot into a FreeBSD kernel and start the installation interface. The installable files have to be pulled over FTP or some other supported source. version-RELEASE-arch-miniinst.iso Everything you need to install FreeBSD. version-RELEASE-arch-disc1.iso Everything you need to install &os; and a live filesystem, which is used in conjunction with the Repair facility in sysinstall. version-RELEASE-arch-disc2.iso &os; documentation and as many third party packages as would fit on the disc.

You must download one of either the miniinst ISO image, or the image of disc one. Do not download both of them, since the disc one image contains everything that the miniinst ISO image contains. The miniinst ISO image is only available for releases prior to 5.4-RELEASE. Use the miniinst ISO if Internet access is cheap for you. It will let you install FreeBSD, and you can then install third party packages by downloading them using the ports/packages system (see ) as necessary. Use the image of disc one if you want to install a &os; 4.X release and want a reasonable selection of third party packages on the disc as well. The additional disc images are useful, but not essential, especially if you have high-speed access to the Internet. Write the CDs You must then write the CD images to disc. If you will be doing this on another FreeBSD system then see for more information (in particular, and ). If you will be doing this on another platform then you will need to use whatever utilities exist to control your CD writer on that platform. The images provided are in the standard ISO format, which many CD writing applications support. If you are interested in building a customized release of FreeBSD, please see the Release Engineering Article. Creating a Local FTP Site with a FreeBSD Disc installation network FTP FreeBSD discs are laid out in the same way as the FTP site. This makes it very easy for you to create a local FTP site that can be used by other machines on your network when installing FreeBSD. On the FreeBSD computer that will host the FTP site, ensure that the CDROM is in the drive, and mounted on /cdrom. &prompt.root; mount /cdrom Create an account for anonymous FTP in /etc/passwd. Do this by editing /etc/passwd using &man.vipw.8; and adding this line: ftp:*:99:99::0:0:FTP:/cdrom:/nonexistent Ensure that the FTP service is enabled in /etc/inetd.conf. Anyone with network connectivity to your machine can now chose a media type of FTP and type in ftp://your machine after picking Other in the FTP sites menu during the install. If the boot media (floppy disks, usually) for your FTP clients is not precisely the same version as that provided by the local FTP site, then sysinstall will not let you complete the installation. If the versions are not similar and you want to override this, you must go into the Options menu and change distribution name to any. This approach is OK for a machine that is on your local network, and that is protected by your firewall. Offering up FTP services to other machines over the Internet (and not your local network) exposes your computer to the attention of crackers and other undesirables. We strongly recommend that you follow good security practices if you do this. Creating Installation Floppies installation floppies If you must install from floppy disk (which we suggest you do not do), either due to unsupported hardware or simply because you insist on doing things the hard way, you must first prepare some floppies for the installation. At a minimum, you will need as many 1.44 MB or 1.2 MB floppies as it takes to hold all the files in the bin (binary distribution) directory. If you are preparing the floppies from DOS, then they must be formatted using the &ms-dos; FORMAT command. If you are using &windows;, use Explorer to format the disks (right-click on the A: drive, and select Format). Do not trust factory pre-formatted floppies. Format them again yourself, just to be sure. Many problems reported by our users in the past have resulted from the use of improperly formatted media, which is why we are making a point of it now. If you are creating the floppies on another FreeBSD machine, a format is still not a bad idea, though you do not need to put a DOS filesystem on each floppy. You can use the disklabel and newfs commands to put a UFS filesystem on them instead, as the following sequence of commands (for a 3.5" 1.44 MB floppy) illustrates: &prompt.root; fdformat -f 1440 fd0.1440 &prompt.root; disklabel -w -r fd0.1440 floppy3 &prompt.root; newfs -t 2 -u 18 -l 1 -i 65536 /dev/fd0 Use fd0.1200 and floppy5 for 5.25" 1.2 MB disks. Then you can mount and write to them like any other filesystem. After you have formatted the floppies, you will need to copy the files to them. The distribution files are split into chunks conveniently sized so that five of them will fit on a conventional 1.44 MB floppy. Go through all your floppies, packing as many files as will fit on each one, until you have all of the distributions you want packed up in this fashion. Each distribution should go into a subdirectory on the floppy, e.g.: a:\bin\bin.aa, a:\bin\bin.ab, and so on. Once you come to the Media screen during the install process, select Floppy and you will be prompted for the rest. Installing from an &ms-dos; Partition installation from MS-DOS To prepare for an installation from an &ms-dos; partition, copy the files from the distribution into a directory called freebsd in the root directory of the partition. For example, c:\freebsd. The directory structure of the CDROM or FTP site must be partially reproduced within this directory, so we suggest using the DOS xcopy command if you are copying it from a CD. For example, to prepare for a minimal installation of FreeBSD: C:\> md c:\freebsd C:\> xcopy e:\bin c:\freebsd\bin\ /s C:\> xcopy e:\manpages c:\freebsd\manpages\ /s Assuming that C: is where you have free space and E: is where your CDROM is mounted. If you do not have a CDROM drive, you can download the distribution from ftp.FreeBSD.org. Each distribution is in its own directory; for example, the base distribution can be found in the &rel.current;/base/ directory. In the 4.X and older releases of &os; the base distribution is called bin. Adjust the sample commands and URLs above accordingly, if you are using one of these versions. For as many distributions you wish to install from an &ms-dos; partition (and you have the free space for), install each one under c:\freebsd — the BIN distribution is the only one required for a minimum installation. Creating an Installation Tape installation from QIC/SCSI Tape Installing from tape is probably the easiest method, short of an online FTP install or CDROM install. The installation program expects the files to be simply tarred onto the tape. After getting all of the distribution files you are interested in, simply tar them onto the tape: &prompt.root; cd /freebsd/distdir &prompt.root; tar cvf /dev/rwt0 dist1 ... dist2 When you perform the installation, you should make sure that you leave enough room in some temporary directory (which you will be allowed to choose) to accommodate the full contents of the tape you have created. Due to the non-random access nature of tapes, this method of installation requires quite a bit of temporary storage. When starting the installation, the tape must be in the drive before booting from the boot floppy. The installation probe may otherwise fail to find it. Before Installing over a Network installation network serial (SLIP or PPP) installation network parallel (PLIP) installation network Ethernet There are three types of network installations available. Serial port (SLIP or PPP), Parallel port (PLIP (laplink cable)), or Ethernet (a standard Ethernet controller (includes some PCMCIA)). The SLIP support is rather primitive, and limited primarily to hard-wired links, such as a serial cable running between a laptop computer and another computer. The link should be hard-wired as the SLIP installation does not currently offer a dialing capability; that facility is provided with the PPP utility, which should be used in preference to SLIP whenever possible. If you are using a modem, then PPP is almost certainly your only choice. Make sure that you have your service provider's information handy as you will need to know it fairly early in the installation process. If you use PAP or CHAP to connect your ISP (in other words, if you can connect to the ISP in &windows; without using a script), then all you will need to do is type in dial at the ppp prompt. Otherwise, you will need to know how to dial your ISP using the AT commands specific to your modem, as the PPP dialer provides only a very simple terminal emulator. Please refer to the user-ppp handbook and FAQ entries for further information. If you have problems, logging can be directed to the screen using the command set log local .... If a hard-wired connection to another FreeBSD (2.0-R or later) machine is available, you might also consider installing over a laplink parallel port cable. The data rate over the parallel port is much higher than what is typically possible over a serial line (up to 50 kbytes/sec), thus resulting in a quicker installation. Finally, for the fastest possible network installation, an Ethernet adapter is always a good choice! FreeBSD supports most common PC Ethernet cards; a table of supported cards (and their required settings) is provided in the Hardware Notes for each release of FreeBSD. If you are using one of the supported PCMCIA Ethernet cards, also be sure that it is plugged in before the laptop is powered on! FreeBSD does not, unfortunately, currently support hot insertion of PCMCIA cards during installation. You will also need to know your IP address on the network, the netmask value for your address class, and the name of your machine. If you are installing over a PPP connection and do not have a static IP, fear not, the IP address can be dynamically assigned by your ISP. Your system administrator can tell you which values to use for your particular network setup. If you will be referring to other hosts by name rather than IP address, you will also need a name server and possibly the address of a gateway (if you are using PPP, it is your provider's IP address) to use in talking to it. If you want to install by FTP via a HTTP proxy, you will also need the proxy's address. If you do not know the answers to all or most of these questions, then you should really probably talk to your system administrator or ISP before trying this type of installation. Before Installing via NFS installation network NFS The NFS installation is fairly straight-forward. Simply copy the FreeBSD distribution files you want onto an NFS server and then point the NFS media selection at it. If this server supports only privileged port (as is generally the default for Sun workstations), you will need to set the option NFS Secure in the Options menu before installation can proceed. If you have a poor quality Ethernet card which suffers from very slow transfer rates, you may also wish to toggle the NFS Slow flag. In order for NFS installation to work, the server must support subdir mounts, for example, if your FreeBSD &rel.current; distribution directory lives on: ziggy:/usr/archive/stuff/FreeBSD, then ziggy will have to allow the direct mounting of /usr/archive/stuff/FreeBSD, not just /usr or /usr/archive/stuff. In FreeBSD's /etc/exports file, this is controlled by the options. Other NFS servers may have different conventions. If you are getting permission denied messages from the server, then it is likely that you do not have this enabled properly. diff --git a/en_US.ISO8859-1/books/handbook/kernelconfig/chapter.sgml b/en_US.ISO8859-1/books/handbook/kernelconfig/chapter.sgml index 9ca0f8489e..11d5beec13 100644 --- a/en_US.ISO8859-1/books/handbook/kernelconfig/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/kernelconfig/chapter.sgml @@ -1,1686 +1,1686 @@ Jim Mock Updated and restructured by Jake Hamby Originally contributed by Configuring the FreeBSD Kernel Synopsis kernel building a custom kernel The kernel is the core of the &os; operating system. It is responsible for managing memory, enforcing security controls, networking, disk access, and much more. While more and more of &os; becomes dynamically configurable it is still occasionally necessary to reconfigure and recompile your kernel. After reading this chapter, you will know: Why you might need to build a custom kernel. How to write a kernel configuration file, or alter an existing configuration file. How to use the kernel configuration file to create and build a new kernel. How to install the new kernel. How to create any entries in /dev that may be required. How to troubleshoot if things go wrong. All of the commands listed within this chapter by way of example should be executed as root in order to succeed. Why Build a Custom Kernel? Traditionally, &os; has had what is called a monolithic kernel. This means that the kernel was one large program, supported a fixed list of devices, and if you wanted to change the kernel's behavior then you had to compile a new kernel, and then reboot your computer with the new kernel. Today, &os; is rapidly moving to a model where much of the kernel's functionality is contained in modules which can be dynamically loaded and unloaded from the kernel as necessary. This allows the kernel to adapt to new hardware suddenly becoming available (such as PCMCIA cards in a laptop), or for new functionality to be brought into the kernel that was not necessary when the kernel was originally compiled. This is known as a modular kernel. Despite this, it is still necessary to carry out some static kernel configuration. In some cases this is because the functionality is so tied to the kernel that it can not be made dynamically loadable. In others it may simply be because no one has yet taken the time to write a dynamic loadable kernel module for that functionality. Building a custom kernel is one of the most important rites of passage nearly every BSD user must endure. This process, while time consuming, will provide many benefits to your &os; system. Unlike the GENERIC kernel, which must support a wide range of hardware, a custom kernel only contains support for your PC's hardware. This has a number of benefits, such as: Faster boot time. Since the kernel will only probe the hardware you have on your system, the time it takes your system to boot can decrease dramatically. Lower memory usage. A custom kernel often uses less memory than the GENERIC kernel, which is important because the kernel must always be present in real memory. For this reason, a custom kernel is especially useful on a system with a small amount of RAM. Additional hardware support. A custom kernel allows you to add in support for devices which are not present in the GENERIC kernel, such as sound cards. Building and Installing a Custom Kernel kernel building / installing First, let us take a quick tour of the kernel build directory. All directories mentioned will be relative to the main /usr/src/sys directory, which is also accessible through the path name /sys. There are a number of subdirectories here representing different parts of the kernel, but the most important for our purposes are arch/conf, where you will edit your custom kernel configuration, and compile, which is the staging area where your kernel will be built. arch represents one of i386, alpha, amd64, ia64, powerpc, sparc64, or pc98 (an alternative development branch of PC hardware, popular in Japan). Everything inside a particular architecture's directory deals with that architecture only; the rest of the code is machine independent code common to all platforms to which &os; could potentially be ported. Notice the logical organization of the directory structure, with each supported device, file system, and option in its own subdirectory. Versions of &os; prior to 5.X support only the i386, alpha and pc98 architectures. This chapter assumes that you are using the i386 architecture in the examples. If this is not the case for your situation, make appropriate adjustments to the path names for your system's architecture. If there is not a /usr/src/sys directory on your system, then the kernel source has not been installed. The easiest way to do this is by running sysinstall (/stand/sysinstall in &os; versions older than 5.2) as root, choosing Configure, then Distributions, then src, then sys. If you have an aversion to sysinstall and you have access to an official &os; CDROM, then you can also install the source from the command line: &prompt.root; mount /cdrom &prompt.root; mkdir -p /usr/src/sys &prompt.root; ln -s /usr/src/sys /sys &prompt.root; cat /cdrom/src/ssys.[a-d]* | tar -xzvf - Next, move to the arch/conf directory and copy the GENERIC configuration file to the name you want to give your kernel. For example: &prompt.root; cd /usr/src/sys/i386/conf &prompt.root; cp GENERIC MYKERNEL Traditionally, this name is in all capital letters and, if you are maintaining multiple &os; machines with different hardware, it is a good idea to name it after your machine's hostname. We will call it MYKERNEL for the purpose of this example. Storing your kernel configuration file directly under /usr/src can be a bad idea. If you are experiencing problems it can be tempting to just delete /usr/src and start again. After doing this, it usually only takes a few seconds for you to realize that you have deleted your custom kernel configuration file. Also, do not edit GENERIC directly, as it may get overwritten the next time you update your source tree, and your kernel modifications will be lost. You might want to keep your kernel configuration file elsewhere, and then create a symbolic link to the file in the i386 directory. For example: &prompt.root; cd /usr/src/sys/i386/conf &prompt.root; mkdir /root/kernels &prompt.root; cp GENERIC /root/kernels/MYKERNEL &prompt.root; ln -s /root/kernels/MYKERNEL Now, edit MYKERNEL with your favorite text editor. If you are just starting out, the only editor available will probably be vi, which is too complex to explain here, but is covered well in many books in the bibliography. However, &os; does offer an easier editor called ee which, if you are a beginner, should be your editor of choice. Feel free to change the comment lines at the top to reflect your configuration or the changes you have made to differentiate it from GENERIC. SunOS If you have built a kernel under &sunos; or some other BSD operating system, much of this file will be very familiar to you. If you are coming from some other operating system such as DOS, on the other hand, the GENERIC configuration file might seem overwhelming to you, so follow the descriptions in the Configuration File section slowly and carefully. If you sync your source tree with the latest sources of the &os; project, be sure to always check the file /usr/src/UPDATING before you perform any update steps. This file describes any important issues or areas requiring special attention within the updated source code. /usr/src/UPDATING always matches your version of the &os; source, and is therefore more up to date with new information than this handbook. You must now compile the source code for the kernel. There are two procedures you can use to do this, and the one you will use depends on why you are rebuilding the kernel and the version of &os; that you are running. If you have installed only the kernel source code, use procedure 1. If you are running a &os; version prior to 4.0, and you are not upgrading to &os; 4.0 or higher using the make buildworld procedure, use procedure 1. If you are building a new kernel without updating the source code (perhaps just to add a new option, such as IPFIREWALL) you can use either procedure. If you are rebuilding the kernel as part of a make buildworld process, use procedure 2. cvsup CTM CVS anonymous If you have not upgraded your source tree in any way since the last time you successfully completed a buildworld-installworld cycle (you have not run CVSup, CTM, or used anoncvs), then it is safe to use the config, make depend, make, make install sequence. Procedure 1. Building a Kernel the <quote>Traditional</quote> Way Run &man.config.8; to generate the kernel source code. &prompt.root; /usr/sbin/config MYKERNEL Change into the build directory. &man.config.8; will print the name of this directory after being run as above. &prompt.root; cd ../compile/MYKERNEL For &os; versions prior to 5.0, use the following form instead: &prompt.root; cd ../../compile/MYKERNEL Compile the kernel. &prompt.root; make depend &prompt.root; make Install the new kernel. &prompt.root; make install Procedure 2. Building a Kernel the <quote>New</quote> Way Change to the /usr/src directory. &prompt.root; cd /usr/src Compile the kernel. &prompt.root; make buildkernel KERNCONF=MYKERNEL Install the new kernel. &prompt.root; make installkernel KERNCONF=MYKERNEL This method of kernel building requires full source files. If you only installed the kernel source, use the traditional method, as described above. By default, when you build a custom kernel, all kernel modules also will be rebuilded. If you want to update a kernel faster or to build only custom modules, you should edit /etc/make.conf before starting to build the kernel: MODULES_OVERRIDE = linux acpi sound/sound sound/driver/ds1 ntfs This variable sets up a list of modules to build instead of all of them. For other variables which you may find useful in the process of building kernel, refer to &man.make.conf.5; manual page. /boot/kernel.old The new kernel will be copied to the /boot/kernel directory as /boot/kernel/kernel and the old kernel will be moved to /boot/kernel.old/kernel. Now, shutdown the system and reboot to use your new kernel. If something goes wrong, there are some troubleshooting instructions at the end of this chapter that you may find useful. Be sure to read the section which explains how to recover in case your new kernel does not boot. In &os; 4.X and earlier, kernels are installed in /kernel, modules in /modules, and old kernels are backed up in /kernel.old. Other files relating to the boot process, such as the boot &man.loader.8; and configuration are stored in /boot. Third party or custom modules can be placed in /modules, although users should be aware that keeping modules in sync with the compiled kernel is very important. Modules not intended to run with the compiled kernel may result in instability or incorrectness. If you have added any new devices (such as sound cards) and you are running &os; 4.X or previous versions, you may have to add some device nodes to your /dev directory before you can use them. For more information, take a look at Making Device Nodes section later on in this chapter. Joel Dahl Updated for &os; 5.X by The Configuration File kernel NOTES kernel LINT NOTES LINT kernel configuration file The general format of a configuration file is quite simple. Each line contains a keyword and one or more arguments. For simplicity, most lines only contain one argument. Anything following a # is considered a comment and ignored. The following sections describe each keyword, in the order they are listed in GENERIC. For an exhaustive list of architecture dependent options and devices, see the NOTES file in the same directory as GENERIC. For architecture independent options, see /usr/src/sys/conf/NOTES. NOTES does not exist in &os; 4.X. Instead, see the LINT file for detailed explanations of options and devices in GENERIC. LINT served two purposes in 4.X: to provide a reference for choosing kernel options when building a custom kernel, and to provide a kernel configuration with as many tweakable options tweaked to non-default values as possible. The reason behind this was that such a configuration helped (and still does) a lot when testing new code and changes to existing code that may cause conflicts with other parts of the kernel. However, the kernel configuration framework went through some heavy changes in 5.X; one example of this is that the driver configuration options were moved to a hints file so that they could be changed and loaded dynamically at boot time, and LINT could not contain those hints anymore. For this and other reasons, the LINT file was renamed to NOTES and retained mostly the first reason for its existence: documenting the available options for user convenience. In &os; 5.X and later versions you can still generate a buildable LINT file by typing: - &prompt.root; cd /usr/src/sys/i386/conf && make LINT + &prompt.root; cd /usr/src/sys/i386/conf && make LINT kernel configuration file The following is an example of the GENERIC kernel configuration file with various additional comments where needed for clarity. This example should match your copy in /usr/src/sys/i386/conf/GENERIC fairly closely. kernel options machine machine i386 This is the machine architecture. It must be either alpha, amd64, i386, ia64, pc98, powerpc, or sparc64. kernel options cpu cpu I486_CPU cpu I586_CPU cpu I686_CPU The above option specifies the type of CPU you have in your system. You may have multiple instances of the CPU line (if, for example, you are not sure whether you should use I586_CPU or I686_CPU), but for a custom kernel it is best to specify only the CPU you have. If you are unsure of your CPU type, you can check the /var/run/dmesg.boot file to view your boot messages. kernel options cpu type Support for I386_CPU is still provided in the source of &os;, but it is disabled by default in both -STABLE and -CURRENT. This means that to install &os; with a 386-class cpu, you now have the following options: Install an older &os; release and rebuild from source as described in . Build the userland and kernel on a newer machine and install on the 386 using the precompiled /usr/obj files (see for details). Roll your own release of &os; which includes I386_CPU support in the kernels of the installation CD-ROM. The first of these options is probably the easiest of all, but you will need a lot of disk space which, on a 386-class machine, may be difficult to find. kernel options ident ident GENERIC This is the identification of the kernel. You should change this to whatever you named your kernel, i.e. MYKERNEL if you have followed the instructions of the previous examples. The value you put in the ident string will print when you boot up the kernel, so it is useful to give the new kernel a different name if you want to keep it separate from your usual kernel (e.g., you want to build an experimental kernel). #To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. In &os; 5.X and newer versions the &man.device.hints.5; is used to configure options of the device drivers. The default location that &man.loader.8; will check at boot time is /boot/device.hints. Using the hints option you can compile these hints statically into your kernel. Then there is no need to create a device.hints file in /boot. #makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols The normal build process of &os; does not include debugging information when building the kernel and strips most symbols after the resulting kernel is linked, to save some space at the install location. If you are going to do tests of kernels in the -CURRENT branch or develop changes of your own for the &os; kernel, you might want to uncomment this line. It will enable the use of the option which enables debugging information when passed to &man.gcc.1;. The same can be accomplished by the &man.config.8; option, if you are using the traditional way for building your kernels (see for more information). options SCHED_4BSD # 4BSD scheduler The traditional scheduler for &os;. Depending on your system's workload, you may gain performance by using the new ULE scheduler for &os; that has been designed specially for SMP, but works just fine on UP systems too. If you wish to try it out, replace SCHED_4BSD with SCHED_ULE in your configuration file. options INET # InterNETworking Networking support. Leave this in, even if you do not plan to be connected to a network. Most programs require at least loopback networking (i.e., making network connections within your PC), so this is essentially mandatory. options INET6 # IPv6 communications protocols This enables the IPv6 communication protocols. options FFS # Berkeley Fast Filesystem This is the basic hard drive file system. Leave it in if you boot from the hard disk. options SOFTUPDATES # Enable FFS Soft Updates support This option enables Soft Updates in the kernel, this will help speed up write access on the disks. Even when this functionality is provided by the kernel, it must be turned on for specific disks. Review the output from &man.mount.8; to see if Soft Updates is enabled for your system disks. If you do not see the soft-updates option then you will need to activate it using the &man.tunefs.8; (for existing file systems) or &man.newfs.8; (for new file systems) commands. options UFS_ACL # Support for access control lists This option, present only in &os; 5.X, enables kernel support for access control lists. This relies on the use of extended attributes and UFS2, and the feature is described in detail in . ACLs are enabled by default and should not be disabled in the kernel if they have been used previously on a file system, as this will remove the access control lists, changing the way files are protected in unpredictable ways. options UFS_DIRHASH # Improve performance on big directories This option includes functionality to speed up disk operations on large directories, at the expense of using additional memory. You would normally keep this for a large server, or interactive workstation, and remove it if you are using &os; on a smaller system where memory is at a premium and disk access speed is less important, such as a firewall. options MD_ROOT # MD is a potential root device This option enables support for a memory backed virtual disk used as a root device. kernel options NFS kernel options NFS_ROOT options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT The network file system. Unless you plan to mount partitions from a &unix; file server over TCP/IP, you can comment these out. kernel options MSDOSFS options MSDOSFS # MSDOS Filesystem The &ms-dos; file system. Unless you plan to mount a DOS formatted hard drive partition at boot time, you can safely comment this out. It will be automatically loaded the first time you mount a DOS partition, as described above. Also, the excellent emulators/mtools software allows you to access DOS floppies without having to mount and unmount them (and does not require MSDOSFS at all). options CD9660 # ISO 9660 Filesystem The ISO 9660 file system for CDROMs. Comment it out if you do not have a CDROM drive or only mount data CDs occasionally (since it will be dynamically loaded the first time you mount a data CD). Audio CDs do not need this file system. options PROCFS # Process filesystem The process file system. This is a pretend file system mounted on /proc which allows programs like &man.ps.1; to give you more information on what processes are running. In &os; 5.X and above, use of PROCFS is not required under most circumstances, as most debugging and monitoring tools have been adapted to run without PROCFS: unlike in &os; 4.X, new installations of &os; 5.X will not mount the process file system by default. In addition, 6.X-CURRENT kernels making use of PROCFS must now also include support for PSEUDOFS: options PSEUDOFS # Pseudo-filesystem framework PSEUDOFS is not available in &os; 4.X. options GEOM_GPT # GUID Partition Tables. This option brings the ability to have a large number of partitions on a single disk. options COMPAT_43 # Compatible with BSD 4.3 [KEEP THIS!] Compatibility with 4.3BSD. Leave this in; some programs will act strangely if you comment this out. options COMPAT_FREEBSD4 # Compatible with &os;4 This option is required on &os; 5.X &i386; and Alpha systems to support applications compiled on older versions of &os; that use older system call interfaces. It is recommended that this option be used on all &i386; and Alpha systems that may run older applications; platforms that gained support only in 5.X, such as ia64 and &sparc64;, do not require this option. options SCSI_DELAY=15000 # Delay (in ms) before probing SCSI This causes the kernel to pause for 15 seconds before probing each SCSI device in your system. If you only have IDE hard drives, you can ignore this, otherwise you will probably want to lower this number, perhaps to 5 seconds, to speed up booting. Of course, if you do this and &os; has trouble recognizing your SCSI devices, you will have to raise it again. options KTRACE # ktrace(1) support This enables kernel process tracing, which is useful in debugging. options SYSVSHM # SYSV-style shared memory This option provides for System V shared memory. The most common use of this is the XSHM extension in X, which many graphics-intensive programs will automatically take advantage of for extra speed. If you use X, you will definitely want to include this. options SYSVMSG # SYSV-style message queues Support for System V messages. This option only adds a few hundred bytes to the kernel. options SYSVSEM # SYSV-style semaphores Support for System V semaphores. Less commonly used but only adds a few hundred bytes to the kernel. The option of the &man.ipcs.1; command will list any processes using each of these System V facilities. options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions Real-time extensions added in the 1993 &posix;. Certain applications in the Ports Collection use these (such as &staroffice;). options KBD_INSTALL_CDEV # install a CDEV entry in /dev This option is related to the keyboard. It installs a CDEV entry in /dev. options AHC_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~128k to driver. options AHD_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~215k to driver. This helps debugging by printing easier register definitions for reading. options ADAPTIVE_GIANT # Giant mutex is adaptive. Giant is the name of a mutual exclusion mechanism (a sleep mutex) that protects a large set of kernel resources. Today, this is an unacceptable performance bottleneck which is actively being replaced with locks that protect individual resources. The ADAPTIVE_GIANT option causes Giant to be included in the set of mutexes adaptively spun on. That is, when a thread wants to lock the Giant mutex, but it is already locked by a thread on another CPU, the first thread will keep running and wait for the lock to be released. Normally, the thread would instead go back to sleep and wait for its next chance to run. If you are not sure, leave this in. kernel options SMP device apic # I/O APIC The apic device enables the use of the I/O APIC for interrupt delivery. The apic device can be used in both UP and SMP kernels, but is required for SMP kernels. Add options SMP to include support for multiple processors. device isa All PCs supported by &os; have one of these. Do not remove this, even if you have no ISA slots. If you have an IBM PS/2 (Micro Channel Architecture) system, &os; provides only limited support at this time. For more information about the MCA support, see /usr/src/sys/i386/conf/NOTES. device eisa Include this if you have an EISA motherboard. This enables auto-detection and configuration support for all devices on the EISA bus. device pci Include this if you have a PCI motherboard. This enables auto-detection of PCI cards and gatewaying from the PCI to ISA bus. # Floppy drives device fdc This is the floppy drive controller. # ATA and ATAPI devices device ata This driver supports all ATA and ATAPI devices. You only need one device ata line for the kernel to detect all PCI ATA/ATAPI devices on modern machines. device atadisk # ATA disk drives This is needed along with device ata for ATA disk drives. device ataraid # ATA RAID drives This is needed along with device ata for ATA RAID drives. device atapicd # ATAPI CDROM drives This is needed along with device ata for ATAPI CDROM drives. device atapifd # ATAPI floppy drives This is needed along with device ata for ATAPI floppy drives. device atapist # ATAPI tape drives This is needed along with device ata for ATAPI tape drives. options ATA_STATIC_ID # Static device numbering This makes the controller number static; without this, the device numbers are dynamically allocated. # SCSI Controllers device ahb # EISA AHA1742 family device ahc # AHA2940 and onboard AIC7xxx devices device ahd # AHA39320/29320 and onboard AIC79xx devices device amd # AMD 53C974 (Teckram DC-390(T)) device isp # Qlogic family device mpt # LSI-Logic MPT-Fusion #device ncr # NCR/Symbios Logic device sym # NCR/Symbios Logic (newer chipsets) device trm # Tekram DC395U/UW/F DC315U adapters device adv # Advansys SCSI adapters device adw # Advansys wide SCSI adapters device aha # Adaptec 154x SCSI adapters device aic # Adaptec 15[012]x SCSI adapters, AIC-6[23]60. device bt # Buslogic/Mylex MultiMaster SCSI adapters device ncv # NCR 53C500 device nsp # Workbit Ninja SCSI-3 device stg # TMC 18C30/18C50 SCSI controllers. Comment out any you do not have in your system. If you have an IDE only system, you can remove these altogether. # SCSI peripherals device scbus # SCSI bus (required for SCSI) device ch # SCSI media changers device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) SCSI peripherals. Again, comment out any you do not have, or if you have only IDE hardware, you can remove them completely. The USB &man.umass.4; driver and a few other drivers use the SCSI subsystem even though they are not real SCSI devices. Therefore make sure not to remove SCSI support, if any such drivers are included in the kernel configuration. # RAID controllers interfaced to the SCSI subsystem device amr # AMI MegaRAID device arcmsr # Areca SATA II RAID device asr # DPT SmartRAID V, VI and Adaptec SCSI RAID device ciss # Compaq Smart RAID 5* device dpt # DPT Smartcache III, IV - See NOTES for options device hptmv # Highpoint RocketRAID 182x device iir # Intel Integrated RAID device ips # IBM (Adaptec) ServeRAID device mly # Mylex AcceleRAID/eXtremeRAID device twa # 3ware 9000 series PATA/SATA RAID # RAID controllers device aac # Adaptec FSA RAID device aacp # SCSI passthrough for aac (requires CAM) device ida # Compaq Smart RAID device mlx # Mylex DAC960 family device pst # Promise Supertrak SX6000 device twe # 3ware ATA RAID Supported RAID controllers. If you do not have any of these, you can comment them out or remove them. # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller The keyboard controller (atkbdc) provides I/O services for the AT keyboard and PS/2 style pointing devices. This controller is required by the keyboard driver (atkbd) and the PS/2 pointing device driver (psm). device atkbd # AT keyboard The atkbd driver, together with atkbdc controller, provides access to the AT 84 keyboard or the AT enhanced keyboard which is connected to the AT keyboard controller. device psm # PS/2 mouse Use this device if your mouse plugs into the PS/2 mouse port. device vga # VGA video card driver The video card driver. # splash screen/screen saver device splash # Splash screen and screen saver support Splash screen at start up! Screen savers require this too. Use the line pseudo-device splash with &os; 4.X. # syscons is the default console driver, resembling an SCO console device sc sc is the default console driver and resembles a SCO console. Since most full-screen programs access the console through a terminal database library like termcap, it should not matter whether you use this or vt, the VT220 compatible console driver. When you log in, set your TERM variable to scoansi if full-screen programs have trouble running under this console. # Enable this for the pcvt (VT220 compatible) console driver #device vt #options XSERVER # support for X server on a vt console #options FAT_CURSOR # start with block cursor This is a VT220-compatible console driver, backward compatible to VT100/102. It works well on some laptops which have hardware incompatibilities with sc. Also set your TERM variable to vt100 or vt220 when you log in. This driver might also prove useful when connecting to a large number of different machines over the network, where termcap or terminfo entries for the sc device are often not available — vt100 should be available on virtually any platform. device agp Include this if you have an AGP card in the system. This will enable support for AGP, and AGP GART for boards which have these features. # Floating point support - do not disable. device npx npx is the interface to the floating point math unit in &os;, which is either the hardware co-processor or the software math emulator. This is not optional. APM # Power management support (see NOTES for more options) #device apm Advanced Power Management support. Useful for laptops, although in &os; 5.X and above this is disabled in GENERIC by default. # Add suspend/resume support for the i8254. device pmtimer Timer device driver for power management events, such as APM and ACPI. # PCCARD (PCMCIA) support # PCMCIA and cardbus bridge support device cbb # cardbus (yenta) bridge device pccard # PC Card (16-bit) bus device cardbus # CardBus (32-bit) bus PCMCIA support. You want this if you are using a laptop. # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports These are the serial ports referred to as COM ports in the &ms-dos;/&windows; world. If you have an internal modem on COM4 and a serial port at COM2, you will have to change the IRQ of the modem to 2 (for obscure technical reasons, IRQ2 = IRQ 9) in order to access it from &os;. If you have a multiport serial card, check the manual page for &man.sio.4; for more information on the proper values to add to your /boot/device.hints. Some video cards (notably those based on S3 chips) use IO addresses in the form of 0x*2e8, and since many cheap serial cards do not fully decode the 16-bit IO address space, they clash with these cards making the COM4 port practically unavailable. Each serial port is required to have a unique IRQ (unless you are using one of the multiport cards where shared interrupts are supported), so the default IRQs for COM3 and COM4 cannot be used. # Parallel port device ppc This is the ISA-bus parallel port interface. device ppbus # Parallel port bus (required) Provides support for the parallel port bus. device lpt # Printer Support for parallel port printers. All three of the above are required to enable parallel printer support. device plip # TCP/IP over parallel This is the driver for the parallel network interface. device ppi # Parallel port interface device The general-purpose I/O (geek port) + IEEE1284 I/O. #device vpo # Requires scbus and da zip drive This is for an Iomega Zip drive. It requires scbus and da support. Best performance is achieved with ports in EPP 1.9 mode. #device puc Uncomment this device if you have a dumb serial or parallel PCI card that is supported by the &man.puc.4; glue driver. # PCI Ethernet NICs. device de # DEC/Intel DC21x4x (Tulip) device em # Intel PRO/1000 adapter Gigabit Ethernet Card device ixgb # Intel PRO/10GbE Ethernet Card device txp # 3Com 3cR990 (Typhoon) device vx # 3Com 3c590, 3c595 (Vortex) Various PCI network card drivers. Comment out or remove any of these not present in your system. # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support MII bus support is required for some PCI 10/100 Ethernet NICs, namely those which use MII-compliant transceivers or implement transceiver control interfaces that operate like an MII. Adding device miibus to the kernel config pulls in support for the generic miibus API and all of the PHY drivers, including a generic one for PHYs that are not specifically handled by an individual driver. device bfe # Broadcom BCM440x 10/100 Ethernet device bge # Broadcom BCM570xx Gigabit Ethernet device dc # DEC/Intel 21143 and various workalikes device fxp # Intel EtherExpress PRO/100B (82557, 82558) device lge # Level 1 LXT1001 gigabit ethernet device nge # NatSemi DP83820 gigabit ethernet device pcn # AMD Am79C97x PCI 10/100 (precedence over 'lnc') device re # RealTek 8139C+/8169/8169S/8110S device rl # RealTek 8129/8139 device sf # Adaptec AIC-6915 (Starfire) device sis # Silicon Integrated Systems SiS 900/SiS 7016 -device sk # SysKonnect SK-984x & SK-982x gigabit Ethernet +device sk # SysKonnect SK-984x & SK-982x gigabit Ethernet device ste # Sundance ST201 (D-Link DFE-550TX) device ti # Alteon Networks Tigon I/II gigabit Ethernet device tl # Texas Instruments ThunderLAN device tx # SMC EtherPower II (83c170 EPIC) device vge # VIA VT612x gigabit ethernet device vr # VIA Rhine, Rhine II device wb # Winbond W89C840F device xl # 3Com 3c90x (Boomerang, Cyclone) Drivers that use the MII bus controller code. # ISA Ethernet NICs. pccard NICs included. device cs # Crystal Semiconductor CS89x0 NIC # 'device ed' requires 'device miibus' device ed # NE[12]000, SMC Ultra, 3c503, DS8390 cards device ex # Intel EtherExpress Pro/10 and Pro/10+ device ep # Etherlink III based cards device fe # Fujitsu MB8696x based cards device ie # EtherExpress 8/16, 3C507, StarLAN 10 etc. device lnc # NE2100, NE32-VL Lance Ethernet cards device sn # SMC's 9000 series of Ethernet chips device xe # Xircom pccard Ethernet # ISA devices that use the old ISA shims #device le ISA Ethernet drivers. See /usr/src/sys/i386/conf/NOTES for details of which cards are supported by which driver. # Wireless NIC cards device wlan # 802.11 support device an # Aironet 4500/4800 802.11 wireless NICs. device awi # BayStack 660 and others device wi # WaveLAN/Intersil/Symbol 802.11 wireless NICs. #device wl # Older non 802.11 Wavelan wireless NIC. Support for various wireless cards. # Pseudo devices device loop # Network loopback This is the generic loopback device for TCP/IP. If you telnet or FTP to localhost (a.k.a. 127.0.0.1) it will come back at you through this device. This is mandatory. Under &os; 4.X you have to use the line pseudo-device loop. device mem # Memory and kernel memory devices The system memory devices. device io # I/O device This option allows a process to gain I/O privileges. This is useful in order to write userland programs that can handle hardware directly. This is required to run the X Window system. device random # Entropy device Cryptographically secure random number generator. device ether # Ethernet support ether is only needed if you have an Ethernet card. It includes generic Ethernet protocol code. Under &os; 4.X use the line pseudo-device ether. device sl # Kernel SLIP sl is for SLIP support. This has been almost entirely supplanted by PPP, which is easier to set up, better suited for modem-to-modem connection, and more powerful. With &os; 4.X use the line pseudo-device sl. device ppp # Kernel PPP This is for kernel PPP support for dial-up connections. There is also a version of PPP implemented as a userland application that uses tun and offers more flexibility and features such as demand dialing. With &os; 4.X use the line pseudo-device ppp. device tun # Packet tunnel. This is used by the userland PPP software. See the PPP section of this book for more information. With &os; 4.X use the line pseudo-device tun. device pty # Pseudo-ttys (telnet etc) This is a pseudo-terminal or simulated login port. It is used by incoming telnet and rlogin sessions, xterm, and some other applications such as Emacs. Under &os; 4.X, you have to use the line pseudo-device pty number. The number after pty indicates the number of ptys to create. If you need more than the default of 16 simultaneous xterm windows and/or remote logins, be sure to increase this number accordingly, up to a maximum of 256. device md # Memory disks Memory disk pseudo-devices. With &os; 4.X use the line pseudo-device md. device gif # IPv6 and IPv4 tunneling This implements IPv6 over IPv4 tunneling, IPv4 over IPv6 tunneling, IPv4 over IPv4 tunneling, and IPv6 over IPv6 tunneling. Beginning with &os; 4.4 the gif device is auto-cloning, and you should use the line pseudo-device gif. Earlier versions of &os; 4.X require a number, for example pseudo-device gif 4. device faith # IPv6-to-IPv4 relaying (translation) This pseudo-device captures packets that are sent to it and diverts them to the IPv4/IPv6 translation daemon. With &os; 4.X use the line pseudo-device faith 1. # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter This is the Berkeley Packet Filter. This pseudo-device allows network interfaces to be placed in promiscuous mode, capturing every packet on a broadcast network (e.g., an Ethernet). These packets can be captured to disk and or examined with the &man.tcpdump.1; program. With &os; 4.X use the line pseudo-device bpf. The &man.bpf.4; device is also used by &man.dhclient.8; to obtain the IP address of the default router (gateway) and so on. If you use DHCP, leave this uncommented. # USB support device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface #device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) #device udbp # USB Double Bulk Pipe devices device ugen # Generic device uhid # Human Interface Devices device ukbd # Keyboard device ulpt # Printer device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse device urio # Diamond Rio 500 MP3 player device uscanner # Scanners # USB Ethernet, requires mii device aue # ADMtek USB Ethernet device axe # ASIX Electronics USB Ethernet device cdce # Generic USB over Ethernet device cue # CATC USB Ethernet device kue # Kawasaki LSI USB Ethernet device rue # RealTek RTL8150 USB Ethernet Support for various USB devices. # FireWire support device firewire # FireWire bus code device sbp # SCSI over FireWire (Requires scbus and da) device fwe # Ethernet over FireWire (non-standard!) Support for various Firewire devices. For more information and additional devices supported by &os;, see /usr/src/sys/i386/conf/NOTES. Large Memory Configurations (<acronym>PAE</acronym>) Physical Address Extensions (PAE) large memory Large memory configuration machines require access to more than the 4 gigabyte limit on User+Kernel Virtual Address (KVA) space. Due to this limitation, Intel added support for 36-bit physical address space access in the &pentium; Pro and later line of CPUs. The Physical Address Extension (PAE) capability of the &intel; &pentium; Pro and later CPUs allows memory configurations of up to 64 gigabytes. &os; provides support for this capability via the kernel configuration option, available in the 4.X series of &os; beginning with 4.9-RELEASE and in the 5.X series of &os; beginning with 5.1-RELEASE. Due to the limitations of the Intel memory architecture, no distinction is made for memory above or below 4 gigabytes. Memory allocated above 4 gigabytes is simply added to the pool of available memory. To enable PAE support in the kernel, simply add the following line to your kernel configuration file: options PAE The PAE support in &os; is only available for &intel; IA-32 processors. It should also be noted, that the PAE support in &os; has not received wide testing, and should be considered beta quality compared to other stable features of &os;. PAE support in &os; has a few limitations: A process is not able to access more than 4 gigabytes of VM space. KLD modules cannot be loaded into a PAE enabled kernel, due to the differences in the build framework of a module and the kernel. Device drivers that do not use the &man.bus.dma.9; interface will cause data corruption in a PAE enabled kernel and are not recommended for use. For this reason, the PAE kernel configuration file is provided in &os; 5.X, which excludes all drivers not known to work in a PAE enabled kernel. Some system tunables determine memory resource usage by the amount of available physical memory. Such tunables can unnecessarily over-allocate due to the large memory nature of a PAE system. One such example is the sysctl, which controls the maximum number of vnodes allowed in the kernel. It is advised to adjust this and other such tunables to a reasonable value. It might be necessary to increase the kernel virtual address (KVA) space or to reduce the amount of specific kernel resource that is heavily used (see above) in order to avoid KVA exhaustion. The kernel option can be used for increasing the KVA space. For performance and stability concerns, it is advised to consult the &man.tuning.7; manual page. The &man.pae.4; manual page contains up-to-date information on &os;'s PAE support. Making Device Nodes device nodes MAKEDEV If you are running &os; 5.0 or later you can safely skip this section. These versions use &man.devfs.5; to allocate device nodes transparently for the user. Almost every device in the kernel has a corresponding node entry in the /dev directory. These nodes look like regular files, but are actually special entries into the kernel which programs use to access the device. The shell script /dev/MAKEDEV, which is executed when you first install the operating system, creates nearly all of the device nodes supported. However, it does not create all of them, so when you add support for a new device, it pays to make sure that the appropriate entries are in this directory, and if not, add them. Here is a simple example: Suppose you add the IDE CD-ROM support to the kernel. The line to add is: device acd0 This means that you should look for some entries that start with acd0 in the /dev directory, possibly followed by a letter, such as c, or preceded by the letter r, which means a raw device. It turns out that those files are not there, so you must change to the /dev directory and type: MAKEDEV &prompt.root; sh MAKEDEV acd0 When this script finishes, you will find that there are now acd0c and racd0c entries in /dev so you know that it executed correctly. For sound cards, the following command creates the appropriate entries: &prompt.root; sh MAKEDEV snd0 When creating device nodes for devices such as sound cards, if other people have access to your machine, it may be desirable to protect the devices from outside access by adding them to the /etc/fbtab file. See &man.fbtab.5; for more information. Follow this simple procedure for any other non-GENERIC devices which do not have entries. All SCSI controllers use the same set of /dev entries, so you do not need to create these. Also, network cards and SLIP/PPP pseudo-devices do not have entries in /dev at all, so you do not have to worry about these either. If Something Goes Wrong There are five categories of trouble that can occur when building a custom kernel. They are: config fails: If the &man.config.8; command fails when you give it your kernel description, you have probably made a simple error somewhere. Fortunately, &man.config.8; will print the line number that it had trouble with, so that you can quickly locate the line containing the error. For example, if you see: config: line 17: syntax error Make sure the keyword is typed correctly by comparing it to the GENERIC kernel or another reference. make fails: If the make command fails, it usually signals an error in your kernel description which is not severe enough for &man.config.8; to catch. Again, look over your configuration, and if you still cannot resolve the problem, send mail to the &a.questions; with your kernel configuration, and it should be diagnosed quickly. Installing the new kernel fails: If the kernel compiled fine, but failed to install (the make install or make installkernel command failed), the first thing to check is if your system is running at securelevel 1 or higher (see &man.init.8;). The kernel installation tries to remove the immutable flag from your kernel and set the immutable flag on the new one. Since securelevel 1 or higher prevents unsetting the immutable flag for any files on the system, the kernel installation needs to be performed at securelevel 0 or lower. The above only applies to &os; 4.X and earlier versions. &os; 5.X, along with later versions, does not set the immutable flag on the kernel and a failure to install a kernel probably indicates a more fundamental problem. The kernel does not boot: If your new kernel does not boot, or fails to recognize your devices, do not panic! Fortunately, &os; has an excellent mechanism for recovering from incompatible kernels. Simply choose the kernel you want to boot from at the &os; boot loader. You can access this when the system counts down from 10 at the boot menu. Hit any key except for the Enter key, type unload and then type boot /boot/kernel.old/kernel, or the filename of any other kernel that will boot properly. When reconfiguring a kernel, it is always a good idea to keep a kernel that is known to work on hand. After booting with a good kernel you can check over your configuration file and try to build it again. One helpful resource is the /var/log/messages file which records, among other things, all of the kernel messages from every successful boot. Also, the &man.dmesg.8; command will print the kernel messages from the current boot. If you are having trouble building a kernel, make sure to keep a GENERIC, or some other kernel that is known to work on hand as a different name that will not get erased on the next build. You cannot rely on kernel.old because when installing a new kernel, kernel.old is overwritten with the last installed kernel which may be non-functional. Also, as soon as possible, move the working kernel to the proper /boot/kernel location or commands such as &man.ps.1; may not work properly. To do this, simply rename the directory containing the good kernel: &prompt.root; mv /boot/kernel /boot/kernel.bad &prompt.root; mv /boot/kernel.good /boot/kernel For versions of &os; prior to 5.X, the proper command to unlock the kernel file that make installs (in order to move another kernel back permanently) is: &prompt.root; chflags noschg /kernel If you find you cannot do this, you are probably running at a &man.securelevel.8; greater than zero. Edit kern_securelevel in /etc/rc.conf and set it to -1, then reboot. You can change it back to its previous setting when you are happy with your new kernel. And, if you want to lock your new kernel into place, or any file for that matter, so that it cannot be moved or tampered with: &prompt.root; chflags schg /kernel The kernel works, but &man.ps.1; does not work any more: If you have installed a different version of the kernel from the one that the system utilities have been built with, for example, a 5.X kernel on a 4.X system, many system-status commands like &man.ps.1; and &man.vmstat.8; will not work any more. You should recompile and install a world built with the same version of the source tree as your kernel. This is one reason it is not normally a good idea to use a different version of the kernel from the rest of the operating system. diff --git a/en_US.ISO8859-1/books/handbook/linuxemu/chapter.sgml b/en_US.ISO8859-1/books/handbook/linuxemu/chapter.sgml index a64a35ff6e..a3b8cdbfeb 100644 --- a/en_US.ISO8859-1/books/handbook/linuxemu/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/linuxemu/chapter.sgml @@ -1,3348 +1,3348 @@ Jim Mock Restructured and parts updated by Brian N. Handy Originally contributed by Rich Murphey Linux Binary Compatibility Synopsis Linux binary compatibility binary compatibility Linux FreeBSD provides binary compatibility with several other &unix; like operating systems, including Linux. At this point, you may be asking yourself why exactly, does FreeBSD need to be able to run Linux binaries? The answer to that question is quite simple. Many companies and developers develop only for Linux, since it is the latest hot thing in the computing world. That leaves the rest of us FreeBSD users bugging these same companies and developers to put out native FreeBSD versions of their applications. The problem is, that most of these companies do not really realize how many people would use their product if there were FreeBSD versions too, and most continue to only develop for Linux. So what is a FreeBSD user to do? This is where the Linux binary compatibility of FreeBSD comes into play. In a nutshell, the compatibility allows FreeBSD users to run about 90% of all Linux applications without modification. This includes applications such as &staroffice;, the Linux version of &netscape;, &adobe; &acrobat;, RealPlayer, VMware, &oracle;, WordPerfect, Doom, Quake, and more. It is also reported that in some situations, Linux binaries perform better on FreeBSD than they do under Linux. There are, however, some Linux-specific operating system features that are not supported under FreeBSD. Linux binaries will not work on FreeBSD if they overly use &i386; specific calls, such as enabling virtual 8086 mode. After reading this chapter, you will know: How to enable Linux binary compatibility on your system. How to install additional Linux shared libraries. How to install Linux applications on your FreeBSD system. The implementation details of Linux compatibility in FreeBSD. Before reading this chapter, you should: Know how to install additional third-party software (). Installation KLD (kernel loadable object) Linux binary compatibility is not turned on by default. The easiest way to enable this functionality is to load the linux KLD object (Kernel LoaDable object). You can load this module by typing the following as root: &prompt.root; kldload linux If you would like Linux compatibility to always be enabled, then you should add the following line to /etc/rc.conf: linux_enable="YES" The &man.kldstat.8; command can be used to verify that the KLD is loaded: &prompt.user; kldstat Id Refs Address Size Name 1 2 0xc0100000 16bdb8 kernel 7 1 0xc24db000 d000 linux.ko kernel options LINUX If for some reason you do not want to or cannot load the KLD, then you may statically link Linux binary compatibility into the kernel by adding options COMPAT_LINUX to your kernel configuration file. Then install your new kernel as described in . Installing Linux Runtime Libraries Linux installing Linux libraries This can be done one of two ways, either by using the linux_base port, or by installing them manually. Installing Using the linux_base Port Ports Collection This is by far the easiest method to use when installing the runtime libraries. It is just like installing any other port from the Ports Collection. Simply do the following: &prompt.root; cd /usr/ports/emulators/linux_base &prompt.root; make install distclean You should now have working Linux binary compatibility. Some programs may complain about incorrect minor versions of the system libraries. In general, however, this does not seem to be a problem. There may be multiple versions of the emulators/linux_base port available, corresponding to different versions of various Linux distributions. You should install the port most closely resembling the requirements of the Linux applications you would like to install. Installing Libraries Manually If you do not have the ports collection installed, you can install the libraries by hand instead. You will need the Linux shared libraries that the program depends on and the runtime linker. Also, you will need to create a shadow root directory, /compat/linux, for Linux libraries on your FreeBSD system. Any shared libraries opened by Linux programs run under FreeBSD will look in this tree first. So, if a Linux program loads, for example, /lib/libc.so, FreeBSD will first try to open /compat/linux/lib/libc.so, and if that does not exist, it will then try /lib/libc.so. Shared libraries should be installed in the shadow tree /compat/linux/lib rather than the paths that the Linux ld.so reports. Generally, you will need to look for the shared libraries that Linux binaries depend on only the first few times that you install a Linux program on your FreeBSD system. After a while, you will have a sufficient set of Linux shared libraries on your system to be able to run newly imported Linux binaries without any extra work. How to Install Additional Shared Libraries shared libraries What if you install the linux_base port and your application still complains about missing shared libraries? How do you know which shared libraries Linux binaries need, and where to get them? Basically, there are 2 possibilities (when following these instructions you will need to be root on your FreeBSD system). If you have access to a Linux system, see what shared libraries the application needs, and copy them to your FreeBSD system. Look at the following example: Let us assume you used FTP to get the Linux binary of Doom, and put it on a Linux system you have access to. You then can check which shared libraries it needs by running ldd linuxdoom, like so: &prompt.user; ldd linuxdoom libXt.so.3 (DLL Jump 3.1) => /usr/X11/lib/libXt.so.3.1.0 libX11.so.3 (DLL Jump 3.1) => /usr/X11/lib/libX11.so.3.1.0 libc.so.4 (DLL Jump 4.5pl26) => /lib/libc.so.4.6.29 symbolic links You would need to get all the files from the last column, and put them under /compat/linux, with the names in the first column as symbolic links pointing to them. This means you eventually have these files on your FreeBSD system: /compat/linux/usr/X11/lib/libXt.so.3.1.0 /compat/linux/usr/X11/lib/libXt.so.3 -> libXt.so.3.1.0 /compat/linux/usr/X11/lib/libX11.so.3.1.0 /compat/linux/usr/X11/lib/libX11.so.3 -> libX11.so.3.1.0 /compat/linux/lib/libc.so.4.6.29 /compat/linux/lib/libc.so.4 -> libc.so.4.6.29

Note that if you already have a Linux shared library with a matching major revision number to the first column of the ldd output, you will not need to copy the file named in the last column to your system, the one you already have should work. It is advisable to copy the shared library anyway if it is a newer version, though. You can remove the old one, as long as you make the symbolic link point to the new one. So, if you have these libraries on your system: /compat/linux/lib/libc.so.4.6.27 /compat/linux/lib/libc.so.4 -> libc.so.4.6.27 and you find a new binary that claims to require a later version according to the output of ldd: libc.so.4 (DLL Jump 4.5pl26) -> libc.so.4.6.29 If it is only one or two versions out of date in the in the trailing digit then do not worry about copying /lib/libc.so.4.6.29 too, because the program should work fine with the slightly older version. However, if you like, you can decide to replace the libc.so anyway, and that should leave you with: /compat/linux/lib/libc.so.4.6.29 /compat/linux/lib/libc.so.4 -> libc.so.4.6.29

The symbolic link mechanism is only needed for Linux binaries. The FreeBSD runtime linker takes care of looking for matching major revision numbers itself and you do not need to worry about it.

Installing Linux ELF Binaries Linux ELF binaries ELF binaries sometimes require an extra step of branding. If you attempt to run an unbranded ELF binary, you will get an error message like the following: &prompt.user; ./my-linux-elf-binary ELF binary type not known Abort To help the FreeBSD kernel distinguish between a FreeBSD ELF binary from a Linux binary, use the &man.brandelf.1; utility. &prompt.user; brandelf -t Linux my-linux-elf-binary GNU toolchain The GNU toolchain now places the appropriate branding information into ELF binaries automatically, so this step should become increasingly unnecessary in the future. Configuring the Hostname Resolver If DNS does not work or you get this message: resolv+: "bind" is an invalid keyword resolv+: "hosts" is an invalid keyword You will need to configure a /compat/linux/etc/host.conf file containing: order hosts, bind multi on The order here specifies that /etc/hosts is searched first and DNS is searched second. When /compat/linux/etc/host.conf is not installed, Linux applications find FreeBSD's /etc/host.conf and complain about the incompatible FreeBSD syntax. You should remove bind if you have not configured a name server using the /etc/resolv.conf file. Murray Stokely Updated for Mathematica 4.X by Bojan Bistrovic Merged with work by Installing &mathematica; applications Mathematica This document describes the process of installing the Linux version of &mathematica; 4.X onto a FreeBSD system. The Linux version of &mathematica; runs perfectly under FreeBSD however the binaries shipped by Wolfram need to be branded so that FreeBSD knows to use the Linux ABI to execute them. The Linux version of &mathematica; or &mathematica; for Students can be ordered directly from Wolfram at . Branding the Linux Binaries The Linux binaries are located in the Unix directory of the &mathematica; CDROM distributed by Wolfram. You need to copy this directory tree to your local hard drive so that you can brand the Linux binaries with &man.brandelf.1; before running the installer: &prompt.root; mount /cdrom &prompt.root; cp -rp /cdrom/Unix/ /localdir/ &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/Kernel/Binaries/Linux/* &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/FrontEnd/Binaries/Linux/* &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/Installation/Binaries/Linux/* &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/Graphics/Binaries/Linux/* &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/Converters/Binaries/Linux/* &prompt.root; brandelf -t Linux /localdir/Files/SystemFiles/LicenseManager/Binaries/Linux/mathlm &prompt.root; cd /localdir/Installers/Linux/ &prompt.root; ./MathInstaller Alternatively, you can simply set the default ELF brand to Linux for all unbranded binaries with the command: &prompt.root; sysctl kern.fallback_elf_brand=3 This will make FreeBSD assume that unbranded ELF binaries use the Linux ABI and so you should be able to run the installer straight from the CDROM. Obtaining Your &mathematica; Password Before you can run &mathematica; you will have to obtain a password from Wolfram that corresponds to your machine ID. Ethernet MAC address Once you have installed the Linux compatibility runtime libraries and unpacked &mathematica; you can obtain the machine ID by running the program mathinfo in the installation directory. This machine ID is based solely on the MAC address of your first Ethernet card. &prompt.root; cd /localdir/Files/SystemFiles/Installation/Binaries/Linux &prompt.root; mathinfo disco.example.com 7115-70839-20412 When you register with Wolfram, either by email, phone or fax, you will give them the machine ID and they will respond with a corresponding password consisting of groups of numbers. You can then enter this information when you attempt to run &mathematica; for the first time exactly as you would for any other &mathematica; platform. Running the &mathematica; Frontend over a Network &mathematica; uses some special fonts to display characters not present in any of the standard font sets (integrals, sums, Greek letters, etc.). The X protocol requires these fonts to be install locally. This means you will have to copy these fonts from the CDROM or from a host with &mathematica; installed to your local machine. These fonts are normally stored in /cdrom/Unix/Files/SystemFiles/Fonts on the CDROM, or /usr/local/mathematica/SystemFiles/Fonts on your hard drive. The actual fonts are in the subdirectories Type1 and X. There are several ways to use them, as described below. The first way is to copy them into one of the existing font directories in /usr/X11R6/lib/X11/fonts. This will require editing the fonts.dir file, adding the font names to it, and changing the number of fonts on the first line. Alternatively, you should also just be able to run &man.mkfontdir.1; in the directory you have copied them to. The second way to do this is to copy the directories to /usr/X11R6/lib/X11/fonts: &prompt.root; cd /usr/X11R6/lib/X11/fonts &prompt.root; mkdir X &prompt.root; mkdir MathType1 &prompt.root; cd /cdrom/Unix/Files/SystemFiles/Fonts &prompt.root; cp X/* /usr/X11R6/lib/X11/fonts/X &prompt.root; cp Type1/* /usr/X11R6/lib/X11/fonts/MathType1 &prompt.root; cd /usr/X11R6/lib/X11/fonts/X &prompt.root; mkfontdir &prompt.root; cd ../MathType1 &prompt.root; mkfontdir Now add the new font directories to your font path: &prompt.root; xset fp+ /usr/X11R6/lib/X11/fonts/X &prompt.root; xset fp+ /usr/X11R6/lib/X11/fonts/MathType1 &prompt.root; xset fp rehash If you are using the &xorg; server, you can have these font directories loaded automatically by adding them to your xorg.conf file. For &xfree86; servers, the configuration file is XF86Config. fonts If you do not already have a directory called /usr/X11R6/lib/X11/fonts/Type1, you can change the name of the MathType1 directory in the example above to Type1. Aaron Kaplan Contributed by Robert Getschmann Thanks to Installing &maple; applications Maple &maple; is a commercial mathematics program similar to &mathematica;. You must purchase this software from and then register there for a license file. To install this software on FreeBSD, please follow these simple steps. Execute the INSTALL shell script from the product distribution. Choose the RedHat option when prompted by the installation program. A typical installation directory might be /usr/local/maple. If you have not done so, order a license for &maple; from Maple Waterloo Software () and copy it to /usr/local/maple/license/license.dat. Install the FLEXlm license manager by running the INSTALL_LIC install shell script that comes with &maple;. Specify the primary hostname for your machine for the license server. Patch the /usr/local/maple/bin/maple.system.type file with the following: ----- snip ------------------ *** maple.system.type.orig Sun Jul 8 16:35:33 2001 --- maple.system.type Sun Jul 8 16:35:51 2001 *************** *** 72,77 **** --- 72,78 ---- # the IBM RS/6000 AIX case MAPLE_BIN="bin.IBM_RISC_UNIX" ;; + "FreeBSD"|\ "Linux") # the Linux/x86 case # We have two Linux implementations, one for Red Hat and ----- snip end of patch ----- Please note that after the "FreeBSD"|\ no other whitespace should be present. This patch instructs &maple; to recognize FreeBSD as a type of Linux system. The bin/maple shell script calls the bin/maple.system.type shell script which in turn calls uname -a to find out the operating system name. Depending on the OS name it will find out which binaries to use. Start the license server. The following script, installed as /usr/local/etc/rc.d/lmgrd.sh is a convenient way to start up lmgrd: ----- snip ------------ #! /bin/sh PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin PATH=${PATH}:/usr/local/maple/bin:/usr/local/maple/FLEXlm/UNIX/LINUX export PATH LICENSE_FILE=/usr/local/maple/license/license.dat LOG=/var/log/lmgrd.log case "$1" in start) - lmgrd -c ${LICENSE_FILE} 2>> ${LOG} 1>&2 + lmgrd -c ${LICENSE_FILE} 2>> ${LOG} 1>&2 echo -n " lmgrd" ;; stop) - lmgrd -c ${LICENSE_FILE} -x lmdown 2>> ${LOG} 1>&2 + lmgrd -c ${LICENSE_FILE} -x lmdown 2>> ${LOG} 1>&2 ;; *) - echo "Usage: `basename $0` {start|stop}" 1>&2 + echo "Usage: `basename $0` {start|stop}" 1>&2 exit 64 ;; esac exit 0 ----- snip ------------ Test-start &maple;: &prompt.user; cd /usr/local/maple/bin &prompt.user; ./xmaple You should be up and running. Make sure to write Maplesoft to let them know you would like a native FreeBSD version! Common Pitfalls The FLEXlm license manager can be a difficult tool to work with. Additional documentation on the subject can be found at . lmgrd is known to be very picky about the license file and to core dump if there are any problems. A correct license file should look like this: # ======================================================= # License File for UNIX Installations ("Pointer File") # ======================================================= SERVER chillig ANY #USE_SERVER VENDOR maplelmg FEATURE Maple maplelmg 2000.0831 permanent 1 XXXXXXXXXXXX \ PLATFORMS=i86_r ISSUER="Waterloo Maple Inc." \ ISSUED=11-may-2000 NOTICE=" Technische Universitat Wien" \ SN=XXXXXXXXX Serial number and key 'X''ed out. chillig is a hostname. Editing the license file works as long as you do not touch the FEATURE line (which is protected by the license key). Dan Pelleg Contributed by Installing &matlab; applications MATLAB This document describes the process of installing the Linux version of &matlab; version 6.5 onto a &os; system. It works quite well, with the exception of the &java.virtual.machine; (see ). The Linux version of &matlab; can be ordered directly from The MathWorks at . Make sure you also get the license file or instructions how to create it. While you are there, let them know you would like a native &os; version of their software. Installing &matlab; To install &matlab;, do the following: Insert the installation CD and mount it. Become root, as recommended by the installation script. To start the installation script type: &prompt.root; /compat/linux/bin/sh /cdrom/install The installer is graphical. If you get errors about not being able to open a display, type setenv HOME ~USER, where USER is the user you did a &man.su.1; as. When asked for the &matlab; root directory, type: /compat/linux/usr/local/matlab. For easier typing on the rest of the installation process, type this at your shell prompt: set MATLAB=/compat/linux/usr/local/matlab Edit the license file as instructed when obtaining the &matlab; license. You can prepare this file in advance using your favorite editor, and copy it to $MATLAB/license.dat before the installer asks you to edit it. Complete the installation process. At this point your &matlab; installation is complete. The following steps apply glue to connect it to your &os; system. License Manager Startup Create symlinks for the license manager scripts: &prompt.root; ln -s $MATLAB/etc/lmboot /usr/local/etc/lmboot_TMW &prompt.root; ln -s $MATLAB/etc/lmdown /usr/local/etc/lmdown_TMW Create a startup file at /usr/local/etc/rc.d/flexlm.sh. The example below is a modified version of the distributed $MATLAB/etc/rc.lm.glnx86. The changes are file locations, and startup of the license manager under Linux emulation. #!/bin/sh case "$1" in start) if [ -f /usr/local/etc/lmboot_TMW ]; then /compat/linux/bin/sh /usr/local/etc/lmboot_TMW -u username && echo 'MATLAB_lmgrd' fi ;; stop) if [ -f /usr/local/etc/lmdown_TMW ]; then /compat/linux/bin/sh /usr/local/etc/lmdown_TMW > /dev/null 2>&1 fi ;; *) echo "Usage: $0 {start|stop}" exit 1 ;; esac exit 0 The file must be made executable: &prompt.root; chmod +x /usr/local/etc/rc.d/flexlm.sh You must also replace username above with the name of a valid user on your system (and not root). Start the license manager with the command: &prompt.root; /usr/local/etc/rc.d/flexlm.sh start Linking the &java; Runtime Environment Change the &java; Runtime Environment (JRE) link to one working under &os;: &prompt.root; cd $MATLAB/sys/java/jre/glnx86/ &prompt.root; unlink jre; ln -s ./jre1.1.8 ./jre Creating a &matlab; Startup Script Place the following startup script in /usr/local/bin/matlab: #!/bin/sh /compat/linux/bin/sh /compat/linux/usr/local/matlab/bin/matlab "$@" Then type the command chmod +x /usr/local/bin/matlab. Depending on your version of emulators/linux_base, you may run into errors when running this script. To avoid that, edit the file /compat/linux/usr/local/matlab/bin/matlab, and change the line that says: if [ `expr "$lscmd" : '.*->.*'` -ne 0 ]; then (in version 13.0.1 it is on line 410) to this line: if test -L $newbase; then Creating a &matlab; Shutdown Script The following is needed to solve a problem with &matlab; not exiting correctly. Create a file $MATLAB/toolbox/local/finish.m, and in it put the single line: ! $MATLAB/bin/finish.sh The $MATLAB is literal. In the same directory, you will find the files finishsav.m and finishdlg.m, which let you save your workspace before quitting. If you use either of them, insert the line above immediately after the save command. Create a file $MATLAB/bin/finish.sh, which will contain the following: #!/usr/compat/linux/bin/sh (sleep 5; killall -1 matlab_helper) & exit 0 Make the file executable: &prompt.root; chmod +x $MATLAB/bin/finish.sh Using &matlab; At this point you are ready to type matlab and start using it. Marcel Moolenaar Contributed by Installing &oracle; applications Oracle Preface This document describes the process of installing &oracle; 8.0.5 and &oracle; 8.0.5.1 Enterprise Edition for Linux onto a FreeBSD machine. Installing the Linux Environment Make sure you have both emulators/linux_base and devel/linux_devtools from the Ports Collection installed. If you run into difficulties with these ports, you may have to use the packages or older versions available in the Ports Collection. If you want to run the intelligent agent, you will also need to install the Red Hat Tcl package: tcl-8.0.3-20.i386.rpm. The general command for installing packages with the official RPM port (archivers/rpm) is: &prompt.root; rpm -i --ignoreos --root /compat/linux --dbpath /var/lib/rpm package Installation of the package should not generate any errors. Creating the &oracle; Environment Before you can install &oracle;, you need to set up a proper environment. This document only describes what to do specially to run &oracle; for Linux on FreeBSD, not what has been described in the &oracle; installation guide. Kernel Tuning kernel tuning As described in the &oracle; installation guide, you need to set the maximum size of shared memory. Do not use SHMMAX under FreeBSD. SHMMAX is merely calculated out of SHMMAXPGS and PGSIZE. Therefore define SHMMAXPGS. All other options can be used as described in the guide. For example: options SHMMAXPGS=10000 options SHMMNI=100 options SHMSEG=10 options SEMMNS=200 options SEMMNI=70 options SEMMSL=61 Set these options to suit your intended use of &oracle;. Also, make sure you have the following options in your kernel configuration file: options SYSVSHM #SysV shared memory options SYSVSEM #SysV semaphores options SYSVMSG #SysV interprocess communication &oracle; Account Create an oracle account just as you would create any other account. The oracle account is special only that you need to give it a Linux shell. Add /compat/linux/bin/bash to /etc/shells and set the shell for the oracle account to /compat/linux/bin/bash. Environment Besides the normal &oracle; variables, such as ORACLE_HOME and ORACLE_SID you must set the following environment variables: Variable Value LD_LIBRARY_PATH $ORACLE_HOME/lib CLASSPATH $ORACLE_HOME/jdbc/lib/classes111.zip PATH /compat/linux/bin /compat/linux/sbin /compat/linux/usr/bin /compat/linux/usr/sbin /bin /sbin /usr/bin /usr/sbin /usr/local/bin $ORACLE_HOME/bin It is advised to set all the environment variables in .profile. A complete example is: ORACLE_BASE=/oracle; export ORACLE_BASE ORACLE_HOME=/oracle; export ORACLE_HOME LD_LIBRARY_PATH=$ORACLE_HOME/lib export LD_LIBRARY_PATH ORACLE_SID=ORCL; export ORACLE_SID ORACLE_TERM=386x; export ORACLE_TERM CLASSPATH=$ORACLE_HOME/jdbc/lib/classes111.zip export CLASSPATH PATH=/compat/linux/bin:/compat/linux/sbin:/compat/linux/usr/bin PATH=$PATH:/compat/linux/usr/sbin:/bin:/sbin:/usr/bin:/usr/sbin PATH=$PATH:/usr/local/bin:$ORACLE_HOME/bin export PATH Installing &oracle; Due to a slight inconsistency in the Linux emulator, you need to create a directory named .oracle in /var/tmp before you start the installer. Let it be owned by the oracle user. You should be able to install &oracle; without any problems. If you have problems, check your &oracle; distribution and/or configuration first! After you have installed &oracle;, apply the patches described in the next two subsections. A frequent problem is that the TCP protocol adapter is not installed right. As a consequence, you cannot start any TCP listeners. The following actions help solve this problem: &prompt.root; cd $ORACLE_HOME/network/lib &prompt.root; make -f ins_network.mk ntcontab.o &prompt.root; cd $ORACLE_HOME/lib &prompt.root; ar r libnetwork.a ntcontab.o &prompt.root; cd $ORACLE_HOME/network/lib &prompt.root; make -f ins_network.mk install Do not forget to run root.sh again! Patching root.sh When installing &oracle;, some actions, which need to be performed as root, are recorded in a shell script called root.sh. This script is written in the orainst directory. Apply the following patch to root.sh, to have it use to proper location of chown or alternatively run the script under a Linux native shell. *** orainst/root.sh.orig Tue Oct 6 21:57:33 1998 --- orainst/root.sh Mon Dec 28 15:58:53 1998 *************** *** 31,37 **** # This is the default value for CHOWN # It will redefined later in this script for those ports # which have it conditionally defined in ss_install.h ! CHOWN=/bin/chown # # Define variables to be used in this script --- 31,37 ---- # This is the default value for CHOWN # It will redefined later in this script for those ports # which have it conditionally defined in ss_install.h ! CHOWN=/usr/sbin/chown # # Define variables to be used in this script When you do not install &oracle; from CD, you can patch the source for root.sh. It is called rthd.sh and is located in the orainst directory in the source tree. Patching genclntsh The script genclntsh is used to create a single shared client library. It is used when building the demos. Apply the following patch to comment out the definition of PATH: *** bin/genclntsh.orig Wed Sep 30 07:37:19 1998 --- bin/genclntsh Tue Dec 22 15:36:49 1998 *************** *** 32,38 **** # # Explicit path to ensure that we're using the correct commands #PATH=/usr/bin:/usr/ccs/bin export PATH ! PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin export PATH # # each product MUST provide a $PRODUCT/admin/shrept.lst --- 32,38 ---- # # Explicit path to ensure that we're using the correct commands #PATH=/usr/bin:/usr/ccs/bin export PATH ! #PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin export PATH # # each product MUST provide a $PRODUCT/admin/shrept.lst Running &oracle; When you have followed the instructions, you should be able to run &oracle; as if it was run on Linux itself. Holger Kipp Contributed by Valentino Vaschetto Original version converted to SGML by Installing &sap.r3; applications SAP R/3 Installations of &sap; Systems using FreeBSD will not be supported by the &sap; support team — they only offer support for certified platforms. Preface This document describes a possible way of installing a &sap.r3; System with &oracle; Database for Linux onto a FreeBSD machine, including the installation of FreeBSD and &oracle;. Two different configurations will be described: &sap.r3; 4.6B (IDES) with &oracle; 8.0.5 on FreeBSD 4.3-STABLE &sap.r3; 4.6C with &oracle; 8.1.7 on FreeBSD 4.5-STABLE Even though this document tries to describe all important steps in a greater detail, it is not intended as a replacement for the &oracle; and &sap.r3; installation guides. Please see the documentation that comes with the &sap.r3; Linux edition for &sap; and &oracle; specific questions, as well as resources from &oracle; and &sap; OSS. Software The following CD-ROMs have been used for &sap; installations: &sap.r3; 4.6B, &oracle; 8.0.5 Name Number Description KERNEL 51009113 SAP Kernel Oracle / Installation / AIX, Linux, Solaris RDBMS 51007558 Oracle / RDBMS 8.0.5.X / Linux EXPORT1 51010208 IDES / DB-Export / Disc 1 of 6 EXPORT2 51010209 IDES / DB-Export / Disc 2 of 6 EXPORT3 51010210 IDES / DB-Export / Disc 3 of 6 EXPORT4 51010211 IDES / DB-Export / Disc 4 of 6 EXPORT5 51010212 IDES / DB-Export / Disc 5 of 6 EXPORT6 51010213 IDES / DB-Export / Disc 6 of 6 Additionally, we used the &oracle; 8 Server (Pre-production version 8.0.5 for Linux, Kernel Version 2.0.33) CD which is not really necessary, and FreeBSD 4.3-STABLE (it was only a few days past 4.3 RELEASE). &sap.r3; 4.6C SR2, &oracle; 8.1.7 Name Number Description KERNEL 51014004 SAP Kernel Oracle / SAP Kernel Version 4.6D / DEC, Linux RDBMS 51012930 Oracle 8.1.7/ RDBMS / Linux EXPORT1 51013953 Release 4.6C SR2 / Export / Disc 1 of 4 EXPORT1 51013953 Release 4.6C SR2 / Export / Disc 2 of 4 EXPORT1 51013953 Release 4.6C SR2 / Export / Disc 3 of 4 EXPORT1 51013953 Release 4.6C SR2 / Export / Disc 4 of 4 LANG1 51013954 Release 4.6C SR2 / Language / DE, EN, FR / Disc 1 of 3 Depending on the languages you would like to install, additional language CDs might be necessary. Here we are just using DE and EN, so the first language CD is the only one needed. As a little note, the numbers for all four EXPORT CDs are identical. All three language CDs also have the same number (this is different from the 4.6B IDES release CD numbering). At the time of writing this installation is running on FreeBSD 4.5-STABLE (20.03.2002). &sap; Notes The following notes should be read before installing &sap.r3; and proved to be useful during installation: &sap.r3; 4.6B, &oracle; 8.0.5 Number Title 0171356 SAP Software on Linux: Essential Comments 0201147 INST: 4.6C R/3 Inst. on UNIX - Oracle 0373203 Update / Migration Oracle 8.0.5 --> 8.0.6/8.1.6 LINUX 0072984 Release of Digital UNIX 4.0B for Oracle 0130581 R3SETUP step DIPGNTAB terminates 0144978 Your system has not been installed correctly 0162266 Questions and tips for R3SETUP on Windows NT / W2K &sap.r3; 4.6C, &oracle; 8.1.7 Number Title 0015023 Initializing table TCPDB (RSXP0004) (EBCDIC) 0045619 R/3 with several languages or typefaces 0171356 SAP Software on Linux: Essential Comments 0195603 RedHat 6.1 Enterprise version: Known problems 0212876 The new archiving tool SAPCAR 0300900 Linux: Released DELL Hardware 0377187 RedHat 6.2: important remarks 0387074 INST: R/3 4.6C SR2 Installation on UNIX 0387077 INST: R/3 4.6C SR2 Inst. on UNIX - Oracle 0387078 SAP Software on UNIX: OS Dependencies 4.6C SR2 Hardware Requirements The following equipment is sufficient for the installation of a &sap.r3; System. For production use, a more exact sizing is of course needed: Component 4.6B 4.6C Processor 2 x 800MHz &pentium; III 2 x 800MHz &pentium; III Memory 1GB ECC 2GB ECC Hard Disk Space 50-60GB (IDES) 50-60GB (IDES) For use in production, &xeon; Processors with large cache, high-speed disk access (SCSI, RAID hardware controller), USV and ECC-RAM is recommended. The large amount of hard disk space is due to the preconfigured IDES System, which creates 27 GB of database files during installation. This space is also sufficient for initial production systems and application data. &sap.r3; 4.6B, &oracle; 8.0.5 The following off-the-shelf hardware was used: a dual processor board with 2 800 MHz &pentium; III processors, &adaptec; 29160 Ultra160 SCSI adapter (for accessing a 40/80 GB DLT tape drive and CDROM), &mylex; &acceleraid; (2 channels, firmware 6.00-1-00 with 32 MB RAM). To the &mylex; RAID controller are attached two 17 GB hard disks (mirrored) and four 36 GB hard disks (RAID level 5). &sap.r3; 4.6C, &oracle; 8.1.7 For this installation a &dell; &poweredge; 2500 was used: a dual processor board with two 1000 MHz &pentium; III processors (256 kB Cache), 2 GB PC133 ECC SDRAM, PERC/3 DC PCI RAID Controller with 128 MB, and an EIDE DVD-ROM drive. To the RAID controller are attached two 18 GB hard disks (mirrored) and four 36 GB hard disks (RAID level 5). Installation of FreeBSD First you have to install FreeBSD. There are several ways to do this (FreeBSD 4.3 was installed via FTP, FreeBSD 4.5 directly from the RELEASE CD) for more information read the . Disk Layout To keep it simple, the same disk layout both for the &sap.r3; 46B and &sap.r3; 46C SR2 installation was used. Only the device names changed, as the installations were on different hardware (/dev/da and /dev/amr respectively, so if using an AMI &megaraid;, one will see /dev/amr0s1a instead of /dev/da0s1a): File system Size (1k-blocks) Size (GB) Mounted on /dev/da0s1a 1.016.303 1 / /dev/da0s1b 6 swap /dev/da0s1e 2.032.623 2 /var /dev/da0s1f 8.205.339 8 /usr /dev/da1s1e 45.734.361 45 /compat/linux/oracle /dev/da1s1f 2.032.623 2 /compat/linux/sapmnt /dev/da1s1g 2.032.623 2 /compat/linux/usr/sap Configure and initialize the two logical drives with the &mylex; or PERC/3 RAID software beforehand. The software can be started during the BIOS boot phase. Please note that this disk layout differs slightly from the &sap; recommendations, as &sap; suggests mounting the &oracle; subdirectories (and some others) separately — we decided to just create them as real subdirectories for simplicity. <command>make world</command> and a New Kernel Download the latest -STABLE sources. Rebuild world and your custom kernel after configuring your kernel configuration file. Here you should also include the kernel parameters which are required for both &sap.r3; and &oracle;. Installing the Linux Environment Installing the Linux Base System First the linux_base port needs to be installed (as root): &prompt.root; cd /usr/ports/emulators/linux_base &prompt.root; make install distclean Installing Linux Development Environment The Linux development environment is needed, if you want to install &oracle; on FreeBSD according to the : &prompt.root; cd /usr/ports/devel/linux_devtools &prompt.root; make install distclean The Linux development environment has only been installed for the &sap.r3; 46B IDES installation. It is not needed, if the &oracle; DB is not relinked on the FreeBSD system. This is the case if you are using the &oracle; tarball from a Linux system. Installing the Necessary RPMs RPMs To start the R3SETUP program, PAM support is needed. During the first &sap; Installation on FreeBSD 4.3-STABLE we tried to install PAM with all the required packages and finally forced the installation of the PAM package, which worked. For &sap.r3; 4.6C SR2 we directly forced the installation of the PAM RPM, which also works, so it seems the dependent packages are not needed: &prompt.root; rpm -i --ignoreos --nodeps --root /compat/linux --dbpath /var/lib/rpm \ pam-0.68-7.i386.rpm For &oracle; 8.0.5 to run the intelligent agent, we also had to install the RedHat Tcl package tcl-8.0.5-30.i386.rpm (otherwise the relinking during &oracle; installation will not work). There are some other issues regarding relinking of &oracle;, but that is a &oracle; Linux issue, not FreeBSD specific. Some Additional Hints It might also be a good idea to add linprocfs to /etc/fstab, for more information, see the &man.linprocfs.5; manual page. Another parameter to set is kern.fallback_elf_brand=3 which is done in the file /etc/sysctl.conf. Creating the &sap.r3; Environment Creating the Necessary File Systems and Mountpoints For a simple installation, it is sufficient to create the following file systems: mount point size in GB /compat/linux/oracle 45 GB /compat/linux/sapmnt 2 GB /compat/linux/usr/sap 2 GB It is also necessary to created some links. Otherwise the &sap; Installer will complain, as it is checking the created links: &prompt.root; ln -s /compat/linux/oracle /oracle &prompt.root; ln -s /compat/linux/sapmnt /sapmnt &prompt.root; ln -s /compat/linux/usr/sap /usr/sap Possible error message during installation (here with System PRD and the &sap.r3; 4.6C SR2 installation): INFO 2002-03-19 16:45:36 R3LINKS_IND_IND SyLinkCreate:200 Checking existence of symbolic link /usr/sap/PRD/SYS/exe/dbg to /sapmnt/PRD/exe. Creating if it does not exist... WARNING 2002-03-19 16:45:36 R3LINKS_IND_IND SyLinkCreate:400 Link /usr/sap/PRD/SYS/exe/dbg exists but it points to file /compat/linux/sapmnt/PRD/exe instead of /sapmnt/PRD/exe. The program cannot go on as long as this link exists at this location. Move the link to another location. ERROR 2002-03-19 16:45:36 R3LINKS_IND_IND Ins_SetupLinks:0 can not setup link '/usr/sap/PRD/SYS/exe/dbg' with content '/sapmnt/PRD/exe' Creating Users and Directories &sap.r3; needs two users and three groups. The user names depend on the &sap; system ID (SID) which consists of three letters. Some of these SIDs are reserved by &sap; (for example SAP and NIX. For a complete list please see the &sap; documentation). For the IDES installation we used IDS, for the 4.6C SR2 installation PRD, as that system is intended for production use. We have therefore the following groups (group IDs might differ, these are just the values we used with our installation): group ID group name description 100 dba Data Base Administrator 101 sapsys &sap; System 102 oper Data Base Operator For a default &oracle; installation, only group dba is used. As oper group, one also uses group dba (see &oracle; and &sap; documentation for further information). We also need the following users: user ID user name generic name group additional groups description 1000 idsadm/prdadm sidadm sapsys oper &sap; Administrator 1002 oraids/oraprd orasid dba oper &oracle; Administrator Adding the users with &man.adduser.8; requires the following (please note shell and home directory) entries for &sap; Administrator: Name: sidadm Password: ****** Fullname: SAP Administrator SID Uid: 1000 Gid: 101 (sapsys) Class: Groups: sapsys dba HOME: /home/sidadm Shell: bash (/compat/linux/bin/bash) and for &oracle; Administrator: Name: orasid Password: ****** Fullname: Oracle Administrator SID Uid: 1002 Gid: 100 (dba) Class: Groups: dba HOME: /oracle/sid Shell: bash (/compat/linux/bin/bash) This should also include group oper in case you are using both groups dba and oper. Creating Directories These directories are usually created as separate file systems. This depends entirely on your requirements. We choose to create them as simple directories, as they are all located on the same RAID 5 anyway: First we will set owners and rights of some directories (as user root): &prompt.root; chmod 775 /oracle &prompt.root; chmod 777 /sapmnt &prompt.root; chown root:dba /oracle &prompt.root; chown sidadm:sapsys /compat/linux/usr/sap &prompt.root; chmod 775 /compat/linux/usr/sap Second we will create directories as user orasid. These will all be subdirectories of /oracle/SID: &prompt.root; su - orasid &prompt.root; cd /oracle/SID &prompt.root; mkdir mirrlogA mirrlogB origlogA origlogB &prompt.root; mkdir sapdata1 sapdata2 sapdata3 sapdata4 sapdata5 sapdata6 &prompt.root; mkdir saparch sapreorg &prompt.root; exit For the &oracle; 8.1.7 installation some additional directories are needed: &prompt.root; su - orasid &prompt.root; cd /oracle &prompt.root; mkdir 805_32 &prompt.root; mkdir client stage &prompt.root; mkdir client/80x_32 &prompt.root; mkdir stage/817_32 &prompt.root; cd /oracle/SID &prompt.root; mkdir 817_32 The directory client/80x_32 is used with exactly this name. Do not replace the x with some number or anything. In the third step we create directories as user sidadm: &prompt.root; su - sidadm &prompt.root; cd /usr/sap &prompt.root; mkdir SID &prompt.root; mkdir trans &prompt.root; exit Entries in <filename>/etc/services</filename> &sap.r3; requires some entries in file /etc/services, which will not be set correctly during installation under FreeBSD. Please add the following entries (you need at least those entries corresponding to the instance number — in this case, 00. It will do no harm adding all entries from 00 to 99 for dp, gw, sp and ms). If you are going to use a SAProuter or need to access &sap; OSS, you also need 99, as port 3299 is usually used for the SAProuter process on the target system: sapdp00 3200/tcp # SAP Dispatcher. 3200 + Instance-Number sapgw00 3300/tcp # SAP Gateway. 3300 + Instance-Number sapsp00 3400/tcp # 3400 + Instance-Number sapms00 3500/tcp # 3500 + Instance-Number sapmsSID 3600/tcp # SAP Message Server. 3600 + Instance-Number sapgw00s 4800/tcp # SAP Secure Gateway 4800 + Instance-Number Necessary Locales locale &sap; requires at least two locales that are not part of the default RedHat installation. &sap; offers the required RPMs as download from their FTP server (which is only accessible if you are a customer with OSS access). See note 0171356 for a list of RPMs you need. It is also possible to just create appropriate links (for example from de_DE and en_US ), but we would not recommend this for a production system (so far it worked with the IDES system without any problems, though). The following locales are needed: de_DE.ISO-8859-1 en_US.ISO-8859-1 Create the links like this: &prompt.root; cd /compat/linux/usr/share/locale &prompt.root; ln -s de_DE de_DE.ISO-8859-1 &prompt.root; ln -s en_US en_US.ISO-8859-1 If they are not present, there will be some problems during the installation. If these are then subsequently ignored (by setting the STATUS of the offending steps to OK in file CENTRDB.R3S), it will be impossible to log onto the &sap; system without some additional effort. Kernel Tuning kernel tuning &sap.r3; systems need a lot of resources. We therefore added the following parameters to the kernel configuration file: # Set these for memory pigs (SAP and Oracle): options MAXDSIZ="(1024*1024*1024)" options DFLDSIZ="(1024*1024*1024)" # System V options needed. options SYSVSHM #SYSV-style shared memory options SHMMAXPGS=262144 #max amount of shared mem. pages #options SHMMAXPGS=393216 #use this for the 46C inst.parameters options SHMMNI=256 #max number of shared memory ident if. options SHMSEG=100 #max shared mem.segs per process options SYSVMSG #SYSV-style message queues options MSGSEG=32767 #max num. of mes.segments in system options MSGSSZ=32 #size of msg-seg. MUST be power of 2 options MSGMNB=65535 #max char. per message queue options MSGTQL=2046 #max amount of msgs in system options SYSVSEM #SYSV-style semaphores options SEMMNU=256 #number of semaphore UNDO structures options SEMMNS=1024 #number of semaphores in system options SEMMNI=520 #number of semaphore identifiers options SEMUME=100 #number of UNDO keys The minimum values are specified in the documentation that comes from &sap;. As there is no description for Linux, see the HP-UX section (32-bit) for further information. As the system for the 4.6C SR2 installation has more main memory, the shared segments can be larger both for &sap; and &oracle;, therefore choose a larger number of shared memory pages. With the default installation of FreeBSD 4.5 on &i386;, leave MAXDSIZ and DFLDSIZ at 1 GB maximum. Otherwise, strange errors like ORA-27102: out of memory and Linux Error: 12: Cannot allocate memory might happen. Installing &sap.r3; Preparing &sap; CDROMs There are many CDROMs to mount and unmount during the installation. Assuming you have enough CDROM drives, you can just mount them all. We decided to copy the CDROMs contents to corresponding directories: /oracle/SID/sapreorg/cd-name where cd-name was one of KERNEL, RDBMS, EXPORT1, EXPORT2, EXPORT3, EXPORT4, EXPORT5 and EXPORT6 for the 4.6B/IDES installation, and KERNEL, RDBMS, DISK1, DISK2, DISK3, DISK4 and LANG for the 4.6C SR2 installation. All the filenames on the mounted CDs should be in capital letters, otherwise use the option for mounting. So use the following commands: &prompt.root; mount_cd9660 -g /dev/cd0a /mnt &prompt.root; cp -R /mnt/* /oracle/SID/sapreorg/cd-name &prompt.root; umount /mnt Running the Installation Script First you have to prepare an install directory: &prompt.root; cd /oracle/SID/sapreorg &prompt.root; mkdir install &prompt.root; cd install Then the installation script is started, which will copy nearly all the relevant files into the install directory: &prompt.root; /oracle/SID/sapreorg/KERNEL/UNIX/INSTTOOL.SH The IDES installation (4.6B) comes with a fully customized &sap.r3; demonstration system, so there are six instead of just three EXPORT CDs. At this point the installation template CENTRDB.R3S is for installing a standard central instance (&r3; and database), not the IDES central instance, so one needs to copy the corresponding CENTRDB.R3S from the EXPORT1 directory, otherwise R3SETUP will only ask for three EXPORT CDs. The newer &sap; 4.6C SR2 release comes with four EXPORT CDs. The parameter file that controls the installation steps is CENTRAL.R3S. Contrary to earlier releases there are no separate installation templates for a central instance with or without database. &sap; is using a separate template for database installation. To restart the installation later it is however sufficient to restart with the original file. During and after installation, &sap; requires hostname to return the computer name only, not the fully qualified domain name. So either set the hostname accordingly, or set an alias with alias hostname='hostname -s' for both orasid and sidadm (and for root at least during installation steps performed as root). It is also possible to adjust the installed .profile and .login files of both users that are installed during &sap; installation. Start <command>R3SETUP</command> 4.6B Make sure LD_LIBRARY_PATH is set correctly: &prompt.root; export LD_LIBRARY_PATH=/oracle/IDS/lib:/sapmnt/IDS/exe:/oracle/805_32/lib Start R3SETUP as root from installation directory: &prompt.root; cd /oracle/IDS/sapreorg/install &prompt.root; ./R3SETUP -f CENTRDB.R3S The script then asks some questions (defaults in brackets, followed by actual input): Question Default Input Enter SAP System ID [C11] IDSEnter Enter SAP Instance Number [00] Enter Enter SAPMOUNT Directory [/sapmnt] Enter Enter name of SAP central host [troubadix.domain.de] Enter Enter name of SAP db host [troubadix] Enter Select character set [1] (WE8DEC) Enter Enter Oracle server version (1) Oracle 8.0.5, (2) Oracle 8.0.6, (3) Oracle 8.1.5, (4) Oracle 8.1.6 1Enter Extract Oracle Client archive [1] (Yes, extract) Enter Enter path to KERNEL CD [/sapcd] /oracle/IDS/sapreorg/KERNEL Enter path to RDBMS CD [/sapcd] /oracle/IDS/sapreorg/RDBMS Enter path to EXPORT1 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT1 Directory to copy EXPORT1 CD [/oracle/IDS/sapreorg/CD4_DIR] Enter Enter path to EXPORT2 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT2 Directory to copy EXPORT2 CD [/oracle/IDS/sapreorg/CD5_DIR] Enter Enter path to EXPORT3 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT3 Directory to copy EXPORT3 CD [/oracle/IDS/sapreorg/CD6_DIR] Enter Enter path to EXPORT4 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT4 Directory to copy EXPORT4 CD [/oracle/IDS/sapreorg/CD7_DIR] Enter Enter path to EXPORT5 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT5 Directory to copy EXPORT5 CD [/oracle/IDS/sapreorg/CD8_DIR] Enter Enter path to EXPORT6 CD [/sapcd] /oracle/IDS/sapreorg/EXPORT6 Directory to copy EXPORT6 CD [/oracle/IDS/sapreorg/CD9_DIR] Enter Enter amount of RAM for SAP + DB 850Enter (in Megabytes) Service Entry Message Server [3600] Enter Enter Group-ID of sapsys [101] Enter Enter Group-ID of oper [102] Enter Enter Group-ID of dba [100] Enter Enter User-ID of sidadm [1000] Enter Enter User-ID of orasid [1002] Enter Number of parallel procs [2] Enter If you had not copied the CDs to the different locations, then the &sap; installer cannot find the CD needed (identified by the LABEL.ASC file on the CD) and would then ask you to insert and mount the CD and confirm or enter the mount path. The CENTRDB.R3S might not be error free. In our case, it requested EXPORT4 CD again but indicated the correct key (6_LOCATION, then 7_LOCATION etc.), so one can just continue with entering the correct values. Apart from some problems mentioned below, everything should go straight through up to the point where the &oracle; database software needs to be installed. Start <command>R3SETUP</command> 4.6C SR2 Make sure LD_LIBRARY_PATH is set correctly. This is a different value from the 4.6B installation with &oracle; 8.0.5: &prompt.root; export LD_LIBRARY_PATH=/sapmnt/PRD/exe:/oracle/PRD/817_32/lib Start R3SETUP as user root from installation directory: &prompt.root; cd /oracle/PRD/sapreorg/install &prompt.root; ./R3SETUP -f CENTRAL.R3S The script then asks some questions (defaults in brackets, followed by actual input): Question Default Input Enter SAP System ID [C11] PRDEnter Enter SAP Instance Number [00] Enter Enter SAPMOUNT Directory [/sapmnt] Enter Enter name of SAP central host [majestix] Enter Enter Database System ID [PRD] PRDEnter Enter name of SAP db host [majestix] Enter Select character set [1] (WE8DEC) Enter Enter Oracle server version (2) Oracle 8.1.7 2Enter Extract Oracle Client archive [1] (Yes, extract) Enter Enter path to KERNEL CD [/sapcd] /oracle/PRD/sapreorg/KERNEL Enter amount of RAM for SAP + DB 2044 1800Enter (in Megabytes) Service Entry Message Server [3600] Enter Enter Group-ID of sapsys [100] Enter Enter Group-ID of oper [101] Enter Enter Group-ID of dba [102] Enter Enter User-ID of oraprd [1002] Enter Enter User-ID of prdadm [1000] Enter LDAP support 3Enter (no support) Installation step completed [1] (continue) Enter Choose installation service [1] (DB inst,file) Enter So far, creation of users gives an error during installation in phases OSUSERDBSID_IND_ORA (for creating user orasid) and OSUSERSIDADM_IND_ORA (creating user sidadm). Apart from some problems mentioned below, everything should go straight through up to the point where the &oracle; database software needs to be installed. Installing &oracle; 8.0.5 Please see the corresponding &sap; Notes and &oracle; Readmes regarding Linux and &oracle; DB for possible problems. Most if not all problems stem from incompatible libraries. For more information on installing &oracle;, refer to the Installing &oracle; chapter. Installing the &oracle; 8.0.5 with <command>orainst</command> If &oracle; 8.0.5 is to be used, some additional libraries are needed for successfully relinking, as &oracle; 8.0.5 was linked with an old glibc (RedHat 6.0), but RedHat 6.1 already uses a new glibc. So you have to install the following additional packages to ensure that linking will work: compat-libs-5.2-2.i386.rpm compat-glibc-5.2-2.0.7.2.i386.rpm compat-egcs-5.2-1.0.3a.1.i386.rpm compat-egcs-c++-5.2-1.0.3a.1.i386.rpm compat-binutils-5.2-2.9.1.0.23.1.i386.rpm See the corresponding &sap; Notes or &oracle; Readmes for further information. If this is no option (at the time of installation we did not have enough time to check this), one could use the original binaries, or use the relinked binaries from an original RedHat system. For compiling the intelligent agent, the RedHat Tcl package must be installed. If you cannot get tcl-8.0.3-20.i386.rpm, a newer one like tcl-8.0.5-30.i386.rpm for RedHat 6.1 should also do. Apart from relinking, the installation is straightforward: &prompt.root; su - oraids &prompt.root; export TERM=xterm &prompt.root; export ORACLE_TERM=xterm &prompt.root; export ORACLE_HOME=/oracle/IDS &prompt.root; cd $ORACLE_HOME/orainst_sap &prompt.root; ./orainst Confirm all screens with Enter until the software is installed, except that one has to deselect the &oracle; On-Line Text Viewer, as this is not currently available for Linux. &oracle; then wants to relink with i386-glibc20-linux-gcc instead of the available gcc, egcs or i386-redhat-linux-gcc . Due to time constrains we decided to use the binaries from an &oracle; 8.0.5 PreProduction release, after the first attempt at getting the version from the RDBMS CD working, failed, and finding and accessing the correct RPMs was a nightmare at that time. Installing the &oracle; 8.0.5 Pre-production Release for Linux (Kernel 2.0.33) This installation is quite easy. Mount the CD, start the installer. It will then ask for the location of the &oracle; home directory, and copy all binaries there. We did not delete the remains of our previous RDBMS installation tries, though. Afterwards, &oracle; Database could be started with no problems. Installing the &oracle; 8.1.7 Linux Tarball Take the tarball oracle81732.tgz you produced from the installation directory on a Linux system and untar it to /oracle/SID/817_32/. Continue with &sap.r3; Installation First check the environment settings of users idsamd (sidadm) and oraids (orasid). They should now both have the files .profile, .login and .cshrc which are all using hostname. In case the system's hostname is the fully qualified name, you need to change hostname to hostname -s within all three files. Database Load Afterwards, R3SETUP can either be restarted or continued (depending on whether exit was chosen or not). R3SETUP then creates the tablespaces and loads the data (for 46B IDES, from EXPORT1 to EXPORT6, for 46C from DISK1 to DISK4) with R3load into the database. When the database load is finished (might take a few hours), some passwords are requested. For test installations, one can use the well known default passwords (use different ones if security is an issue!): Question Input Enter Password for sapr3 sapEnter Confirum Password for sapr3 sapEnter Enter Password for sys change_on_installEnter Confirm Password for sys change_on_installEnter Enter Password for system managerEnter Confirm Password for system managerEnter At this point We had a few problems with dipgntab during the 4.6B installation. Listener Start the &oracle; Listener as user orasid as follows: &prompt.user; umask 0; lsnrctl start Otherwise you might get the error ORA-12546 as the sockets will not have the correct permissions. See &sap; Note 072984. Updating MNLS Tables If you plan to import non-Latin-1 languages into the &sap; system, you have to update the Multi National Language Support tables. This is described in the &sap; OSS Notes 15023 and 45619. Otherwise, you can skip this question during &sap; installation. If you do not need MNLS, it is still necessary to check the table TCPDB and initializing it if this has not been done. See &sap; note 0015023 and 0045619 for further information. Post-installation Steps Request &sap.r3; License Key You have to request your &sap.r3; License Key. This is needed, as the temporary license that was installed during installation is only valid for four weeks. First get the hardware key. Log on as user idsadm and call saplicense: &prompt.root; /sapmnt/IDS/exe/saplicense -get Calling saplicense without parameters gives a list of options. Upon receiving the license key, it can be installed using: &prompt.root; /sapmnt/IDS/exe/saplicense -install You are then required to enter the following values: SAP SYSTEM ID = SID, 3 chars CUSTOMER KEY = hardware key, 11 chars INSTALLATION NO = installation, 10 digits EXPIRATION DATE = yyyymmdd, usually "99991231" LICENSE KEY = license key, 24 chars Creating Users Create a user within client 000 (for some tasks required to be done within client 000, but with a user different from users sap* and ddic). As a user name, We usually choose wartung (or service in English). Profiles required are sap_new and sap_all. For additional safety the passwords of default users within all clients should be changed (this includes users sap* and ddic). Configure Transport System, Profile, Operation Modes, Etc. Within client 000, user different from ddic and sap*, do at least the following: Task Transaction Configure Transport System, e.g. as Stand-Alone Transport Domain Entity STMS Create / Edit Profile for System RZ10 Maintain Operation Modes and Instances RZ04 These and all the other post-installation steps are thoroughly described in &sap; installation guides. Edit <filename>init<replaceable>sid</replaceable>.sap</filename> (<filename>initIDS.sap</filename>) The file /oracle/IDS/dbs/initIDS.sap contains the &sap; backup profile. Here the size of the tape to be used, type of compression and so on need to be defined. To get this running with sapdba / brbackup, we changed the following values: compress = hardware archive_function = copy_delete_save cpio_flags = "-ov --format=newc --block-size=128 --quiet" cpio_in_flags = "-iuv --block-size=128 --quiet" tape_size = 38000M tape_address = /dev/nsa0 tape_address_rew = /dev/sa0 Explanations: compress: The tape we use is a HP DLT1 which does hardware compression. archive_function: This defines the default behavior for saving &oracle; archive logs: new logfiles are saved to tape, already saved logfiles are saved again and are then deleted. This prevents lots of trouble if you need to recover the database, and one of the archive-tapes has gone bad. cpio_flags: Default is to use which sets block size to 5120 Bytes. For DLT Tapes, HP recommends at least 32 K block size, so we used for 64 K. is needed because we have inode numbers greater than 65535. The last option is needed as otherwise brbackup complains as soon as cpio outputs the numbers of blocks saved. cpio_in_flags: Flags needed for loading data back from tape. Format is recognized automatically. tape_size: This usually gives the raw storage capability of the tape. For security reason (we use hardware compression), the value is slightly lower than the actual value. tape_address: The non-rewindable device to be used with cpio. tape_address_rew: The rewindable device to be used with cpio. Configuration Issues after Installation The following &sap; parameters should be tuned after installation (examples for IDES 46B, 1 GB memory): Name Value ztta/roll_extension 250000000 abap/heap_area_dia 300000000 abap/heap_area_nondia 400000000 em/initial_size_MB 256 em/blocksize_kB 1024 ipc/shm_psize_40 70000000 &sap; Note 0013026: Name Value ztta/dynpro_area 2500000 &sap; Note 0157246: Name Value rdisp/ROLL_MAXFS 16000 rdisp/PG_MAXFS 30000 With the above parameters, on a system with 1 gigabyte of memory, one may find memory consumption similar to: Mem: 547M Active, 305M Inact, 109M Wired, 40M Cache, 112M Buf, 3492K Free Problems during Installation Restart <command>R3SETUP</command> after Fixing a Problem R3SETUP stops if it encounters an error. If you have looked at the corresponding logfiles and fixed the error, you have to start R3SETUP again, usually selecting REPEAT as option for the last step R3SETUP complained about. To restart R3SETUP, just start it with the corresponding R3S file: &prompt.root; ./R3SETUP -f CENTRDB.R3S for 4.6B, or with &prompt.root; ./R3SETUP -f CENTRAL.R3S for 4.6C, no matter whether the error occurred with CENTRAL.R3S or DATABASE.R3S. At some stages, R3SETUP assumes that both database and &sap; processes are up and running (as those were steps it already completed). Should errors occur and for example the database could not be started, you have to start both database and &sap; by hand after you fixed the errors and before starting R3SETUP again. Do not forget to also start the &oracle; listener again (as orasid with umask 0; lsnrctl start) if it was also stopped (for example due to a necessary reboot of the system). OSUSERSIDADM_IND_ORA during <command>R3SETUP</command> If R3SETUP complains at this stage, edit the template file R3SETUP used at that time (CENTRDB.R3S (4.6B) or either CENTRAL.R3S or DATABASE.R3S (4.6C)). Locate [OSUSERSIDADM_IND_ORA] or search for the only STATUS=ERROR entry and edit the following values: HOME=/home/sidadm (was empty) STATUS=OK (had status ERROR) Then you can restart R3SETUP again. OSUSERDBSID_IND_ORA during <command>R3SETUP</command> Possibly R3SETUP also complains at this stage. The error here is similar to the one in phase OSUSERSIDADM_IND_ORA. Just edit the template file R3SETUP used at that time (CENTRDB.R3S (4.6B) or either CENTRAL.R3S or DATABASE.R3S (4.6C)). Locate [OSUSERDBSID_IND_ORA] or search for the only STATUS=ERROR entry and edit the following value in that section: STATUS=OK Then restart R3SETUP. <errorname>oraview.vrf FILE NOT FOUND</errorname> during &oracle; Installation You have not deselected &oracle; On-Line Text Viewer before starting the installation. This is marked for installation even though this option is currently not available for Linux. Deselect this product inside the &oracle; installation menu and restart installation. <errorname>TEXTENV_INVALID</errorname> during <command>R3SETUP</command>, RFC or SAPgui Start If this error is encountered, the correct locale is missing. &sap; Note 0171356 lists the necessary RPMs that need be installed (e.g. saplocales-1.0-3, saposcheck-1.0-1 for RedHat 6.1). In case you ignored all the related errors and set the corresponding STATUS from ERROR to OK (in CENTRDB.R3S) every time R3SETUP complained and just restarted R3SETUP, the &sap; system will not be properly configured and you will then not be able to connect to the system with a SAPgui, even though the system can be started. Trying to connect with the old Linux SAPgui gave the following messages: Sat May 5 14:23:14 2001 *** ERROR => no valid userarea given [trgmsgo. 0401] Sat May 5 14:23:22 2001 *** ERROR => ERROR NR 24 occured [trgmsgi. 0410] *** ERROR => Error when generating text environment. [trgmsgi. 0435] *** ERROR => function failed [trgmsgi. 0447] *** ERROR => no socket operation allowed [trxio.c 3363] Speicherzugriffsfehler This behavior is due to &sap.r3; being unable to correctly assign a locale and also not being properly configured itself (missing entries in some database tables). To be able to connect to &sap;, add the following entries to file DEFAULT.PFL (see Note 0043288): abap/set_etct_env_at_new_mode = 0 install/collate/active = 0 rscp/TCP0B = TCP0B Restart the &sap; system. Now you can connect to the system, even though country-specific language settings might not work as expected. After correcting country settings (and providing the correct locales), these entries can be removed from DEFAULT.PFL and the &sap; system can be restarted. <errorcode>ORA-00001</errorcode> This error only happened with &oracle; 8.1.7 on FreeBSD 4.5. The reason was that the &oracle; database could not initialize itself properly and crashed, leaving semaphores and shared memory on the system. The next try to start the database then returned ORA-00001. Find them with ipcs -a and remove them with ipcrm. <errorcode>ORA-00445</errorcode> (Background Process PMON Did Not Start) This error happened with &oracle; 8.1.7. This error is reported if the database is started with the usual startsap script (for example startsap_majestix_00) as user prdadm. A possible workaround is to start the database as user oraprd instead with svrmgrl: &prompt.user; svrmgrl SVRMGR> connect internal; SVRMGR> startup; SVRMGR> exit <errorcode>ORA-12546</errorcode> (Start Listener with Correct Permissions) Start the &oracle; listener as user oraids with the following commands: &prompt.root; umask 0; lsnrctl start Otherwise you might get ORA-12546 as the sockets will not have the correct permissions. See &sap; Note 0072984. <errorcode>ORA-27102</errorcode> (Out of Memory) This error happened whilst trying to use values for MAXDSIZ and DFLDSIZ greater than 1 GB (1024x1024x1024). Additionally, we got Linux Error 12: Cannot allocate memory. [DIPGNTAB_IND_IND] during <command>R3SETUP</command> In general, see &sap; Note 0130581 (R3SETUP step DIPGNTAB terminates). During the IDES-specific installation, for some reason the installation process was not using the proper &sap; system name IDS, but the empty string "" instead. This leads to some minor problems with accessing directories, as the paths are generated dynamically using SID (in this case IDS). So instead of accessing: /usr/sap/IDS/SYS/... /usr/sap/IDS/DVMGS00 the following paths were used: /usr/sap//SYS/... /usr/sap/D00 To continue with the installation, we created a link and an additional directory: &prompt.root; pwd /compat/linux/usr/sap &prompt.root; ls -l total 4 drwxr-xr-x 3 idsadm sapsys 512 May 5 11:20 D00 drwxr-x--x 5 idsadm sapsys 512 May 5 11:35 IDS -lrwxr-xr-x 1 root sapsys 7 May 5 11:35 SYS -> IDS/SYS +lrwxr-xr-x 1 root sapsys 7 May 5 11:35 SYS -> IDS/SYS drwxrwxr-x 2 idsadm sapsys 512 May 5 13:00 tmp drwxrwxr-x 11 idsadm sapsys 512 May 4 14:20 trans We also found &sap; Notes (0029227 and 0008401) describing this behavior. We did not encounter any of these problems with the &sap; 4.6C installation. [RFCRSWBOINI_IND_IND] during <command>R3SETUP</command> During installation of &sap; 4.6C, this error was just the result of another error happening earlier during installation. In this case, you have to look through the corresponding logfiles and correct the real problem. If after looking through the logfiles this error is indeed the correct one (check the &sap; Notes), you can set STATUS of the offending step from ERROR to OK (file CENTRDB.R3S) and restart R3SETUP. After installation, you have to execute the report RSWBOINS from transaction SE38. See &sap; Note 0162266 for additional information about phase RFCRSWBOINI and RFCRADDBDIF. [RFCRADDBDIF_IND_IND] during <command>R3SETUP</command> Here the same restrictions apply: make sure by looking through the logfiles, that this error is not caused by some previous problems. If you can confirm that &sap; Note 0162266 applies, just set STATUS of the offending step from ERROR to OK (file CENTRDB.R3S) and restart R3SETUP. After installation, you have to execute the report RADDBDIF from transaction SE38. <errorcode>sigaction sig31: File size limit exceeded</errorcode> This error occurred during start of &sap; processes disp+work. If starting &sap; with the startsap script, subprocesses are then started which detach and do the dirty work of starting all other &sap; processes. As a result, the script itself will not notice if something goes wrong. To check whether the &sap; processes did start properly, have a look at the process status with ps ax | grep SID, which will give you a list of all &oracle; and &sap; processes. If it looks like some processes are missing or if you cannot connect to the &sap; system, look at the corresponding logfiles which can be found at /usr/sap/SID/DVEBMGSnr/work/. The files to look at are dev_ms and dev_disp. Signal 31 happens here if the amount of shared memory used by &oracle; and &sap; exceed the one defined within the kernel configuration file and could be resolved by using a larger value: # larger value for 46C production systems: options SHMMAXPGS=393216 # smaller value sufficient for 46B: #options SHMMAXPGS=262144 Start of <command>saposcol</command> Failed There are some problems with the program saposcol (version 4.6D). The &sap; system is using saposcol to collect data about the system performance. This program is not needed to use the &sap; system, so this problem can be considered a minor one. The older versions (4.6B) does work, but does not collect all the data (many calls will just return 0, for example for CPU usage). Advanced Topics If you are curious as to how the Linux binary compatibility works, this is the section you want to read. Most of what follows is based heavily on an email written to &a.chat; by Terry Lambert tlambert@primenet.com (Message ID: <199906020108.SAA07001@usr09.primenet.com>). How Does It Work? execution class loader FreeBSD has an abstraction called an execution class loader. This is a wedge into the &man.execve.2; system call. What happens is that FreeBSD has a list of loaders, instead of a single loader with a fallback to the #! loader for running any shell interpreters or shell scripts. Historically, the only loader on the &unix; platform examined the magic number (generally the first 4 or 8 bytes of the file) to see if it was a binary known to the system, and if so, invoked the binary loader. If it was not the binary type for the system, the &man.execve.2; call returned a failure, and the shell attempted to start executing it as shell commands. The assumption was a default of whatever the current shell is. Later, a hack was made for &man.sh.1; to examine the first two characters, and if they were :\n, then it invoked the &man.csh.1; shell instead (we believe SCO first made this hack). What FreeBSD does now is go through a list of loaders, with a generic #! loader that knows about interpreters as the characters which follow to the next whitespace next to last, followed by a fallback to /bin/sh. ELF For the Linux ABI support, FreeBSD sees the magic number as an ELF binary (it makes no distinction between FreeBSD, &solaris;, Linux, or any other OS which has an ELF image type, at this point). Solaris The ELF loader looks for a specialized brand, which is a comment section in the ELF image, and which is not present on SVR4/&solaris; ELF binaries. For Linux binaries to function, they must be branded as type Linux from &man.brandelf.1;: &prompt.root; brandelf -t Linux file When this is done, the ELF loader will see the Linux brand on the file. ELF branding When the ELF loader sees the Linux brand, the loader replaces a pointer in the proc structure. All system calls are indexed through this pointer (in a traditional &unix; system, this would be the sysent[] structure array, containing the system calls). In addition, the process is flagged for special handling of the trap vector for the signal trampoline code, and several other (minor) fix-ups that are handled by the Linux kernel module. The Linux system call vector contains, among other things, a list of sysent[] entries whose addresses reside in the kernel module. When a system call is called by the Linux binary, the trap code dereferences the system call function pointer off the proc structure, and gets the Linux, not the FreeBSD, system call entry points. In addition, the Linux mode dynamically reroots lookups; this is, in effect, what the option to file system mounts (not the unionfs file system type!) does. First, an attempt is made to lookup the file in the /compat/linux/original-path directory, then only if that fails, the lookup is done in the /original-path directory. This makes sure that binaries that require other binaries can run (e.g., the Linux toolchain can all run under Linux ABI support). It also means that the Linux binaries can load and execute FreeBSD binaries, if there are no corresponding Linux binaries present, and that you could place a &man.uname.1; command in the /compat/linux directory tree to ensure that the Linux binaries could not tell they were not running on Linux. In effect, there is a Linux kernel in the FreeBSD kernel; the various underlying functions that implement all of the services provided by the kernel are identical to both the FreeBSD system call table entries, and the Linux system call table entries: file system operations, virtual memory operations, signal delivery, System V IPC, etc… The only difference is that FreeBSD binaries get the FreeBSD glue functions, and Linux binaries get the Linux glue functions (most older OS's only had their own glue functions: addresses of functions in a static global sysent[] structure array, instead of addresses of functions dereferenced off a dynamically initialized pointer in the proc structure of the process making the call). Which one is the native FreeBSD ABI? It does not matter. Basically the only difference is that (currently; this could easily be changed in a future release, and probably will be after this) the FreeBSD glue functions are statically linked into the kernel, and the Linux glue functions can be statically linked, or they can be accessed via a kernel module. Yeah, but is this really emulation? No. It is an ABI implementation, not an emulation. There is no emulator (or simulator, to cut off the next question) involved. So why is it sometimes called Linux emulation? To make it hard to sell FreeBSD! Really, it is because the historical implementation was done at a time when there was really no word other than that to describe what was going on; saying that FreeBSD ran Linux binaries was not true, if you did not compile the code in or load a module, and there needed to be a word to describe what was being loaded—hence the Linux emulator. diff --git a/en_US.ISO8859-1/books/handbook/mac/chapter.sgml b/en_US.ISO8859-1/books/handbook/mac/chapter.sgml index f1ffb8d56f..c6b5f06e62 100644 --- a/en_US.ISO8859-1/books/handbook/mac/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/mac/chapter.sgml @@ -1,2288 +1,2288 @@ Tom Rhodes Written by Mandatory Access Control Synopsis MAC Mandatory Access Control MAC &os; 5.X introduced new security extensions from the TrustedBSD project based on the &posix;.1e draft. Two of the most significant new security mechanisms are file system Access Control Lists (ACLs) and Mandatory Access Control (MAC) facilities. Mandatory Access Control allows new access control modules to be loaded, implementing new security policies. Some provide protections of a narrow subset of the system, hardening a particular service, while others provide comprehensive labeled security across all subjects and objects. The mandatory part of the definition comes from the fact that the enforcement of the controls is done by administrators and the system, and is not left up to the discretion of users as is done with discretionary access control (DAC, the standard file and System V IPC permissions on &os;). This chapter will focus on the Mandatory Access Control Framework (MAC Framework), and a set of pluggable security policy modules enabling various security mechanisms. After reading this chapter, you will know: What MAC security policy modules are currently included in &os; and their associated mechanisms. What MAC security policy modules implement as well as the difference between a labeled and non-labeled policy. How to efficiently configure a system to use the MAC framework. How to configure the different security policy modules included with the MAC framework. How to implement a more secure environment using the MAC framework and the examples shown. How to test the MAC configuration to ensure the framework has been properly implemented. Before reading this chapter, you should: Understand &unix; and &os; basics (). Be familiar with the basics of kernel configuration/compilation (). Have some familiarity with security and how it pertains to &os; (). The improper use of the information in this chapter may cause loss of system access, aggravation of users, or inability to access the features provided by X11. More importantly, MAC should not be relied upon to completely secure a system. The MAC framework only augments existing security policy; without sound security practices and regular security checks, the system will never be completely secure. It should also be noted that the examples contained within this chapter are just that, examples. It is not recommended that these particular settings be rolled out on a production system. Implementing the various security policy modules takes a good deal of thought. One who does not fully understand exactly how everything works may find him or herself going back through the entire system and reconfiguring many files or directories. What Will Not Be Covered This chapter covers a broad range of security issues relating to the MAC framework; however, the development of new MAC security policy modules will not be covered. A number of security policy modules included with the MAC framework have specific characteristics which are provided for both testing and new module development. These include the &man.mac.test.4;, &man.mac.stub.4; and &man.mac.none.4;. For more information on these security policy modules and the various mechanisms they provide, please review the manual pages. Key Terms in this Chapter Before reading this chapter, a few key terms must be explained. This will hopefully clear up any confusion that may occur and avoid the abrupt introduction of new terms and information. compartment: A compartment is a set of programs and data to be partitioned or separated, where users are given explicit access to specific components of a system. Also, a compartment represents a grouping, such as a work group, department, project, or topic. Using compartments, it is possible to implement a need-to-know security policy. integrity: Integrity, as a key concept, is the level of trust which can be placed on data. As the integrity of the data is elevated, so does the ability to trust that data. label: A label is a security attribute which can be applied to files, directories, or other items in the system. It could be considered a confidentiality stamp; when a label is placed on a file it describes the security properties for that specific file and will only permit access by files, users, resources, etc. with a similar security setting. The meaning and interpretation of label values depends on the policy configuration: while some policies might treat a label as representing the integrity or secrecy of an object, other policies might use labels to hold rules for access. level: The increased or decreased setting of a security attribute. As the level increases, its security is considered to elevate as well. multilabel: The property is a file system option which can be set in single user mode using the &man.tunefs.8; utility, during the boot operation using the &man.fstab.5; file, or during the creation of a new file system. This option will permit an administrator to apply different MAC labels on different objects. This option only applies to security policy modules which support labeling. object: An object or system object is an entity through which information flows under the direction of a subject. This includes directories, files, fields, screens, keyboards, memory, magnetic storage, printers or any other data storage/moving device. Basically, an object is a data container or a system resource; access to an object effectively means access to the data. policy: A collection of rules which defines how objectives are to be achieved. A policy usually documents how certain items are to be handled. This chapter will consider the term policy in this context as a security policy; i.e. a collection of rules which will control the flow of data and information and define whom will have access to that data and information. sensitivity: Usually used when discussing MLS. A sensitivity level is a term used to describe how important or secret the data should be. As the sensitivity level increases, so does the importance of the secrecy, or confidentiality of the data. single label: A single label is when the entire file system uses one label to enforce access control over the flow of data. When a file system has this set, which is any time when the option is not set, all files will conform to the same label setting. subject: a subject is any active entity that causes information to flow between objects; e.g. a user, user processor, system process, etc. On &os;, this is almost always a thread acting in a process on behalf of a user. Explanation of MAC With all of these new terms in mind, consider how the MAC framework augments the security of the system as a whole. The various security policy modules provided by the MAC framework could be used to protect the network and file systems, block users from accessing certain ports and sockets, and more. Perhaps the best use of the policy modules is to blend them together, by loading several security policy modules at a time for a multi-layered security environment. In a multi-layered security environment, multiple policy modules are in effect to keep security in check. This is different to a hardening policy, which typically hardens elements of a system that is used only for specific purposes. The only downside is administrative overhead in cases of multiple file system labels, setting network access control user by user, etc. These downsides are minimal when compared to the lasting effect of the framework; for instance, the ability to pick and choose which policies are required for a specific configuration keeps performance overhead down. The reduction of support for unneeded policies can increase the overall performance of the system as well as offer flexibility of choice. A good implementation would consider the overall security requirements and effectively implement the various security policy modules offered by the framework. Thus a system utilizing MAC features should at least guarantee that a user will not be permitted to change security attributes at will; all user utilities, programs and scripts must work within the constraints of the access rules provided by the selected security policy modules; and that total control of the MAC access rules are in the hands of the system administrator. It is the sole duty of the system administrator to carefully select the correct security policy modules. Some environments may need to limit access control over the network; in these cases, the &man.mac.portacl.4;, &man.mac.ifoff.4; and even &man.mac.biba.4; policy modules might make good starting points. In other cases, strict confidentiality of file system objects might be required. Policy modules such as &man.mac.bsdextended.4; and &man.mac.mls.4; exist for this purpose. Policy decisions could be made based on network configuration. Perhaps only certain users should be permitted access to facilities provided by &man.ssh.1; to access the network or the Internet. The &man.mac.portacl.4; would be the policy module of choice for these situations. But what should be done in the case of file systems? Should all access to certain directories be severed from other groups or specific users? Or should we limit user or utility access to specific files by setting certain objects as classified? In the file system case, access to objects might be considered confidential to some users, but not to others. For an example, a large development team might be broken off into smaller groups of individuals. Developers in project A might not be permitted to access objects written by developers in project B. Yet they might need to access objects created by developers in project C; that is quite a situation indeed. Using the different security policy modules provided by the MAC framework; users could be divided into these groups and then given access to the appropriate areas without fear of information leakage. Thus, each security policy module has a unique way of dealing with the overall security of a system. Module selection should be based on a well thought out security policy. In many cases, the overall policy may need to be revised and reimplemented on the system. Understanding the different security policy modules offered by the MAC framework will help administrators choose the best policies for their situations. The default &os; kernel does not include the option for the MAC framework; thus the following kernel option must be added before trying any of the examples or information in this chapter: options MAC And the kernel will require a rebuild and a reinstall. While the various manual pages for MAC policy modules state that they may be built into the kernel, it is possible to lock the system out of the network and more. Implementing MAC is much like implementing a firewall, care must be taken to prevent being completely locked out of the system. The ability to revert back to a previous configuration should be considered while the implementation of MAC remotely should be done with extreme caution. Understanding MAC Labels A MAC label is a security attribute which may be applied to subjects and objects throughout the system. When setting a label, the user must be able to comprehend what it is, exactly, that is being done. The attributes available on an object depend on the policy module loaded, and that policy modules interpret their attributes in different ways. If improperly configured due to lack of comprehension, or the inability to understand the implications, the result will be the unexpected and perhaps, undesired, behavior of the system. The security label on an object is used as a part of a security access control decision by a policy. With some policies, the label by itself contains all information necessary to make a decision; in other models, the labels may be processed as part of a larger rule set, etc. For instance, setting the label of biba/low on a file will represent a label maintained by the Biba security policy module, with a value of low. A few policy modules which support the labeling feature in &os; offer three specific predefined labels. These are the low, high, and equal labels. Although they enforce access control in a different manner with each policy module, you can be sure that the low label will be the lowest setting, the equal label will set the subject or object to be disabled or unaffected, and the high label will enforce the highest setting available in the Biba and MLS policy modules. Within single label file system environments, only one label may be used on objects. This will enforce one set of access permissions across the entire system and in many environments may be all that is required. There are a few cases where multiple labels may be set on objects or subjects in the file system. For those cases, the option may be passed to &man.tunefs.8;. In the case of Biba and MLS, a numeric label may be set to indicate the precise level of hierarchical control. This numeric level is used to partition or sort information into different groups of say, classification only permitting access to that group or a higher group level. In most cases the administrator will only be setting up a single label to use throughout the file system. Hey wait, this is similar to DAC! I thought MAC gave control strictly to the administrator. That statement still holds true, to some extent as root is the one in control and who configures the policies so that users are placed in the appropriate categories/access levels. Alas, many policy modules can restrict the root user as well. Basic control over objects will then be released to the group, but root may revoke or modify the settings at any time. This is the hierarchal/clearance model covered by policies such as Biba and MLS. Label Configuration Virtually all aspects of label policy module configuration will be performed using the base system utilities. These commands provide a simple interface for object or subject configuration or the manipulation and verification of the configuration. All configuration may be done by use of the &man.setfmac.8; and &man.setpmac.8; utilities. The setfmac command is used to set MAC labels on system objects while the setpmac command is used to set the labels on system subjects. Observe: &prompt.root; setfmac biba/high test If no errors occurred with the command above, a prompt will be returned. The only time these commands are not quiescent is when an error occurred; similarly to the &man.chmod.1; and &man.chown.8; commands. In some cases this error may be a Permission denied and is usually obtained when the label is being set or modified on an object which is restricted.Other conditions may produce different failures. For instance, the file may not be owned by the user attempting to relabel the object, the object may not exist or may be read only. A mandatory policy will not allow the process to relabel the file, maybe because of a property of the file, a property of the process, or a property of the proposed new label value. For example: a user running at low integrity tries to change the label of a high integrity file. Or perhaps a user running at low integrity tries to change the label of a low integrity file to a high integrity label. The system administrator may use the following commands to overcome this: &prompt.root; setfmac biba/high test Permission denied &prompt.root; setpmac biba/low setfmac biba/high test &prompt.root; getfmac test test: biba/high As we see above, setpmac can be used to override the policy module's settings by assigning a different label to the invoked process. The getpmac utility is usually used with currently running processes, such as sendmail: although it takes a process ID in place of a command the logic is extremely similar. If users attempt to manipulate a file not in their access, subject to the rules of the loaded policy modules, the Operation not permitted error will be displayed by the mac_set_link function. Common Label Types For the &man.mac.biba.4;, &man.mac.mls.4; and &man.mac.lomac.4; policy modules, the ability to assign simple labels is provided. These take the form of high, equal and low, what follows is a brief description of what these labels provide: The low label is considered the lowest label setting an object or subject may have. Setting this on objects or subjects will block their access to objects or subjects marked high. The equal label should only be placed on objects considered to be exempt from the policy. The high label grants an object or subject the highest possible setting. With respect to each policy module, each of those settings will instate a different information flow directive. Reading the proper manual pages will further explain the traits of these generic label configurations. Advanced Label Configuration Numeric grade numbers used for comparison:compartment+compartment; thus the following: biba/10:2+3+6(5:2+3-20:2+3+4+5+6) May be interpreted as: Biba Policy Label/Grade 10 :Compartments 2, 3 and 6: (grade 5 ...) In this example, the first grade would be considered the effective grade with effective compartments, the second grade is the low grade and the last one is the high grade. In most configurations these settings will not be used; indeed, they offered for more advanced configurations. When applied to system objects, they will only have a current grade/compartments as opposed to system subjects as they reflect the range of available rights in the system, and network interfaces, where they are used for access control. The grade and compartments in a subject and object pair are used to construct a relationship referred to as dominance, in which a subject dominates an object, the object dominates the subject, neither dominates the other, or both dominate each other. The both dominate case occurs when the two labels are equal. Due to the information flow nature of Biba, you have rights to a set of compartments, need to know, that might correspond to projects, but objects also have a set of compartments. Users may have to subset their rights using su or setpmac in order to access objects in a compartment from which they are not restricted. Users and Label Settings Users themselves are required to have labels so that their files and processes may properly interact with the security policy defined on the system. This is configured through the login.conf file by use of login classes. Every policy module that uses labels will implement the user class setting. An example entry containing every policy module setting is displayed below: default:\ :copyright=/etc/COPYRIGHT:\ :welcome=/etc/motd:\ :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\ :path=~/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:\ :manpath=/usr/share/man /usr/local/man:\ :nologin=/usr/sbin/nologin:\ :cputime=1h30m:\ :datasize=8M:\ :vmemoryuse=100M:\ :stacksize=2M:\ :memorylocked=4M:\ :memoryuse=8M:\ :filesize=8M:\ :coredumpsize=8M:\ :openfiles=24:\ :maxproc=32:\ :priority=0:\ :requirehome:\ :passwordtime=91d:\ :umask=022:\ :ignoretime@:\ :label=partition/13,mls/5,biba/10(5-15),lomac10[2]: The label option is used to set the user class default label which will be enforced by MAC. Users will never be permitted to modify this value, thus it can be considered not optional in the user case. In a real configuration, however, the administrator will never wish to enable every policy module. It is recommended that the rest of this chapter be reviewed before any of this configuration is implemented. Users may change their label after the initial login; however, this change is subject constraints of the policy. The example above tells the Biba policy that a process's minimum integrity is 5, its maximum is 15, but the default effective label is 10. The process will run at 10 until it chooses to change label, perhaps due to the user using the setpmac command, which will be constrained by Biba to the range set at login. In all cases, after a change to login.conf, the login class capability database must be rebuilt using cap_mkdb and this will be reflected throughout every forthcoming example or discussion. It is useful to note that many sites may have a particularly large number of users requiring several different user classes. In depth planning is required as this may get extremely difficult to manage. Future versions of &os; will include a new way to deal with mapping users to labels; however, this will not be available until some time after &os; 5.3. Network Interfaces and Label Settings Labels may also be set on network interfaces to help control the flow of data across the network. In all cases they function in the same way the policies function with respect to objects. Users at high settings in biba, for example, will not be permitted to access network interfaces with a label of low. The may be passed to ifconfig when setting the MAC label on network interfaces. For example: &prompt.root; ifconfig bge0 maclabel biba/equal will set the MAC label of biba/equal on the &man.bge.4; interface. When using a setting similar to biba/high(low-high) the entire label should be quoted; otherwise an error will be returned. Each policy module which supports labeling has a tunable which may be used to disable the MAC label on network interfaces. Setting the label to will have a similar effect. Review the output from sysctl, the policy manual pages, or even the information found later in this chapter for those tunables. Singlelabel or Multilabel? By default the system will use the option. But what does this mean to the administrator? There are several differences which, in their own right, offer pros and cons to the flexibility in the systems security model. The only permits for one label, for instance biba/high to be used for each subject or object. It provides for lower administration overhead but decreases the flexibility of policies which support labeling. Many administrators may want to use the option in their security policy. The option will permit each subject or object to have its own independent MAC label in place of the standard option which will allow only one label throughout the partition. The and label options are only required for the policies which implement the labeling feature, including the Biba, Lomac, MLS and SEBSD policies. In many cases, the may not need to be set at all. Consider the following situation and security model: &os; web-server using the MAC framework and a mix of the various policies. This machine only requires one label, biba/high, for everything in the system. Here the file system would not require the option as a single label will always be in effect. But, this machine will be a web server and should have the web server run at biba/low to prevent write up capabilities. The Biba policy and how it works will be discussed later, so if the previous comment was difficult to interpret just continue reading and return. The server could use a separate partition set at biba/low for most if not all of its runtime state. Much is lacking from this example, for instance the restrictions on data, configuration and user settings; however, this is just a quick example to prove the aforementioned point. If any of the non-labeling policies are to be used, then the option would never be required. These include the seeotheruids, portacl and partition policies. It should also be noted that using with a partition and establishing a security model based on functionality could open the doors for higher administrative overhead as everything in the file system would have a label. This includes directories, files, and even device nodes. The following command will set on the file systems to have multiple labels. This may only be done in single user mode: &prompt.root; tunefs -l enable / This is not a requirement for the swap file system. Some users have experienced problems with setting the flag on the root partition. If this is the case, please review the of this chapter. Controlling MAC with Tunables Without any modules loaded, there are still some parts of MAC which may be configured using the sysctl interface. These tunables are described below and in all cases the number one (1) means enabled while the number zero (0) means disabled: security.mac.enforce_fs defaults to one (1) and enforces MAC file system policies on the file systems. security.mac.enforce_kld defaults to one (1) and enforces MAC kernel linking policies on the dynamic kernel linker (see &man.kld.4;). security.mac.enforce_network defaults to one (1) and enforces MAC network policies. security.mac.enforce_pipe defaults to one (1) and enforces MAC policies on pipes. security.mac.enforce_process defaults to one (1) and enforces MAC policies on processes which utilize inter-process communication. security.mac.enforce_socket defaults to one (1) and enforces MAC policies on sockets (see the &man.socket.2; manual page). security.mac.enforce_system defaults to one (1) and enforces MAC policies on system activities such as accounting and rebooting. security.mac.enforce_vm defaults to one (1) and enforces MAC policies on the virtual memory system. Every policy or MAC option supports tunables. These usually hang off of the security.mac.<policyname> tree. To view all of the tunables from MAC use the following command: &prompt.root; sysctl -da | grep mac This should be interpreted as all of the basic MAC policies are enforced by default. If the modules were built into the kernel the system would be extremely locked down and most likely unable to communicate with the local network or connect to the Internet, etc. This is why building the modules into the kernel is not completely recommended. Not because it limits the ability to disable features on the fly with sysctl, but it permits the administrator to instantly switch the policies of a system without the requirement of rebuilding and reinstalling a new system. Module Configuration Every module included with the MAC framework may be either compiled into the kernel as noted above or loaded as a run-time kernel module. The recommended method is to add the module name to the /boot/loader.conf file so that it will load during the initial boot operation. The following sections will discuss the various MAC modules and cover their features. Implementing them into a specific environment will also be a consideration of this chapter. Some modules support the use of labeling, which is controlling access by enforcing a label such as this is allowed and this is not. A label configuration file may control how files may be accessed, network communication can be exchanged, and more. The previous section showed how the flag could be set on file systems to enable per-file or per-partition access control. A single label configuration would enforce only one label across the system, that is why the tunefs option is called . The MAC seeotheruids Module MAC See Other UIDs Policy Module name: mac_seeotheruids.ko Kernel configuration line: options MAC_SEEOTHERUIDS Boot option: mac_seeotheruids_load="YES" The &man.mac.seeotheruids.4; module mimics and extends the security.bsd.see_other_uids and security.bsd.see_other_gids sysctl tunables. This option does not require any labels to be set before configuration and can operate transparently with the other modules. After loading the module, the following sysctl tunables may be used to control the features: security.mac.seeotheruids.enabled will enable the module's features and use the default settings. These default settings will deny users the ability to view processes and sockets owned by other users. security.mac.seeotheruids.specificgid_enabled will allow a certain group to be exempt from this policy. To exempt specific groups from this policy, use the security.mac.seeotheruids.specificgid=XXX sysctl tunable. In the above example, the XXX should be replaced with the numeric group ID to be exempted. security.mac.seeotheruids.primarygroup_enabled is used to exempt specific primary groups from this policy. When using this tunable, the security.mac.seeotheruids.specificgid_enabled may not be set. The MAC bsdextended Module MAC File System Firewall Policy Module name: mac_bsdextended.ko Kernel configuration line: options MAC_BSDEXTENDED Boot option: mac_bsdextended_load="YES" The &man.mac.bsdextended.4; module enforces the file system firewall. This module's policy provides an extension to the standard file system permissions model, permitting an administrator to create a firewall-like ruleset to protect files, utilities, and directories in the file system hierarchy. The policy may be created using a utility, &man.ugidfw.8;, that has a syntax similar to that of &man.ipfw.8;. More tools can be written by using the functions in the &man.libugidfw.3; library. Extreme caution should be taken when working with this module; incorrect use could block access to certain parts of the file system. Examples After the &man.mac.bsdextended.4; module has been loaded, the following command may be used to list the current rule configuration: &prompt.root; ugidfw list 0 slots, 0 rules As expected, there are no rules defined. This means that everything is still completely accessible. To create a rule which will block all access by users but leave root unaffected, simply run the following command: &prompt.root; ugidfw add subject not uid root new object not uid root mode n In releases prior to &os; 5.3, the add parameter did not exist. In those cases the set should be used instead. See below for a command example. This is a very bad idea as it will block all users from issuing even the most simple commands, such as ls. A more patriotic list of rules might be: &prompt.root; ugidfw set 2 subject uid user1 object uid user2 mode n &prompt.root; ugidfw set 3 subject uid user1 object gid user2 mode n This will block any and all access, including directory listings, to user2's home directory from the username user1. In place of user1, the could be passed. This will enforce the same access restrictions above for all users in place of just one user. The root user will be unaffected by these changes. This should give a general idea of how the &man.mac.bsdextended.4; module may be used to help fortify a file system. For more information, see the &man.mac.bsdextended.4; and the &man.ugidfw.8; manual pages. The MAC ifoff Module MAC Interface Silencing Policy Module name: mac_ifoff.ko Kernel configuration line: options MAC_IFOFF Boot option: mac_ifoff_load="YES" The &man.mac.ifoff.4; module exists solely to disable network interfaces on the fly and keep network interfaces from being brought up during the initial system boot. It does not require any labels to be set up on the system, nor does it have a dependency on other MAC modules. Most of the control is done through the sysctl tunables listed below. security.mac.ifoff.lo_enabled will enable/disable all traffic on the loopback (&man.lo.4;) interface. security.mac.ifoff.bpfrecv_enabled will enable/disable all traffic on the Berkeley Packet Filter interface (&man.bpf.4;) security.mac.ifoff.other_enabled will enable/disable traffic on all other interfaces. One of the most common uses of &man.mac.ifoff.4; is network monitoring in an environment where network traffic should not be permitted during the boot sequence. Another suggested use would be to write a script which uses security/aide to automatically block network traffic if it finds new or altered files in protected directories. The MAC portacl Module MAC Port Access Control List Policy Module name: mac_portacl.ko Kernel configuration line: MAC_PORTACL Boot option: mac_portacl_load="YES" The &man.mac.portacl.4; module is used to limit binding to local TCP and UDP ports using a variety of sysctl variables. In essence &man.mac.portacl.4; makes it possible to allow non-root users to bind to specified privileged ports, i.e. ports fewer than 1024. Once loaded, this module will enable the MAC policy on all sockets. The following tunables are available: security.mac.portacl.enabled will enable/disable the policy completely.Due to a bug the security.mac.portacl.enabled sysctl variable will not work on &os; 5.2.1 or previous releases. security.mac.portacl.port_high will set the highest port number that &man.mac.portacl.4; will enable protection for. security.mac.portacl.suser_exempt will, when set to a non-zero value, exempt the root user from this policy. security.mac.portacl.rules will specify the actual mac_portacl policy; see below. The actual mac_portacl policy, as specified in the security.mac.portacl.rules sysctl, is a text string of the form: rule[,rule,...] with as many rules as needed. Each rule is of the form: idtype:id:protocol:port. The idtype parameter can be uid or gid and used to interpret the id parameter as either a user id or group id, respectively. The protocol parameter is used to determine if the rule should apply to TCP or UDP by setting the parameter to tcp or udp. The final port parameter is the port number to allow the specified user or group to bind to. Since the ruleset is interpreted directly by the kernel only numeric values can be used for the user ID, group ID, and port parameters. I.e. user, group, and port service names cannot be used. By default, on &unix;-like systems, ports fewer than 1024 can only be used by/bound to privileged processes, i.e. those run as root. For &man.mac.portacl.4; to allow non-privileged processes to bind to ports below 1024 this standard &unix; restriction has to be disabled. This can be accomplished by setting the &man.sysctl.8; variables net.inet.ip.portrange.reservedlow and net.inet.ip.portrange.reservedhigh to zero. See the examples below or review the &man.mac.portacl.4; manual page for further information. Examples The following examples should illuminate the above discussion a little better: &prompt.root; sysctl security.mac.portacl.port_high=1023 &prompt.root; sysctl net.inet.ip.portrange.reservedlow=0 net.inet.ip.portrange.reservedhigh=0 First we set &man.mac.portacl.4; to cover the standard privileged ports and disable the normal &unix; bind restrictions. &prompt.root; sysctl security.mac.portacl.suser_exempt=1 The root user should not be crippled by this policy, thus set the security.mac.portacl.suser_exempt to a non-zero value. The &man.mac.portacl.4; module has now been set up to behave the same way &unix;-like systems behave by default. &prompt.root; sysctl security.mac.portacl.rules=uid:80:tcp:80 Allow the user with UID 80 (normally the www user) to bind to port 80. This can be used to allow the www user to run a web server without ever having root privilege. &prompt.root; sysctl security.mac.portacl.rules=uid:1001:tcp:110,uid:1001:tcp:995 Permit the user with the UID of 1001 to bind to the TCP ports 110 (pop3) and 995 (pop3s). This will permit this user to start a server that accepts connections on ports 110 and 995. MAC Policies with Labeling Features The next few sections will discuss MAC policies which use labels. From here on this chapter will focus on the features of &man.mac.biba.4;, &man.mac.lomac.4;, &man.mac.partition.4;, and &man.mac.mls.4;. This is an example configuration only and should not be considered for a production implementation. The goal is to document and show the syntax as well as examples for implementation and testing. For these policies to work correctly several preparations must be made. Preparation for Labeling Policies The following changes are required in the login.conf file: An insecure class, or another class of similar type, must be added. The login class of insecure is not required and just used as an example here; different configurations may use another class name. The insecure class should have the following settings and definitions. Several of these can be altered but the line which defines the default label is a requirement and must remain. insecure:\ :copyright=/etc/COPYRIGHT:\ :welcome=/etc/motd:\ :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\ :path=~/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:\ :manpath=/usr/share/man /usr/local/man:\ :nologin=/usr/sbin/nologin:\ :cputime=1h30m:\ :datasize=8M:\ :vmemoryuse=100M:\ :stacksize=2M:\ :memorylocked=4M:\ :memoryuse=8M:\ :filesize=8M:\ :coredumpsize=8M:\ :openfiles=24:\ :maxproc=32:\ :priority=0:\ :requirehome:\ :passwordtime=91d:\ :umask=022:\ :ignoretime@:\ :label=partition/13,mls/5,biba/low: The &man.cap.mkdb.1; command needs to be ran on &man.login.conf.5; before any of the users can be switched over to the new class. The root username should also be placed into a login class; otherwise, almost every command executed by root will require the use of setpmac. Rebuilding the login.conf database may cause some errors later with the daemon class. Simply uncommenting the daemon account and rebuilding the database should alleviate these issues. Ensure that all partitions on which MAC labeling will be implemented support the . We must do this because many of the examples here contain different labels for testing purposes. Review the output from the mount command as a precautionary measure. Switch any users who will have the higher security mechanisms enforced over to the new user class. A quick run of &man.pw.8; or &man.vipw.8; should do the trick. The MAC partition Module MAC Process Partition Policy Module name: mac_partition.ko Kernel configuration line: options MAC_PARTITION Boot option: mac_partition_load="YES" The &man.mac.partition.4; policy will drop processes into specific partitions based on their MAC label. Think of it as a special type of &man.jail.8;, though that is hardly a worthy comparison. This is one module that should be added to the &man.loader.conf.5; file so that it loads and enables the policy during the boot process. Most configuration for this policy is done using the &man.setpmac.8; utility which will be explained below. The following sysctl tunable is available for this policy: security.mac.partition.enabled will enable the enforcement of MAC process partitions. When this policy is enabled, users will only be permitted to see their processes but will not be permitted to work with certain utilities. For instance, a user in the insecure class above will not be permitted to access the top command as well as many other commands that must spawn a process. To set or drop utilities into a partition label, use the setpmac utility: &prompt.root; setpmac partition/13 top This will add the top command to the label set on users in the insecure class. Note that all processes spawned by users in the insecure class will stay in the partition/13 label. Examples The following command will show you the partition label and the process list: &prompt.root; ps Zax This next command will allow the viewing of another user's process partition label and that user's currently running processes: &prompt.root; ps -ZU trhodes Users can see processes in root's label unless the &man.mac.seeotheruids.4; policy is loaded. A really crafty implementation could have all of the services disabled in /etc/rc.conf and started by a script that starts them with the proper labeling set. The following policies support integer settings in place of the three default labels offered. These options, including their limitations, are further explained in the module manual pages. The MAC Multi-Level Security Module MAC Multi-Level Security Policy Module name: mac_mls.ko Kernel configuration line: options MAC_MLS Boot option: mac_mls_load="YES" The &man.mac.mls.4; policy controls access between subjects and objects in the system by enforcing a strict information flow policy. In MLS environments, a clearance level is set in each subject or objects label, along with compartments. Since these clearance or sensibility levels can reach numbers greater than six thousand; it would be a daunting task for any system administrator to thoroughly configure each subject or object. Thankfully, three instant labels are already included in this policy. These labels are mls/low, mls/equal and mls/high. Since these labels are described in depth in the manual page, they will only get a brief description here: The mls/low label contains a low configuration which permits it to be dominated by all other objects. Anything labeled with mls/low will have a low clearance level and not be permitted to access information of a higher level. In addition, this label will prevent objects of a higher clearance level from writing or passing information on to them. The mls/equal label should be placed on objects considered to be exempt from the policy. The mls/high label is the highest level of clearance possible. Objects assigned this label will hold dominance over all other objects in the system; however, they will not permit the leaking of information to objects of a lower class. MLS provides for: A hierarchical security level with a set of non hierarchical categories; Fixed rules: no read up, no write down (a subject can have read access to objects on its own level or below, but not above. Similarly, a subject can have write access to objects on its own level or above but not beneath.); Secrecy (preventing inappropriate disclosure of data); Basis for the design of systems that concurrently handle data at multiple sensitivity levels (without leaking information between secret and confidential). The following sysctl tunables are available for the configuration of special services and interfaces: security.mac.mls.enabled is used to enable/disable the MLS policy. security.mac.mls.ptys_equal will label all &man.pty.4; devices as mls/equal during creation. security.mac.mls.revocation_enabled is used to revoke access to objects after their label changes to a label of a lower grade. security.mac.mls.max_compartments is used to set the maximum number of compartment levels with objects; basically the maximum compartment number allowed on a system. To manipulate the MLS labels, the &man.setfmac.8; command has been provided. To assign a label to an object, issue the following command: &prompt.root; setfmac mls/5 test To get the MLS label for the file test issue the following command: &prompt.root; getfmac test This is a summary of the MLS policy's features. Another approach is to create a master policy file in /etc which specifies the MLS policy information and to feed that file into the setfmac command. This method will be explained after all policies are covered. Observations: an object with lower clearance is unable to observe higher clearance processes. A basic policy would be to enforce mls/high on everything not to be read, even if it needs to be written. Enforce mls/low on everything not to be written, even if it needs to be read. And finally enforce mls/equal on the rest. All users marked insecure should be set at mls/low. The MAC Biba Module MAC Biba Integrity Policy Module name: mac_biba.ko Kernel configuration line: options MAC_BIBA Boot option: mac_biba_load="YES" The &man.mac.biba.4; module loads the MAC Biba policy. This policy works much like that of the MLS policy with the exception that the rules for information flow are slightly reversed. This is said to prevent the downward flow of sensitive information whereas the MLS policy prevents the upward flow of sensitive information; thus, much of this section can apply to both policies. In Biba environments, an integrity label is set on each subject or object. These labels are made up of hierarchal grades, and non-hierarchal components. As an object's or subject's grade ascends, so does its integrity. Supported labels are biba/low, biba/equal, and biba/high; as explained below: The biba/low label is considered the lowest integrity an object or subject may have. Setting this on objects or subjects will block their write access to objects or subjects marked high. They still have read access though. The biba/equal label should only be placed on objects considered to be exempt from the policy. The biba/high label will permit writing to objects set at a lower label, but not permit reading that object. It is recommended that this label be placed on objects that affect the integrity of the entire system. Biba provides for: Hierarchical integrity level with a set of non hierarchical integrity categories; Fixed rules: no write up, no read down (opposite of MLS). A subject can have write access to objects on its own level or below, but not above. Similarly, a subject can have read access to objects on its own level or above, but not below; Integrity (preventing inappropriate modification of data); Integrity levels (instead of MLS sensitivity levels). The following sysctl tunables can be used to manipulate the Biba policy. security.mac.biba.enabled may be used to enable/disable enforcement of the Biba policy on the target machine. security.mac.biba.ptys_equal may be used to disable the Biba policy on &man.pty.4; devices. security.mac.biba.revocation_enabled will force the revocation of access to objects if the label is changed to dominate the subject. To access the Biba policy setting on system objects, use the setfmac and getfmac commands: &prompt.root; setfmac biba/low test &prompt.root; getfmac test test: biba/low Observations: a lower integrity subject is unable to write to a higher integrity subject; a higher integrity subject cannot observe or read a lower integrity object. The MAC LOMAC Module MAC LOMAC Module name: mac_lomac.ko Kernel configuration line: options MAC_LOMAC Boot option: mac_lomac_load="YES" Unlike the MAC Biba policy, the &man.mac.lomac.4; policy permits access to lower integrity objects only after decreasing the integrity level to not disrupt any integrity rules. The MAC version of the Low-watermark integrity policy, not to be confused with the older &man.lomac.4; implementation, works almost identically to Biba, but with the exception of using floating labels to support subject demotion via an auxiliary grade compartment. This secondary compartment takes the form of [auxgrade]. When assigning a lomac policy with an auxiliary grade, it should look a little bit like: lomac/10[2] where the number two (2) is the auxiliary grade. The MAC LOMAC policy relies on the ubiquitous labeling of all system objects with integrity labels, permitting subjects to read from low integrity objects and then downgrading the label on the subject to prevent future writes to high integrity objects. This is the [auxgrade] option discussed above, thus the policy may provide for greater compatibility and require less initial configuration than Biba. Examples Like the Biba and MLS policies; the setfmac and setpmac utilities may be used to place labels on system objects: &prompt.root; setfmac /usr/home/trhodes lomac/high[low] &prompt.root; getfmac /usr/home/trhodes lomac/high[low] Notice the auxiliary grade here is low, this is a feature provided only by the MAC LOMAC policy. Implementing a Secure Environment with MAC MAC Example Implementation The following demonstration will implement a secure environment using various MAC modules with properly configured policies. This is only a test and should not be considered the complete answer to everyone's security woes. Just implementing a policy and ignoring it never works and could be disastrous in a production environment. Before beginning this process, the multilabel option must be set on each file system as stated at the beginning of this chapter. Not doing so will result in errors. Create an insecure User Class Begin the procedure by adding the following user class to the /etc/login.conf file: insecure:\ :copyright=/etc/COPYRIGHT:\ :welcome=/etc/motd:\ :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\ :path=~/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin :manpath=/usr/share/man /usr/local/man:\ :nologin=/usr/sbin/nologin:\ :cputime=1h30m:\ :datasize=8M:\ :vmemoryuse=100M:\ :stacksize=2M:\ :memorylocked=4M:\ :memoryuse=8M:\ :filesize=8M:\ :coredumpsize=8M:\ :openfiles=24:\ :maxproc=32:\ :priority=0:\ :requirehome:\ :passwordtime=91d:\ :umask=022:\ :ignoretime@:\ :label=partition/13,mls/5: And adding the following line to the default user class: :label=mls/equal,biba/equal,partition/15: Once this is completed, the following command must be issued to rebuild the database: &prompt.root; cap_mkdb /etc/login.conf Boot with the Correct Modules Add the following lines to /boot/loader.conf so the required modules will load during system initialization: mac_biba_load="YES" mac_mls_load="YES" mac_seeotheruids_load="YES" mac_partition_load="YES" Set All Users to Insecure All user accounts that are not root or system users will now require a login class. The login class is required otherwise users will be refused access to common commands such as &man.vi.1;. The following sh script should do the trick: - &prompt.root; for x in `awk -F: '($3 >= 1001) && ($3 != 65534) { print $1 }' \ + &prompt.root; for x in `awk -F: '($3 >= 1001) && ($3 != 65534) { print $1 }' \ /etc/passwd`; do pw usermod $x -L insecure; done; The cap_mkdb command will need to be run on /etc/master.passwd after this change. Complete the Configuration A contexts file should now be created; the following example was taken from Robert Watson's example policy and should be placed in /etc/policy.contexts. # This is the default BIBA/MLS policy for this system. .* biba/high,mls/high /sbin/dhclient biba/high(low),mls/high(low) /dev(/.*)? biba/equal,mls/equal # This is not an exhaustive list of all "privileged" devices. /dev/mdctl biba/high,mls/high /dev/pci biba/high,mls/high /dev/k?mem biba/high,mls/high /dev/io biba/high,mls/high /dev/agp.* biba/high,mls/high (/var)?/tmp(/.*)? biba/equal,mls/equal /tmp/\.X11-unix biba/high(equal),mls/high(equal) /tmp/\.X11-unix/.* biba/equal,mls/equal /proc(/.*)? biba/equal,mls/equal /mnt.* biba/low,mls/low (/usr)?/home biba/high(low),mls/high(low) (/usr)?/home/.* biba/low,mls/low /var/mail(/.*)? biba/low,mls/low /var/spool/mqueue(/.*)? biba/low,mls/low (/mnt)?/cdrom(/.*)? biba/high,mls/high (/usr)?/home/(ftp|samba)(/.*)? biba/high,mls/high /var/log/sendmail\.st biba/low,mls/low /var/run/utmp biba/equal,mls/equal /var/log/(lastlog|wtmp) biba/equal,mls/equal This policy will enforce security by setting restrictions on both the downward and upward flow of information with regards to the directories and utilities listed on the left. This can now be read into our system by issuing the following command: &prompt.root; setfsmac -ef /etc/policy.contexts / &prompt.root; setfsmac -ef /etc/policy.contexts /usr The above file system layout may be different depending on environment. The /etc/mac.conf file requires the following modifications in the main section: default_labels file ?biba,?mls default_labels ifnet ?biba,?mls default_labels process ?biba,?mls,?partition default_labels socket ?biba,?mls Testing the Configuration MAC Configuration Testing Add a user with the adduser command and place that user in the insecure class for these tests. The examples below will show a mix of root and regular user tests; use the prompt to distinguish between the two. Basic Labeling Tests &prompt.user; getpmac biba/15(15-15),mls/15(15-15),partition/15 &prompt.root; setpmac partition/15,mls/equal top The top process will be killed before we start another top process. MAC Seeotheruids Tests &prompt.user; ps Zax biba/15(15-15),mls/15(15-15),partition/15 1096 #C: S 0:00.03 -su (bash) biba/15(15-15),mls/15(15-15),partition/15 1101 #C: R+ 0:00.01 ps Zax We should not be permitted to see any processes owned by other users. MAC Partition Test Disable the MAC seeotheruids policy for the rest of these tests: &prompt.root; sysctl security.mac.seeotheruids.enabled=0 &prompt.user; ps Zax LABEL PID TT STAT TIME COMMAND biba/equal(low-high),mls/equal(low-high),partition/15 1122 #C: S+ 0:00.02 top biba/15(15-15),mls/15(15-15),partition/15 1096 #C: S 0:00.05 -su (bash) biba/15(15-15),mls/15(15-15),partition/15 1123 #C: R+ 0:00.01 ps Zax All users should be permitted to see every process in their partition. Testing Biba and MLS Labels &prompt.root; setpmac partition/15,mls/equal,biba/high$high-high$ top &prompt.user; ps Zax LABEL PID TT STAT TIME COMMAND biba/high(high-high),mls/equal(low-high),partition/15 1251 #C: S+ 0:00.02 top biba/15(15-15),mls/15(15-15),partition/15 1096 #C: S 0:00.06 -su (bash) biba/15(15-15),mls/15(15-15),partition/15 1157 #C: R+ 0:00.00 ps Zax The Biba policy allows us to read higher-labeled objects. &prompt.root; setpmac partition/15,mls/equal,biba/low top &prompt.user; ps Zax LABEL PID TT STAT TIME COMMAND biba/15(15-15),mls/15(15-15),partition/15 1096 #C: S 0:00.07 -su (bash) biba/15(15-15),mls/15(15-15),partition/15 1226 #C: R+ 0:00.01 ps Zax The Biba policy does not allow lower-labeled objects to be read; however, MLS does. &prompt.user; ifconfig bge0 | grep maclabel maclabel biba/low(low-low),mls/low(low-low) &prompt.user; ping -c 1 192.0.34.166 PING 192.0.34.166 (192.0.34.166): 56 data bytes ping: sendto: Permission denied Users are unable to ping example.com, or any domain for that matter. To prevent this error from occurring, run the following command: &prompt.root; sysctl security.mac.biba.trust_all_interfaces=1 This sets the default interface label to insecure mode, so the default Biba policy label will not be enforced. &prompt.root; ifconfig bge0 maclabel biba/equal$low-high$,mls/equal$low-high$ &prompt.user; ping -c 1 192.0.34.166 PING 192.0.34.166 (192.0.34.166): 56 data bytes 64 bytes from 192.0.34.166: icmp_seq=0 ttl=50 time=204.455 ms --- 192.0.34.166 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max/stddev = 204.455/204.455/204.455/0.000 ms By setting a more correct label, we can issue ping requests. Now to create a few files for some read and write testing procedures: &prompt.root; touch test1 test2 test3 test4 test5 &prompt.root; getfmac test1 test1: biba/equal,mls/equal &prompt.root; setfmac biba/low test1 test2; setfmac biba/high test4 test5; \ setfmac mls/low test1 test3; setfmac mls/high test2 test4 -&prompt.root; setfmac mls/equal,biba/equal test3 && getfmac test? +&prompt.root; setfmac mls/equal,biba/equal test3 && getfmac test? test1: biba/low,mls/low test2: biba/low,mls/high test3: biba/equal,mls/equal test4: biba/high,mls/high test5: biba/high,mls/equal &prompt.root; chown testuser:testuser test? All of these files should now be owned by our testuser user. And now for some read tests: &prompt.user; ls test1 test2 test3 test4 test5 &prompt.user; ls test? ls: test1: Permission denied ls: test2: Permission denied ls: test4: Permission denied test3 test5 We should not be permitted to observe pairs; e.g.: (biba/low,mls/low), (biba/low,mls/high) and (biba/high,mls/high). And of course, read access should be denied. Now for some write tests: - &prompt.user; for i in `echo test*`; do echo 1 > $i; done + &prompt.user; for i in `echo test*`; do echo 1 > $i; done -su: test1: Permission denied -su: test4: Permission denied -su: test5: Permission denied Like with the read tests, write access should not be permitted to write pairs; e.g.: (biba/low,mls/high) and (biba/equal,mls/equal). &prompt.user; cat test? cat: test1: Permission denied cat: test2: Permission denied 1 cat: test4: Permission denied And now as root: &prompt.root; cat test2 1 Another Example: Using MAC to Constrain a Web Server A separate location for the web data which users must be capable of accessing will be appointed. This will permit biba/high processes access rights to the web data. Begin by creating a directory to store the web data in: &prompt.root; mkdir /usr/home/cvs Now initialize it with cvs: &prompt.root; cvs -d /usr/home/cvs init The first goal is to enable the biba policy, thus the mac_biba_enable="YES" should be placed in /boot/loader.conf. This assumes that support for MAC has been enabled in the kernel. From this point on everything in the system should be set at biba/high by default. The following modification must be made to the login.conf file, under the default user class: :ignoretime@:\ :umask=022:\ :label=biba/high: Every user should now be placed in the default class; a command such as: - &prompt.root; for x in `awk -F: '($3 >= 1001) && ($3 != 65534) { print $1 }' \ + &prompt.root; for x in `awk -F: '($3 >= 1001) && ($3 != 65534) { print $1 }' \ /etc/passwd`; do pw usermod $x -L default; done; will accomplish this task in a few moments. Now create another class, web, a copy of default, with the label setting of biba/low. Create a user who will be used to work with the main web data stored in a cvs repository. This user must be placed in our new login class, web. Since the default is biba/high everywhere, the repository will be the same. The web data must also be the same for users to have read/write access to it; however, since our web server will be serving data that biba/high users must access, we will need to downgrade the data as a whole. The perfect tools for this are &man.sh.1; and &man.cron.8; and are already provided in &os;. The following script should do everything we want: PATH=/bin:/usr/bin:/usr/local/bin; export PATH; CVSROOT=/home/repo; export CVSROOT; cd /home/web; cvs -qR checkout -P htdocs; exit; In many cases the cvs Id tags must be placed into the web site data files. This script may now be placed into web's home directory and the following &man.crontab.1; entry added: # Check out the web data as biba/low every twelve hours: 0 */12 * * * web /home/web/checkout.sh This will check out the HTML sources every twelve hours on the machine. The default startup method for the web server must also be modified to start the process as biba/low. This can be done by making the following modification to the /usr/local/etc/rc.d/apache.sh script: command="setpmac biba/low /usr/local/sbin/httpd" The Apache configuration must be altered to work with the biba/low policy. In this case the software must be configured to append to the log files in a directory set at biba/low or else access denied errors will be returned. Following this example requires that the docroot directive be set to /home/web/htdocs; otherwise, Apache will fail when trying to locate the directory to serve documents from. Other configuration variables must be altered as well, including the PID file, Scoreboardfile, DocumentRoot, log file locations, or any other variable which requires write access. When using biba, all write access will be denied to the server in areas not set at biba/low. Troubleshooting the MAC Framework MAC Troubleshooting During the development stage, a few users reported problems with normal configuration. Some of these problems are listed below: The <option>multilabel</option> option cannot be enabled on <filename>/</filename> The flag does not stay enabled on my root (/) partition! It seems that one out of every fifty users has this problem, indeed, we had this problem during our initial configuration. Further observation of this so called bug has lead me to believe that it is a result of either incorrect documentation or misinterpretation of the documentation. Regardless of why it happened, the following steps may be taken to resolve it: Edit /etc/fstab and set the root partition at for read-only. Reboot into single user mode. Run tunefs on /. Reboot the system into normal mode. Run mount / and change the back to in /etc/fstab and reboot the system again. Double-check the output from the mount to ensure that has been properly set on the root file system. Cannot start a X11 server after <acronym>MAC</acronym> After establishing a secure environment with MAC, I am no longer able to start X! This could be caused by the MAC partition policy or by a mislabeling in one of the MAC labeling policies. To debug, try the following: Check the error message; if the user is in the insecure class, the partition policy may be the culprit. Try setting the user's class back to the default class and rebuild the database with the cap_mkdb command. If this does not alleviate the problem, go to step two. Double-check the label policies. Ensure that the policies are set correctly for the user in question, the X11 application, and the /dev entries. If neither of these resolve the problem, send the error message and a description of your environment to the TrustedBSD discussion lists located at the TrustedBSD website or to the &a.questions; mailing list. Error: &man..secure.path.3; cannot stat <filename>.login_conf</filename> When I attempt to switch from the root to another user in the system, the error message _secure_path: unable to state .login_conf. This message is usually shown when the user has a higher label setting then that of the user whom they are attempting to become. For instance a user on the system, joe, has a default label of . The root user, who has a label of , cannot view joe's home directory. This will happen regardless if root has used the su command to become joe, or not. In this scenario, the Biba integrity model will not permit root to view objects set at a lower integrity level. The <username>root</username> username is broken! In normal or even single user mode, the root is not recognized. The whoami command returns 0 (zero) and su returns who are you?. What could be going on? This can happen if a labeling policy has been disabled, either by a &man.sysctl.8; or the policy module was unloaded. If the policy is being disabled or has been temporarily disabled, then the login capabilities database needs to be reconfigured with the option being removed. Double check the login.conf file to ensure that all options have been removed and rebuild the database with the cap_mkdb command. diff --git a/en_US.ISO8859-1/books/handbook/mail/chapter.sgml b/en_US.ISO8859-1/books/handbook/mail/chapter.sgml index c85ed1895b..b6a04bf11d 100644 --- a/en_US.ISO8859-1/books/handbook/mail/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/mail/chapter.sgml @@ -1,2323 +1,2323 @@ Bill Lloyd Original work by Jim Mock Rewritten by Electronic Mail Synopsis email Electronic Mail, better known as email, is one of the most widely used forms of communication today. This chapter provides a basic introduction to running a mail server on &os;, as well as an introduction to sending and receiving email using &os;; however, it is not a complete reference and in fact many important considerations are omitted. For more complete coverage of the subject, the reader is referred to the many excellent books listed in . After reading this chapter, you will know: What software components are involved in sending and receiving electronic mail. Where basic sendmail configuration files are located in FreeBSD. The difference between remote and local mailboxes. How to block spammers from illegally using your mail server as a relay. How to install and configure an alternate Mail Transfer Agent on your system, replacing sendmail. How to troubleshoot common mail server problems. How to use SMTP with UUCP. How to set up the system to send mail only. How to use mail with a dialup connection. How to configure SMTP Authentication for added security. How to install and use a Mail User Agent, such as mutt to send and receive email. How to download your mail from a remote POP or IMAP server. How to automatically apply filters and rules to incoming email. Before reading this chapter, you should: Properly set up your network connection (). Properly set up the DNS information for your mail host (). Know how to install additional third-party software (). Using Electronic Mail POP IMAP DNS There are five major parts involved in an email exchange. They are: the user program, the server daemon, DNS, a remote or local mailbox, and of course, the mailhost itself. The User Program This includes command line programs such as mutt, pine, elm, and mail, and GUI programs such as balsa, xfmail to name a few, and something more sophisticated like a WWW browser. These programs simply pass off the email transactions to the local mailhost, either by calling one of the server daemons available, or delivering it over TCP. Mailhost Server Daemon mail server daemons sendmail mail server daemons postfix mail server daemons qmail mail server daemons exim &os; ships with sendmail by default, but also support numerous other mail server daemons, just some of which include: exim; postfix; qmail. The server daemon usually has two functions—it is responsible for receiving incoming mail as well as delivering outgoing mail. It is not responsible for the collection of mail using protocols such as POP or IMAP to read your email, nor does it allow connecting to local mbox or Maildir mailboxes. You may require an additional daemon for that. Older versions of sendmail have some serious security issues which may result in an attacker gaining local and/or remote access to your machine. Make sure that you are running a current version to avoid these problems. Optionally, install an alternative MTA from the &os; Ports Collection. Email and DNS The Domain Name System (DNS) and its daemon named play a large role in the delivery of email. In order to deliver mail from your site to another, the server daemon will look up the remote site in the DNS to determine the host that will receive mail for the destination. This process also occurs when mail is sent from a remote host to your mail server. MX record DNS is responsible for mapping hostnames to IP addresses, as well as for storing information specific to mail delivery, known as MX records. The MX (Mail eXchanger) record specifies which host, or hosts, will receive mail for a particular domain. If you do not have an MX record for your hostname or domain, the mail will be delivered directly to your host provided you have an A record pointing your hostname to your IP address. You may view the MX records for any domain by using the &man.host.1; command, as seen in the example below: &prompt.user; host -t mx FreeBSD.org FreeBSD.org mail is handled (pri=10) by mx1.FreeBSD.org Receiving Mail email receiving Receiving mail for your domain is done by the mail host. It will collect all mail sent to your domain and store it either in mbox (the default method for storing mail) or Maildir format, depending on your configuration. Once mail has been stored, it may either be read locally using applications such as &man.mail.1; or mutt, or remotely accessed and collected using protocols such as POP or IMAP. This means that should you only wish to read mail locally, you are not required to install a POP or IMAP server. Accessing remote mailboxes using <acronym>POP</acronym> and <acronym>IMAP</acronym> POP IMAP In order to access mailboxes remotely, you are required to have access to a POP or IMAP server. These protocols allow users to connect to their mailboxes from remote locations with ease. Though both POP and IMAP allow users to remotely access mailboxes, IMAP offers many advantages, some of which are: IMAP can store messages on a remote server as well as fetch them. IMAP supports concurrent updates. IMAP can be extremely useful over low-speed links as it allows users to fetch the structure of messages without downloading them; it can also perform tasks such as searching on the server in order to minimize data transfer between clients and servers. In order to install a POP or IMAP server, the following steps should be performed: Choose an IMAP or POP server that best suits your needs. The following POP and IMAP servers are well known and serve as some good examples: qpopper; teapop; imap-uw; courier-imap; Install the POP or IMAP daemon of your choosing from the ports collection. Where required, modify /etc/inetd.conf to load the POP or IMAP server. It should be noted that both POP and IMAP transmit information, including username and password credentials in clear-text. This means that if you wish to secure the transmission of information across these protocols, you should consider tunneling sessions over &man.ssh.1;. Tunneling sessions is described in . Accessing local mailboxes Mailboxes may be accessed locally by directly utilizing MUAs on the server on which the mailbox resides. This can be done using applications such as mutt or &man.mail.1;. The Mail Host mail host The mail host is the name given to a server that is responsible for delivering and receiving mail for your host, and possibly your network. Christopher Shumway Contributed by <application>sendmail</application> Configuration sendmail &man.sendmail.8; is the default Mail Transfer Agent (MTA) in FreeBSD. sendmail's job is to accept mail from Mail User Agents (MUA) and deliver it to the appropriate mailer as defined by its configuration file. sendmail can also accept network connections and deliver mail to local mailboxes or deliver it to another program. sendmail uses the following configuration files: /etc/mail/access /etc/mail/aliases /etc/mail/local-host-names /etc/mail/mailer.conf /etc/mail/mailertable /etc/mail/sendmail.cf /etc/mail/virtusertable Filename Function /etc/mail/access sendmail access database file /etc/mail/aliases Mailbox aliases /etc/mail/local-host-names Lists of hosts sendmail accepts mail for /etc/mail/mailer.conf Mailer program configuration /etc/mail/mailertable Mailer delivery table /etc/mail/sendmail.cf sendmail master configuration file /etc/mail/virtusertable Virtual users and domain tables <filename>/etc/mail/access</filename> The access database defines what host(s) or IP addresses have access to the local mail server and what kind of access they have. Hosts can be listed as , , or simply passed to sendmail's error handling routine with a given mailer error. Hosts that are listed as , which is the default, are allowed to send mail to this host as long as the mail's final destination is the local machine. Hosts that are listed as are rejected for all mail connections. Hosts that have the option for their hostname are allowed to send mail for any destination through this mail server. Configuring the <application>sendmail</application> Access Database cyberspammer.com 550 We do not accept mail from spammers FREE.STEALTH.MAILER@ 550 We do not accept mail from spammers another.source.of.spam REJECT okay.cyberspammer.com OK 128.32 RELAY In this example we have five entries. Mail senders that match the left hand side of the table are affected by the action on the right side of the table. The first two examples give an error code to sendmail's error handling routine. The message is printed to the remote host when a mail matches the left hand side of the table. The next entry rejects mail from a specific host on the Internet, another.source.of.spam. The next entry accepts mail connections from a host okay.cyberspammer.com, which is more exact than the cyberspammer.com line above. More specific matches override less exact matches. The last entry allows relaying of electronic mail from hosts with an IP address that begins with 128.32. These hosts would be able to send mail through this mail server that are destined for other mail servers. When this file is updated, you need to run make in /etc/mail/ to update the database. <filename>/etc/mail/aliases</filename> The aliases database contains a list of virtual mailboxes that are expanded to other user(s), files, programs or other aliases. Here are a few examples that can be used in /etc/mail/aliases: Mail Aliases root: localuser ftp-bugs: joe,eric,paul bit.bucket: /dev/null procmail: "|/usr/local/bin/procmail" The file format is simple; the mailbox name on the left side of the colon is expanded to the target(s) on the right. The first example simply expands the mailbox root to the mailbox localuser, which is then looked up again in the aliases database. If no match is found, then the message is delivered to the local user localuser. The next example shows a mail list. Mail to the mailbox ftp-bugs is expanded to the three local mailboxes joe, eric, and paul. Note that a remote mailbox could be specified as user@example.com. The next example shows writing mail to a file, in this case /dev/null. The last example shows sending mail to a program, in this case the mail message is written to the standard input of /usr/local/bin/procmail through a &unix; pipe. When this file is updated, you need to run make in /etc/mail/ to update the database. <filename>/etc/mail/local-host-names</filename> This is a list of hostnames &man.sendmail.8; is to accept as the local host name. Place any domains or hosts that sendmail is to be receiving mail for. For example, if this mail server was to accept mail for the domain example.com and the host mail.example.com, its local-host-names might look something like this: example.com mail.example.com When this file is updated, &man.sendmail.8; needs to be restarted to read the changes. <filename>/etc/mail/sendmail.cf</filename> sendmail's master configuration file, sendmail.cf controls the overall behavior of sendmail, including everything from rewriting e-mail addresses to printing rejection messages to remote mail servers. Naturally, with such a diverse role, this configuration file is quite complex and its details are a bit out of the scope of this section. Fortunately, this file rarely needs to be changed for standard mail servers. The master sendmail configuration file can be built from &man.m4.1; macros that define the features and behavior of sendmail. Please see /usr/src/contrib/sendmail/cf/README for some of the details. When changes to this file are made, sendmail needs to be restarted for the changes to take effect. <filename>/etc/mail/virtusertable</filename> The virtusertable maps mail addresses for virtual domains and mailboxes to real mailboxes. These mailboxes can be local, remote, aliases defined in /etc/mail/aliases or files. Example Virtual Domain Mail Map root@example.com root postmaster@example.com postmaster@noc.example.net @example.com joe In the above example, we have a mapping for a domain example.com. This file is processed in a first match order down the file. The first item maps root@example.com to the local mailbox root. The next entry maps postmaster@example.com to the mailbox postmaster on the host noc.example.net. Finally, if nothing from example.com has matched so far, it will match the last mapping, which matches every other mail message addressed to someone at example.com. This will be mapped to the local mailbox joe. Andrew Boothman Written by Gregory Neil Shapiro Information taken from e-mails written by Changing Your Mail Transfer Agent email change mta As already mentioned, FreeBSD comes with sendmail already installed as your MTA (Mail Transfer Agent). Therefore by default it is in charge of your outgoing and incoming mail. However, for a variety of reasons, some system administrators want to change their system's MTA. These reasons range from simply wanting to try out another MTA to needing a specific feature or package which relies on another mailer. Fortunately, whatever the reason, FreeBSD makes it easy to make the change. Install a New MTA You have a wide choice of MTAs available. A good starting point is the FreeBSD Ports Collection where you will be able to find many. Of course you are free to use any MTA you want from any location, as long as you can make it run under FreeBSD. Start by installing your new MTA. Once it is installed it gives you a chance to decide if it really fulfills your needs, and also gives you the opportunity to configure your new software before getting it to take over from sendmail. When doing this, you should be sure that installing the new software will not attempt to overwrite system binaries such as /usr/bin/sendmail. Otherwise, your new mail software has essentially been put into service before you have configured it. Please refer to your chosen MTA's documentation for information on how to configure the software you have chosen. Disable <application>sendmail</application> The procedure used to start sendmail changed significantly between 4.5-RELEASE, 4.6-RELEASE, and later releases. Therefore, the procedure used to disable it is subtly different. If you disable sendmail's outgoing mail service, it is important that you replace it with an alternative mail delivery system. If you choose not to, system functions such as &man.periodic.8; will be unable to deliver their results by e-mail as they would normally expect to. Many parts of your system may expect to have a functional sendmail-compatible system. If applications continue to use sendmail's binaries to try to send e-mail after you have disabled them, mail could go into an inactive sendmail queue, and never be delivered. FreeBSD 4.5-STABLE before 2002/4/4 and Earlier (Including 4.5-RELEASE and Earlier) Enter: sendmail_enable="NO" into /etc/rc.conf. This will disable sendmail's incoming mail service, but if /etc/mail/mailer.conf (see below) is not changed, sendmail will still be used to send e-mail. FreeBSD 4.5-STABLE after 2002/4/4 (Including 4.6-RELEASE and Later) In order to completely disable sendmail, including the outgoing mail service, you must use sendmail_enable="NONE" in /etc/rc.conf. If you only want to disable sendmail's incoming mail service, you should set sendmail_enable="NO" in /etc/rc.conf. However, if incoming mail is disabled, local delivery will still function. More information on sendmail's startup options is available from the &man.rc.sendmail.8; manual page. FreeBSD 5.0-STABLE and Later In order to completely disable sendmail, including the outgoing mail service, you must use sendmail_enable="NO" sendmail_submit_enable="NO" sendmail_outbound_enable="NO" sendmail_msp_queue_enable="NO" in /etc/rc.conf. If you only want to disable sendmail's incoming mail service, you should set sendmail_enable="NO" in /etc/rc.conf. More information on sendmail's startup options is available from the &man.rc.sendmail.8; manual page. Running Your New MTA on Boot You may have a choice of two methods for running your new MTA on boot, again depending on what version of FreeBSD you are running. FreeBSD 4.5-STABLE before 2002/4/11 (Including 4.5-RELEASE and Earlier) Add a script to /usr/local/etc/rc.d/ that ends in .sh and is executable by root. The script should accept start and stop parameters. At startup time the system scripts will execute the command /usr/local/etc/rc.d/supermailer.sh start which you can also use to manually start the server. At shutdown time, the system scripts will use the stop option, running the command /usr/local/etc/rc.d/supermailer.sh stop which you can also use to manually stop the server while the system is running. FreeBSD 4.5-STABLE after 2002/4/11 (Including 4.6-RELEASE and Later) With later versions of FreeBSD, you can use the above method or you can set mta_start_script="filename" in /etc/rc.conf, where filename is the name of some script that you want executed at boot to start your MTA. Replacing <application>sendmail</application> as the System's Default Mailer The program sendmail is so ubiquitous as standard software on &unix; systems that some software just assumes it is already installed and configured. For this reason, many alternative MTA's provide their own compatible implementations of the sendmail command-line interface; this facilitates using them as drop-in replacements for sendmail. Therefore, if you are using an alternative mailer, you will need to make sure that software trying to execute standard sendmail binaries such as /usr/bin/sendmail actually executes your chosen mailer instead. Fortunately, FreeBSD provides a system called &man.mailwrapper.8; that does this job for you. When sendmail is operating as installed, you will find something like the following in /etc/mail/mailer.conf: sendmail /usr/libexec/sendmail/sendmail send-mail /usr/libexec/sendmail/sendmail mailq /usr/libexec/sendmail/sendmail newaliases /usr/libexec/sendmail/sendmail hoststat /usr/libexec/sendmail/sendmail purgestat /usr/libexec/sendmail/sendmail This means that when any of these common commands (such as sendmail itself) are run, the system actually invokes a copy of mailwrapper named sendmail, which checks mailer.conf and executes /usr/libexec/sendmail/sendmail instead. This system makes it easy to change what binaries are actually executed when these default sendmail functions are invoked. Therefore if you wanted /usr/local/supermailer/bin/sendmail-compat to be run instead of sendmail, you could change /etc/mail/mailer.conf to read: sendmail /usr/local/supermailer/bin/sendmail-compat send-mail /usr/local/supermailer/bin/sendmail-compat mailq /usr/local/supermailer/bin/mailq-compat newaliases /usr/local/supermailer/bin/newaliases-compat hoststat /usr/local/supermailer/bin/hoststat-compat purgestat /usr/local/supermailer/bin/purgestat-compat Finishing Once you have everything configured the way you want it, you should either kill the sendmail processes that you no longer need and start the processes belonging to your new software, or simply reboot. Rebooting will also give you the opportunity to ensure that you have correctly configured your system to start your new MTA automatically on boot. Troubleshooting email troubleshooting Why do I have to use the FQDN for hosts on my site? You will probably find that the host is actually in a different domain; for example, if you are in foo.bar.edu and you wish to reach a host called mumble in the bar.edu domain, you will have to refer to it by the fully-qualified domain name, mumble.bar.edu, instead of just mumble. BIND Traditionally, this was allowed by BSD BIND resolvers. However the current version of BIND that ships with FreeBSD no longer provides default abbreviations for non-fully qualified domain names other than the domain you are in. So an unqualified host mumble must either be found as mumble.foo.bar.edu, or it will be searched for in the root domain. This is different from the previous behavior, where the search continued across mumble.bar.edu, and mumble.edu. Have a look at RFC 1535 for why this was considered bad practice, or even a security hole. As a good workaround, you can place the line: search foo.bar.edu bar.edu instead of the previous: domain foo.bar.edu into your /etc/resolv.conf. However, make sure that the search order does not go beyond the boundary between local and public administration, as RFC 1535 calls it. MX record sendmail says mail loops back to myself This is answered in the sendmail FAQ as follows: I'm getting these error messages: 553 MX list for domain.net points back to relay.domain.net 554 <user@domain.net>... Local configuration error How can I solve this problem? You have asked mail to the domain (e.g., domain.net) to be forwarded to a specific host (in this case, relay.domain.net) by using an MX record, but the relay machine does not recognize itself as domain.net. Add domain.net to /etc/mail/local-host-names [known as /etc/sendmail.cw prior to version 8.10] (if you are using FEATURE(use_cw_file)) or add Cw domain.net to /etc/mail/sendmail.cf. The sendmail FAQ can be found at and is recommended reading if you want to do any tweaking of your mail setup. PPP How can I run a mail server on a dial-up PPP host? You want to connect a FreeBSD box on a LAN to the Internet. The FreeBSD box will be a mail gateway for the LAN. The PPP connection is non-dedicated. UUCP MX record There are at least two ways to do this. One way is to use UUCP. Another way is to get a full-time Internet server to provide secondary MX services for your domain. For example, if your company's domain is example.com and your Internet service provider has set example.net up to provide secondary MX services to your domain: example.com. MX 10 example.com. MX 20 example.net. Only one host should be specified as the final recipient (add Cw example.com in /etc/mail/sendmail.cf on example.com). When the sending sendmail is trying to deliver the mail it will try to connect to you (example.com) over the modem link. It will most likely time out because you are not online. The program sendmail will automatically deliver it to the secondary MX site, i.e. your Internet provider (example.net). The secondary MX site will then periodically try to connect to your host and deliver the mail to the primary MX host (example.com). You might want to use something like this as a login script: #!/bin/sh # Put me in /usr/local/bin/pppmyisp ( sleep 60 ; /usr/sbin/sendmail -q ) & /usr/sbin/ppp -direct pppmyisp If you are going to create a separate login script for a user you could use sendmail -qRexample.com instead in the script above. This will force all mail in your queue for example.com to be processed immediately. A further refinement of the situation is as follows: Message stolen from the &a.isp;. > we provide the secondary MX for a customer. The customer connects to > our services several times a day automatically to get the mails to > his primary MX (We do not call his site when a mail for his domains > arrived). Our sendmail sends the mailqueue every 30 minutes. At the > moment he has to stay 30 minutes online to be sure that all mail is > gone to the primary MX. > > Is there a command that would initiate sendmail to send all the mails > now? The user has not root-privileges on our machine of course. In the privacy flags section of sendmail.cf, there is a definition Opgoaway,restrictqrun Remove restrictqrun to allow non-root users to start the queue processing. You might also like to rearrange the MXs. We are the 1st MX for our customers like this, and we have defined: # If we are the best MX for a host, try directly instead of generating # local config error. OwTrue That way a remote site will deliver straight to you, without trying the customer connection. You then send to your customer. Only works for hosts, so you need to get your customer to name their mail machine customer.com as well as hostname.customer.com in the DNS. Just put an A record in the DNS for customer.com. Why do I keep getting Relaying Denied errors when sending mail from other hosts? In default FreeBSD installations, sendmail is configured to only send mail from the host it is running on. For example, if a POP server is available, then users will be able to check mail from school, work, or other remote locations but they still will not be able to send outgoing emails from outside locations. Typically, a few moments after the attempt, an email will be sent from MAILER-DAEMON with a 5.7 Relaying Denied error message. There are several ways to get around this. The most straightforward solution is to put your ISP's address in a relay-domains file at /etc/mail/relay-domains. A quick way to do this would be: &prompt.root; echo "your.isp.example.com" > /etc/mail/relay-domains After creating or editing this file you must restart sendmail. This works great if you are a server administrator and do not wish to send mail locally, or would like to use a point and click client/system on another machine or even another ISP. It is also very useful if you only have one or two email accounts set up. If there is a large number of addresses to add, you can simply open this file in your favorite text editor and then add the domains, one per line: your.isp.example.com other.isp.example.net users-isp.example.org www.example.org Now any mail sent through your system, by any host in this list (provided the user has an account on your system), will succeed. This is a very nice way to allow users to send mail from your system remotely without allowing people to send SPAM through your system. Advanced Topics The following section covers more involved topics such as mail configuration and setting up mail for your entire domain. Basic Configuration email configuration Out of the box, you should be able to send email to external hosts as long as you have set up /etc/resolv.conf or are running your own name server. If you would like to have mail for your host delivered to the MTA (e.g., sendmail) on your own FreeBSD host, there are two methods: Run your own name server and have your own domain. For example, FreeBSD.org Get mail delivered directly to your host. This is done by delivering mail directly to the current DNS name for your machine. For example, example.FreeBSD.org. SMTP Regardless of which of the above you choose, in order to have mail delivered directly to your host, it must have a permanent static IP address (not a dynamic address, as with most PPP dial-up configurations). If you are behind a firewall, it must pass SMTP traffic on to you. If you want to receive mail directly at your host, you need to be sure of either of two things: MX record Make sure that the (lowest-numbered) MX record in your DNS points to your host's IP address. Make sure there is no MX entry in your DNS for your host. Either of the above will allow you to receive mail directly at your host. Try this: &prompt.root; hostname example.FreeBSD.org &prompt.root; host example.FreeBSD.org example.FreeBSD.org has address 204.216.27.XX If that is what you see, mail directly to yourlogin@example.FreeBSD.org should work without problems (assuming sendmail is running correctly on example.FreeBSD.org). If instead you see something like this: &prompt.root; host example.FreeBSD.org example.FreeBSD.org has address 204.216.27.XX example.FreeBSD.org mail is handled (pri=10) by hub.FreeBSD.org All mail sent to your host (example.FreeBSD.org) will end up being collected on hub under the same username instead of being sent directly to your host. The above information is handled by your DNS server. The DNS record that carries mail routing information is the Mail eXchange entry. If no MX record exists, mail will be delivered directly to the host by way of its IP address. The MX entry for freefall.FreeBSD.org at one time looked like this: freefall MX 30 mail.crl.net freefall MX 40 agora.rdrop.com freefall MX 10 freefall.FreeBSD.org freefall MX 20 who.cdrom.com As you can see, freefall had many MX entries. The lowest MX number is the host that receives mail directly if available; if it is not accessible for some reason, the others (sometimes called backup MXes) accept messages temporarily, and pass it along when a lower-numbered host becomes available, eventually to the lowest-numbered host. Alternate MX sites should have separate Internet connections from your own in order to be most useful. Your ISP or another friendly site should have no problem providing this service for you. Mail for Your Domain In order to set up a mailhost (a.k.a. mail server) you need to have any mail sent to various workstations directed to it. Basically, you want to claim any mail for any hostname in your domain (in this case *.FreeBSD.org) and divert it to your mail server so your users can receive their mail on the master mail server. DNS To make life easiest, a user account with the same username should exist on both machines. Use &man.adduser.8; to do this. The mailhost you will be using must be the designated mail exchanger for each workstation on the network. This is done in your DNS configuration like so: example.FreeBSD.org A 204.216.27.XX ; Workstation MX 10 hub.FreeBSD.org ; Mailhost This will redirect mail for the workstation to the mailhost no matter where the A record points. The mail is sent to the MX host. You cannot do this yourself unless you are running a DNS server. If you are not, or cannot run your own DNS server, talk to your ISP or whoever provides your DNS. If you are doing virtual email hosting, the following information will come in handy. For this example, we will assume you have a customer with his own domain, in this case customer1.org, and you want all the mail for customer1.org sent to your mailhost, mail.myhost.com. The entry in your DNS should look like this: customer1.org MX 10 mail.myhost.com You do not need an A record for customer1.org if you only want to handle email for that domain. Be aware that pinging customer1.org will not work unless an A record exists for it. The last thing that you must do is tell sendmail on your mailhost what domains and/or hostnames it should be accepting mail for. There are a few different ways this can be done. Either of the following will work: Add the hosts to your /etc/mail/local-host-names file if you are using the FEATURE(use_cw_file). If you are using a version of sendmail earlier than 8.10, the file is /etc/sendmail.cw. Add a Cwyour.host.com line to your /etc/sendmail.cf or /etc/mail/sendmail.cf if you are using sendmail 8.10 or higher. SMTP with UUCP The sendmail configuration that ships with FreeBSD is designed for sites that connect directly to the Internet. Sites that wish to exchange their mail via UUCP must install another sendmail configuration file. Tweaking /etc/mail/sendmail.cf manually is an advanced topic. sendmail version 8 generates config files via &man.m4.1; preprocessing, where the actual configuration occurs on a higher abstraction level. The &man.m4.1; configuration files can be found under /usr/src/usr.sbin/sendmail/cf. If you did not install your system with full sources, the sendmail configuration set has been broken out into a separate source distribution tarball. Assuming you have your FreeBSD source code CDROM mounted, do: &prompt.root; cd /cdrom/src &prompt.root; cat scontrib.?? | tar xzf - -C /usr/src/contrib/sendmail This extracts to only a few hundred kilobytes. The file README in the cf directory can serve as a basic introduction to &man.m4.1; configuration. The best way to support UUCP delivery is to use the mailertable feature. This creates a database that sendmail can use to make routing decisions. First, you have to create your .mc file. The directory /usr/src/usr.sbin/sendmail/cf/cf contains a few examples. Assuming you have named your file foo.mc, all you need to do in order to convert it into a valid sendmail.cf is: &prompt.root; cd /usr/src/usr.sbin/sendmail/cf/cf &prompt.root; make foo.cf &prompt.root; cp foo.cf /etc/mail/sendmail.cf A typical .mc file might look like: VERSIONID(`Your version number') OSTYPE(bsd4.4) FEATURE(accept_unresolvable_domains) FEATURE(nocanonify) FEATURE(mailertable, `hash -o /etc/mail/mailertable') define(`UUCP_RELAY', your.uucp.relay) define(`UUCP_MAX_SIZE', 200000) define(`confDONT_PROBE_INTERFACES') MAILER(local) MAILER(smtp) MAILER(uucp) Cw your.alias.host.name Cw youruucpnodename.UUCP The lines containing accept_unresolvable_domains, nocanonify, and confDONT_PROBE_INTERFACES features will prevent any usage of the DNS during mail delivery. The UUCP_RELAY clause is needed to support UUCP delivery. Simply put an Internet hostname there that is able to handle .UUCP pseudo-domain addresses; most likely, you will enter the mail relay of your ISP there. Once you have this, you need an /etc/mail/mailertable file. If you have only one link to the outside that is used for all your mails, the following file will suffice: # # makemap hash /etc/mail/mailertable.db < /etc/mail/mailertable . uucp-dom:your.uucp.relay A more complex example might look like this: # # makemap hash /etc/mail/mailertable.db < /etc/mail/mailertable # horus.interface-business.de uucp-dom:horus .interface-business.de uucp-dom:if-bus interface-business.de uucp-dom:if-bus .heep.sax.de smtp8:%1 horus.UUCP uucp-dom:horus if-bus.UUCP uucp-dom:if-bus . uucp-dom: The first three lines handle special cases where domain-addressed mail should not be sent out to the default route, but instead to some UUCP neighbor in order to shortcut the delivery path. The next line handles mail to the local Ethernet domain that can be delivered using SMTP. Finally, the UUCP neighbors are mentioned in the .UUCP pseudo-domain notation, to allow for a uucp-neighbor !recipient override of the default rules. The last line is always a single dot, matching everything else, with UUCP delivery to a UUCP neighbor that serves as your universal mail gateway to the world. All of the node names behind the uucp-dom: keyword must be valid UUCP neighbors, as you can verify using the command uuname. As a reminder that this file needs to be converted into a DBM database file before use. The command line to accomplish this is best placed as a comment at the top of the mailertable file. You always have to execute this command each time you change your mailertable file. Final hint: if you are uncertain whether some particular mail routing would work, remember the option to sendmail. It starts sendmail in address test mode; simply enter 3,0, followed by the address you wish to test for the mail routing. The last line tells you the used internal mail agent, the destination host this agent will be called with, and the (possibly translated) address. Leave this mode by typing CtrlD. &prompt.user; sendmail -bt ADDRESS TEST MODE (ruleset 3 NOT automatically invoked) Enter <ruleset> <address> > 3,0 foo@example.com canonify input: foo @ example . com ... parse returns: $# uucp-dom $@ your.uucp.relay $: foo < @ example . com . > > ^D Bill Moran Contributed by Setting Up to Send Only There are many instances where you may only want to send mail through a relay. Some examples are: Your computer is a desktop machine, but you want to use programs such as &man.send-pr.1;. To do so, you should use your ISP's mail relay. The computer is a server that does not handle mail locally, but needs to pass off all mail to a relay for processing. Just about any MTA is capable of filling this particular niche. Unfortunately, it can be very difficult to properly configure a full-featured MTA just to handle offloading mail. Programs such as sendmail and postfix are largely overkill for this use. Additionally, if you are using a typical Internet access service, your agreement may forbid you from running a mail server. The easiest way to fulfill those needs is to install the mail/ssmtp port. Execute the following commands as root: &prompt.root; cd /usr/ports/mail/ssmtp &prompt.root; make install replace clean Once installed, mail/ssmtp can be configured with a four-line file located at /usr/local/etc/ssmtp/ssmtp.conf: root=yourrealemail@example.com mailhub=mail.example.com rewriteDomain=example.com hostname=_HOSTNAME_ Make sure you use your real email address for root. Enter your ISP's outgoing mail relay in place of mail.example.com (some ISPs call this the outgoing mail server or SMTP server). Make sure you disable sendmail, including the outgoing mail service. See for details. mail/ssmtp has some other options available. See the example configuration file in /usr/local/etc/ssmtp or the manual page of ssmtp for some examples and more information. Setting up ssmtp in this manner will allow any software on your computer that needs to send mail to function properly, while not violating your ISP's usage policy or allowing your computer to be hijacked for spamming. Using Mail with a Dialup Connection If you have a static IP address, you should not need to adjust anything from the defaults. Set your host name to your assigned Internet name and sendmail will do the rest. If you have a dynamically assigned IP number and use a dialup PPP connection to the Internet, you will probably have a mailbox on your ISPs mail server. Let's assume your ISP's domain is example.net, and that your user name is user, you have called your machine bsd.home, and your ISP has told you that you may use relay.example.net as a mail relay. In order to retrieve mail from your mailbox, you must install a retrieval agent. The fetchmail utility is a good choice as it supports many different protocols. This program is available as a package or from the Ports Collection (mail/fetchmail). Usually, your ISP will provide POP. If you are using user PPP, you can automatically fetch your mail when an Internet connection is established with the following entry in /etc/ppp/ppp.linkup: MYADDR: !bg su user -c fetchmail If you are using sendmail (as shown below) to deliver mail to non-local accounts, you probably want to have sendmail process your mailqueue as soon as your Internet connection is established. To do this, put this command after the fetchmail command in /etc/ppp/ppp.linkup: !bg su user -c "sendmail -q" Assume that you have an account for user on bsd.home. In the home directory of user on bsd.home, create a .fetchmailrc file: poll example.net protocol pop3 fetchall pass MySecret This file should not be readable by anyone except user as it contains the password MySecret. In order to send mail with the correct from: header, you must tell sendmail to use user@example.net rather than user@bsd.home. You may also wish to tell sendmail to send all mail via relay.example.net, allowing quicker mail transmission. The following .mc file should suffice: VERSIONID(`bsd.home.mc version 1.0') OSTYPE(bsd4.4)dnl FEATURE(nouucp)dnl MAILER(local)dnl MAILER(smtp)dnl Cwlocalhost Cwbsd.home MASQUERADE_AS(`example.net')dnl FEATURE(allmasquerade)dnl FEATURE(masquerade_envelope)dnl FEATURE(nocanonify)dnl FEATURE(nodns)dnl define(`SMART_HOST', `relay.example.net') Dmbsd.home define(`confDOMAIN_NAME',`bsd.home')dnl define(`confDELIVERY_MODE',`deferred')dnl Refer to the previous section for details of how to turn this .mc file into a sendmail.cf file. Also, do not forget to restart sendmail after updating sendmail.cf. James Gorham Written by SMTP Authentication Having SMTP Authentication in place on your mail server has a number of benefits. SMTP Authentication can add another layer of security to sendmail, and has the benefit of giving mobile users who switch hosts the ability to use the same mail server without the need to reconfigure their mail client settings each time. Install security/cyrus-sasl from the ports. You can find this port in security/cyrus-sasl. security/cyrus-sasl has a number of compile time options to choose from and, for the method we will be using here, make sure to select the option. After installing security/cyrus-sasl, edit /usr/local/lib/sasl/Sendmail.conf (or create it if it does not exist) and add the following line: pwcheck_method: passwd This method will enable sendmail to authenticate against your FreeBSD passwd database. This saves the trouble of creating a new set of usernames and passwords for each user that needs to use SMTP authentication, and keeps the login and mail password the same. Now edit /etc/make.conf and add the following lines: SENDMAIL_CFLAGS=-I/usr/local/include/sasl1 -DSASL SENDMAIL_LDFLAGS=-L/usr/local/lib SENDMAIL_LDADD=-lsasl These lines will give sendmail the proper configuration options for linking to cyrus-sasl at compile time. Make sure that cyrus-sasl has been installed before recompiling sendmail. Recompile sendmail by executing the following commands: &prompt.root; cd /usr/src/usr.sbin/sendmail &prompt.root; make cleandir &prompt.root; make obj &prompt.root; make &prompt.root; make install The compile of sendmail should not have any problems if /usr/src has not been changed extensively and the shared libraries it needs are available. After sendmail has been compiled and reinstalled, edit your /etc/mail/freebsd.mc file (or whichever file you use as your .mc file. Many administrators choose to use the output from &man.hostname.1; as the .mc file for uniqueness). Add these lines to it: dnl set SASL options TRUST_AUTH_MECH(`GSSAPI DIGEST-MD5 CRAM-MD5 LOGIN')dnl define(`confAUTH_MECHANISMS', `GSSAPI DIGEST-MD5 CRAM-MD5 LOGIN')dnl define(`confDEF_AUTH_INFO', `/etc/mail/auth-info')dnl These options configure the different methods available to sendmail for authenticating users. If you would like to use a method other than pwcheck, please see the included documentation. Finally, run &man.make.1; while in /etc/mail. That will run your new .mc file and create a .cf file named freebsd.cf (or whatever name you have used for your .mc file). Then use the command make install restart, which will copy the file to sendmail.cf, and will properly restart sendmail. For more information about this process, you should refer to /etc/mail/Makefile. If all has gone correctly, you should be able to enter your login information into the mail client and send a test message. For further investigation, set the of sendmail to 13 and watch /var/log/maillog for any errors. You may wish to add the following line to /etc/rc.conf so this service will be available after every system boot: cyrus_pwcheck_enable="YES" This will ensure the initialization of SMTP_AUTH upon system boot. For more information, please see the sendmail page regarding SMTP authentication. Marc Silver Contributed by Mail User Agents Mail User Agents A Mail User Agent (MUA) is an application that is used to send and receive email. Furthermore, as email evolves and becomes more complex, MUA's are becoming increasingly powerful in the way they interact with email; this gives users increased functionality and flexibility. &os; contains support for numerous mail user agents, all of which can be easily installed using the FreeBSD Ports Collection. Users may choose between graphical email clients such as evolution or balsa, console based clients such as mutt, pine or mail, or the web interfaces used by some large organizations. mail &man.mail.1; is the default Mail User Agent (MUA) in &os;. It is a console based MUA that offers all the basic functionality required to send and receive text-based email, though it is limited in interaction abilities with attachments and can only support local mailboxes. Although mail does not natively support interaction with POP or IMAP servers, these mailboxes may be downloaded to a local mbox file using an application such as fetchmail, which will be discussed later in this chapter (). In order to send and receive email, simply invoke the mail command as per the following example: &prompt.user; mail The contents of the user mailbox in /var/mail are automatically read by the mail utility. Should the mailbox be empty, the utility exits with a message indicating that no mails could be found. Once the mailbox has been read, the application interface is started, and a list of messages will be displayed. Messages are automatically numbered, as can be seen in the following example: Mail version 8.1 6/6/93. Type ? for help. "/var/mail/marcs": 3 messages 3 new >N 1 root@localhost Mon Mar 8 14:05 14/510 "test" N 2 root@localhost Mon Mar 8 14:05 14/509 "user account" N 3 root@localhost Mon Mar 8 14:05 14/509 "sample" Messages can now be read by using the t mail command, suffixed by the message number that should be displayed. In this example, we will read the first email: - & t 1 + & t 1 Message 1: From root@localhost Mon Mar 8 14:05:52 2004 X-Original-To: marcs@localhost Delivered-To: marcs@localhost To: marcs@localhost Subject: test Date: Mon, 8 Mar 2004 14:05:52 +0200 (SAST) From: root@localhost (Charlie Root) This is a test message, please reply if you receive it. As can be seen in the example above, the t key will cause the message to be displayed with full headers. To display the list of messages again, the h key should be used. If the email requires a response, you may use mail to reply, by using either the R or r mail keys. The R key instructs mail to reply only to the sender of the email, while r replies not only to the sender, but also to other recipients of the message. You may also suffix these commands with the mail number which you would like make a reply to. Once this has been done, the response should be entered, and the end of the message should be marked by a single . on a new line. An example can be seen below: - & R 1 + & R 1 To: root@localhost Subject: Re: test Thank you, I did get your email. . EOT In order to send new email, the m key should be used, followed by the recipient email address. Multiple recipients may also be specified by separating each address with the , delimiter. The subject of the message may then be entered, followed by the message contents. The end of the message should be specified by putting a single . on a new line. - & mail root@localhost + & mail root@localhost Subject: I mastered mail Now I can send and receive email using mail ... :) . EOT While inside the mail utility, the ? command may be used to display help at any time, the &man.mail.1; manual page should also be consulted for more help with mail. As previously mentioned, the &man.mail.1; command was not originally designed to handle attachments, and thus deals with them very poorly. Newer MUAs such as mutt handle attachments in a much more intelligent way. But should you still wish to use the mail command, the converters/mpack port may be of considerable use. mutt mutt is a small yet very powerful Mail User Agent, with excellent features, just some of which include: The ability to thread messages; PGP support for digital signing and encryption of email; MIME Support; Maildir Support; Highly customizable. All of these features help to make mutt one of the most advanced mail user agents available. See for more information on mutt. The stable version of mutt may be installed using the mail/mutt port, while the current development version may be installed via the mail/mutt-devel port. After the port has been installed, mutt can be started by issuing the following command: &prompt.user; mutt mutt will automatically read the contents of the user mailbox in /var/mail and display the contents if applicable. If no mails are found in the user mailbox, then mutt will wait for commands from the user. The example below shows mutt displaying a list of messages: In order to read an email, simply select it using the cursor keys, and press the Enter key. An example of mutt displaying email can be seen below: As with the &man.mail.1; command, mutt allows users to reply only to the sender of the message as well as to all recipients. To reply only to the sender of the email, use the r keyboard shortcut. To send a group reply, which will be sent to the original sender as well as all the message recipients, use the g shortcut. mutt makes use of the &man.vi.1; command as an editor for creating and replying to emails. This may be customized by the user by creating or editing their own .muttrc file in their home directory and setting the editor variable. In order to compose a new mail message, press m. After a valid subject has been given, mutt will start &man.vi.1; and the mail can be written. Once the contents of the mail are complete, save and quit from vi and mutt will resume, displaying a summary screen of the mail that is to be delivered. In order to send the mail, press y. An example of the summary screen can be seen below: mutt also contains extensive help, which can be accessed from most of the menus by pressing the ? key. The top line also displays the keyboard shortcuts where appropriate. pine pine is aimed at a beginner user, but also includes some advanced features. The pine software has had several remote vulnerabilities discovered in the past, which allowed remote attackers to execute arbitrary code as users on the local system, by the action of sending a specially-prepared email. All such known problems have been fixed, but the pine code is written in a very insecure style and the &os; Security Officer believes there are likely to be other undiscovered vulnerabilities. You install pine at your own risk. The current version of pine may be installed using the mail/pine4 port. Once the port has installed, pine can be started by issuing the following command: &prompt.user; pine The first time that pine is run it displays a greeting page with a brief introduction, as well as a request from the pine development team to send an anonymous email message allowing them to judge how many users are using their client. To send this anonymous message, press Enter, or alternatively press E to exit the greeting without sending an anonymous message. An example of the greeting page can be seen below: Users are then presented with the main menu, which can be easily navigated using the cursor keys. This main menu provides shortcuts for the composing new mails, browsing of mail directories, and even the administration of address book entries. Below the main menu, relevant keyboard shortcuts to perform functions specific to the task at hand are shown. The default directory opened by pine is the inbox. To view the message index, press I, or select the MESSAGE INDEX option as seen below: The message index shows messages in the current directory, and can be navigated by using the cursor keys. Highlighted messages can be read by pressing the Enter key. In the screenshot below, a sample message is displayed by pine. Keyboard shortcuts are displayed as a reference at the bottom of the screen. An example of one of these shortcuts is the r key, which tells the MUA to reply to the current message being displayed. Replying to an email in pine is done using the pico editor, which is installed by default with pine. The pico utility makes it easy to navigate around the message and is slightly more forgiving on novice users than &man.vi.1; or &man.mail.1;. Once the reply is complete, the message can be sent by pressing CtrlX . The pine application will ask for confirmation. The pine application can be customized using the SETUP option from the main menu. Consult for more information. Marc Silver Contributed by Using fetchmail fetchmail fetchmail is a full-featured IMAP and POP client which allows users to automatically download mail from remote IMAP and POP servers and save it into local mailboxes; there it can be accessed more easily. fetchmail can be installed using the mail/fetchmail port, and offers various features, some of which include: Support of POP3, APOP, KPOP, IMAP, ETRN and ODMR protocols. Ability to forward mail using SMTP, which allows filtering, forwarding, and aliasing to function normally. May be run in daemon mode to check periodically for new messages. Can retrieve multiple mailboxes and forward them based on configuration, to different local users. While it is outside the scope of this document to explain all of fetchmail's features, some basic features will be explained. The fetchmail utility requires a configuration file known as .fetchmailrc, in order to run correctly. This file includes server information as well as login credentials. Due to the sensitive nature of the contents of this file, it is advisable to make it readable only by the owner, with the following command: &prompt.user; chmod 600 .fetchmailrc The following .fetchmailrc serves as an example for downloading a single user mailbox using POP. It tells fetchmail to connect to example.com using a username of joesoap and a password of XXX. This example assumes that the user joesoap is also a user on the local system. poll example.com protocol pop3 username "joesoap" password "XXX" The next example connects to multiple POP and IMAP servers and redirects to different local usernames where applicable: poll example.com proto pop3: user "joesoap", with password "XXX", is "jsoap" here; user "andrea", with password "XXXX"; poll example2.net proto imap: user "john", with password "XXXXX", is "myth" here; The fetchmail utility can be run in daemon mode by running it with the flag, followed by the interval (in seconds) that fetchmail should poll servers listed in the .fetchmailrc file. The following example would cause fetchmail to poll every 600 seconds: &prompt.user; fetchmail -d 600 More information on fetchmail can be found at . Marc Silver Contributed by Using procmail procmail The procmail utility is an incredibly powerful application used to filter incoming mail. It allows users to define rules which can be matched to incoming mails to perform specific functions or to reroute mail to alternative mailboxes and/or email addresses. procmail can be installed using the mail/procmail port. Once installed, it can be directly integrated into most MTAs; consult your MTA documentation for more information. Alternatively, procmail can be integrated by adding the following line to a .forward in the home directory of the user utilizing procmail features: "|exec /usr/local/bin/procmail || exit 75" The following section will display some basic procmail rules, as well as brief descriptions on what they do. These rules, and others must be inserted into a .procmailrc file, which must reside in the user's home directory. The majority of these rules can also be found in the &man.procmailex.5; manual page. Forward all mail from user@example.com to an external address of goodmail@example2.com: :0 * ^From.*user@example.com ! goodmail@example2.com Forward all mails shorter than 1000 bytes to an external address of goodmail@example2.com: :0 -* < 1000 +* < 1000 ! goodmail@example2.com Send all mail sent to alternate@example.com into a mailbox called alternate: :0 * ^TOalternate@example.com alternate Send all mail with a subject of Spam to /dev/null: :0 ^Subject:.*Spam /dev/null A useful recipe that parses incoming &os;.org mailing lists and places each list in its own mailbox: :0 * ^Sender:.owner-freebsd-\/[^@]+@FreeBSD.ORG { LISTNAME=${MATCH} :0 * LISTNAME??^\/[^@]+ FreeBSD-${MATCH} } diff --git a/en_US.ISO8859-1/books/handbook/mirrors/chapter.sgml b/en_US.ISO8859-1/books/handbook/mirrors/chapter.sgml index 31726da40b..c27f3e40a9 100644 --- a/en_US.ISO8859-1/books/handbook/mirrors/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/mirrors/chapter.sgml @@ -1,3169 +1,3169 @@ Obtaining FreeBSD CDROM and DVD Publishers Retail Boxed Products FreeBSD is available as a boxed product (FreeBSD CDs, additional software, and printed documentation) from several retailers:

CompUSA WWW:

Frys Electronics WWW:

CD and DVD Sets FreeBSD CD and DVD sets are available from many online retailers:

BSD Mall by Daemon News PO Box 161 Nauvoo, IL 62354 USA Phone: +1 866 273-6255 Fax: +1 217 453-9956 Email: sales@bsdmall.com WWW:

BSD-Systems Email: info@bsd-systems.co.uk WWW:

fastdiscs.com 6 Eltham Close Leeds, LS6 2TY United Kingdom Phone: +44 870 1995 171 Email: sales@fastdiscs.com WWW:

FreeBSD Mall, Inc. 3623 Sanford Street Concord, CA 94520-1405 USA Phone: +1 925 674-0783 Fax: +1 925 674-0821 Email: info@freebsdmall.com WWW:

Hinner EDV St. Augustinus-Str. 10 D-81825 München Germany Phone: (089) 428 419 WWW:

Ikarios 22-24 rue Voltaire 92000 Nanterre France WWW:

JMC Software Ireland Phone: 353 1 6291282 WWW:

Linux CD Mall Private Bag MBE N348 Auckland 1030 New Zealand Phone: +64 21 866529 WWW:

The Linux Emporium Hilliard House, Lester Way Wallingford OX10 9TA United Kingdom Phone: +44 1491 837010 Fax: +44 1491 837016 WWW:

Linux+ DVD Magazine Lewartowskiego 6 Warsaw 00-190 Poland Phone: +48 22 860 18 18 Email: editors@lpmagazine.org WWW:

Linux System Labs Australia 21 Ray Drive Balwyn North VIC - 3104 Australia Phone: +61 3 9857 5918 Fax: +61 3 9857 8974 WWW:

LinuxCenter.Ru Galernaya Street, 55 Saint-Petersburg 190000 Russia Phone: +7-812-3125208 Email: info@linuxcenter.ru WWW:

Distributors If you are a reseller and want to carry FreeBSD CDROM products, please contact a distributor:

Cylogistics 809B Cuesta Dr., #2149 Mountain View, CA 94040 USA Phone: +1 650 694-4949 Fax: +1 650 694-4953 Email: sales@cylogistics.com WWW:

Ingram Micro 1600 E. St. Andrew Place Santa Ana, CA 92705-4926 USA Phone: 1 (800) 456-8000 WWW:

Kudzu, LLC 7375 Washington Ave. S. Edina, MN 55439 USA Phone: +1 952 947-0822 Fax: +1 952 947-0876 Email: sales@kudzuenterprises.com

LinuxCenter.Ru Galernaya Street, 55 Saint-Petersburg 190000 Russia Phone: +7-812-3125208 Email: info@linuxcenter.ru WWW:

Navarre Corp 7400 49th Ave South New Hope, MN 55428 USA Phone: +1 763 535-8333 Fax: +1 763 535-0341 WWW:

FTP Sites The official sources for FreeBSD are available via anonymous FTP from a worldwide set of mirror sites. The site is well connected and allows a large number of connections to it, but you are probably better off finding a closer mirror site (especially if you decide to set up some sort of mirror site). The FreeBSD mirror sites database is more accurate than the mirror listing in the Handbook, as it gets its information from the DNS rather than relying on static lists of hosts. Additionally, FreeBSD is available via anonymous FTP from the following mirror sites. If you choose to obtain FreeBSD via anonymous FTP, please try to use a site near you. The mirror sites listed as Primary Mirror Sites typically have the entire FreeBSD archive (all the currently available versions for each of the architectures) but you will probably have faster download times from a site that is in your country or region. The regional sites carry the most recent versions for the most popular architecture(s) but might not carry the entire FreeBSD archive. All sites provide access via anonymous FTP but some sites also provide access via other methods. The access methods available for each site are provided in parentheses after the hostname. &chap.mirrors.ftp.inc; Anonymous CVS <anchor id="anoncvs-intro">Introduction CVS anonymous Anonymous CVS (or, as it is otherwise known, anoncvs) is a feature provided by the CVS utilities bundled with FreeBSD for synchronizing with a remote CVS repository. Among other things, it allows users of FreeBSD to perform, with no special privileges, read-only CVS operations against one of the FreeBSD project's official anoncvs servers. To use it, one simply sets the CVSROOT environment variable to point at the appropriate anoncvs server, provides the well-known password anoncvs with the cvs login command, and then uses the &man.cvs.1; command to access it like any local repository. The cvs login command, stores the passwords that are used for authenticating to the CVS server in a file called .cvspass in your HOME directory. If this file does not exist, you might get an error when trying to use cvs login for the first time. Just make an empty .cvspass file, and retry to login. While it can also be said that the CVSup and anoncvs services both perform essentially the same function, there are various trade-offs which can influence the user's choice of synchronization methods. In a nutshell, CVSup is much more efficient in its usage of network resources and is by far the most technically sophisticated of the two, but at a price. To use CVSup, a special client must first be installed and configured before any bits can be grabbed, and then only in the fairly large chunks which CVSup calls collections. Anoncvs, by contrast, can be used to examine anything from an individual file to a specific program (like ls or grep) by referencing the CVS module name. Of course, anoncvs is also only good for read-only operations on the CVS repository, so if it is your intention to support local development in one repository shared with the FreeBSD project bits then CVSup is really your only option. <anchor id="anoncvs-usage">Using Anonymous CVS Configuring &man.cvs.1; to use an Anonymous CVS repository is a simple matter of setting the CVSROOT environment variable to point to one of the FreeBSD project's anoncvs servers. At the time of this writing, the following servers are available: Austria: :pserver:anoncvs@anoncvs.at.FreeBSD.org:/home/ncvs (Use cvs login and enter any password when prompted.) France: :pserver:anoncvs@anoncvs.fr.FreeBSD.org:/home/ncvs (pserver (password anoncvs), ssh (no password)) Germany: :pserver:anoncvs@anoncvs.de.FreeBSD.org:/home/ncvs (Use cvs login and enter the password anoncvs when prompted.) Germany: :pserver:anoncvs@anoncvs2.de.FreeBSD.org:/home/ncvs (rsh, pserver, ssh, ssh/2022) Japan: :pserver:anoncvs@anoncvs.jp.FreeBSD.org:/home/ncvs (Use cvs login and enter the password anoncvs when prompted.) Sweden: freebsdanoncvs@anoncvs.se.FreeBSD.org:/home/ncvs (ssh only - no password) SSH HostKey: 1024 a7:34:15:ee:0e:c6:65:cf:40:78:2d:f3:cd:87:bd:a6 root@apelsin.fruitsalad.org SSH2 HostKey: 1024 21:df:04:03:c7:26:3e:e8:36:1a:50:2d:c7:ae:b8:5f ssh_host_dsa_key.pub USA: freebsdanoncvs@anoncvs.FreeBSD.org:/home/ncvs (ssh only - no password) SSH HostKey: 1024 a1:e7:46:de:fb:56:ef:05:bc:73:aa:91:09:da:f7:f4 root@sanmateo.ecn.purdue.edu SSH2 HostKey: 1024 52:02:38:1a:2f:a8:71:d3:f5:83:93:8d:aa:00:6f:65 ssh_host_dsa_key.pub USA: anoncvs@anoncvs1.FreeBSD.org:/home/ncvs (ssh only - no password) SSH HostKey: 1024 4b:83:b6:c5:70:75:6c:5b:18:8e:3a:7a:88:a0:43:bb root@ender.liquidneon.com SSH2 HostKey: 1024 80:a7:87:fa:61:d9:25:5c:33:d5:48:51:aa:8f:b6:12 ssh_host_dsa_key.pub Since CVS allows one to check out virtually any version of the FreeBSD sources that ever existed (or, in some cases, will exist), you need to be familiar with the revision () flag to &man.cvs.1; and what some of the permissible values for it in the FreeBSD Project repository are. There are two kinds of tags, revision tags and branch tags. A revision tag refers to a specific revision. Its meaning stays the same from day to day. A branch tag, on the other hand, refers to the latest revision on a given line of development, at any given time. Because a branch tag does not refer to a specific revision, it may mean something different tomorrow than it means today. contains revision tags that users might be interested in. Again, none of these are valid for the Ports Collection since the Ports Collection does not have multiple revisions. When you specify a branch tag, you normally receive the latest versions of the files on that line of development. If you wish to receive some past version, you can do so by specifying a date with the flag. See the &man.cvs.1; manual page for more details. Examples While it really is recommended that you read the manual page for &man.cvs.1; thoroughly before doing anything, here are some quick examples which essentially show how to use Anonymous CVS: Checking Out Something from -CURRENT (&man.ls.1;): &prompt.user; setenv CVSROOT :pserver:anoncvs@anoncvs.jp.FreeBSD.org:/home/ncvs &prompt.user; cvs login At the prompt, enter the password anoncvs. &prompt.user; cvs co ls Using SSH to check out the <filename>src/</filename> tree: &prompt.user; cvs -d freebsdanoncvs@anoncvs.FreeBSD.org:/home/ncvs co src The authenticity of host 'anoncvs.freebsd.org (128.46.156.46)' can't be established. DSA key fingerprint is 52:02:38:1a:2f:a8:71:d3:f5:83:93:8d:aa:00:6f:65. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'anoncvs.freebsd.org' (DSA) to the list of known hosts. Checking Out the Version of &man.ls.1; in the 6-STABLE Branch: &prompt.user; setenv CVSROOT :pserver:anoncvs@anoncvs.jp.FreeBSD.org:/home/ncvs &prompt.user; cvs login At the prompt, enter the password anoncvs. &prompt.user; cvs co -rRELENG_6 ls Creating a List of Changes (as Unified Diffs) to &man.ls.1; &prompt.user; setenv CVSROOT :pserver:anoncvs@anoncvs.jp.FreeBSD.org:/home/ncvs &prompt.user; cvs login At the prompt, enter the password anoncvs. &prompt.user; cvs rdiff -u -rRELENG_5_3_0_RELEASE -rRELENG_5_4_0_RELEASE ls Finding Out What Other Module Names Can Be Used: &prompt.user; setenv CVSROOT :pserver:anoncvs@anoncvs.jp.FreeBSD.org:/home/ncvs &prompt.user; cvs login At the prompt, enter the password anoncvs. &prompt.user; cvs co modules &prompt.user; more modules/modules Other Resources The following additional resources may be helpful in learning CVS: CVS Tutorial from Cal Poly. CVS Home, the CVS development and support community. CVSweb is the FreeBSD Project web interface for CVS. Using CTM CTM CTM is a method for keeping a remote directory tree in sync with a central one. It has been developed for usage with FreeBSD's source trees, though other people may find it useful for other purposes as time goes by. Little, if any, documentation currently exists at this time on the process of creating deltas, so contact the &a.ctm-users.name; mailing list for more information and if you wish to use CTM for other things. Why Should I Use <application>CTM</application>? CTM will give you a local copy of the FreeBSD source trees. There are a number of flavors of the tree available. Whether you wish to track the entire CVS tree or just one of the branches, CTM can provide you the information. If you are an active developer on FreeBSD, but have lousy or non-existent TCP/IP connectivity, or simply wish to have the changes automatically sent to you, CTM was made for you. You will need to obtain up to three deltas per day for the most active branches. However, you should consider having them sent by automatic email. The sizes of the updates are always kept as small as possible. This is typically less than 5K, with an occasional (one in ten) being 10-50K and every now and then a large 100K+ or more coming around. You will also need to make yourself aware of the various caveats related to working directly from the development sources rather than a pre-packaged release. This is particularly true if you choose the current sources. It is recommended that you read Staying current with FreeBSD. What Do I Need to Use <application>CTM</application>? You will need two things: The CTM program, and the initial deltas to feed it (to get up to current levels). The CTM program has been part of FreeBSD ever since version 2.0 was released, and lives in /usr/src/usr.sbin/ctm if you have a copy of the source available. The deltas you feed CTM can be had two ways, FTP or email. If you have general FTP access to the Internet then the following FTP sites support access to CTM: or see section mirrors. FTP the relevant directory and fetch the README file, starting from there. If you wish to get your deltas via email: Subscribe to one of the CTM distribution lists. &a.ctm-cvs-cur.name; supports the entire CVS tree. &a.ctm-src-cur.name; supports the head of the development branch. &a.ctm-src-4.name; supports the 4.X release branch, etc.. (If you do not know how to subscribe yourself to a list, click on the list name above or go to &a.mailman.lists.link; and click on the list that you wish to subscribe to. The list page should contain all of the necessary subscription instructions.) When you begin receiving your CTM updates in the mail, you may use the ctm_rmail program to unpack and apply them. You can actually use the ctm_rmail program directly from a entry in /etc/aliases if you want to have the process run in a fully automated fashion. Check the ctm_rmail manual page for more details. No matter what method you use to get the CTM deltas, you should subscribe to the &a.ctm-announce.name; mailing list. In the future, this will be the only place where announcements concerning the operations of the CTM system will be posted. Click on the list name above and follow the instructions to subscribe to the list. Using <application>CTM</application> for the First Time Before you can start using CTM deltas, you will need to get to a starting point for the deltas produced subsequently to it. First you should determine what you already have. Everyone can start from an empty directory. You must use an initial Empty delta to start off your CTM supported tree. At some point it is intended that one of these started deltas be distributed on the CD for your convenience, however, this does not currently happen. Since the trees are many tens of megabytes, you should prefer to start from something already at hand. If you have a -RELEASE CD, you can copy or extract an initial source from it. This will save a significant transfer of data. You can recognize these starter deltas by the X appended to the number (src-cur.3210XEmpty.gz for instance). The designation following the X corresponds to the origin of your initial seed. Empty is an empty directory. As a rule a base transition from Empty is produced every 100 deltas. By the way, they are large! 70 to 80 Megabytes of gzip'd data is common for the XEmpty deltas. Once you have picked a base delta to start from, you will also need all deltas with higher numbers following it. Using <application>CTM</application> in Your Daily Life To apply the deltas, simply say: &prompt.root; cd /where/ever/you/want/the/stuff &prompt.root; ctm -v -v /where/you/store/your/deltas/src-xxx.* CTM understands deltas which have been put through gzip, so you do not need to gunzip them first, this saves disk space. Unless it feels very secure about the entire process, CTM will not touch your tree. To verify a delta you can also use the flag and CTM will not actually touch your tree; it will merely verify the integrity of the delta and see if it would apply cleanly to your current tree. There are other options to CTM as well, see the manual pages or look in the sources for more information. That is really all there is to it. Every time you get a new delta, just run it through CTM to keep your sources up to date. Do not remove the deltas if they are hard to download again. You just might want to keep them around in case something bad happens. Even if you only have floppy disks, consider using fdwrite to make a copy. Keeping Your Local Changes As a developer one would like to experiment with and change files in the source tree. CTM supports local modifications in a limited way: before checking for the presence of a file foo, it first looks for foo.ctm. If this file exists, CTM will operate on it instead of foo. This behavior gives us a simple way to maintain local changes: simply copy the files you plan to modify to the corresponding file names with a .ctm suffix. Then you can freely hack the code, while CTM keeps the .ctm file up-to-date. Other Interesting <application>CTM</application> Options Finding Out Exactly What Would Be Touched by an Update You can determine the list of changes that CTM will make on your source repository using the option to CTM. This is useful if you would like to keep logs of the changes, pre- or post- process the modified files in any manner, or just are feeling a tad paranoid. Making Backups Before Updating Sometimes you may want to backup all the files that would be changed by a CTM update. Specifying the option causes CTM to backup all files that would be touched by a given CTM delta to backup-file. Restricting the Files Touched by an Update Sometimes you would be interested in restricting the scope of a given CTM update, or may be interested in extracting just a few files from a sequence of deltas. You can control the list of files that CTM would operate on by specifying filtering regular expressions using the and options. For example, to extract an up-to-date copy of lib/libc/Makefile from your collection of saved CTM deltas, run the commands: &prompt.root; cd /where/ever/you/want/to/extract/it/ &prompt.root; ctm -e '^lib/libc/Makefile' ~ctm/src-xxx.* For every file specified in a CTM delta, the and options are applied in the order given on the command line. The file is processed by CTM only if it is marked as eligible after all the and options are applied to it. Future Plans for <application>CTM</application> Tons of them: Use some kind of authentication into the CTM system, so as to allow detection of spoofed CTM updates. Clean up the options to CTM, they became confusing and counter intuitive. Miscellaneous Stuff There is a sequence of deltas for the ports collection too, but interest has not been all that high yet. CTM Mirrors CTM/FreeBSD is available via anonymous FTP from the following mirror sites. If you choose to obtain CTM via anonymous FTP, please try to use a site near you. In case of problems, please contact the &a.ctm-users.name; mailing list. California, Bay Area, official source South Africa, backup server for old deltas Taiwan/R.O.C. If you did not find a mirror near to you or the mirror is incomplete, try to use a search engine such as alltheweb. Using CVSup Introduction CVSup is a software package for distributing and updating source trees from a master CVS repository on a remote server host. The FreeBSD sources are maintained in a CVS repository on a central development machine in California. With CVSup, FreeBSD users can easily keep their own source trees up to date. CVSup uses the so-called pull model of updating. Under the pull model, each client asks the server for updates, if and when they are wanted. The server waits passively for update requests from its clients. Thus all updates are instigated by the client. The server never sends unsolicited updates. Users must either run the CVSup client manually to get an update, or they must set up a cron job to run it automatically on a regular basis. The term CVSup, capitalized just so, refers to the entire software package. Its main components are the client cvsup which runs on each user's machine, and the server cvsupd which runs at each of the FreeBSD mirror sites. As you read the FreeBSD documentation and mailing lists, you may see references to sup. Sup was the predecessor of CVSup, and it served a similar purpose. CVSup is used much in the same way as sup and, in fact, uses configuration files which are backward-compatible with sup's. Sup is no longer used in the FreeBSD project, because CVSup is both faster and more flexible. Installation The easiest way to install CVSup is to use the precompiled net/cvsup package from the FreeBSD packages collection. If you prefer to build CVSup from source, you can use the net/cvsup port instead. But be forewarned: the net/cvsup port depends on the Modula-3 system, which takes a substantial amount of time and disk space to download and build. If you are going to be using CVSup on a machine which will not have &xfree86; or &xorg; installed, such as a server, be sure to use the port which does not include the CVSup GUI, net/cvsup-without-gui. CVSup Configuration CVSup's operation is controlled by a configuration file called the supfile. There are some sample supfiles in the directory /usr/share/examples/cvsup/. The information in a supfile answers the following questions for CVSup: Which files do you want to receive? Which versions of them do you want? Where do you want to get them from? Where do you want to put them on your own machine? Where do you want to put your status files? In the following sections, we will construct a typical supfile by answering each of these questions in turn. First, we describe the overall structure of a supfile. A supfile is a text file. Comments begin with # and extend to the end of the line. Lines that are blank and lines that contain only comments are ignored. Each remaining line describes a set of files that the user wishes to receive. The line begins with the name of a collection, a logical grouping of files defined by the server. The name of the collection tells the server which files you want. After the collection name come zero or more fields, separated by white space. These fields answer the questions listed above. There are two types of fields: flag fields and value fields. A flag field consists of a keyword standing alone, e.g., delete or compress. A value field also begins with a keyword, but the keyword is followed without intervening white space by = and a second word. For example, release=cvs is a value field. A supfile typically specifies more than one collection to receive. One way to structure a supfile is to specify all of the relevant fields explicitly for each collection. However, that tends to make the supfile lines quite long, and it is inconvenient because most fields are the same for all of the collections in a supfile. CVSup provides a defaulting mechanism to avoid these problems. Lines beginning with the special pseudo-collection name *default can be used to set flags and values which will be used as defaults for the subsequent collections in the supfile. A default value can be overridden for an individual collection, by specifying a different value with the collection itself. Defaults can also be changed or augmented in mid-supfile by additional *default lines. With this background, we will now proceed to construct a supfile for receiving and updating the main source tree of FreeBSD-CURRENT. Which files do you want to receive? The files available via CVSup are organized into named groups called collections. The collections that are available are described in the following section. In this example, we wish to receive the entire main source tree for the FreeBSD system. There is a single large collection src-all which will give us all of that. As a first step toward constructing our supfile, we simply list the collections, one per line (in this case, only one line): src-all Which version(s) of them do you want? With CVSup, you can receive virtually any version of the sources that ever existed. That is possible because the cvsupd server works directly from the CVS repository, which contains all of the versions. You specify which one of them you want using the tag= and value fields. Be very careful to specify any tag= fields correctly. Some tags are valid only for certain collections of files. If you specify an incorrect or misspelled tag, CVSup will delete files which you probably do not want deleted. In particular, use only tag=. for the ports-* collections. The tag= field names a symbolic tag in the repository. There are two kinds of tags, revision tags and branch tags. A revision tag refers to a specific revision. Its meaning stays the same from day to day. A branch tag, on the other hand, refers to the latest revision on a given line of development, at any given time. Because a branch tag does not refer to a specific revision, it may mean something different tomorrow than it means today. contains branch tags that users might be interested in. When specifying a tag in CVSup's configuration file, it must be preceded with tag= (RELENG_4 will become tag=RELENG_4). Keep in mind that only the tag=. is relevant for the Ports Collection. Be very careful to type the tag name exactly as shown. CVSup cannot distinguish between valid and invalid tags. If you misspell the tag, CVSup will behave as though you had specified a valid tag which happens to refer to no files at all. It will delete your existing sources in that case. When you specify a branch tag, you normally receive the latest versions of the files on that line of development. If you wish to receive some past version, you can do so by specifying a date with the value field. The &man.cvsup.1; manual page explains how to do that. For our example, we wish to receive FreeBSD-CURRENT. We add this line at the beginning of our supfile: *default tag=. There is an important special case that comes into play if you specify neither a tag= field nor a date= field. In that case, you receive the actual RCS files directly from the server's CVS repository, rather than receiving a particular version. Developers generally prefer this mode of operation. By maintaining a copy of the repository itself on their systems, they gain the ability to browse the revision histories and examine past versions of files. This gain is achieved at a large cost in terms of disk space, however. Where do you want to get them from? We use the host= field to tell cvsup where to obtain its updates. Any of the CVSup mirror sites will do, though you should try to select one that is close to you in cyberspace. In this example we will use a fictional FreeBSD distribution site, cvsup99.FreeBSD.org: *default host=cvsup99.FreeBSD.org You will need to change the host to one that actually exists before running CVSup. On any particular run of cvsup, you can override the host setting on the command line, with . Where do you want to put them on your own machine? The prefix= field tells cvsup where to put the files it receives. In this example, we will put the source files directly into our main source tree, /usr/src. The src directory is already implicit in the collections we have chosen to receive, so this is the correct specification: *default prefix=/usr Where should cvsup maintain its status files? The CVSup client maintains certain status files in what is called the base directory. These files help CVSup to work more efficiently, by keeping track of which updates you have already received. We will use the standard base directory, /var/db: *default base=/var/db If your base directory does not already exist, now would be a good time to create it. The cvsup client will refuse to run if the base directory does not exist. Miscellaneous supfile settings: There is one more line of boiler plate that normally needs to be present in the supfile: *default release=cvs delete use-rel-suffix compress release=cvs indicates that the server should get its information out of the main FreeBSD CVS repository. This is virtually always the case, but there are other possibilities which are beyond the scope of this discussion. delete gives CVSup permission to delete files. You should always specify this, so that CVSup can keep your source tree fully up-to-date. CVSup is careful to delete only those files for which it is responsible. Any extra files you happen to have will be left strictly alone. use-rel-suffix is ... arcane. If you really want to know about it, see the &man.cvsup.1; manual page. Otherwise, just specify it and do not worry about it. compress enables the use of gzip-style compression on the communication channel. If your network link is T1 speed or faster, you probably should not use compression. Otherwise, it helps substantially. Putting it all together: Here is the entire supfile for our example: *default tag=. *default host=cvsup99.FreeBSD.org *default prefix=/usr *default base=/var/db *default release=cvs delete use-rel-suffix compress src-all The <filename>refuse</filename> File As mentioned above, CVSup uses a pull method. Basically, this means that you connect to the CVSup server, and it says, Here is what you can download from me..., and your client responds OK, I will take this, this, this, and this. In the default configuration, the CVSup client will take every file associated with the collection and tag you chose in the configuration file. However, this is not always what you want, especially if you are synching the doc, ports, or www trees — most people cannot read four or five languages, and therefore they do not need to download the language-specific files. If you are CVSuping the Ports Collection, you can get around this by specifying each collection individually (e.g., ports-astrology, ports-biology, etc instead of simply saying ports-all). However, since the doc and www trees do not have language-specific collections, you must use one of CVSup's many nifty features: the refuse file. The refuse file essentially tells CVSup that it should not take every single file from a collection; in other words, it tells the client to refuse certain files from the server. The refuse file can be found (or, if you do not yet have one, should be placed) in base/sup/. base is defined in your supfile; our defined base is /var/db, which means that by default the refuse file is /var/db/sup/refuse. The refuse file has a very simple format; it simply contains the names of files or directories that you do not wish to download. For example, if you cannot speak any languages other than English and some German, and you do not feel the need to read the German translation of documentation, you can put the following in your refuse file: doc/bn_* doc/da_* doc/de_* doc/el_* doc/es_* doc/fr_* doc/it_* doc/ja_* doc/nl_* doc/no_* doc/pl_* doc/pt_* doc/ru_* doc/sr_* doc/tr_* doc/zh_* and so forth for the other languages (you can find the full list by browsing the FreeBSD CVS repository). With this very useful feature, those users who are on slow links or pay by the minute for their Internet connection will be able to save valuable time as they will no longer need to download files that they will never use. For more information on refuse files and other neat features of CVSup, please view its manual page. Running <application>CVSup</application> You are now ready to try an update. The command line for doing this is quite simple: &prompt.root; cvsup supfile where supfile is of course the name of the supfile you have just created. Assuming you are running under X11, cvsup will display a GUI window with some buttons to do the usual things. Press the go button, and watch it run. Since you are updating your actual /usr/src tree in this example, you will need to run the program as root so that cvsup has the permissions it needs to update your files. Having just created your configuration file, and having never used this program before, that might understandably make you nervous. There is an easy way to do a trial run without touching your precious files. Just create an empty directory somewhere convenient, and name it as an extra argument on the command line: &prompt.root; mkdir /var/tmp/dest &prompt.root; cvsup supfile /var/tmp/dest The directory you specify will be used as the destination directory for all file updates. CVSup will examine your usual files in /usr/src, but it will not modify or delete any of them. Any file updates will instead land in /var/tmp/dest/usr/src. CVSup will also leave its base directory status files untouched when run this way. The new versions of those files will be written into the specified directory. As long as you have read access to /usr/src, you do not even need to be root to perform this kind of trial run. If you are not running X11 or if you just do not like GUIs, you should add a couple of options to the command line when you run cvsup: &prompt.root; cvsup -g -L 2 supfile The tells CVSup not to use its GUI. This is automatic if you are not running X11, but otherwise you have to specify it. The tells CVSup to print out the details of all the file updates it is doing. There are three levels of verbosity, from to . The default is 0, which means total silence except for error messages. There are plenty of other options available. For a brief list of them, type cvsup -H. For more detailed descriptions, see the manual page. Once you are satisfied with the way updates are working, you can arrange for regular runs of CVSup using &man.cron.8;. Obviously, you should not let CVSup use its GUI when running it from &man.cron.8;. <application>CVSup</application> File Collections The file collections available via CVSup are organized hierarchically. There are a few large collections, and they are divided into smaller sub-collections. Receiving a large collection is equivalent to receiving each of its sub-collections. The hierarchical relationships among collections are reflected by the use of indentation in the list below. The most commonly used collections are src-all, and ports-all. The other collections are used only by small groups of people for specialized purposes, and some mirror sites may not carry all of them. cvs-all release=cvs The main FreeBSD CVS repository, including the cryptography code. distrib release=cvs Files related to the distribution and mirroring of FreeBSD. doc-all release=cvs Sources for the FreeBSD Handbook and other documentation. This does not include files for the FreeBSD web site. ports-all release=cvs The FreeBSD Ports Collection. If you do not want to update the whole of ports-all (the whole ports tree), but use one of the subcollections listed below, make sure that you always update the ports-base subcollection! Whenever something changes in the ports build infrastructure represented by ports-base, it is virtually certain that those changes will be used by real ports real soon. Thus, if you only update the real ports and they use some of the new features, there is a very high chance that their build will fail with some mysterious error message. The very first thing to do in this case is to make sure that your ports-base subcollection is up to date. If you are going to be building your own local copy of ports/INDEX, you must accept ports-all (the whole ports tree). Building ports/INDEX with a partial tree is not supported. See the FAQ. ports-accessibility release=cvs Software to help disabled users. ports-arabic release=cvs Arabic language support. ports-archivers release=cvs Archiving tools. ports-astro release=cvs Astronomical ports. ports-audio release=cvs Sound support. ports-base release=cvs The Ports Collection build infrastructure - various files located in the Mk/ and Tools/ subdirectories of /usr/ports. Please see the important warning above: you should always update this subcollection, whenever you update any part of the FreeBSD Ports Collection! ports-benchmarks release=cvs Benchmarks. ports-biology release=cvs Biology. ports-cad release=cvs Computer aided design tools. ports-chinese release=cvs Chinese language support. ports-comms release=cvs Communication software. ports-converters release=cvs character code converters. ports-databases release=cvs Databases. ports-deskutils release=cvs Things that used to be on the desktop before computers were invented. ports-devel release=cvs Development utilities. ports-dns release=cvs DNS related software. ports-editors release=cvs Editors. ports-emulators release=cvs Emulators for other operating systems. ports-finance release=cvs Monetary, financial and related applications. ports-ftp release=cvs FTP client and server utilities. ports-games release=cvs Games. ports-german release=cvs German language support. ports-graphics release=cvs Graphics utilities. ports-hebrew release=cvs Hebrew language support. ports-hungarian release=cvs Hungarian language support. ports-irc release=cvs Internet Relay Chat utilities. ports-japanese release=cvs Japanese language support. ports-java release=cvs &java; utilities. ports-korean release=cvs Korean language support. ports-lang release=cvs Programming languages. ports-mail release=cvs Mail software. ports-math release=cvs Numerical computation software. ports-mbone release=cvs MBone applications. ports-misc release=cvs Miscellaneous utilities. ports-multimedia release=cvs Multimedia software. ports-net release=cvs Networking software. ports-net-im release=cvs Instant messaging software. ports-net-mgmt release=cvs Network management software. ports-news release=cvs USENET news software. ports-palm release=cvs Software support for Palm series. ports-polish release=cvs Polish language support. ports-portuguese release=cvs Portuguese language support. ports-print release=cvs Printing software. ports-russian release=cvs Russian language support. ports-science release=cvs Science. ports-security release=cvs Security utilities. ports-shells release=cvs Command line shells. ports-sysutils release=cvs System utilities. ports-textproc release=cvs text processing utilities (does not include desktop publishing). ports-ukrainian release=cvs Ukrainian language support. ports-vietnamese release=cvs Vietnamese language support. ports-www release=cvs Software related to the World Wide Web. ports-x11 release=cvs Ports to support the X window system. ports-x11-clocks release=cvs X11 clocks. ports-x11-fm release=cvs X11 file managers. ports-x11-fonts release=cvs X11 fonts and font utilities. ports-x11-toolkits release=cvs X11 toolkits. ports-x11-servers release=cvs X11 servers. ports-x11-themes release=cvs X11 themes. ports-x11-wm release=cvs X11 window managers. src-all release=cvs The main FreeBSD sources, including the cryptography code. src-base release=cvs Miscellaneous files at the top of /usr/src. src-bin release=cvs User utilities that may be needed in single-user mode (/usr/src/bin). src-contrib release=cvs Utilities and libraries from outside the FreeBSD project, used relatively unmodified (/usr/src/contrib). src-crypto release=cvs Cryptography utilities and libraries from outside the FreeBSD project, used relatively unmodified (/usr/src/crypto). src-eBones release=cvs Kerberos and DES (/usr/src/eBones). Not used in current releases of FreeBSD. src-etc release=cvs System configuration files (/usr/src/etc). src-games release=cvs Games (/usr/src/games). src-gnu release=cvs Utilities covered by the GNU Public License (/usr/src/gnu). src-include release=cvs Header files (/usr/src/include). src-kerberos5 release=cvs Kerberos5 security package (/usr/src/kerberos5). src-kerberosIV release=cvs KerberosIV security package (/usr/src/kerberosIV). src-lib release=cvs Libraries (/usr/src/lib). src-libexec release=cvs System programs normally executed by other programs (/usr/src/libexec). src-release release=cvs Files required to produce a FreeBSD release (/usr/src/release). src-sbin release=cvs System utilities for single-user mode (/usr/src/sbin). src-secure release=cvs Cryptographic libraries and commands (/usr/src/secure). src-share release=cvs Files that can be shared across multiple systems (/usr/src/share). src-sys release=cvs The kernel (/usr/src/sys). src-sys-crypto release=cvs Kernel cryptography code (/usr/src/sys/crypto). src-tools release=cvs Various tools for the maintenance of FreeBSD (/usr/src/tools). src-usrbin release=cvs User utilities (/usr/src/usr.bin). src-usrsbin release=cvs System utilities (/usr/src/usr.sbin). www release=cvs The sources for the FreeBSD WWW site. distrib release=self The CVSup server's own configuration files. Used by CVSup mirror sites. gnats release=current The GNATS bug-tracking database. mail-archive release=current FreeBSD mailing list archive. www release=current The pre-processed FreeBSD WWW site files (not the source files). Used by WWW mirror sites. For More Information For the CVSup FAQ and other information about CVSup, see The CVSup Home Page. Most FreeBSD-related discussion of CVSup takes place on the &a.hackers;. New versions of the software are announced there, as well as on the &a.announce;. Questions and bug reports should be addressed to the author of the program at cvsup-bugs@polstra.com. CVSup Sites CVSup servers for FreeBSD are running at the following sites: &chap.mirrors.cvsup.inc; Using Portsnap Introduction Portsnap is a system for securely distributing the &os; ports tree. Approximately once an hour, a snapshot of the ports tree is generated, repackaged, and cryptographically signed. The resulting files are then distributed via HTTP. Like CVSup, Portsnap uses a pull model of updating: The packaged and signed ports trees are placed on a web server which waits passively for clients to request files. Users must either run &man.portsnap.8; manually to download updates or set up a &man.cron.8; job to download updates automatically on a regular basis. For technical reasons, Portsnap does not update the live ports tree in /usr/ports/ directly; instead, it works via a compressed copy of the ports tree stored in /var/db/portsnap/ by default. This compressed copy is then used to update the live ports tree. If Portsnap is installed from the &os; Ports Collection, then the default location for its compressed snapshot will be /usr/local/portsnap/ instead of /var/db/portsnap/. Installation On &os; 6.0 and more recent versions, Portsnap is contained in the &os; base system. On older versions of &os;, it can be installed using the sysutils/portsnap port. Portsnap Configuration Portsnap's operation is controlled by the /etc/portsnap.conf configuration file. For most users, the default configuration file will suffice; for more details, consult the &man.portsnap.conf.5; manual page. If Portsnap is installed from the &os; Ports Collection, it will use the configuration file /usr/local/etc/portsnap.conf instead of /etc/portsnap.conf. This configuration file is not created when the port is installed, but a sample configuration file is distributed; to copy it into place, run the following command: - &prompt.root; cd /usr/local/etc && cp portsnap.conf.sample portsnap.conf + &prompt.root; cd /usr/local/etc && cp portsnap.conf.sample portsnap.conf Running <application>Portsnap</application> for the First Time The first time &man.portsnap.8; is run, it will need to download a compressed snapshot of the entire ports tree into /var/db/portsnap/ (or /usr/local/portsnap/ if Portsnap was installed from the Ports Collection). This is approximately a 36 MB download. &prompt.root; portsnap fetch Once the compressed snapshot has been downloaded, a live copy of the ports tree can be extracted into /usr/ports/. This is necessary even if a ports tree has already been created in that directory (e.g., by using CVSup), since it establishes a baseline from which portsnap can determine which parts of the ports tree need to be updated later. In the default installation /usr/ports is not created. It should be created before portsnap is used. &prompt.root; mkdir /usr/ports &prompt.root; portsnap extract Updating the Ports Tree After an initial compressed snapshot of the ports tree has been downloaded and extracted into /usr/ports/, updating the ports tree consists of two steps: fetching updates to the compressed snapshot, and using them to update the live ports tree. These two steps can be specified to portsnap as a single command: &prompt.root; portsnap fetch update Some older versions of portsnap do not support this syntax; if it fails, try instead the following: &prompt.root; portsnap fetch &prompt.root; portsnap update Running Portsnap from cron In order to avoid problems with flash crowds accessing the Portsnap servers, portsnap fetch will not run from a &man.cron.8; job. Instead, a special portsnap cron command exists, which waits for a random duration up to 3600 seconds before fetching updates. In addition, it is strongly recommended that portsnap update not be run from a cron job, since it is liable to cause major problems if it happens to run at the same time as a port is being built or installed. However, it is safe to update the ports INDEX files, and this can be done by passing the flag to portsnap. (Obviously, if portsnap -I update is run from cron, then it will be necessary to run portsnap update without the flag at a later time in order to update the rest of the tree.) Adding the following line to /etc/crontab will cause portsnap to update its compressed snapshot and the INDEX files in /usr/ports/, and will send an email if any installed ports are out of date: - 0 3 * * * root portsnap -I cron update && pkg_version -vIL= + 0 3 * * * root portsnap -I cron update && pkg_version -vIL= If the system clock is not set to the local time zone, please replace 3 with a random value between 0 and 23, in order to spread the load on the Portsnap servers more evenly. Some older versions of portsnap do not support listing multiple commands (e.g., cron update) in the same invocation of portsnap. If the line above fails, try replacing portsnap -I cron update with - portsnap cron && portsnap -I update. + portsnap cron && portsnap -I update. CVS Tags When obtaining or updating sources using cvs or CVSup, a revision tag must be specified. A revision tag refers to either a particular line of &os; development, or a specific point in time. The first type are called branch tags, and the second type are called release tags. Branch Tags All of these, with the exception of HEAD (which is always a valid tag), only apply to the src/ tree. The ports/, doc/, and www/ trees are not branched. HEAD Symbolic name for the main line, or FreeBSD-CURRENT. Also the default when no revision is specified. In CVSup, this tag is represented by a . (not punctuation, but a literal . character). In CVS, this is the default when no revision tag is specified. It is usually not a good idea to checkout or update to CURRENT sources on a STABLE machine, unless that is your intent. RELENG_6 The line of development for FreeBSD-6.X, also known as FreeBSD 6-STABLE RELENG_6_0 The release branch for FreeBSD-6.0, used only for security advisories and other critical fixes. RELENG_5 The line of development for FreeBSD-5.X, also known as FreeBSD 5-STABLE. RELENG_5_4 The release branch for FreeBSD-5.4, used only for security advisories and other critical fixes. RELENG_5_3 The release branch for FreeBSD-5.3, used only for security advisories and other critical fixes. RELENG_5_2 The release branch for FreeBSD-5.2 and FreeBSD-5.2.1, used only for security advisories and other critical fixes. RELENG_5_1 The release branch for FreeBSD-5.1, used only for security advisories and other critical fixes. RELENG_5_0 The release branch for FreeBSD-5.0, used only for security advisories and other critical fixes. RELENG_4 The line of development for FreeBSD-4.X, also known as FreeBSD 4-STABLE. RELENG_4_11 The release branch for FreeBSD-4.11, used only for security advisories and other critical fixes. RELENG_4_10 The release branch for FreeBSD-4.10, used only for security advisories and other critical fixes. RELENG_4_9 The release branch for FreeBSD-4.9, used only for security advisories and other critical fixes. RELENG_4_8 The release branch for FreeBSD-4.8, used only for security advisories and other critical fixes. RELENG_4_7 The release branch for FreeBSD-4.7, used only for security advisories and other critical fixes. RELENG_4_6 The release branch for FreeBSD-4.6 and FreeBSD-4.6.2, used only for security advisories and other critical fixes. RELENG_4_5 The release branch for FreeBSD-4.5, used only for security advisories and other critical fixes. RELENG_4_4 The release branch for FreeBSD-4.4, used only for security advisories and other critical fixes. RELENG_4_3 The release branch for FreeBSD-4.3, used only for security advisories and other critical fixes. RELENG_3 The line of development for FreeBSD-3.X, also known as 3.X-STABLE. RELENG_2_2 The line of development for FreeBSD-2.2.X, also known as 2.2-STABLE. This branch is mostly obsolete. Release Tags These tags refer to a specific point in time when a particular version of &os; was released. The release engineering process is documented in more detail by the Release Engineering Information and Release Process documents. The src tree uses tag names that start with RELENG_ tags. The ports and doc trees use tags whose names begin with RELEASE tags. Finally, the www tree is not tagged with any special name for releases. RELENG_5_4_0_RELEASE FreeBSD 5.4 RELENG_4_11_0_RELEASE FreeBSD 4.11 RELENG_5_3_0_RELEASE FreeBSD 5.3 RELENG_4_10_0_RELEASE FreeBSD 4.10 RELENG_5_2_1_RELEASE FreeBSD 5.2.1 RELENG_5_2_0_RELEASE FreeBSD 5.2 RELENG_4_9_0_RELEASE FreeBSD 4.9 RELENG_5_1_0_RELEASE FreeBSD 5.1 RELENG_4_8_0_RELEASE FreeBSD 4.8 RELENG_5_0_0_RELEASE FreeBSD 5.0 RELENG_4_7_0_RELEASE FreeBSD 4.7 RELENG_4_6_2_RELEASE FreeBSD 4.6.2 RELENG_4_6_1_RELEASE FreeBSD 4.6.1 RELENG_4_6_0_RELEASE FreeBSD 4.6 RELENG_4_5_0_RELEASE FreeBSD 4.5 RELENG_4_4_0_RELEASE FreeBSD 4.4 RELENG_4_3_0_RELEASE FreeBSD 4.3 RELENG_4_2_0_RELEASE FreeBSD 4.2 RELENG_4_1_1_RELEASE FreeBSD 4.1.1 RELENG_4_1_0_RELEASE FreeBSD 4.1 RELENG_4_0_0_RELEASE FreeBSD 4.0 RELENG_3_5_0_RELEASE FreeBSD-3.5 RELENG_3_4_0_RELEASE FreeBSD-3.4 RELENG_3_3_0_RELEASE FreeBSD-3.3 RELENG_3_2_0_RELEASE FreeBSD-3.2 RELENG_3_1_0_RELEASE FreeBSD-3.1 RELENG_3_0_0_RELEASE FreeBSD-3.0 RELENG_2_2_8_RELEASE FreeBSD-2.2.8 RELENG_2_2_7_RELEASE FreeBSD-2.2.7 RELENG_2_2_6_RELEASE FreeBSD-2.2.6 RELENG_2_2_5_RELEASE FreeBSD-2.2.5 RELENG_2_2_2_RELEASE FreeBSD-2.2.2 RELENG_2_2_1_RELEASE FreeBSD-2.2.1 RELENG_2_2_0_RELEASE FreeBSD-2.2.0 AFS Sites AFS servers for FreeBSD are running at the following sites: Sweden The path to the files are: /afs/stacken.kth.se/ftp/pub/FreeBSD/ stacken.kth.se # Stacken Computer Club, KTH, Sweden 130.237.234.43 #hot.stacken.kth.se 130.237.237.230 #fishburger.stacken.kth.se 130.237.234.3 #milko.stacken.kth.se Maintainer ftp@stacken.kth.se rsync Sites The following sites make FreeBSD available through the rsync protocol. The rsync utility works in much the same way as the &man.rcp.1; command, but has more options and uses the rsync remote-update protocol which transfers only the differences between two sets of files, thus greatly speeding up the synchronization over the network. This is most useful if you are a mirror site for the FreeBSD FTP server, or the CVS repository. The rsync suite is available for many operating systems, on FreeBSD, see the net/rsync port or use the package. Czech Republic rsync://ftp.cz.FreeBSD.org/ Available collections: ftp: A partial mirror of the FreeBSD FTP server. FreeBSD: A full mirror of the FreeBSD FTP server. Germany rsync://grappa.unix-ag.uni-kl.de/ Available collections: freebsd-cvs: The full FreeBSD CVS repository. This machine also mirrors the CVS repositories of the NetBSD and the OpenBSD projects, among others. Netherlands rsync://ftp.nl.FreeBSD.org/ Available collections: vol/4/freebsd-core: A full mirror of the FreeBSD FTP server. United Kingdom rsync://rsync.mirror.ac.uk/ Available collections: ftp.FreeBSD.org: A full mirror of the FreeBSD FTP server. United States of America rsync://ftp-master.FreeBSD.org/ This server may only be used by FreeBSD primary mirror sites. Available collections: FreeBSD: The master archive of the FreeBSD FTP server. acl: The FreeBSD master ACL list. rsync://ftp13.FreeBSD.org/ Available collections: FreeBSD: A full mirror of the FreeBSD FTP server. diff --git a/en_US.ISO8859-1/books/handbook/network-servers/chapter.sgml b/en_US.ISO8859-1/books/handbook/network-servers/chapter.sgml index b9d302fda8..1e31c5de85 100644 --- a/en_US.ISO8859-1/books/handbook/network-servers/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/network-servers/chapter.sgml @@ -1,5184 +1,5184 @@ Murray Stokely Reorganized by Network Servers Synopsis This chapter will cover some of the more frequently used network services on &unix; systems. We will cover how to install, configure, test, and maintain many different types of network services. Example configuration files are included throughout this chapter for you to benefit from. After reading this chapter, you will know: How to manage the inetd daemon. How to set up a network file system. How to set up a network information server for sharing user accounts. How to set up automatic network settings using DHCP. How to set up a domain name server. How to set up the Apache HTTP Server. How to set up a File Transfer Protocol (FTP) Server. How to set up a file and print server for &windows; clients using Samba. How to synchronize the time and date, and set up a time server, with the NTP protocol. Before reading this chapter, you should: Understand the basics of the /etc/rc scripts. Be familiar with basic network terminology. Know how to install additional third-party software (). Chern Lee Contributed by The <application>inetd</application> <quote>Super-Server</quote> Overview &man.inetd.8; is referred to as the Internet Super-Server because it manages connections for several services. When a connection is received by inetd, it determines which program the connection is destined for, spawns the particular process and delegates the socket to it (the program is invoked with the service socket as its standard input, output and error descriptors). Running one instance of inetd reduces the overall system load as compared to running each daemon individually in stand-alone mode. Primarily, inetd is used to spawn other daemons, but several trivial protocols are handled directly, such as chargen, auth, and daytime. This section will cover the basics in configuring inetd through its command-line options and its configuration file, /etc/inetd.conf. Settings inetd is initialized through the /etc/rc.conf system. The inetd_enable option is set to NO by default, but is often times turned on by sysinstall with the medium security profile. Placing: inetd_enable="YES" or inetd_enable="NO" into /etc/rc.conf can enable or disable inetd starting at boot time. Additionally, different command-line options can be passed to inetd via the inetd_flags option. Command-Line Options inetd synopsis: -d Turn on debugging. -l Turn on logging of successful connections. -w Turn on TCP Wrapping for external services (on by default). -W Turn on TCP Wrapping for internal services which are built into inetd (on by default). -c maximum Specify the default maximum number of simultaneous invocations of each service; the default is unlimited. May be overridden on a per-service basis with the parameter. -C rate Specify the default maximum number of times a service can be invoked from a single IP address in one minute; the default is unlimited. May be overridden on a per-service basis with the parameter. -R rate Specify the maximum number of times a service can be invoked in one minute; the default is 256. A rate of 0 allows an unlimited number of invocations. -a Specify one specific IP address to bind to. Alternatively, a hostname can be specified, in which case the IPv4 or IPv6 address which corresponds to that hostname is used. Usually a hostname is specified when inetd is run inside a &man.jail.8;, in which case the hostname corresponds to the &man.jail.8; environment. When hostname specification is used and both IPv4 and IPv6 bindings are desired, one entry with the appropriate protocol type for each binding is required for each service in /etc/inetd.conf. For example, a TCP-based service would need two entries, one using tcp4 for the protocol and the other using tcp6. -p Specify an alternate file in which to store the process ID. These options can be passed to inetd using the inetd_flags option in /etc/rc.conf. By default, inetd_flags is set to -wW, which turns on TCP wrapping for inetd's internal and external services. For novice users, these parameters usually do not need to be modified or even entered in /etc/rc.conf. An external service is a daemon outside of inetd, which is invoked when a connection is received for it. On the other hand, an internal service is one that inetd has the facility of offering within itself. <filename>inetd.conf</filename> Configuration of inetd is controlled through the /etc/inetd.conf file. When a modification is made to /etc/inetd.conf, inetd can be forced to re-read its configuration file by sending a HangUP signal to the inetd process as shown: Sending <application>inetd</application> a HangUP Signal &prompt.root; kill -HUP `cat /var/run/inetd.pid` Each line of the configuration file specifies an individual daemon. Comments in the file are preceded by a #. The format of /etc/inetd.conf is as follows: service-name socket-type protocol {wait|nowait}[/max-child[/max-connections-per-ip-per-minute]] user[:group][/login-class] server-program server-program-arguments An example entry for the ftpd daemon using IPv4: ftp stream tcp nowait root /usr/libexec/ftpd ftpd -l service-name This is the service name of the particular daemon. It must correspond to a service listed in /etc/services. This determines which port inetd must listen to. If a new service is being created, it must be placed in /etc/services first. socket-type Either stream, dgram, raw, or seqpacket. stream must be used for connection-based, TCP daemons, while dgram is used for daemons utilizing the UDP transport protocol. protocol One of the following: Protocol Explanation tcp, tcp4 TCP IPv4 udp, udp4 UDP IPv4 tcp6 TCP IPv6 udp6 UDP IPv6 tcp46 Both TCP IPv4 and v6 udp46 Both UDP IPv4 and v6 {wait|nowait}[/max-child[/max-connections-per-ip-per-minute]] indicates whether the daemon invoked from inetd is able to handle its own socket or not. socket types must use the option, while stream socket daemons, which are usually multi-threaded, should use . usually hands off multiple sockets to a single daemon, while spawns a child daemon for each new socket. The maximum number of child daemons inetd may spawn can be set using the option. If a limit of ten instances of a particular daemon is needed, a /10 would be placed after . In addition to , another option limiting the maximum connections from a single place to a particular daemon can be enabled. does just this. A value of ten here would limit any particular IP address connecting to a particular service to ten attempts per minute. This is useful to prevent intentional or unintentional resource consumption and Denial of Service (DoS) attacks to a machine. In this field, or is mandatory. and are optional. A stream-type multi-threaded daemon without any or limits would simply be: nowait. The same daemon with a maximum limit of ten daemons would read: nowait/10. Additionally, the same setup with a limit of twenty connections per IP address per minute and a maximum total limit of ten child daemons would read: nowait/10/20. These options are all utilized by the default settings of the fingerd daemon, as seen here: finger stream tcp nowait/3/10 nobody /usr/libexec/fingerd fingerd -s user This is the username that the particular daemon should run as. Most commonly, daemons run as the root user. For security purposes, it is common to find some servers running as the daemon user, or the least privileged nobody user. server-program The full path of the daemon to be executed when a connection is received. If the daemon is a service provided by inetd internally, then should be used. server-program-arguments This works in conjunction with by specifying the arguments, starting with argv[0], passed to the daemon on invocation. If mydaemon -d is the command line, mydaemon -d would be the value of . Again, if the daemon is an internal service, use here. Security Depending on the security profile chosen at install, many of inetd's daemons may be enabled by default. If there is no apparent need for a particular daemon, disable it! Place a # in front of the daemon in question in /etc/inetd.conf, and then send a hangup signal to inetd. Some daemons, such as fingerd, may not be desired at all because they provide an attacker with too much information. Some daemons are not security-conscious and have long, or non-existent timeouts for connection attempts. This allows an attacker to slowly send connections to a particular daemon, thus saturating available resources. It may be a good idea to place and limitations on certain daemons. By default, TCP wrapping is turned on. Consult the &man.hosts.access.5; manual page for more information on placing TCP restrictions on various inetd invoked daemons. Miscellaneous daytime, time, echo, discard, chargen, and auth are all internally provided services of inetd. The auth service provides identity (ident, identd) network services, and is configurable to a certain degree. Consult the &man.inetd.8; manual page for more in-depth information. Tom Rhodes Reorganized and enhanced by Bill Swingle Written by Network File System (NFS) NFS Among the many different file systems that FreeBSD supports is the Network File System, also known as NFS. NFS allows a system to share directories and files with others over a network. By using NFS, users and programs can access files on remote systems almost as if they were local files. Some of the most notable benefits that NFS can provide are: Local workstations use less disk space because commonly used data can be stored on a single machine and still remain accessible to others over the network. There is no need for users to have separate home directories on every network machine. Home directories could be set up on the NFS server and made available throughout the network. Storage devices such as floppy disks, CDROM drives, and &iomegazip; drives can be used by other machines on the network. This may reduce the number of removable media drives throughout the network. How <acronym>NFS</acronym> Works NFS consists of at least two main parts: a server and one or more clients. The client remotely accesses the data that is stored on the server machine. In order for this to function properly a few processes have to be configured and running. Under &os; 4.X, the portmap utility is used in place of the rpcbind utility. Thus, in &os; 4.X the user is required to replace every instance of rpcbind with portmap in the forthcoming examples. The server has to be running the following daemons: NFS server file server UNIX clients rpcbind portmap mountd nfsd Daemon Description nfsd The NFS daemon which services requests from the NFS clients. mountd The NFS mount daemon which carries out the requests that &man.nfsd.8; passes on to it. rpcbind This daemon allows NFS clients to discover which port the NFS server is using. The client can also run a daemon, known as nfsiod. The nfsiod daemon services the requests from the NFS server. This is optional, and improves performance, but is not required for normal and correct operation. See the &man.nfsiod.8; manual page for more information. Configuring <acronym>NFS</acronym> NFS configuration NFS configuration is a relatively straightforward process. The processes that need to be running can all start at boot time with a few modifications to your /etc/rc.conf file. On the NFS server, make sure that the following options are configured in the /etc/rc.conf file: rpcbind_enable="YES" nfs_server_enable="YES" mountd_flags="-r" mountd runs automatically whenever the NFS server is enabled. On the client, make sure this option is present in /etc/rc.conf: nfs_client_enable="YES" The /etc/exports file specifies which file systems NFS should export (sometimes referred to as share). Each line in /etc/exports specifies a file system to be exported and which machines have access to that file system. Along with what machines have access to that file system, access options may also be specified. There are many such options that can be used in this file but only a few will be mentioned here. You can easily discover other options by reading over the &man.exports.5; manual page. Here are a few example /etc/exports entries: NFS export examples The following examples give an idea of how to export file systems, although the settings may be different depending on your environment and network configuration. For instance, to export the /cdrom directory to three example machines that have the same domain name as the server (hence the lack of a domain name for each) or have entries in your /etc/hosts file. The flag makes the exported file system read-only. With this flag, the remote system will not be able to write any changes to the exported file system. /cdrom -ro host1 host2 host3 The following line exports /home to three hosts by IP address. This is a useful setup if you have a private network without a DNS server configured. Optionally the /etc/hosts file could be configured for internal hostnames; please review &man.hosts.5; for more information. The flag allows the subdirectories to be mount points. In other words, it will not mount the subdirectories but permit the client to mount only the directories that are required or needed. /home -alldirs 10.0.0.2 10.0.0.3 10.0.0.4 The following line exports /a so that two clients from different domains may access the file system. The flag allows the root user on the remote system to write data on the exported file system as root. If the -maproot=root flag is not specified, then even if a user has root access on the remote system, he will not be able to modify files on the exported file system. /a -maproot=root host.example.com box.example.org In order for a client to access an exported file system, the client must have permission to do so. Make sure the client is listed in your /etc/exports file. In /etc/exports, each line represents the export information for one file system to one host. A remote host can only be specified once per file system, and may only have one default entry. For example, assume that /usr is a single file system. The following /etc/exports would be invalid: # Invalid when /usr is one file system /usr/src client /usr/ports client One file system, /usr, has two lines specifying exports to the same host, client. The correct format for this situation is: /usr/src /usr/ports client The properties of one file system exported to a given host must all occur on one line. Lines without a client specified are treated as a single host. This limits how you can export file systems, but for most people this is not an issue. The following is an example of a valid export list, where /usr and /exports are local file systems: # Export src and ports to client01 and client02, but only # client01 has root privileges on it /usr/src /usr/ports -maproot=root client01 /usr/src /usr/ports client02 # The client machines have root and can mount anywhere # on /exports. Anyone in the world can mount /exports/obj read-only /exports -alldirs -maproot=root client01 client02 /exports/obj -ro You must restart mountd whenever you modify /etc/exports so the changes can take effect. This can be accomplished by sending the HUP signal to the mountd process: &prompt.root; kill -HUP `cat /var/run/mountd.pid` Alternatively, a reboot will make FreeBSD set everything up properly. A reboot is not necessary though. Executing the following commands as root should start everything up. On the NFS server: &prompt.root; rpcbind &prompt.root; nfsd -u -t -n 4 &prompt.root; mountd -r On the NFS client: &prompt.root; nfsiod -n 4 Now everything should be ready to actually mount a remote file system. In these examples the server's name will be server and the client's name will be client. If you only want to temporarily mount a remote file system or would rather test the configuration, just execute a command like this as root on the client: NFS mounting &prompt.root; mount server:/home /mnt This will mount the /home directory on the server at /mnt on the client. If everything is set up correctly you should be able to enter /mnt on the client and see all the files that are on the server. If you want to automatically mount a remote file system each time the computer boots, add the file system to the /etc/fstab file. Here is an example: server:/home /mnt nfs rw 0 0 The &man.fstab.5; manual page lists all the available options. Practical Uses NFS has many practical uses. Some of the more common ones are listed below: NFS uses Set several machines to share a CDROM or other media among them. This is cheaper and often a more convenient method to install software on multiple machines. On large networks, it might be more convenient to configure a central NFS server in which to store all the user home directories. These home directories can then be exported to the network so that users would always have the same home directory, regardless of which workstation they log in to. Several machines could have a common /usr/ports/distfiles directory. That way, when you need to install a port on several machines, you can quickly access the source without downloading it on each machine. Wylie Stilwell Contributed by Chern Lee Rewritten by Automatic Mounts with <application>amd</application> amd automatic mounter daemon &man.amd.8; (the automatic mounter daemon) automatically mounts a remote file system whenever a file or directory within that file system is accessed. Filesystems that are inactive for a period of time will also be automatically unmounted by amd. Using amd provides a simple alternative to permanent mounts, as permanent mounts are usually listed in /etc/fstab. amd operates by attaching itself as an NFS server to the /host and /net directories. When a file is accessed within one of these directories, amd looks up the corresponding remote mount and automatically mounts it. /net is used to mount an exported file system from an IP address, while /host is used to mount an export from a remote hostname. An access to a file within /host/foobar/usr would tell amd to attempt to mount the /usr export on the host foobar. Mounting an Export with <application>amd</application> You can view the available mounts of a remote host with the showmount command. For example, to view the mounts of a host named foobar, you can use: &prompt.user; showmount -e foobar Exports list on foobar: /usr 10.10.10.0 /a 10.10.10.0 &prompt.user; cd /host/foobar/usr As seen in the example, the showmount shows /usr as an export. When changing directories to /host/foobar/usr, amd attempts to resolve the hostname foobar and automatically mount the desired export. amd can be started by the startup scripts by placing the following lines in /etc/rc.conf: amd_enable="YES" Additionally, custom flags can be passed to amd from the amd_flags option. By default, amd_flags is set to: amd_flags="-a /.amd_mnt -l syslog /host /etc/amd.map /net /etc/amd.map" The /etc/amd.map file defines the default options that exports are mounted with. The /etc/amd.conf file defines some of the more advanced features of amd. Consult the &man.amd.8; and &man.amd.conf.5; manual pages for more information. John Lind Contributed by Problems Integrating with Other Systems Certain Ethernet adapters for ISA PC systems have limitations which can lead to serious network problems, particularly with NFS. This difficulty is not specific to FreeBSD, but FreeBSD systems are affected by it. The problem nearly always occurs when (FreeBSD) PC systems are networked with high-performance workstations, such as those made by Silicon Graphics, Inc., and Sun Microsystems, Inc. The NFS mount will work fine, and some operations may succeed, but suddenly the server will seem to become unresponsive to the client, even though requests to and from other systems continue to be processed. This happens to the client system, whether the client is the FreeBSD system or the workstation. On many systems, there is no way to shut down the client gracefully once this problem has manifested itself. The only solution is often to reset the client, because the NFS situation cannot be resolved. Though the correct solution is to get a higher performance and capacity Ethernet adapter for the FreeBSD system, there is a simple workaround that will allow satisfactory operation. If the FreeBSD system is the server, include the option on the mount from the client. If the FreeBSD system is the client, then mount the NFS file system with the option . These options may be specified using the fourth field of the fstab entry on the client for automatic mounts, or by using the parameter of the &man.mount.8; command for manual mounts. It should be noted that there is a different problem, sometimes mistaken for this one, when the NFS servers and clients are on different networks. If that is the case, make certain that your routers are routing the necessary UDP information, or you will not get anywhere, no matter what else you are doing. In the following examples, fastws is the host (interface) name of a high-performance workstation, and freebox is the host (interface) name of a FreeBSD system with a lower-performance Ethernet adapter. Also, /sharedfs will be the exported NFS file system (see &man.exports.5;), and /project will be the mount point on the client for the exported file system. In all cases, note that additional options, such as or and may be desirable in your application. Examples for the FreeBSD system (freebox) as the client in /etc/fstab on freebox: fastws:/sharedfs /project nfs rw,-r=1024 0 0 As a manual mount command on freebox: &prompt.root; mount -t nfs -o -r=1024 fastws:/sharedfs /project Examples for the FreeBSD system as the server in /etc/fstab on fastws: freebox:/sharedfs /project nfs rw,-w=1024 0 0 As a manual mount command on fastws: &prompt.root; mount -t nfs -o -w=1024 freebox:/sharedfs /project Nearly any 16-bit Ethernet adapter will allow operation without the above restrictions on the read or write size. For anyone who cares, here is what happens when the failure occurs, which also explains why it is unrecoverable. NFS typically works with a block size of 8 K (though it may do fragments of smaller sizes). Since the maximum Ethernet packet is around 1500 bytes, the NFS block gets split into multiple Ethernet packets, even though it is still a single unit to the upper-level code, and must be received, assembled, and acknowledged as a unit. The high-performance workstations can pump out the packets which comprise the NFS unit one right after the other, just as close together as the standard allows. On the smaller, lower capacity cards, the later packets overrun the earlier packets of the same unit before they can be transferred to the host and the unit as a whole cannot be reconstructed or acknowledged. As a result, the workstation will time out and try again, but it will try again with the entire 8 K unit, and the process will be repeated, ad infinitum. By keeping the unit size below the Ethernet packet size limitation, we ensure that any complete Ethernet packet received can be acknowledged individually, avoiding the deadlock situation. Overruns may still occur when a high-performance workstations is slamming data out to a PC system, but with the better cards, such overruns are not guaranteed on NFS units. When an overrun occurs, the units affected will be retransmitted, and there will be a fair chance that they will be received, assembled, and acknowledged. Bill Swingle Written by Eric Ogren Enhanced by Udo Erdelhoff Network Information System (NIS/YP) What Is It? NIS Solaris HP-UX AIX Linux NetBSD OpenBSD NIS, which stands for Network Information Services, was developed by Sun Microsystems to centralize administration of &unix; (originally &sunos;) systems. It has now essentially become an industry standard; all major &unix; like systems (&solaris;, HP-UX, &aix;, Linux, NetBSD, OpenBSD, FreeBSD, etc) support NIS. yellow pagesNIS NIS was formerly known as Yellow Pages, but because of trademark issues, Sun changed the name. The old term (and yp) is still often seen and used. NIS domains It is a RPC-based client/server system that allows a group of machines within an NIS domain to share a common set of configuration files. This permits a system administrator to set up NIS client systems with only minimal configuration data and add, remove or modify configuration data from a single location. Windows NT It is similar to the &windowsnt; domain system; although the internal implementation of the two are not at all similar, the basic functionality can be compared. Terms/Processes You Should Know There are several terms and several important user processes that you will come across when attempting to implement NIS on FreeBSD, whether you are trying to create an NIS server or act as an NIS client: rpcbind portmap Term Description NIS domainname An NIS master server and all of its clients (including its slave servers) have a NIS domainname. Similar to an &windowsnt; domain name, the NIS domainname does not have anything to do with DNS. rpcbind Must be running in order to enable RPC (Remote Procedure Call, a network protocol used by NIS). If rpcbind is not running, it will be impossible to run an NIS server, or to act as an NIS client (Under &os; 4.X portmap is used in place of rpcbind). ypbind Binds an NIS client to its NIS server. It will take the NIS domainname from the system, and using RPC, connect to the server. ypbind is the core of client-server communication in an NIS environment; if ypbind dies on a client machine, it will not be able to access the NIS server. ypserv Should only be running on NIS servers; this is the NIS server process itself. If &man.ypserv.8; dies, then the server will no longer be able to respond to NIS requests (hopefully, there is a slave server to take over for it). There are some implementations of NIS (but not the FreeBSD one), that do not try to reconnect to another server if the server it used before dies. Often, the only thing that helps in this case is to restart the server process (or even the whole server) or the ypbind process on the client. rpc.yppasswdd Another process that should only be running on NIS master servers; this is a daemon that will allow NIS clients to change their NIS passwords. If this daemon is not running, users will have to login to the NIS master server and change their passwords there. How Does It Work? There are three types of hosts in an NIS environment: master servers, slave servers, and clients. Servers act as a central repository for host configuration information. Master servers hold the authoritative copy of this information, while slave servers mirror this information for redundancy. Clients rely on the servers to provide this information to them. Information in many files can be shared in this manner. The master.passwd, group, and hosts files are commonly shared via NIS. Whenever a process on a client needs information that would normally be found in these files locally, it makes a query to the NIS server that it is bound to instead. Machine Types NIS master server A NIS master server. This server, analogous to a &windowsnt; primary domain controller, maintains the files used by all of the NIS clients. The passwd, group, and other various files used by the NIS clients live on the master server. It is possible for one machine to be an NIS master server for more than one NIS domain. However, this will not be covered in this introduction, which assumes a relatively small-scale NIS environment. NIS slave server NIS slave servers. Similar to the &windowsnt; backup domain controllers, NIS slave servers maintain copies of the NIS master's data files. NIS slave servers provide the redundancy, which is needed in important environments. They also help to balance the load of the master server: NIS Clients always attach to the NIS server whose response they get first, and this includes slave-server-replies. NIS client NIS clients. NIS clients, like most &windowsnt; workstations, authenticate against the NIS server (or the &windowsnt; domain controller in the &windowsnt; workstations case) to log on. Using NIS/YP This section will deal with setting up a sample NIS environment. This section assumes that you are running FreeBSD 3.3 or later. The instructions given here will probably work for any version of FreeBSD greater than 3.0, but there are no guarantees that this is true. Planning Let us assume that you are the administrator of a small university lab. This lab, which consists of 15 FreeBSD machines, currently has no centralized point of administration; each machine has its own /etc/passwd and /etc/master.passwd. These files are kept in sync with each other only through manual intervention; currently, when you add a user to the lab, you must run adduser on all 15 machines. Clearly, this has to change, so you have decided to convert the lab to use NIS, using two of the machines as servers. Therefore, the configuration of the lab now looks something like: Machine name IP address Machine role ellington 10.0.0.2 NIS master coltrane 10.0.0.3 NIS slave basie 10.0.0.4 Faculty workstation bird 10.0.0.5 Client machine cli[1-11] 10.0.0.[6-17] Other client machines If you are setting up a NIS scheme for the first time, it is a good idea to think through how you want to go about it. No matter what the size of your network, there are a few decisions that need to be made. Choosing a NIS Domain Name NIS domainname This might not be the domainname that you are used to. It is more accurately called the NIS domainname. When a client broadcasts its requests for info, it includes the name of the NIS domain that it is part of. This is how multiple servers on one network can tell which server should answer which request. Think of the NIS domainname as the name for a group of hosts that are related in some way. Some organizations choose to use their Internet domainname for their NIS domainname. This is not recommended as it can cause confusion when trying to debug network problems. The NIS domainname should be unique within your network and it is helpful if it describes the group of machines it represents. For example, the Art department at Acme Inc. might be in the acme-art NIS domain. For this example, assume you have chosen the name test-domain. SunOS However, some operating systems (notably &sunos;) use their NIS domain name as their Internet domain name. If one or more machines on your network have this restriction, you must use the Internet domain name as your NIS domain name. Physical Server Requirements There are several things to keep in mind when choosing a machine to use as a NIS server. One of the unfortunate things about NIS is the level of dependency the clients have on the server. If a client cannot contact the server for its NIS domain, very often the machine becomes unusable. The lack of user and group information causes most systems to temporarily freeze up. With this in mind you should make sure to choose a machine that will not be prone to being rebooted regularly, or one that might be used for development. The NIS server should ideally be a stand alone machine whose sole purpose in life is to be an NIS server. If you have a network that is not very heavily used, it is acceptable to put the NIS server on a machine running other services, just keep in mind that if the NIS server becomes unavailable, it will affect all of your NIS clients adversely. NIS Servers The canonical copies of all NIS information are stored on a single machine called the NIS master server. The databases used to store the information are called NIS maps. In FreeBSD, these maps are stored in /var/yp/[domainname] where [domainname] is the name of the NIS domain being served. A single NIS server can support several domains at once, therefore it is possible to have several such directories, one for each supported domain. Each domain will have its own independent set of maps. NIS master and slave servers handle all NIS requests with the ypserv daemon. ypserv is responsible for receiving incoming requests from NIS clients, translating the requested domain and map name to a path to the corresponding database file and transmitting data from the database back to the client. Setting Up a NIS Master Server NIS server configuration Setting up a master NIS server can be relatively straight forward, depending on your needs. FreeBSD comes with support for NIS out-of-the-box. All you need is to add the following lines to /etc/rc.conf, and FreeBSD will do the rest for you. nisdomainname="test-domain" This line will set the NIS domainname to test-domain upon network setup (e.g. after reboot). nis_server_enable="YES" This will tell FreeBSD to start up the NIS server processes when the networking is next brought up. nis_yppasswdd_enable="YES" This will enable the rpc.yppasswdd daemon which, as mentioned above, will allow users to change their NIS password from a client machine. Depending on your NIS setup, you may need to add further entries. See the section about NIS servers that are also NIS clients, below, for details. Now, all you have to do is to run the command /etc/netstart as superuser. It will set up everything for you, using the values you defined in /etc/rc.conf. Initializing the NIS Maps NIS maps The NIS maps are database files, that are kept in the /var/yp directory. They are generated from configuration files in the /etc directory of the NIS master, with one exception: the /etc/master.passwd file. This is for a good reason, you do not want to propagate passwords to your root and other administrative accounts to all the servers in the NIS domain. Therefore, before we initialize the NIS maps, you should: &prompt.root; cp /etc/master.passwd /var/yp/master.passwd &prompt.root; cd /var/yp &prompt.root; vi master.passwd You should remove all entries regarding system accounts (bin, tty, kmem, games, etc), as well as any accounts that you do not want to be propagated to the NIS clients (for example root and any other UID 0 (superuser) accounts). Make sure the /var/yp/master.passwd is neither group nor world readable (mode 600)! Use the chmod command, if appropriate. Tru64 UNIX When you have finished, it is time to initialize the NIS maps! FreeBSD includes a script named ypinit to do this for you (see its manual page for more information). Note that this script is available on most &unix; Operating Systems, but not on all. On Digital UNIX/Compaq Tru64 UNIX it is called ypsetup. Because we are generating maps for an NIS master, we are going to pass the option to ypinit. To generate the NIS maps, assuming you already performed the steps above, run: ellington&prompt.root; ypinit -m test-domain Server Type: MASTER Domain: test-domain Creating an YP server will require that you answer a few questions. Questions will all be asked at the beginning of the procedure. Do you want this procedure to quit on non-fatal errors? [y/n: n] n Ok, please remember to go back and redo manually whatever fails. If you don't, something might not work. At this point, we have to construct a list of this domains YP servers. rod.darktech.org is already known as master server. Please continue to add any slave servers, one per line. When you are done with the list, type a <control D>. master server : ellington next host to add: coltrane next host to add: ^D The current list of NIS servers looks like this: ellington coltrane Is this correct? [y/n: y] y [..output from map generation..] NIS Map update completed. ellington has been setup as an YP master server without any errors. ypinit should have created /var/yp/Makefile from /var/yp/Makefile.dist. When created, this file assumes that you are operating in a single server NIS environment with only FreeBSD machines. Since test-domain has a slave server as well, you must edit /var/yp/Makefile: ellington&prompt.root; vi /var/yp/Makefile You should comment out the line that says NOPUSH = "True" (if it is not commented out already). Setting up a NIS Slave Server NIS slave server Setting up an NIS slave server is even more simple than setting up the master. Log on to the slave server and edit the file /etc/rc.conf as you did before. The only difference is that we now must use the option when running ypinit. The option requires the name of the NIS master be passed to it as well, so our command line looks like: coltrane&prompt.root; ypinit -s ellington test-domain Server Type: SLAVE Domain: test-domain Master: ellington Creating an YP server will require that you answer a few questions. Questions will all be asked at the beginning of the procedure. Do you want this procedure to quit on non-fatal errors? [y/n: n] n Ok, please remember to go back and redo manually whatever fails. If you don't, something might not work. There will be no further questions. The remainder of the procedure should take a few minutes, to copy the databases from ellington. Transferring netgroup... ypxfr: Exiting: Map successfully transferred Transferring netgroup.byuser... ypxfr: Exiting: Map successfully transferred Transferring netgroup.byhost... ypxfr: Exiting: Map successfully transferred Transferring master.passwd.byuid... ypxfr: Exiting: Map successfully transferred Transferring passwd.byuid... ypxfr: Exiting: Map successfully transferred Transferring passwd.byname... ypxfr: Exiting: Map successfully transferred Transferring group.bygid... ypxfr: Exiting: Map successfully transferred Transferring group.byname... ypxfr: Exiting: Map successfully transferred Transferring services.byname... ypxfr: Exiting: Map successfully transferred Transferring rpc.bynumber... ypxfr: Exiting: Map successfully transferred Transferring rpc.byname... ypxfr: Exiting: Map successfully transferred Transferring protocols.byname... ypxfr: Exiting: Map successfully transferred Transferring master.passwd.byname... ypxfr: Exiting: Map successfully transferred Transferring networks.byname... ypxfr: Exiting: Map successfully transferred Transferring networks.byaddr... ypxfr: Exiting: Map successfully transferred Transferring netid.byname... ypxfr: Exiting: Map successfully transferred Transferring hosts.byaddr... ypxfr: Exiting: Map successfully transferred Transferring protocols.bynumber... ypxfr: Exiting: Map successfully transferred Transferring ypservers... ypxfr: Exiting: Map successfully transferred Transferring hosts.byname... ypxfr: Exiting: Map successfully transferred coltrane has been setup as an YP slave server without any errors. Don't forget to update map ypservers on ellington. You should now have a directory called /var/yp/test-domain. Copies of the NIS master server's maps should be in this directory. You will need to make sure that these stay updated. The following /etc/crontab entries on your slave servers should do the job: 20 * * * * root /usr/libexec/ypxfr passwd.byname 21 * * * * root /usr/libexec/ypxfr passwd.byuid These two lines force the slave to sync its maps with the maps on the master server. Although these entries are not mandatory, since the master server attempts to ensure any changes to its NIS maps are communicated to its slaves and because password information is vital to systems depending on the server, it is a good idea to force the updates. This is more important on busy networks where map updates might not always complete. Now, run the command /etc/netstart on the slave server as well, which again starts the NIS server. NIS Clients An NIS client establishes what is called a binding to a particular NIS server using the ypbind daemon. ypbind checks the system's default domain (as set by the domainname command), and begins broadcasting RPC requests on the local network. These requests specify the name of the domain for which ypbind is attempting to establish a binding. If a server that has been configured to serve the requested domain receives one of the broadcasts, it will respond to ypbind, which will record the server's address. If there are several servers available (a master and several slaves, for example), ypbind will use the address of the first one to respond. From that point on, the client system will direct all of its NIS requests to that server. ypbind will occasionally ping the server to make sure it is still up and running. If it fails to receive a reply to one of its pings within a reasonable amount of time, ypbind will mark the domain as unbound and begin broadcasting again in the hopes of locating another server. Setting Up a NIS Client NIS client configuration Setting up a FreeBSD machine to be a NIS client is fairly straightforward. Edit the file /etc/rc.conf and add the following lines in order to set the NIS domainname and start ypbind upon network startup: nisdomainname="test-domain" nis_client_enable="YES" To import all possible password entries from the NIS server, remove all user accounts from your /etc/master.passwd file and use vipw to add the following line to the end of the file: +::::::::: This line will afford anyone with a valid account in the NIS server's password maps an account. There are many ways to configure your NIS client by changing this line. See the netgroups section below for more information. For more detailed reading see O'Reilly's book on Managing NFS and NIS. You should keep at least one local account (i.e. not imported via NIS) in your /etc/master.passwd and this account should also be a member of the group wheel. If there is something wrong with NIS, this account can be used to log in remotely, become root, and fix things. To import all possible group entries from the NIS server, add this line to your /etc/group file: +:*:: After completing these steps, you should be able to run ypcat passwd and see the NIS server's passwd map. NIS Security In general, any remote user can issue an RPC to &man.ypserv.8; and retrieve the contents of your NIS maps, provided the remote user knows your domainname. To prevent such unauthorized transactions, &man.ypserv.8; supports a feature called securenets which can be used to restrict access to a given set of hosts. At startup, &man.ypserv.8; will attempt to load the securenets information from a file called /var/yp/securenets. This path varies depending on the path specified with the option. This file contains entries that consist of a network specification and a network mask separated by white space. Lines starting with # are considered to be comments. A sample securenets file might look like this: # allow connections from local host -- mandatory 127.0.0.1 255.255.255.255 # allow connections from any host # on the 192.168.128.0 network 192.168.128.0 255.255.255.0 # allow connections from any host # between 10.0.0.0 to 10.0.15.255 # this includes the machines in the testlab 10.0.0.0 255.255.240.0 If &man.ypserv.8; receives a request from an address that matches one of these rules, it will process the request normally. If the address fails to match a rule, the request will be ignored and a warning message will be logged. If the /var/yp/securenets file does not exist, ypserv will allow connections from any host. The ypserv program also has support for Wietse Venema's TCP Wrapper package. This allows the administrator to use the TCP Wrapper configuration files for access control instead of /var/yp/securenets. While both of these access control mechanisms provide some security, they, like the privileged port test, are vulnerable to IP spoofing attacks. All NIS-related traffic should be blocked at your firewall. Servers using /var/yp/securenets may fail to serve legitimate NIS clients with archaic TCP/IP implementations. Some of these implementations set all host bits to zero when doing broadcasts and/or fail to observe the subnet mask when calculating the broadcast address. While some of these problems can be fixed by changing the client configuration, other problems may force the retirement of the client systems in question or the abandonment of /var/yp/securenets. Using /var/yp/securenets on a server with such an archaic implementation of TCP/IP is a really bad idea and will lead to loss of NIS functionality for large parts of your network. TCP Wrappers The use of the TCP Wrapper package increases the latency of your NIS server. The additional delay may be long enough to cause timeouts in client programs, especially in busy networks or with slow NIS servers. If one or more of your client systems suffers from these symptoms, you should convert the client systems in question into NIS slave servers and force them to bind to themselves. Barring Some Users from Logging On In our lab, there is a machine basie that is supposed to be a faculty only workstation. We do not want to take this machine out of the NIS domain, yet the passwd file on the master NIS server contains accounts for both faculty and students. What can we do? There is a way to bar specific users from logging on to a machine, even if they are present in the NIS database. To do this, all you must do is add -username to the end of the /etc/master.passwd file on the client machine, where username is the username of the user you wish to bar from logging in. This should preferably be done using vipw, since vipw will sanity check your changes to /etc/master.passwd, as well as automatically rebuild the password database when you finish editing. For example, if we wanted to bar user bill from logging on to basie we would: basie&prompt.root; vipw [add -bill to the end, exit] vipw: rebuilding the database... vipw: done basie&prompt.root; cat /etc/master.passwd root:[password]:0:0::0:0:The super-user:/root:/bin/csh toor:[password]:0:0::0:0:The other super-user:/root:/bin/sh daemon:*:1:1::0:0:Owner of many system processes:/root:/sbin/nologin operator:*:2:5::0:0:System &:/:/sbin/nologin bin:*:3:7::0:0:Binaries Commands and Source,,,:/:/sbin/nologin tty:*:4:65533::0:0:Tty Sandbox:/:/sbin/nologin kmem:*:5:65533::0:0:KMem Sandbox:/:/sbin/nologin games:*:7:13::0:0:Games pseudo-user:/usr/games:/sbin/nologin news:*:8:8::0:0:News Subsystem:/:/sbin/nologin man:*:9:9::0:0:Mister Man Pages:/usr/share/man:/sbin/nologin bind:*:53:53::0:0:Bind Sandbox:/:/sbin/nologin uucp:*:66:66::0:0:UUCP pseudo-user:/var/spool/uucppublic:/usr/libexec/uucp/uucico xten:*:67:67::0:0:X-10 daemon:/usr/local/xten:/sbin/nologin pop:*:68:6::0:0:Post Office Owner:/nonexistent:/sbin/nologin nobody:*:65534:65534::0:0:Unprivileged user:/nonexistent:/sbin/nologin +::::::::: -bill basie&prompt.root; Udo Erdelhoff Contributed by Using Netgroups netgroups The method shown in the previous section works reasonably well if you need special rules for a very small number of users and/or machines. On larger networks, you will forget to bar some users from logging onto sensitive machines, or you may even have to modify each machine separately, thus losing the main benefit of NIS: centralized administration. The NIS developers' solution for this problem is called netgroups. Their purpose and semantics can be compared to the normal groups used by &unix; file systems. The main differences are the lack of a numeric ID and the ability to define a netgroup by including both user accounts and other netgroups. Netgroups were developed to handle large, complex networks with hundreds of users and machines. On one hand, this is a Good Thing if you are forced to deal with such a situation. On the other hand, this complexity makes it almost impossible to explain netgroups with really simple examples. The example used in the remainder of this section demonstrates this problem. Let us assume that your successful introduction of NIS in your laboratory caught your superiors' interest. Your next job is to extend your NIS domain to cover some of the other machines on campus. The two tables contain the names of the new users and new machines as well as brief descriptions of them. User Name(s) Description alpha, beta Normal employees of the IT department charlie, delta The new apprentices of the IT department echo, foxtrott, golf, ... Ordinary employees able, baker, ... The current interns Machine Name(s) Description war, death, famine, pollution Your most important servers. Only the IT employees are allowed to log onto these machines. pride, greed, envy, wrath, lust, sloth Less important servers. All members of the IT department are allowed to login onto these machines. one, two, three, four, ... Ordinary workstations. Only the real employees are allowed to use these machines. trashcan A very old machine without any critical data. Even the intern is allowed to use this box. If you tried to implement these restrictions by separately blocking each user, you would have to add one -user line to each system's passwd for each user who is not allowed to login onto that system. If you forget just one entry, you could be in trouble. It may be feasible to do this correctly during the initial setup, however you will eventually forget to add the lines for new users during day-to-day operations. After all, Murphy was an optimist. Handling this situation with netgroups offers several advantages. Each user need not be handled separately; you assign a user to one or more netgroups and allow or forbid logins for all members of the netgroup. If you add a new machine, you will only have to define login restrictions for netgroups. If a new user is added, you will only have to add the user to one or more netgroups. Those changes are independent of each other: no more for each combination of user and machine do... If your NIS setup is planned carefully, you will only have to modify exactly one central configuration file to grant or deny access to machines. The first step is the initialization of the NIS map netgroup. FreeBSD's &man.ypinit.8; does not create this map by default, but its NIS implementation will support it once it has been created. To create an empty map, simply type ellington&prompt.root; vi /var/yp/netgroup and start adding content. For our example, we need at least four netgroups: IT employees, IT apprentices, normal employees and interns. IT_EMP (,alpha,test-domain) (,beta,test-domain) IT_APP (,charlie,test-domain) (,delta,test-domain) USERS (,echo,test-domain) (,foxtrott,test-domain) \ (,golf,test-domain) INTERNS (,able,test-domain) (,baker,test-domain) IT_EMP, IT_APP etc. are the names of the netgroups. Each bracketed group adds one or more user accounts to it. The three fields inside a group are: The name of the host(s) where the following items are valid. If you do not specify a hostname, the entry is valid on all hosts. If you do specify a hostname, you will enter a realm of darkness, horror and utter confusion. The name of the account that belongs to this netgroup. The NIS domain for the account. You can import accounts from other NIS domains into your netgroup if you are one of the unlucky fellows with more than one NIS domain. Each of these fields can contain wildcards. See &man.netgroup.5; for details. netgroups Netgroup names longer than 8 characters should not be used, especially if you have machines running other operating systems within your NIS domain. The names are case sensitive; using capital letters for your netgroup names is an easy way to distinguish between user, machine and netgroup names. Some NIS clients (other than FreeBSD) cannot handle netgroups with a large number of entries. For example, some older versions of &sunos; start to cause trouble if a netgroup contains more than 15 entries. You can circumvent this limit by creating several sub-netgroups with 15 users or less and a real netgroup that consists of the sub-netgroups: BIGGRP1 (,joe1,domain) (,joe2,domain) (,joe3,domain) [...] BIGGRP2 (,joe16,domain) (,joe17,domain) [...] BIGGRP3 (,joe31,domain) (,joe32,domain) BIGGROUP BIGGRP1 BIGGRP2 BIGGRP3 You can repeat this process if you need more than 225 users within a single netgroup. Activating and distributing your new NIS map is easy: ellington&prompt.root; cd /var/yp ellington&prompt.root; make This will generate the three NIS maps netgroup, netgroup.byhost and netgroup.byuser. Use &man.ypcat.1; to check if your new NIS maps are available: ellington&prompt.user; ypcat -k netgroup ellington&prompt.user; ypcat -k netgroup.byhost ellington&prompt.user; ypcat -k netgroup.byuser The output of the first command should resemble the contents of /var/yp/netgroup. The second command will not produce output if you have not specified host-specific netgroups. The third command can be used to get the list of netgroups for a user. The client setup is quite simple. To configure the server war, you only have to start &man.vipw.8; and replace the line +::::::::: with +@IT_EMP::::::::: Now, only the data for the users defined in the netgroup IT_EMP is imported into war's password database and only these users are allowed to login. Unfortunately, this limitation also applies to the ~ function of the shell and all routines converting between user names and numerical user IDs. In other words, cd ~user will not work, ls -l will show the numerical ID instead of the username and find . -user joe -print will fail with No such user. To fix this, you will have to import all user entries without allowing them to login onto your servers. This can be achieved by adding another line to /etc/master.passwd. This line should contain: +:::::::::/sbin/nologin, meaning Import all entries but replace the shell with /sbin/nologin in the imported entries. You can replace any field in the passwd entry by placing a default value in your /etc/master.passwd. Make sure that the line +:::::::::/sbin/nologin is placed after +@IT_EMP:::::::::. Otherwise, all user accounts imported from NIS will have /sbin/nologin as their login shell. After this change, you will only have to change one NIS map if a new employee joins the IT department. You could use a similar approach for the less important servers by replacing the old +::::::::: in their local version of /etc/master.passwd with something like this: +@IT_EMP::::::::: +@IT_APP::::::::: +:::::::::/sbin/nologin The corresponding lines for the normal workstations could be: +@IT_EMP::::::::: +@USERS::::::::: +:::::::::/sbin/nologin And everything would be fine until there is a policy change a few weeks later: The IT department starts hiring interns. The IT interns are allowed to use the normal workstations and the less important servers; and the IT apprentices are allowed to login onto the main servers. You add a new netgroup IT_INTERN, add the new IT interns to this netgroup and start to change the configuration on each and every machine... As the old saying goes: Errors in centralized planning lead to global mess. NIS' ability to create netgroups from other netgroups can be used to prevent situations like these. One possibility is the creation of role-based netgroups. For example, you could create a netgroup called BIGSRV to define the login restrictions for the important servers, another netgroup called SMALLSRV for the less important servers and a third netgroup called USERBOX for the normal workstations. Each of these netgroups contains the netgroups that are allowed to login onto these machines. The new entries for your NIS map netgroup should look like this: BIGSRV IT_EMP IT_APP SMALLSRV IT_EMP IT_APP ITINTERN USERBOX IT_EMP ITINTERN USERS This method of defining login restrictions works reasonably well if you can define groups of machines with identical restrictions. Unfortunately, this is the exception and not the rule. Most of the time, you will need the ability to define login restrictions on a per-machine basis. Machine-specific netgroup definitions are the other possibility to deal with the policy change outlined above. In this scenario, the /etc/master.passwd of each box contains two lines starting with +. The first of them adds a netgroup with the accounts allowed to login onto this machine, the second one adds all other accounts with /sbin/nologin as shell. It is a good idea to use the ALL-CAPS version of the machine name as the name of the netgroup. In other words, the lines should look like this: +@BOXNAME::::::::: +:::::::::/sbin/nologin Once you have completed this task for all your machines, you will not have to modify the local versions of /etc/master.passwd ever again. All further changes can be handled by modifying the NIS map. Here is an example of a possible netgroup map for this scenario with some additional goodies: # Define groups of users first IT_EMP (,alpha,test-domain) (,beta,test-domain) IT_APP (,charlie,test-domain) (,delta,test-domain) DEPT1 (,echo,test-domain) (,foxtrott,test-domain) DEPT2 (,golf,test-domain) (,hotel,test-domain) DEPT3 (,india,test-domain) (,juliet,test-domain) ITINTERN (,kilo,test-domain) (,lima,test-domain) D_INTERNS (,able,test-domain) (,baker,test-domain) # # Now, define some groups based on roles USERS DEPT1 DEPT2 DEPT3 BIGSRV IT_EMP IT_APP SMALLSRV IT_EMP IT_APP ITINTERN USERBOX IT_EMP ITINTERN USERS # # And a groups for a special tasks # Allow echo and golf to access our anti-virus-machine SECURITY IT_EMP (,echo,test-domain) (,golf,test-domain) # # machine-based netgroups # Our main servers WAR BIGSRV FAMINE BIGSRV # User india needs access to this server POLLUTION BIGSRV (,india,test-domain) # # This one is really important and needs more access restrictions DEATH IT_EMP # # The anti-virus-machine mentioned above ONE SECURITY # # Restrict a machine to a single user TWO (,hotel,test-domain) # [...more groups to follow] If you are using some kind of database to manage your user accounts, you should be able to create the first part of the map with your database's report tools. This way, new users will automatically have access to the boxes. One last word of caution: It may not always be advisable to use machine-based netgroups. If you are deploying a couple of dozen or even hundreds of identical machines for student labs, you should use role-based netgroups instead of machine-based netgroups to keep the size of the NIS map within reasonable limits. Important Things to Remember There are still a couple of things that you will need to do differently now that you are in an NIS environment. Every time you wish to add a user to the lab, you must add it to the master NIS server only, and you must remember to rebuild the NIS maps. If you forget to do this, the new user will not be able to login anywhere except on the NIS master. For example, if we needed to add a new user jsmith to the lab, we would: &prompt.root; pw useradd jsmith &prompt.root; cd /var/yp &prompt.root; make test-domain You could also run adduser jsmith instead of pw useradd jsmith. Keep the administration accounts out of the NIS maps. You do not want to be propagating administrative accounts and passwords to machines that will have users that should not have access to those accounts. Keep the NIS master and slave secure, and minimize their downtime. If somebody either hacks or simply turns off these machines, they have effectively rendered many people without the ability to login to the lab. This is the chief weakness of any centralized administration system. If you do not protect your NIS servers, you will have a lot of angry users! NIS v1 Compatibility FreeBSD's ypserv has some support for serving NIS v1 clients. FreeBSD's NIS implementation only uses the NIS v2 protocol, however other implementations include support for the v1 protocol for backwards compatibility with older systems. The ypbind daemons supplied with these systems will try to establish a binding to an NIS v1 server even though they may never actually need it (and they may persist in broadcasting in search of one even after they receive a response from a v2 server). Note that while support for normal client calls is provided, this version of ypserv does not handle v1 map transfer requests; consequently, it cannot be used as a master or slave in conjunction with older NIS servers that only support the v1 protocol. Fortunately, there probably are not any such servers still in use today. NIS Servers That Are Also NIS Clients Care must be taken when running ypserv in a multi-server domain where the server machines are also NIS clients. It is generally a good idea to force the servers to bind to themselves rather than allowing them to broadcast bind requests and possibly become bound to each other. Strange failure modes can result if one server goes down and others are dependent upon it. Eventually all the clients will time out and attempt to bind to other servers, but the delay involved can be considerable and the failure mode is still present since the servers might bind to each other all over again. You can force a host to bind to a particular server by running ypbind with the flag. If you do not want to do this manually each time you reboot your NIS server, you can add the following lines to your /etc/rc.conf: nis_client_enable="YES" # run client stuff as well nis_client_flags="-S NIS domain,server" See &man.ypbind.8; for further information. Password Formats NIS password formats One of the most common issues that people run into when trying to implement NIS is password format compatibility. If your NIS server is using DES encrypted passwords, it will only support clients that are also using DES. For example, if you have &solaris; NIS clients in your network, then you will almost certainly need to use DES encrypted passwords. To check which format your servers and clients are using, look at /etc/login.conf. If the host is configured to use DES encrypted passwords, then the default class will contain an entry like this: default:\ :passwd_format=des:\ :copyright=/etc/COPYRIGHT:\ [Further entries elided] Other possible values for the passwd_format capability include blf and md5 (for Blowfish and MD5 encrypted passwords, respectively). If you have made changes to /etc/login.conf, you will also need to rebuild the login capability database, which is achieved by running the following command as root: &prompt.root; cap_mkdb /etc/login.conf The format of passwords already in /etc/master.passwd will not be updated until a user changes his password for the first time after the login capability database is rebuilt. Next, in order to ensure that passwords are encrypted with the format that you have chosen, you should also check that the crypt_default in /etc/auth.conf gives precedence to your chosen password format. To do this, place the format that you have chosen first in the list. For example, when using DES encrypted passwords, the entry would be: crypt_default = des blf md5 Having followed the above steps on each of the &os; based NIS servers and clients, you can be sure that they all agree on which password format is used within your network. If you have trouble authenticating on an NIS client, this is a pretty good place to start looking for possible problems. Remember: if you want to deploy an NIS server for a heterogenous network, you will probably have to use DES on all systems because it is the lowest common standard. Greg Sutter Written by Automatic Network Configuration (DHCP) What Is DHCP? Dynamic Host Configuration Protocol DHCP Internet Software Consortium (ISC) DHCP, the Dynamic Host Configuration Protocol, describes the means by which a system can connect to a network and obtain the necessary information for communication upon that network. FreeBSD versions prior to 6.0 use the ISC (Internet Software Consortium) DHCP client (&man.dhclient.8;) implementation. Later versions use the OpenBSD dhclient taken from OpenBSD 3.7. All information here regarding dhclient is for use with either of the ISC or OpenBSD DHCP clients. The DHCP server is the one included in the ISC distribution. What This Section Covers This section describes both the client-side components of the ISC and OpenBSD DHCP client and server-side components of the ISC DHCP system. The client-side program, dhclient, comes integrated within FreeBSD, and the server-side portion is available from the net/isc-dhcp3-server port. The &man.dhclient.8;, &man.dhcp-options.5;, and &man.dhclient.conf.5; manual pages, in addition to the references below, are useful resources. How It Works UDP When dhclient, the DHCP client, is executed on the client machine, it begins broadcasting requests for configuration information. By default, these requests are on UDP port 68. The server replies on UDP 67, giving the client an IP address and other relevant network information such as netmask, router, and DNS servers. All of this information comes in the form of a DHCP lease and is only valid for a certain time (configured by the DHCP server maintainer). In this manner, stale IP addresses for clients no longer connected to the network can be automatically reclaimed. DHCP clients can obtain a great deal of information from the server. An exhaustive list may be found in &man.dhcp-options.5;. FreeBSD Integration &os; fully integrates the ISC or OpenBSD DHCP client, dhclient (according to the &os; version you run). DHCP client support is provided within both the installer and the base system, obviating the need for detailed knowledge of network configurations on any network that runs a DHCP server. dhclient has been included in all FreeBSD distributions since 3.2. sysinstall DHCP is supported by sysinstall. When configuring a network interface within sysinstall, the second question asked is: Do you want to try DHCP configuration of the interface?. Answering affirmatively will execute dhclient, and if successful, will fill in the network configuration information automatically. There are two things you must do to have your system use DHCP upon startup: DHCP requirements Make sure that the bpf device is compiled into your kernel. To do this, add device bpf (pseudo-device bpf under &os; 4.X) to your kernel configuration file, and rebuild the kernel. For more information about building kernels, see . The bpf device is already part of the GENERIC kernel that is supplied with FreeBSD, so if you do not have a custom kernel, you should not need to create one in order to get DHCP working. For those who are particularly security conscious, you should be warned that bpf is also the device that allows packet sniffers to work correctly (although they still have to be run as root). bpf is required to use DHCP, but if you are very sensitive about security, you probably should not add bpf to your kernel in the expectation that at some point in the future you will be using DHCP. Edit your /etc/rc.conf to include the following: ifconfig_fxp0="DHCP" Be sure to replace fxp0 with the designation for the interface that you wish to dynamically configure, as described in . If you are using a different location for dhclient, or if you wish to pass additional flags to dhclient, also include the following (editing as necessary): dhcp_program="/sbin/dhclient" dhcp_flags="" DHCP server The DHCP server, dhcpd, is included as part of the net/isc-dhcp3-server port in the ports collection. This port contains the ISC DHCP server and documentation. Files DHCP configuration files /etc/dhclient.conf dhclient requires a configuration file, /etc/dhclient.conf. Typically the file contains only comments, the defaults being reasonably sane. This configuration file is described by the &man.dhclient.conf.5; manual page. /sbin/dhclient dhclient is statically linked and resides in /sbin. The &man.dhclient.8; manual page gives more information about dhclient. /sbin/dhclient-script dhclient-script is the FreeBSD-specific DHCP client configuration script. It is described in &man.dhclient-script.8;, but should not need any user modification to function properly. /var/db/dhclient.leases The DHCP client keeps a database of valid leases in this file, which is written as a log. &man.dhclient.leases.5; gives a slightly longer description. Further Reading The DHCP protocol is fully described in RFC 2131. An informational resource has also been set up at . Installing and Configuring a DHCP Server What This Section Covers This section provides information on how to configure a FreeBSD system to act as a DHCP server using the ISC (Internet Software Consortium) implementation of the DHCP suite. The server portion of the suite is not provided as part of FreeBSD, and so you will need to install the net/isc-dhcp3-server port to provide this service. See for more information on using the Ports Collection. DHCP Server Installation DHCP installation In order to configure your FreeBSD system as a DHCP server, you will need to ensure that the &man.bpf.4; device is compiled into your kernel. To do this, add device bpf (pseudo-device bpf under &os; 4.X) to your kernel configuration file, and rebuild the kernel. For more information about building kernels, see . The bpf device is already part of the GENERIC kernel that is supplied with FreeBSD, so you do not need to create a custom kernel in order to get DHCP working. Those who are particularly security conscious should note that bpf is also the device that allows packet sniffers to work correctly (although such programs still need privileged access). bpf is required to use DHCP, but if you are very sensitive about security, you probably should not include bpf in your kernel purely because you expect to use DHCP at some point in the future. The next thing that you will need to do is edit the sample dhcpd.conf which was installed by the net/isc-dhcp3-server port. By default, this will be /usr/local/etc/dhcpd.conf.sample, and you should copy this to /usr/local/etc/dhcpd.conf before proceeding to make changes. Configuring the DHCP Server DHCP dhcpd.conf dhcpd.conf is comprised of declarations regarding subnets and hosts, and is perhaps most easily explained using an example : option domain-name "example.com"; option domain-name-servers 192.168.4.100; option subnet-mask 255.255.255.0; default-lease-time 3600; max-lease-time 86400; ddns-update-style none; subnet 192.168.4.0 netmask 255.255.255.0 { range 192.168.4.129 192.168.4.254; option routers 192.168.4.1; } host mailhost { hardware ethernet 02:03:04:05:06:07; fixed-address mailhost.example.com; } This option specifies the domain that will be provided to clients as the default search domain. See &man.resolv.conf.5; for more information on what this means. This option specifies a comma separated list of DNS servers that the client should use. The netmask that will be provided to clients. A client may request a specific length of time that a lease will be valid. Otherwise the server will assign a lease with this expiry value (in seconds). This is the maximum length of time that the server will lease for. Should a client request a longer lease, a lease will be issued, although it will only be valid for max-lease-time seconds. This option specifies whether the DHCP server should attempt to update DNS when a lease is accepted or released. In the ISC implementation, this option is required. This denotes which IP addresses should be used in the pool reserved for allocating to clients. IP addresses between, and including, the ones stated are handed out to clients. Declares the default gateway that will be provided to clients. The hardware MAC address of a host (so that the DHCP server can recognize a host when it makes a request). Specifies that the host should always be given the same IP address. Note that using a hostname is correct here, since the DHCP server will resolve the hostname itself before returning the lease information. Once you have finished writing your dhcpd.conf, you can proceed to start the server by issuing the following command: &prompt.root; /usr/local/etc/rc.d/isc-dhcpd.sh start Should you need to make changes to the configuration of your server in the future, it is important to note that sending a SIGHUP signal to dhcpd does not result in the configuration being reloaded, as it does with most daemons. You will need to send a SIGTERM signal to stop the process, and then restart it using the command above. Files DHCP configuration files /usr/local/sbin/dhcpd dhcpd is statically linked and resides in /usr/local/sbin. The &man.dhcpd.8; manual page installed with the port gives more information about dhcpd. /usr/local/etc/dhcpd.conf dhcpd requires a configuration file, /usr/local/etc/dhcpd.conf before it will start providing service to clients. This file needs to contain all the information that should be provided to clients that are being serviced, along with information regarding the operation of the server. This configuration file is described by the &man.dhcpd.conf.5; manual page installed by the port. /var/db/dhcpd.leases The DHCP server keeps a database of leases it has issued in this file, which is written as a log. The manual page &man.dhcpd.leases.5;, installed by the port gives a slightly longer description. /usr/local/sbin/dhcrelay dhcrelay is used in advanced environments where one DHCP server forwards a request from a client to another DHCP server on a separate network. If you require this functionality, then install the net/isc-dhcp3-relay port. The &man.dhcrelay.8; manual page provided with the port contains more detail. Chern Lee Contributed by Domain Name System (DNS) Overview BIND FreeBSD utilizes, by default, a version of BIND (Berkeley Internet Name Domain), which is the most common implementation of the DNS protocol. DNS is the protocol through which names are mapped to IP addresses, and vice versa. For example, a query for www.FreeBSD.org will receive a reply with the IP address of The FreeBSD Project's web server, whereas, a query for ftp.FreeBSD.org will return the IP address of the corresponding FTP machine. Likewise, the opposite can happen. A query for an IP address can resolve its hostname. It is not necessary to run a name server to perform DNS lookups on a system. DNS DNS is coordinated across the Internet through a somewhat complex system of authoritative root name servers, and other smaller-scale name servers who host and cache individual domain information. This document refers to BIND 8.x, as it is the stable version used in &os;. Versions of &os; 5.3 and beyond include BIND9 and the configuration instructions may be found later in this chapter. Users of &os; 5.2 and other previous versions may install BIND9 from the net/bind9 port. RFC1034 and RFC1035 dictate the DNS protocol. Currently, BIND is maintained by the Internet Software Consortium . Terminology To understand this document, some terms related to DNS must be understood. resolver reverse DNS root zone Term Definition Forward DNS Mapping of hostnames to IP addresses Origin Refers to the domain covered in a particular zone file named, BIND, name server Common names for the BIND name server package within FreeBSD Resolver A system process through which a machine queries a name server for zone information Reverse DNS The opposite of forward DNS; mapping of IP addresses to hostnames Root zone The beginning of the Internet zone hierarchy. All zones fall under the root zone, similar to how all files in a file system fall under the root directory. Zone An individual domain, subdomain, or portion of the DNS administered by the same authority zones examples Examples of zones: . is the root zone org. is a zone under the root zone example.org. is a zone under the org. zone foo.example.org. is a subdomain, a zone under the example.org. zone 1.2.3.in-addr.arpa is a zone referencing all IP addresses which fall under the 3.2.1.* IP space. As one can see, the more specific part of a hostname appears to its left. For example, example.org. is more specific than org., as org. is more specific than the root zone. The layout of each part of a hostname is much like a file system: the /dev directory falls within the root, and so on. Reasons to Run a Name Server Name servers usually come in two forms: an authoritative name server, and a caching name server. An authoritative name server is needed when: one wants to serve DNS information to the world, replying authoritatively to queries. a domain, such as example.org, is registered and IP addresses need to be assigned to hostnames under it. an IP address block requires reverse DNS entries (IP to hostname). a backup name server, called a slave, must reply to queries when the primary is down or inaccessible. A caching name server is needed when: a local DNS server may cache and respond more quickly than querying an outside name server. a reduction in overall network traffic is desired (DNS traffic has been measured to account for 5% or more of total Internet traffic). When one queries for www.FreeBSD.org, the resolver usually queries the uplink ISP's name server, and retrieves the reply. With a local, caching DNS server, the query only has to be made once to the outside world by the caching DNS server. Every additional query will not have to look to the outside of the local network, since the information is cached locally. How It Works In FreeBSD, the BIND daemon is called named for obvious reasons. File Description named the BIND daemon ndc name daemon control program /etc/namedb directory where BIND zone information resides /etc/namedb/named.conf daemon configuration file Zone files are usually contained within the /etc/namedb directory, and contain the DNS zone information served by the name server. Starting BIND BIND starting Since BIND is installed by default, configuring it all is relatively simple. To ensure the named daemon is started at boot, put the following line in /etc/rc.conf: named_enable="YES" To start the daemon manually (after configuring it): &prompt.root; ndc start Configuration Files BIND configuration files Using <command>make-localhost</command> Be sure to: &prompt.root; cd /etc/namedb &prompt.root; sh make-localhost to properly create the local reverse DNS zone file in /etc/namedb/master/localhost.rev. <filename>/etc/namedb/named.conf</filename> // $FreeBSD$ // // Refer to the named(8) manual page for details. If you are ever going // to setup a primary server, make sure you've understood the hairy // details of how DNS is working. Even with simple mistakes, you can // break connectivity for affected parties, or cause huge amount of // useless Internet traffic. options { directory "/etc/namedb"; // In addition to the "forwarders" clause, you can force your name // server to never initiate queries of its own, but always ask its // forwarders only, by enabling the following line: // // forward only; // If you've got a DNS server around at your upstream provider, enter // its IP address here, and enable the line below. This will make you // benefit from its cache, thus reduce overall DNS traffic in the Internet. /* forwarders { 127.0.0.1; }; */ Just as the comment says, to benefit from an uplink's cache, forwarders can be enabled here. Under normal circumstances, a name server will recursively query the Internet looking at certain name servers until it finds the answer it is looking for. Having this enabled will have it query the uplink's name server (or name server provided) first, taking advantage of its cache. If the uplink name server in question is a heavily trafficked, fast name server, enabling this may be worthwhile. 127.0.0.1 will not work here. Change this IP address to a name server at your uplink. /* * If there is a firewall between you and name servers you want * to talk to, you might need to uncomment the query-source * directive below. Previous versions of BIND always asked * questions using port 53, but BIND 8.1 uses an unprivileged * port by default. */ // query-source address * port 53; /* * If running in a sandbox, you may have to specify a different * location for the dumpfile. */ // dump-file "s/named_dump.db"; }; // Note: the following will be supported in a future release. /* host { any; } { topology { 127.0.0.0/8; }; }; */ // Setting up secondaries is way easier and the rough picture for this // is explained below. // // If you enable a local name server, don't forget to enter 127.0.0.1 // into your /etc/resolv.conf so this server will be queried first. // Also, make sure to enable it in /etc/rc.conf. zone "." { type hint; file "named.root"; }; zone "0.0.127.IN-ADDR.ARPA" { type master; file "localhost.rev"; }; // NB: Do not use the IP addresses below, they are faked, and only // serve demonstration/documentation purposes! // // Example secondary config entries. It can be convenient to become // a secondary at least for the zone where your own domain is in. Ask // your network administrator for the IP address of the responsible // primary. // // Never forget to include the reverse lookup (IN-ADDR.ARPA) zone! // (This is the first bytes of the respective IP address, in reverse // order, with ".IN-ADDR.ARPA" appended.) // // Before starting to setup a primary zone, better make sure you fully // understand how DNS and BIND works, however. There are sometimes // unobvious pitfalls. Setting up a secondary is comparably simpler. // // NB: Don't blindly enable the examples below. :-) Use actual names // and addresses instead. // // NOTE!!! FreeBSD runs BIND in a sandbox (see named_flags in rc.conf). // The directory containing the secondary zones must be write accessible // to BIND. The following sequence is suggested: // // mkdir /etc/namedb/s // chown bind:bind /etc/namedb/s // chmod 750 /etc/namedb/s For more information on running BIND in a sandbox, see Running named in a sandbox. /* zone "example.com" { type slave; file "s/example.com.bak"; masters { 192.168.1.1; }; }; zone "0.168.192.in-addr.arpa" { type slave; file "s/0.168.192.in-addr.arpa.bak"; masters { 192.168.1.1; }; }; */ In named.conf, these are examples of slave entries for a forward and reverse zone. For each new zone served, a new zone entry must be added to named.conf. For example, the simplest zone entry for example.org can look like: zone "example.org" { type master; file "example.org"; }; The zone is a master, as indicated by the statement, holding its zone information in /etc/namedb/example.org indicated by the statement. zone "example.org" { type slave; file "example.org"; }; In the slave case, the zone information is transferred from the master name server for the particular zone, and saved in the file specified. If and when the master server dies or is unreachable, the slave name server will have the transferred zone information and will be able to serve it. Zone Files An example master zone file for example.org (existing within /etc/namedb/example.org) is as follows: $TTL 3600 example.org. IN SOA ns1.example.org. admin.example.org. ( 5 ; Serial 10800 ; Refresh 3600 ; Retry 604800 ; Expire 86400 ) ; Minimum TTL ; DNS Servers @ IN NS ns1.example.org. @ IN NS ns2.example.org. ; Machine Names localhost IN A 127.0.0.1 ns1 IN A 3.2.1.2 ns2 IN A 3.2.1.3 mail IN A 3.2.1.10 @ IN A 3.2.1.30 ; Aliases www IN CNAME @ ; MX Record @ IN MX 10 mail.example.org. Note that every hostname ending in a . is an exact hostname, whereas everything without a trailing . is referenced to the origin. For example, www is translated into www.origin. In our fictitious zone file, our origin is example.org., so www would translate to www.example.org. The format of a zone file follows: recordname IN recordtype value DNS records The most commonly used DNS records: SOA start of zone authority NS an authoritative name server A a host address CNAME the canonical name for an alias MX mail exchanger PTR a domain name pointer (used in reverse DNS) example.org. IN SOA ns1.example.org. admin.example.org. ( 5 ; Serial 10800 ; Refresh after 3 hours 3600 ; Retry after 1 hour 604800 ; Expire after 1 week 86400 ) ; Minimum TTL of 1 day example.org. the domain name, also the origin for this zone file. ns1.example.org. the primary/authoritative name server for this zone. admin.example.org. the responsible person for this zone, email address with @ replaced. (admin@example.org becomes admin.example.org) 5 the serial number of the file. This must be incremented each time the zone file is modified. Nowadays, many admins prefer a yyyymmddrr format for the serial number. 2001041002 would mean last modified 04/10/2001, the latter 02 being the second time the zone file has been modified this day. The serial number is important as it alerts slave name servers for a zone when it is updated. @ IN NS ns1.example.org. This is an NS entry. Every name server that is going to reply authoritatively for the zone must have one of these entries. The @ as seen here could have been example.org. The @ translates to the origin. localhost IN A 127.0.0.1 ns1 IN A 3.2.1.2 ns2 IN A 3.2.1.3 mail IN A 3.2.1.10 @ IN A 3.2.1.30 The A record indicates machine names. As seen above, ns1.example.org would resolve to 3.2.1.2. Again, the origin symbol, @, is used here, thus meaning example.org would resolve to 3.2.1.30. www IN CNAME @ The canonical name record is usually used for giving aliases to a machine. In the example, www is aliased to the machine addressed to the origin, or example.org (3.2.1.30). CNAMEs can be used to provide alias hostnames, or round robin one hostname among multiple machines. MX record @ IN MX 10 mail.example.org. The MX record indicates which mail servers are responsible for handling incoming mail for the zone. mail.example.org is the hostname of the mail server, and 10 being the priority of that mail server. One can have several mail servers, with priorities of 3, 2, 1. A mail server attempting to deliver to example.org would first try the highest priority MX, then the second highest, etc, until the mail can be properly delivered. For in-addr.arpa zone files (reverse DNS), the same format is used, except with PTR entries instead of A or CNAME. $TTL 3600 1.2.3.in-addr.arpa. IN SOA ns1.example.org. admin.example.org. ( 5 ; Serial 10800 ; Refresh 3600 ; Retry 604800 ; Expire 3600 ) ; Minimum @ IN NS ns1.example.org. @ IN NS ns2.example.org. 2 IN PTR ns1.example.org. 3 IN PTR ns2.example.org. 10 IN PTR mail.example.org. 30 IN PTR example.org. This file gives the proper IP address to hostname mappings of our above fictitious domain. Caching Name Server BIND caching name server A caching name server is a name server that is not authoritative for any zones. It simply asks queries of its own, and remembers them for later use. To set one up, just configure the name server as usual, omitting any inclusions of zones. Running <application>named</application> in a Sandbox BIND running in a sandbox chroot For added security you may want to run &man.named.8; as an unprivileged user, and configure it to &man.chroot.8; into a sandbox directory. This makes everything outside of the sandbox inaccessible to the named daemon. Should named be compromised, this will help to reduce the damage that can be caused. By default, FreeBSD has a user and a group called bind, intended for this use. Various people would recommend that instead of configuring named to chroot, you should run named inside a &man.jail.8;. This section does not attempt to cover this situation. Since named will not be able to access anything outside of the sandbox (such as shared libraries, log sockets, and so on), there are a number of steps that need to be followed in order to allow named to function correctly. In the following checklist, it is assumed that the path to the sandbox is /etc/namedb and that you have made no prior modifications to the contents of this directory. Perform the following steps as root: Create all directories that named expects to see: &prompt.root; cd /etc/namedb &prompt.root; mkdir -p bin dev etc var/tmp var/run master slave &prompt.root; chown bind:bind slave var/* named only needs write access to these directories, so that is all we give it. Rearrange and create basic zone and configuration files: &prompt.root; cp /etc/localtime etc -&prompt.root; mv named.conf etc && ln -sf etc/named.conf +&prompt.root; mv named.conf etc && ln -sf etc/named.conf &prompt.root; mv named.root master &prompt.root; sh make-localhost -&prompt.root; cat > master/named.localhost +&prompt.root; cat > master/named.localhost $ORIGIN localhost. $TTL 6h @ IN SOA localhost. postmaster.localhost. ( 1 ; serial 3600 ; refresh 1800 ; retry 604800 ; expiration 3600 ) ; minimum IN NS localhost. IN A 127.0.0.1 ^D This allows named to log the correct time to &man.syslogd.8;. syslog log files named If you are running a version of &os; prior to 4.9-RELEASE, build a statically linked copy of named-xfer, and copy it into the sandbox: &prompt.root; cd /usr/src/lib/libisc -&prompt.root; make cleandir && make cleandir && make depend && make all +&prompt.root; make cleandir && make cleandir && make depend && make all &prompt.root; cd /usr/src/lib/libbind -&prompt.root; make cleandir && make cleandir && make depend && make all +&prompt.root; make cleandir && make cleandir && make depend && make all &prompt.root; cd /usr/src/libexec/named-xfer -&prompt.root; make cleandir && make cleandir && make depend && make NOSHARED=yes all -&prompt.root; cp named-xfer /etc/namedb/bin && chmod 555 /etc/namedb/bin/named-xfer +&prompt.root; make cleandir && make cleandir && make depend && make NOSHARED=yes all +&prompt.root; cp named-xfer /etc/namedb/bin && chmod 555 /etc/namedb/bin/named-xfer After your statically linked named-xfer is installed some cleaning up is required, to avoid leaving stale copies of libraries or programs in your source tree: &prompt.root; cd /usr/src/lib/libisc &prompt.root; make cleandir &prompt.root; cd /usr/src/lib/libbind &prompt.root; make cleandir &prompt.root; cd /usr/src/libexec/named-xfer &prompt.root; make cleandir This step has been reported to fail occasionally. If this happens to you, then issue the command: - &prompt.root; cd /usr/src && make cleandir && make cleandir + &prompt.root; cd /usr/src && make cleandir && make cleandir and delete your /usr/obj tree: - &prompt.root; rm -fr /usr/obj && mkdir /usr/obj + &prompt.root; rm -fr /usr/obj && mkdir /usr/obj This will clean out any cruft from your source tree, and retrying the steps above should then work. If you are running &os; version 4.9-RELEASE or later, then the copy of named-xfer in /usr/libexec is statically linked by default, and you can simply use &man.cp.1; to copy it into your sandbox. Make a dev/null that named can see and write to: - &prompt.root; cd /etc/namedb/dev && mknod null c 2 2 + &prompt.root; cd /etc/namedb/dev && mknod null c 2 2 &prompt.root; chmod 666 null Symlink /var/run/ndc to /etc/namedb/var/run/ndc: &prompt.root; ln -sf /etc/namedb/var/run/ndc /var/run/ndc This simply avoids having to specify the option to &man.ndc.8; every time you run it. Since the contents of /var/run are deleted on boot, it may be useful to add this command to root's &man.crontab.5;, using the option. syslog log files named Configure &man.syslogd.8; to create an extra log socket that named can write to. To do this, add -l /etc/namedb/dev/log to the syslogd_flags variable in /etc/rc.conf. chroot Arrange to have named start and chroot itself to the sandbox by adding the following to /etc/rc.conf: named_enable="YES" named_flags="-u bind -g bind -t /etc/namedb /etc/named.conf" Note that the configuration file /etc/named.conf is denoted by a full pathname relative to the sandbox, i.e. in the line above, the file referred to is actually /etc/namedb/etc/named.conf. The next step is to edit /etc/namedb/etc/named.conf so that named knows which zones to load and where to find them on the disk. There follows a commented example (anything not specifically commented here is no different from the setup for a DNS server not running in a sandbox): options { directory "/"; named-xfer "/bin/named-xfer"; version ""; // Don't reveal BIND version query-source address * port 53; }; // ndc control socket controls { unix "/var/run/ndc" perm 0600 owner 0 group 0; }; // Zones follow: zone "localhost" IN { type master; file "master/named.localhost"; allow-transfer { localhost; }; notify no; }; zone "0.0.127.in-addr.arpa" IN { type master; file "master/localhost.rev"; allow-transfer { localhost; }; notify no; }; zone "." IN { type hint; file "master/named.root"; }; zone "private.example.net" in { type master; file "master/private.example.net.db"; allow-transfer { 192.168.10.0/24; }; }; zone "10.168.192.in-addr.arpa" in { type slave; masters { 192.168.10.2; }; file "slave/192.168.10.db"; }; The directory statement is specified as /, since all files that named needs are within this directory (recall that this is equivalent to a normal user's /etc/namedb). Specifies the full path to the named-xfer binary (from named's frame of reference). This is necessary since named is compiled to look for named-xfer in /usr/libexec by default. Specifies the filename (relative to the directory statement above) where named can find the zone file for this zone. Specifies the filename (relative to the directory statement above) where named should write a copy of the zone file for this zone after successfully transferring it from the master server. This is why we needed to change the ownership of the directory slave to bind in the setup stages above. After completing the steps above, either reboot your server or restart &man.syslogd.8; and start &man.named.8;, making sure to use the new options specified in syslogd_flags and named_flags. You should now be running a sandboxed copy of named! Security Although BIND is the most common implementation of DNS, there is always the issue of security. Possible and exploitable security holes are sometimes found. It is a good idea to read CERT's security advisories and to subscribe to the &a.security-notifications; to stay up to date with the current Internet and FreeBSD security issues. If a problem arises, keeping sources up to date and having a fresh build of named would not hurt. Further Reading BIND/named manual pages: &man.ndc.8; &man.named.8; &man.named.conf.5; Official ISC BIND Page BIND FAQ O'Reilly DNS and BIND 4th Edition RFC1034 - Domain Names - Concepts and Facilities RFC1035 - Domain Names - Implementation and Specification Tom Rhodes Written by <acronym>BIND</acronym>9 and &os; bind9 setting up The release of &os; 5.3 brought the BIND9 DNS server software into the distribution. New security features, a new file system layout and automated &man.chroot.8; configuration came with the import. This section has been written in two parts, the first will discuss new features and their configuration; the latter will cover upgrades to aid in move to &os; 5.3. From this moment on, the server will be referred to simply as &man.named.8; in place of BIND. This section skips over the terminology described in the previous section as well as some of the theoretical discussions; thus, it is recommended that the previous section be consulted before reading any further here. Configuration files for named currently reside in /var/named/etc/namedb/ and will need modification before use. This is where most of the configuration will be performed. Configuration of a Master Zone To configure a master zone visit /var/named/etc/namedb/ and run the following command: &prompt.root; sh make-localhost If all went well a new file should exist in the master directory. The filenames should be localhost.rev for the local domain name and localhost-v6.rev for IPv6 configurations. As the default configuration file, configuration for its use will already be present in the named.conf file. Configuration of a Slave Zone Configuration for extra domains or sub domains may be done properly by setting them as a slave zone. In most cases, the master/localhost.rev file could just be copied over into the slave directory and modified. Once completed, the files need to be properly added in named.conf such as in the following configuration for example.com: zone "example.com" { type slave; file "slave/example.com"; masters { 10.0.0.1; }; }; zone "0.168.192.in-addr.arpa" { type slave; file "slave/0.168.192.in-addr.arpa"; masters { 10.0.0.1; }; }; Note well that in this example, the master IP address is the primary domain server from which the zones are transferred; it does not necessary serve as DNS server itself. System Initialization Configuration In order for the named daemon to start when the system is booted, the following option must be present in the rc.conf file: named_enable="YES" While other options exist, this is the bare minimal requirement. Consult the &man.rc.conf.5; manual page for a list of the other options. If nothing is entered in the rc.conf file then named may be started on the command line by invoking: &prompt.root; /etc/rc.d/named start <acronym>BIND</acronym>9 Security While &os; automatically drops named into a &man.chroot.8; environment; there are several other security mechanisms in place which could help to lure off possible DNS service attacks. Query Access Control Lists A query access control list can be used to restrict queries against the zones. The configuration works by defining the network inside of the acl token and then listing IP addresses in the zone configuration. To permit domains to query the example host, just define it like this: acl "example.com" { 192.168.0.0/24; }; zone "example.com" { type slave; file "slave/example.com"; masters { 10.0.0.1; }; allow-query { example.com; }; }; zone "0.168.192.in-addr.arpa" { type slave; file "slave/0.168.192.in-addr.arpa"; masters { 10.0.0.1; }; allow-query { example.com; }; }; Restrict Version Permitting version lookups on the DNS server could be opening the doors for an attacker. A malicious user may use this information to hunt up known exploits or bugs to utilize against the host. Setting a false version will not protect the server from exploits. Only upgrading to a version that is not vulnerable will protect your server. A false version string can be placed the options section of named.conf: options { directory "/etc/namedb"; pid-file "/var/run/named/pid"; dump-file "/var/dump/named_dump.db"; statistics-file "/var/stats/named.stats"; version "None of your business"; }; Murray Stokely Contributed by Apache HTTP Server web servers setting up Apache Overview &os; is used to run some of the busiest web sites in the world. The majority of web servers on the Internet are using the Apache HTTP Server. Apache software packages should be included on your FreeBSD installation media. If you did not install Apache when you first installed FreeBSD, then you can install it from the www/apache13 or www/apache2 port. Once Apache has been installed successfully, it must be configured. This section covers version 1.3.X of the Apache HTTP Server as that is the most widely used version for &os;. Apache 2.X introduces many new technologies but they are not discussed here. For more information about Apache 2.X, please see . Configuration Apache configuration file The main Apache HTTP Server configuration file is installed as /usr/local/etc/apache/httpd.conf on &os;. This file is a typical &unix; text configuration file with comment lines beginning with the # character. A comprehensive description of all possible configuration options is outside the scope of this book, so only the most frequently modified directives will be described here. ServerRoot "/usr/local" This specifies the default directory hierarchy for the Apache installation. Binaries are stored in the bin and sbin subdirectories of the server root, and configuration files are stored in etc/apache. ServerAdmin you@your.address The address to which problems with the server should be emailed. This address appears on some server-generated pages, such as error documents. ServerName www.example.com ServerName allows you to set a host name which is sent back to clients for your server if it is different to the one that the host is configured with (i.e., use www instead of the host's real name). DocumentRoot "/usr/local/www/data" DocumentRoot: The directory out of which you will serve your documents. By default, all requests are taken from this directory, but symbolic links and aliases may be used to point to other locations. It is always a good idea to make backup copies of your Apache configuration file before making changes. Once you are satisfied with your initial configuration you are ready to start running Apache. Running <application>Apache</application> Apache starting or stopping Apache does not run from the inetd super server as many other network servers do. It is configured to run standalone for better performance for incoming HTTP requests from client web browsers. A shell script wrapper is included to make starting, stopping, and restarting the server as simple as possible. To start up Apache for the first time, just run: &prompt.root; /usr/local/sbin/apachectl start You can stop the server at any time by typing: &prompt.root; /usr/local/sbin/apachectl stop After making changes to the configuration file for any reason, you will need to restart the server: &prompt.root; /usr/local/sbin/apachectl restart To restart Apache without aborting current connections, run: &prompt.root; /usr/local/sbin/apachectl graceful Additional information available at &man.apachectl.8; manual page. To launch Apache at system startup, add the following line to /etc/rc.conf: apache_enable="YES" If you would like to supply additional command line options for the Apache httpd program started at system boot, you may specify them with an additional line in rc.conf: apache_flags="" Now that the web server is running, you can view your web site by pointing a web browser to http://localhost/. The default web page that is displayed is /usr/local/www/data/index.html. Virtual Hosting Apache supports two different types of Virtual Hosting. The first method is Name-based Virtual Hosting. Name-based virtual hosting uses the clients HTTP/1.1 headers to figure out the hostname. This allows many different domains to share the same IP address. To setup Apache to use Name-based Virtual Hosting add an entry like the following to your httpd.conf: NameVirtualHost * If your webserver was named www.domain.tld and you wanted to setup a virtual domain for www.someotherdomain.tld then you would add the following entries to httpd.conf: <VirtualHost *> ServerName www.domain.tld DocumentRoot /www/domain.tld </VirtualHost> <VirtualHost *> ServerName www.someotherdomain.tld DocumentRoot /www/someotherdomain.tld </VirtualHost> Replace the addresses with the addresses you want to use and the path to the documents with what you are using. For more information about setting up virtual hosts, please consult the official Apache documentation at: . Apache Modules Apache modules There are many different Apache modules available to add functionality to the basic server. The FreeBSD Ports Collection provides an easy way to install Apache together with some of the more popular add-on modules. mod_ssl web servers secure SSL cryptography The mod_ssl module uses the OpenSSL library to provide strong cryptography via the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1) protocols. This module provides everything necessary to request a signed certificate from a trusted certificate signing authority so that you can run a secure web server on &os;. If you have not yet installed Apache, then a version of Apache 1.3.X that includes mod_ssl may be installed with the www/apache13-modssl port. SSL support is also available for Apache 2.X in the www/apache2 port, where it is enabled by default. mod_perl Perl The Apache/Perl integration project brings together the full power of the Perl programming language and the Apache HTTP Server. With the mod_perl module it is possible to write Apache modules entirely in Perl. In addition, the persistent interpreter embedded in the server avoids the overhead of starting an external interpreter and the penalty of Perl start-up time. If you have not yet installed Apache, then a version of Apache that includes mod_perl may be installed with the www/apache13-modperl port. Tom Rhodes Written by PHP PHP In the past few years, more businesses have turned to the Internet in order to enhance their revenue and increase exposure. This has also increased the need for interactive web content. While some companies, such as µsoft;, have introduced solutions into their proprietary products, the open source community answered the call. One answer, widely used, is known as PHP. PHP, also known as Hypertext Preprocessor is a general-purpose scripting language that is especially suited for Web development. Capable of being embedded into HTML its syntax draws upon C, &java;, and Perl with the intention of allowing web developers write dynamically generated webpages quickly. To gain support for PHP5 for the Apache web server, begin by installing the www/mod_php5 port. This will install and configure the modules required to support dynamic web applications. Check to ensure the following lines have been added to /usr/local/etc/apache/httpd.conf: LoadModule php5_module libexec/apache/libphp5.so AddModule mod_php5.c <IfModule mod_php5.c> DirectoryIndex index.php index.html </IfModule> <IfModule mod_php5.c> AddType application/x-httpd-php .php AddType application/x-httpd-php-source .phps </IfModule> Once completed, a simple call to the apachectl command for a graceful restart: &prompt.root; apachectl graceful The PHP support in &os; is extremely modular. If support for any extensions is required, an administrator only needs to install the appropriate port and restart Apache like recommended above. For instance, to add support for the MySQL database server to PHP5, simply install the databases/php5-mysql and issue the following command: &prompt.root; apachectl graceful Which will enable MySQL support in PHP. Murray Stokely Contributed by File Transfer Protocol (FTP) FTP servers Overview The File Transfer Protocol (FTP) provides users with a simple way to transfer files to and from an FTP server. &os; includes FTP server software, ftpd, in the base system. This makes setting up and administering an FTP server on FreeBSD very straightforward. Configuration The most important configuration step is deciding which accounts will be allowed access to the FTP server. A normal FreeBSD system has a number of system accounts used for various daemons, but unknown users should not be allowed to log in with these accounts. The /etc/ftpusers file is a list of users disallowed any FTP access. By default, it includes the aforementioned system accounts, but it is possible to add specific users here that should not be allowed access to FTP. You may want to restrict the access of some users without preventing them completely from using FTP. This can be accomplished with the /etc/ftpchroot file. This file lists users and groups subject to FTP access restrictions. The &man.ftpchroot.5; manual page has all of the details so it will not be described in detail here. FTP anonymous If you would like to enable anonymous FTP access to your server, then you must create a user named ftp on your &os; system. Users will then be able to log on to your FTP server with a username of ftp or anonymous and with any password (by convention an email address for the user should be used as the password). The FTP server will call &man.chroot.2; when an anonymous user logs in, to restrict access to only the home directory of the ftp user. There are two text files that specify welcome messages to be displayed to FTP clients. The contents of the file /etc/ftpwelcome will be displayed to users before they reach the login prompt. After a successful login, the contents of the file /etc/ftpmotd will be displayed. Note that the path to this file is relative to the login environment, so the file ~ftp/etc/ftpmotd would be displayed for anonymous users. Once the FTP server has been configured properly, it must be enabled in /etc/inetd.conf. All that is required here is to remove the comment symbol # from in front of the existing ftpd line : ftp stream tcp nowait root /usr/libexec/ftpd ftpd -l As explained in , a HangUP Signal must be sent to inetd after this configuration file is changed. You can now log on to your FTP server by typing: &prompt.user; ftp localhost Maintaining syslog log files FTP The ftpd daemon uses &man.syslog.3; to log messages. By default, the system log daemon will put messages related to FTP in the /var/log/xferlog file. The location of the FTP log can be modified by changing the following line in /etc/syslog.conf: ftp.info /var/log/xferlog FTP anonymous Be aware of the potential problems involved with running an anonymous FTP server. In particular, you should think twice about allowing anonymous users to upload files. You may find that your FTP site becomes a forum for the trade of unlicensed commercial software or worse. If you do need to allow anonymous FTP uploads, then you should set up the permissions so that these files can not be read by other anonymous users until they have been reviewed. Murray Stokely Contributed by File and Print Services for µsoft.windows; clients (Samba) Samba server Microsoft Windows file server Windows clients print server Windows clients Overview Samba is a popular open source software package that provides file and print services for µsoft.windows; clients. Such clients can connect to and use FreeBSD filespace as if it was a local disk drive, or FreeBSD printers as if they were local printers. Samba software packages should be included on your FreeBSD installation media. If you did not install Samba when you first installed FreeBSD, then you can install it from the net/samba3 port or package. Configuration A default Samba configuration file is installed as /usr/local/etc/smb.conf.default. This file must be copied to /usr/local/etc/smb.conf and customized before Samba can be used. The smb.conf file contains runtime configuration information for Samba, such as definitions of the printers and file system shares that you would like to share with &windows; clients. The Samba package includes a web based tool called swat which provides a simple way of configuring the smb.conf file. Using the Samba Web Administration Tool (SWAT) The Samba Web Administration Tool (SWAT) runs as a daemon from inetd. Therefore, the following line in /etc/inetd.conf should be uncommented before swat can be used to configure Samba: swat stream tcp nowait/400 root /usr/local/sbin/swat As explained in , a HangUP Signal must be sent to inetd after this configuration file is changed. Once swat has been enabled in inetd.conf, you can use a browser to connect to . You will first have to log on with the system root account. Once you have successfully logged on to the main Samba configuration page, you can browse the system documentation, or begin by clicking on the Globals tab. The Globals section corresponds to the variables that are set in the [global] section of /usr/local/etc/smb.conf. Global Settings Whether you are using swat or editing /usr/local/etc/smb.conf directly, the first directives you are likely to encounter when configuring Samba are: workgroup NT Domain-Name or Workgroup-Name for the computers that will be accessing this server. netbios name NetBIOS This sets the NetBIOS name by which a Samba server is known. By default it is the same as the first component of the host's DNS name. server string This sets the string that will be displayed with the net view command and some other networking tools that seek to display descriptive text about the server. Security Settings Two of the most important settings in /usr/local/etc/smb.conf are the security model chosen, and the backend password format for client users. The following directives control these options: security The two most common options here are security = share and security = user. If your clients use usernames that are the same as their usernames on your &os; machine then you will want to use user level security. This is the default security policy and it requires clients to first log on before they can access shared resources. In share level security, client do not need to log onto the server with a valid username and password before attempting to connect to a shared resource. This was the default security model for older versions of Samba. passdb backend NIS+ LDAP SQL database Samba has several different backend authentication models. You can authenticate clients with LDAP, NIS+, a SQL database, or a modified password file. The default authentication method is smbpasswd, and that is all that will be covered here. Assuming that the default smbpasswd backend is used, the /usr/local/private/smbpasswd file must be created to allow Samba to authenticate clients. If you would like to give all of your &unix; user accounts access from &windows; clients, use the following command: &prompt.root; grep -v "^#" /etc/passwd | make_smbpasswd > /usr/local/private/smbpasswd &prompt.root; chmod 600 /usr/local/private/smbpasswd Please see the Samba documentation for additional information about configuration options. With the basics outlined here, you should have everything you need to start running Samba. Starting <application>Samba</application> To enable Samba when your system boots, add the following line to /etc/rc.conf: samba_enable="YES" You can then start Samba at any time by typing: &prompt.root; /usr/local/etc/rc.d/samba.sh start Starting SAMBA: removing stale tdbs : Starting nmbd. Starting smbd. Samba actually consists of three separate daemons. You should see that both the nmbd and smbd daemons are started by the samba.sh script. If you enabled winbind name resolution services in smb.conf, then you will also see that the winbindd daemon is started. You can stop Samba at any time by typing : &prompt.root; /usr/local/etc/rc.d/samba.sh stop Samba is a complex software suite with functionality that allows broad integration with µsoft.windows; networks. For more information about functionality beyond the basic installation described here, please see . Tom Hukins Contributed by Clock Synchronization with NTP NTP Overview Over time, a computer's clock is prone to drift. The Network Time Protocol (NTP) is one way to ensure your clock stays accurate. Many Internet services rely on, or greatly benefit from, computers' clocks being accurate. For example, a web server may receive requests to send a file if it has been modified since a certain time. In a local area network environment, it is essential that computers sharing files from the same file server have synchronized clocks so that file timestamps stay consistent. Services such as &man.cron.8; also rely on an accurate system clock to run commands at the specified times. NTP ntpd FreeBSD ships with the &man.ntpd.8; NTP server which can be used to query other NTP servers to set the clock on your machine or provide time services to others. Choosing Appropriate NTP Servers NTP choosing servers In order to synchronize your clock, you will need to find one or more NTP servers to use. Your network administrator or ISP may have set up an NTP server for this purpose—check their documentation to see if this is the case. There is an online list of publicly accessible NTP servers which you can use to find an NTP server near to you. Make sure you are aware of the policy for any servers you choose, and ask for permission if required. Choosing several unconnected NTP servers is a good idea in case one of the servers you are using becomes unreachable or its clock is unreliable. &man.ntpd.8; uses the responses it receives from other servers intelligently—it will favor unreliable servers less than reliable ones. Configuring Your Machine NTP configuration Basic Configuration ntpdate If you only wish to synchronize your clock when the machine boots up, you can use &man.ntpdate.8;. This may be appropriate for some desktop machines which are frequently rebooted and only require infrequent synchronization, but most machines should run &man.ntpd.8;. Using &man.ntpdate.8; at boot time is also a good idea for machines that run &man.ntpd.8;. The &man.ntpd.8; program changes the clock gradually, whereas &man.ntpdate.8; sets the clock, no matter how great the difference between a machine's current clock setting and the correct time. To enable &man.ntpdate.8; at boot time, add ntpdate_enable="YES" to /etc/rc.conf. You will also need to specify all servers you wish to synchronize with and any flags to be passed to &man.ntpdate.8; in ntpdate_flags. NTP ntp.conf General Configuration NTP is configured by the /etc/ntp.conf file in the format described in &man.ntp.conf.5;. Here is a simple example: server ntplocal.example.com prefer server timeserver.example.org server ntp2a.example.net driftfile /var/db/ntp.drift The server option specifies which servers are to be used, with one server listed on each line. If a server is specified with the prefer argument, as with ntplocal.example.com, that server is preferred over other servers. A response from a preferred server will be discarded if it differs significantly from other servers' responses, otherwise it will be used without any consideration to other responses. The prefer argument is normally used for NTP servers that are known to be highly accurate, such as those with special time monitoring hardware. The driftfile option specifies which file is used to store the system clock's frequency offset. The &man.ntpd.8; program uses this to automatically compensate for the clock's natural drift, allowing it to maintain a reasonably correct setting even if it is cut off from all external time sources for a period of time. The driftfile option specifies which file is used to store information about previous responses from the NTP servers you are using. This file contains internal information for NTP. It should not be modified by any other process. Controlling Access to Your Server By default, your NTP server will be accessible to all hosts on the Internet. The restrict option in /etc/ntp.conf allows you to control which machines can access your server. If you want to deny all machines from accessing your NTP server, add the following line to /etc/ntp.conf: restrict default ignore If you only want to allow machines within your own network to synchronize their clocks with your server, but ensure they are not allowed to configure the server or used as peers to synchronize against, add restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap instead, where 192.168.1.0 is an IP address on your network and 255.255.255.0 is your network's netmask. /etc/ntp.conf can contain multiple restrict options. For more details, see the Access Control Support subsection of &man.ntp.conf.5;. Running the NTP Server To ensure the NTP server is started at boot time, add the line ntpd_enable="YES" to /etc/rc.conf. If you wish to pass additional flags to &man.ntpd.8;, edit the ntpd_flags parameter in /etc/rc.conf. To start the server without rebooting your machine, run ntpd being sure to specify any additional parameters from ntpd_flags in /etc/rc.conf. For example: &prompt.root; ntpd -p /var/run/ntpd.pid Under &os; 4.X, you have to replace every instance of ntpd with xntpd in the options above. Using ntpd with a Temporary Internet Connection The &man.ntpd.8; program does not need a permanent connection to the Internet to function properly. However, if you have a temporary connection that is configured to dial out on demand, it is a good idea to prevent NTP traffic from triggering a dial out or keeping the connection alive. If you are using user PPP, you can use filter directives in /etc/ppp/ppp.conf. For example: set filter dial 0 deny udp src eq 123 # Prevent NTP traffic from initiating dial out set filter dial 1 permit 0 0 set filter alive 0 deny udp src eq 123 # Prevent incoming NTP traffic from keeping the connection open set filter alive 1 deny udp dst eq 123 # Prevent outgoing NTP traffic from keeping the connection open set filter alive 2 permit 0/0 0/0 For more details see the PACKET FILTERING section in &man.ppp.8; and the examples in /usr/share/examples/ppp/. Some Internet access providers block low-numbered ports, preventing NTP from functioning since replies never reach your machine. Further Information Documentation for the NTP server can be found in /usr/share/doc/ntp/ in HTML format. diff --git a/en_US.ISO8859-1/books/handbook/ppp-and-slip/chapter.sgml b/en_US.ISO8859-1/books/handbook/ppp-and-slip/chapter.sgml index 6b0fe2eafe..ddd229fc29 100644 --- a/en_US.ISO8859-1/books/handbook/ppp-and-slip/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/ppp-and-slip/chapter.sgml @@ -1,3234 +1,3234 @@ Jim Mock Restructured, reorganized, and updated by PPP and SLIP Synopsis PPP SLIP FreeBSD has a number of ways to link one computer to another. To establish a network or Internet connection through a dial-up modem, or to allow others to do so through you, requires the use of PPP or SLIP. This chapter describes setting up these modem-based communication services in detail. After reading this chapter, you will know: How to set up user PPP. How to set up kernel PPP. How to set up PPPoE (PPP over Ethernet). How to set up PPPoA (PPP over ATM). How to configure and set up a SLIP client and server. PPP user PPP PPP kernel PPP PPP over Ethernet Before reading this chapter, you should: Be familiar with basic network terminology. Understand the basics and purpose of a dialup connection and PPP and/or SLIP. You may be wondering what the main difference is between user PPP and kernel PPP. The answer is simple: user PPP processes the inbound and outbound data in userland rather than in the kernel. This is expensive in terms of copying the data between the kernel and userland, but allows a far more feature-rich PPP implementation. User PPP uses the tun device to communicate with the outside world whereas kernel PPP uses the ppp device. Throughout in this chapter, user PPP will simply be referred to as ppp unless a distinction needs to be made between it and any other PPP software such as pppd. Unless otherwise stated, all of the commands explained in this chapter should be executed as root. Tom Rhodes Updated and enhanced by Brian Somers Originally contributed by Nik Clayton With input from Dirk Frömberg Peter Childs Using User PPP User PPP Assumptions This document assumes you have the following: ISP PPP An account with an Internet Service Provider (ISP) which you connect to using PPP. You have a modem or other device connected to your system and configured correctly which allows you to connect to your ISP. The dial-up number(s) of your ISP. PAP CHAP UNIX password Your login name and password. (Either a regular &unix; style login and password pair, or a PAP or CHAP login and password pair.) nameserver The IP address of one or more name servers. Normally, you will be given two IP addresses by your ISP to use for this. If they have not given you at least one, then you can use the enable dns command in ppp.conf and ppp will set the name servers for you. This feature depends on your ISPs PPP implementation supporting DNS negotiation. The following information may be supplied by your ISP, but is not completely necessary: The IP address of your ISP's gateway. The gateway is the machine to which you will connect and will be set up as your default route. If you do not have this information, we can make one up and your ISP's PPP server will tell us the correct value when we connect. This IP number is referred to as HISADDR by ppp. The netmask you should use. If your ISP has not provided you with one, you can safely use 255.255.255.255. static IP address If your ISP provides you with a static IP address and hostname, you can enter it. Otherwise, we simply let the peer assign whatever IP address it sees fit. If you do not have any of the required information, contact your ISP. Throughout this section, many of the examples showing the contents of configuration files are numbered by line. These numbers serve to aid in the presentation and discussion only and are not meant to be placed in the actual file. Proper indentation with tab and space characters is also important. Creating PPP Device Nodes PPPcreating device nodes Under normal circumstances, most users will only need one tun device (/dev/tun0). References to tun0 below may be changed to tunN where N is any unit number corresponding to your system. For FreeBSD installations that do not have &man.devfs.5; enabled (FreeBSD 4.X and earlier), the existence of the tun0 device should be verified (this is not necessary if &man.devfs.5; is enabled as device nodes will be created on demand). The easiest way to make sure that the tun0 device is configured correctly is to remake the device. To remake the device, do the following: &prompt.root; cd /dev &prompt.root; sh MAKEDEV tun0 If you need 16 tunnel devices in your kernel, you will need to create them. This can be done by executing the following commands: &prompt.root; cd /dev &prompt.root; sh MAKEDEV tun15 Automatic <application>PPP</application> Configuration PPPconfiguration Both ppp and pppd (the kernel level implementation of PPP) use the configuration files located in the /etc/ppp directory. Examples for user ppp can be found in /usr/share/examples/ppp/. Configuring ppp requires that you edit a number of files, depending on your requirements. What you put in them depends to some extent on whether your ISP allocates IP addresses statically (i.e., you get given one IP address, and always use that one) or dynamically (i.e., your IP address changes each time you connect to your ISP). PPP and Static IP Addresses PPPwith static IP addresses You will need to edit the /etc/ppp/ppp.conf configuration file. It should look similar to the example below. Lines that end in a : start in the first column (beginning of the line)— all other lines should be indented as shown using spaces or tabs. 1 default: 2 set log Phase Chat LCP IPCP CCP tun command 3 ident user-ppp VERSION (built COMPILATIONDATE) 4 set device /dev/cuaa0 5 set speed 115200 6 set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 5 \ 7 \"\" AT OK-AT-OK ATE1Q0 OK \\dATDT\\T TIMEOUT 40 CONNECT" 8 set timeout 180 9 enable dns 10 11 provider: 12 set phone "(123) 456 7890" 13 set authname foo 14 set authkey bar 15 set login "TIMEOUT 10 \"\" \"\" gin:--gin: \\U word: \\P col: ppp" 16 set timeout 300 17 set ifaddr x.x.x.x y.y.y.y 255.255.255.255 0.0.0.0 18 add default HISADDR Line 1: Identifies the default entry. Commands in this entry are executed automatically when ppp is run. Line 2: Enables logging parameters. When the configuration is working satisfactorily, this line should be reduced to saying set log phase tun in order to avoid excessive log file sizes. Line 3: Tells PPP how to identify itself to the peer. PPP identifies itself to the peer if it has any trouble negotiating and setting up the link, providing information that the peers administrator may find useful when investigating such problems. Line 4: Identifies the device to which the modem is connected. COM1 is /dev/cuaa0 and COM2 is /dev/cuaa1. Line 5: Sets the speed you want to connect at. If 115200 does not work (it should with any reasonably new modem), try 38400 instead. - Line 6 & 7: + Line 6 & 7: PPPuser PPP The dial string. User PPP uses an expect-send syntax similar to the &man.chat.8; program. Refer to the manual page for information on the features of this language. Note that this command continues onto the next line for readability. Any command in ppp.conf may do this if the last character on the line is a ``\'' character. Line 8: Sets the idle timeout for the link. 180 seconds is the default, so this line is purely cosmetic. Line 9: Tells PPP to ask the peer to confirm the local resolver settings. If you run a local name server, this line should be commented out or removed. Line 10: A blank line for readability. Blank lines are ignored by PPP. Line 11: Identifies an entry for a provider called provider. This could be changed to the name of your ISP so that later you can use the to start the connection. Line 12: Sets the phone number for this provider. Multiple phone numbers may be specified using the colon (:) or pipe character (|)as a separator. The difference between the two separators is described in &man.ppp.8;. To summarize, if you want to rotate through the numbers, use a colon. If you want to always attempt to dial the first number first and only use the other numbers if the first number fails, use the pipe character. Always quote the entire set of phone numbers as shown. You must enclose the phone number in quotation marks (") if there is any intention on using spaces in the phone number. This can cause a simple, yet subtle error. - Line 13 & 14: + Line 13 & 14: Identifies the user name and password. When connecting using a &unix; style login prompt, these values are referred to by the set login command using the \U and \P variables. When connecting using PAP or CHAP, these values are used at authentication time. Line 15: PAP CHAP If you are using PAP or CHAP, there will be no login at this point, and this line should be commented out or removed. See PAP and CHAP authentication for further details. The login string is of the same chat-like syntax as the dial string. In this example, the string works for a service whose login session looks like this: J. Random Provider login: foo password: bar protocol: ppp You will need to alter this script to suit your own needs. When you write this script for the first time, you should ensure that you have enabled chat logging so you can determine if the conversation is going as expected. Line 16: timeout Sets the default idle timeout (in seconds) for the connection. Here, the connection will be closed automatically after 300 seconds of inactivity. If you never want to timeout, set this value to zero or use the command line switch. Line 17: ISP Sets the interface addresses. The string x.x.x.x should be replaced by the IP address that your provider has allocated to you. The string y.y.y.y should be replaced by the IP address that your ISP indicated for their gateway (the machine to which you connect). If your ISP has not given you a gateway address, use 10.0.0.2/0. If you need to use a guessed address, make sure that you create an entry in /etc/ppp/ppp.linkup as per the instructions for PPP and Dynamic IP addresses. If this line is omitted, ppp cannot run in mode. Line 18: Adds a default route to your ISP's gateway. The special word HISADDR is replaced with the gateway address specified on line 17. It is important that this line appears after line 17, otherwise HISADDR will not yet be initialized. If you do not wish to run ppp in , this line should be moved to the ppp.linkup file. It is not necessary to add an entry to ppp.linkup when you have a static IP address and are running ppp in mode as your routing table entries are already correct before you connect. You may however wish to create an entry to invoke programs after connection. This is explained later with the sendmail example. Example configuration files can be found in the /usr/share/examples/ppp/ directory. PPP and Dynamic IP Addresses PPPwith dynamic IP addresses IPCP If your service provider does not assign static IP addresses, ppp can be configured to negotiate the local and remote addresses. This is done by guessing an IP address and allowing ppp to set it up correctly using the IP Configuration Protocol (IPCP) after connecting. The ppp.conf configuration is the same as PPP and Static IP Addresses, with the following change: 17 set ifaddr 10.0.0.1/0 10.0.0.2/0 255.255.255.255 Again, do not include the line number, it is just for reference. Indentation of at least one space is required. Line 17: The number after the / character is the number of bits of the address that ppp will insist on. You may wish to use IP numbers more appropriate to your circumstances, but the above example will always work. The last argument (0.0.0.0) tells PPP to start negotiations using address 0.0.0.0 rather than 10.0.0.1 and is necessary for some ISPs. Do not use 0.0.0.0 as the first argument to set ifaddr as it prevents PPP from setting up an initial route in mode. If you are not running in mode, you will need to create an entry in /etc/ppp/ppp.linkup. ppp.linkup is used after a connection has been established. At this point, ppp will have assigned the interface addresses and it will now be possible to add the routing table entries: 1 provider: 2 add default HISADDR Line 1: On establishing a connection, ppp will look for an entry in ppp.linkup according to the following rules: First, try to match the same label as we used in ppp.conf. If that fails, look for an entry for the IP address of our gateway. This entry is a four-octet IP style label. If we still have not found an entry, look for the MYADDR entry. Line 2: This line tells ppp to add a default route that points to HISADDR. HISADDR will be replaced with the IP number of the gateway as negotiated by the IPCP. See the pmdemand entry in the files /usr/share/examples/ppp/ppp.conf.sample and /usr/share/examples/ppp/ppp.linkup.sample for a detailed example. Receiving Incoming Calls PPPreceiving incoming calls When you configure ppp to receive incoming calls on a machine connected to a LAN, you must decide if you wish to forward packets to the LAN. If you do, you should allocate the peer an IP number from your LAN's subnet, and use the command enable proxy in your /etc/ppp/ppp.conf file. You should also confirm that the /etc/rc.conf file contains the following: gateway_enable="YES" Which getty? Configuring FreeBSD for Dial-up Services provides a good description on enabling dial-up services using &man.getty.8;. An alternative to getty is mgetty, a smarter version of getty designed with dial-up lines in mind. The advantages of using mgetty is that it actively talks to modems, meaning if port is turned off in /etc/ttys then your modem will not answer the phone. Later versions of mgetty (from 0.99beta onwards) also support the automatic detection of PPP streams, allowing your clients script-less access to your server. Refer to Mgetty and AutoPPP for more information on mgetty. <application>PPP</application> Permissions The ppp command must normally be run as the root user. If however, you wish to allow ppp to run in server mode as a normal user by executing ppp as described below, that user must be given permission to run ppp by adding them to the network group in /etc/group. You will also need to give them access to one or more sections of the configuration file using the allow command: allow users fred mary If this command is used in the default section, it gives the specified users access to everything. PPP Shells for Dynamic-IP Users PPP shells Create a file called /etc/ppp/ppp-shell containing the following: #!/bin/sh IDENT=`echo $0 | sed -e 's/^.*-$.*$$/\1/'` CALLEDAS="$IDENT" TTY=`tty` if [ x$IDENT = xdialup ]; then IDENT=`basename $TTY` fi echo "PPP for $CALLEDAS on $TTY" echo "Starting PPP for $IDENT" exec /usr/sbin/ppp -direct $IDENT This script should be executable. Now make a symbolic link called ppp-dialup to this script using the following commands: &prompt.root; ln -s ppp-shell /etc/ppp/ppp-dialup You should use this script as the shell for all of your dialup users. This is an example from /etc/passwd for a dialup PPP user with username pchilds (remember do not directly edit the password file, use &man.vipw.8;). pchilds:*:1011:300:Peter Childs PPP:/home/ppp:/etc/ppp/ppp-dialup Create a /home/ppp directory that is world readable containing the following 0 byte files: -r--r--r-- 1 root wheel 0 May 27 02:23 .hushlogin -r--r--r-- 1 root wheel 0 May 27 02:22 .rhosts which prevents /etc/motd from being displayed. PPP Shells for Static-IP Users PPP shells Create the ppp-shell file as above, and for each account with statically assigned IPs create a symbolic link to ppp-shell. For example, if you have three dialup customers, fred, sam, and mary, that you route class C networks for, you would type the following: &prompt.root; ln -s /etc/ppp/ppp-shell /etc/ppp/ppp-fred &prompt.root; ln -s /etc/ppp/ppp-shell /etc/ppp/ppp-sam &prompt.root; ln -s /etc/ppp/ppp-shell /etc/ppp/ppp-mary Each of these users dialup accounts should have their shell set to the symbolic link created above (for example, mary's shell should be /etc/ppp/ppp-mary). Setting Up <filename>ppp.conf</filename> for Dynamic-IP Users The /etc/ppp/ppp.conf file should contain something along the lines of: default: set debug phase lcp chat set timeout 0 ttyd0: set ifaddr 203.14.100.1 203.14.100.20 255.255.255.255 enable proxy ttyd1: set ifaddr 203.14.100.1 203.14.100.21 255.255.255.255 enable proxy The indenting is important. The default: section is loaded for each session. For each dialup line enabled in /etc/ttys create an entry similar to the one for ttyd0: above. Each line should get a unique IP address from your pool of IP addresses for dynamic users. Setting Up <filename>ppp.conf</filename> for Static-IP Users Along with the contents of the sample /usr/share/examples/ppp/ppp.conf above you should add a section for each of the statically assigned dialup users. We will continue with our fred, sam, and mary example. fred: set ifaddr 203.14.100.1 203.14.101.1 255.255.255.255 sam: set ifaddr 203.14.100.1 203.14.102.1 255.255.255.255 mary: set ifaddr 203.14.100.1 203.14.103.1 255.255.255.255 The file /etc/ppp/ppp.linkup should also contain routing information for each static IP user if required. The line below would add a route for the 203.14.101.0 class C via the client's ppp link. fred: add 203.14.101.0 netmask 255.255.255.0 HISADDR sam: add 203.14.102.0 netmask 255.255.255.0 HISADDR mary: add 203.14.103.0 netmask 255.255.255.0 HISADDR <command>mgetty</command> and AutoPPP mgetty AutoPPP LCP Configuring and compiling mgetty with the AUTO_PPP option enabled allows mgetty to detect the LCP phase of PPP connections and automatically spawn off a ppp shell. However, since the default login/password sequence does not occur it is necessary to authenticate users using either PAP or CHAP. This section assumes the user has successfully configured, compiled, and installed a version of mgetty with the AUTO_PPP option (v0.99beta or later). Make sure your /usr/local/etc/mgetty+sendfax/login.config file has the following in it: /AutoPPP/ - - /etc/ppp/ppp-pap-dialup This will tell mgetty to run the ppp-pap-dialup script for detected PPP connections. Create a file called /etc/ppp/ppp-pap-dialup containing the following (the file should be executable): #!/bin/sh exec /usr/sbin/ppp -direct pap$IDENT For each dialup line enabled in /etc/ttys, create a corresponding entry in /etc/ppp/ppp.conf. This will happily co-exist with the definitions we created above. pap: enable pap set ifaddr 203.14.100.1 203.14.100.20-203.14.100.40 enable proxy Each user logging in with this method will need to have a username/password in /etc/ppp/ppp.secret file, or alternatively add the following option to authenticate users via PAP from the /etc/passwd file. enable passwdauth If you wish to assign some users a static IP number, you can specify the number as the third argument in /etc/ppp/ppp.secret. See /usr/share/examples/ppp/ppp.secret.sample for examples. MS Extensions DNS NetBIOS PPPMicrosoft extensions It is possible to configure PPP to supply DNS and NetBIOS nameserver addresses on demand. To enable these extensions with PPP version 1.x, the following lines might be added to the relevant section of /etc/ppp/ppp.conf. enable msext set ns 203.14.100.1 203.14.100.2 set nbns 203.14.100.5 And for PPP version 2 and above: accept dns set dns 203.14.100.1 203.14.100.2 set nbns 203.14.100.5 This will tell the clients the primary and secondary name server addresses, and a NetBIOS nameserver host. In version 2 and above, if the set dns line is omitted, PPP will use the values found in /etc/resolv.conf. PAP and CHAP Authentication PAP CHAP Some ISPs set their system up so that the authentication part of your connection is done using either of the PAP or CHAP authentication mechanisms. If this is the case, your ISP will not give a login: prompt when you connect, but will start talking PPP immediately. PAP is less secure than CHAP, but security is not normally an issue here as passwords, although being sent as plain text with PAP, are being transmitted down a serial line only. There is not much room for crackers to eavesdrop. Referring back to the PPP and Static IP addresses or PPP and Dynamic IP addresses sections, the following alterations must be made: 13 set authname MyUserName 14 set authkey MyPassword 15 set login Line 13: This line specifies your PAP/CHAP user name. You will need to insert the correct value for MyUserName. Line 14: password This line specifies your PAP/CHAP password. You will need to insert the correct value for MyPassword. You may want to add an additional line, such as: 16 accept PAP or 16 accept CHAP to make it obvious that this is the intention, but PAP and CHAP are both accepted by default. Line 15: Your ISP will not normally require that you log into the server if you are using PAP or CHAP. You must therefore disable your set login string. Changing Your <command>ppp</command> Configuration on the Fly It is possible to talk to the ppp program while it is running in the background, but only if a suitable diagnostic port has been set up. To do this, add the following line to your configuration: set server /var/run/ppp-tun%d DiagnosticPassword 0177 This will tell PPP to listen to the specified &unix; domain socket, asking clients for the specified password before allowing access. The %d in the name is replaced with the tun device number that is in use. Once a socket has been set up, the &man.pppctl.8; program may be used in scripts that wish to manipulate the running program. Using PPP Network Address Translation Capability PPPNAT PPP has ability to use internal NAT without kernel diverting capabilities. This functionality may be enabled by the following line in /etc/ppp/ppp.conf: nat enable yes Alternatively, PPP NAT may be enabled by command-line option -nat. There is also /etc/rc.conf knob named ppp_nat, which is enabled by default. If you use this feature, you may also find useful the following /etc/ppp/ppp.conf options to enable incoming connections forwarding: nat port tcp 10.0.0.2:ftp ftp nat port tcp 10.0.0.2:http http or do not trust the outside at all nat deny_incoming yes Final System Configuration PPPconfiguration You now have ppp configured, but there are a few more things to do before it is ready to work. They all involve editing the /etc/rc.conf file. Working from the top down in this file, make sure the hostname= line is set, e.g.: hostname="foo.example.com" If your ISP has supplied you with a static IP address and name, it is probably best that you use this name as your host name. Look for the network_interfaces variable. If you want to configure your system to dial your ISP on demand, make sure the tun0 device is added to the list, otherwise remove it. network_interfaces="lo0 tun0" ifconfig_tun0= The ifconfig_tun0 variable should be empty, and a file called /etc/start_if.tun0 should be created. This file should contain the line: ppp -auto mysystem This script is executed at network configuration time, starting your ppp daemon in automatic mode. If you have a LAN for which this machine is a gateway, you may also wish to use the switch. Refer to the manual page for further details. Make sure that the router program is set to NO with the following line in your /etc/rc.conf: router_enable="NO" routed It is important that the routed daemon is not started, as routed tends to delete the default routing table entries created by ppp. It is probably worth your while ensuring that the sendmail_flags line does not include the option, otherwise sendmail will attempt to do a network lookup every now and then, possibly causing your machine to dial out. You may try: sendmail_flags="-bd" sendmail The downside of this is that you must force sendmail to re-examine the mail queue whenever the ppp link is up by typing: &prompt.root; /usr/sbin/sendmail -q You may wish to use the !bg command in ppp.linkup to do this automatically: 1 provider: 2 delete ALL 3 add 0 0 HISADDR 4 !bg sendmail -bd -q30m SMTP If you do not like this, it is possible to set up a dfilter to block SMTP traffic. Refer to the sample files for further details. All that is left is to reboot the machine. After rebooting, you can now either type: &prompt.root; ppp and then dial provider to start the PPP session, or, if you want ppp to establish sessions automatically when there is outbound traffic (and you have not created the start_if.tun0 script), type: &prompt.root; ppp -auto provider Summary To recap, the following steps are necessary when setting up ppp for the first time: Client side: Ensure that the tun device is built into your kernel. Ensure that the tunN device file is available in the /dev directory. Create an entry in /etc/ppp/ppp.conf. The pmdemand example should suffice for most ISPs. If you have a dynamic IP address, create an entry in /etc/ppp/ppp.linkup. Update your /etc/rc.conf file. Create a start_if.tun0 script if you require demand dialing. Server side: Ensure that the tun device is built into your kernel. Ensure that the tunN device file is available in the /dev directory. Create an entry in /etc/passwd (using the &man.vipw.8; program). Create a profile in this users home directory that runs ppp -direct direct-server or similar. Create an entry in /etc/ppp/ppp.conf. The direct-server example should suffice. Create an entry in /etc/ppp/ppp.linkup. Update your /etc/rc.conf file. Gennady B. Sorokopud Parts originally contributed by Robert Huff Using Kernel PPP Setting Up Kernel PPP PPPkernel PPP Before you start setting up PPP on your machine, make sure that pppd is located in /usr/sbin and the directory /etc/ppp exists. pppd can work in two modes: As a client — you want to connect your machine to the outside world via a PPP serial connection or modem line. PPPserver As a server — your machine is located on the network, and is used to connect other computers using PPP. In both cases you will need to set up an options file (/etc/ppp/options or ~/.ppprc if you have more than one user on your machine that uses PPP). You will also need some modem/serial software (preferably comms/kermit), so you can dial and establish a connection with the remote host. Trev Roydhouse Based on information provided by Using <command>pppd</command> as a Client PPPclient Cisco The following /etc/ppp/options might be used to connect to a Cisco terminal server PPP line. crtscts # enable hardware flow control modem # modem control line noipdefault # remote PPP server must supply your IP address # if the remote host does not send your IP during IPCP # negotiation, remove this option passive # wait for LCP packets domain ppp.foo.com # put your domain name here :<remote_ip> # put the IP of remote PPP host here # it will be used to route packets via PPP link # if you didn't specified the noipdefault option # change this line to <local_ip>:<remote_ip> defaultroute # put this if you want that PPP server will be your # default router To connect: Kermit modem Dial to the remote host using Kermit (or some other modem program), and enter your user name and password (or whatever is needed to enable PPP on the remote host). Exit Kermit (without hanging up the line). Enter the following: &prompt.root; /usr/src/usr.sbin/pppd.new/pppd /dev/tty01 19200 Be sure to use the appropriate speed and device name. Now your computer is connected with PPP. If the connection fails, you can add the option to the /etc/ppp/options file, and check console messages to track the problem. Following /etc/ppp/pppup script will make all 3 stages automatic: #!/bin/sh ps ax |grep pppd |grep -v grep pid=`ps ax |grep pppd |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing pppd, PID=' ${pid} kill ${pid} fi ps ax |grep kermit |grep -v grep pid=`ps ax |grep kermit |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing kermit, PID=' ${pid} kill -9 ${pid} fi ifconfig ppp0 down ifconfig ppp0 delete kermit -y /etc/ppp/kermit.dial pppd /dev/tty01 19200 Kermit /etc/ppp/kermit.dial is a Kermit script that dials and makes all necessary authorization on the remote host (an example of such a script is attached to the end of this document). Use the following /etc/ppp/pppdown script to disconnect the PPP line: #!/bin/sh pid=`ps ax |grep pppd |grep -v grep|awk '{print $1;}'` if [ X${pid} != "X" ] ; then echo 'killing pppd, PID=' ${pid} kill -TERM ${pid} fi ps ax |grep kermit |grep -v grep pid=`ps ax |grep kermit |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing kermit, PID=' ${pid} kill -9 ${pid} fi /sbin/ifconfig ppp0 down /sbin/ifconfig ppp0 delete kermit -y /etc/ppp/kermit.hup /etc/ppp/ppptest Check to see if pppd is still running by executing /usr/etc/ppp/ppptest, which should look like this: #!/bin/sh pid=`ps ax| grep pppd |grep -v grep|awk '{print $1;}'` if [ X${pid} != "X" ] ; then echo 'pppd running: PID=' ${pid-NONE} else echo 'No pppd running.' fi set -x netstat -n -I ppp0 ifconfig ppp0 To hang up the modem, execute /etc/ppp/kermit.hup, which should contain: set line /dev/tty01 ; put your modem device here set speed 19200 set file type binary set file names literal set win 8 set rec pack 1024 set send pack 1024 set block 3 set term bytesize 8 set command bytesize 8 set flow none pau 1 out +++ inp 5 OK out ATH0\13 echo \13 exit Here is an alternate method using chat instead of kermit: The following two files are sufficient to accomplish a pppd connection. /etc/ppp/options: /dev/cuaa1 115200 crtscts # enable hardware flow control modem # modem control line connect "/usr/bin/chat -f /etc/ppp/login.chat.script" noipdefault # remote PPP serve must supply your IP address # if the remote host doesn't send your IP during # IPCP negotiation, remove this option passive # wait for LCP packets domain <your.domain> # put your domain name here : # put the IP of remote PPP host here # it will be used to route packets via PPP link # if you didn't specified the noipdefault option # change this line to <local_ip>:<remote_ip> defaultroute # put this if you want that PPP server will be # your default router /etc/ppp/login.chat.script: The following should go on a single line. ABORT BUSY ABORT 'NO CARRIER' "" AT OK ATDT<phone.number> CONNECT "" TIMEOUT 10 ogin:-\\r-ogin: <login-id> TIMEOUT 5 sword: <password> Once these are installed and modified correctly, all you need to do is run pppd, like so: &prompt.root; pppd Using <command>pppd</command> as a Server /etc/ppp/options should contain something similar to the following: crtscts # Hardware flow control netmask 255.255.255.0 # netmask (not required) 192.114.208.20:192.114.208.165 # IP's of local and remote hosts # local ip must be different from one # you assigned to the Ethernet (or other) # interface on your machine. # remote IP is IP address that will be # assigned to the remote machine domain ppp.foo.com # your domain passive # wait for LCP modem # modem line The following /etc/ppp/pppserv script will tell pppd to behave as a server: #!/bin/sh ps ax |grep pppd |grep -v grep pid=`ps ax |grep pppd |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing pppd, PID=' ${pid} kill ${pid} fi ps ax |grep kermit |grep -v grep pid=`ps ax |grep kermit |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing kermit, PID=' ${pid} kill -9 ${pid} fi # reset ppp interface ifconfig ppp0 down ifconfig ppp0 delete # enable autoanswer mode kermit -y /etc/ppp/kermit.ans # run ppp pppd /dev/tty01 19200 Use this /etc/ppp/pppservdown script to stop the server: #!/bin/sh ps ax |grep pppd |grep -v grep pid=`ps ax |grep pppd |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing pppd, PID=' ${pid} kill ${pid} fi ps ax |grep kermit |grep -v grep pid=`ps ax |grep kermit |grep -v grep|awk '{print $1;}'` if [ "X${pid}" != "X" ] ; then echo 'killing kermit, PID=' ${pid} kill -9 ${pid} fi ifconfig ppp0 down ifconfig ppp0 delete kermit -y /etc/ppp/kermit.noans The following Kermit script (/etc/ppp/kermit.ans) will enable/disable autoanswer mode on your modem. It should look like this: set line /dev/tty01 set speed 19200 set file type binary set file names literal set win 8 set rec pack 1024 set send pack 1024 set block 3 set term bytesize 8 set command bytesize 8 set flow none pau 1 out +++ inp 5 OK out ATH0\13 inp 5 OK echo \13 out ATS0=1\13 ; change this to out ATS0=0\13 if you want to disable ; autoanswer mode inp 5 OK echo \13 exit A script named /etc/ppp/kermit.dial is used for dialing and authenticating on the remote host. You will need to customize it for your needs. Put your login and password in this script; you will also need to change the input statement depending on responses from your modem and remote host. ; ; put the com line attached to the modem here: ; set line /dev/tty01 ; ; put the modem speed here: ; set speed 19200 set file type binary ; full 8 bit file xfer set file names literal set win 8 set rec pack 1024 set send pack 1024 set block 3 set term bytesize 8 set command bytesize 8 set flow none set modem hayes set dial hangup off set carrier auto ; Then SET CARRIER if necessary, set dial display on ; Then SET DIAL if necessary, set input echo on set input timeout proceed set input case ignore def \%x 0 ; login prompt counter goto slhup :slcmd ; put the modem in command mode echo Put the modem in command mode. clear ; Clear unread characters from input buffer pause 1 output +++ ; hayes escape sequence input 1 OK\13\10 ; wait for OK if success goto slhup output \13 pause 1 output at\13 input 1 OK\13\10 if fail goto slcmd ; if modem doesn't answer OK, try again :slhup ; hang up the phone clear ; Clear unread characters from input buffer pause 1 echo Hanging up the phone. output ath0\13 ; hayes command for on hook input 2 OK\13\10 if fail goto slcmd ; if no OK answer, put modem in command mode :sldial ; dial the number pause 1 echo Dialing. output atdt9,550311\13\10 ; put phone number here assign \%x 0 ; zero the time counter :look clear ; Clear unread characters from input buffer increment \%x ; Count the seconds input 1 {CONNECT } if success goto sllogin reinput 1 {NO CARRIER\13\10} if success goto sldial reinput 1 {NO DIALTONE\13\10} if success goto slnodial reinput 1 {\255} if success goto slhup reinput 1 {\127} if success goto slhup -if < \%x 60 goto look +if < \%x 60 goto look else goto slhup :sllogin ; login assign \%x 0 ; zero the time counter pause 1 echo Looking for login prompt. :slloop increment \%x ; Count the seconds clear ; Clear unread characters from input buffer output \13 ; ; put your expected login prompt here: ; input 1 {Username: } if success goto sluid reinput 1 {\255} if success goto slhup reinput 1 {\127} if success goto slhup -if < \%x 10 goto slloop ; try 10 times to get a login prompt +if < \%x 10 goto slloop ; try 10 times to get a login prompt else goto slhup ; hang up and start again if 10 failures :sluid ; ; put your userid here: ; output ppp-login\13 input 1 {Password: } ; ; put your password here: ; output ppp-password\13 input 1 {Entering SLIP mode.} echo quit :slnodial echo \7No dialtone. Check the telephone line!\7 exit 1 ; local variables: ; mode: csh ; comment-start: "; " ; comment-start-skip: "; " ; end: Tom Rhodes Contributed by Troubleshooting <acronym>PPP</acronym> Connections PPPtroubleshooting This section covers a few issues which may arise when using PPP over a modem connection. For instance, perhaps you need to know exactly what prompts the system you are dialing into will present. Some ISPs present the ssword prompt, and others will present password; if the ppp script is not written accordingly, the login attempt will fail. The most common way to debug ppp connections is by connecting manually. The following information will walk you through a manual connection step by step. Check the Device Nodes If you reconfigured your kernel then you recall the sio device. If you did not configure your kernel, there is no reason to worry. Just check the dmesg output for the modem device with: &prompt.root; dmesg | grep sio You should get some pertinent output about the sio devices. These are the COM ports we need. If your modem acts like a standard serial port then you should see it listed on sio1, or COM2. If so, you are not required to rebuild the kernel, you just need to make the serial device. You can do this by changing your directory to /dev and running the MAKEDEV script like above. Now make the serial devices with: &prompt.root; sh MAKEDEV cuaa0 cuaa1 cuaa2 cuaa3 which will create the serial devices for your system. When matching up sio modem is on sio1 or COM2 if you are in DOS, then your modem device would be /dev/cuaa1. Connecting Manually Connecting to the Internet by manually controlling ppp is quick, easy, and a great way to debug a connection or just get information on how your ISP treats ppp client connections. Lets start PPP from the command line. Note that in all of our examples we will use example as the hostname of the machine running PPP. You start ppp by just typing ppp: &prompt.root; ppp We have now started ppp. ppp ON example> set device /dev/cuaa1 We set our modem device, in this case it is cuaa1. ppp ON example> set speed 115200 Set the connection speed, in this case we are using 115,200 kbps. ppp ON example> enable dns Tell ppp to configure our resolver and add the nameserver lines to /etc/resolv.conf. If ppp cannot determine our hostname, we can set one manually later. ppp ON example> term Switch to terminal mode so that we can manually control the modem. deflink: Entering terminal mode on /dev/cuaa1 type '~h' for help at OK atdt123456789 Use at to initialize the modem, then use atdt and the number for your ISP to begin the dial in process. CONNECT Confirmation of the connection, if we are going to have any connection problems, unrelated to hardware, here is where we will attempt to resolve them. ISP Login:myusername Here you are prompted for a username, return the prompt with the username that was provided by the ISP. ISP Pass:mypassword This time we are prompted for a password, just reply with the password that was provided by the ISP. Just like logging into &os;, the password will not echo. Shell or PPP:ppp Depending on your ISP this prompt may never appear. Here we are being asked if we wish to use a shell on the provider, or to start ppp. In this example, we have chosen to use ppp as we want an Internet connection. Ppp ON example> Notice that in this example the first has been capitalized. This shows that we have successfully connected to the ISP. PPp ON example> We have successfully authenticated with our ISP and are waiting for the assigned IP address. PPP ON example> We have made an agreement on an IP address and successfully completed our connection. PPP ON example>add default HISADDR Here we add our default route, we need to do this before we can talk to the outside world as currently the only established connection is with the peer. If this fails due to existing routes you can put a bang character ! in front of the . Alternatively, you can set this before making the actual connection and it will negotiate a new route accordingly. If everything went good we should now have an active connection to the Internet, which could be thrown into the background using CTRL z If you notice the PPP return to ppp then we have lost our connection. This is good to know because it shows our connection status. Capital P's show that we have a connection to the ISP and lowercase p's show that the connection has been lost for whatever reason. ppp only has these 2 states. Debugging If you have a direct line and cannot seem to make a connection, then turn hardware flow CTS/RTS to off with the . This is mainly the case if you are connected to some PPP capable terminal servers, where PPP hangs when it tries to write data to your communication link, so it would be waiting for a CTS, or Clear To Send signal which may never come. If you use this option however, you should also use the option, which may be required to defeat hardware dependent on passing certain characters from end to end, most of the time XON/XOFF. See the &man.ppp.8; manual page for more information on this option, and how it is used. If you have an older modem, you may need to use the . Parity is set at none be default, but is used for error checking (with a large increase in traffic) on older modems and some ISPs. You may need this option for the Compuserve ISP. PPP may not return to the command mode, which is usually a negotiation error where the ISP is waiting for your side to start negotiating. At this point, using the ~p command will force ppp to start sending the configuration information. If you never obtain a login prompt, then most likely you need to use PAP or CHAP authentication instead of the &unix; style in the example above. To use PAP or CHAP just add the following options to PPP before going into terminal mode: ppp ON example> set authname myusername Where myusername should be replaced with the username that was assigned by the ISP. ppp ON example> set authkey mypassword Where mypassword should be replaced with the password that was assigned by the ISP. If you connect fine, but cannot seem to find any domain name, try to use &man.ping.8; with an IP address and see if you can get any return information. If you experience 100 percent (100%) packet loss, then it is most likely that you were not assigned a default route. Double check that the option was set during the connection. If you can connect to a remote IP address then it is possible that a resolver address has not been added to the /etc/resolv.conf. This file should look like: domain example.com nameserver x.x.x.x nameserver y.y.y.y Where x.x.x.x and y.y.y.y should be replaced with the IP address of your ISP's DNS servers. This information may or may not have been provided when you signed up, but a quick call to your ISP should remedy that. You could also have &man.syslog.3; provide a logging function for your PPP connection. Just add: !ppp *.* /var/log/ppp.log to /etc/syslog.conf. In most cases, this functionality already exists. Jim Mock Contributed (from http://node.to/freebsd/how-tos/how-to-freebsd-pppoe.html) by Using PPP over Ethernet (PPPoE) PPPover Ethernet PPPoE PPP, over Ethernet This section describes how to set up PPP over Ethernet (PPPoE). Configuring the Kernel No kernel configuration is necessary for PPPoE any longer. If the necessary netgraph support is not built into the kernel, it will be dynamically loaded by ppp. Setting Up <filename>ppp.conf</filename> Here is an example of a working ppp.conf: default: set log Phase tun command # you can add more detailed logging if you wish set ifaddr 10.0.0.1/0 10.0.0.2/0 name_of_service_provider: set device PPPoE:xl1 # replace xl1 with your Ethernet device set authname YOURLOGINNAME set authkey YOURPASSWORD set dial set login add default HISADDR Running <application>ppp</application> As root, you can run: &prompt.root; ppp -ddial name_of_service_provider Starting <application>ppp</application> at Boot Add the following to your /etc/rc.conf file: ppp_enable="YES" ppp_mode="ddial" ppp_nat="YES" # if you want to enable nat for your local network, otherwise NO ppp_profile="name_of_service_provider" Using a PPPoE Service Tag Sometimes it will be necessary to use a service tag to establish your connection. Service tags are used to distinguish between different PPPoE servers attached to a given network. You should have been given any required service tag information in the documentation provided by your ISP. If you cannot locate it there, ask your ISP's tech support personnel. As a last resort, you could try the method suggested by the Roaring Penguin PPPoE program which can be found in the Ports Collection. Bear in mind however, this may de-program your modem and render it useless, so think twice before doing it. Simply install the program shipped with the modem by your provider. Then, access the System menu from the program. The name of your profile should be listed there. It is usually ISP. The profile name (service tag) will be used in the PPPoE configuration entry in ppp.conf as the provider part of the set device command (see the &man.ppp.8; manual page for full details). It should look like this: set device PPPoE:xl1:ISP Do not forget to change xl1 to the proper device for your Ethernet card. Do not forget to change ISP to the profile you have just found above. For additional information, see: Cheaper Broadband with FreeBSD on DSL by Renaud Waldura. Nutzung von T-DSL und T-Online mit FreeBSD by Udo Erdelhoff (in German). PPPoE with a &tm.3com; <trademark class="registered">HomeConnect</trademark> ADSL Modem Dual Link This modem does not follow RFC 2516 (A Method for transmitting PPP over Ethernet (PPPoE), written by L. Mamakos, K. Lidl, J. Evarts, D. Carrel, D. Simone, and R. Wheeler). Instead, different packet type codes have been used for the Ethernet frames. Please complain to 3Com if you think it should comply with the PPPoE specification. In order to make FreeBSD capable of communicating with this device, a sysctl must be set. This can be done automatically at boot time by updating /etc/sysctl.conf: net.graph.nonstandard_pppoe=1 or can be done immediately with the command: &prompt.root; sysctl net.graph.nonstandard_pppoe=1 Unfortunately, because this is a system-wide setting, it is not possible to talk to a normal PPPoE client or server and a &tm.3com; HomeConnect ADSL Modem at the same time. Using <application>PPP</application> over ATM (PPPoA) PPPover ATM PPPoA PPP, over ATM The following describes how to set up PPP over ATM (PPPoA). PPPoA is a popular choice among European DSL providers. Using PPPoA with the Alcatel &speedtouch; USB PPPoA support for this device is supplied as a port in FreeBSD because the firmware is distributed under Alcatel's license agreement and can not be redistributed freely with the base system of FreeBSD. To install the software, simply use the Ports Collection. Install the net/pppoa port and follow the instructions provided with it. Like many USB devices, the Alcatel &speedtouch; USB needs to download firmware from the host computer to operate properly. It is possible to automate this process in &os; so that this transfer takes place whenever the device is plugged into a USB port. The following information can be added to the /etc/usbd.conf file to enable this automatic firmware transfer. This file must be edited as the root user. device "Alcatel SpeedTouch USB" devname "ugen[0-9]+" vendor 0x06b9 product 0x4061 attach "/usr/local/sbin/modem_run -f /usr/local/libdata/mgmt.o" To enable the USB daemon, usbd, put the following the line into /etc/rc.conf: usbd_enable="YES" It is also possible to set up ppp to dial up at startup. To do this add the following lines to /etc/rc.conf. Again, for this procedure you will need to be logged in as the root user. ppp_enable="YES" ppp_mode="ddial" ppp_profile="adsl" For this to work correctly you will need to have used the sample ppp.conf which is supplied with the net/pppoa port. Using mpd You can use mpd to connect to a variety of services, in particular PPTP services. You can find mpd in the Ports Collection, net/mpd. Many ADSL modems require that a PPTP tunnel is created between the modem and computer, one such modem is the Alcatel &speedtouch; Home. First you must install the port, and then you can configure mpd to suit your requirements and provider settings. The port places a set of sample configuration files which are well documented in PREFIX/etc/mpd/. Note here that PREFIX means the directory into which your ports are installed, this defaults to /usr/local/. A complete guide to configure mpd is available in HTML format once the port has been installed. It is placed in PREFIX/share/doc/mpd/. Here is a sample configuration for connecting to an ADSL service with mpd. The configuration is spread over two files, first the mpd.conf: default: load adsl adsl: new -i ng0 adsl adsl set bundle authname username set bundle password password set bundle disable multilink set link no pap acfcomp protocomp set link disable chap set link accept chap set link keep-alive 30 10 set ipcp no vjcomp set ipcp ranges 0.0.0.0/0 0.0.0.0/0 set iface route default set iface disable on-demand set iface enable proxy-arp set iface idle 0 open The username used to authenticate with your ISP. The password used to authenticate with your ISP. The mpd.links file contains information about the link, or links, you wish to establish. An example mpd.links to accompany the above example is given beneath: adsl: set link type pptp set pptp mode active set pptp enable originate outcall set pptp self 10.0.0.1 set pptp peer 10.0.0.138 The IP address of your &os; computer which you will be using mpd from. The IP address of your ADSL modem. For the Alcatel &speedtouch; Home this address defaults to 10.0.0.138. It is possible to initialize the connection easily by issuing the following command as root: &prompt.root; mpd -b adsl You can see the status of the connection with the following command: &prompt.user; ifconfig ng0 ng0: flags=88d1<UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST> mtu 1500 inet 216.136.204.117 --> 204.152.186.171 netmask 0xffffffff Using mpd is the recommended way to connect to an ADSL service with &os;. Using pptpclient It is also possible to use FreeBSD to connect to other PPPoA services using net/pptpclient. To use net/pptpclient to connect to a DSL service, install the port or package and edit your /etc/ppp/ppp.conf. You will need to be root to perform both of these operations. An example section of ppp.conf is given below. For further information on ppp.conf options consult the ppp manual page, &man.ppp.8;. adsl: set log phase chat lcp ipcp ccp tun command set timeout 0 enable dns set authname username set authkey password set ifaddr 0 0 add default HISADDR The username of your account with the DSL provider. The password for your account. Because you must put your account's password in the ppp.conf file in plain text form you should make sure than nobody can read the contents of this file. The following series of commands will make sure the file is only readable by the root account. Refer to the manual pages for &man.chmod.1; and &man.chown.8; for further information. &prompt.root; chown root:wheel /etc/ppp/ppp.conf &prompt.root; chmod 600 /etc/ppp/ppp.conf This will open a tunnel for a PPP session to your DSL router. Ethernet DSL modems have a preconfigured LAN IP address which you connect to. In the case of the Alcatel &speedtouch; Home this address is 10.0.0.138. Your router documentation should tell you which address your device uses. To open the tunnel and start a PPP session execute the following command: &prompt.root; pptp address adsl You may wish to add an ampersand (&) to the end of the previous command because pptp will not return your prompt to you otherwise. A tun virtual tunnel device will be created for interaction between the pptp and ppp processes. Once you have been returned to your prompt, or the pptp process has confirmed a connection you can examine the tunnel like so: &prompt.user; ifconfig tun0 tun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500 inet 216.136.204.21 --> 204.152.186.171 netmask 0xffffff00 Opened by PID 918 If you are unable to connect, check the configuration of your router, which is usually accessible via telnet or with a web browser. If you still cannot connect you should examine the output of the pptp command and the contents of the ppp log file, /var/log/ppp.log for clues. Satoshi Asami Originally contributed by Guy Helmer With input from Piero Serini Using SLIP SLIP Setting Up a SLIP Client SLIPclient The following is one way to set up a FreeBSD machine for SLIP on a static host network. For dynamic hostname assignments (your address changes each time you dial up), you probably need to have a more complex setup. First, determine which serial port your modem is connected to. Many people set up a symbolic link, such as /dev/modem, to point to the real device name, /dev/cuaaN (or /dev/cuadN under &os; 6.X). This allows you to abstract the actual device name should you ever need to move the modem to a different port. It can become quite cumbersome when you need to fix a bunch of files in /etc and .kermrc files all over the system! /dev/cuaa0 (or /dev/cuad0 under &os; 6.X) is COM1, cuaa1 (or /dev/cuad1) is COM2, etc. Make sure you have the following in your kernel configuration file: device sl Under &os; 4.X, use instead the following line: pseudo-device sl 1 It is included in the GENERIC kernel, so this should not be a problem unless you have deleted it. Things You Have to Do Only Once Add your home machine, the gateway and nameservers to your /etc/hosts file. Ours looks like this: 127.0.0.1 localhost loghost 136.152.64.181 water.CS.Example.EDU water.CS water 136.152.64.1 inr-3.CS.Example.EDU inr-3 slip-gateway 128.32.136.9 ns1.Example.EDU ns1 128.32.136.12 ns2.Example.EDU ns2 Make sure you have hosts before bind in your /etc/host.conf on FreeBSD versions prior to 5.0. Since FreeBSD 5.0, the system uses the file /etc/nsswitch.conf instead, make sure you have files before dns in the line of this file. Without these parameters funny things may happen. Edit the /etc/rc.conf file. Set your hostname by editing the line that says: hostname="myname.my.domain" Your machine's full Internet hostname should be placed here. default route Designate the default router by changing the line: defaultrouter="NO" to: defaultrouter="slip-gateway" Make a file /etc/resolv.conf which contains: domain CS.Example.EDU nameserver 128.32.136.9 nameserver 128.32.136.12 nameserver domain name As you can see, these set up the nameserver hosts. Of course, the actual domain names and addresses depend on your environment. Set the password for root and toor (and any other accounts that do not have a password). Reboot your machine and make sure it comes up with the correct hostname. Making a SLIP Connection SLIPconnecting with Dial up, type slip at the prompt, enter your machine name and password. What is required to be entered depends on your environment. If you use Kermit, you can try a script like this: # kermit setup set modem hayes set line /dev/modem set speed 115200 set parity none set flow rts/cts set terminal bytesize 8 set file type binary # The next macro will dial up and login define slip dial 643-9600, input 10 =>, if failure stop, - output slip\x0d, input 10 Username:, if failure stop, - output silvia\x0d, input 10 Password:, if failure stop, - output ***\x0d, echo \x0aCONNECTED\x0a Of course, you have to change the username and password to fit yours. After doing so, you can just type slip from the Kermit prompt to connect. Leaving your password in plain text anywhere in the filesystem is generally a bad idea. Do it at your own risk. Leave the Kermit there (you can suspend it by Ctrl z ) and as root, type: &prompt.root; slattach -h -c -s 115200 /dev/modem If you are able to ping hosts on the other side of the router, you are connected! If it does not work, you might want to try instead of as an argument to slattach. How to Shutdown the Connection Do the following: &prompt.root; kill -INT `cat /var/run/slattach.modem.pid` to kill slattach. Keep in mind you must be root to do the above. Then go back to kermit (by running fg if you suspended it) and exit from it (q). The &man.slattach.8; manual page says you have to use ifconfig sl0 down to mark the interface down, but this does not seem to make any difference. (ifconfig sl0 reports the same thing.) Some times, your modem might refuse to drop the carrier. In that case, simply start kermit and quit it again. It usually goes out on the second try. Troubleshooting If it does not work, feel free to ask on &a.net.name; mailing list. The things that people tripped over so far: Not using or in slattach (This should not be fatal, but some users have reported that this solves their problems.) Using instead of (might be hard to see the difference on some fonts). Try ifconfig sl0 to see your interface status. For example, you might get: &prompt.root; ifconfig sl0 sl0: flags=10<POINTOPOINT> inet 136.152.64.181 --> 136.152.64.1 netmask ffffff00 If you get no route to host messages from &man.ping.8;, there may be a problem with your routing table. You can use the netstat -r command to display the current routes : &prompt.root; netstat -r Routing tables Destination Gateway Flags Refs Use IfaceMTU Rtt Netmasks: (root node) (root node) Route Tree for Protocol Family inet: (root node) => default inr-3.Example.EDU UG 8 224515 sl0 - - localhost.Exampl localhost.Example. UH 5 42127 lo0 - 0.438 inr-3.Example.ED water.CS.Example.E UH 1 0 sl0 - - water.CS.Example localhost.Example. UGH 34 47641234 lo0 - 0.438 (root node) The preceding examples are from a relatively busy system. The numbers on your system will vary depending on network activity. Setting Up a SLIP Server SLIPserver This document provides suggestions for setting up SLIP Server services on a FreeBSD system, which typically means configuring your system to automatically startup connections upon login for remote SLIP clients. Prerequisites TCP/IP networking This section is very technical in nature, so background knowledge is required. It is assumed that you are familiar with the TCP/IP network protocol, and in particular, network and node addressing, network address masks, subnetting, routing, and routing protocols, such as RIP. Configuring SLIP services on a dial-up server requires a knowledge of these concepts, and if you are not familiar with them, please read a copy of either Craig Hunt's TCP/IP Network Administration published by O'Reilly & Associates, Inc. (ISBN Number 0-937175-82-X), or Douglas Comer's books on the TCP/IP protocol. modem It is further assumed that you have already set up your modem(s) and configured the appropriate system files to allow logins through your modems. If you have not prepared your system for this yet, please see for details on dialup services configuration. You may also want to check the manual pages for &man.sio.4; for information on the serial port device driver and &man.ttys.5;, &man.gettytab.5;, &man.getty.8;, & &man.init.8; for information relevant to configuring the system to accept logins on modems, and perhaps &man.stty.1; for information on setting serial port parameters (such as clocal for directly-connected serial interfaces). Quick Overview In its typical configuration, using FreeBSD as a SLIP server works as follows: a SLIP user dials up your FreeBSD SLIP Server system and logs in with a special SLIP login ID that uses /usr/sbin/sliplogin as the special user's shell. The sliplogin program browses the file /etc/sliphome/slip.hosts to find a matching line for the special user, and if it finds a match, connects the serial line to an available SLIP interface and then runs the shell script /etc/sliphome/slip.login to configure the SLIP interface. An Example of a SLIP Server Login For example, if a SLIP user ID were Shelmerg, Shelmerg's entry in /etc/master.passwd would look something like this: Shelmerg:password:1964:89::0:0:Guy Helmer - SLIP:/usr/users/Shelmerg:/usr/sbin/sliplogin When Shelmerg logs in, sliplogin will search /etc/sliphome/slip.hosts for a line that had a matching user ID; for example, there may be a line in /etc/sliphome/slip.hosts that reads: Shelmerg dc-slip sl-helmer 0xfffffc00 autocomp sliplogin will find that matching line, hook the serial line into the next available SLIP interface, and then execute /etc/sliphome/slip.login like this: /etc/sliphome/slip.login 0 19200 Shelmerg dc-slip sl-helmer 0xfffffc00 autocomp If all goes well, /etc/sliphome/slip.login will issue an ifconfig for the SLIP interface to which sliplogin attached itself (SLIP interface 0, in the above example, which was the first parameter in the list given to slip.login) to set the local IP address (dc-slip), remote IP address (sl-helmer), network mask for the SLIP interface (0xfffffc00), and any additional flags (autocomp). If something goes wrong, sliplogin usually logs good informational messages via the syslogd daemon facility, which usually logs to /var/log/messages (see the manual pages for &man.syslogd.8; and &man.syslog.conf.5; and perhaps check /etc/syslog.conf to see to what syslogd is logging and where it is logging to). Kernel Configuration kernelconfiguration SLIP &os;'s default kernel (GENERIC) comes with SLIP (&man.sl.4;) support; in case of a custom kernel, you have to add the following line to your kernel configuration file: device sl Under &os; 4.X, use instead the following line: pseudo-device sl 2 The number at the end of the line is the maximum number of SLIP connections that may be operating simultaneously. Since &os; 5.0, the &man.sl.4; driver is auto-cloning. By default, your &os; machine will not forward packets. If you want your FreeBSD SLIP Server to act as a router, you will have to edit the /etc/rc.conf file and change the setting of the gateway_enable variable to . You will then need to reboot for the new settings to take effect. Please refer to on Configuring the FreeBSD Kernel for help in reconfiguring your kernel. Sliplogin Configuration As mentioned earlier, there are three files in the /etc/sliphome directory that are part of the configuration for /usr/sbin/sliplogin (see &man.sliplogin.8; for the actual manual page for sliplogin): slip.hosts, which defines the SLIP users and their associated IP addresses; slip.login, which usually just configures the SLIP interface; and (optionally) slip.logout, which undoes slip.login's effects when the serial connection is terminated. <filename>slip.hosts</filename> Configuration /etc/sliphome/slip.hosts contains lines which have at least four items separated by whitespace: SLIP user's login ID Local address (local to the SLIP server) of the SLIP link Remote address of the SLIP link Network mask The local and remote addresses may be host names (resolved to IP addresses by /etc/hosts or by the domain name service, depending on your specifications in the file /etc/nsswitch.conf, or in /etc/host.conf if you use FreeBSD 4.X), and the network mask may be a name that can be resolved by a lookup into /etc/networks. On a sample system, /etc/sliphome/slip.hosts looks like this: # # login local-addr remote-addr mask opt1 opt2 # (normal,compress,noicmp) # Shelmerg dc-slip sl-helmerg 0xfffffc00 autocomp At the end of the line is one or more of the options: — no header compression — compress headers — compress headers if the remote end allows it — disable ICMP packets (so any ping packets will be dropped instead of using up your bandwidth) SLIP TCP/IP networking Your choice of local and remote addresses for your SLIP links depends on whether you are going to dedicate a TCP/IP subnet or if you are going to use proxy ARP on your SLIP server (it is not true proxy ARP, but that is the terminology used in this section to describe it). If you are not sure which method to select or how to assign IP addresses, please refer to the TCP/IP books referenced in the SLIP Prerequisites () and/or consult your IP network manager. If you are going to use a separate subnet for your SLIP clients, you will need to allocate the subnet number out of your assigned IP network number and assign each of your SLIP client's IP numbers out of that subnet. Then, you will probably need to configure a static route to the SLIP subnet via your SLIP server on your nearest IP router. Ethernet Otherwise, if you will use the proxy ARP method, you will need to assign your SLIP client's IP addresses out of your SLIP server's Ethernet subnet, and you will also need to adjust your /etc/sliphome/slip.login and /etc/sliphome/slip.logout scripts to use &man.arp.8; to manage the proxy-ARP entries in the SLIP server's ARP table. <filename>slip.login</filename> Configuration The typical /etc/sliphome/slip.login file looks like this: #!/bin/sh - # # @(#)slip.login 5.1 (Berkeley) 7/1/90 # # generic login file for a slip line. sliplogin invokes this with # the parameters: # 1 2 3 4 5 6 7-n # slipunit ttyspeed loginname local-addr remote-addr mask opt-args # /sbin/ifconfig sl$1 inet $4 $5 netmask $6 This slip.login file merely runs ifconfig for the appropriate SLIP interface with the local and remote addresses and network mask of the SLIP interface. If you have decided to use the proxy ARP method (instead of using a separate subnet for your SLIP clients), your /etc/sliphome/slip.login file will need to look something like this: #!/bin/sh - # # @(#)slip.login 5.1 (Berkeley) 7/1/90 # # generic login file for a slip line. sliplogin invokes this with # the parameters: # 1 2 3 4 5 6 7-n # slipunit ttyspeed loginname local-addr remote-addr mask opt-args # /sbin/ifconfig sl$1 inet $4 $5 netmask $6 # Answer ARP requests for the SLIP client with our Ethernet addr /usr/sbin/arp -s $5 00:11:22:33:44:55 pub The additional line in this slip.login, arp -s $5 00:11:22:33:44:55 pub, creates an ARP entry in the SLIP server's ARP table. This ARP entry causes the SLIP server to respond with the SLIP server's Ethernet MAC address whenever another IP node on the Ethernet asks to speak to the SLIP client's IP address. EthernetMAC address When using the example above, be sure to replace the Ethernet MAC address (00:11:22:33:44:55) with the MAC address of your system's Ethernet card, or your proxy ARP will definitely not work! You can discover your SLIP server's Ethernet MAC address by looking at the results of running netstat -i; the second line of the output should look something like: ed0 1500 <Link>0.2.c1.28.5f.4a 191923 0 129457 0 116 This indicates that this particular system's Ethernet MAC address is 00:02:c1:28:5f:4a — the periods in the Ethernet MAC address given by netstat -i must be changed to colons and leading zeros should be added to each single-digit hexadecimal number to convert the address into the form that &man.arp.8; desires; see the manual page on &man.arp.8; for complete information on usage. When you create /etc/sliphome/slip.login and /etc/sliphome/slip.logout, the execute bit (i.e., chmod 755 /etc/sliphome/slip.login /etc/sliphome/slip.logout) must be set, or sliplogin will be unable to execute it. <filename>slip.logout</filename> Configuration /etc/sliphome/slip.logout is not strictly needed (unless you are implementing proxy ARP), but if you decide to create it, this is an example of a basic slip.logout script: #!/bin/sh - # # slip.logout # # logout file for a slip line. sliplogin invokes this with # the parameters: # 1 2 3 4 5 6 7-n # slipunit ttyspeed loginname local-addr remote-addr mask opt-args # /sbin/ifconfig sl$1 down If you are using proxy ARP, you will want to have /etc/sliphome/slip.logout remove the ARP entry for the SLIP client: #!/bin/sh - # # @(#)slip.logout # # logout file for a slip line. sliplogin invokes this with # the parameters: # 1 2 3 4 5 6 7-n # slipunit ttyspeed loginname local-addr remote-addr mask opt-args # /sbin/ifconfig sl$1 down # Quit answering ARP requests for the SLIP client /usr/sbin/arp -d $5 The arp -d $5 removes the ARP entry that the proxy ARP slip.login added when the SLIP client logged in. It bears repeating: make sure /etc/sliphome/slip.logout has the execute bit set after you create it (i.e., chmod 755 /etc/sliphome/slip.logout). Routing Considerations SLIP routing If you are not using the proxy ARP method for routing packets between your SLIP clients and the rest of your network (and perhaps the Internet), you will probably have to add static routes to your closest default router(s) to route your SLIP clients subnet via your SLIP server. Static Routes static routes Adding static routes to your nearest default routers can be troublesome (or impossible if you do not have authority to do so...). If you have a multiple-router network in your organization, some routers, such as those made by Cisco and Proteon, may not only need to be configured with the static route to the SLIP subnet, but also need to be told which static routes to tell other routers about, so some expertise and troubleshooting/tweaking may be necessary to get static-route-based routing to work. Running <application>&gated;</application> &gated; &gated; is proprietary software now and will not be available as source code to the public anymore (more info on the &gated; website). This section only exists to ensure backwards compatibility for those that are still using an older version. An alternative to the headaches of static routes is to install &gated; on your FreeBSD SLIP server and configure it to use the appropriate routing protocols (RIP/OSPF/BGP/EGP) to tell other routers about your SLIP subnet. You will need to write a /etc/gated.conf file to configure your &gated;; here is a sample, similar to what the author used on a FreeBSD SLIP server: # # gated configuration file for dc.dsu.edu; for gated version 3.5alpha5 # Only broadcast RIP information for xxx.xxx.yy out the ed Ethernet interface # # # tracing options # traceoptions "/var/tmp/gated.output" replace size 100k files 2 general ; rip yes { interface sl noripout noripin ; interface ed ripin ripout version 1 ; traceoptions route ; } ; # # Turn on a bunch of tracing info for the interface to the kernel: kernel { traceoptions remnants request routes info interface ; } ; # # Propagate the route to xxx.xxx.yy out the Ethernet interface via RIP # export proto rip interface ed { proto direct { xxx.xxx.yy mask 255.255.252.0 metric 1; # SLIP connections } ; } ; # # Accept routes from RIP via ed Ethernet interfaces import proto rip interface ed { all ; } ; RIP The above sample gated.conf file broadcasts routing information regarding the SLIP subnet xxx.xxx.yy via RIP onto the Ethernet; if you are using a different Ethernet driver than the ed driver, you will need to change the references to the ed interface appropriately. This sample file also sets up tracing to /var/tmp/gated.output for debugging &gated;'s activity; you can certainly turn off the tracing options if &gated; works correctly for you. You will need to change the xxx.xxx.yy's into the network address of your own SLIP subnet (be sure to change the net mask in the proto direct clause as well). Once you have installed and configured &gated; on your system, you will need to tell the FreeBSD startup scripts to run &gated; in place of routed. The easiest way to accomplish this is to set the router and router_flags variables in /etc/rc.conf. Please see the manual page for &gated; for information on command-line parameters. diff --git a/en_US.ISO8859-1/books/handbook/security/chapter.sgml b/en_US.ISO8859-1/books/handbook/security/chapter.sgml index 58d2b80de9..2fc0cfe13b 100644 --- a/en_US.ISO8859-1/books/handbook/security/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/security/chapter.sgml @@ -1,5123 +1,5123 @@ Matthew Dillon Much of this chapter has been taken from the security(7) manual page by Security security Synopsis This chapter will provide a basic introduction to system security concepts, some general good rules of thumb, and some advanced topics under &os;. A lot of the topics covered here can be applied to system and Internet security in general as well. The Internet is no longer a friendly place in which everyone wants to be your kind neighbor. Securing your system is imperative to protect your data, intellectual property, time, and much more from the hands of hackers and the like. &os; provides an array of utilities and mechanisms to ensure the integrity and security of your system and network. After reading this chapter, you will know: Basic system security concepts, in respect to &os;. About the various crypt mechanisms available in &os;, such as DES and MD5. How to set up one-time password authentication. How to configure TCP Wrappers for use with inetd. How to set up KerberosIV on &os; releases prior to 5.0. How to set up Kerberos5 on post &os; 5.0 releases. How to configure IPsec and create a VPN between &os;/&windows; machines. How to configure and use OpenSSH, &os;'s SSH implementation. What file system ACLs are and how to use them. How to use the Portaudit utility to audit third party software packages installed from the Ports Collection. How to utilize the &os; security advisories publications. Have an idea of what Process Accounting is and how to enable it on &os;. Before reading this chapter, you should: Understand basic &os; and Internet concepts. Additional security topics are covered throughout this book. For example, Mandatory Access Control is discussed in and Internet Firewalls are discussed in . Introduction Security is a function that begins and ends with the system administrator. While all BSD &unix; multi-user systems have some inherent security, the job of building and maintaining additional security mechanisms to keep those users honest is probably one of the single largest undertakings of the sysadmin. Machines are only as secure as you make them, and security concerns are ever competing with the human necessity for convenience. &unix; systems, in general, are capable of running a huge number of simultaneous processes and many of these processes operate as servers — meaning that external entities can connect and talk to them. As yesterday's mini-computers and mainframes become today's desktops, and as computers become networked and internetwork, security becomes an even bigger issue. Security is best implemented through a layered onion approach. In a nutshell, what you want to do is to create as many layers of security as are convenient and then carefully monitor the system for intrusions. You do not want to overbuild your security or you will interfere with the detection side, and detection is one of the single most important aspects of any security mechanism. For example, it makes little sense to set the schg flag (see &man.chflags.1;) on every system binary because while this may temporarily protect the binaries, it prevents an attacker who has broken in from making an easily detectable change that may result in your security mechanisms not detecting the attacker at all. System security also pertains to dealing with various forms of attack, including attacks that attempt to crash, or otherwise make a system unusable, but do not attempt to compromise the root account (break root). Security concerns can be split up into several categories: Denial of service attacks. User account compromises. Root compromise through accessible servers. Root compromise via user accounts. Backdoor creation. DoS attacks Denial of Service (DoS) security DoS attacks Denial of Service (DoS) Denial of Service (DoS) A denial of service attack is an action that deprives the machine of needed resources. Typically, DoS attacks are brute-force mechanisms that attempt to crash or otherwise make a machine unusable by overwhelming its servers or network stack. Some DoS attacks try to take advantage of bugs in the networking stack to crash a machine with a single packet. The latter can only be fixed by applying a bug fix to the kernel. Attacks on servers can often be fixed by properly specifying options to limit the load the servers incur on the system under adverse conditions. Brute-force network attacks are harder to deal with. A spoofed-packet attack, for example, is nearly impossible to stop, short of cutting your system off from the Internet. It may not be able to take your machine down, but it can saturate your Internet connection. security account compromises A user account compromise is even more common than a DoS attack. Many sysadmins still run standard telnetd, rlogind, rshd, and ftpd servers on their machines. These servers, by default, do not operate over encrypted connections. The result is that if you have any moderate-sized user base, one or more of your users logging into your system from a remote location (which is the most common and convenient way to login to a system) will have his or her password sniffed. The attentive system admin will analyze his remote access logs looking for suspicious source addresses even for successful logins. One must always assume that once an attacker has access to a user account, the attacker can break root. However, the reality is that in a well secured and maintained system, access to a user account does not necessarily give the attacker access to root. The distinction is important because without access to root the attacker cannot generally hide his tracks and may, at best, be able to do nothing more than mess with the user's files, or crash the machine. User account compromises are very common because users tend not to take the precautions that sysadmins take. security backdoors System administrators must keep in mind that there are potentially many ways to break root on a machine. The attacker may know the root password, the attacker may find a bug in a root-run server and be able to break root over a network connection to that server, or the attacker may know of a bug in a suid-root program that allows the attacker to break root once he has broken into a user's account. If an attacker has found a way to break root on a machine, the attacker may not have a need to install a backdoor. Many of the root holes found and closed to date involve a considerable amount of work by the attacker to cleanup after himself, so most attackers install backdoors. A backdoor provides the attacker with a way to easily regain root access to the system, but it also gives the smart system administrator a convenient way to detect the intrusion. Making it impossible for an attacker to install a backdoor may actually be detrimental to your security, because it will not close off the hole the attacker found to break in the first place. Security remedies should always be implemented with a multi-layered onion peel approach and can be categorized as follows: Securing root and staff accounts. Securing root–run servers and suid/sgid binaries. Securing user accounts. Securing the password file. Securing the kernel core, raw devices, and file systems. Quick detection of inappropriate changes made to the system. Paranoia. The next section of this chapter will cover the above bullet items in greater depth. Securing &os; security securing &os; Command vs. Protocol Throughout this document, we will use bold text to refer to an application, and a monospaced font to refer to specific commands. Protocols will use a normal font. This typographical distinction is useful for instances such as ssh, since it is a protocol as well as command. The sections that follow will cover the methods of securing your &os; system that were mentioned in the last section of this chapter. Securing the <username>root</username> Account and Staff Accounts su First off, do not bother securing staff accounts if you have not secured the root account. Most systems have a password assigned to the root account. The first thing you do is assume that the password is always compromised. This does not mean that you should remove the password. The password is almost always necessary for console access to the machine. What it does mean is that you should not make it possible to use the password outside of the console or possibly even with the &man.su.1; command. For example, make sure that your ptys are specified as being insecure in the /etc/ttys file so that direct root logins via telnet or rlogin are disallowed. If using other login services such as sshd, make sure that direct root logins are disabled there as well. You can do this by editing your /etc/ssh/sshd_config file, and making sure that PermitRootLogin is set to NO. Consider every access method — services such as FTP often fall through the cracks. Direct root logins should only be allowed via the system console. wheel Of course, as a sysadmin you have to be able to get to root, so we open up a few holes. But we make sure these holes require additional password verification to operate. One way to make root accessible is to add appropriate staff accounts to the wheel group (in /etc/group). The staff members placed in the wheel group are allowed to su to root. You should never give staff members native wheel access by putting them in the wheel group in their password entry. Staff accounts should be placed in a staff group, and then added to the wheel group via the /etc/group file. Only those staff members who actually need to have root access should be placed in the wheel group. It is also possible, when using an authentication method such as Kerberos, to use Kerberos' .k5login file in the root account to allow a &man.ksu.1; to root without having to place anyone at all in the wheel group. This may be the better solution since the wheel mechanism still allows an intruder to break root if the intruder has gotten hold of your password file and can break into a staff account. While having the wheel mechanism is better than having nothing at all, it is not necessarily the safest option. An indirect way to secure staff accounts, and ultimately root access is to use an alternative login access method and do what is known as starring out the encrypted password for the staff accounts. Using the &man.vipw.8; command, one can replace each instance of an encrypted password with a single * character. This command will update the /etc/master.passwd file and user/password database to disable password-authenticated logins. A staff account entry such as: foobar:R9DT/Fa1/LV9U:1000:1000::0:0:Foo Bar:/home/foobar:/usr/local/bin/tcsh Should be changed to this: foobar:*:1000:1000::0:0:Foo Bar:/home/foobar:/usr/local/bin/tcsh This change will prevent normal logins from occurring, since the encrypted password will never match *. With this done, staff members must use another mechanism to authenticate themselves such as &man.kerberos.1; or &man.ssh.1; using a public/private key pair. When using something like Kerberos, one generally must secure the machines which run the Kerberos servers and your desktop workstation. When using a public/private key pair with ssh, one must generally secure the machine used to login from (typically one's workstation). An additional layer of protection can be added to the key pair by password protecting the key pair when creating it with &man.ssh-keygen.1;. Being able to star out the passwords for staff accounts also guarantees that staff members can only login through secure access methods that you have set up. This forces all staff members to use secure, encrypted connections for all of their sessions, which closes an important hole used by many intruders: sniffing the network from an unrelated, less secure machine. The more indirect security mechanisms also assume that you are logging in from a more restrictive server to a less restrictive server. For example, if your main box is running all sorts of servers, your workstation should not be running any. In order for your workstation to be reasonably secure you should run as few servers as possible, up to and including no servers at all, and you should run a password-protected screen blanker. Of course, given physical access to a workstation an attacker can break any sort of security you put on it. This is definitely a problem that you should consider, but you should also consider the fact that the vast majority of break-ins occur remotely, over a network, from people who do not have physical access to your workstation or servers. KerberosIV Using something like Kerberos also gives you the ability to disable or change the password for a staff account in one place, and have it immediately affect all the machines on which the staff member may have an account. If a staff member's account gets compromised, the ability to instantly change his password on all machines should not be underrated. With discrete passwords, changing a password on N machines can be a mess. You can also impose re-passwording restrictions with Kerberos: not only can a Kerberos ticket be made to timeout after a while, but the Kerberos system can require that the user choose a new password after a certain period of time (say, once a month). Securing Root-run Servers and SUID/SGID Binaries ntalk comsat finger sandboxes sshd telnetd rshd rlogind The prudent sysadmin only runs the servers he needs to, no more, no less. Be aware that third party servers are often the most bug-prone. For example, running an old version of imapd or popper is like giving a universal root ticket out to the entire world. Never run a server that you have not checked out carefully. Many servers do not need to be run as root. For example, the ntalk, comsat, and finger daemons can be run in special user sandboxes. A sandbox is not perfect, unless you go through a large amount of trouble, but the onion approach to security still stands: If someone is able to break in through a server running in a sandbox, they still have to break out of the sandbox. The more layers the attacker must break through, the lower the likelihood of his success. Root holes have historically been found in virtually every server ever run as root, including basic system servers. If you are running a machine through which people only login via sshd and never login via telnetd or rshd or rlogind, then turn off those services! &os; now defaults to running ntalkd, comsat, and finger in a sandbox. Another program which may be a candidate for running in a sandbox is &man.named.8;. /etc/defaults/rc.conf includes the arguments necessary to run named in a sandbox in a commented-out form. Depending on whether you are installing a new system or upgrading an existing system, the special user accounts used by these sandboxes may not be installed. The prudent sysadmin would research and implement sandboxes for servers whenever possible. sendmail There are a number of other servers that typically do not run in sandboxes: sendmail, popper, imapd, ftpd, and others. There are alternatives to some of these, but installing them may require more work than you are willing to perform (the convenience factor strikes again). You may have to run these servers as root and rely on other mechanisms to detect break-ins that might occur through them. The other big potential root holes in a system are the suid-root and sgid binaries installed on the system. Most of these binaries, such as rlogin, reside in /bin, /sbin, /usr/bin, or /usr/sbin. While nothing is 100% safe, the system-default suid and sgid binaries can be considered reasonably safe. Still, root holes are occasionally found in these binaries. A root hole was found in Xlib in 1998 that made xterm (which is typically suid) vulnerable. It is better to be safe than sorry and the prudent sysadmin will restrict suid binaries, that only staff should run, to a special group that only staff can access, and get rid of (chmod 000) any suid binaries that nobody uses. A server with no display generally does not need an xterm binary. Sgid binaries can be almost as dangerous. If an intruder can break an sgid-kmem binary, the intruder might be able to read /dev/kmem and thus read the encrypted password file, potentially compromising any passworded account. Alternatively an intruder who breaks group kmem can monitor keystrokes sent through ptys, including ptys used by users who login through secure methods. An intruder that breaks the tty group can write to almost any user's tty. If a user is running a terminal program or emulator with a keyboard-simulation feature, the intruder can potentially generate a data stream that causes the user's terminal to echo a command, which is then run as that user. Securing User Accounts User accounts are usually the most difficult to secure. While you can impose Draconian access restrictions on your staff and star out their passwords, you may not be able to do so with any general user accounts you might have. If you do have sufficient control, then you may win out and be able to secure the user accounts properly. If not, you simply have to be more vigilant in your monitoring of those accounts. Use of ssh and Kerberos for user accounts is more problematic, due to the extra administration and technical support required, but still a very good solution compared to a crypted password file. Securing the Password File The only sure fire way is to * out as many passwords as you can and use ssh or Kerberos for access to those accounts. Even though the encrypted password file (/etc/spwd.db) can only be read by root, it may be possible for an intruder to obtain read access to that file even if the attacker cannot obtain root-write access. Your security scripts should always check for and report changes to the password file (see the Checking file integrity section below). Securing the Kernel Core, Raw Devices, and File systems If an attacker breaks root he can do just about anything, but there are certain conveniences. For example, most modern kernels have a packet sniffing device driver built in. Under &os; it is called the bpf device. An intruder will commonly attempt to run a packet sniffer on a compromised machine. You do not need to give the intruder the capability and most systems do not have the need for the bpf device compiled in. sysctl But even if you turn off the bpf device, you still have /dev/mem and /dev/kmem to worry about. For that matter, the intruder can still write to raw disk devices. Also, there is another kernel feature called the module loader, &man.kldload.8;. An enterprising intruder can use a KLD module to install his own bpf device, or other sniffing device, on a running kernel. To avoid these problems you have to run the kernel at a higher secure level, at least securelevel 1. The securelevel can be set with a sysctl on the kern.securelevel variable. Once you have set the securelevel to 1, write access to raw devices will be denied and special chflags flags, such as schg, will be enforced. You must also ensure that the schg flag is set on critical startup binaries, directories, and script files — everything that gets run up to the point where the securelevel is set. This might be overdoing it, and upgrading the system is much more difficult when you operate at a higher secure level. You may compromise and run the system at a higher secure level but not set the schg flag for every system file and directory under the sun. Another possibility is to simply mount / and /usr read-only. It should be noted that being too Draconian in what you attempt to protect may prevent the all-important detection of an intrusion. Checking File Integrity: Binaries, Configuration Files, Etc. When it comes right down to it, you can only protect your core system configuration and control files so much before the convenience factor rears its ugly head. For example, using chflags to set the schg bit on most of the files in / and /usr is probably counterproductive, because while it may protect the files, it also closes a detection window. The last layer of your security onion is perhaps the most important — detection. The rest of your security is pretty much useless (or, worse, presents you with a false sense of safety) if you cannot detect potential incursions. Half the job of the onion is to slow down the attacker, rather than stop him, in order to give the detection side of the equation a chance to catch him in the act. The best way to detect an incursion is to look for modified, missing, or unexpected files. The best way to look for modified files is from another (often centralized) limited-access system. Writing your security scripts on the extra-secure limited-access system makes them mostly invisible to potential attackers, and this is important. In order to take maximum advantage you generally have to give the limited-access box significant access to the other machines in the business, usually either by doing a read-only NFS export of the other machines to the limited-access box, or by setting up ssh key-pairs to allow the limited-access box to ssh to the other machines. Except for its network traffic, NFS is the least visible method — allowing you to monitor the file systems on each client box virtually undetected. If your limited-access server is connected to the client boxes through a switch, the NFS method is often the better choice. If your limited-access server is connected to the client boxes through a hub, or through several layers of routing, the NFS method may be too insecure (network-wise) and using ssh may be the better choice even with the audit-trail tracks that ssh lays. Once you give a limited-access box, at least read access to the client systems it is supposed to monitor, you must write scripts to do the actual monitoring. Given an NFS mount, you can write scripts out of simple system utilities such as &man.find.1; and &man.md5.1;. It is best to physically md5 the client-box files at least once a day, and to test control files such as those found in /etc and /usr/local/etc even more often. When mismatches are found, relative to the base md5 information the limited-access machine knows is valid, it should scream at a sysadmin to go check it out. A good security script will also check for inappropriate suid binaries and for new or deleted files on system partitions such as / and /usr. When using ssh rather than NFS, writing the security script is much more difficult. You essentially have to scp the scripts to the client box in order to run them, making them visible, and for safety you also need to scp the binaries (such as find) that those scripts use. The ssh client on the client box may already be compromised. All in all, using ssh may be necessary when running over insecure links, but it is also a lot harder to deal with. A good security script will also check for changes to user and staff members access configuration files: .rhosts, .shosts, .ssh/authorized_keys and so forth… files that might fall outside the purview of the MD5 check. If you have a huge amount of user disk space, it may take too long to run through every file on those partitions. In this case, setting mount flags to disallow suid binaries and devices on those partitions is a good idea. The nodev and nosuid options (see &man.mount.8;) are what you want to look into. You should probably scan them anyway, at least once a week, since the object of this layer is to detect a break-in whether or not the break-in is effective. Process accounting (see &man.accton.8;) is a relatively low-overhead feature of the operating system which might help as a post-break-in evaluation mechanism. It is especially useful in tracking down how an intruder has actually broken into a system, assuming the file is still intact after the break-in occurs. Finally, security scripts should process the log files, and the logs themselves should be generated in as secure a manner as possible — remote syslog can be very useful. An intruder tries to cover his tracks, and log files are critical to the sysadmin trying to track down the time and method of the initial break-in. One way to keep a permanent record of the log files is to run the system console to a serial port and collect the information on a continuing basis through a secure machine monitoring the consoles. Paranoia A little paranoia never hurts. As a rule, a sysadmin can add any number of security features, as long as they do not affect convenience, and can add security features that do affect convenience with some added thought. Even more importantly, a security administrator should mix it up a bit — if you use recommendations such as those given by this document verbatim, you give away your methodologies to the prospective attacker who also has access to this document. Denial of Service Attacks Denial of Service (DoS) This section covers Denial of Service attacks. A DoS attack is typically a packet attack. While there is not much you can do about modern spoofed packet attacks that saturate your network, you can generally limit the damage by ensuring that the attacks cannot take down your servers. Limiting server forks. Limiting springboard attacks (ICMP response attacks, ping broadcast, etc.). Kernel Route Cache. A common DoS attack is against a forking server that attempts to cause the server to eat processes, file descriptors, and memory, until the machine dies. inetd (see &man.inetd.8;) has several options to limit this sort of attack. It should be noted that while it is possible to prevent a machine from going down, it is not generally possible to prevent a service from being disrupted by the attack. Read the inetd manual page carefully and pay specific attention to the , , and options. Note that spoofed-IP attacks will circumvent the option to inetd, so typically a combination of options must be used. Some standalone servers have self-fork-limitation parameters. Sendmail has its option, which tends to work much better than trying to use sendmail's load limiting options due to the load lag. You should specify a MaxDaemonChildren parameter, when you start sendmail, high enough to handle your expected load, but not so high that the computer cannot handle that number of sendmails without falling on its face. It is also prudent to run sendmail in queued mode () and to run the daemon (sendmail -bd) separate from the queue-runs (sendmail -q15m). If you still want real-time delivery you can run the queue at a much lower interval, such as , but be sure to specify a reasonable MaxDaemonChildren option for that sendmail to prevent cascade failures. Syslogd can be attacked directly and it is strongly recommended that you use the option whenever possible, and the option otherwise. You should also be fairly careful with connect-back services such as TCP Wrapper's reverse-identd, which can be attacked directly. You generally do not want to use the reverse-ident feature of TCP Wrapper for this reason. It is a very good idea to protect internal services from external access by firewalling them off at your border routers. The idea here is to prevent saturation attacks from outside your LAN, not so much to protect internal services from network-based root compromise. Always configure an exclusive firewall, i.e., firewall everything except ports A, B, C, D, and M-Z. This way you can firewall off all of your low ports except for certain specific services such as named (if you are primary for a zone), ntalkd, sendmail, and other Internet-accessible services. If you try to configure the firewall the other way — as an inclusive or permissive firewall, there is a good chance that you will forget to close a couple of services, or that you will add a new internal service and forget to update the firewall. You can still open up the high-numbered port range on the firewall, to allow permissive-like operation, without compromising your low ports. Also take note that &os; allows you to control the range of port numbers used for dynamic binding, via the various net.inet.ip.portrange sysctl's (sysctl -a | fgrep portrange), which can also ease the complexity of your firewall's configuration. For example, you might use a normal first/last range of 4000 to 5000, and a hiport range of 49152 to 65535, then block off everything under 4000 in your firewall (except for certain specific Internet-accessible ports, of course). Another common DoS attack is called a springboard attack — to attack a server in a manner that causes the server to generate responses which overloads the server, the local network, or some other machine. The most common attack of this nature is the ICMP ping broadcast attack. The attacker spoofs ping packets sent to your LAN's broadcast address with the source IP address set to the actual machine they wish to attack. If your border routers are not configured to stomp on ping's to broadcast addresses, your LAN winds up generating sufficient responses to the spoofed source address to saturate the victim, especially when the attacker uses the same trick on several dozen broadcast addresses over several dozen different networks at once. Broadcast attacks of over a hundred and twenty megabits have been measured. A second common springboard attack is against the ICMP error reporting system. By constructing packets that generate ICMP error responses, an attacker can saturate a server's incoming network and cause the server to saturate its outgoing network with ICMP responses. This type of attack can also crash the server by running it out of mbuf's, especially if the server cannot drain the ICMP responses it generates fast enough. &os; 4.X kernels have a kernel compile option called which limits the effectiveness of these sorts of attacks. Later kernels use the sysctl variable net.inet.icmp.icmplim. The last major class of springboard attacks is related to certain internal inetd services such as the udp echo service. An attacker simply spoofs a UDP packet with the source address being server A's echo port, and the destination address being server B's echo port, where server A and B are both on your LAN. The two servers then bounce this one packet back and forth between each other. The attacker can overload both servers and their LANs simply by injecting a few packets in this manner. Similar problems exist with the internal chargen port. A competent sysadmin will turn off all of these inetd-internal test services. Spoofed packet attacks may also be used to overload the kernel route cache. Refer to the net.inet.ip.rtexpire, rtminexpire, and rtmaxcache sysctl parameters. A spoofed packet attack that uses a random source IP will cause the kernel to generate a temporary cached route in the route table, viewable with netstat -rna | fgrep W3. These routes typically timeout in 1600 seconds or so. If the kernel detects that the cached route table has gotten too big it will dynamically reduce the rtexpire but will never decrease it to less than rtminexpire. There are two problems: The kernel does not react quickly enough when a lightly loaded server is suddenly attacked. The rtminexpire is not low enough for the kernel to survive a sustained attack. If your servers are connected to the Internet via a T3 or better, it may be prudent to manually override both rtexpire and rtminexpire via &man.sysctl.8;. Never set either parameter to zero (unless you want to crash the machine). Setting both parameters to 2 seconds should be sufficient to protect the route table from attack. Access Issues with Kerberos and SSH ssh KerberosIV There are a few issues with both Kerberos and ssh that need to be addressed if you intend to use them. Kerberos V is an excellent authentication protocol, but there are bugs in the kerberized telnet and rlogin applications that make them unsuitable for dealing with binary streams. Also, by default Kerberos does not encrypt a session unless you use the option. ssh encrypts everything by default. ssh works quite well in every respect except that it forwards encryption keys by default. What this means is that if you have a secure workstation holding keys that give you access to the rest of the system, and you ssh to an insecure machine, your keys are usable. The actual keys themselves are not exposed, but ssh installs a forwarding port for the duration of your login, and if an attacker has broken root on the insecure machine he can utilize that port to use your keys to gain access to any other machine that your keys unlock. We recommend that you use ssh in combination with Kerberos whenever possible for staff logins. ssh can be compiled with Kerberos support. This reduces your reliance on potentially exposed ssh keys while at the same time protecting passwords via Kerberos. ssh keys should only be used for automated tasks from secure machines (something that Kerberos is unsuited to do). We also recommend that you either turn off key-forwarding in the ssh configuration, or that you make use of the from=IP/DOMAIN option that ssh allows in its authorized_keys file to make the key only usable to entities logging in from specific machines. Bill Swingle Parts rewritten and updated by DES, MD5, and Crypt security crypt crypt DES MD5 Every user on a &unix; system has a password associated with their account. It seems obvious that these passwords need to be known only to the user and the actual operating system. In order to keep these passwords secret, they are encrypted with what is known as a one-way hash, that is, they can only be easily encrypted but not decrypted. In other words, what we told you a moment ago was obvious is not even true: the operating system itself does not really know the password. It only knows the encrypted form of the password. The only way to get the plain-text password is by a brute force search of the space of possible passwords. Unfortunately the only secure way to encrypt passwords when &unix; came into being was based on DES, the Data Encryption Standard. This was not such a problem for users resident in the US, but since the source code for DES could not be exported outside the US, &os; had to find a way to both comply with US law and retain compatibility with all the other &unix; variants that still used DES. The solution was to divide up the encryption libraries so that US users could install the DES libraries and use DES but international users still had an encryption method that could be exported abroad. This is how &os; came to use MD5 as its default encryption method. MD5 is believed to be more secure than DES, so installing DES is offered primarily for compatibility reasons. Recognizing Your Crypt Mechanism Before &os; 4.4 libcrypt.a was a symbolic link pointing to the library which was used for encryption. &os; 4.4 changed libcrypt.a to provide a configurable password authentication hash library. Currently the library supports DES, MD5 and Blowfish hash functions. By default &os; uses MD5 to encrypt passwords. It is pretty easy to identify which encryption method &os; is set up to use. Examining the encrypted passwords in the /etc/master.passwd file is one way. Passwords encrypted with the MD5 hash are longer than those encrypted with the DES hash and also begin with the characters $1$. Passwords starting with $2a$ are encrypted with the Blowfish hash function. DES password strings do not have any particular identifying characteristics, but they are shorter than MD5 passwords, and are coded in a 64-character alphabet which does not include the $ character, so a relatively short string which does not begin with a dollar sign is very likely a DES password. The password format used for new passwords is controlled by the passwd_format login capability in /etc/login.conf, which takes values of des, md5 or blf. See the &man.login.conf.5; manual page for more information about login capabilities. One-time Passwords one-time passwords security one-time passwords S/Key is a one-time password scheme based on a one-way hash function. &os; uses the MD4 hash for compatibility but other systems have used MD5 and DES-MAC. S/Key has been part of the &os; base system since version 1.1.5 and is also used on a growing number of other operating systems. S/Key is a registered trademark of Bell Communications Research, Inc. From version 5.0 of &os;, S/Key has been replaced with the functionally equivalent OPIE (One-time Passwords In Everything). OPIE uses the MD5 hash by default. There are three different sorts of passwords which we will discuss below. The first is your usual &unix; style or Kerberos password; we will call this a &unix; password. The second sort is the one-time password which is generated by the S/Key key program or the OPIE &man.opiekey.1; program and accepted by the keyinit or &man.opiepasswd.1; programs and the login prompt; we will call this a one-time password. The final sort of password is the secret password which you give to the key/opiekey programs (and sometimes the keyinit/opiepasswd programs) which it uses to generate one-time passwords; we will call it a secret password or just unqualified password. The secret password does not have anything to do with your &unix; password; they can be the same but this is not recommended. S/Key and OPIE secret passwords are not limited to 8 characters like old &unix; passwordsUnder &os; the standard login password may be up to 128 characters in length., they can be as long as you like. Passwords of six or seven word long phrases are fairly common. For the most part, the S/Key or OPIE system operates completely independently of the &unix; password system. Besides the password, there are two other pieces of data that are important to S/Key and OPIE. One is what is known as the seed or key, consisting of two letters and five digits. The other is what is called the iteration count, a number between 1 and 100. S/Key creates the one-time password by concatenating the seed and the secret password, then applying the MD4/MD5 hash as many times as specified by the iteration count and turning the result into six short English words. These six English words are your one-time password. The authentication system (primarily PAM) keeps track of the last one-time password used, and the user is authenticated if the hash of the user-provided password is equal to the previous password. Because a one-way hash is used it is impossible to generate future one-time passwords if a successfully used password is captured; the iteration count is decremented after each successful login to keep the user and the login program in sync. When the iteration count gets down to 1, S/Key and OPIE must be reinitialized. There are three programs involved in each system which we will discuss below. The key and opiekey programs accept an iteration count, a seed, and a secret password, and generate a one-time password or a consecutive list of one-time passwords. The keyinit and opiepasswd programs are used to initialize S/Key and OPIE respectively, and to change passwords, iteration counts, or seeds; they take either a secret passphrase, or an iteration count, seed, and one-time password. The keyinfo and opieinfo programs examine the relevant credentials files (/etc/skeykeys or /etc/opiekeys) and print out the invoking user's current iteration count and seed. There are four different sorts of operations we will cover. The first is using keyinit or opiepasswd over a secure connection to set up one-time-passwords for the first time, or to change your password or seed. The second operation is using keyinit or opiepasswd over an insecure connection, in conjunction with key or opiekey over a secure connection, to do the same. The third is using key/opiekey to log in over an insecure connection. The fourth is using key or opiekey to generate a number of keys which can be written down or printed out to carry with you when going to some location without secure connections to anywhere. Secure Connection Initialization To initialize S/Key for the first time, change your password, or change your seed while logged in over a secure connection (e.g. on the console of a machine or via ssh), use the keyinit command without any parameters while logged in as yourself: &prompt.user; keyinit Adding unfurl: Reminder - Only use this method if you are directly connected. If you are using telnet or rlogin exit with no password and use keyinit -s. Enter secret password: Again secret password: ID unfurl s/key is 99 to17757 DEFY CLUB PRO NASH LACE SOFT For OPIE, opiepasswd is used instead: &prompt.user; opiepasswd -c [grimreaper] ~ $ opiepasswd -f -c Adding unfurl: Only use this method from the console; NEVER from remote. If you are using telnet, xterm, or a dial-in, type ^C now or exit with no password. Then run opiepasswd without the -c parameter. Using MD5 to compute responses. Enter new secret pass phrase: Again new secret pass phrase: ID unfurl OTP key is 499 to4268 MOS MALL GOAT ARM AVID COED At the Enter new secret pass phrase: or Enter secret password: prompts, you should enter a password or phrase. Remember, this is not the password that you will use to login with, this is used to generate your one-time login keys. The ID line gives the parameters of your particular instance: your login name, the iteration count, and seed. When logging in the system will remember these parameters and present them back to you so you do not have to remember them. The last line gives the particular one-time password which corresponds to those parameters and your secret password; if you were to re-login immediately, this one-time password is the one you would use. Insecure Connection Initialization To initialize or change your secret password over an insecure connection, you will need to already have a secure connection to some place where you can run key or opiekey; this might be in the form of a desk accessory on a &macintosh;, or a shell prompt on a machine you trust. You will also need to make up an iteration count (100 is probably a good value), and you may make up your own seed or use a randomly-generated one. Over on the insecure connection (to the machine you are initializing), use the keyinit -s command: &prompt.user; keyinit -s Updating unfurl: Old key: to17758 Reminder you need the 6 English words from the key command. Enter sequence count from 1 to 9999: 100 Enter new key [default to17759]: s/key 100 to 17759 s/key access password: s/key access password:CURE MIKE BANE HIM RACY GORE For OPIE, you need to use opiepasswd: &prompt.user; opiepasswd Updating unfurl: You need the response from an OTP generator. Old secret pass phrase: otp-md5 498 to4268 ext Response: GAME GAG WELT OUT DOWN CHAT New secret pass phrase: otp-md5 499 to4269 Response: LINE PAP MILK NELL BUOY TROY ID mark OTP key is 499 gr4269 LINE PAP MILK NELL BUOY TROY To accept the default seed (which the keyinit program confusingly calls a key), press Return. Then before entering an access password, move over to your secure connection or S/Key desk accessory, and give it the same parameters: &prompt.user; key 100 to17759 Reminder - Do not use this program while logged in via telnet or rlogin. Enter secret password: <secret password> CURE MIKE BANE HIM RACY GORE Or for OPIE: &prompt.user; opiekey 498 to4268 Using the MD5 algorithm to compute response. Reminder: Don't use opiekey from telnet or dial-in sessions. Enter secret pass phrase: GAME GAG WELT OUT DOWN CHAT Now switch back over to the insecure connection, and copy the one-time password generated over to the relevant program. Generating a Single One-time Password Once you have initialized S/Key or OPIE, when you login you will be presented with a prompt like this: &prompt.user; telnet example.com Trying 10.0.0.1... Connected to example.com Escape character is '^]'. FreeBSD/i386 (example.com) (ttypa) login: <username> s/key 97 fw13894 Password: Or for OPIE: &prompt.user; telnet example.com Trying 10.0.0.1... Connected to example.com Escape character is '^]'. FreeBSD/i386 (example.com) (ttypa) login: <username> otp-md5 498 gr4269 ext Password: As a side note, the S/Key and OPIE prompts have a useful feature (not shown here): if you press Return at the password prompt, the prompter will turn echo on, so you can see what you are typing. This can be extremely useful if you are attempting to type in a password by hand, such as from a printout. MS-DOS Windows MacOS At this point you need to generate your one-time password to answer this login prompt. This must be done on a trusted system that you can run key or opiekey on. (There are versions of these for DOS, &windows; and &macos; as well.) They need both the iteration count and the seed as command line options. You can cut-and-paste these right from the login prompt on the machine that you are logging in to. On the trusted system: &prompt.user; key 97 fw13894 Reminder - Do not use this program while logged in via telnet or rlogin. Enter secret password: WELD LIP ACTS ENDS ME HAAG For OPIE: &prompt.user; opiekey 498 to4268 Using the MD5 algorithm to compute response. Reminder: Don't use opiekey from telnet or dial-in sessions. Enter secret pass phrase: GAME GAG WELT OUT DOWN CHAT Now that you have your one-time password you can continue logging in: login: <username> s/key 97 fw13894 Password: <return to enable echo> s/key 97 fw13894 Password [echo on]: WELD LIP ACTS ENDS ME HAAG Last login: Tue Mar 21 11:56:41 from 10.0.0.2 ... Generating Multiple One-time Passwords Sometimes you have to go places where you do not have access to a trusted machine or secure connection. In this case, it is possible to use the key and opiekey commands to generate a number of one-time passwords beforehand to be printed out and taken with you. For example: &prompt.user; key -n 5 30 zz99999 Reminder - Do not use this program while logged in via telnet or rlogin. Enter secret password: <secret password> 26: SODA RUDE LEA LIND BUDD SILT 27: JILT SPY DUTY GLOW COWL ROT 28: THEM OW COLA RUNT BONG SCOT 29: COT MASH BARR BRIM NAN FLAG 30: CAN KNEE CAST NAME FOLK BILK Or for OPIE: &prompt.user; opiekey -n 5 30 zz99999 Using the MD5 algorithm to compute response. Reminder: Don't use opiekey from telnet or dial-in sessions. Enter secret pass phrase: <secret password> 26: JOAN BORE FOSS DES NAY QUIT 27: LATE BIAS SLAY FOLK MUCH TRIG 28: SALT TIN ANTI LOON NEAL USE 29: RIO ODIN GO BYE FURY TIC 30: GREW JIVE SAN GIRD BOIL PHI The requests five keys in sequence, the specifies what the last iteration number should be. Note that these are printed out in reverse order of eventual use. If you are really paranoid, you might want to write the results down by hand; otherwise you can cut-and-paste into lpr. Note that each line shows both the iteration count and the one-time password; you may still find it handy to scratch off passwords as you use them. Restricting Use of &unix; Passwords S/Key can place restrictions on the use of &unix; passwords based on the host name, user name, terminal port, or IP address of a login session. These restrictions can be found in the configuration file /etc/skey.access. The &man.skey.access.5; manual page has more information on the complete format of the file and also details some security cautions to be aware of before depending on this file for security. If there is no /etc/skey.access file (this is the default on &os; 4.X systems), then all users will be allowed to use &unix; passwords. If the file exists, however, then all users will be required to use S/Key unless explicitly permitted to do otherwise by configuration statements in the skey.access file. In all cases, &unix; passwords are permitted on the console. Here is a sample skey.access configuration file which illustrates the three most common sorts of configuration statements: permit internet 192.168.0.0 255.255.0.0 permit user fnord permit port ttyd0 The first line (permit internet) allows users whose IP source address (which is vulnerable to spoofing) matches the specified value and mask, to use &unix; passwords. This should not be considered a security mechanism, but rather, a means to remind authorized users that they are using an insecure network and need to use S/Key for authentication. The second line (permit user) allows the specified username, in this case fnord, to use &unix; passwords at any time. Generally speaking, this should only be used for people who are either unable to use the key program, like those with dumb terminals, or those who are ineducable. The third line (permit port) allows all users logging in on the specified terminal line to use &unix; passwords; this would be used for dial-ups. OPIE can restrict the use of &unix; passwords based on the IP address of a login session just like S/Key does. The relevant file is /etc/opieaccess, which is present by default on &os; 5.0 and newer systems. Please check &man.opieaccess.5; for more information on this file and which security considerations you should be aware of when using it. Here is a sample opieaccess file: permit 192.168.0.0 255.255.0.0 This line allows users whose IP source address (which is vulnerable to spoofing) matches the specified value and mask, to use &unix; passwords at any time. If no rules in opieaccess are matched, the default is to deny non-OPIE logins. Tom Rhodes Written by: TCP Wrappers TCP Wrappers Anyone familiar with &man.inetd.8; has probably heard of TCP Wrappers at some point. But few individuals seem to fully comprehend its usefulness in a network environment. It seems that everyone wants to install a firewall to handle network connections. While a firewall has a wide variety of uses, there are some things that a firewall not handle such as sending text back to the connection originator. The TCP software does this and much more. In the next few sections many of the TCP Wrappers features will be discussed, and, when applicable, example configuration lines will be provided. The TCP Wrappers software extends the abilities of inetd to provide support for every server daemon under its control. Using this method it is possible to provide logging support, return messages to connections, permit a daemon to only accept internal connections, etc. While some of these features can be provided by implementing a firewall, this will add not only an extra layer of protection but go beyond the amount of control a firewall can provide. The added functionality of TCP Wrappers should not be considered a replacement for a good firewall. TCP Wrappers can be used in conjunction with a firewall or other security enhancements though and it can serve nicely as an extra layer of protection for the system. Since this is an extension to the configuration of inetd, the reader is expected have read the inetd configuration section. While programs run by &man.inetd.8; are not exactly daemons, they have traditionally been called daemons. This is the term we will use in this section too. Initial Configuration The only requirement of using TCP Wrappers in &os; is to ensure the inetd server is started from rc.conf with the option; this is the default setting. Of course, proper configuration of /etc/hosts.allow is also expected, but &man.syslogd.8; will throw messages in the system logs in these cases. Unlike other implementations of TCP Wrappers, the use of hosts.deny has been deprecated. All configuration options should be placed in /etc/hosts.allow. In the simplest configuration, daemon connection policies are set to either be permitted or blocked depending on the options in /etc/hosts.allow. The default configuration in &os; is to allow a connection to every daemon started with inetd. Changing this will be discussed only after the basic configuration is covered. Basic configuration usually takes the form of daemon : address : action. Where daemon is the daemon name which inetd started. The address can be a valid hostname, an IP address or an IPv6 address enclosed in brackets ([ ]). The action field can be either allow or deny to grant or deny access appropriately. Keep in mind that configuration works off a first rule match semantic, meaning that the configuration file is scanned in ascending order for a matching rule. When a match is found the rule is applied and the search process will halt. Several other options exist but they will be explained in a later section. A simple configuration line may easily be constructed from that information alone. For example, to allow POP3 connections via the mail/qpopper daemon, the following lines should be appended to hosts.allow: # This line is required for POP3 connections: qpopper : ALL : allow After adding this line, inetd will need restarted. This can be accomplished by use of the &man.kill.1; command, or with the restart parameter with /etc/rc.d/inetd. Advanced Configuration TCP Wrappers has advanced options too; they will allow for more control over the way connections are handled. In some cases it may be a good idea to return a comment to certain hosts or daemon connections. In other cases, perhaps a log file should be recorded or an email sent to the administrator. Other situations may require the use of a service for local connections only. This is all possible through the use of configuration options known as wildcards, expansion characters and external command execution. The next two sections are written to cover these situations. External Commands Suppose that a situation occurs where a connection should be denied yet a reason should be sent to the individual who attempted to establish that connection. How could it be done? That action can be made possible by using the option. When a connection attempt is made, will be called to execute a shell command or script. An example already exists in the hosts.allow file: # The rest of the daemons are protected. ALL : ALL \ : severity auth.info \ : twist /bin/echo "You are not welcome to use %d from %h." This example shows that the message, You are not allowed to use daemon from hostname. will be returned for any daemon not previously configured in the access file. This is extremely useful for sending a reply back to the connection initiator right after the established connection is dropped. Note that any message returned must be wrapped in quote " characters; there are no exceptions to this rule. It may be possible to launch a denial of service attack on the server if an attacker, or group of attackers could flood these daemons with connection requests. Another possibility is to use the option in these cases. Like , the implicitly denies the connection and may be used to run external shell commands or scripts. Unlike , will not send a reply back to the individual who established the connection. For an example, consider the following configuration line: # We do not allow connections from example.com: ALL : .example.com \ : spawn (/bin/echo %a from %h attempted to access %d >> \ /var/log/connections.log) \ : deny This will deny all connection attempts from the *.example.com domain; simultaneously logging the hostname, IP address and the daemon which they attempted to access in the /var/log/connections.log file. Aside from the already explained substitution characters above, e.g. %a, a few others exist. See the &man.hosts.access.5; manual page for the complete list. Wildcard Options Thus far the ALL example has been used continuously throughout the examples. Other options exist which could extend the functionality a bit further. For instance, ALL may be used to match every instance of either a daemon, domain or an IP address. Another wildcard available is PARANOID which may be used to match any host which provides an IP address that may be forged. In other words, paranoid may be used to define an action to be taken whenever a connection is made from an IP address that differs from its hostname. The following example may shed some more light on this discussion: # Block possibly spoofed requests to sendmail: sendmail : PARANOID : deny In that example all connection requests to sendmail which have an IP address that varies from its hostname will be denied. Using the PARANOID may severely cripple servers if the client or server has a broken DNS setup. Administrator discretion is advised. To learn more about wildcards and their associated functionality, see the &man.hosts.access.5; manual page. Before any of the specific configuration lines above will work, the first configuration line should be commented out in hosts.allow. This was noted at the beginning of this section. Mark Murray Contributed by Mark Dapoz Based on a contribution by <application>KerberosIV</application> Kerberos is a network add-on system/protocol that allows users to authenticate themselves through the services of a secure server. Services such as remote login, remote copy, secure inter-system file copying and other high-risk tasks are made considerably safer and more controllable. The following instructions can be used as a guide on how to set up Kerberos as distributed for &os;. However, you should refer to the relevant manual pages for a complete description. Installing <application>KerberosIV</application> MIT KerberosIV installing Kerberos is an optional component of &os;. The easiest way to install this software is by selecting the krb4 or krb5 distribution in sysinstall during the initial installation of &os;. This will install the eBones (KerberosIV) or Heimdal (Kerberos5) implementation of Kerberos. These implementations are included because they are developed outside the USA/Canada and were thus available to system owners outside those countries during the era of restrictive export controls on cryptographic code from the USA. Alternatively, the MIT implementation of Kerberos is available from the Ports Collection as security/krb5. Creating the Initial Database This is done on the Kerberos server only. First make sure that you do not have any old Kerberos databases around. You should change to the directory /etc/kerberosIV and check that only the following files are present: &prompt.root; cd /etc/kerberosIV &prompt.root; ls README krb.conf krb.realms If any additional files (such as principal.* or master_key) exist, then use the kdb_destroy command to destroy the old Kerberos database, or if Kerberos is not running, simply delete the extra files. You should now edit the krb.conf and krb.realms files to define your Kerberos realm. In this case the realm will be EXAMPLE.COM and the server is grunt.example.com. We edit or create the krb.conf file: &prompt.root; cat krb.conf EXAMPLE.COM EXAMPLE.COM grunt.example.com admin server CS.BERKELEY.EDU okeeffe.berkeley.edu ATHENA.MIT.EDU kerberos.mit.edu ATHENA.MIT.EDU kerberos-1.mit.edu ATHENA.MIT.EDU kerberos-2.mit.edu ATHENA.MIT.EDU kerberos-3.mit.edu LCS.MIT.EDU kerberos.lcs.mit.edu TELECOM.MIT.EDU bitsy.mit.edu ARC.NASA.GOV trident.arc.nasa.gov In this case, the other realms do not need to be there. They are here as an example of how a machine may be made aware of multiple realms. You may wish to not include them for simplicity. The first line names the realm in which this system works. The other lines contain realm/host entries. The first item on a line is a realm, and the second is a host in that realm that is acting as a key distribution center. The words admin server following a host's name means that host also provides an administrative database server. For further explanation of these terms, please consult the Kerberos manual pages. Now we have to add grunt.example.com to the EXAMPLE.COM realm and also add an entry to put all hosts in the .example.com domain in the EXAMPLE.COM realm. The krb.realms file would be updated as follows: &prompt.root; cat krb.realms grunt.example.com EXAMPLE.COM .example.com EXAMPLE.COM .berkeley.edu CS.BERKELEY.EDU .MIT.EDU ATHENA.MIT.EDU .mit.edu ATHENA.MIT.EDU Again, the other realms do not need to be there. They are here as an example of how a machine may be made aware of multiple realms. You may wish to remove them to simplify things. The first line puts the specific system into the named realm. The rest of the lines show how to default systems of a particular subdomain to a named realm. Now we are ready to create the database. This only needs to run on the Kerberos server (or Key Distribution Center). Issue the kdb_init command to do this: &prompt.root; kdb_init Realm name [default ATHENA.MIT.EDU ]: EXAMPLE.COM You will be prompted for the database Master Password. It is important that you NOT FORGET this password. Enter Kerberos master key: Now we have to save the key so that servers on the local machine can pick it up. Use the kstash command to do this: &prompt.root; kstash Enter Kerberos master key: Current Kerberos master key version is 1. Master key entered. BEWARE! This saves the encrypted master password in /etc/kerberosIV/master_key. Making It All Run KerberosIV initial startup Two principals need to be added to the database for each system that will be secured with Kerberos. Their names are kpasswd and rcmd. These two principals are made for each system, with the instance being the name of the individual system. These daemons, kpasswd and rcmd allow other systems to change Kerberos passwords and run commands like &man.rcp.1;, &man.rlogin.1; and &man.rsh.1;. Now let us add these entries: &prompt.root; kdb_edit Opening database... Enter Kerberos master key: Current Kerberos master key version is 1. Master key entered. BEWARE! Previous or default values are in [brackets] , enter return to leave the same, or new value. Principal name: passwd Instance: grunt <Not found>, Create [y] ? y Principal: passwd, Instance: grunt, kdc_key_ver: 1 New Password: <---- enter RANDOM here Verifying password New Password: <---- enter RANDOM here Random password [y] ? y Principal's new key version = 1 Expiration date (enter yyyy-mm-dd) [ 2000-01-01 ] ? Max ticket lifetime (*5 minutes) [ 255 ] ? Attributes [ 0 ] ? Edit O.K. Principal name: rcmd Instance: grunt <Not found>, Create [y] ? Principal: rcmd, Instance: grunt, kdc_key_ver: 1 New Password: <---- enter RANDOM here Verifying password New Password: <---- enter RANDOM here Random password [y] ? Principal's new key version = 1 Expiration date (enter yyyy-mm-dd) [ 2000-01-01 ] ? Max ticket lifetime (*5 minutes) [ 255 ] ? Attributes [ 0 ] ? Edit O.K. Principal name: <---- null entry here will cause an exit Creating the Server File We now have to extract all the instances which define the services on each machine. For this we use the ext_srvtab command. This will create a file which must be copied or moved by secure means to each Kerberos client's /etc/kerberosIV directory. This file must be present on each server and client, and is crucial to the operation of Kerberos. &prompt.root; ext_srvtab grunt Enter Kerberos master key: Current Kerberos master key version is 1. Master key entered. BEWARE! Generating 'grunt-new-srvtab'.... Now, this command only generates a temporary file which must be renamed to srvtab so that all the servers can pick it up. Use the &man.mv.1; command to move it into place on the original system: &prompt.root; mv grunt-new-srvtab srvtab If the file is for a client system, and the network is not deemed safe, then copy the client-new-srvtab to removable media and transport it by secure physical means. Be sure to rename it to srvtab in the client's /etc/kerberosIV directory, and make sure it is mode 600: &prompt.root; mv grumble-new-srvtab srvtab &prompt.root; chmod 600 srvtab Populating the Database We now have to add some user entries into the database. First let us create an entry for the user jane. Use the kdb_edit command to do this: &prompt.root; kdb_edit Opening database... Enter Kerberos master key: Current Kerberos master key version is 1. Master key entered. BEWARE! Previous or default values are in [brackets] , enter return to leave the same, or new value. Principal name: jane Instance: <Not found>, Create [y] ? y Principal: jane, Instance: , kdc_key_ver: 1 New Password: <---- enter a secure password here Verifying password New Password: <---- re-enter the password here Principal's new key version = 1 Expiration date (enter yyyy-mm-dd) [ 2000-01-01 ] ? Max ticket lifetime (*5 minutes) [ 255 ] ? Attributes [ 0 ] ? Edit O.K. Principal name: <---- null entry here will cause an exit Testing It All Out First we have to start the Kerberos daemons. Note that if you have correctly edited your /etc/rc.conf then this will happen automatically when you reboot. This is only necessary on the Kerberos server. Kerberos clients will automatically get what they need from the /etc/kerberosIV directory. &prompt.root; kerberos & Kerberos server starting Sleep forever on error Log file is /var/log/kerberos.log Current Kerberos master key version is 1. Master key entered. BEWARE! Current Kerberos master key version is 1 Local realm: EXAMPLE.COM &prompt.root; kadmind -n & KADM Server KADM0.0A initializing Please do not use 'kill -9' to kill this job, use a regular kill instead Current Kerberos master key version is 1. Master key entered. BEWARE! Now we can try using the kinit command to get a ticket for the ID jane that we created above: &prompt.user; kinit jane MIT Project Athena (grunt.example.com) Kerberos Initialization for "jane" Password: Try listing the tokens using klist to see if we really have them: &prompt.user; klist Ticket file: /tmp/tkt245 Principal: jane@EXAMPLE.COM Issued Expires Principal Apr 30 11:23:22 Apr 30 19:23:22 krbtgt.EXAMPLE.COM@EXAMPLE.COM Now try changing the password using &man.passwd.1; to check if the kpasswd daemon can get authorization to the Kerberos database: &prompt.user; passwd realm EXAMPLE.COM Old password for jane: New Password for jane: Verifying password New Password for jane: Password changed. Adding <command>su</command> Privileges Kerberos allows us to give each user who needs root privileges their own separate &man.su.1; password. We could now add an ID which is authorized to &man.su.1; to root. This is controlled by having an instance of root associated with a principal. Using kdb_edit we can create the entry jane.root in the Kerberos database: &prompt.root; kdb_edit Opening database... Enter Kerberos master key: Current Kerberos master key version is 1. Master key entered. BEWARE! Previous or default values are in [brackets] , enter return to leave the same, or new value. Principal name: jane Instance: root <Not found>, Create [y] ? y Principal: jane, Instance: root, kdc_key_ver: 1 New Password: <---- enter a SECURE password here Verifying password New Password: <---- re-enter the password here Principal's new key version = 1 Expiration date (enter yyyy-mm-dd) [ 2000-01-01 ] ? Max ticket lifetime (*5 minutes) [ 255 ] ? 12 <--- Keep this short! Attributes [ 0 ] ? Edit O.K. Principal name: <---- null entry here will cause an exit Now try getting tokens for it to make sure it works: &prompt.root; kinit jane.root MIT Project Athena (grunt.example.com) Kerberos Initialization for "jane.root" Password: Now we need to add the user to root's .klogin file: &prompt.root; cat /root/.klogin jane.root@EXAMPLE.COM Now try doing the &man.su.1;: &prompt.user; su Password: and take a look at what tokens we have: &prompt.root; klist Ticket file: /tmp/tkt_root_245 Principal: jane.root@EXAMPLE.COM Issued Expires Principal May 2 20:43:12 May 3 04:43:12 krbtgt.EXAMPLE.COM@EXAMPLE.COM Using Other Commands In an earlier example, we created a principal called jane with an instance root. This was based on a user with the same name as the principal, and this is a Kerberos default; that a <principal>.<instance> of the form <username>.root will allow that <username> to &man.su.1; to root if the necessary entries are in the .klogin file in root's home directory: &prompt.root; cat /root/.klogin jane.root@EXAMPLE.COM Likewise, if a user has in their own home directory lines of the form: &prompt.user; cat ~/.klogin jane@EXAMPLE.COM jack@EXAMPLE.COM This allows anyone in the EXAMPLE.COM realm who has authenticated themselves as jane or jack (via kinit, see above) to access to jane's account or files on this system (grunt) via &man.rlogin.1;, &man.rsh.1; or &man.rcp.1;. For example, jane now logs into another system using Kerberos: &prompt.user; kinit MIT Project Athena (grunt.example.com) Password: &prompt.user; rlogin grunt Last login: Mon May 1 21:14:47 from grumble Copyright (c) 1980, 1983, 1986, 1988, 1990, 1991, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD BUILT-19950429 (GR386) #0: Sat Apr 29 17:50:09 SAT 1995 Or jack logs into jane's account on the same machine (jane having set up the .klogin file as above, and the person in charge of Kerberos having set up principal jack with a null instance): &prompt.user; kinit &prompt.user; rlogin grunt -l jane MIT Project Athena (grunt.example.com) Password: Last login: Mon May 1 21:16:55 from grumble Copyright (c) 1980, 1983, 1986, 1988, 1990, 1991, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD BUILT-19950429 (GR386) #0: Sat Apr 29 17:50:09 SAT 1995 Tillman Hodgson Contributed by Mark Murray Based on a contribution by <application>Kerberos5</application> Every &os; release beyond &os;-5.1 includes support only for Kerberos5. Hence Kerberos5 is the only version included, and its configuration is similar in many aspects to that of KerberosIV. The following information only applies to Kerberos5 in post &os;-5.0 releases. Users who wish to use the KerberosIV package may install the security/krb4 port. Kerberos is a network add-on system/protocol that allows users to authenticate themselves through the services of a secure server. Services such as remote login, remote copy, secure inter-system file copying and other high-risk tasks are made considerably safer and more controllable. Kerberos can be described as an identity-verifying proxy system. It can also be described as a trusted third-party authentication system. Kerberos provides only one function — the secure authentication of users on the network. It does not provide authorization functions (what users are allowed to do) or auditing functions (what those users did). After a client and server have used Kerberos to prove their identity, they can also encrypt all of their communications to assure privacy and data integrity as they go about their business. Therefore it is highly recommended that Kerberos be used with other security methods which provide authorization and audit services. The following instructions can be used as a guide on how to set up Kerberos as distributed for &os;. However, you should refer to the relevant manual pages for a complete description. For purposes of demonstrating a Kerberos installation, the various name spaces will be handled as follows: The DNS domain (zone) will be example.org. The Kerberos realm will be EXAMPLE.ORG. Please use real domain names when setting up Kerberos even if you intend to run it internally. This avoids DNS problems and assures inter-operation with other Kerberos realms. History Kerberos5 history Kerberos was created by MIT as a solution to network security problems. The Kerberos protocol uses strong cryptography so that a client can prove its identity to a server (and vice versa) across an insecure network connection. Kerberos is both the name of a network authentication protocol and an adjective to describe programs that implement the program (Kerberos telnet, for example). The current version of the protocol is version 5, described in RFC 1510. Several free implementations of this protocol are available, covering a wide range of operating systems. The Massachusetts Institute of Technology (MIT), where Kerberos was originally developed, continues to develop their Kerberos package. It is commonly used in the US as a cryptography product, as such it has historically been affected by US export regulations. The MIT Kerberos is available as a port (security/krb5). Heimdal Kerberos is another version 5 implementation, and was explicitly developed outside of the US to avoid export regulations (and is thus often included in non-commercial &unix; variants). The Heimdal Kerberos distribution is available as a port (security/heimdal), and a minimal installation of it is included in the base &os; install. In order to reach the widest audience, these instructions assume the use of the Heimdal distribution included in &os;. Setting up a Heimdal <acronym>KDC</acronym> Kerberos5 Key Distribution Center The Key Distribution Center (KDC) is the centralized authentication service that Kerberos provides — it is the computer that issues Kerberos tickets. The KDC is considered trusted by all other computers in the Kerberos realm, and thus has heightened security concerns. Note that while running the Kerberos server requires very few computing resources, a dedicated machine acting only as a KDC is recommended for security reasons. To begin setting up a KDC, ensure that your /etc/rc.conf file contains the correct settings to act as a KDC (you may need to adjust paths to reflect your own system): kerberos5_server_enable="YES" kadmind5_server_enable="YES" kerberos_stash="YES" The is only available in &os; 4.X. Next we will set up your Kerberos config file, /etc/krb5.conf: [libdefaults] default_realm = EXAMPLE.ORG [realms] EXAMPLE.ORG = { kdc = kerberos.example.org admin_server = kerberos.example.org } [domain_realm] .example.org = EXAMPLE.ORG Note that this /etc/krb5.conf file implies that your KDC will have the fully-qualified hostname of kerberos.example.org. You will need to add a CNAME (alias) entry to your zone file to accomplish this if your KDC has a different hostname. For large networks with a properly configured BIND DNS server, the above example could be trimmed to: [libdefaults] default_realm = EXAMPLE.ORG With the following lines being appended to the example.org zonefile: _kerberos._udp IN SRV 01 00 88 kerberos.example.org. _kerberos._tcp IN SRV 01 00 88 kerberos.example.org. _kpasswd._udp IN SRV 01 00 464 kerberos.example.org. _kerberos-adm._tcp IN SRV 01 00 749 kerberos.example.org. _kerberos IN TXT EXAMPLE.ORG For clients to be able to find the Kerberos services, you must have either a fully configured /etc/krb5.conf or a miminally configured /etc/krb5.conf and a properly configured DNS server. Next we will create the Kerberos database. This database contains the keys of all principals encrypted with a master password. You are not required to remember this password, it will be stored in a file (/var/heimdal/m-key). To create the master key, run kstash and enter a password. Once the master key has been created, you can initialize the database using the kadmin program with the -l option (standing for local). This option instructs kadmin to modify the database files directly rather than going through the kadmind network service. This handles the chicken-and-egg problem of trying to connect to the database before it is created. Once you have the kadmin prompt, use the init command to create your realms initial database. Lastly, while still in kadmin, create your first principal using the add command. Stick to the defaults options for the principal for now, you can always change them later with the modify command. Note that you can use the ? command at any prompt to see the available options. A sample database creation session is shown below: &prompt.root; kstash Master key: xxxxxxxx Verifying password - Master key: xxxxxxxx &prompt.root; kadmin -l kadmin> init EXAMPLE.ORG Realm max ticket life [unlimited]: kadmin> add tillman Max ticket life [unlimited]: Max renewable life [unlimited]: Attributes []: Password: xxxxxxxx Verifying password - Password: xxxxxxxx Now it is time to start up the KDC services. Run /etc/rc.d/kerberos start and /etc/rc.d/kadmind start to bring up the services. Note that you will not have any kerberized daemons running at this point but you should be able to confirm the that the KDC is functioning by obtaining and listing a ticket for the principal (user) that you just created from the command-line of the KDC itself: &prompt.user; k5init tillman tillman@EXAMPLE.ORG's Password: &prompt.user; k5list Credentials cache: FILE:/tmp/krb5cc_500 Principal: tillman@EXAMPLE.ORG Issued Expires Principal Aug 27 15:37:58 Aug 28 01:37:58 krbtgt/EXAMPLE.ORG@EXAMPLE.ORG <application>Kerberos</application> enabling a server with Heimdal services Kerberos5 enabling services First, we need a copy of the Kerberos configuration file, /etc/krb5.conf. To do so, simply copy it over to the client computer from the KDC in a secure fashion (using network utilities, such as &man.scp.1;, or physically via a floppy disk). Next you need a /etc/krb5.keytab file. This is the major difference between a server providing Kerberos enabled daemons and a workstation — the server must have a keytab file. This file contains the servers host key, which allows it and the KDC to verify each others identity. It must be transmitted to the server in a secure fashion, as the security of the server can be broken if the key is made public. This explicitly means that transferring it via a clear text channel, such as FTP, is a very bad idea. Typically, you transfer to the keytab to the server using the kadmin program. This is handy because you also need to create the host principal (the KDC end of the krb5.keytab) using kadmin. Note that you must have already obtained a ticket and that this ticket must be allowed to use the kadmin interface in the kadmind.acl. See the section titled Remote administration in the Heimdal info pages (info heimdal) for details on designing access control lists. If you do not want to enable remote kadmin access, you can simply securely connect to the KDC (via local console, &man.ssh.1; or Kerberos &man.telnet.1;) and perform administration locally using kadmin -l. After installing the /etc/krb5.conf file, you can use kadmin from the Kerberos server. The add --random-key command will let you add the servers host principal, and the ext command will allow you to extract the servers host principal to its own keytab. For example: &prompt.root; kadmin kadmin> add --random-key host/myserver.example.org Max ticket life [unlimited]: Max renewable life [unlimited]: Attributes []: kadmin> ext host/myserver.example.org kadmin> exit Note that the ext command (short for extract) stores the extracted key in /etc/krb5.keytab by default. If you do not have kadmind running on the KDC (possibly for security reasons) and thus do not have access to kadmin remotely, you can add the host principal (host/myserver.EXAMPLE.ORG) directly on the KDC and then extract it to a temporary file (to avoid over-writing the /etc/krb5.keytab on the KDC) using something like this: &prompt.root; kadmin kadmin> ext --keytab=/tmp/example.keytab host/myserver.example.org kadmin> exit You can then securely copy the keytab to the server computer (using scp or a floppy, for example). Be sure to specify a non-default keytab name to avoid over-writing the keytab on the KDC. At this point your server can communicate with the KDC (due to its krb5.conf file) and it can prove its own identity (due to the krb5.keytab file). It is now ready for you to enable some Kerberos services. For this example we will enable the telnet service by putting a line like this into your /etc/inetd.conf and then restarting the &man.inetd.8; service with /etc/rc.d/inetd restart: telnet stream tcp nowait root /usr/libexec/telnetd telnetd -a user The critical bit is that the -a (for authentication) type is set to user. Consult the &man.telnetd.8; manual page for more details. <application>Kerberos</application> enabling a client with Heimdal Kerberos5 configure clients Setting up a client computer is almost trivially easy. As far as Kerberos configuration goes, you only need the Kerberos configuration file, located at /etc/krb5.conf. Simply securely copy it over to the client computer from the KDC. Test your client computer by attempting to use kinit, klist, and kdestroy from the client to obtain, show, and then delete a ticket for the principal you created above. You should also be able to use Kerberos applications to connect to Kerberos enabled servers, though if that does not work and obtaining a ticket does the problem is likely with the server and not with the client or the KDC. When testing an application like telnet, try using a packet sniffer (such as &man.tcpdump.1;) to confirm that your password is not sent in the clear. Try using telnet with the -x option, which encrypts the entire data stream (similar to ssh). The core Kerberos client applications (traditionally named kinit, klist, kdestroy, and kpasswd) are installed in the base &os; install. Note that &os; versions prior to 5.0 renamed them to k5init, k5list, k5destroy, k5passwd, and k5stash (though it is typically only used once). Various non-core Kerberos client applications are also installed by default. This is where the minimal nature of the base Heimdal installation is felt: telnet is the only Kerberos enabled service. The Heimdal port adds some of the missing client applications: Kerberos enabled versions of ftp, rsh, rcp, rlogin, and a few other less common programs. The MIT port also contains a full suite of Kerberos client applications. User configuration files: <filename>.k5login</filename> and <filename>.k5users</filename> .k5login .k5users Users within a realm typically have their Kerberos principal (such as tillman@EXAMPLE.ORG) mapped to a local user account (such as a local account named tillman). Client applications such as telnet usually do not require a user name or a principal. Occasionally, however, you want to grant access to a local user account to someone who does not have a matching Kerberos principal. For example, tillman@EXAMPLE.ORG may need access to the local user account webdevelopers. Other principals may also need access to that local account. The .k5login and .k5users files, placed in a users home directory, can be used similar to a powerful combination of .hosts and .rhosts, solving this problem. For example, if a .k5login with the following contents: tillman@example.org jdoe@example.org Were to be placed into the home directory of the local user webdevelopers then both principals listed would have access to that account without requiring a shared password. Reading the manual pages for these commands is recommended. Note that the ksu manual page covers .k5users. <application>Kerberos</application> Tips, Tricks, and Troubleshooting Kerberos5 troubleshooting When using either the Heimdal or MIT Kerberos ports ensure that your PATH environment variable lists the Kerberos versions of the client applications before the system versions. Do all the computers in your realm have synchronized time settings? If not, authentication may fail. describes how to synchronize clocks using NTP. MIT and Heimdal inter-operate nicely. Except for kadmin, the protocol for which is not standardized. If you change your hostname, you also need to change your host/ principal and update your keytab. This also applies to special keytab entries like the www/ principal used for Apache's www/mod_auth_kerb. All hosts in your realm must be resolvable (both forwards and reverse) in DNS (or /etc/hosts as a minimum). CNAMEs will work, but the A and PTR records must be correct and in place. The error message is not very intuitive: Kerberos5 refuses authentication because Read req failed: Key table entry not found. Some operating systems that may being acting as clients to your KDC do not set the permissions for ksu to be setuid root. This means that ksu does not work, which is a good security idea but annoying. This is not a KDC error. With MIT Kerberos, if you want to allow a principal to have a ticket life longer than the default ten hours, you must use modify_principal in kadmin to change the maxlife of both the principal in question and the krbtgt principal. Then the principal can use the -l option with kinit to request a ticket with a longer lifetime. If you run a packet sniffer on your KDC to add in troubleshooting and then run kinit from a workstation, you will notice that your TGT is sent immediately upon running kinit — even before you type your password! The explanation is that the Kerberos server freely transmits a TGT (Ticket Granting Ticket) to any unauthorized request; however, every TGT is encrypted in a key derived from the user's password. Therefore, when a user types their password it is not being sent to the KDC, it is being used to decrypt the TGT that kinit already obtained. If the decryption process results in a valid ticket with a valid time stamp, the user has valid Kerberos credentials. These credentials include a session key for establishing secure communications with the Kerberos server in the future, as well as the actual ticket-granting ticket, which is actually encrypted with the Kerberos server's own key. This second layer of encryption is unknown to the user, but it is what allows the Kerberos server to verify the authenticity of each TGT. If you want to use long ticket lifetimes (a week, for example) and you are using OpenSSH to connect to the machine where your ticket is stored, make sure that Kerberos is set to no in your sshd_config or else your tickets will be deleted when you log out. Remember that host principals can have a longer ticket lifetime as well. If your user principal has a lifetime of a week but the host you are connecting to has a lifetime of nine hours, you will have an expired host principal in your cache and the ticket cache will not work as expected. When setting up a krb5.dict file to prevent specific bad passwords from being used (the manual page for kadmind covers this briefly), remember that it only applies to principals that have a password policy assigned to them. The krb5.dict files format is simple: one string per line. Creating a symbolic link to /usr/share/dict/words might be useful. Differences with the <acronym>MIT</acronym> port The major difference between the MIT and Heimdal installs relates to the kadmin program which has a different (but equivalent) set of commands and uses a different protocol. This has a large implications if your KDC is MIT as you will not be able to use the Heimdal kadmin program to administer your KDC remotely (or vice versa, for that matter). The client applications may also take slightly different command line options to accomplish the same tasks. Following the instructions on the MIT Kerberos web site () is recommended. Be careful of path issues: the MIT port installs into /usr/local/ by default, and the normal system applications may be run instead of MIT if your PATH environment variable lists the system directories first. With the MIT security/krb5 port that is provided by &os;, be sure to read the /usr/local/share/doc/krb5/README.FreeBSD file installed by the port if you want to understand why logins via telnetd and klogind behave somewhat oddly. Most importantly, correcting the incorrect permissions on cache file behavior requires that the login.krb5 binary be used for authentication so that it can properly change ownership for the forwarded credentials. Mitigating limitations found in <application>Kerberos</application> Kerberos5 limitations and shortcomings <application>Kerberos</application> is an all-or-nothing approach Every service enabled on the network must be modified to work with Kerberos (or be otherwise secured against network attacks) or else the users credentials could be stolen and re-used. An example of this would be Kerberos enabling all remote shells (via rsh and telnet, for example) but not converting the POP3 mail server which sends passwords in plain text. <application>Kerberos</application> is intended for single-user workstations In a multi-user environment, Kerberos is less secure. This is because it stores the tickets in the /tmp directory, which is readable by all users. If a user is sharing a computer with several other people simultaneously (i.e. multi-user), it is possible that the user's tickets can be stolen (copied) by another user. This can be overcome with the -c filename command-line option or (preferably) the KRB5CCNAME environment variable, but this is rarely done. In principal, storing the ticket in the users home directory and using simple file permissions can mitigate this problem. The KDC is a single point of failure By design, the KDC must be as secure as the master password database is contained on it. The KDC should have absolutely no other services running on it and should be physically secured. The danger is high because Kerberos stores all passwords encrypted with the same key (the master key), which in turn is stored as a file on the KDC. As a side note, a compromised master key is not quite as bad as one might normally fear. The master key is only used to encrypt the Kerberos database and as a seed for the random number generator. As long as access to your KDC is secure, an attacker cannot do much with the master key. Additionally, if the KDC is unavailable (perhaps due to a denial of service attack or network problems) the network services are unusable as authentication can not be performed, a recipe for a denial-of-service attack. This can alleviated with multiple KDCs (a single master and one or more slaves) and with careful implementation of secondary or fall-back authentication (PAM is excellent for this). <application>Kerberos</application> Shortcomings Kerberos allows users, hosts and services to authenticate between themselves. It does not have a mechanism to authenticate the KDC to the users, hosts or services. This means that a trojanned kinit (for example) could record all user names and passwords. Something like security/tripwire or other file system integrity checking tools can alleviate this. Resources and further information Kerberos5 external resources The Kerberos FAQ Designing an Authentication System: a Dialog in Four Scenes RFC 1510, The Kerberos Network Authentication Service (V5) MIT Kerberos home page Heimdal Kerberos home page Tom Rhodes Written by: OpenSSL security OpenSSL One feature that many users overlook is the OpenSSL toolkit included in &os;. OpenSSL provides an encryption transport layer on top of the normal communications layer; thus allowing it to be intertwined with many network applications and services. Some uses of OpenSSL may include encrypted authentication of mail clients, web based transactions such as credit card payments and more. Many ports such as www/apache13-ssl, and mail/sylpheed-claws will offer compilation support for building with OpenSSL. In most cases the Ports Collection will attempt to build the security/openssl port unless the WITH_OPENSSL_BASE make variable is explicitly set to yes. The version of OpenSSL included in &os; supports Secure Sockets Layer v2/v3 (SSLv2/SSLv3), Transport Layer Security v1 (TLSv1) network security protocols and can be used as a general cryptographic library. While OpenSSL supports the IDEA algorithm, it is disabled by default due to United States patents. To use it, the license should be reviewed and, if the restrictions are acceptable, the MAKE_IDEA variable must be set in make.conf. One of the most common uses of OpenSSL is to provide certificates for use with software applications. These certificates ensure that the credentials of the company or individual are valid and not fraudulent. If the certificate in question has not been verified by one of the several Certificate Authorities, or CAs, a warning is usually produced. A Certificate Authority is a company, such as VeriSign, which will sign certificates in order to validate credentials of individuals or companies. This process has a cost associated with it and is definitely not a requirement for using certificates; however, it can put some of the more paranoid users at ease. Generating Certificates OpenSSL certificate generation To generate a certificate, the following command is available: &prompt.root; openssl req -new -nodes -out req.pem -keyout cert.pem Generating a 1024 bit RSA private key ................++++++ .......................................++++++ writing new private key to 'cert.pem' ----- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]:US State or Province Name (full name) [Some-State]:PA Locality Name (eg, city) []:Pittsburgh Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company Organizational Unit Name (eg, section) []:Systems Administrator Common Name (eg, YOUR name) []:localhost.example.org Email Address []:trhodes@FreeBSD.org Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []:SOME PASSWORD An optional company name []:Another Name Notice the response directly after the Common Name prompt shows a domain name. This prompt requires a server name to be entered for verification purposes; placing anything but a domain name would yield a useless certificate. Other options, for instance expire time, alternate encryption algorithms, etc. are available. A complete list may be obtained by viewing the &man.openssl.1; manual page. Two files should now exist in the directory in which the aforementioned command was issued. The certificate request, req.pem, may be sent to a certificate authority who will validate the credentials that you entered, sign the request and return the certificate to you. The second file created will be named cert.pem and is the private key for the certificate and should be protected at all costs; if this falls in the hands of others it can be used to impersonate you (or your server). In cases where a signature from a CA is not required, a self signed certificate can be created. First, generate the RSA key: &prompt.root; openssl dsaparam -rand -genkey -out myRSA.key 1024 Next, generate the CA key: &prompt.root; openssl gendsa -des3 -out myca.key myRSA.key Use this key to create the certificate: &prompt.root; openssl req -new -x509 -days 365 -key myca.key -out new.crt Two new files should appear in the directory: a certificate authority signature file, myca.key and the certificate itself, new.crt. These should be placed in a directory, preferably under /etc, which is readable only by root. Permissions of 0700 should be fine for this and they can be set with the chmod utility. Using Certificates, an Example So what can these files do? A good use would be to encrypt connections to the Sendmail MTA. This would dissolve the use of clear text authentication for users who send mail via the local MTA. This is not the best use in the world as some MUAs will present the user with an error if they have not installed the certificate locally. Refer to the documentation included with the software for more information on certificate installation. The following lines should be placed inside the local .mc file: dnl SSL Options define(`confCACERT_PATH',`/etc/certs')dnl define(`confCACERT',`/etc/certs/new.crt')dnl define(`confSERVER_CERT',`/etc/certs/new.crt')dnl define(`confSERVER_KEY',`/etc/certs/myca.key')dnl define(`confTLS_SRV_OPTIONS', `V')dnl Where /etc/certs/ is the directory to be used for storing the certificate and key files locally. The last few requirements are a rebuild of the local .cf file. This is easily achieved by typing make install within the /etc/mail directory. Follow that up with make restart which should start the Sendmail daemon. If all went well there will be no error messages in the /var/log/maillog file and Sendmail will show up in the process list. For a simple test, simply connect to the mail server using the &man.telnet.1; utility: &prompt.root; telnet example.com 25 Trying 192.0.34.166... Connected to example.com. Escape character is '^]'. 220 example.com ESMTP Sendmail 8.12.10/8.12.10; Tue, 31 Aug 2004 03:41:22 -0400 (EDT) ehlo example.com 250-example.com Hello example.com [192.0.34.166], pleased to meet you 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE 250-DSN 250-ETRN 250-AUTH LOGIN PLAIN 250-STARTTLS 250-DELIVERBY 250 HELP quit 221 2.0.0 example.com closing connection Connection closed by foreign host. If the STARTTLS line appears in the output then everything is working correctly. Nik Clayton

nik@FreeBSD.org

Written by IPsec VPN over IPsec Creating a VPN between two networks, separated by the Internet, using FreeBSD gateways. Hiten M. Pandya

hmp@FreeBSD.org

Written by Understanding IPsec This section will guide you through the process of setting up IPsec, and to use it in an environment which consists of FreeBSD and µsoft.windows; 2000/XP machines, to make them communicate securely. In order to set up IPsec, it is necessary that you are familiar with the concepts of building a custom kernel (see ). IPsec is a protocol which sits on top of the Internet Protocol (IP) layer. It allows two or more hosts to communicate in a secure manner (hence the name). The FreeBSD IPsec network stack is based on the KAME implementation, which has support for both protocol families, IPv4 and IPv6. FreeBSD 5.X contains a hardware accelerated IPsec stack, known as Fast IPsec, that was obtained from OpenBSD. It employs cryptographic hardware (whenever possible) via the &man.crypto.4; subsystem to optimize the performance of IPsec. This subsystem is new, and does not support all the features that are available in the KAME version of IPsec. However, in order to enable hardware-accelerated IPsec, the following kernel option has to be added to your kernel configuration file: kernel options FAST_IPSEC options FAST_IPSEC # new IPsec (cannot define w/ IPSEC) Note, that it is not currently possible to use the Fast IPsec subsystem in lue with the KAME implementation of IPsec. Consult the &man.fast.ipsec.4; manual page for more information. IPsec ESP IPsec AH IPsec consists of two sub-protocols: Encapsulated Security Payload (ESP), protects the IP packet data from third party interference, by encrypting the contents using symmetric cryptography algorithms (like Blowfish, 3DES). Authentication Header (AH), protects the IP packet header from third party interference and spoofing, by computing a cryptographic checksum and hashing the IP packet header fields with a secure hashing function. This is then followed by an additional header that contains the hash, to allow the information in the packet to be authenticated. ESP and AH can either be used together or separately, depending on the environment. VPN virtual private network VPN IPsec can either be used to directly encrypt the traffic between two hosts (known as Transport Mode); or to build virtual tunnels between two subnets, which could be used for secure communication between two corporate networks (known as Tunnel Mode). The latter is more commonly known as a Virtual Private Network (VPN). The &man.ipsec.4; manual page should be consulted for detailed information on the IPsec subsystem in FreeBSD. To add IPsec support to your kernel, add the following options to your kernel configuration file: kernel options IPSEC kernel options IPSEC_ESP options IPSEC #IP security options IPSEC_ESP #IP security (crypto; define w/ IPSEC) kernel options IPSEC_DEBUG If IPsec debugging support is desired, the following kernel option should also be added: options IPSEC_DEBUG #debug for IP security The Problem There is no standard for what constitutes a VPN. VPNs can be implemented using a number of different technologies, each of which have their own strengths and weaknesses. This section presents a scenario, and the strategies used for implementing a VPN for this scenario. The Scenario: Two networks, connected to the Internet, to behave as one VPN creating The premise is as follows: You have at least two sites Both sites are using IP internally Both sites are connected to the Internet, through a gateway that is running FreeBSD. The gateway on each network has at least one public IP address. The internal addresses of the two networks can be public or private IP addresses, it does not matter. You can be running NAT on the gateway machine if necessary. The internal IP addresses of the two networks do not collide. While I expect it is theoretically possible to use a combination of VPN technology and NAT to get this to work, I expect it to be a configuration nightmare. If you find that you are trying to connect two networks, both of which, internally, use the same private IP address range (e.g. both of them use 192.168.1.x), then one of the networks will have to be renumbered. The network topology might look something like this: Network #1 [ Internal Hosts ] Private Net, 192.168.1.2-254 [ Win9x/NT/2K ] [ UNIX ] | | .---[fxp1]---. Private IP, 192.168.1.1 | FreeBSD | `---[fxp0]---' Public IP, A.B.C.D | | -=-=- Internet -=-=- | | .---[fxp0]---. Public IP, W.X.Y.Z | FreeBSD | `---[fxp1]---' Private IP, 192.168.2.1 | | Network #2 [ Internal Hosts ] [ Win9x/NT/2K ] Private Net, 192.168.2.2-254 [ UNIX ] Notice the two public IP addresses. I will use the letters to refer to them in the rest of this article. Anywhere you see those letters in this article, replace them with your own public IP addresses. Note also that internally, the two gateway machines have .1 IP addresses, and that the two networks have different private IP addresses (192.168.1.x and 192.168.2.x respectively). All the machines on the private networks have been configured to use the .1 machine as their default gateway. The intention is that, from a network point of view, each network should view the machines on the other network as though they were directly attached the same router -- albeit a slightly slow router with an occasional tendency to drop packets. This means that (for example), machine 192.168.1.20 should be able to run ping 192.168.2.34 and have it work, transparently. &windows; machines should be able to see the machines on the other network, browse file shares, and so on, in exactly the same way that they can browse machines on the local network. And the whole thing has to be secure. This means that traffic between the two networks has to be encrypted. Creating a VPN between these two networks is a multi-step process. The stages are as follows: Create a virtual network link between the two networks, across the Internet. Test it, using tools like &man.ping.8;, to make sure it works. Apply security policies to ensure that traffic between the two networks is transparently encrypted and decrypted as necessary. Test this, using tools like &man.tcpdump.1;, to ensure that traffic is encrypted. Configure additional software on the FreeBSD gateways, to allow &windows; machines to see one another across the VPN. Step 1: Creating and testing a <quote>virtual</quote> network link Suppose that you were logged in to the gateway machine on network #1 (with public IP address A.B.C.D, private IP address 192.168.1.1), and you ran ping 192.168.2.1, which is the private address of the machine with IP address W.X.Y.Z. What needs to happen in order for this to work? The gateway machine needs to know how to reach 192.168.2.1. In other words, it needs to have a route to 192.168.2.1. Private IP addresses, such as those in the 192.168.x range are not supposed to appear on the Internet at large. Instead, each packet you send to 192.168.2.1 will need to be wrapped up inside another packet. This packet will need to appear to be from A.B.C.D, and it will have to be sent to W.X.Y.Z. This process is called encapsulation. Once this packet arrives at W.X.Y.Z it will need to unencapsulated, and delivered to 192.168.2.1. You can think of this as requiring a tunnel between the two networks. The two tunnel mouths are the IP addresses A.B.C.D and W.X.Y.Z, and the tunnel must be told the addresses of the private IP addresses that will be allowed to pass through it. The tunnel is used to transfer traffic with private IP addresses across the public Internet. This tunnel is created by using the generic interface, or gif devices on FreeBSD. As you can imagine, the gif interface on each gateway host must be configured with four IP addresses; two for the public IP addresses, and two for the private IP addresses. Support for the gif device must be compiled in to the &os; kernel on both machines. You can do this by adding the line: device gif to the kernel configuration files on both machines, and then compile, install, and reboot as normal. Configuring the tunnel is a two step process. First the tunnel must be told what the outside (or public) IP addresses are, using &man.gifconfig.8;. Then the private IP addresses must be configured using &man.ifconfig.8;. In &os; 5.X, the functionality provided by the &man.gifconfig.8; utility has been merged into &man.ifconfig.8;. On the gateway machine on network #1 you would run the following two commands to configure the tunnel. gifconfig gif0 A.B.C.D W.X.Y.Z ifconfig gif0 inet 192.168.1.1 192.168.2.1 netmask 0xffffffff On the other gateway machine you run the same commands, but with the order of the IP addresses reversed. gifconfig gif0 W.X.Y.Z A.B.C.D ifconfig gif0 inet 192.168.2.1 192.168.1.1 netmask 0xffffffff You can then run: gifconfig gif0 to see the configuration. For example, on the network #1 gateway, you would see this: &prompt.root; gifconfig gif0 gif0: flags=8011<UP,POINTTOPOINT,MULTICAST> mtu 1280 inet 192.168.1.1 --> 192.168.2.1 netmask 0xffffffff physical address inet A.B.C.D --> W.X.Y.Z As you can see, a tunnel has been created between the physical addresses A.B.C.D and W.X.Y.Z, and the traffic allowed through the tunnel is that between 192.168.1.1 and 192.168.2.1. This will also have added an entry to the routing table on both machines, which you can examine with the command netstat -rn. This output is from the gateway host on network #1. &prompt.root; netstat -rn Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire ... 192.168.2.1 192.168.1.1 UH 0 0 gif0 ... As the Flags value indicates, this is a host route, which means that each gateway knows how to reach the other gateway, but they do not know how to reach the rest of their respective networks. That problem will be fixed shortly. It is likely that you are running a firewall on both machines. This will need to be circumvented for your VPN traffic. You might want to allow all traffic between both networks, or you might want to include firewall rules that protect both ends of the VPN from one another. It greatly simplifies testing if you configure the firewall to allow all traffic through the VPN. You can always tighten things up later. If you are using &man.ipfw.8; on the gateway machines then a command like ipfw add 1 allow ip from any to any via gif0 will allow all traffic between the two end points of the VPN, without affecting your other firewall rules. Obviously you will need to run this command on both gateway hosts. This is sufficient to allow each gateway machine to ping the other. On 192.168.1.1, you should be able to run ping 192.168.2.1 and get a response, and you should be able to do the same thing on the other gateway machine. However, you will not be able to reach internal machines on either network yet. This is because of the routing -- although the gateway machines know how to reach one another, they do not know how to reach the network behind each one. To solve this problem you must add a static route on each gateway machine. The command to do this on the first gateway would be: route add 192.168.2.0 192.168.2.1 netmask 0xffffff00 This says In order to reach the hosts on the network 192.168.2.0, send the packets to the host 192.168.2.1. You will need to run a similar command on the other gateway, but with the 192.168.1.x addresses instead. IP traffic from hosts on one network will now be able to reach hosts on the other network. That has now created two thirds of a VPN between the two networks, in as much as it is virtual and it is a network. It is not private yet. You can test this using &man.ping.8; and &man.tcpdump.1;. Log in to the gateway host and run tcpdump dst host 192.168.2.1 In another log in session on the same host run ping 192.168.2.1 You will see output that looks something like this: 16:10:24.018080 192.168.1.1 > 192.168.2.1: icmp: echo request 16:10:24.018109 192.168.1.1 > 192.168.2.1: icmp: echo reply 16:10:25.018814 192.168.1.1 > 192.168.2.1: icmp: echo request 16:10:25.018847 192.168.1.1 > 192.168.2.1: icmp: echo reply 16:10:26.028896 192.168.1.1 > 192.168.2.1: icmp: echo request 16:10:26.029112 192.168.1.1 > 192.168.2.1: icmp: echo reply As you can see, the ICMP messages are going back and forth unencrypted. If you had used the parameter to &man.tcpdump.1; to grab more bytes of data from the packets you would see more information. Obviously this is unacceptable. The next section will discuss securing the link between the two networks so that it all traffic is automatically encrypted. Summary: Configure both kernels with pseudo-device gif. Edit /etc/rc.conf on gateway host #1 and add the following lines (replacing IP addresses as necessary). gifconfig_gif0="A.B.C.D W.X.Y.Z" ifconfig_gif0="inet 192.168.1.1 192.168.2.1 netmask 0xffffffff" static_routes="vpn" route_vpn="192.168.2.0 192.168.2.1 netmask 0xffffff00" Edit your firewall script (/etc/rc.firewall, or similar) on both hosts, and add ipfw add 1 allow ip from any to any via gif0 Make similar changes to /etc/rc.conf on gateway host #2, reversing the order of IP addresses. Step 2: Securing the link To secure the link we will be using IPsec. IPsec provides a mechanism for two hosts to agree on an encryption key, and to then use this key in order to encrypt data between the two hosts. The are two areas of configuration to be considered here. There must be a mechanism for two hosts to agree on the encryption mechanism to use. Once two hosts have agreed on this mechanism there is said to be a security association between them. There must be a mechanism for specifying which traffic should be encrypted. Obviously, you do not want to encrypt all your outgoing traffic -- you only want to encrypt the traffic that is part of the VPN. The rules that you put in place to determine what traffic will be encrypted are called security policies. Security associations and security policies are both maintained by the kernel, and can be modified by userland programs. However, before you can do this you must configure the kernel to support IPsec and the Encapsulated Security Payload (ESP) protocol. This is done by configuring a kernel with: kernel options IPSEC options IPSEC options IPSEC_ESP and recompiling, reinstalling, and rebooting. As before you will need to do this to the kernels on both of the gateway hosts. IKE You have two choices when it comes to setting up security associations. You can configure them by hand between two hosts, which entails choosing the encryption algorithm, encryption keys, and so forth, or you can use daemons that implement the Internet Key Exchange protocol (IKE) to do this for you. I recommend the latter. Apart from anything else, it is easier to set up. IPsec security policies setkey Editing and displaying security policies is carried out using &man.setkey.8;. By analogy, setkey is to the kernel's security policy tables as &man.route.8; is to the kernel's routing tables. setkey can also display the current security associations, and to continue the analogy further, is akin to netstat -r in that respect. There are a number of choices for daemons to manage security associations with FreeBSD. This article will describe how to use one of these, racoon — which is available from security/ipsec-tools in the &os; Ports collection. racoon The racoon software must be run on both gateway hosts. On each host it is configured with the IP address of the other end of the VPN, and a secret key (which you choose, and must be the same on both gateways). The two daemons then contact one another, confirm that they are who they say they are (by using the secret key that you configured). The daemons then generate a new secret key, and use this to encrypt the traffic over the VPN. They periodically change this secret, so that even if an attacker were to crack one of the keys (which is as theoretically close to unfeasible as it gets) it will not do them much good -- by the time they have cracked the key the two daemons have chosen another one. The configuration file for racoon is stored in ${PREFIX}/etc/racoon. You should find a configuration file there, which should not need to be changed too much. The other component of racoon's configuration, which you will need to change, is the pre-shared key. The default racoon configuration expects to find this in the file ${PREFIX}/etc/racoon/psk.txt. It is important to note that the pre-shared key is not the key that will be used to encrypt your traffic across the VPN link, it is simply a token that allows the key management daemons to trust one another. psk.txt contains a line for each remote site you are dealing with. In this example, where there are two sites, each psk.txt file will contain one line (because each end of the VPN is only dealing with one other end). On gateway host #1 this line should look like this: W.X.Y.Z secret That is, the public IP address of the remote end, whitespace, and a text string that provides the secret. Obviously, you should not use secret as your key -- the normal rules for choosing a password apply. On gateway host #2 the line would look like this A.B.C.D secret That is, the public IP address of the remote end, and the same secret key. psk.txt must be mode 0600 (i.e., only read/write to root) before racoon will run. You must run racoon on both gateway machines. You will also need to add some firewall rules to allow the IKE traffic, which is carried over UDP to the ISAKMP (Internet Security Association Key Management Protocol) port. Again, this should be fairly early in your firewall ruleset. ipfw add 1 allow udp from A.B.C.D to W.X.Y.Z isakmp ipfw add 1 allow udp from W.X.Y.Z to A.B.C.D isakmp Once racoon is running you can try pinging one gateway host from the other. The connection is still not encrypted, but racoon will then set up the security associations between the two hosts -- this might take a moment, and you may see this as a short delay before the ping commands start responding. Once the security association has been set up you can view it using &man.setkey.8;. Run setkey -D on either host to view the security association information. That's one half of the problem. They other half is setting your security policies. To create a sensible security policy, let's review what's been set up so far. This discussions hold for both ends of the link. Each IP packet that you send out has a header that contains data about the packet. The header includes the IP addresses of both the source and destination. As we already know, private IP addresses, such as the 192.168.x.y range are not supposed to appear on the public Internet. Instead, they must first be encapsulated inside another packet. This packet must have the public source and destination IP addresses substituted for the private addresses. So if your outgoing packet started looking like this: .----------------------. | Src: 192.168.1.1 | | Dst: 192.168.2.1 | | <other header info> | +----------------------+ | <packet data> | `----------------------' Then it will be encapsulated inside another packet, looking something like this: .--------------------------. | Src: A.B.C.D | | Dst: W.X.Y.Z | | <other header info> | +--------------------------+ | .----------------------. | | | Src: 192.168.1.1 | | | | Dst: 192.168.2.1 | | | | <other header info> | | | +----------------------+ | | | <packet data> | | | `----------------------' | `--------------------------' This encapsulation is carried out by the gif device. As you can see, the packet now has real IP addresses on the outside, and our original packet has been wrapped up as data inside the packet that will be put out on the Internet. Obviously, we want all traffic between the VPNs to be encrypted. You might try putting this in to words, as: If a packet leaves from A.B.C.D, and it is destined for W.X.Y.Z, then encrypt it, using the necessary security associations. If a packet arrives from W.X.Y.Z, and it is destined for A.B.C.D, then decrypt it, using the necessary security associations. That's close, but not quite right. If you did this, all traffic to and from W.X.Y.Z, even traffic that was not part of the VPN, would be encrypted. That's not quite what you want. The correct policy is as follows If a packet leaves from A.B.C.D, and that packet is encapsulating another packet, and it is destined for W.X.Y.Z, then encrypt it, using the necessary security associations. If a packet arrives from W.X.Y.Z, and that packet is encapsulating another packet, and it is destined for A.B.C.D, then decrypt it, using the necessary security associations. A subtle change, but a necessary one. Security policies are also set using &man.setkey.8;. &man.setkey.8; features a configuration language for defining the policy. You can either enter configuration instructions via stdin, or you can use the option to specify a filename that contains configuration instructions. The configuration on gateway host #1 (which has the public IP address A.B.C.D) to force all outbound traffic to W.X.Y.Z to be encrypted is: spdadd A.B.C.D/32 W.X.Y.Z/32 ipencap -P out ipsec esp/tunnel/A.B.C.D-W.X.Y.Z/require; Put these commands in a file (e.g. /etc/ipsec.conf) and then run &prompt.root; setkey -f /etc/ipsec.conf tells &man.setkey.8; that we want to add a rule to the secure policy database. The rest of this line specifies which packets will match this policy. A.B.C.D/32 and W.X.Y.Z/32 are the IP addresses and netmasks that identify the network or hosts that this policy will apply to. In this case, we want it to apply to traffic between these two hosts. tells the kernel that this policy should only apply to packets that encapsulate other packets. says that this policy applies to outgoing packets, and says that the packet will be secured. The second line specifies how this packet will be encrypted. is the protocol that will be used, while indicates that the packet will be further encapsulated in an IPsec packet. The repeated use of A.B.C.D and W.X.Y.Z is used to select the security association to use, and the final mandates that packets must be encrypted if they match this rule. This rule only matches outgoing packets. You will need a similar rule to match incoming packets. spdadd W.X.Y.Z/32 A.B.C.D/32 ipencap -P in ipsec esp/tunnel/W.X.Y.Z-A.B.C.D/require; Note the instead of in this case, and the necessary reversal of the IP addresses. The other gateway host (which has the public IP address W.X.Y.Z) will need similar rules. spdadd W.X.Y.Z/32 A.B.C.D/32 ipencap -P out ipsec esp/tunnel/W.X.Y.Z-A.B.C.D/require; spdadd A.B.C.D/32 W.X.Y.Z/32 ipencap -P in ipsec esp/tunnel/A.B.C.D-W.X.Y.Z/require; Finally, you need to add firewall rules to allow ESP and IPENCAP packets back and forth. These rules will need to be added to both hosts. ipfw add 1 allow esp from A.B.C.D to W.X.Y.Z ipfw add 1 allow esp from W.X.Y.Z to A.B.C.D ipfw add 1 allow ipencap from A.B.C.D to W.X.Y.Z ipfw add 1 allow ipencap from W.X.Y.Z to A.B.C.D Because the rules are symmetric you can use the same rules on each gateway host. Outgoing packets will now look something like this: .------------------------------. --------------------------. | Src: A.B.C.D | | | Dst: W.X.Y.Z | | | <other header info> | | Encrypted +------------------------------+ | packet. | .--------------------------. | -------------. | contents | | Src: A.B.C.D | | | | are | | Dst: W.X.Y.Z | | | | completely | | <other header info> | | | |- secure | +--------------------------+ | | Encap'd | from third | | .----------------------. | | -. | packet | party | | | Src: 192.168.1.1 | | | | Original |- with real | snooping | | | Dst: 192.168.2.1 | | | | packet, | IP addr | | | | <other header info> | | | |- private | | | | +----------------------+ | | | IP addr | | | | | <packet data> | | | | | | | | `----------------------' | | -' | | | `--------------------------' | -------------' | `------------------------------' --------------------------' When they are received by the far end of the VPN they will first be decrypted (using the security associations that have been negotiated by racoon). Then they will enter the gif interface, which will unwrap the second layer, until you are left with the innermost packet, which can then travel in to the inner network. You can check the security using the same &man.ping.8; test from earlier. First, log in to the A.B.C.D gateway machine, and run: tcpdump dst host 192.168.2.1 In another log in session on the same host run ping 192.168.2.1 This time you should see output like the following: XXX tcpdump output Now, as you can see, &man.tcpdump.1; shows the ESP packets. If you try to examine them with the option you will see (apparently) gibberish, because of the encryption. Congratulations. You have just set up a VPN between two remote sites. Summary Configure both kernels with: options IPSEC options IPSEC_ESP Install security/ipsec-tools. Edit ${PREFIX}/etc/racoon/psk.txt on both gateway hosts, adding an entry for the remote host's IP address and a secret key that they both know. Make sure this file is mode 0600. Add the following lines to /etc/rc.conf on each host: ipsec_enable="YES" ipsec_file="/etc/ipsec.conf" Create an /etc/ipsec.conf on each host that contains the necessary spdadd lines. On gateway host #1 this would be: spdadd A.B.C.D/32 W.X.Y.Z/32 ipencap -P out ipsec esp/tunnel/A.B.C.D-W.X.Y.Z/require; spdadd W.X.Y.Z/32 A.B.C.D/32 ipencap -P in ipsec esp/tunnel/W.X.Y.Z-A.B.C.D/require; On gateway host #2 this would be: spdadd W.X.Y.Z/32 A.B.C.D/32 ipencap -P out ipsec esp/tunnel/W.X.Y.Z-A.B.C.D/require; spdadd A.B.C.D/32 W.X.Y.Z/32 ipencap -P in ipsec esp/tunnel/A.B.C.D-W.X.Y.Z/require; Add firewall rules to allow IKE, ESP, and IPENCAP traffic to both hosts: ipfw add 1 allow udp from A.B.C.D to W.X.Y.Z isakmp ipfw add 1 allow udp from W.X.Y.Z to A.B.C.D isakmp ipfw add 1 allow esp from A.B.C.D to W.X.Y.Z ipfw add 1 allow esp from W.X.Y.Z to A.B.C.D ipfw add 1 allow ipencap from A.B.C.D to W.X.Y.Z ipfw add 1 allow ipencap from W.X.Y.Z to A.B.C.D The previous two steps should suffice to get the VPN up and running. Machines on each network will be able to refer to one another using IP addresses, and all traffic across the link will be automatically and securely encrypted. Chern Lee Contributed by OpenSSH OpenSSH security OpenSSH OpenSSH is a set of network connectivity tools used to access remote machines securely. It can be used as a direct replacement for rlogin, rsh, rcp, and telnet. Additionally, any other TCP/IP connections can be tunneled/forwarded securely through SSH. OpenSSH encrypts all traffic to effectively eliminate eavesdropping, connection hijacking, and other network-level attacks. OpenSSH is maintained by the OpenBSD project, and is based upon SSH v1.2.12 with all the recent bug fixes and updates. It is compatible with both SSH protocols 1 and 2. OpenSSH has been in the base system since FreeBSD 4.0. Advantages of Using OpenSSH Normally, when using &man.telnet.1; or &man.rlogin.1;, data is sent over the network in an clear, un-encrypted form. Network sniffers anywhere in between the client and server can steal your user/password information or data transferred in your session. OpenSSH offers a variety of authentication and encryption methods to prevent this from happening. Enabling sshd OpenSSH enabling The sshd daemon is enabled by default on &os; 4.X and is enabled or not during the installation by the user of &os; 5.X. To see if it is enabled, check the rc.conf file for: sshd_enable="YES" This will load &man.sshd.8;, the daemon program for OpenSSH, the next time your system initializes. Alternatively, you can simply run directly the sshd daemon by typing sshd on the command line. SSH Client OpenSSH client The &man.ssh.1; utility works similarly to &man.rlogin.1;. &prompt.root; ssh user@example.com Host key not found from the list of known hosts. Are you sure you want to continue connecting (yes/no)? yes Host 'example.com' added to the list of known hosts. user@example.com's password: ******* The login will continue just as it would have if a session was created using rlogin or telnet. SSH utilizes a key fingerprint system for verifying the authenticity of the server when the client connects. The user is prompted to enter yes only when connecting for the first time. Future attempts to login are all verified against the saved fingerprint key. The SSH client will alert you if the saved fingerprint differs from the received fingerprint on future login attempts. The fingerprints are saved in ~/.ssh/known_hosts, or ~/.ssh/known_hosts2 for SSH v2 fingerprints. By default, recent versions of the OpenSSH servers only accept SSH v2 connections. The client will use version 2 if possible and will fall back to version 1. The client can also be forced to use one or the other by passing it the or for version 1 or version 2, respectively. The version 1 compatability is maintained in the client for backwards compatability with older versions. Secure Copy OpenSSH secure copy scp The &man.scp.1; command works similarly to &man.rcp.1;; it copies a file to or from a remote machine, except in a secure fashion. &prompt.root; scp user@example.com:/COPYRIGHT COPYRIGHT user@example.com's password: ******* COPYRIGHT 100% |*****************************| 4735 00:00 &prompt.root; Since the fingerprint was already saved for this host in the previous example, it is verified when using &man.scp.1; here. The arguments passed to &man.scp.1; are similar to &man.cp.1;, with the file or files in the first argument, and the destination in the second. Since the file is fetched over the network, through SSH, one or more of the file arguments takes on the form . Configuration OpenSSH configuration The system-wide configuration files for both the OpenSSH daemon and client reside within the /etc/ssh directory. ssh_config configures the client settings, while sshd_config configures the daemon. Additionally, the (/usr/sbin/sshd by default), and rc.conf options can provide more levels of configuration. ssh-keygen Instead of using passwords, &man.ssh-keygen.1; can be used to generate DSA or RSA keys to authenticate a user: &prompt.user; ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/home/user/.ssh/id_dsa): Created directory '/home/user/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/user/.ssh/id_dsa. Your public key has been saved in /home/user/.ssh/id_dsa.pub. The key fingerprint is: bb:48:db:f2:93:57:80:b6:aa:bc:f5:d5:ba:8f:79:17 user@host.example.com &man.ssh-keygen.1; will create a public and private key pair for use in authentication. The private key is stored in ~/.ssh/id_dsa or ~/.ssh/id_rsa, whereas the public key is stored in ~/.ssh/id_dsa.pub or ~/.ssh/id_rsa.pub, respectively for DSA and RSA key types. The public key must be placed in ~/.ssh/authorized_keys of the remote machine in order for the setup to work. Similarly, RSA version 1 public keys should be placed in ~/.ssh/authorized_keys. This will allow connection to the remote machine based upon SSH keys instead of passwords. If a passphrase is used in &man.ssh-keygen.1;, the user will be prompted for a password each time in order to use the private key. &man.ssh-agent.1; can alleviate the strain of repeatedly entering long passphrases, and is explored in the section below. The various options and files can be different according to the OpenSSH version you have on your system; to avoid problems you should consult the &man.ssh-keygen.1; manual page. ssh-agent and ssh-add The &man.ssh-agent.1; and &man.ssh-add.1; utilities provide methods for SSH keys to be loaded into memory for use, without needing to type the passphrase each time. The &man.ssh-agent.1; utility will handle the authentication using the private key(s) that are loaded into it. &man.ssh-agent.1; should be used to launch another application. At the most basic level, it could spawn a shell or at a more advanced level, a window manager. To use &man.ssh-agent.1; in a shell, first it will need to be spawned with a shell as an argument. Secondly, the identity needs to be added by running &man.ssh-add.1; and providing it the passphrase for the private key. Once these steps have been completed the user will be able to &man.ssh.1; to any host that has the corresponding public key installed. For example: &prompt.user; ssh-agent csh &prompt.user; ssh-add Enter passphrase for /home/user/.ssh/id_dsa: Identity added: /home/user/.ssh/id_dsa (/home/user/.ssh/id_dsa) &prompt.user; To use &man.ssh-agent.1; in X11, a call to &man.ssh-agent.1; will need to be placed in ~/.xinitrc. This will provide the &man.ssh-agent.1; services to all programs launched in X11. An example ~/.xinitrc file might look like this: exec ssh-agent startxfce4 This would launch &man.ssh-agent.1;, which would in turn launch XFCE, every time X11 starts. Then once that is done and X11 has been restarted so that the changes can take effect, simply run &man.ssh-add.1; to load all of your SSH keys. SSH Tunneling OpenSSH tunneling OpenSSH has the ability to create a tunnel to encapsulate another protocol in an encrypted session. The following command tells &man.ssh.1; to create a tunnel for telnet: &prompt.user; ssh -2 -N -f -L 5023:localhost:23 user@foo.example.com &prompt.user; The ssh command is used with the following options: Forces ssh to use version 2 of the protocol. (Do not use if you are working with older SSH servers) Indicates no command, or tunnel only. If omitted, ssh would initiate a normal session. Forces ssh to run in the background. Indicates a local tunnel in localport:remotehost:remoteport fashion. The remote SSH server. An SSH tunnel works by creating a listen socket on localhost on the specified port. It then forwards any connection received on the local host/port via the SSH connection to the specified remote host and port. In the example, port 5023 on localhost is being forwarded to port 23 on localhost of the remote machine. Since 23 is telnet, this would create a secure telnet session through an SSH tunnel. This can be used to wrap any number of insecure TCP protocols such as SMTP, POP3, FTP, etc. Using SSH to Create a Secure Tunnel for SMTP &prompt.user; ssh -2 -N -f -L 5025:localhost:25 user@mailserver.example.com user@mailserver.example.com's password: ***** &prompt.user; telnet localhost 5025 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220 mailserver.example.com ESMTP This can be used in conjunction with an &man.ssh-keygen.1; and additional user accounts to create a more seamless/hassle-free SSH tunneling environment. Keys can be used in place of typing a password, and the tunnels can be run as a separate user. Practical SSH Tunneling Examples Secure Access of a POP3 Server At work, there is an SSH server that accepts connections from the outside. On the same office network resides a mail server running a POP3 server. The network, or network path between your home and office may or may not be completely trustable. Because of this, you need to check your e-mail in a secure manner. The solution is to create an SSH connection to your office's SSH server, and tunnel through to the mail server. &prompt.user; ssh -2 -N -f -L 2110:mail.example.com:110 user@ssh-server.example.com user@ssh-server.example.com's password: ****** When the tunnel is up and running, you can point your mail client to send POP3 requests to localhost port 2110. A connection here will be forwarded securely across the tunnel to mail.example.com. Bypassing a Draconian Firewall Some network administrators impose extremely draconian firewall rules, filtering not only incoming connections, but outgoing connections. You may be only given access to contact remote machines on ports 22 and 80 for SSH and web surfing. You may wish to access another (perhaps non-work related) service, such as an Ogg Vorbis server to stream music. If this Ogg Vorbis server is streaming on some other port than 22 or 80, you will not be able to access it. The solution is to create an SSH connection to a machine outside of your network's firewall, and use it to tunnel to the Ogg Vorbis server. &prompt.user; ssh -2 -N -f -L 8888:music.example.com:8000 user@unfirewalled-system.example.org user@unfirewalled-system.example.org's password: ******* Your streaming client can now be pointed to localhost port 8888, which will be forwarded over to music.example.com port 8000, successfully evading the firewall. The <varname>AllowUsers</varname> Users Option It is often a good idea to limit which users can log in and from where. The AllowUsers option is a good way to accomplish this. For example, to only allow the root user to log in from 192.168.1.32, something like this would be appropriate in the /etc/ssh/sshd_config file: AllowUsers root@192.168.1.32 To allow the user admin to log in from anywhere, just list the username by itself: AllowUsers admin Multiple users should be listed on the same line, like so: AllowUsers root@192.168.1.32 admin It is important that you list each user that needs to log in to this machine; otherwise they will be locked out. After making changes to /etc/ssh/sshd_config you must tell &man.sshd.8; to reload its config files, by running: &prompt.root; /etc/rc.d/sshd reload Further Reading OpenSSH &man.ssh.1; &man.scp.1; &man.ssh-keygen.1; &man.ssh-agent.1; &man.ssh-add.1; &man.ssh.config.5; &man.sshd.8; &man.sftp-server.8; &man.sshd.config.5; Tom Rhodes Contributed by ACL File System Access Control Lists In conjunction with file system enhancements like snapshots, FreeBSD 5.0 and later offers the security of File System Access Control Lists (ACLs). Access Control Lists extend the standard &unix; permission model in a highly compatible (&posix;.1e) way. This feature permits an administrator to make use of and take advantage of a more sophisticated security model. To enable ACL support for UFS file systems, the following: options UFS_ACL must be compiled into the kernel. If this option has not been compiled in, a warning message will be displayed when attempting to mount a file system supporting ACLs. This option is included in the GENERIC kernel. ACLs rely on extended attributes being enabled on the file system. Extended attributes are natively supported in the next generation &unix; file system, UFS2. A higher level of administrative overhead is required to configure extended attributes on UFS1 than on UFS2. The performance of extended attributes on UFS2 is also substantially higher. As a result, UFS2 is generally recommended in preference to UFS1 for use with access control lists. ACLs are enabled by the mount-time administrative flag, , which may be added to /etc/fstab. The mount-time flag can also be automatically set in a persistent manner using &man.tunefs.8; to modify a superblock ACLs flag in the file system header. In general, it is preferred to use the superblock flag for several reasons: The mount-time ACLs flag cannot be changed by a remount (&man.mount.8; ), only by means of a complete &man.umount.8; and fresh &man.mount.8;. This means that ACLs cannot be enabled on the root file system after boot. It also means that you cannot change the disposition of a file system once it is in use. Setting the superblock flag will cause the file system to always be mounted with ACLs enabled even if there is not an fstab entry or if the devices re-order. This prevents accidental mounting of the file system without ACLs enabled, which can result in ACLs being improperly enforced, and hence security problems. We may change the ACLs behavior to allow the flag to be enabled without a complete fresh &man.mount.8;, but we consider it desirable to discourage accidental mounting without ACLs enabled, because you can shoot your feet quite nastily if you enable ACLs, then disable them, then re-enable them without flushing the extended attributes. In general, once you have enabled ACLs on a file system, they should not be disabled, as the resulting file protections may not be compatible with those intended by the users of the system, and re-enabling ACLs may re-attach the previous ACLs to files that have since had their permissions changed, resulting in other unpredictable behavior. File systems with ACLs enabled will show a + (plus) sign in their permission settings when viewed. For example: drwx------ 2 robert robert 512 Dec 27 11:54 private drwxrwx---+ 2 robert robert 512 Dec 23 10:57 directory1 drwxrwx---+ 2 robert robert 512 Dec 22 10:20 directory2 drwxrwx---+ 2 robert robert 512 Dec 27 11:57 directory3 drwxr-xr-x 2 robert robert 512 Nov 10 11:54 public_html Here we see that the directory1, directory2, and directory3 directories are all taking advantage of ACLs. The public_html directory is not. Making Use of <acronym>ACL</acronym>s The file system ACLs can be viewed by the &man.getfacl.1; utility. For instance, to view the ACL settings on the test file, one would use the command: &prompt.user; getfacl test #file:test #owner:1001 #group:1001 user::rw- group::r-- other::r-- To change the ACL settings on this file, invoke the &man.setfacl.1; utility. Observe: &prompt.user; setfacl -k test The flag will remove all of the currently defined ACLs from a file or file system. The more preferable method would be to use as it leaves the basic fields required for ACLs to work. &prompt.user; setfacl -m u:trhodes:rwx,group:web:r--,o::--- test In the aforementioned command, the option was used to modify the default ACL entries. Since there were no pre-defined entries, as they were removed by the previous command, this will restore the default options and assign the options listed. Take care to notice that if you add a user or group which does not exist on the system, an Invalid argument error will be printed to stdout. Tom Rhodes Contributed by Portaudit Monitoring Third Party Security Issues In recent years, the security world has made many improvements to how vulnerability assessment is handled. The threat of system intrusion increases as third party utilities are installed and configured for virtually any operating system available today. Vulnerability assessment is a key factor in security, and while &os; releases advisories for the base system, doing so for every third party utility is beyond the &os; Project's capability. There is a way to mitigate third party vulnerabilities and warn administrators of known security issues. A &os; add on utility known as Portaudit exists solely for this purpose. The security/portaudit port polls a database, updated and maintained by the &os; Security Team and ports developers, for known security issues. To begin using Portaudit, one must install it from the Ports Collection: - &prompt.root; cd /usr/ports/security/portaudit && make install clean + &prompt.root; cd /usr/ports/security/portaudit && make install clean During the install process, the configuration files for &man.periodic.8; will be updated, permitting Portaudit output in the daily security runs. Ensure the daily security run emails, which are sent to root's email account, are being read. No more configuration will be required here. After installation, an administrator must update the database stored locally in /var/db/portaudit by invoking the following command: &prompt.root; portaudit -F The database will automatically be updated during the &man.periodic.8; run; thus, the previous command is completely optional. It is only required for the following examples. To audit the third party utilities installed as part of the Ports Collection, an administrator need only run the following command: &prompt.root; portaudit -a An example of output is provided: Affected package: cups-base-1.1.22.0_1 Type of problem: cups-base -- HPGL buffer overflow vulnerability. Reference: <http://www.FreeBSD.org/ports/portaudit/40a3bca2-6809-11d9-a9e7-0001020eed82.html> 1 problem(s) in your installed packages found. You are advised to update or deinstall the affected package(s) immediately. By pointing a web browser to the URL shown, an administrator may obtain more information about the vulnerability in question. This will include versions affected, by &os; Port version, along with other web sites which may contain security advisories. In short, Portaudit is a powerful utility and extremely useful when coupled with the Portupgrade port. Tom Rhodes Contributed by FreeBSD Security Advisories &os; Security Advisories Like many production quality operating systems, &os; publishes Security Advisories. These advisories are usually mailed to the security lists and noted in the Errata only after the appropriate releases have been patched. This section will work to explain what an advisory is, how to understand it, and what measures to take in order to patch a system. What does an advisory look like? The &os; security advisories look similar to the one below, taken from the &a.security-notifications.name; mailing list. ============================================================================= &os;-SA-XX:XX.UTIL Security Advisory The &os; Project Topic: denial of service due to some problem Category: core Module: sys Announced: 2003-09-23 Credits: Person@EMAIL-ADDRESS Affects: All releases of &os; &os; 4-STABLE prior to the correction date Corrected: 2003-09-23 16:42:59 UTC (RELENG_4, 4.9-PRERELEASE) 2003-09-23 20:08:42 UTC (RELENG_5_1, 5.1-RELEASE-p6) 2003-09-23 20:07:06 UTC (RELENG_5_0, 5.0-RELEASE-p15) 2003-09-23 16:44:58 UTC (RELENG_4_8, 4.8-RELEASE-p8) 2003-09-23 16:47:34 UTC (RELENG_4_7, 4.7-RELEASE-p18) 2003-09-23 16:49:46 UTC (RELENG_4_6, 4.6-RELEASE-p21) 2003-09-23 16:51:24 UTC (RELENG_4_5, 4.5-RELEASE-p33) 2003-09-23 16:52:45 UTC (RELENG_4_4, 4.4-RELEASE-p43) 2003-09-23 16:54:39 UTC (RELENG_4_3, 4.3-RELEASE-p39) &os; only: NO For general information regarding FreeBSD Security Advisories, including descriptions of the fields above, security branches, and the following sections, please visit http://www.FreeBSD.org/security/. I. Background II. Problem Description III. Impact IV. Workaround V. Solution VI. Correction details VII. References The Topic field indicates exactly what the problem is. It is basically an introduction to the current security advisory and notes the utility with the vulnerability. The Category refers to the affected part of the system which may be one of core, contrib, or ports. The core category means that the vulnerability affects a core component of the &os; operating system. The contrib category means that the vulnerability affects software contributed to the &os; Project, such as sendmail. Finally the ports category indicates that the vulnerability affects add on software available as part of the Ports Collection. The Module field refers to the component location, for instance sys. In this example, we see that the module, sys, is affected; therefore, this vulnerability affects a component used within the kernel. The Announced field reflects the date said security advisory was published, or announced to the world. This means that the security team has verified that the problem does exist and that a patch has been committed to the &os; source code repository. The Credits field gives credit to the individual or organization who noticed the vulnerability and reported it. The Affects field explains which releases of &os; are affected by this vulnerability. For the kernel, a quick look over the output from ident on the affected files will help in determining the revision. For ports, the version number is listed after the port name in /var/db/pkg. If the system does not sync with the &os; CVS repository and rebuild daily, chances are that it is affected. The Corrected field indicates the date, time, time offset, and release that was corrected. The &os; only field indicates whether this vulnerability affects just &os;, or if it affects other operating systems as well. The Background field gives information on exactly what the affected utility is. Most of the time this is why the utility exists in &os;, what it is used for, and a bit of information on how the utility came to be. The Problem Description field explains the security hole in depth. This can include information on flawed code, or even how the utility could be maliciously used to open a security hole. The Impact field describes what type of impact the problem could have on a system. For example, this could be anything from a denial of service attack, to extra privileges available to users, or even giving the attacker superuser access. The Workaround field offers a feasible workaround to system administrators who may be incapable of upgrading the system. This may be due to time constraints, network availability, or a slew of other reasons. Regardless, security should not be taken lightly, and an affected system should either be patched or the security hole workaround should be implemented. The Solution field offers instructions on patching the affected system. This is a step by step tested and verified method for getting a system patched and working securely. The Correction Details field displays the CVS branch or release name with the periods changed to underscore characters. It also shows the revision number of the affected files within each branch. The References field usually offers sources of other information. This can included web URLs, books, mailing lists, and newsgroups. Tom Rhodes Contributed by Process Accounting Process Accounting Process accounting is a security method in which an administrator may keep track of system resources used, their allocation among users, provide for system monitoring, and minimally track a user's commands. This indeed has its own positive and negative points. One of the positives is that an intrusion may be narrowed down to the point of entry. A negative is the amount of logs generated by process accounting, and the disk space they may require. This section will walk an administrator through the basics of process accounting. Enable and Utilizing Process Accounting Before making use of process accounting, it must be enabled. To do this, execute the following commands: &prompt.root; touch /var/account/acct &prompt.root; accton /var/account/acct &prompt.root; echo 'accounting_enable="YES"' >> /etc/rc.conf Once enabled, accounting will begin to track CPU stats, commands, etc. All accounting logs are in a non-human readable format and may be viewed using the &man.sa.8; utility. If issued without any options, sa will print information relating to the number of per user calls, the total elapsed time in minutes, total CPU and user time in minutes, average number of I/O operations, etc. To view information about commands being issued, one would use the &man.lastcomm.1; utility. The lastcomm may be used to print out commands issued by users on specific &man.ttys.5;, for example: &prompt.root; lastcomm ls trhodes ttyp1 Would print out all known usage of the ls by trhodes on the ttyp1 terminal. Many other useful options exist and are explained in the &man.lastcomm.1;, &man.acct.5; and &man.sa.8; manual pages. diff --git a/en_US.ISO8859-1/books/handbook/serialcomms/chapter.sgml b/en_US.ISO8859-1/books/handbook/serialcomms/chapter.sgml index 284f345750..b38cdb5fe6 100644 --- a/en_US.ISO8859-1/books/handbook/serialcomms/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/serialcomms/chapter.sgml @@ -1,2892 +1,2892 @@ Serial Communications Synopsis serial communications &unix; has always had support for serial communications. In fact, the very first &unix; machines relied on serial lines for user input and output. Things have changed a lot from the days when the average terminal consisted of a 10-character-per-second serial printer and a keyboard. This chapter will cover some of the ways in which FreeBSD uses serial communications. After reading this chapter, you will know: How to connect terminals to your FreeBSD system. How to use a modem to dial out to remote hosts. How to allow remote users to login to your system with a modem. How to boot your system from a serial console. Before reading this chapter, you should: Know how to configure and install a new kernel (). Understand &unix; permissions and processes (). Have access to the technical manual for the serial hardware (modem or multi-port card) that you would like to use with FreeBSD. Introduction Terminology bits-per-second bps Bits per Second — the rate at which data is transmitted DTE DTE Data Terminal Equipment — for example, your computer DCE DCE Data Communications Equipment — your modem RS-232 RS-232C cables EIA standard for hardware serial communications When talking about communications data rates, this section does not use the term baud. Baud refers to the number of electrical state transitions that may be made in a period of time, while bps (bits per second) is the correct term to use (at least it does not seem to bother the curmudgeons quite as much). Cables and Ports To connect a modem or terminal to your FreeBSD system, you will need a serial port on your computer and the proper cable to connect to your serial device. If you are already familiar with your hardware and the cable it requires, you can safely skip this section. Cables There are several different kinds of serial cables. The two most common types for our purposes are null-modem cables and standard (straight) RS-232 cables. The documentation for your hardware should describe the type of cable required. Null-modem Cables null-modem cable A null-modem cable passes some signals, such as Signal Ground, straight through, but switches other signals. For example, the Transmitted Data pin on one end goes to the Received Data pin on the other end. You can also construct your own null-modem cable for use with terminals (e.g., for quality purposes). This table shows the RS-232C signals and the pin numbers on a DB-25 connector. Note that the standard also calls for a straight-through pin 1 to pin 1 Protective Ground line, but it is often omitted. Some terminals work OK using only pins 2, 3 and 7, while others require different configurations than the examples shown below. DB-25 to DB-25 Null-Modem Cable Signal Pin # Pin # Signal SG 7 connects to 7 SG TD 2 connects to 3 RD RD 3 connects to 2 TD RTS 4 connects to 5 CTS CTS 5 connects to 4 RTS DTR 20 connects to 6 DSR DTR 20 connects to 8 DCD DSR 6 connects to 20 DTR DCD 8 connects to 20 DTR

Here are two other schemes more common nowadays. DB-9 to DB-9 Null-Modem Cable Signal Pin # Pin # Signal RD 2 connects to 3 TD TD 3 connects to 2 RD DTR 4 connects to 6 DSR DTR 4 connects to 1 DCD SG 5 connects to 5 SG DSR 6 connects to 4 DTR DCD 1 connects to 4 DTR RTS 7 connects to 8 CTS CTS 8 connects to 7 RTS

DB-9 to DB-25 Null-Modem Cable Signal Pin # Pin # Signal RD 2 connects to 2 TD TD 3 connects to 3 RD DTR 4 connects to 6 DSR DTR 4 connects to 8 DCD SG 5 connects to 7 SG DSR 6 connects to 20 DTR DCD 1 connects to 20 DTR RTS 7 connects to 5 CTS CTS 8 connects to 4 RTS

When one pin at one end connects to a pair of pins at the other end, it is usually implemented with one short wire between the pair of pins in their connector and a long wire to the other single pin. The above designs seems to be the most popular. In another variation (explained in the book RS-232 Made Easy) SG connects to SG, TD connects to RD, RTS and CTS connect to DCD, DTR connects to DSR, and vice-versa. Standard RS-232C Cables RS-232C cables A standard serial cable passes all of the RS-232C signals straight through. That is, the Transmitted Data pin on one end of the cable goes to the Transmitted Data pin on the other end. This is the type of cable to use to connect a modem to your FreeBSD system, and is also appropriate for some terminals. Ports Serial ports are the devices through which data is transferred between the FreeBSD host computer and the terminal. This section describes the kinds of ports that exist and how they are addressed in FreeBSD. Kinds of Ports Several kinds of serial ports exist. Before you purchase or construct a cable, you need to make sure it will fit the ports on your terminal and on the FreeBSD system. Most terminals will have DB-25 ports. Personal computers, including PCs running FreeBSD, will have DB-25 or DB-9 ports. If you have a multiport serial card for your PC, you may have RJ-12 or RJ-45 ports. See the documentation that accompanied the hardware for specifications on the kind of port in use. A visual inspection of the port often works too. Port Names In FreeBSD, you access each serial port through an entry in the /dev directory. There are two different kinds of entries: Call-in ports are named /dev/ttydN where N is the port number, starting from zero. Generally, you use the call-in port for terminals. Call-in ports require that the serial line assert the data carrier detect (DCD) signal to work correctly. Call-out ports are named /dev/cuadN. You usually do not use the call-out port for terminals, just for modems. You may use the call-out port if the serial cable or the terminal does not support the carrier detect signal. Call-out ports are named /dev/cuaaN in &os; 5.X and older. If you have connected a terminal to the first serial port (COM1 in &ms-dos;), then you will use /dev/ttyd0 to refer to the terminal. If the terminal is on the second serial port (also known as COM2), use /dev/ttyd1, and so forth. Kernel Configuration FreeBSD supports four serial ports by default. In the &ms-dos; world, these are known as COM1, COM2, COM3, and COM4. FreeBSD currently supports dumb multiport serial interface cards, such as the BocaBoard 1008 and 2016, as well as more intelligent multi-port cards such as those made by Digiboard and Stallion Technologies. However, the default kernel only looks for the standard COM ports. To see if your kernel recognizes any of your serial ports, watch for messages while the kernel is booting, or use the /sbin/dmesg command to replay the kernel's boot messages. In particular, look for messages that start with the characters sio. To view just the messages that have the word sio, use the command: &prompt.root; /sbin/dmesg | grep 'sio' For example, on a system with four serial ports, these are the serial-port specific kernel boot messages: sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A sio2 at 0x3e8-0x3ef irq 5 on isa sio2: type 16550A sio3 at 0x2e8-0x2ef irq 9 on isa sio3: type 16550A If your kernel does not recognize all of your serial ports, you will probably need to configure your kernel in the /boot/device.hints file. You can also comment-out or completely remove lines for devices you do not have. On &os; 4.X you have to edit your kernel configuration file. For detailed information on configuring your kernel, please see . The relevant device lines would look like this: device sio0 at isa? port IO_COM1 irq 4 device sio1 at isa? port IO_COM2 irq 3 device sio2 at isa? port IO_COM3 irq 5 device sio3 at isa? port IO_COM4 irq 9 Please refer to the &man.sio.4; manual page for more information on serial ports and multiport boards configuration. Be careful if you are using a configuration file that was previously used for a different version of FreeBSD because the device flags and the syntax have changed between versions. port IO_COM1 is a substitution for port 0x3f8, IO_COM2 is 0x2f8, IO_COM3 is 0x3e8, and IO_COM4 is 0x2e8, which are fairly common port addresses for their respective serial ports; interrupts 4, 3, 5, and 9 are fairly common interrupt request lines. Also note that regular serial ports cannot share interrupts on ISA-bus PCs (multiport boards have on-board electronics that allow all the 16550A's on the board to share one or two interrupt request lines). Device Special Files Most devices in the kernel are accessed through device special files, which are located in the /dev directory. The sio devices are accessed through the /dev/ttydN (dial-in) and /dev/cuadN (call-out) devices. FreeBSD also provides initialization devices (/dev/ttydN.init and /dev/cuadN.init on &os; 6.X, /dev/ttyidN and /dev/cuaidN on &os; 5.X and older) and locking devices (/dev/ttydN.lock and /dev/cuadN.lock on &os; 6.X, /dev/ttyldN and /dev/cualdN on &os; 5.X and older). The initialization devices are used to initialize communications port parameters each time a port is opened, such as crtscts for modems which use RTS/CTS signaling for flow control. The locking devices are used to lock flags on ports to prevent users or programs changing certain parameters; see the manual pages &man.termios.4;, &man.sio.4;, and &man.stty.1; for information on the terminal settings, locking and initializing devices, and setting terminal options, respectively. Making Device Special Files FreeBSD 5.0 includes the &man.devfs.5; filesystem which automatically creates device nodes as needed. If you are running a version of FreeBSD with devfs enabled then you can safely skip this section. A shell script called MAKEDEV in the /dev directory manages the device special files. To use MAKEDEV to make dial-up device special files for COM1 (port 0), cd to /dev and issue the command MAKEDEV ttyd0. Likewise, to make dial-up device special files for COM2 (port 1), use MAKEDEV ttyd1. MAKEDEV not only creates the /dev/ttydN device special files, but also the /dev/cuaaN, /dev/cuaiaN, /dev/cualaN, /dev/ttyldN, and /dev/ttyidN nodes. After making new device special files, be sure to check the permissions on the files (especially the /dev/cua* files) to make sure that only users who should have access to those device special files can read and write on them — you probably do not want to allow your average user to use your modems to dial-out. The default permissions on the /dev/cua* files should be sufficient: crw-rw---- 1 uucp dialer 28, 129 Feb 15 14:38 /dev/cuaa1 crw-rw---- 1 uucp dialer 28, 161 Feb 15 14:38 /dev/cuaia1 crw-rw---- 1 uucp dialer 28, 193 Feb 15 14:38 /dev/cuala1 These permissions allow the user uucp and users in the group dialer to use the call-out devices. Serial Port Configuration ttyd cuad The ttydN (or cuadN) device is the regular device you will want to open for your applications. When a process opens the device, it will have a default set of terminal I/O settings. You can see these settings with the command &prompt.root; stty -a -f /dev/ttyd1 When you change the settings to this device, the settings are in effect until the device is closed. When it is reopened, it goes back to the default set. To make changes to the default set, you can open and adjust the settings of the initial state device. For example, to turn on mode, 8 bit communication, and flow control by default for ttyd5, type: &prompt.root; stty -f /dev/ttyd5.init clocal cs8 ixon ixoff rc files rc.serial System-wide initialization of the serial devices is controlled in /etc/rc.d/serial. This file affects the default settings of serial devices. On &os; 4.X, system-wide initialization of the serial devices is controlled in /etc/rc.serial. To prevent certain settings from being changed by an application, make adjustments to the lock state device. For example, to lock the speed of ttyd5 to 57600 bps, type: &prompt.root; stty -f /dev/ttyd5.lock 57600 Now, an application that opens ttyd5 and tries to change the speed of the port will be stuck with 57600 bps. MAKEDEV Naturally, you should make the initial state and lock state devices writable only by the root account. Sean Kelly Contributed by Terminals terminals Terminals provide a convenient and low-cost way to access your FreeBSD system when you are not at the computer's console or on a connected network. This section describes how to use terminals with FreeBSD. Uses and Types of Terminals The original &unix; systems did not have consoles. Instead, people logged in and ran programs through terminals that were connected to the computer's serial ports. It is quite similar to using a modem and terminal software to dial into a remote system to do text-only work. Today's PCs have consoles capable of high quality graphics, but the ability to establish a login session on a serial port still exists in nearly every &unix; style operating system today; FreeBSD is no exception. By using a terminal attached to an unused serial port, you can log in and run any text program that you would normally run on the console or in an xterm window in the X Window System. For the business user, you can attach many terminals to a FreeBSD system and place them on your employees' desktops. For a home user, a spare computer such as an older IBM PC or a &macintosh; can be a terminal wired into a more powerful computer running FreeBSD. You can turn what might otherwise be a single-user computer into a powerful multiple user system. For FreeBSD, there are three kinds of terminals: Dumb terminals PCs acting as terminals X terminals The remaining subsections describe each kind. Dumb Terminals Dumb terminals are specialized pieces of hardware that let you connect to computers over serial lines. They are called dumb because they have only enough computational power to display, send, and receive text. You cannot run any programs on them. It is the computer to which you connect them that has all the power to run text editors, compilers, email, games, and so forth. There are hundreds of kinds of dumb terminals made by many manufacturers, including Digital Equipment Corporation's VT-100 and Wyse's WY-75. Just about any kind will work with FreeBSD. Some high-end terminals can even display graphics, but only certain software packages can take advantage of these advanced features. Dumb terminals are popular in work environments where workers do not need access to graphical applications such as those provided by the X Window System. PCs Acting as Terminals If a dumb terminal has just enough ability to display, send, and receive text, then certainly any spare personal computer can be a dumb terminal. All you need is the proper cable and some terminal emulation software to run on the computer. Such a configuration is popular in homes. For example, if your spouse is busy working on your FreeBSD system's console, you can do some text-only work at the same time from a less powerful personal computer hooked up as a terminal to the FreeBSD system. X Terminals X terminals are the most sophisticated kind of terminal available. Instead of connecting to a serial port, they usually connect to a network like Ethernet. Instead of being relegated to text-only applications, they can display any X application. We introduce X terminals just for the sake of completeness. However, this chapter does not cover setup, configuration, or use of X terminals. Configuration This section describes what you need to configure on your FreeBSD system to enable a login session on a terminal. It assumes you have already configured your kernel to support the serial port to which the terminal is connected—and that you have connected it. Recall from that the init process is responsible for all process control and initialization at system startup. One of the tasks performed by init is to read the /etc/ttys file and start a getty process on the available terminals. The getty process is responsible for reading a login name and starting the login program. Thus, to configure terminals for your FreeBSD system the following steps should be taken as root: Add a line to /etc/ttys for the entry in the /dev directory for the serial port if it is not already there. Specify that /usr/libexec/getty be run on the port, and specify the appropriate getty type from the /etc/gettytab file. Specify the default terminal type. Set the port to on. Specify whether the port should be secure. Force init to reread the /etc/ttys file. As an optional step, you may wish to create a custom getty type for use in step 2 by making an entry in /etc/gettytab. This chapter does not explain how to do so; you are encouraged to see the &man.gettytab.5; and the &man.getty.8; manual pages for more information. Adding an Entry to <filename>/etc/ttys</filename> The /etc/ttys file lists all of the ports on your FreeBSD system where you want to allow logins. For example, the first virtual console ttyv0 has an entry in this file. You can log in on the console using this entry. This file also contains entries for the other virtual consoles, serial ports, and pseudo-ttys. For a hardwired terminal, just list the serial port's /dev entry without the /dev part (for example, /dev/ttyv0 would be listed as ttyv0). A default FreeBSD install includes an /etc/ttys file with support for the first four serial ports: ttyd0 through ttyd3. If you are attaching a terminal to one of those ports, you do not need to add another entry. Adding Terminal Entries to <filename>/etc/ttys</filename> Suppose we would like to connect two terminals to the system: a Wyse-50 and an old 286 IBM PC running Procomm terminal software emulating a VT-100 terminal. We connect the Wyse to the second serial port and the 286 to the sixth serial port (a port on a multiport serial card). The corresponding entries in the /etc/ttys file would look like this: ttyd1 "/usr/libexec/getty std.38400" wy50 on insecure ttyd5 "/usr/libexec/getty std.19200" vt100 on insecure The first field normally specifies the name of the terminal special file as it is found in /dev. The second field is the command to execute for this line, which is usually &man.getty.8;. getty initializes and opens the line, sets the speed, prompts for a user name and then executes the &man.login.1; program. The getty program accepts one (optional) parameter on its command line, the getty type. A getty type configures characteristics on the terminal line, like bps rate and parity. The getty program reads these characteristics from the file /etc/gettytab. The file /etc/gettytab contains lots of entries for terminal lines both old and new. In almost all cases, the entries that start with the text std will work for hardwired terminals. These entries ignore parity. There is a std entry for each bps rate from 110 to 115200. Of course, you can add your own entries to this file. The &man.gettytab.5; manual page provides more information. When setting the getty type in the /etc/ttys file, make sure that the communications settings on the terminal match. For our example, the Wyse-50 uses no parity and connects at 38400 bps. The 286 PC uses no parity and connects at 19200 bps. The third field is the type of terminal usually connected to that tty line. For dial-up ports, unknown or dialup is typically used in this field since users may dial up with practically any type of terminal or software. For hardwired terminals, the terminal type does not change, so you can put a real terminal type from the &man.termcap.5; database file in this field. For our example, the Wyse-50 uses the real terminal type while the 286 PC running Procomm will be set to emulate at VT-100. The fourth field specifies if the port should be enabled. Putting on here will have the init process start the program in the second field, getty. If you put off in this field, there will be no getty, and hence no logins on the port. The final field is used to specify whether the port is secure. Marking a port as secure means that you trust it enough to allow the root account (or any account with a user ID of 0) to login from that port. Insecure ports do not allow root logins. On an insecure port, users must login from unprivileged accounts and then use &man.su.1; or a similar mechanism to gain superuser privileges. It is highly recommended that you use insecure even for terminals that are behind locked doors. It is quite easy to login and use su if you need superuser privileges. Force <command>init</command> to Reread <filename>/etc/ttys</filename> After making the necessary changes to the /etc/ttys file you should send a SIGHUP (hangup) signal to the init process to force it to re-read its configuration file. For example: &prompt.root; kill -HUP 1 init is always the first process run on a system, therefore it will always have PID 1. If everything is set up correctly, all cables are in place, and the terminals are powered up, then a getty process should be running on each terminal and you should see login prompts on your terminals at this point. Troubleshooting Your Connection Even with the most meticulous attention to detail, something could still go wrong while setting up a terminal. Here is a list of symptoms and some suggested fixes. No Login Prompt Appears Make sure the terminal is plugged in and powered up. If it is a personal computer acting as a terminal, make sure it is running terminal emulation software on the correct serial port. Make sure the cable is connected firmly to both the terminal and the FreeBSD computer. Make sure it is the right kind of cable. Make sure the terminal and FreeBSD agree on the bps rate and parity settings. If you have a video display terminal, make sure the contrast and brightness controls are turned up. If it is a printing terminal, make sure paper and ink are in good supply. Make sure that a getty process is running and serving the terminal. For example, to get a list of running getty processes with ps, type: &prompt.root; ps -axww|grep getty You should see an entry for the terminal. For example, the following display shows that a getty is running on the second serial port ttyd1 and is using the std.38400 entry in /etc/gettytab: 22189 d1 Is+ 0:00.03 /usr/libexec/getty std.38400 ttyd1 If no getty process is running, make sure you have enabled the port in /etc/ttys. Also remember to run kill -HUP 1 after modifying the ttys file. If the getty process is running but the terminal still does not display a login prompt, or if it displays a prompt but will not allow you to type, your terminal or cable may not support hardware handshaking. Try changing the entry in /etc/ttys from std.38400 to 3wire.38400 remember to run kill -HUP 1 after modifying /etc/ttys). The 3wire entry is similar to std, but ignores hardware handshaking. You may need to reduce the baud rate or enable software flow control when using 3wire to prevent buffer overflows. If Garbage Appears Instead of a Login Prompt Make sure the terminal and FreeBSD agree on the bps rate and parity settings. Check the getty processes to make sure the correct getty type is in use. If not, edit /etc/ttys and run kill -HUP 1. Characters Appear Doubled; the Password Appears When Typed Switch the terminal (or the terminal emulation software) from half duplex or local echo to full duplex. Guy Helmer Contributed by Sean Kelly Additions by Dial-in Service dial-in service Configuring your FreeBSD system for dial-in service is very similar to connecting terminals except that you are dealing with modems instead of terminals. External vs. Internal Modems External modems seem to be more convenient for dial-up, because external modems often can be semi-permanently configured via parameters stored in non-volatile RAM and they usually provide lighted indicators that display the state of important RS-232 signals. Blinking lights impress visitors, but lights are also very useful to see whether a modem is operating properly. Internal modems usually lack non-volatile RAM, so their configuration may be limited only to setting DIP switches. If your internal modem has any signal indicator lights, it is probably difficult to view the lights when the system's cover is in place. Modems and Cables modem If you are using an external modem, then you will of course need the proper cable. A standard RS-232C serial cable should suffice as long as all of the normal signals are wired: Signal Names Acronyms Names RD Received Data TD Transmitted Data DTR Data Terminal Ready DSR Data Set Ready DCD Data Carrier Detect (RS-232's Received Line Signal Detector) SG Signal Ground RTS Request to Send CTS Clear to Send

FreeBSD needs the RTS and CTS signals for flow control at speeds above 2400 bps, the CD signal to detect when a call has been answered or the line has been hung up, and the DTR signal to reset the modem after a session is complete. Some cables are wired without all of the needed signals, so if you have problems, such as a login session not going away when the line hangs up, you may have a problem with your cable. Like other &unix; like operating systems, FreeBSD uses the hardware signals to find out when a call has been answered or a line has been hung up and to hangup and reset the modem after a call. FreeBSD avoids sending commands to the modem or watching for status reports from the modem. If you are familiar with connecting modems to PC-based bulletin board systems, this may seem awkward. Serial Interface Considerations FreeBSD supports NS8250-, NS16450-, NS16550-, and NS16550A-based EIA RS-232C (CCITT V.24) communications interfaces. The 8250 and 16450 devices have single-character buffers. The 16550 device provides a 16-character buffer, which allows for better system performance. (Bugs in plain 16550's prevent the use of the 16-character buffer, so use 16550A's if possible). Because single-character-buffer devices require more work by the operating system than the 16-character-buffer devices, 16550A-based serial interface cards are much preferred. If the system has many active serial ports or will have a heavy load, 16550A-based cards are better for low-error-rate communications. Quick Overview getty As with terminals, init spawns a getty process for each configured serial port for dial-in connections. For example, if a modem is attached to /dev/ttyd0, the command ps ax might show this: 4850 ?? I 0:00.09 /usr/libexec/getty V19200 ttyd0 When a user dials the modem's line and the modems connect, the CD (Carrier Detect) line is reported by the modem. The kernel notices that carrier has been detected and completes getty's open of the port. getty sends a login: prompt at the specified initial line speed. getty watches to see if legitimate characters are received, and, in a typical configuration, if it finds junk (probably due to the modem's connection speed being different than getty's speed), getty tries adjusting the line speeds until it receives reasonable characters. /usr/bin/login After the user enters his/her login name, getty executes /usr/bin/login, which completes the login by asking for the user's password and then starting the user's shell. Configuration Files There are three system configuration files in the /etc directory that you will probably need to edit to allow dial-up access to your FreeBSD system. The first, /etc/gettytab, contains configuration information for the /usr/libexec/getty daemon. Second, /etc/ttys holds information that tells /sbin/init what tty devices should have getty processes running on them. Lastly, you can place port initialization commands in the /etc/rc.d/serial script. There are two schools of thought regarding dial-up modems on &unix;. One group likes to configure their modems and systems so that no matter at what speed a remote user dials in, the local computer-to-modem RS-232 interface runs at a locked speed. The benefit of this configuration is that the remote user always sees a system login prompt immediately. The downside is that the system does not know what a user's true data rate is, so full-screen programs like Emacs will not adjust their screen-painting methods to make their response better for slower connections. The other school configures their modems' RS-232 interface to vary its speed based on the remote user's connection speed. For example, V.32bis (14.4 Kbps) connections to the modem might make the modem run its RS-232 interface at 19.2 Kbps, while 2400 bps connections make the modem's RS-232 interface run at 2400 bps. Because getty does not understand any particular modem's connection speed reporting, getty gives a login: message at an initial speed and watches the characters that come back in response. If the user sees junk, it is assumed that they know they should press the Enter key until they see a recognizable prompt. If the data rates do not match, getty sees anything the user types as junk, tries going to the next speed and gives the login: prompt again. This procedure can continue ad nauseam, but normally only takes a keystroke or two before the user sees a good prompt. Obviously, this login sequence does not look as clean as the former locked-speed method, but a user on a low-speed connection should receive better interactive response from full-screen programs. This section will try to give balanced configuration information, but is biased towards having the modem's data rate follow the connection rate. <filename>/etc/gettytab</filename> /etc/gettytab /etc/gettytab is a &man.termcap.5;-style file of configuration information for &man.getty.8;. Please see the &man.gettytab.5; manual page for complete information on the format of the file and the list of capabilities. Locked-speed Config If you are locking your modem's data communications rate at a particular speed, you probably will not need to make any changes to /etc/gettytab. Matching-speed Config You will need to set up an entry in /etc/gettytab to give getty information about the speeds you wish to use for your modem. If you have a 2400 bps modem, you can probably use the existing D2400 entry. # # Fast dialup terminals, 2400/1200/300 rotary (can start either way) # D2400|d2400|Fast-Dial-2400:\ :nx=D1200:tc=2400-baud: 3|D1200|Fast-Dial-1200:\ :nx=D300:tc=1200-baud: 5|D300|Fast-Dial-300:\ :nx=D2400:tc=300-baud: If you have a higher speed modem, you will probably need to add an entry in /etc/gettytab; here is an entry you could use for a 14.4 Kbps modem with a top interface speed of 19.2 Kbps: # # Additions for a V.32bis Modem # um|V300|High Speed Modem at 300,8-bit:\ :nx=V19200:tc=std.300: un|V1200|High Speed Modem at 1200,8-bit:\ :nx=V300:tc=std.1200: uo|V2400|High Speed Modem at 2400,8-bit:\ :nx=V1200:tc=std.2400: up|V9600|High Speed Modem at 9600,8-bit:\ :nx=V2400:tc=std.9600: uq|V19200|High Speed Modem at 19200,8-bit:\ :nx=V9600:tc=std.19200: This will result in 8-bit, no parity connections. The example above starts the communications rate at 19.2 Kbps (for a V.32bis connection), then cycles through 9600 bps (for V.32), 2400 bps, 1200 bps, 300 bps, and back to 19.2 Kbps. Communications rate cycling is implemented with the nx= (next table) capability. Each of the lines uses a tc= (table continuation) entry to pick up the rest of the standard settings for a particular data rate. If you have a 28.8 Kbps modem and/or you want to take advantage of compression on a 14.4 Kbps modem, you need to use a higher communications rate than 19.2 Kbps. Here is an example of a gettytab entry starting a 57.6 Kbps: # # Additions for a V.32bis or V.34 Modem # Starting at 57.6 Kbps # vm|VH300|Very High Speed Modem at 300,8-bit:\ :nx=VH57600:tc=std.300: vn|VH1200|Very High Speed Modem at 1200,8-bit:\ :nx=VH300:tc=std.1200: vo|VH2400|Very High Speed Modem at 2400,8-bit:\ :nx=VH1200:tc=std.2400: vp|VH9600|Very High Speed Modem at 9600,8-bit:\ :nx=VH2400:tc=std.9600: vq|VH57600|Very High Speed Modem at 57600,8-bit:\ :nx=VH9600:tc=std.57600: If you have a slow CPU or a heavily loaded system and do not have 16550A-based serial ports, you may receive sio silo errors at 57.6 Kbps. <filename>/etc/ttys</filename> /etc/ttys Configuration of the /etc/ttys file was covered in . Configuration for modems is similar but we must pass a different argument to getty and specify a different terminal type. The general format for both locked-speed and matching-speed configurations is: ttyd0 "/usr/libexec/getty xxx" dialup on The first item in the above line is the device special file for this entry — ttyd0 means /dev/ttyd0 is the file that this getty will be watching. The second item, "/usr/libexec/getty xxx" (xxx will be replaced by the initial gettytab capability) is the process init will run on the device. The third item, dialup, is the default terminal type. The fourth parameter, on, indicates to init that the line is operational. There can be a fifth parameter, secure, but it should only be used for terminals which are physically secure (such as the system console). The default terminal type (dialup in the example above) may depend on local preferences. dialup is the traditional default terminal type on dial-up lines so that users may customize their login scripts to notice when the terminal is dialup and automatically adjust their terminal type. However, the author finds it easier at his site to specify vt102 as the default terminal type, since the users just use VT102 emulation on their remote systems. After you have made changes to /etc/ttys, you may send the init process a HUP signal to re-read the file. You can use the command &prompt.root; kill -HUP 1 to send the signal. If this is your first time setting up the system, you may want to wait until your modem(s) are properly configured and connected before signaling init. Locked-speed Config For a locked-speed configuration, your ttys entry needs to have a fixed-speed entry provided to getty. For a modem whose port speed is locked at 19.2 Kbps, the ttys entry might look like this: ttyd0 "/usr/libexec/getty std.19200" dialup on If your modem is locked at a different data rate, substitute the appropriate value for std.speed instead of std.19200. Make sure that you use a valid type listed in /etc/gettytab. Matching-speed Config In a matching-speed configuration, your ttys entry needs to reference the appropriate beginning auto-baud (sic) entry in /etc/gettytab. For example, if you added the above suggested entry for a matching-speed modem that starts at 19.2 Kbps (the gettytab entry containing the V19200 starting point), your ttys entry might look like this: ttyd0 "/usr/libexec/getty V19200" dialup on <filename>/etc/rc.d/serial</filename> rc files rc.serial High-speed modems, like V.32, V.32bis, and V.34 modems, need to use hardware (RTS/CTS) flow control. You can add stty commands to /etc/rc.d/serial to set the hardware flow control flag in the FreeBSD kernel for the modem ports. For example to set the termios flag crtscts on serial port #1's (COM2) dial-in and dial-out initialization devices, the following lines could be added to /etc/rc.d/serial: # Serial port initial configuration stty -f /dev/ttyd1.init crtscts stty -f /dev/cuad1.init crtscts Modem Settings If you have a modem whose parameters may be permanently set in non-volatile RAM, you will need to use a terminal program (such as Telix under &ms-dos; or tip under FreeBSD) to set the parameters. Connect to the modem using the same communications speed as the initial speed getty will use and configure the modem's non-volatile RAM to match these requirements: CD asserted when connected DTR asserted for operation; dropping DTR hangs up line and resets modem CTS transmitted data flow control Disable XON/XOFF flow control RTS received data flow control Quiet mode (no result codes) No command echo Please read the documentation for your modem to find out what commands and/or DIP switch settings you need to give it. For example, to set the above parameters on a &usrobotics; &sportster; 14,400 external modem, one could give these commands to the modem: ATZ AT&C1&D2&H1&I0&R2&W You might also want to take this opportunity to adjust other settings in the modem, such as whether it will use V.42bis and/or MNP5 compression. The &usrobotics; &sportster; 14,400 external modem also has some DIP switches that need to be set; for other modems, perhaps you can use these settings as an example: Switch 1: UP — DTR Normal Switch 2: N/A (Verbal Result Codes/Numeric Result Codes) Switch 3: UP — Suppress Result Codes Switch 4: DOWN — No echo, offline commands Switch 5: UP — Auto Answer Switch 6: UP — Carrier Detect Normal Switch 7: UP — Load NVRAM Defaults Switch 8: N/A (Smart Mode/Dumb Mode) Result codes should be disabled/suppressed for dial-up modems to avoid problems that can occur if getty mistakenly gives a login: prompt to a modem that is in command mode and the modem echoes the command or returns a result code. This sequence can result in a extended, silly conversation between getty and the modem. Locked-speed Config For a locked-speed configuration, you will need to configure the modem to maintain a constant modem-to-computer data rate independent of the communications rate. On a &usrobotics; &sportster; 14,400 external modem, these commands will lock the modem-to-computer data rate at the speed used to issue the commands: ATZ AT&B1&W Matching-speed Config For a variable-speed configuration, you will need to configure your modem to adjust its serial port data rate to match the incoming call rate. On a &usrobotics; &sportster; 14,400 external modem, these commands will lock the modem's error-corrected data rate to the speed used to issue the commands, but allow the serial port rate to vary for non-error-corrected connections: ATZ AT&B2&W Checking the Modem's Configuration Most high-speed modems provide commands to view the modem's current operating parameters in a somewhat human-readable fashion. On the &usrobotics; &sportster; 14,400 external modems, the command ATI5 displays the settings that are stored in the non-volatile RAM. To see the true operating parameters of the modem (as influenced by the modem's DIP switch settings), use the commands ATZ and then ATI4. If you have a different brand of modem, check your modem's manual to see how to double-check your modem's configuration parameters. Troubleshooting Here are a few steps you can follow to check out the dial-up modem on your system. Checking Out the FreeBSD System Hook up your modem to your FreeBSD system, boot the system, and, if your modem has status indication lights, watch to see whether the modem's DTR indicator lights when the login: prompt appears on the system's console — if it lights up, that should mean that FreeBSD has started a getty process on the appropriate communications port and is waiting for the modem to accept a call. If the DTR indicator does not light, login to the FreeBSD system through the console and issue a ps ax to see if FreeBSD is trying to run a getty process on the correct port. You should see lines like these among the processes displayed: 114 ?? I 0:00.10 /usr/libexec/getty V19200 ttyd0 115 ?? I 0:00.10 /usr/libexec/getty V19200 ttyd1 If you see something different, like this: 114 d0 I 0:00.10 /usr/libexec/getty V19200 ttyd0 and the modem has not accepted a call yet, this means that getty has completed its open on the communications port. This could indicate a problem with the cabling or a mis-configured modem, because getty should not be able to open the communications port until CD (carrier detect) has been asserted by the modem. If you do not see any getty processes waiting to open the desired ttydN port, double-check your entries in /etc/ttys to see if there are any mistakes there. Also, check the log file /var/log/messages to see if there are any log messages from init or getty regarding any problems. If there are any messages, triple-check the configuration files /etc/ttys and /etc/gettytab, as well as the appropriate device special files /dev/ttydN, for any mistakes, missing entries, or missing device special files. Try Dialing In Try dialing into the system; be sure to use 8 bits, no parity, and 1 stop bit on the remote system. If you do not get a prompt right away, or get garbage, try pressing Enter about once per second. If you still do not see a login: prompt after a while, try sending a BREAK. If you are using a high-speed modem to do the dialing, try dialing again after locking the dialing modem's interface speed (via AT&B1 on a &usrobotics; &sportster; modem, for example). If you still cannot get a login: prompt, check /etc/gettytab again and double-check that The initial capability name specified in /etc/ttys for the line matches a name of a capability in /etc/gettytab Each nx= entry matches another gettytab capability name Each tc= entry matches another gettytab capability name If you dial but the modem on the FreeBSD system will not answer, make sure that the modem is configured to answer the phone when DTR is asserted. If the modem seems to be configured correctly, verify that the DTR line is asserted by checking the modem's indicator lights (if it has any). If you have gone over everything several times and it still does not work, take a break and come back to it later. If it still does not work, perhaps you can send an electronic mail message to the &a.questions; describing your modem and your problem, and the good folks on the list will try to help. Dial-out Service dial-out service The following are tips for getting your host to be able to connect over the modem to another computer. This is appropriate for establishing a terminal session with a remote host. This is useful to log onto a BBS. This kind of connection can be extremely helpful to get a file on the Internet if you have problems with PPP. If you need to FTP something and PPP is broken, use the terminal session to FTP it. Then use zmodem to transfer it to your machine. My Stock Hayes Modem Is Not Supported, What Can I Do? Actually, the manual page for tip is out of date. There is a generic Hayes dialer already built in. Just use at=hayes in your /etc/remote file. The Hayes driver is not smart enough to recognize some of the advanced features of newer modems—messages like BUSY, NO DIALTONE, or CONNECT 115200 will just confuse it. You should turn those messages off when you use tip (using ATX0&W). Also, the dial timeout for tip is 60 seconds. Your modem should use something less, or else tip will think there is a communication problem. Try ATS7=45&W. As shipped, tip does not yet support Hayes modems fully. The solution is to edit the file tipconf.h in the directory /usr/src/usr.bin/tip/tip. Obviously you need the source distribution to do this. Edit the line #define HAYES 0 to #define HAYES 1. Then make and make install. Everything works nicely after that. How Am I Expected to Enter These AT Commands? /etc/remote Make what is called a direct entry in your /etc/remote file. For example, if your modem is hooked up to the first serial port, /dev/cuad0, then put in the following line: cuad0:dv=/dev/cuad0:br#19200:pa=none Use the highest bps rate your modem supports in the br capability. Then, type tip cuad0 and you will be connected to your modem. Or use cu as root with the following command: &prompt.root; cu -lline -sspeed line is the serial port (e.g./dev/cuad0) and speed is the speed (e.g.57600). When you are done entering the AT commands hit ~. to exit. The <literal>@</literal> Sign for the pn Capability Does Not Work! The @ sign in the phone number capability tells tip to look in /etc/phones for a phone number. But the @ sign is also a special character in capability files like /etc/remote. Escape it with a backslash: pn=\@ How Can I Dial a Phone Number on the Command Line? Put what is called a generic entry in your /etc/remote file. For example: tip115200|Dial any phone number at 115200 bps:\ :dv=/dev/cuad0:br#115200:at=hayes:pa=none:du: tip57600|Dial any phone number at 57600 bps:\ :dv=/dev/cuad0:br#57600:at=hayes:pa=none:du: Then you can do things like: &prompt.root; tip -115200 5551234 If you prefer cu over tip, use a generic cu entry: cu115200|Use cu to dial any number at 115200bps:\ :dv=/dev/cuad1:br#57600:at=hayes:pa=none:du: and type: &prompt.root; cu 5551234 -s 115200 Do I Have to Type in the bps Rate Every Time I Do That? Put in an entry for tip1200 or cu1200, but go ahead and use whatever bps rate is appropriate with the br capability. tip thinks a good default is 1200 bps which is why it looks for a tip1200 entry. You do not have to use 1200 bps, though. I Access a Number of Hosts Through a Terminal Server Rather than waiting until you are connected and typing CONNECT <host> each time, use tip's cm capability. For example, these entries in /etc/remote: pain|pain.deep13.com|Forrester's machine:\ :cm=CONNECT pain\n:tc=deep13: muffin|muffin.deep13.com|Frank's machine:\ :cm=CONNECT muffin\n:tc=deep13: deep13:Gizmonics Institute terminal server:\ :dv=/dev/cuad2:br#38400:at=hayes:du:pa=none:pn=5551234: will let you type tip pain or tip muffin to connect to the hosts pain or muffin, and tip deep13 to get to the terminal server. Can Tip Try More Than One Line for Each Site? This is often a problem where a university has several modem lines and several thousand students trying to use them. Make an entry for your university in /etc/remote and use @ for the pn capability: big-university:\ :pn=\@:tc=dialout dialout:\ :dv=/dev/cuad3:br#9600:at=courier:du:pa=none: Then, list the phone numbers for the university in /etc/phones: big-university 5551111 big-university 5551112 big-university 5551113 big-university 5551114 tip will try each one in the listed order, then give up. If you want to keep retrying, run tip in a while loop. Why Do I Have to Hit <keycombo action="simul"> <keycap>Ctrl</keycap> <keycap>P</keycap> </keycombo> Twice to Send <keycombo action="simul"> <keycap>Ctrl</keycap> <keycap>P</keycap> </keycombo> Once? CtrlP is the default force character, used to tell tip that the next character is literal data. You can set the force character to any other character with the ~s escape, which means set a variable. Type ~sforce=single-char followed by a newline. single-char is any single character. If you leave out single-char, then the force character is the nul character, which you can get by typing Ctrl2 or CtrlSpace . A pretty good value for single-char is Shift Ctrl 6 , which is only used on some terminal servers. You can have the force character be whatever you want by specifying the following in your $HOME/.tiprc file: force=<single-char> Suddenly Everything I Type Is in Upper Case?? You must have pressed Ctrl A , tip's raise character, specially designed for people with broken caps-lock keys. Use ~s as above and set the variable raisechar to something reasonable. In fact, you can set it to the same as the force character, if you never expect to use either of these features. Here is a sample .tiprc file perfect for Emacs users who need to type Ctrl2 and CtrlA a lot: force=^^ raisechar=^^ The ^^ is ShiftCtrl6 . How Can I Do File Transfers with <command>tip</command>? If you are talking to another &unix; system, you can send and receive files with ~p (put) and ~t (take). These commands run cat and echo on the remote system to accept and send files. The syntax is: ~p local-file remote-file ~t remote-file local-file There is no error checking, so you probably should use another protocol, like zmodem. How Can I Run zmodem with <command>tip</command>? To receive files, start the sending program on the remote end. Then, type ~C rz to begin receiving them locally. To send files, start the receiving program on the remote end. Then, type ~C sz files to send them to the remote system. Kazutaka YOKOTA Contributed by Bill Paul Based on a document by Setting Up the Serial Console serial console Introduction FreeBSD has the ability to boot on a system with only a dumb terminal on a serial port as a console. Such a configuration should be useful for two classes of people: system administrators who wish to install FreeBSD on machines that have no keyboard or monitor attached, and developers who want to debug the kernel or device drivers. As described in , FreeBSD employs a three stage bootstrap. The first two stages are in the boot block code which is stored at the beginning of the FreeBSD slice on the boot disk. The boot block will then load and run the boot loader (/boot/loader) as the third stage code. In order to set up the serial console you must configure the boot block code, the boot loader code and the kernel. Serial Console Configuration, Terse Version This section assumes that you are using the default setup and just want a fast overview of setting up the serial console. Connect the serial cable to COM1 and the controlling terminal. To see all boot messages on the serial console, issue the following command while logged in as the superuser: &prompt.root; echo 'console="comconsole"' >> /boot/loader.conf Edit /etc/ttys and change off to on and dialup to vt100 for the ttyd0 entry. Otherwise a password will not be required to connect via the serial console, resulting in a potential security hole. Reboot the system to see if the changes took effect. If a different configuration is required, a more in depth configuration explanation exists in . Serial Console Configuration Prepare a serial cable. null-modem cable You will need either a null-modem cable or a standard serial cable and a null-modem adapter. See for a discussion on serial cables. Unplug your keyboard. Most PC systems probe for the keyboard during the Power-On Self-Test (POST) and will generate an error if the keyboard is not detected. Some machines complain loudly about the lack of a keyboard and will not continue to boot until it is plugged in. If your computer complains about the error, but boots anyway, then you do not have to do anything special. (Some machines with Phoenix BIOS installed merely say Keyboard failed and continue to boot normally.) If your computer refuses to boot without a keyboard attached then you will have to configure the BIOS so that it ignores this error (if it can). Consult your motherboard's manual for details on how to do this. Set the keyboard to Not installed in the BIOS setup. You will still be able to use your keyboard. All this does is tell the BIOS not to probe for a keyboard at power-on. Your BIOS should not complain if the keyboard is absent. You can leave the keyboard plugged in even with this flag set to Not installed and the keyboard will still work. If your system has a &ps2; mouse, chances are very good that you may have to unplug your mouse as well as your keyboard. This is because &ps2; mice share some hardware with the keyboard and leaving the mouse plugged in can fool the keyboard probe into thinking the keyboard is still there. It is said that a Gateway 2000 Pentium 90 MHz system with an AMI BIOS that behaves this way. In general, this is not a problem since the mouse is not much good without the keyboard anyway. Plug a dumb terminal into COM1 (sio0). If you do not have a dumb terminal, you can use an old PC/XT with a modem program, or the serial port on another &unix; box. If you do not have a COM1 (sio0), get one. At this time, there is no way to select a port other than COM1 for the boot blocks without recompiling the boot blocks. If you are already using COM1 for another device, you will have to temporarily remove that device and install a new boot block and kernel once you get FreeBSD up and running. (It is assumed that COM1 will be available on a file/compute/terminal server anyway; if you really need COM1 for something else (and you cannot switch that something else to COM2 (sio1)), then you probably should not even be bothering with all this in the first place.) Make sure the configuration file of your kernel has appropriate flags set for COM1 (sio0). Relevant flags are: 0x10 Enables console support for this unit. The other console flags are ignored unless this is set. Currently, at most one unit can have console support; the first one (in config file order) with this flag set is preferred. This option alone will not make the serial port the console. Set the following flag or use the option described below, together with this flag. 0x20 Forces this unit to be the console (unless there is another higher priority console), regardless of the option discussed below. The flag 0x20 must be used together with the flag. 0x40 Reserves this unit (in conjunction with 0x10) and makes the unit unavailable for normal access. You should not set this flag to the serial port unit which you want to use as the serial console. The only use of this flag is to designate the unit for kernel remote debugging. See The Developer's Handbook for more information on remote debugging. In FreeBSD 4.0 or later the semantics of the flag 0x40 are slightly different and there is another flag to specify a serial port for remote debugging. Example: device sio0 at isa? port IO_COM1 flags 0x10 irq 4 See the &man.sio.4; manual page for more details. If the flags were not set, you need to run UserConfig (on a different console) or recompile the kernel. Create boot.config in the root directory of the a partition on the boot drive. This file will instruct the boot block code how you would like to boot the system. In order to activate the serial console, you need one or more of the following options—if you want multiple options, include them all on the same line: Toggles internal and serial consoles. You can use this to switch console devices. For instance, if you boot from the internal (video) console, you can use to direct the boot loader and the kernel to use the serial port as its console device. Alternatively, if you boot from the serial port, you can use the to tell the boot loader and the kernel to use the video display as the console instead. Toggles single and dual console configurations. In the single configuration the console will be either the internal console (video display) or the serial port, depending on the state of the option above. In the dual console configuration, both the video display and the serial port will become the console at the same time, regardless of the state of the option. However, note that the dual console configuration takes effect only during the boot block is running. Once the boot loader gets control, the console specified by the option becomes the only console. Makes the boot block probe the keyboard. If no keyboard is found, the and options are automatically set. Due to space constraints in the current version of the boot blocks, the option is capable of detecting extended keyboards only. Keyboards with less than 101 keys (and without F11 and F12 keys) may not be detected. Keyboards on some laptop computers may not be properly found because of this limitation. If this is the case with your system, you have to abandon using the option. Unfortunately there is no workaround for this problem. Use either the option to select the console automatically, or the option to activate the serial console. You may include other options described in &man.boot.8; as well. The options, except for , will be passed to the boot loader (/boot/loader). The boot loader will determine which of the internal video or the serial port should become the console by examining the state of the option alone. This means that if you specify the option but not the option in /boot.config, you can use the serial port as the console only during the boot block; the boot loader will use the internal video display as the console. Boot the machine. When you start your FreeBSD box, the boot blocks will echo the contents of /boot.config to the console. For example: /boot.config: -P Keyboard: no The second line appears only if you put in /boot.config and indicates presence/absence of the keyboard. These messages go to either serial or internal console, or both, depending on the option in /boot.config. Options Message goes to none internal console serial console serial and internal consoles serial and internal consoles , keyboard present internal console , keyboard absent serial console After the above messages, there will be a small pause before the boot blocks continue loading the boot loader and before any further messages printed to the console. Under normal circumstances, you do not need to interrupt the boot blocks, but you may want to do so in order to make sure things are set up correctly. Hit any key, other than Enter, at the console to interrupt the boot process. The boot blocks will then prompt you for further action. You should now see something like: - >> FreeBSD/i386 BOOT + >> FreeBSD/i386 BOOT Default: 0:ad(0,a)/boot/loader boot: Verify the above message appears on either the serial or internal console or both, according to the options you put in /boot.config. If the message appears in the correct console, hit Enter to continue the boot process. If you want the serial console but you do not see the prompt on the serial terminal, something is wrong with your settings. In the meantime, you enter and hit Enter/Return (if possible) to tell the boot block (and then the boot loader and the kernel) to choose the serial port for the console. Once the system is up, go back and check what went wrong. After the boot loader is loaded and you are in the third stage of the boot process you can still switch between the internal console and the serial console by setting appropriate environment variables in the boot loader. See . Summary Here is the summary of various settings discussed in this section and the console eventually selected. Case 1: You Set the Flags to 0x10 for <devicename>sio0</devicename> device sio0 at isa? port IO_COM1 flags 0x10 irq 4 Options in /boot.config Console during boot blocks Console during boot loader Console in kernel nothing internal internal internal serial serial serial serial and internal internal internal serial and internal serial serial , keyboard present internal internal internal , keyboard absent serial and internal serial serial Case 2: You Set the Flags to 0x30 for sio0 device sio0 at isa? port IO_COM1 flags 0x30 irq 4 Options in /boot.config Console during boot blocks Console during boot loader Console in kernel nothing internal internal serial serial serial serial serial and internal internal serial serial and internal serial serial , keyboard present internal internal serial , keyboard absent serial and internal serial serial Tips for the Serial Console Setting a Faster Serial Port Speed By default, the serial port settings are: 9600 baud, 8 bits, no parity, and 1 stop bit. If you wish to change the speed, you need to recompile at least the boot blocks. Add the following line to /etc/make.conf and compile new boot blocks: BOOT_COMCONSOLE_SPEED=19200 See for detailed instructions about building and installing new boot blocks. If the serial console is configured in some other way than by booting with , or if the serial console used by the kernel is different from the one used by the boot blocks, then you must also add the following option to the kernel configuration file and compile a new kernel: options CONSPEED=19200 Using Serial Port Other Than <devicename>sio0</devicename> for the Console Using a port other than sio0 as the console requires some recompiling. If you want to use another serial port for whatever reasons, recompile the boot blocks, the boot loader and the kernel as follows. Get the kernel source. (See ) Edit /etc/make.conf and set BOOT_COMCONSOLE_PORT to the address of the port you want to use (0x3F8, 0x2F8, 0x3E8 or 0x2E8). Only sio0 through sio3 (COM1 through COM4) can be used; multiport serial cards will not work. No interrupt setting is needed. Create a custom kernel configuration file and add appropriate flags for the serial port you want to use. For example, if you want to make sio1 (COM2) the console: device sio1 at isa? port IO_COM2 flags 0x10 irq 3 or device sio1 at isa? port IO_COM2 flags 0x30 irq 3 The console flags for the other serial ports should not be set. Recompile and install the boot blocks and the boot loader: &prompt.root; cd /sys/boot &prompt.root; make clean &prompt.root; make &prompt.root; make install Rebuild and install the kernel. Write the boot blocks to the boot disk with &man.disklabel.8; and boot from the new kernel. Entering the DDB Debugger from the Serial Line If you wish to drop into the kernel debugger from the serial console (useful for remote diagnostics, but also dangerous if you generate a spurious BREAK on the serial port!) then you should compile your kernel with the following options: options BREAK_TO_DEBUGGER options DDB Getting a Login Prompt on the Serial Console While this is not required, you may wish to get a login prompt over the serial line, now that you can see boot messages and can enter the kernel debugging session through the serial console. Here is how to do it. Open the file /etc/ttys with an editor and locate the lines: ttyd0 "/usr/libexec/getty std.9600" unknown off secure ttyd1 "/usr/libexec/getty std.9600" unknown off secure ttyd2 "/usr/libexec/getty std.9600" unknown off secure ttyd3 "/usr/libexec/getty std.9600" unknown off secure ttyd0 through ttyd3 corresponds to COM1 through COM4. Change off to on for the desired port. If you have changed the speed of the serial port, you need to change std.9600 to match the current setting, e.g. std.19200. You may also want to change the terminal type from unknown to the actual type of your serial terminal. After editing the file, you must kill -HUP 1 to make this change take effect. Changing Console from the Boot Loader Previous sections described how to set up the serial console by tweaking the boot block. This section shows that you can specify the console by entering some commands and environment variables in the boot loader. As the boot loader is invoked at the third stage of the boot process, after the boot block, the settings in the boot loader will override the settings in the boot block. Setting Up the Serial Console You can easily specify the boot loader and the kernel to use the serial console by writing just one line in /boot/loader.rc: set console="comconsole" This will take effect regardless of the settings in the boot block discussed in the previous section. You had better put the above line as the first line of /boot/loader.rc so as to see boot messages on the serial console as early as possible. Likewise, you can specify the internal console as: set console="vidconsole" If you do not set the boot loader environment variable console, the boot loader, and subsequently the kernel, will use whichever console indicated by the option in the boot block. In versions 3.2 or later, you may specify the console in /boot/loader.conf.local or /boot/loader.conf, rather than in /boot/loader.rc. In this method your /boot/loader.rc should look like: include /boot/loader.4th start Then, create /boot/loader.conf.local and put the following line there. console=comconsole or console=vidconsole See &man.loader.conf.5; for more information. At the moment, the boot loader has no option equivalent to the option in the boot block, and there is no provision to automatically select the internal console and the serial console based on the presence of the keyboard. Using a Serial Port Other Than <devicename>sio0</devicename> for the Console You need to recompile the boot loader to use a serial port other than sio0 for the serial console. Follow the procedure described in . Caveats The idea here is to allow people to set up dedicated servers that require no graphics hardware or attached keyboards. Unfortunately, while most systems will let you boot without a keyboard, there are quite a few that will not let you boot without a graphics adapter. Machines with AMI BIOSes can be configured to boot with no graphics adapter installed simply by changing the graphics adapter setting in the CMOS configuration to Not installed. However, many machines do not support this option and will refuse to boot if you have no display hardware in the system. With these machines, you will have to leave some kind of graphics card plugged in, (even if it is just a junky mono board) although you will not have to attach a monitor. You might also try installing an AMI BIOS. diff --git a/en_US.ISO8859-1/books/handbook/x11/chapter.sgml b/en_US.ISO8859-1/books/handbook/x11/chapter.sgml index c21eda03d2..28a48c2094 100644 --- a/en_US.ISO8859-1/books/handbook/x11/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/x11/chapter.sgml @@ -1,1822 +1,1822 @@ Ken Tom Updated for X.Org's X11 server by Marc Fonvieille The X Window System Synopsis FreeBSD uses X11 to provide users with a powerful graphical user interface. X11 is an open-source implementation of the X Window System that includes both &xorg; and &xfree86;. &os; versions up to and including &os; 4.11-RELEASE and &os; 5.2.1-RELEASE will find the default installation to be &xfree86;, the X11 server released by The &xfree86; Project, Inc. As of &os; 5.3-RELEASE, the default and official flavor of X11 was changed to &xorg;, the X11 server developed by the X.Org Foundation. This chapter will cover the installation and configuration of X11 with emphasis on &xorg;. For more information on the video hardware that X11 supports, check either the &xorg; or &xfree86; web sites. After reading this chapter, you will know: The various components of the X Window System, and how they interoperate. How to install and configure X11. How to install and use different window managers. How to use &truetype; fonts in X11. How to set up your system for graphical logins (XDM). Before reading this chapter, you should: Know how to install additional third-party software (). This chapter covers the installation and the configuration of both &xorg; and &xfree86; X11 servers. For the most part, configuration files, commands and syntaxes are identical. In the case where there are differences, both &xorg; and &xfree86; syntaxes will be shown. Understanding X Using X for the first time can be somewhat of a shock to someone familiar with other graphical environments, such as µsoft.windows; or &macos;. While it is not necessary to understand all of the details of various X components and how they interact, some basic knowledge makes it possible to take advantage of X's strengths. Why X? X is not the first window system written for &unix;, but it is the most popular of them. X's original development team had worked on another window system prior to writing X. That system's name was W (for Window). X was just the next letter in the Roman alphabet. X can be called X, X Window System, X11, and a number of other terms. You may find that using the term X Windows to describe X11 can be offensive to some people; for a bit more insight on this, see &man.X.7;. The X Client/Server Model X was designed from the beginning to be network-centric, and adopts a client-server model. In the X model, the X server runs on the computer that has the keyboard, monitor, and mouse attached. The server's responsibility includes tasks such as managing the display, handling input from the keyboard and mouse, and so on. Each X application (such as XTerm, or &netscape;) is a client. A client sends messages to the server such as Please draw a window at these coordinates, and the server sends back messages such as The user just clicked on the OK button. In a home or small office environment, the X server and the X clients commonly run on the same computer. However, it is perfectly possible to run the X server on a less powerful desktop computer, and run X applications (the clients) on, say, the powerful and expensive machine that serves the office. In this scenario the communication between the X client and server takes place over the network. This confuses some people, because the X terminology is exactly backward to what they expect. They expect the X server to be the big powerful machine down the hall, and the X client to be the machine on their desk. It is important to remember that the X server is the machine with the monitor and keyboard, and the X clients are the programs that display the windows. There is nothing in the protocol that forces the client and server machines to be running the same operating system, or even to be running on the same type of computer. It is certainly possible to run an X server on µsoft.windows; or Apple's &macos;, and there are various free and commercial applications available that do exactly that. Starting with &os; 5.3-RELEASE, the X server that installs with &os; is &xorg;, and is available for free, under a license very similar to the FreeBSD license. Commercial X servers for FreeBSD are also available. The Window Manager The X design philosophy is much like the &unix; design philosophy, tools, not policy. This means that X does not try to dictate how a task is to be accomplished. Instead, tools are provided to the user, and it is the user's responsibility to decide how to use those tools. This philosophy extends to X not dictating what windows should look like on screen, how to move them around with the mouse, what keystrokes should be used to move between windows (i.e., Alt Tab , in the case of µsoft.windows;), what the title bars on each window should look like, whether or not they have close buttons on them, and so on. Instead, X delegates this responsibility to an application called a Window Manager. There are dozens of window managers available for X: AfterStep, Blackbox, ctwm, Enlightenment, fvwm, Sawfish, twm, Window Maker, and more. Each of these window managers provides a different look and feel; some of them support virtual desktops; some of them allow customized keystrokes to manage the desktop; some have a Start button or similar device; some are themeable, allowing a complete change of look-and-feel by applying a new theme. These window managers, and many more, are available in the x11-wm category of the Ports Collection. In addition, the KDE and GNOME desktop environments both have their own window managers which integrate with the desktop. Each window manager also has a different configuration mechanism; some expect configuration file written by hand, others feature GUI tools for most of the configuration tasks; at least one (Sawfish) has a configuration file written in a dialect of the Lisp language. Focus Policy Another feature the window manager is responsible for is the mouse focus policy. Every windowing system needs some means of choosing a window to be actively receiving keystrokes, and should visibly indicate which window is active as well. A familiar focus policy is called click-to-focus. This is the model utilized by µsoft.windows;, in which a window becomes active upon receiving a mouse click. X does not support any particular focus policy. Instead, the window manager controls which window has the focus at any one time. Different window managers will support different focus methods. All of them support click to focus, and the majority of them support several others. The most popular focus policies are: focus-follows-mouse The window that is under the mouse pointer is the window that has the focus. This may not necessarily be the window that is on top of all the other windows. The focus is changed by pointing at another window, there is no need to click in it as well. sloppy-focus This policy is a small extension to focus-follows-mouse. With focus-follows-mouse, if the mouse is moved over the root window (or background) then no window has the focus, and keystrokes are simply lost. With sloppy-focus, focus is only changed when the cursor enters a new window, and not when exiting the current window. click-to-focus The active window is selected by mouse click. The window may then be raised, and appear in front of all other windows. All keystrokes will now be directed to this window, even if the cursor is moved to another window. Many window managers support other policies, as well as variations on these. Be sure to consult the documentation for the window manager itself. Widgets The X approach of providing tools and not policy extends to the widgets seen on screen in each application. Widget is a term for all the items in the user interface that can be clicked or manipulated in some way; buttons, check boxes, radio buttons, icons, lists, and so on. µsoft.windows; calls these controls. µsoft.windows; and Apple's &macos; both have a very rigid widget policy. Application developers are supposed to ensure that their applications share a common look and feel. With X, it was not considered sensible to mandate a particular graphical style, or set of widgets to adhere to. As a result, do not expect X applications to have a common look and feel. There are several popular widget sets and variations, including the original Athena widget set from MIT, &motif; (on which the widget set in µsoft.windows; was modeled, all bevelled edges and three shades of grey), OpenLook, and others. Most newer X applications today will use a modern-looking widget set, either Qt, used by KDE, or GTK+, used by the GNOME project. In this respect, there is some convergence in look-and-feel of the &unix; desktop, which certainly makes things easier for the novice user. Installing X11 &xorg; or &xfree86; may be installed on &os;. Beginning with &os; 5.3-RELEASE, &xorg; is the default X11 implementation for &os;. &xorg; is the X server of the open source X Window System implementation released by the X.Org Foundation. &xorg; is based on the code of &xfree86 4.4RC2 and X11R6.6. The X.Org Foundation released X11R6.7 in April 2004 and X11R6.8.2 in February 2005, this latter is the version currently available in the &os; Ports Collection. To build and install &xorg; from the Ports Collection: &prompt.root; cd /usr/ports/x11/xorg &prompt.root; make install clean To build &xorg; in its entirety, be sure to have at least 4 GB of free space available. To build and install &xfree86; from the Ports Collection: &prompt.root; cd /usr/ports/x11/XFree86-4 &prompt.root; make install clean Alternatively, X11 can be installed directly from packages. Binary packages to use with &man.pkg.add.1; tool are also available for X11. When the remote fetching feature of &man.pkg.add.1; is used, the version number of the package must be removed. &man.pkg.add.1; will automatically fetch the latest version of the application. So to fetch and install the package of &xorg;, simply type: &prompt.root; pkg_add -r xorg The &xfree86; 4.X package can be installed by typing: &prompt.root; pkg_add -r XFree86 The examples above will install the complete X11 distribution including the servers, clients, fonts etc. Separate packages and ports of X11 are also available. The rest of this chapter will explain how to configure X11, and how to set up a productive desktop environment. Moving from <application>&xfree86;</application> to <application>&xorg;</application> As with any port, you should check the /usr/ports/UPDATING file for changes. Included in this file are instructions for converting your system from &xfree86; to &xorg;. Use CVSup to update your ports tree prior to attempting any conversion. You will also need to install sysutils/portupgrade prior to converting your X11 installation. In your /etc/make.conf you will need to add the variable X_WINDOW_SYSTEM=xorg. This ensures that your system knows which X11 is being used. The older XFREE86_VERSION variable has been deprecated and has been replaced with the X_WINDOW_SYSTEM variable. Then, use the following commands: &prompt.root; pkg_delete -f /var/db/pkg/imake-4* /var/db/pkg/XFree86-* &prompt.root; cd /usr/ports/x11/xorg &prompt.root; make install clean &prompt.root; pkgdb -F The &man.pkgdb.1; command is part of the portupgrade software and will update various package dependencies. To build &xorg; in its entirety, be sure to have at least 4 GB of free space available. Christopher Shumway Contributed by X11 Configuration &xfree86; 4.X &xfree86; &xorg; X11 Before Starting Before configuration of X11 the following information about the target system is needed: Monitor specifications Video Adapter chipset Video Adapter memory horizontal scan rate vertical scan rate The specifications for the monitor are used by X11 to determine the resolution and refresh rate to run at. These specifications can usually be obtained from the documentation that came with the monitor or from the manufacturer's website. There are two ranges of numbers that are needed, the horizontal scan rate and the vertical synchronization rate. The video adapter's chipset defines what driver module X11 uses to talk to the graphics hardware. With most chipsets, this can be automatically determined, but it is still useful to know in case the automatic detection does not work correctly. Video memory on the graphic adapter determines the resolution and color depth which the system can run at. This is important to know so the user knows the limitations of the system. Configuring X11 Configuration of X11 is a multi-step process. The first step is to build an initial configuration file. As the super user, simply run: &prompt.root; Xorg -configure In the case of &xfree86; type: &prompt.root; XFree86 -configure This will generate an X11 configuration skeleton file in the /root directory called xorg.conf.new (whether you &man.su.1; or do a direct login affects the inherited supervisor $HOME directory variable). For &xfree86;, this configuration file is called XF86Config.new. The X11 program will attempt to probe the graphics hardware on the system and write a configuration file to load the proper drivers for the detected hardware on the target system. The next step is to test the existing configuration to verify that &xorg; can work with the graphics hardware on the target system. To perform this task, type: &prompt.root; Xorg -config xorg.conf.new &xfree86; users will type: &prompt.root; XFree86 -xf86config XF86Config.new If a black and grey grid and an X mouse cursor appear, the configuration was successful. To exit the test, just press Ctrl Alt Backspace simultaneously. If the mouse does not work, you will need to first configure it before proceeding. See in the &os; install chapter. X11 tuning Next, tune the xorg.conf.new (or XF86Config.new if you are running &xfree86;) configuration file to taste. Open the file in a text editor such as &man.emacs.1; or &man.ee.1;. First, add the frequencies for the target system's monitor. These are usually expressed as a horizontal and vertical synchronization rate. These values are added to the xorg.conf.new file under the "Monitor" section: Section "Monitor" Identifier "Monitor0" VendorName "Monitor Vendor" ModelName "Monitor Model" HorizSync 30-107 VertRefresh 48-120 EndSection The HorizSync and VertRefresh keywords may be missing in the configuration file. If they are, they need to be added, with the correct horizontal synchronization rate placed after the HorizSync keyword and the vertical synchronization rate after the VertRefresh keyword. In the example above the target monitor's rates were entered. X allows DPMS (Energy Star) features to be used with capable monitors. The &man.xset.1; program controls the time-outs and can force standby, suspend, or off modes. If you wish to enable DPMS features for your monitor, you must add the following line to the monitor section: Option "DPMS" xorg.conf XF86Config While the xorg.conf.new (or XF86Config.new) configuration file is still open in an editor, select the default resolution and color depth desired. This is defined in the "Screen" section: Section "Screen" Identifier "Screen0" Device "Card0" Monitor "Monitor0" DefaultDepth 24 SubSection "Display" Viewport 0 0 Depth 24 Modes "1024x768" EndSubSection EndSection The DefaultDepth keyword describes the color depth to run at by default. This can be overridden with the command line switch to &man.Xorg.1; (or &man.XFree86.1;). The Modes keyword describes the resolution to run at for the given color depth. Note that only VESA standard modes are supported as defined by the target system's graphics hardware. In the example above, the default color depth is twenty-four bits per pixel. At this color depth, the accepted resolution is 1024 by 768 pixels. Finally, write the configuration file and test it using the test mode given above. One of the tools available to assist you during troubleshooting process are the X11 log files, which contain information on each device that the X11 server attaches to. &xorg; log file names are in the format of /var/log/Xorg.0.log (&xfree86; log file names follow the format of XFree86.0.log). The exact name of the log can vary from Xorg.0.log to Xorg.8.log and so forth. If all is well, the configuration file needs to be installed in a common location where &man.Xorg.1; (or &man.XFree86.1;) can find it. This is typically /etc/X11/xorg.conf or /usr/X11R6/etc/X11/xorg.conf (for &xfree86; it is called /etc/X11/XF86Config or /usr/X11R6/etc/X11/XF86Config). &prompt.root; cp xorg.conf.new /etc/X11/xorg.conf For &xfree86;: &prompt.root; cp XF86Config.new /etc/X11/XF86Config The X11 configuration process is now complete. In order to start &xfree86; 4.X with &man.startx.1;, install the x11/wrapper port. &xorg; already includes the wrapper code and does not require the installation of the wrapper port. The X11 server may also be started with the use of &man.xdm.1;. There is also a graphical configuration tool, &man.xorgcfg.1; (&man.xf86cfg.1; for &xfree86;), that comes with the X11 distribution. It allows you to interactively define your configuration by choosing the appropriate drivers and settings. This program can be invoked from the console, by typing the command xorgcfg -textmode. For more details, refer to the &man.xorgcfg.1; and &man.xf86cfg.1; manual pages. Alternatively, there is also a tool called &man.xorgconfig.1; (&man.xf86config.1; for &xfree86;), this program is a console utility that is less user friendly, but it may work in situations where the other tools do not. Advanced Configuration Topics Configuration with &intel; i810 Graphics Chipsets Intel i810 graphic chipset Configuration with &intel; i810 integrated chipsets requires the agpgart AGP programming interface for X11 to drive the card. The &man.agp.4; driver is in the GENERIC kernel since releases 4.8-RELEASE and 5.0-RELEASE. On prior releases, you will have to add the following line: device agp in your kernel configuration file and rebuild a new kernel. Instead, you may want to load the agp.ko kernel module automatically with the &man.loader.8; at boot time. For that, simply add this line to /boot/loader.conf: agp_load="YES" Next, if you are running FreeBSD 4.X or earlier, a device node needs to be created for the programming interface. To create the AGP device node, run &man.MAKEDEV.8; in the /dev directory: &prompt.root; cd /dev &prompt.root; sh MAKEDEV agpgart FreeBSD 5.X or later will use &man.devfs.5; to allocate device nodes transparently, therefore the &man.MAKEDEV.8; step is not required. This will allow configuration of the hardware as any other graphics board. Note on systems without the &man.agp.4; driver compiled in the kernel, trying to load the module with &man.kldload.8; will not work. This driver has to be in the kernel at boot time through being compiled in or using /boot/loader.conf. If you are using &xfree86; 4.1.0 (or later) and messages about unresolved symbols like fbPictureInit appear, try adding the following line after Driver "i810" in the X11 configuration file: Option "NoDDC" Murray Stokely Contributed by Using Fonts in X11 Type1 Fonts The default fonts that ship with X11 are less than ideal for typical desktop publishing applications. Large presentation fonts show up jagged and unprofessional looking, and small fonts in &netscape; are almost completely unintelligible. However, there are several free, high quality Type1 (&postscript;) fonts available which can be readily used with X11. For instance, the URW font collection (x11-fonts/urwfonts) includes high quality versions of standard type1 fonts (Times Roman, Helvetica, Palatino and others). The Freefonts collection (x11-fonts/freefonts) includes many more fonts, but most of them are intended for use in graphics software such as the Gimp, and are not complete enough to serve as screen fonts. In addition, X11 can be configured to use &truetype; fonts with a minimum of effort. For more details on this, see the &man.X.7; manual page or the section on &truetype; fonts. To install the above Type1 font collections from the ports collection, run the following commands: &prompt.root; cd /usr/ports/x11-fonts/urwfonts &prompt.root; make install clean And likewise with the freefont or other collections. To have the X server detect these fonts, add an appropriate line to the X server configuration file in /etc/X11/ (xorg.conf for &xorg; and XF86Config for &xfree86;), which reads: FontPath "/usr/X11R6/lib/X11/fonts/URW/" Alternatively, at the command line in the X session run: &prompt.user; xset fp+ /usr/X11R6/lib/X11/fonts/URW &prompt.user; xset fp rehash This will work but will be lost when the X session is closed, unless it is added to the startup file (~/.xinitrc for a normal startx session, or ~/.xsession when logging in through a graphical login manager like XDM). A third way is to use the new /usr/X11R6/etc/fonts/local.conf file: see the section on anti-aliasing. &truetype; Fonts TrueType Fonts fonts TrueType Both &xfree86; 4.X and &xorg; have built in support for rendering &truetype; fonts. There are two different modules that can enable this functionality. The freetype module is used in this example because it is more consistent with the other font rendering back-ends. To enable the freetype module just add the following line to the "Module" section of the /etc/X11/xorg.conf or /etc/X11/XF86Config file. Load "freetype" For &xfree86; 3.3.X, a separate &truetype; font server is needed. Xfstt is commonly used for this purpose. To install Xfstt, simply install the port x11-servers/Xfstt. Now make a directory for the &truetype; fonts (for example, /usr/X11R6/lib/X11/fonts/TrueType) and copy all of the &truetype; fonts into this directory. Keep in mind that &truetype; fonts cannot be directly taken from a &macintosh;; they must be in &unix;/&ms-dos;/&windows; format for use by X11. Once the files have been copied into this directory, use ttmkfdir to create a fonts.dir file, so that the X font renderer knows that these new files have been installed. ttmkfdir is available from the FreeBSD Ports Collection as x11-fonts/ttmkfdir. &prompt.root; cd /usr/X11R6/lib/X11/fonts/TrueType &prompt.root; ttmkfdir > fonts.dir Now add the &truetype; directory to the font path. This is just the same as described above for Type1 fonts, that is, use &prompt.user; xset fp+ /usr/X11R6/lib/X11/fonts/TrueType &prompt.user; xset fp rehash or add a FontPath line to the xorg.conf (or XF86Config) file. That's it. Now &netscape;, Gimp, &staroffice;, and all of the other X applications should now recognize the installed &truetype; fonts. Extremely small fonts (as with text in a high resolution display on a web page) and extremely large fonts (within &staroffice;) will look much better now. Joe Marcus Clarke Updated by Anti-Aliased Fonts anti-aliased fonts fonts anti-aliased Anti-aliasing has been available in X11 since &xfree86; 4.0.2. However, font configuration was cumbersome before the introduction of &xfree86; 4.3.0. Beginning with &xfree86; 4.3.0, all fonts in X11 that are found in /usr/X11R6/lib/X11/fonts/ and ~/.fonts/ are automatically made available for anti-aliasing to Xft-aware applications. Not all applications are Xft-aware, but many have received Xft support. Examples of Xft-aware applications include Qt 2.3 and higher (the toolkit for the KDE desktop), GTK+ 2.0 and higher (the toolkit for the GNOME desktop), and Mozilla 1.2 and higher. In order to control which fonts are anti-aliased, or to configure anti-aliasing properties, create (or edit, if it already exists) the file /usr/X11R6/etc/fonts/local.conf. Several advanced features of the Xft font system can be tuned using this file; this section describes only some simple possibilities. For more details, please see &man.fonts-conf.5;. XML This file must be in XML format. Pay careful attention to case, and make sure all tags are properly closed. The file begins with the usual XML header followed by a DOCTYPE definition, and then the <fontconfig> tag: <?xml version="1.0"?> <!DOCTYPE fontconfig SYSTEM "fonts.dtd"> <fontconfig> As previously stated, all fonts in /usr/X11R6/lib/X11/fonts/ as well as ~/.fonts/ are already made available to Xft-aware applications. If you wish to add another directory outside of these two directory trees, add a line similar to the following to /usr/X11R6/etc/fonts/local.conf: <dir>/path/to/my/fonts</dir> After adding new fonts, and especially new font directories, you should run the following command to rebuild the font caches: &prompt.root; fc-cache -f Anti-aliasing makes borders slightly fuzzy, which makes very small text more readable and removes staircases from large text, but can cause eyestrain if applied to normal text. To exclude font sizes smaller than 14 point from anti-aliasing, include these lines: <match target="font"> <test name="size" compare="less"> <double>14</double> </test> <edit name="antialias" mode="assign"> <bool>false</bool> </edit> </match> <match target="font"> <test name="pixelsize" compare="less" qual="any"> <double>14</double> </test> <edit mode="assign" name="antialias"> <bool>false</bool> </edit> </match> fonts spacing Spacing for some monospaced fonts may also be inappropriate with anti-aliasing. This seems to be an issue with KDE, in particular. One possible fix for this is to force the spacing for such fonts to be 100. Add the following lines: <match target="pattern" name="family"> <test qual="any" name="family"> <string>fixed</string> </test> <edit name="family" mode="assign"> <string>mono</string> </edit> </match> <match target="pattern" name="family"> <test qual="any" name="family"> <string>console</string> </test> <edit name="family" mode="assign"> <string>mono</string> </edit> </match> (this aliases the other common names for fixed fonts as "mono"), and then add: <match target="pattern" name="family"> <test qual="any" name="family"> <string>mono</string> </test> <edit name="spacing" mode="assign"> <int>100</int> </edit> </match> Certain fonts, such as Helvetica, may have a problem when anti-aliased. Usually this manifests itself as a font that seems cut in half vertically. At worst, it may cause applications such as Mozilla to crash. To avoid this, consider adding the following to local.conf: <match target="pattern" name="family"> <test qual="any" name="family"> <string>Helvetica</string> </test> <edit name="family" mode="assign"> <string>sans-serif</string> </edit> </match> Once you have finished editing local.conf make sure you end the file with the </fontconfig> tag. Not doing this will cause your changes to be ignored. The default font set that comes with X11 is not very desirable when it comes to anti-aliasing. A much better set of default fonts can be found in the x11-fonts/bitstream-vera port. This port will install a /usr/X11R6/etc/fonts/local.conf file if one does not exist already. If the file does exist, the port will create a /usr/X11R6/etc/fonts/local.conf-vera file. Merge the contents of this file into /usr/X11R6/etc/fonts/local.conf, and the Bitstream fonts will automatically replace the default X11 Serif, Sans Serif, and Monospaced fonts. Finally, users can add their own settings via their personal .fonts.conf files. To do this, each user should simply create a ~/.fonts.conf. This file must also be in XML format. LCD screen Fonts LCD screen One last point: with an LCD screen, sub-pixel sampling may be desired. This basically treats the (horizontally separated) red, green and blue components separately to improve the horizontal resolution; the results can be dramatic. To enable this, add the line somewhere in the local.conf file: <match target="font"> <test qual="all" name="rgba"> <const>unknown</const> </test> <edit name="rgba" mode="assign"> <const>rgb</const> </edit> </match> Depending on the sort of display, rgb may need to be changed to bgr, vrgb or vbgr: experiment and see which works best. Mozilla disabling anti-aliased fonts Anti-aliasing should be enabled the next time the X server is started. However, programs must know how to take advantage of it. At present, the Qt toolkit does, so the entire KDE environment can use anti-aliased fonts (see on KDE for details). GTK+ and GNOME can also be made to use anti-aliasing via the Font capplet (see for details). By default, Mozilla 1.2 and greater will automatically use anti-aliasing. To disable this, rebuild Mozilla with the -DWITHOUT_XFT flag. Seth Kingsley Contributed by The X Display Manager Overview X Display Manager The X Display Manager (XDM) is an optional part of the X Window System that is used for login session management. This is useful for several types of situations, including minimal X Terminals, desktops, and large network display servers. Since the X Window System is network and protocol independent, there are a wide variety of possible configurations for running X clients and servers on different machines connected by a network. XDM provides a graphical interface for choosing which display server to connect to, and entering authorization information such as a login and password combination. Think of XDM as providing the same functionality to the user as the &man.getty.8; utility (see for details). That is, it performs system logins to the display being connected to and then runs a session manager on behalf of the user (usually an X window manager). XDM then waits for this program to exit, signaling that the user is done and should be logged out of the display. At this point, XDM can display the login and display chooser screens for the next user to login. Using XDM The XDM daemon program is located in /usr/X11R6/bin/xdm. This program can be run at any time as root and it will start managing the X display on the local machine. If XDM is to be run every time the machine boots up, a convenient way to do this is by adding an entry to /etc/ttys. For more information about the format and usage of this file, see . There is a line in the default /etc/ttys file for running the XDM daemon on a virtual terminal: ttyv8 "/usr/X11R6/bin/xdm -nodaemon" xterm off secure By default this entry is disabled; in order to enable it change field 5 from off to on and restart &man.init.8; using the directions in . The first field, the name of the terminal this program will manage, is ttyv8. This means that XDM will start running on the 9th virtual terminal. Configuring XDM The XDM configuration directory is located in /usr/X11R6/lib/X11/xdm. In this directory there are several files used to change the behavior and appearance of XDM. Typically these files will be found: File Description Xaccess Client authorization ruleset. Xresources Default X resource values. Xservers List of remote and local displays to manage. Xsession Default session script for logins. Xsetup_* Script to launch applications before the login interface. xdm-config Global configuration for all displays running on this machine. xdm-errors Errors generated by the server program. xdm-pid The process ID of the currently running XDM. Also in this directory are a few scripts and programs used to set up the desktop when XDM is running. The purpose of each of these files will be briefly described. The exact syntax and usage of all of these files is described in &man.xdm.1;. The default configuration is a simple rectangular login window with the hostname of the machine displayed at the top in a large font and Login: and Password: prompts below. This is a good starting point for changing the look and feel of XDM screens. Xaccess The protocol for connecting to XDM controlled displays is called the X Display Manager Connection Protocol (XDMCP). This file is a ruleset for controlling XDMCP connections from remote machines. It's ignored unless the xdm-config is changed to listen for remote connections. By default, it does not allow any clients to connect. Xresources This is an application-defaults file for the display chooser and the login screens. This is where the appearance of the login program can be modified. The format is identical to the app-defaults file described in the X11 documentation. Xservers This is a list of the remote displays the chooser should provide as choices. Xsession This is the default session script for XDM to run after a user has logged in. Normally each user will have a customized session script in ~/.xsession that overrides this script. Xsetup_* These will be run automatically before displaying the chooser or login interfaces. There is a script for each display being used, named Xsetup_ followed by the local display number (for instance Xsetup_0). Typically these scripts will run one or two programs in the background such as xconsole. xdm-config This contains settings in the form of app-defaults that are applicable to every display that this installation manages. xdm-errors This contains the output of the X servers that XDM is trying to run. If a display that XDM is trying to start hangs for some reason, this is a good place to look for error messages. These messages are also written to the user's ~/.xsession-errors file on a per-session basis. Running a Network Display Server In order for other clients to connect to the display server, edit the access control rules, and enable the connection listener. By default these are set to conservative values. To make XDM listen for connections, first comment out a line in the xdm-config file: ! SECURITY: do not listen for XDMCP or Chooser requests ! Comment out this line if you want to manage X terminals with xdm DisplayManager.requestPort: 0 and then restart XDM. Remember that comments in app-defaults files begin with a ! character, not the usual #. More strict access controls may be desired. Look at the example entries in Xaccess, and refer to the &man.xdm.1; manual page. Replacements for XDM Several replacements for the default XDM program exist. One of them, kdm (bundled with KDE) is described later in this chapter. The kdm display manager offers many visual improvements and cosmetic frills, as well as the functionality to allow users to choose their window manager of choice at login time. Valentino Vaschetto Contributed by Desktop Environments This section describes the different desktop environments available for X on FreeBSD. A desktop environment can mean anything ranging from a simple window manager to a complete suite of desktop applications, such as KDE or GNOME. GNOME About GNOME GNOME GNOME is a user-friendly desktop environment that enables users to easily use and configure their computers. GNOME includes a panel (for starting applications and displaying status), a desktop (where data and applications can be placed), a set of standard desktop tools and applications, and a set of conventions that make it easy for applications to cooperate and be consistent with each other. Users of other operating systems or environments should feel right at home using the powerful graphics-driven environment that GNOME provides. More information regarding GNOME on FreeBSD can be found on the FreeBSD GNOME Project's web site. The web site also contains fairly comprehensive FAQs about installing, configuring, and managing GNOME. Installing GNOME The easiest way to install GNOME is through the Desktop Configuration menu during the FreeBSD installation process as described in of Chapter 2. It can also be easily installed from a package or the ports collection: To install the GNOME package from the network, simply type: &prompt.root; pkg_add -r gnome2 To build GNOME from source, use the ports tree: &prompt.root; cd /usr/ports/x11/gnome2 &prompt.root; make install clean Once GNOME is installed, the X server must be told to start GNOME instead of a default window manager. The easiest way to start GNOME is with GDM, the GNOME Display Manager. GDM, which is installed as a part of the GNOME desktop (but is disabled by default), can be enabled by adding gdm_enable="YES" to /etc/rc.conf. Once you have rebooted, GNOME will start automatically once you log in — no further configuration is necessary. GNOME may also be started from the command-line by properly configuring a file named .xinitrc. If a custom .xinitrc is already in place, simply replace the line that starts the current window manager with one that starts /usr/X11R6/bin/gnome-session instead. If nothing special has been done to the configuration file, then it is enough simply to type: &prompt.user; echo "/usr/X11R6/bin/gnome-session" > ~/.xinitrc Next, type startx, and the GNOME desktop environment will be started. If an older display manager, like XDM, is being used, this will not work. Instead, create an executable .xsession file with the same command in it. To do this, edit the file and replace the existing window manager command with /usr/X11R6/bin/gnome-session: - &prompt.user; echo "#!/bin/sh" > ~/.xsession -&prompt.user; echo "/usr/X11R6/bin/gnome-session" >> ~/.xsession + &prompt.user; echo "#!/bin/sh" > ~/.xsession +&prompt.user; echo "/usr/X11R6/bin/gnome-session" >> ~/.xsession &prompt.user; chmod +x ~/.xsession Yet another option is to configure the display manager to allow choosing the window manager at login time; the section on KDE details explains how to do this for kdm, the display manager of KDE. Anti-aliased Fonts with GNOME GNOME anti-aliased fonts X11 supports anti-aliasing via its RENDER extension. GTK+ 2.0 and greater (the toolkit used by GNOME) can make use of this functionality. Configuring anti-aliasing is described in . So, with up-to-date software, anti-aliasing is possible within the GNOME desktop. Just go to Applications Desktop Preferences Font, and select either Best shapes, Best contrast, or Subpixel smoothing (LCDs). For a GTK+ application that is not part of the GNOME desktop, set the environment variable GDK_USE_XFT to 1 before launching the program. KDE KDE About KDE KDE is an easy to use contemporary desktop environment. Some of the things that KDE brings to the user are: A beautiful contemporary desktop A desktop exhibiting complete network transparency An integrated help system allowing for convenient, consistent access to help on the use of the KDE desktop and its applications Consistent look and feel of all KDE applications Standardized menu and toolbars, keybindings, color-schemes, etc. Internationalization: KDE is available in more than 40 languages Centralized consisted dialog driven desktop configuration A great number of useful KDE applications KDE has an office application suite based on KDE's KParts technology consisting of a spread-sheet, a presentation application, an organizer, a news client and more. KDE also comes with a web browser called Konqueror, which represents a solid competitor to other existing web browsers on &unix; systems. More information on KDE can be found on the KDE website. For FreeBSD specific information and resources on KDE, consult the FreeBSD-KDE team's website. Installing KDE Just as with GNOME or any other desktop environment, the easiest way to install KDE is through the Desktop Configuration menu during the FreeBSD installation process as described in of Chapter 2. Once again, the software can be easily installed from a package or from the Ports Collection: To install the KDE package from the network, simply type: &prompt.root; pkg_add -r kde &man.pkg.add.1; will automatically fetch the latest version of the application. To build KDE from source, use the ports tree: &prompt.root; cd /usr/ports/x11/kde3 &prompt.root; make install clean After KDE has been installed, the X server must be told to launch this application instead of the default window manager. This is accomplished by editing the .xinitrc file: &prompt.user; echo "exec startkde" > ~/.xinitrc Now, whenever the X Window System is invoked with startx, KDE will be the desktop. If a display manager such as XDM is being used, the configuration is slightly different. Edit the .xsession file instead. Instructions for kdm are described later in this chapter. More Details on KDE Now that KDE is installed on the system, most things can be discovered through the help pages, or just by pointing and clicking at various menus. &windows; or &mac; users will feel quite at home. The best reference for KDE is the on-line documentation. KDE comes with its own web browser, Konqueror, dozens of useful applications, and extensive documentation. The remainder of this section discusses the technical items that are difficult to learn by random exploration. The KDE Display Manager KDE display manager An administrator of a multi-user system may wish to have a graphical login screen to welcome users. XDM can be used, as described earlier. However, KDE includes an alternative, kdm, which is designed to look more attractive and include more login-time options. In particular, users can easily choose (via a menu) which desktop environment (KDE, GNOME, or something else) to run after logging on. To begin with, run the KDE control panel, kcontrol, as root. It is generally considered unsafe to run the entire X environment as root. Instead, run the window manager as a normal user, open a terminal window (such as xterm or KDE's konsole), become root with su (the user must be in the wheel group in /etc/group for this), and then type kcontrol. Click on the icon on the left marked System, then on Login manager. On the right there are various configurable options, which the KDE manual will explain in greater detail. Click on sessions on the right. Click New type to add various window managers and desktop environments. These are just labels, so they can say KDE and GNOME rather than startkde or gnome-session. Include a label failsafe. Play with the other menus as well, they are mainly cosmetic and self-explanatory. When you are done, click on Apply at the bottom, and quit the control center. To make sure kdm understands what the labels (KDE, GNOME etc) mean, edit the files used by XDM. In KDE 2.2 this has changed: kdm now uses its own configuration files. Please see the KDE 2.2 documentation for details. In a terminal window, as root, edit the file /usr/X11R6/lib/X11/xdm/Xsession. There is a section in the middle like this: case $# in 1) case $1 in failsafe) exec xterm -geometry 80x24-0-0 ;; esac esac A few lines need to be added to this section. Assuming the labels from used were KDE and GNOME, use the following: case $# in 1) case $1 in kde) exec /usr/local/bin/startkde ;; GNOME) exec /usr/X11R6/bin/gnome-session ;; failsafe) exec xterm -geometry 80x24-0-0 ;; esac esac For the KDE login-time desktop background to be honored, the following line needs to be added to /usr/X11R6/lib/X11/xdm/Xsetup_0: /usr/local/bin/krootimage Now, make sure kdm is listed in /etc/ttys to be started at the next bootup. To do this, simply follow the instructions from the previous section on XDM and replace references to the /usr/X11R6/bin/xdm program with /usr/local/bin/kdm. Anti-aliased Fonts KDE anti-aliased fonts X11 supports anti-aliasing via its RENDER extension, and starting with version 2.3, Qt (the toolkit used by KDE) supports this extension. Configuring this is described in on antialiasing X11 fonts. So, with up-to-date software, anti-aliasing is possible on a KDE desktop. Just go to the KDE menu, go to Preferences Look and Feel Fonts, and click on the check box Use Anti-Aliasing for Fonts and Icons. For a Qt application which is not part of KDE, the environment variable QT_XFT needs to be set to true before starting the program. XFce About XFce XFce is a desktop environment based on the GTK+ toolkit used by GNOME, but is much more lightweight and meant for those who want a simple, efficient desktop which is nevertheless easy to use and configure. Visually, it looks very much like CDE, found on commercial &unix; systems. Some of XFce's features are: A simple, easy-to-handle desktop Fully configurable via mouse, with drag and drop, etc Main panel similar to CDE, with menus, applets and applications launchers Integrated window manager, file manager, sound manager, GNOME compliance module, and other things Themeable (since it uses GTK+) Fast, light and efficient: ideal for older/slower machines or machines with memory limitations More information on XFce can be found on the XFce website. Installing XFce A binary package for XFce exists (at the time of writing). To install, simply type: &prompt.root; pkg_add -r xfce4 Alternatively, to build from source, use the ports collection: &prompt.root; cd /usr/ports/x11-wm/xfce4 &prompt.root; make install clean Now, tell the X server to launch XFce the next time X is started. Simply type this: &prompt.user; echo "/usr/X11R6/bin/startxfce4" > ~/.xinitrc The next time X is started, XFce will be the desktop. As before, if a display manager like XDM is being used, create an .xsession, as described in the section on GNOME, but with the /usr/X11R6/bin/startxfce4 command; or, configure the display manager to allow choosing a desktop at login time, as explained in the section on kdm. diff --git a/en_US.ISO8859-1/books/pmake/basics/chapter.sgml b/en_US.ISO8859-1/books/pmake/basics/chapter.sgml index 3bd63e2cb9..5e96c33a1b 100644 --- a/en_US.ISO8859-1/books/pmake/basics/chapter.sgml +++ b/en_US.ISO8859-1/books/pmake/basics/chapter.sgml @@ -1,1548 +1,1548 @@ The Basics of PMake PMake takes as input a file that tells which files depend on which other files to be complete and what to do about files that are out-of-date. This file is known as a makefile and is usually kept in the top-most directory of the system to be built. While you can call the makefile anything you want, PMake will look for Makefile and makefile (in that order) in the current directory if you do not tell it otherwise. To specify a different makefile, use the flag, e.g. &prompt.user; pmake program.mk A makefile has four different types of lines in it: File dependency specifications Creation commands Variable assignments Comments, include statements and conditional directives Any line may be continued over multiple lines by ending it with a backslash. The backslash, following newline and any initial whitespace on the following line are compressed into a single space before the input line is examined by PMake.

Dependency Lines As mentioned in the introduction, in any system, there are dependencies between the files that make up the system. For instance, in a program made up of several C source files and one header file, the C files will need to be re-compiled should the header file be changed. For a document of several chapters and one macro file, the chapters will need to be reprocessed if any of the macros changes. These are dependencies and are specified by means of dependency lines in the makefile. On a dependency line, there are targets and sources, separated by a one- or two-character operator. The targets depend on the sources and are usually created from them. Any number of targets and sources may be specified on a dependency line. All the targets in the line are made to depend on all the sources. Targets and sources need not be actual files, but every source must be either an actual file or another target in the makefile. If you run out of room, use a backslash at the end of the line to continue onto the next one. Any file may be a target and any file may be a source, but the relationship between the two (or however many) is determined by the operator that separates them. Three types of operators exist: one specifies that the datedness of a target is determined by the state of its sources, while another specifies other files (the sources) that need to be dealt with before the target can be re-created. The third operator is very similar to the first, with the additional condition that the target is out-of-date if it has no sources. These operations are represented by the colon, the exclamation point and the double-colon, respectively, and are mutually exclusive. Their exact semantics are as follows: : If a colon is used, a target on the line is considered to be out-of-date (and in need of creation) if any of the sources has been modified more recently than the target, or the target doesn't exist. Under this operation, steps will be taken to re-create the target only if it is found to be out-of-date by using these two rules. ! If an exclamation point is used, the target will always be re-created, but this will not happen until all of its sources have been examined and re-created, if necessary. :: If a double-colon is used, a target is out-of-date if any of the sources has been modified more recently than the target, or the target doesn't exist, or the target has no sources. If the target is out-of-date according to these rules, it will be re-created. This operator also does something else to the targets, but I will go into that in the next section (see ). Enough words, now for an example. Take that C program I mentioned earlier. Say there are three C files (a.c, b.c and c.c) each of which includes the file defs.h. The dependencies between the files could then be expressed as follows: program : a.o b.o c.o a.o b.o c.o : defs.h a.o : a.c b.o : b.c c.o : c.c You may be wondering at this point, where a.o, b.o and c.o came in and why they depend on defs.h and the C files do not. The reason is quite simple: program cannot be made by linking together .c files—it must be made from .o files. Likewise, if you change defs.h, it is not the .c files that need to be re-created, it is the .o files. If you think of dependencies in these terms—which files (targets) need to be created from which files (sources)—you should have no problems. An important thing to notice about the above example, is that all the .o files appear as targets on more than one line. This is perfectly all right: the target is made to depend on all the sources mentioned on all the dependency lines. For example, a.o depends on both defs.h and a.c. The order of the dependency lines in the makefile is important: the first target on the first dependency line in the makefile will be the one that gets made if you do not say otherwise. That is why program comes first in the example makefile, above. Both targets and sources may contain the standard C-Shell wildcard characters ({, }, *, ?, [, and ]), but the non-curly-brace ones may only appear in the final component (the file portion) of the target or source. The characters mean the following things: {} These enclose a comma-separated list of options and cause the pattern to be expanded once for each element of the list. Each expansion contains a different element. For example, src/{whiffle,beep,fish}.c expands to the three words src/whiffle.c, src/beep.c, and src/fish.c. These braces may be nested and, unlike the other wildcard characters, the resulting words need not be actual files. All other wildcard characters are expanded using the files that exist when PMake is started. * This matches zero or more characters of any sort. src/*.c will expand to the same three words as above as long as src contains those three files (and no other files that end in .c).> ? Matches any single character. [] This is known as a character class and contains either a list of single characters, or a series of character ranges (a-z, for example means all characters between a and z), or both. It matches any single character contained in the list. For example, [A-Za-z] will match all letters, while [0123456789] will match all numbers.

Shell Commands Is not that nice, you say to yourself, but how are files actually ``re-created'', as he likes to spell it? The re-creation is accomplished by commands you place in the makefile. These commands are passed to the Bourne shell (better known as /bin/sh) to be executed and are expected to do what is necessary to update the target file (PMake does not actually check to see if the target was created. It just assumes it is there). Shell commands in a makefile look a lot like shell commands you would type at a terminal, with one important exception: each command in a makefile must be preceded by at least one tab. Each target has associated with it a shell script made up of one or more of these shell commands. The creation script for a target should immediately follow the dependency line for that target. While any given target may appear on more than one dependency line, only one of these dependency lines may be followed by a creation script, unless the :: operator was used on the dependency line. If the double-colon was used, each dependency line for the target may be followed by a shell script. That script will only be executed if the target on the associated dependency line is out-of-date with respect to the sources on that line, according to the rules I gave earlier. I'll give you a good example of this later on. To expand on the earlier makefile, you might add commands as follows: program : a.o b.o c.o cc a.o b.o c.o -o program a.o b.o c.o : defs.h a.o : a.c cc -c a.c b.o : b.c cc -c b.c c.o : c.c cc -c c.c Something you should remember when writing a makefile is, the commands will be executed if the target on the dependency line is out-of-date, not the sources. In this example, the command cc -c a.c will be executed if a.o is out-of-date. Because of the : operator, this means that should a.c or defs.h have been modified more recently than a.o, the command will be executed (a.o will be considered out-of-date). Remember how I said the only difference between a makefile shell command and a regular shell command was the leading tab? I lied. There is another way in which makefile commands differ from regular ones. The first two characters after the initial whitespace are treated specially. If they are any combination of @ and -, they cause PMake to do different things. In most cases, shell commands are printed before they are actually executed. This is to keep you informed of what is going on. If an @ appears, however, this echoing is suppressed. In the case of an echo command, say echo Linking index it would be rather silly to see echo Linking index Linking index so PMake allows you to place an @ before the command to prevent the command from being printed: @echo Linking index The other special character is the -. In case you did not know, shell commands finish with a certain exit status. This status is made available by the operating system to whatever program invoked the command. Normally this status will be 0 if everything went ok and non-zero if something went wrong. For this reason, PMake will consider an error to have occurred if one of the shells it invokes returns a non-zero status. When it detects an error, PMake's usual action is to abort whatever it's doing and exit with a non-zero status itself (any other targets that were being created will continue being made, but nothing new will be started. PMake will exit after the last job finishes). This behavior can be altered, however, by placing a - at the front of a command (e.g. -mv index index.old), certain command-line arguments, or doing other things, to be detailed later. In such a case, the non-zero status is simply ignored and PMake keeps chugging along. Because all the commands are given to a single shell to execute, such things as setting shell variables, changing directories, etc., last beyond the command in which they are found. This also allows shell compound commands (like for loops) to be entered in a natural manner. Since this could cause problems for some makefiles that depend on each command being executed by a single shell, PMake has a flag (it stands for backwards-compatible) that forces each command to be given to a separate shell. It also does several other things, all of which I discourage since they are now old-fashioned. A target's shell script is fed to the shell on its (the shell's) input stream. This means that any commands, such as ci that need to get input from the terminal will not work right – they willl get the shell's input, something they probably will not find to their liking. A simple way around this is to give a command like this: ci $(SRCS) < /dev/tty This would force the program's input to come from the terminal. If you cannot do this for some reason, your only other alternative is to use PMake in its fullest compatibility mode. See Compatibility in .

Variables PMake, like Make before it, has the ability to save text in variables to be recalled later at your convenience. Variables in PMake are used much like variables in the shell and, by tradition, consist of all upper-case letters (you do not have to use all upper-case letters. In fact there is nothing to stop you from calling a - variable @^&$%$. Just tradition). Variables + variable @^&$%$. Just tradition). Variables are assigned-to using lines of the form: VARIABLE = value appended-to by: VARIABLE += value conditionally assigned-to (if the variable is not already defined) by: VARIABLE ?= value and assigned-to with expansion (i.e. the value is expanded (see below) before being assigned to the variable—useful for placing a value at the beginning of a variable, or other things) by: VARIABLE := value Any whitespace before value is stripped off. When appending, a space is placed between the old value and the stuff being appended. The final way a variable may be assigned to is using: VARIABLE != shell-command In this case, shell-command has all its variables expanded (see below) and is passed off to a shell to execute. The output of the shell is then placed in the variable. Any newlines (other than the final one) are replaced by spaces before the assignment is made. This is typically used to find the current directory via a line like: CWD != pwd This is intended to be used to execute commands that produce small amounts of output (e.g. pwd). The implementation is less than intelligent and will likely freeze if you execute something that produces thousands of bytes of output (8 Kb is the limit on many &unix; systems). The value of a variable may be retrieved by enclosing the variable name in parentheses or curly braces and preceding the whole thing with a dollar sign. For example, to set the variable CFLAGS to the string -I/sprite/src/lib/libc -O, you would place a line: CFLAGS = -I/sprite/src/lib/libc -O in the makefile and use the word $(CFLAGS) wherever you would like the string -I/sprite/src/lib/libc -O to appear. This is called variable expansion. Unlike Make, PMake will not expand a variable unless it knows the variable exists. E.g. if you have a ${i} in a shell command and you have not assigned a value to the variable i (the empty string is considered a value, by the way), where Make would have substituted the empty string, PMake will leave the ${i} alone. To keep PMake from substituting for a variable it knows, precede the dollar sign with another dollar sign (e.g. to pass ${HOME} to the shell, use $${HOME}). This causes PMake, in effect, to expand the $ macro, which expands to a single $. For compatibility, Make's style of variable expansion will be used if you invoke PMake with any of the compatibility flags (, or . The flag alters just the variable expansion). There are two different times at which variable expansion occurs: when parsing a dependency line, the expansion occurs immediately upon reading the line. If any variable used on a dependency line is undefined, PMake will print a message and exit. Variables in shell commands are expanded when the command is executed. Variables used inside another variable are expanded whenever the outer variable is expanded (the expansion of an inner variable has no effect on the outer variable. For example, if the outer variable is used on a dependency line and in a shell command, and the inner variable changes value between when the dependency line is read and the shell command is executed, two different values will be substituted for the outer variable). Variables come in four flavors, though they are all expanded the same and all look about the same. They are (in order of expanding scope): Local variables. Command-line variables. Global variables. Environment variables. The classification of variables does not matter much, except that the classes are searched from the top (local) to the bottom (environment) when looking up a variable. The first one found wins.

Local Variables Each target can have as many as seven local variables. These are variables that are only visible within that target's shell script and contain such things as the target's name, all of its sources (from all its dependency lines), those sources that were out-of-date, etc. Four local variables are defined for all targets. They are: .TARGET The name of the target. .OODATE The list of the sources for the target that were considered out-of-date. The order in the list is not guaranteed to be the same as the order in which the dependencies were given. .ALLSRC The list of all sources for this target in the order in which they were given. .PREFIX The target without its suffix and without any leading path. E.g. for the target ../../lib/compat/fsRead.c, this variable would contain fsRead. Three other local variables are set only for certain targets under special circumstances. These are the .IMPSRC, .ARCHIVE, and .MEMBER variables. When they are set and how they are used is described later. Four of these variables may be used in sources as well as in shell scripts. These are .TARGET, .PREFIX, .ARCHIVE and .MEMBER. The variables in the sources are expanded once for each target on the dependency line, providing what is known as a dynamic source, allowing you to specify several dependency lines at once. For example: $(OBJS) : $(.PREFIX).c will create a dependency between each object file and its corresponding C source file.

Command-line Variables Command-line variables are set when PMake is first invoked by giving a variable assignment as one of the arguments. For example: pmake "CFLAGS = -I/sprite/src/lib/libc -O" would make CFLAGS be a command-line variable with the given value. Any assignments to CFLAGS in the makefile will have no effect, because once it is set, there is (almost) nothing you can do to change a command-line variable (the search order, you see). Command-line variables may be set using any of the four assignment operators, though only = and ?= behave as you would expect them to, mostly because assignments to command-line variables are performed before the makefile is read, thus the values set in the makefile are unavailable at the time. += is the same as =, because the old value of the variable is sought only in the scope in which the assignment is taking place (for reasons of efficiency that I won't get into here). := and ?= will work if the only variables used are in the environment. != is sort of pointless to use from the command line, since the same effect can no doubt be accomplished using the shell's own command substitution mechanisms (backquotes and all that).

Global Variables Global variables are those set or appended-to in the makefile. There are two classes of global variables: those you set and those PMake sets. As I said before, the ones you set can have any name you want them to have, except they may not contain a colon or an exclamation point. The variables PMake sets (almost) always begin with a period and always contain upper-case letters, only. The variables are as follows: .PMAKE The name by which PMake was invoked is stored in this variable. For compatibility, the name is also stored in the MAKE variable. .MAKEFLAGS All the relevant flags with which PMake was invoked. This does not include such things as or variable assignments. Again for compatibility, this value is stored in the MFLAGS variable as well. Two other variables, .INCLUDES and .LIBS, are covered in the section on special targets in . Global variables may be deleted using lines of the form: #undef variable The # must be the first character on the line. Note that this may only be done on global variables.

Environment Variables Environment variables are passed by the shell that invoked PMake and are given by PMake to each shell it invokes. They are expanded like any other variable, but they cannot be altered in any way. One special environment variable, PMAKE, is examined by PMake for command-line flags, variable assignments, etc., it should always use. This variable is examined before the actual arguments to PMake are. In addition, all flags given to PMake, either through the PMAKE variable or on the command line, are placed in this environment variable and exported to each shell PMake executes. Thus recursive invocations of PMake automatically receive the same flags as the top-most one. Using all these variables, you can compress the sample makefile even more: OBJS = a.o b.o c.o program : $(OBJS) cc $(.ALLSRC) -o $(.TARGET) $(OBJS) : defs.h a.o : a.c cc -c a.c b.o : b.c cc -c b.c c.o : c.c cc -c c.c

Comments Comments in a makefile start with a # character and extend to the end of the line. They may appear anywhere you want them, except in a shell command (though the shell will treat it as a comment, too). If, for some reason, you need to use the # in a variable or on a dependency line, put a backslash in front of it. PMake will compress the two into a single #. This is not true if PMake is operating in full-compatibility mode).

Parallelism PMake was specifically designed to re-create several targets at once, when possible. You do not have to do anything special to cause this to happen (unless PMake was configured to not act in parallel, in which case you will have to make use of the and flags (see below)), but you do have to be careful at times. There are several problems you are likely to encounter. One is that some makefiles (and programs) are written in such a way that it is impossible for two targets to be made at once. The program xstr, for example, always modifies the files strings and x.c. There is no way to change it. Thus you cannot run two of them at once without something being trashed. Similarly, if you have commands in the makefile that always send output to the same file, you will not be able to make more than one target at once unless you change the file you use. You can, for instance, add a $$$$ to the end of the file name to tack on the process ID of the shell executing the command (each $$ expands to a single $, thus giving you the shell variable $$). Since only one shell is used for all the commands, you will get the same file name for each command in the script. The other problem comes from improperly-specified dependencies that worked in Make because of its sequential, depth-first way of examining them. While I do not want to go into depth on how PMake works (look in if you are interested), I will warn you that files in two different levels of the dependency tree may be examined in a different order in PMake than they were in Make. For example, given the makefile: a : b c b : d PMake will examine the targets in the order c, d, b, a. If the makefile's author expected PMake to abort before making c if an error occurred while making b, or if b needed to exist before c was made, (s)he will be sorely disappointed. The dependencies are incomplete, since in both these cases, c would depend on b. So watch out. Another problem you may face is that, while PMake is set up to handle the output from multiple jobs in a graceful fashion, the same is not so for input. It has no way to regulate input to different jobs, so if you use the redirection from /dev/tty I mentioned earlier, you must be careful not to run two of the jobs at once.

Writing and Debugging a Makefile Now you know most of what is in a Makefile, what do you do next? There are two choices: use one of the uncommonly-available makefile generators or write your own makefile (I leave out the third choice of ignoring PMake and doing everything by hand as being beyond the bounds of common sense). When faced with the writing of a makefile, it is usually best to start from first principles: just what are you trying to do? What do you want the makefile finally to produce? To begin with a somewhat traditional example, let's say you need to write a makefile to create a program, expr, that takes standard infix expressions and converts them to prefix form (for no readily apparent reason). You have got three source files, in C, that make up the program: main.c, parse.c, and output.c. Harking back to my pithy advice about dependency lines, you write the first line of the file: expr : main.o parse.o output.o because you remember expr is made from .o files, not .c files. Similarly for the .o files you produce the lines: main.o : main.c parse.o : parse.c output.o : output.c main.o parse.o output.o : defs.h Great. You have now got the dependencies specified. What you need now is commands. These commands, remember, must produce the target on the dependency line, usually by using the sources you have listed. You remember about local variables? Good, so it should come to you as no surprise when you write: expr : main.o parse.o output.o cc -o $(.TARGET) $(.ALLSRC) Why use the variables? If your program grows to produce postfix expressions too (which, of course, requires a name change or two), it is one fewer place you have to change the file. You cannot do this for the object files, however, because they depend on their corresponding source files and defs.h, thus if you said: cc -c $(.ALLSRC) you will get (for main.o): cc -c main.c defs.h which is wrong. So you round out the makefile with these lines: main.o : main.c cc -c main.c parse.o : parse.c cc -c parse.c output.o : output.c cc -c output.c The makefile is now complete and will, in fact, create the program you want it to without unnecessary compilations or excessive typing on your part. There are two things wrong with it, however (aside from it being altogether too long, something I will address in ): The string main.o parse.o output.o is repeated twice, necessitating two changes when you add postfix (you were planning on that, were not you?). This is in direct violation of de Boor's First Rule of writing makefiles: Anything that needs to be written more than once should be placed in a variable. I cannot emphasize this enough as being very important to the maintenance of a makefile and its program. There is no way to alter the way compilations are performed short of editing the makefile and making the change in all places. This is evil and violates de Boor's Second Rule, which follows directly from the first: Any flags or programs used inside a makefile should be placed in a variable so they may be changed, temporarily or permanently, with the greatest ease. The makefile should more properly read: OBJS = main.o parse.o output.o expr : $(OBJS) $(CC) $(CFLAGS) -o $(.TARGET) $(.ALLSRC) main.o : main.c $(CC) $(CFLAGS) -c main.c parse.o : parse.c $(CC) $(CFLAGS) -c parse.c output.o : output.c $(CC) $(CFLAGS) -c output.c $(OBJS) : defs.h Alternatively, if you like the idea of dynamic sources mentioned in , you could write it like this: OBJS = main.o parse.o output.o expr : $(OBJS) $(CC) $(CFLAGS) -o $(.TARGET) $(.ALLSRC) $(OBJS) : $(.PREFIX).c defs.h $(CC) $(CFLAGS) -c $(.PREFIX).c These two rules and examples lead to de Boor's First Corollary: Variables are your friends. Once you have written the makefile comes the sometimes-difficult task of making sure the darn thing works. Your most helpful tool to make sure the makefile is at least syntactically correct is the flag, which allows you to see if PMake will choke on the makefile. The second thing the flag lets you do is see what PMake would do without it actually doing it, thus you can make sure the right commands would be executed were you to give PMake its head. When you find your makefile is not behaving as you hoped, the first question that comes to mind (after What time is it, anyway?) is Why not? In answering this, two flags will serve you well: -d m and -p 2. The first causes PMake to tell you as it examines each target in the makefile and indicate why it is deciding whatever it is deciding. You can then use the information printed for other targets to see where you went wrong. The -p 2 flag makes PMake print out its internal state when it is done, allowing you to see that you forgot to make that one chapter depend on that file of macros you just got a new version of. The output from -p 2 is intended to resemble closely a real makefile, but with additional information provided and with variables expanded in those commands PMake actually printed or executed. Something to be especially careful about is circular dependencies. For example: a : b b : c d d : a In this case, because of how PMake works, c is the only thing PMake will examine, because d and a will effectively fall off the edge of the universe, making it impossible to examine b (or them, for that matter). PMake will tell you (if run in its normal mode) all the targets involved in any cycle it looked at (i.e. if you have two cycles in the graph (naughty, naughty), but only try to make a target in one of them, PMake will only tell you about that one. You will have to try to make the other to find the second cycle). When run as Make, it will only print the first target in the cycle.

Invoking PMake PMake comes with a wide variety of flags to choose from. They may appear in any order, interspersed with command-line variable assignments and targets to create. The flags are as follows: This causes PMake to spew out debugging information that may prove useful to you. If you cannot figure out why PMake is doing what it is doing, you might try using this flag. The what parameter is a string of single characters that tell PMake what aspects you are interested in. Most of what I describe will make little sense to you, unless you have dealt with Make before. Just remember where this table is and come back to it as you read on. The characters and the information they produce are as follows: a Archive searching and caching. c Conditional evaluation. d The searching and caching of directories. j Various snippets of information related to the running of the multiple shells. Not particularly interesting. m The making of each target: what target is being examined; when it was last modified; whether it is out-of-date; etc. p Makefile parsing. r Remote execution. s The application of suffix-transformation rules. (See .) t The maintenance of the list of targets. v Variable assignment. Of these all, the m and s letters will be most useful to you. If the is the final argument or the argument from which it would get these key letters (see below for a note about which argument would be used) begins with a –, all of these debugging flags will be set, resulting in massive amounts of output. makefile Specify a makefile to read different from the standard makefiles (Makefile or makefile). If makefile is -, PMake uses the standard input. This is useful for making quick and dirty makefiles. Prints out a summary of the various flags PMake accepts. It can also be used to find out what level of concurrency was compiled into the version of PMake you are using (look at -J and -L) and various other information on how PMake was configured. If you give this flag, PMake will ignore non-zero status returned by any of its shells. It is like placing a - before all the commands in the makefile. This is similar to in that it allows PMake to continue when it sees an error, but unlike , where PMake continues blithely as if nothing went wrong, causes it to recognize the error and only continue work on those things that do not depend on the target, either directly or indirectly (through depending on something that depends on it), whose creation returned the error. The is for keep going. PMake has the ability to lock a directory against other people executing it in the same directory (by means of a file called LOCK.make that it creates and checks for in the directory). This is a Good Thing because two people doing the same thing in the same place can be disastrous for the final product (too many cooks and all that). Whether this locking is the default is up to your system administrator. If locking is on, will turn it off, and vice versa. Note that this locking will not prevent you from invoking PMake twice in the same place–if you own the lock file, PMake will warn you about it but continue to execute. Tells PMake another place to search for included makefiles via the <filename> style. Several -m options can be given to form a search path. If this construct is used the default system makefile search path is completely overridden. This flag tells PMake not to execute the commands needed to update the out-of-date targets in the makefile. Rather, PMake will simply print the commands it would have executed and exit. This is particularly useful for checking the correctness of a makefile. If PMake does not do what you expect it to, it is a good chance the makefile is wrong. This causes PMake to print its input in a reasonable form, though not necessarily one that would make immediate sense to anyone but me. The number is a bitwise OR of 1 and 2, where 1 means it should print the input before doing any processing and 2 says it should print it after everything has been re-created. Thus would print it twice-a-once before processing and once after (you might find the difference between the two interesting). This is mostly useful to me, but you may find it informative in some bizarre circumstances. If you give PMake this flag, it will not try to re-create anything. It will just see if anything is out-of-date and exit non-zero if so. When PMake starts up, it reads a default makefile that tells it what sort of system it is on and gives it some idea of what to do if you do not tell it anything. I will tell you about it in . If you give this flag, PMake will not read the default makefile. This causes PMake to not print commands before they are executed. It is the equivalent of putting an @ before every command in the makefile. Rather than try to re-create a target, PMake will simply touch it so as to make it appear up-to-date. If the target did not exist before, it will when PMake finishes, but if the target did exist, it will appear to have been updated. Targets can still be created in parallel, however. This is the mode PMake will enter if it is invoked either as smake or vmake. This tells PMake it is ok to export jobs to other machines, if they are available. It is used when running in Make mode, as exporting in this mode tends to make things run slower than if the commands were just executed locally. Forces PMake to be as backwards-compatible with Make as possible while still being itself. This includes: Executing one shell per shell command Expanding anything that looks even vaguely like a variable, with the empty string replacing any variable PMake does not know. Refusing to allow you to escape a # with a backslash. Permitting undefined variables on dependency lines and conditionals (see below). Normally this causes PMake to abort. This nullifies any and all compatibility mode flags you may have given or implied up to the time the is encountered. It is useful mostly in a makefile that you wrote for PMake to avoid bad things happening when someone runs PMake as make or has things set in the environment that tell it to be compatible. is not placed in the PMAKE environment variable or the .MAKEFLAGS or MFLAGS global variables. Allows you to define a variable to have 1 as its value. The variable is a global variable, not a command-line variable. This is useful mostly for people who are used to the C compiler arguments and those using conditionals, which I will get into in . Tells PMake another place to search for included makefiles. Yet another thing to be explained in (, to be precise). Gives the absolute maximum number of targets to create at once on both local and remote machines. This specifies the maximum number of targets to create on the local machine at once. This may be 0, though you should be wary of doing this, as PMake may hang until a remote machine becomes available, if one is not available when it is started. This is the flag that provides absolute, complete, full compatibility with Make. It still allows you to use all but a few of the features of PMake, but it is non-parallel. This is the mode PMake enters if you call it make. When creating targets in parallel, several shells are executing at once, each wanting to write its own two cents'-worth to the screen. This output must be captured by PMake in some way in order to prevent the screen from being filled with garbage even more indecipherable than you usually see. PMake has two ways of doing this, one of which provides for much cleaner output and a clear separation between the output of different jobs, the other of which provides a more immediate response so one can tell what is really happening. The former is done by notifying you when the creation of a target starts, capturing the output and transferring it to the screen all at once when the job finishes. The latter is done by catching the output of the shell (and its children) and buffering it until an entire line is received, then printing that line preceded by an indication of which job produced the output. Since I prefer this second method, it is the one used by default. The first method will be used if you give the flag to PMake. As mentioned before, the flag tells PMake to use Make's style of expanding variables, substituting the empty string for any variable it does not know. There are several times when PMake will print a message at you that is only a warning, i.e. it can continue to work in spite of your having done something silly (such as forgotten a leading tab for a shell command). Sometimes you are well aware of silly things you have done and would like PMake to stop bothering you. This flag tells it to shut up about anything non-fatal. This flag causes PMake to not attempt to export any jobs to another machine. Several flags may follow a single -. Those flags that require arguments take them from successive parameters. For example: pmake -fDnI server.mk DEBUG /chip2/X/server/include will cause PMake to read server.mk as the input makefile, define the variable DEBUG as a global variable and look for included makefiles in the directory /chip2/X/server/include.

Summary A makefile is made of four types of lines: Dependency lines Creation commands Variable assignments Comments, include statements and conditional directives A dependency line is a list of one or more targets, an operator (:, ::, or !), and a list of zero or more sources. Sources may contain wildcards and certain local variables. A creation command is a regular shell command preceded by a tab. In addition, if the first two characters after the tab (and other whitespace) are a combination of @ or -, PMake will cause the command to not be printed (if the character is @) or errors from it to be ignored (if -). A blank line, dependency line or variable assignment terminates a creation script. There may be only one creation script for each target with a : or ! operator. Variables are places to store text. They may be unconditionally assigned-to using the = operator, appended-to using the += operator, conditionally (if the variable is undefined) assigned-to with the ?= operator, and assigned-to with variable expansion with the := operator. The output of a shell command may be assigned to a variable using the != operator. Variables may be expanded (their value inserted) by enclosing their name in parentheses or curly braces, preceded by a dollar sign. A dollar sign may be escaped with another dollar sign. Variables are not expanded if PMake does not know about them. There are seven local variables: .TARGET, .ALLSRC, .OODATE, .PREFIX, .IMPSRC, .ARCHIVE, and .MEMBER. Four of them (.TARGET, .PREFIX, .ARCHIVE, and .MEMBER) may be used to specify dynamic sources. Variables are good. Know them. Love them. Live them. Debugging of makefiles is best accomplished using the , , and flags.

diff --git a/en_US.ISO8859-1/books/pmake/gods/chapter.sgml b/en_US.ISO8859-1/books/pmake/gods/chapter.sgml index e2541e31fd..f5963d47e4 100644 --- a/en_US.ISO8859-1/books/pmake/gods/chapter.sgml +++ b/en_US.ISO8859-1/books/pmake/gods/chapter.sgml @@ -1,953 +1,953 @@ PMake for Gods This chapter is devoted to those facilities in PMake that allow you to do a great deal in a makefile with very little work, as well as do some things you could not do in Make without a great deal of work (and perhaps the use of other programs). The problem with these features, is they must be handled with care, or you will end up with a mess. Once more, I assume a greater familiarity with &unix; or Sprite than I did in the previous two chapters.

Search Paths PMake supports the dispersal of files into multiple directories by allowing you to specify places to look for sources with .PATH targets in the makefile. The directories you give as sources for these targets make up a search path. Only those files used exclusively as sources are actually sought on a search path, the assumption being that anything listed as a target in the makefile can be created by the makefile and thus should be in the current directory. There are two types of search paths in PMake: one is used for all types of files (including included makefiles) and is specified with a plain .PATH target (e.g. .PATH : RCS), while the other is specific to a certain type of file, as indicated by the file's suffix. A specific search path is indicated by immediately following the .PATH with the suffix of the file. For instance: .PATH.h : /sprite/lib/include /sprite/att/lib/include would tell PMake to look in the directories /sprite/lib/include and /sprite/att/lib/include for any files whose suffix is .h. The current directory is always consulted first to see if a file exists. Only if it cannot be found there are the directories in the specific search path, followed by those in the general search path, consulted. A search path is also used when expanding wildcard characters. If the pattern has a recognizable suffix on it, the path for that suffix will be used for the expansion. Otherwise the default search path is employed. When a file is found in some directory other than the current one, all local variables that would have contained the target's name (.ALLSRC, and .IMPSRC) will instead contain the path to the file, as found by PMake. Thus if you have a file ../lib/mumble.c and a makefile like this: .PATH.c : ../lib mumble : mumble.c $(CC) -o $(.TARGET) $(.ALLSRC) the command executed to create mumble would be cc -o mumble ../lib/mumble.c. (as an aside, the command in this case is not strictly necessary, since it will be found using transformation rules if it is not given. This is because .out is the null suffix by default and a transformation exists from .c to .out. Just thought I would throw that in). If a file exists in two directories on the same search path, the file in the first directory on the path will be the one PMake uses. So if you have a large system spread over many directories, it would behoove you to follow a naming convention that avoids such conflicts. Something you should know about the way search paths are implemented is that each directory is read, and its contents cached, exactly once – when it is first encountered – so any changes to the directories while PMake is running will not be noted when searching for implicit sources, nor will they be found when PMake attempts to discover when the file was last modified, unless the file was created in the current directory. While people have suggested that PMake should read the directories each time, my experience suggests that the caching seldom causes problems. In addition, not caching the directories slows things down enormously because of PMake's attempts to apply transformation rules through non-existent files – the number of extra file-system searches is truly staggering, especially if many files without suffixes are used and the null suffix is not changed from .out.

Archives and Libraries &unix; and Sprite allow you to merge files into an archive using the ar command. Further, if the files are relocatable object files, you can run ranlib on the archive and get yourself a library that you can link into any program you want. The main problem with archives is they double the space you need to store the archived files, since there is one copy in the archive and one copy out by itself. The problem with libraries is you usually think of them as rather than /usr/lib/libm.a and the linker thinks they are out-of-date if you so much as look at them. PMake solves the problem with archives by allowing you to tell it to examine the files in the archives (so you can remove the individual files without having to regenerate them later). To handle the problem with libraries, PMake adds an additional way of deciding if a library is out-of-date: if the table of contents is older than the library, or is missing, the library is out-of-date. A library is any target that looks like or that ends in a suffix that was marked as a library using the .LIBS target. .a is so marked in the system makefile. Members of an archive are specified as archive(member[member...]). Thus libdix.a(window.o) specifies the file window.o in the archive libdix.a. You may also use wildcards to specify the members of the archive. Just remember that most the wildcard characters will only find existing files. A file that is a member of an archive is treated specially. If the file does not exist, but it is in the archive, the modification time recorded in the archive is used for the file when determining if the file is out-of-date. When figuring out how to make an archived member target (not the file itself, but the file in the archive – the archive(member) target), special care is taken with the transformation rules, as follows: archive(member) is made to depend on member. The transformation from the member's suffix to the archive's suffix is applied to the archive(member) target. The archive(member)'s .TARGET variable is set to the name of the member if member is actually a target, or the path to the member file if member is only a source. The .ARCHIVE variable for the archive(member) target is set to the name of the archive. The .MEMBER variable is set to the actual string inside the parentheses. In most cases, this will be the same as the .TARGET variable. The archive(member)'s place in the local variables of the targets that depend on it is taken by the value of its .TARGET variable. Thus, a program library could be created with the following makefile: .o.a : ... rm -f $(.TARGET:T) OBJS = obj1.o obj2.o obj3.o libprog.a : libprog.a($(OBJS)) ar cru $(.TARGET) $(.OODATE) ranlib $(.TARGET) This will cause the three object files to be compiled (if the corresponding source files were modified after the object file or, if that does not exist, the archived object file), the out-of-date ones archived in libprog.a, a table of contents placed in the archive and the newly-archived object files to be removed. All this is used in the makelib.mk system makefile to create a single library with ease. This makefile looks like this: # # Rules for making libraries. The object files that make up the library # are removed once they are archived. # # To make several libraries in parallel, you should define the variable # "many_libraries". This will serialize the invocations of ranlib. # # To use, do something like this: # # OBJECTS = <files in the library> # # fish.a: fish.a($(OBJECTS)) MAKELIB # # #ifndef _MAKELIB_MK _MAKELIB_MK = #include <po.mk> .po.a .o.a : ... rm -f $(.MEMBER) ARFLAGS ?= crl # # Re-archive the out-of-date members and recreate the library's table of # contents using ranlib. If many_libraries is defined, put the ranlib # off til the end so many libraries can be made at once. # MAKELIB : .USE .PRECIOUS ar $(ARFLAGS) $(.TARGET) $(.OODATE) #ifndef no_ranlib # ifdef many_libraries ... # endif many_libraries ranlib $(.TARGET) #endif no_ranlib #endif _MAKELIB_MK

On the Condition... Like the C compiler before it, PMake allows you to configure the makefile, based on the current environment, using conditional statements. A conditional looks like this: #if boolean expression lines #elif another boolean expression more lines #else still more lines #endif They may be nested to a maximum depth of 30 and may occur anywhere (except in a comment, of course). The # must the very first character on the line. Each boolean expression is made up of terms that look like function calls, the standard C boolean operators - &&, ||, and + &&, ||, and !, and the standard relational operators ==, !=, >, >=, <, and <=, with == and != being overloaded to allow string - comparisons as well. && represents logical + comparisons as well. && represents logical AND; || is logical OR and ! is logical NOT. The arithmetic and string operators take precedence over all three of these operators, while NOT takes precedence over AND, which takes precedence over OR. This precedence may be overridden with parentheses, and an expression may be parenthesized to your heart's content. Each term looks like a call on one of four functions: make The syntax is make(target) where target is a target in the makefile. This is true if the given target was specified on the command line, or as the source for a .MAIN target (note that the sources for .MAIN are only used if no targets were given on the command line). defined The syntax is defined(variable) and is true if variable is defined. Certain variables are defined in the system makefile that identify the system on which PMake is being run. exists The syntax is exists(file) and is true if the file can be found on the global search path (i.e. that defined by .PATH targets, not by .PATHsuffix targets). empty This syntax is much like the others, except the string inside the parentheses is of the same form as you would put between parentheses when expanding a variable, complete with modifiers and everything. The function returns true if the resulting string is empty. An undefined variable in this context will cause at the very least a warning message about a malformed conditional, and at the worst will cause the process to stop once it has read the makefile. If you want to check for a variable being defined or empty, use the expression: !defined(var) || empty(var) as the definition of || will prevent the empty() from being evaluated and causing an error, if the variable is undefined. This can be used to see if a variable contains a given word, for example: #if !empty(var:Mword) The arithmetic and string operators may only be used to test the value of a variable. The lefthand side must contain the variable expansion, while the righthand side contains either a string, enclosed in double-quotes, or a number. The standard C numeric conventions (except for specifying an octal number) apply to both sides. E.g.: #if $(OS) == 4.3 #if $(MACHINE) == "sun3" #if $(LOAD_ADDR) > 0xc000 are all valid conditionals. In addition, the numeric value of a variable can be tested as a boolean as follows: #if $(LOAD) would see if LOAD contains a non-zero value and: #if !$(LOAD) would test if LOAD contains a zero value. In addition to the bare #if, there are other forms that apply one of the first two functions to each term. They are as follows: ifdef defined ifndef !defined ifmake make ifnmake !make There are also the else if forms: elif, elifdef, elifndef, elifmake, and elifnmake. For instance, if you wish to create two versions of a program, one of which is optimized (the production version) and the other of which is for debugging (has symbols for dbx), you have two choices: you can create two makefiles, one of which uses the flag for the compilation, while the other uses the flag, or you can use another target (call it debug) to create the debug version. The construct below will take care of this for you. I have also made it so defining the variable DEBUG (say with pmake -D DEBUG) will also cause the debug version to be made. #if defined(DEBUG) || make(debug) CFLAGS += -g #else CFLAGS += -O #endif There are, of course, problems with this approach. The most glaring annoyance is that if you want to go from making a debug version to making a production version, you have to remove all the object files, or you will get some optimized and some debug versions in the same program. Another annoyance is you have to be careful not to make two targets that conflict because of some conditionals in the makefile. For instance: #if make(print) FORMATTER = ditroff -Plaser_printer #endif #if make(draft) FORMATTER = nroff -Pdot_matrix_printer #endif would wreak havoc if you tried pmake draft print since you would use the same formatter for each target. As I said, this all gets somewhat complicated.

A Shell is a Shell is a Shell In normal operation, the Bourne Shell (better known as sh) is used to execute the commands to re-create targets. PMake also allows you to specify a different shell for it to use when executing these commands. There are several things PMake must know about the shell you wish to use. These things are specified as the sources for the .SHELL target by keyword, as follows: path=path PMake needs to know where the shell actually resides, so it can execute it. If you specify this and nothing else, PMake will use the last component of the path and look in its table of the shells it knows and use the specification it finds, if any. Use this if you just want to use a different version of the Bourne or C Shell (yes, PMake knows how to use the C Shell too). name=name This is the name by which the shell is to be known. It is a single word and, if no other keywords are specified (other than path), it is the name by which PMake attempts to find a specification for it (as mentioned above). You can use this if you would just rather use the C Shell than the Bourne Shell (.SHELL: name=csh will do it). quiet=echo-off command As mentioned before, PMake actually controls whether commands are printed by introducing commands into the shell's input stream. This keyword, and the next two, control what those commands are. The quiet keyword is the command used to turn echoing off. Once it is turned off, echoing is expected to remain off until the echo-on command is given. echo=echo-on command The command PMake should give to turn echoing back on again. filter=printed echo-off command Many shells will echo the echo-off command when it is given. This keyword tells PMake in what format the shell actually prints the echo-off command. Wherever PMake sees this string in the shell's output, it will delete it and any following whitespace, up to and including the next newline. See the example at the end of this section for more details. echoFlag=flag to turn echoing on Unless a target has been marked .SILENT, PMake wants to start the shell running with echoing on. To do this, it passes this flag to the shell as one of its arguments. If either this or the next flag begins with a -, the flags will be passed to the shell as separate arguments. Otherwise, the two will be concatenated (if they are used at the same time, of course). errFlag=flag to turn error checking on Likewise, unless a target is marked .IGNORE, PMake wishes error-checking to be on from the very start. To this end, it will pass this flag to the shell as an argument. The same rules for an initial - apply as for the echoFlag. check=command to turn error checking on Just as for echo-control, error-control is achieved by inserting commands into the shell's input stream. This is the command to make the shell check for errors. It also serves another purpose if the shell does not have error-control as commands, but I will get into that in a minute. Again, once error checking has been turned on, it is expected to remain on until it is turned off again. ignore=commandto turn error checking off This is the command PMake uses to turn error checking off. It has another use if the shell does not do errorcontrol, but I will tell you about that...now. hasErrCtl=yes or no This takes a value that is either yes or no. Now you might think that the existence of the check and ignore keywords would be enough to tell PMake if the shell can do error-control, but you would be wrong. If hasErrCtl is yes, PMake uses the check and ignore commands in a straight-forward manner. If this is no, however, their use is rather different. In this case, the check command is used as a template, in which the string %s is replaced by the command that is about to be executed, to produce a command for the shell that will echo the command to be executed. The ignore command is also used as a template, again with %s replaced by the command to be executed, to produce a command that will execute the command to be executed and ignore any error it returns. When these strings are used as templates, you must provide newline(s) (\n) in the appropriate place(s). The strings that follow these keywords may be enclosed in single or double quotes (the quotes will be stripped off) and may contain the usual C backslash-characters (\n is newline, \r is return, \b is backspace, \' escapes a single-quote inside single-quotes, \" escapes a double-quote inside double-quotes). Now for an example. This is actually the contents of the <shx.mk> system makefile, and causes PMake to use the Bourne Shell in such a way that each command is printed as it is executed. That is, if more than one command is given on a line, each will be printed separately. Similarly, each time the body of a loop is executed, the commands within that loop will be printed, etc. The specification runs like this: # # This is a shell specification to have the Bourne shell echo # the commands just before executing them, rather than when it reads # them. Useful if you want to see how variables are being expanded, etc. # .SHELL : path=/bin/sh \ quiet="set -" \ echo="set -x" \ filter="+ set - " \ echoFlag=x \ errFlag=e \ hasErrCtl=yes \ check="set -e" \ ignore="set +e" It tells PMake the following: The shell is located in the file /bin/sh. It need not tell PMake that the name of the shell is sh as PMake can figure that out for itself (it is the last component of the path). The command to stop echoing is set -. The command to start echoing is set . When the echo off command is executed, the shell will print + set - (The + comes from using the flag (rather than the flag PMake usually uses)). PMake will remove all occurrences of this string from the output, so you do not notice extra commands you did not put there. The flag the Bourne Shell will take to start echoing in this way is the flag. The Bourne Shell will only take its flag arguments concatenated as its first argument, so neither this nor the errFlag specification begins with a -. The flag to use to turn error-checking on from the start is . The shell can turn error-checking on and off, and the commands to do so are set +e and set -e, respectively. I should note that this specification is for Bourne Shells that are not part of Berkeley &unix;, as shells from Berkeley do not do error control. You can get a similar effect, however, by changing the last three lines to be: hasErrCtl=no \ check="echo \"+ %s\"\n" \ ignore="sh -c '%s || exit 0\n" This will cause PMake to execute the two commands: echo "+ cmd" sh -c 'cmd || true' for each command for which errors are to be ignored. (In case you are wondering, the thing for ignore tells the shell to execute another shell without error checking on and always exit 0, since the || causes the exit 0 to be executed only if the first command exited non-zero, and if the first command exited zero, the shell will also exit zero, since that is the last command it executed).

Compatibility There are three (well, 3 1/2) levels of backwards-compatibility built into PMake. Most makefiles will need none at all. Some may need a little bit of work to operate correctly when run in parallel. Each level encompasses the previous levels (e.g. (one shell per command) implies ). The three levels are described in the following three sections.

DEFCON 3 – Variable Expansion As noted before, PMake will not expand a variable unless it knows of a value for it. This can cause problems for makefiles that expect to leave variables undefined except in special circumstances (e.g. if more flags need to be passed to the C compiler or the output from a text processor should be sent to a different printer). If the variables are enclosed in curly braces (${PRINTER}), the shell will let them pass. If they are enclosed in parentheses, however, the shell will declare a syntax error and the make will come to a grinding halt. You have two choices: change the makefile to define the variables (their values can be overridden on the command line, since that is where they would have been set if you used Make, anyway) or always give the flag (this can be done with the .MAKEFLAGS target, if you want).

DEFCON 2 – The Number of the Beast Then there are the makefiles that expect certain commands, such as changing to a different directory, to not affect other commands in a target's creation script. You can solve this is either by going back to executing one shell per command (which is what the flag forces PMake to do), which slows the process down a good bit and requires you to use semicolons and escaped newlines for shell constructs, or by changing the makefile to execute the offending command(s) in a subshell (by placing the line inside parentheses), like so: install :: .MAKE (cd src; $(.PMAKE) install) (cd lib; $(.PMAKE) install) (cd man; $(.PMAKE) install) This will always execute the three makes (even if the flag was given) because of the combination of the :: operator and the .MAKE attribute. Each command will change to the proper directory to perform the install, leaving the main shell in the directory in which it started.

DEFCON 1 – Imitation is the Not the Highest Form of Flattery The final category of makefile is the one where every command requires input, the dependencies are incompletely specified, or you simply cannot create more than one target at a time, as mentioned earlier. In addition, you may not have the time or desire to upgrade the makefile to run smoothly with PMake. If you are the conservative sort, this is the compatibility mode for you. It is entered either by giving PMake the flag (for Make), or by executing PMake as make. In either case, PMake performs things exactly like Make (while still supporting most of the nice new features PMake provides). This includes: No parallel execution. Targets are made in the exact order specified by the makefile. The sources for each target are made in strict left-to-right order, etc. A single Bourne shell is used to execute each command, thus the shell's $$ variable is useless, changing directories does not work across command lines, etc. If no special characters exist in a command line, PMake will break the command into words itself and execute the command directly, without executing a shell first. The characters that cause PMake to execute a shell are: #, =, |, ^, (, ), {, }, ;, &, >, <, *, ?, [, ], :, $, `, and \. You should notice that these are all the characters that are given special meaning by the shell (except ' and , which PMake deals with all by its lonesome). The use of the null suffix is turned off.

The Way Things Work When PMake reads the makefile, it parses sources and targets into nodes in a graph. The graph is directed only in the sense that PMake knows which way is up. Each node contains not only links to all its parents and children (the nodes that depend on it and those on which it depends, respectively), but also a count of the number of its children that have already been processed. The most important thing to know about how PMake uses this graph is that the traversal is breadth-first and occurs in two passes. After PMake has parsed the makefile, it begins with the nodes the user has told it to make (either on the command line, or via a .MAIN target, or by the target being the first in the file not labeled with the .NOTMAIN attribute) placed in a queue. It continues to take the node off the front of the queue, mark it as something that needs to be made, pass the node to Suff_FindDeps (mentioned earlier) to find any implicit sources for the node, and place all the node's children that have yet to be marked at the end of the queue. If any of the children is a .USE rule, its attributes are applied to the parent, then its commands are appended to the parent's list of commands and its children are linked to its parent. The parent's unmade children counter is then decremented (since the .USE node has been processed). You will note that this allows a .USE node to have children that are .USE nodes and the rules will be applied in sequence. If the node has no children, it is placed at the end of another queue to be examined in the second pass. This process continues until the first queue is empty. At this point, all the leaves of the graph are in the examination queue. PMake removes the node at the head of the queue and sees if it is out-of-date. If it is, it is passed to a function that will execute the commands for the node asynchronously. When the commands have completed, all the node's parents have their unmade children counter decremented and, if the counter is then 0, they are placed on the examination queue. Likewise, if the node is up-to-date. Only those parents that were marked on the downward pass are processed in this way. Thus PMake traverses the graph back up to the nodes the user instructed it to create. When the examination queue is empty and no shells are running to create a target, PMake is finished. Once all targets have been processed, PMake executes the commands attached to the .END target, either explicitly or through the use of an ellipsis in a shell script. If there were no errors during the entire process but there are still some targets unmade (PMake keeps a running count of how many targets are left to be made), there is a cycle in the graph. PMake does a depth-first traversal of the graph to find all the targets that were not made and prints them out one by one.

diff --git a/en_US.ISO8859-1/books/pmake/shortcuts/chapter.sgml b/en_US.ISO8859-1/books/pmake/shortcuts/chapter.sgml index 2e14f7c052..097b5e9284 100644 --- a/en_US.ISO8859-1/books/pmake/shortcuts/chapter.sgml +++ b/en_US.ISO8859-1/books/pmake/shortcuts/chapter.sgml @@ -1,1229 +1,1229 @@ Short-cuts and Other Nice Things Based on what I have told you so far, you may have gotten the impression that PMake is just a way of storing away commands and making sure you do not forget to compile something. Good. That is just what it is. However, the ways I have described have been inelegant, at best, and painful, at worst. This chapter contains things that make the writing of makefiles easier and the makefiles themselves shorter and easier to modify (and, occasionally, simpler). In this chapter, I assume you are somewhat more familiar with Sprite (or &unix;, if that is what you are using) than I did in , just so you are on your toes. So without further ado…

Transformation Rules As you know, a file's name consists of two parts: a base name, which gives some hint as to the contents of the file, and a suffix, which usually indicates the format of the file. Over the years, as &unix; has developed, naming conventions, with regard to suffixes, have also developed that have become almost as incontrovertible as Law. E.g. a file ending in .c is assumed to contain C source code; one with a .o suffix is assumed to be a compiled, relocatable object file that may be linked into any program; a file with a .ms suffix is usually a text file to be processed by Troff with the -ms macro package, and so on. One of the best aspects of both Make and PMake comes from their understanding of how the suffix of a file pertains to its contents and their ability to do things with a file based solely on its suffix. This ability comes from something known as a transformation rule. A transformation rule specifies how to change a file with one suffix into a file with another suffix. A transformation rule looks much like a dependency line, except the target is made of two known suffixes stuck together. Suffixes are made known to PMake by placing them as sources on a dependency line whose target is the special target .SUFFIXES. E.g.: .SUFFIXES : .o .c .c.o : $(CC) $(CFLAGS) -c $(.IMPSRC) The creation script attached to the target is used to trans form a file with the first suffix (in this case, .c) into a file with the second suffix (here, .o). In addition, the target inherits whatever attributes have been applied to the transformation rule. The simple rule given above says that to transform a C source file into an object file, you compile it using cc with the flag. This rule is taken straight from the system makefile. Many transformation rules (and suffixes) are defined there, and I refer you to it for more examples (type pmake -h to find out where it is). There are several things to note about the transformation rule given above: The .IMPSRC variable. This variable is set to the implied source (the file from which the target is being created; the one with the first suffix), which, in this case, is the .c file. The CFLAGS variable. Almost all of the transformation rules in the system makefile are set up using variables that you can alter in your makefile to tailor the rule to your needs. In this case, if you want all your C files to be compiled with the flag, to provide information for dbx, you would set the CFLAGS variable to contain (CFLAGS = -g) and PMake would take care of the rest. To give you a quick example, the makefile in could be changed to this: OBJS = a.o b.o c.o program : $(OBJS) $(CC) -o $(.TARGET) $(.ALLSRC) $(OBJS) : defs.h The transformation rule I gave above takes the place of the 6 lines This is also somewhat cleaner, I think, than the dynamic source solution presented in . : a.o : a.c cc -c a.c b.o : b.c cc -c b.c c.o : c.c cc -c c.c Now you may be wondering about the dependency between the .o and .c files – it is not mentioned anywhere in the new makefile. This is because it is not needed: one of the effects of applying a transformation rule is the target comes to depend on the implied source. That's why it is called the implied source. For a more detailed example. Say you have a makefile like this: a.out : a.o b.o $(CC) $(.ALLSRC) and a directory set up like this: total 4 -rw-rw-r-- 1 deboor 34 Sep 7 00:43 Makefile -rw-rw-r-- 1 deboor 119 Oct 3 19:39 a.c -rw-rw-r-- 1 deboor 201 Sep 7 00:43 a.o -rw-rw-r-- 1 deboor 69 Sep 7 00:43 b.c While just typing pmake will do the right thing, it is much more informative to type pmake -d s. This will show you what PMake is up to as it processes the files. In this case, PMake prints the following: Suff_FindDeps (a.out) using existing source a.o - applying .o -> .out to "a.o" + applying .o -> .out to "a.o" Suff_FindDeps (a.o) trying a.c...got it - applying .c -> .o to "a.c" + applying .c -> .o to "a.c" Suff_FindDeps (b.o) trying b.c...got it - applying .c -> .o to "b.c" + applying .c -> .o to "b.c" Suff_FindDeps (a.c) trying a.y...not there trying a.l...not there trying a.c,v...not there trying a.y,v...not there trying a.l,v...not there Suff_FindDeps (b.c) trying b.y...not there trying b.l...not there trying b.c,v...not there trying b.y,v...not there trying b.l,v...not there --- a.o --- cc -c a.c --- b.o --- cc -c b.c --- a.out --- cc a.o b.o Suff_FindDeps is the name of a function in PMake that is called to check for implied sources for a target using transformation rules. The transformations it tries are, naturally enough, limited to the ones that have been defined (a transformation may be defined multiple times, by the way, but only the most recent one will be used). You will notice, however, that there is a definite order to the suffixes that are tried. This order is set by the relative positions of the suffixes on the .SUFFIXES line – the earlier a suffix appears, the earlier it is checked as the source of a transformation. Once a suffix has been defined, the only way to change its position in the pecking order is to remove all the suffixes (by having a .SUFFIXES dependency line with no sources) and redefine them in the order you want. (Previously-defined transformation rules will be automatically redefined as the suffixes they involve are re-entered.) Another way to affect the search order is to make the dependency explicit. In the above example, a.out depends on a.o and b.o. Since a transformation exists from .o to .out, PMake uses that, as indicated by the using existing source a.o message. The search for a transformation starts from the suffix of the target and continues through all the defined transformations, in the order dictated by the suffix ranking, until an existing file with the same base (the target name minus the suffix and any leading directories) is found. At that point, one or more transformation rules will have been found to change the one existing file into the target. For example, ignoring what's in the system makefile for now, say you have a makefile like this: .SUFFIXES : .out .o .c .y .l .l.c : lex $(.IMPSRC) mv lex.yy.c $(.TARGET) .y.c : yacc $(.IMPSRC) mv y.tab.c $(.TARGET) .c.o : cc -c $(.IMPSRC) .o.out : cc -o $(.TARGET) $(.IMPSRC) and the single file jive.l. If you were to type pmake -rd ms jive.out, you would get the following output for jive.out: Suff_FindDeps (jive.out) trying jive.o...not there trying jive.c...not there trying jive.y...not there trying jive.l...got it - applying .l -> .c to "jive.l" - applying .c -> .o to "jive.c" - applying .o -> .out to "jive.o" + applying .l -> .c to "jive.l" + applying .c -> .o to "jive.c" + applying .o -> .out to "jive.o" and this is why: PMake starts with the target jive.out, figures out its suffix (.out) and looks for things i t can transform to a .out file. In this case, it only finds .o, so it looks for the file jive.o. It fails to find it, so it looks for transformations into a .o file. Again it has only one choice: .c. So it looks for jive.c and, as you know, fails to find it. At this point it has two choices: it can create the .c file from either a .y file or a .l file. Since .y came first on the .SUFFIXES line, it checks for jive.y first, but can not find it, so it looks for jive.l and, lo and behold, there it is. At this point, it has defined a transformation path as follows: - .l -> .c -> .o -> .out + .l -> .c -> .o -> .out and applies the transformation rules accordingly. For completeness, and to give you a better idea of what PMake actually did with this three-step transformation, this is what PMake printed for the rest of the process: Suff_FindDeps (jive.o) using existing source jive.c - applying .c -> .o to "jive.c" + applying .c -> .o to "jive.c" Suff_FindDeps (jive.c) using existing source jive.l - applying .l -> .c to "jive.l" + applying .l -> .c to "jive.l" Suff_FindDeps (jive.l) Examining jive.l...modified 17:16:01 Oct 4, 1987...up-to-date Examining jive.c...non-existent...out-of-date --- jive.c --- lex jive.l ... meaningless lex output deleted ... mv lex.yy.c jive.c Examining jive.o...non-existent...out-of-date --- jive.o --- cc -c jive.c Examining jive.out...non-existent...out-of-date --- jive.out --- cc -o jive.out jive.o One final question remains: what does PMake do with targets that have no known suffix? PMake simply pretends it actually has a known suffix and searches for transformations accordingly. The suffix it chooses is the source for the .NULL target mentioned later. In the system makefile, .out is chosen as the null suffix because most people use PMake to create programs. You are, however, free and welcome to change it to a suffix of your own choosing. The null suffix is ignored, however, when PMake is in compatibility mode (see ).

Including Other Makefiles Just as for programs, it is often useful to extract certain parts of a makefile into another file and just include it in other makefiles somehow. Many compilers allow you say something like: #include "defs.h" to include the contents of defs.h in the source file. PMake allows you to do the same thing for makefiles, with the added ability to use variables in the filenames. An include directive in a makefile looks either like this: #include <file> or this: #include "file" The difference between the two is where PMake searches for the file: the first way, PMake will look for the file only in the system makefile directory (or directories) (to find out what that directory is, give PMake the -h flag). The system makefile directory search path can be overridden via the option. For files in double-quotes, the search is more complex: The directory of the makefile that's including the file. The current directory (the one in which you invoked PMake). The directories given by you using flags, in the order in which you gave them. Directories given by .PATH dependency lines (see ). The system makefile directory. in that order. You are free to use PMake variables in the filename – PMake will expand them before searching for the file. You must specify the searching method with either angle brackets or double-quotes outside of a variable expansion. I.e. the following: SYSTEM = <command.mk> #include $(SYSTEM) will not work.

Saving Commands There may come a time when you will want to save certain commands to be executed when everything else is done. For instance: you are making several different libraries at one time and you want to create the members in parallel. Problem is, ranlib is another one of those programs that can not be run more than once in the same directory at the same time (each one creates a file called __.SYMDEF into which it stuffs information for the linker to use. Two of them running at once will overwrite each other's file and the result will be garbage for both parties). You might want a way to save the ranlib commands til the end so they can be run one after the other, thus keeping them from trashing each other's file. PMake allows you to do this by inserting an ellipsis (...) as a command between commands to be run at once and those to be run later. So for the ranlib case above, you might do this: lib1.a : $(LIB1OBJS) rm -f $(.TARGET) ar cr $(.TARGET) $(.ALLSRC) ... ranlib $(.TARGET) lib2.a : $(LIB2OBJS) rm -f $(.TARGET) ar cr $(.TARGET) $(.ALLSRC) ... ranlib $(.TARGET) This would save both ranlib $(.TARGET) commands until the end, when they would run one after the other (using the correct value for the .TARGET variable, of course). Commands saved in this manner are only executed if PMake manages to re-create everything without an error.

Target Attributes PMake allows you to give attributes to targets by means of special sources. Like everything else PMake uses, these sources begin with a period and are made up of all upper-case letters. There are various reasons for using them, and I will try to give examples for most of them. Others you will have to find uses for yourself. Think of it as an exercise for the reader. By placing one (or more) of these as a source on a dependency line, you are marking the target(s) with that attribute. That is just the way I phrase it, so you know. Any attributes given as sources for a transformation rule are applied to the target of the transformation rule when the rule is applied. .DONTCARE If a target is marked with this attribute and PMake can not figure out how to create it, it will ignore this fact and assume the file is not really needed or actually exists and PMake just can not find it. This may prove wrong, but the error will be noted later on, not when PMake tries to create the target so marked. This attribute also prevents PMake from attempting to touch the target if it is given the flag. .EXEC This attribute causes its shell script to be executed while having no effect on targets that depend on it. This makes the target into a sort of subroutine. An example. Say you have some LISP files that need to be compiled and loaded into a LISP process. To do this, you echo LISP commands into a file and execute a LISP with this file as its input when everything is done. Say also that you have to load other files from another system before you can compile your files and further, that you do not want to go through the loading and dumping unless one of your files has changed. Your makefile might look a little bit like this (remember, this is an educational example, and do not worry about the COMPILE rule, all will soon become clear, grasshopper): system : init a.fasl b.fasl c.fasl for i in $(.ALLSRC); do - echo -n '(load "' >> input - echo -n ${i} >> input - echo '")' >> input + echo -n '(load "' >> input + echo -n ${i} >> input + echo '")' >> input done - echo '(dump "$(.TARGET)")' >> input - lisp < input + echo '(dump "$(.TARGET)")' >> input + lisp < input a.fasl : a.l init COMPILE b.fasl : b.l init COMPILE c.fasl : c.l init COMPILE COMPILE : .USE - echo '(compile "$(.ALLSRC)")' >> input + echo '(compile "$(.ALLSRC)")' >> input init : .EXEC - echo '(load-system)' > input + echo '(load-system)' > input .EXEC sources, do not appear in the local variables of targets that depend on them (nor are they touched if PMake is given the flag). Note that all the rules, not just that for system, include init as a source. This is because none of the other targets can be made until init has been made, thus they depend on it. .EXPORT This is used to mark those targets whose creation should be sent to another machine if at all possible. This may be used by some exportation schemes if the exportation is expensive. You should ask your system administrator if it is necessary. .EXPORTSAME Tells the export system that the job should be exported to a machine of the same architecture as the current one. Certain operations (e.g. running text through nroff) can be performed the same on any architecture (CPU and operating system type), while others (e.g. compiling a program with cc) must be performed on a machine with the same architecture. Not all export systems will support this attribute. .IGNORE Giving a target the .IGNORE attribute causes PMake to ignore errors from any of the target's commands, as if they all had - before them. .INVISIBLE This allows you to specify one target as a source for another without the one affecting the other's local variables. Useful if, say, you have a makefile that creates two programs, one of which is used to create the other, so it must exist before the other is created. You could say prog1 : $(PROG1OBJS) prog2 MAKEINSTALL prog2 : $(PROG2OBJS) .INVISIBLE MAKEINSTALL where MAKEINSTALL is some complex .USE rule (see below) that depends on the .ALLSRC variable containing the right things. Without the .INVISIBLE attribute for prog2, the MAKEINSTALL rule could not be applied. This is not as useful as it should be, and the semantics may change (or the whole thing go away) in the not-too-distant future. .JOIN This is another way to avoid performing some operations in parallel while permitting everything else to be done so. Specifically it forces the target's shell script to be executed only if one or more of the sources was out-of-date. In addition, the target's name, in both its .TARGET variable and all the local variables of any target that depends on it, is replaced by the value of its .ALLSRC variable. As an example, suppose you have a program that has four libraries that compile in the same directory along with, and at the same time as, the program. You again have the problem with ranlib that I mentioned earlier, only this time it's more severe: you can not just put the ranlib off to the end since the program will need those libraries before it can be re-created. You can do something like this: program : $(OBJS) libraries cc -o $(.TARGET) $(.ALLSRC) libraries : lib1.a lib2.a lib3.a lib4.a .JOIN ranlib $(.OODATE) In this case, PMake will re-create the $(OBJS) as necessary, along with lib1.a, lib2.a, lib3.a and lib4.a. It will then execute ranlib on any library that was changed and set program's .ALLSRC variable to contain what's in $(OBJS) followed by lib1.a lib2.a lib3.a lib4.a. In case you are wondering, it is called .JOIN because it joins together different threads of the input graph at the target marked with the attribute. Another aspect of the .JOIN attribute is it keeps the target from being created if the flag was given. .MAKE The .MAKE attribute marks its target as being a recursive invocation of PMake. This forces PMake to execute the script associated with the target (if it is out-of-date) even if you gave the or flag. By doing this, you can start at the top of a system and type pmake -n and have it descend the directory tree (if your makefiles are set up correctly), printing what it would have executed if you had not included the flag. .NOEXPORT If possible, PMake will attempt to export the creation of all targets to another machine (this depends on how PMake was configured). Sometimes, the creation is so simple, it is pointless to send it to another machine. If you give the target the .NOEXPORT attribute, it will be run loally, even if you have given PMake the flag. .NOTMAIN Normally, if you do not specify a target to make in any other way, PMake will take the first target on the first dependency line of a makefile as the target to create. That target is known as the Main Target and is labeled as such if you print the dependencies out using the flag. Giving a target this attribute tells PMake that the target is definitely not the Main Target. This allows you to place targets in an included makefile and have PMake create something else by default. .PRECIOUS When PMake is interrupted (you type control-C at the keyboard), it will attempt to clean up after itself by removing any half-made targets. If a target has the .PRECIOUS attribute, however, PMake will leave it alone. An additional side effect of the :: operator is to mark the targets as .PRECIOUS. .SILENT Marking a target with this attribute keeps its commands from being printed when they are executed, just as if they had an @ in front of them. .USE By giving a target this attribute, you turn it into PMake's equivalent of a macro. When the target is used as a source for another target, the other target acquires the commands, sources and attributes (except .USE) of the source. If the target already has commands, the .USE target's commands are added to the end. If more than one .USE-marked source is given to a target, the rules are applied sequentially. The typical .USE rule (as I call them) will use the sources of the target to which it is applied (as stored in the .ALLSRC variable for the target) as its arguments, if you will. For example, you probably noticed that the commands for creating lib1.a and lib2.a in the example in section were exactly the same. You can use the .USE attribute to eliminate the repetition, like so: lib1.a : $(LIB1OBJS) MAKELIB lib2.a : $(LIB2OBJS) MAKELIB MAKELIB : .USE rm -f $(.TARGET) ar cr $(.TARGET) $(.ALLSRC) ... ranlib $(.TARGET) Several system makefiles (not to be confused with The System Makefile) make use of these .USE rules to make your life easier (they are in the default, system makefile directory...take a look). Note that the .USE rule source itself (MAKELIB) does not appear in any of the targets's local variables. There is no limit to the number of times I could use the MAKELIB rule. If there were more libraries, I could continue with lib3.a : $(LIB3OBJS) MAKELIB and so on and so forth.

Special Targets As there were in Make, so there are certain targets that have special meaning to PMake. When you use one on a dependency line, it is the only target that may appear on the left-hand-side of the operator. As for the attributes and variables, all the special targets begin with a period and consist of upper-case letters only. I will not describe them all in detail because some of them are rather complex and I will describe them in more detail than you will want in . The targets are as follows: .BEGIN Any commands attached to this target are executed before anything else is done. You can use it for any initialization that needs doing. .DEFAULT This is sort of a .USE rule for any target (that was used only as a source) that PMake can not figure out any other way to create. It is only sort of a .USE rule because only the shell script attached to the .DEFAULT target is used. The .IMPSRC variable of a target that inherits .DEFAULT's commands is set to the target's own name. .END This serves a function similar to .BEGIN, in that commands attached to it are executed once everything has been re-created (so long as no errors occurred). It also serves the extra function of being a place on which PMake can hang commands you put off to the end. Thus the script for this target will be executed before any of the commands you save with the .... .EXPORT The sources for this target are passed to the exportation system compiled into PMake. Some systems will use these sources to configure themselves. You should ask your system administrator about this. .IGNORE This target marks each of its sources with the .IGNORE attribute. If you do not give it any sources, then it is like giving the flag when you invoke PMake – errors are ignored for all commands. .INCLUDES The sources for this target are taken to be suffixes that indicate a file that can be included in a program source file. The suffix must have already been declared with .SUFFIXES (see below). Any suffix so marked will have the directories on its search path (see .PATH, below) placed in the .INCLUDES variable, each preceded by a flag. This variable can then be used as an argument for the compiler in the normal fashion. The .h suffix is already marked in this way in the system makefile. E.g. if you have .SUFFIXES : .bitmap .PATH.bitmap : /usr/local/X/lib/bitmaps .INCLUDES : .bitmap PMake will place -I/usr/local/X/lib/bitmaps in the .INCLUDES variable and you can then say cc $(.INCLUDES) -c xprogram.c (Note: the .INCLUDES variable is not actually filled in until the entire makefile has been read.) .INTERRUPT When PMake is interrupted, it will execute the commands in the script for this target, if it exists. .LIBS This does for libraries what .INCLUDES does for include files, except the flag used is , as required by those linkers that allow you to tell them where to find libraries. The variable used is .LIBS. Be forewarned that PMake may not have been compiled to do this if the linker on your system does not accept the flag, though the .LIBS variable will always be defined once the makefile has been read. .MAIN If you did not give a target (or targets) to create when you invoked PMake, it will take the sources of this target as the targets to create. .MAKEFLAGS This target provides a way for you to always specify flags for PMake when the makefile is used. The flags are just as they would be typed to the shell (except you can not use shell variables unless they are in the environment), though the and flags have no effect. .NULL This allows you to specify what suffix PMake should pretend a file has if, in fact, it has no known suffix. Only one suffix may be so designated. The last source on the dependency line is the suffix that is used (you should, however, only give one suffix...). .PATH If you give sources for this target, PMake will take them as directories in which to search for files it cannot find in the current directory. If you give no sources, it will clear out any directories added to the search path before. Since the effects of this all get very complex, I will leave it til to give you a complete explanation. .PATHsuffix This does a similar thing to .PATH, but it does it only for files with the given suffix. The suffix must have been defined already. Look at Search Paths () for more information. .PRECIOUS Similar to .IGNORE, this gives the .PRECIOUS attribute to each source on the dependency line, unless there are no sources, in which case the .PRECIOUS attribute is given to every target in the file. .RECURSIVE This target applies the .MAKE attribute to all its sources. It does nothing if you do not give it any sources. .SHELL PMake is not constrained to only using the Bourne shell to execute the commands you put in the makefile. You can tell it some other shell to use with this target. Check out () for more information. .SILENT When you use .SILENT as a target, it applies the .SILENT attribute to each of its sources. If there are no sources on the dependency line, then it is as if you gave PMake the flag and no commands will be echoed. .SUFFIXES This is used to give new file suffixes for PMake to handle. Each source is a suffix PMake should recognize. If you give a .SUFFIXES dependency line with no sources, PMake will forget about all the suffixes it knew (this also nukes the null suffix). For those targets that need to have suffixes defined, this is how you do it. In addition to these targets, a line of the form: attribute : sources applies the attribute to all the targets listed as sources.

Modifying Variable Expansion Variables need not always be expanded verbatim. PMake defines several modifiers that may be applied to a variable's value before it is expanded. You apply a modifier by placing it after the variable name with a colon between the two, like so: ${VARIABLE:modifier} Each modifier is a single character followed by something specific to the modifier itself. You may apply as many modifiers as you want – each one is applied to the result of the previous and is separated from the previous by another colon. There are seven ways to modify a variable's expansion, most of which come from the C shell variable modification characters: Mpattern This is used to select only those words (a word is a series of characters that are neither spaces nor tabs) that match the given pattern. The pattern is a wildcard pattern like that used by the shell, where * means 0 or more characters of any sort; ? is any single character; [abcd] matches any single character that is either a, b, c or d (there may be any number of characters between the brackets); [0-9] matches any single character that is between 0and 9 (i.e. any digit. This form may be freely mixed with the other bracket form), and \ is used to escape any of the characters *, ?, [ or :, leaving them as regular characters to match themselves in a word. For example, the system makefile <makedepend.mk> uses $(CFLAGS:M-[ID]*) to extract all the and flags that would be passed to the C compiler. This allows it to properly locate include files and generate the correct dependencies. Npattern This is identical to :M except it substitutes all words that don't match the given pattern. S/search-string/replacement-string/[g] Causes the first occurrence of search-string in the variable to be replaced by replacement-string, unless the flag is given at the end, in which case all occurrences of the string are replaced. The substitution is performed on each word in the variable in turn. If search-string begins with a ^, the string must match starting at the beginning of the word. If search-string ends with a $, the string must match to the end of the word (these two may be combined to force an exact match). If a backslash precedes these two characters, however, they lose their special meaning. Variable expansion also occurs in the normal fashion inside both the search-string and the replacement-string, except that a backslash is used to prevent the expansion of a $, not another dollar sign, as is usual. Note that search-string is just a string, not a pattern, so none of the usual regularexpression/wildcard characters have any special meaning save ^ and $. In the replacement string, the & character is replaced by the search-string unless it is preceded by a backslash. You are allowed to use any character except colon or exclamation point to separate the two strings. This so-called delimiter character may be placed in either string by preceding it with a backslash. T Replaces each word in the variable expansion by its last component (its tail). For example, given: OBJS = ../lib/a.o b /usr/lib/libm.a TAILS = $(OBJS:T) the variable TAILS would expand to a.o b libm.a. H This is similar to :T, except that every word is replaced by everything but the tail (the head). Using the same definition of OBJS, the string $(OBJS:H) would expand to ../lib /usr/lib. Note that the final slash on the heads is removed and anything without a head is replaced by the empty string. E :E replaces each word by its suffix (extension). So $(OBJS:E) would give you .o .a. R This replaces each word by everything but the suffix (the root of the word). $(OBJS:R) expands to ../lib/a b /usr/lib/libm. In addition, the System V style of substitution is also supported. This looks like: $(VARIABLE:search-string=replacement) It must be the last modifier in the chain. The search is anchored at the end of each word, so only suffixes or whole words may be replaced.

More Exercises Exercise 3.1 You have got a set programs, each of which is created from its own assembly-language source file (suffix .asm). Each program can be assembled into two versions, one with error-checking code assembled in and one without. You could assemble them into files with different suffixes (.eobj and .obj, for instance), but your linker only understands files that end in .obj. To top it all off, the final executables must have the suffix .exe. How can you still use transformation rules to make your life easier (Hint: assume the errorchecking versions have ec tacked onto their prefix)? Exercise 3.2 Assume, for a moment or two, you want to perform a sort of indirection by placing the name of a variable into another one, then you want to get the value of the first by expanding the second somehow. Unfortunately, PMake does not allow constructs like: $($(FOO)) What do you do? Hint: no further variable expansion is performed after modifiers are applied, thus if you cause a $ to occur in the expansion, that is what will be in the result.

diff --git a/en_US.ISO8859-1/books/porters-handbook/book.sgml b/en_US.ISO8859-1/books/porters-handbook/book.sgml index e1c8eb5f96..9fcbf5d390 100644 --- a/en_US.ISO8859-1/books/porters-handbook/book.sgml +++ b/en_US.ISO8859-1/books/porters-handbook/book.sgml @@ -1,9933 +1,9933 @@ %books.ent; ]> FreeBSD Porter's Handbook The FreeBSD Documentation Project April 2000 2000 2001 2002 2003 2004 2005 2006 The FreeBSD Documentation Project &bookinfo.trademarks; &bookinfo.legalnotice; Introduction The FreeBSD ports collection is the way almost everyone installs applications ("ports") on FreeBSD. Like everything else about FreeBSD, it is primarily a volunteer effort. It is important to keep this in mind when reading this document. In FreeBSD, anyone may submit a new port, or volunteer to maintain an existing port if it is unmaintained—you do not need any special commit privileges to do so. Making a port yourself So, you are interested in making your own port or upgrading an existing one? Great! What follows are some guidelines for creating a new port for FreeBSD. If you want to upgrade an existing port, you should read this and then read . When this document is not sufficiently detailed, you should refer to /usr/ports/Mk/bsd.port.mk, which all port Makefiles include. Even if you do not hack Makefiles daily, it is well commented, and you will still gain much knowledge from it. Additionally, you may send specific questions to the &a.ports;. Only a fraction of the variables (VAR) that can be overridden are mentioned in this document. Most (if not all) are documented at the start of /usr/ports/Mk/bsd.port.mk; the others probably ought to be. Note that this file uses a non-standard tab setting: Emacs and Vim should recognize the setting on loading the file. Both &man.vi.1; and &man.ex.1; can be set to use the correct value by typing :set tabstop=4 once the file has been loaded. Quick Porting This section tells you how to do a quick port. In many cases, it is not sufficient, so you will have to read further on into the document. First, get the original tarball and put it into DISTDIR, which defaults to /usr/ports/distfiles. The following assumes that the software compiled out-of-the-box, i.e., there was absolutely no change required for the port to work on your FreeBSD box. If you needed to change something, you will have to refer to the next section too. Writing the <filename>Makefile</filename> The minimal Makefile would look something like this: # New ports collection makefile for: oneko # Date created: 5 December 1994 # Whom: asami # # $FreeBSD$ # PORTNAME= oneko PORTVERSION= 1.1b CATEGORIES= games MASTER_SITES= ftp://ftp.cs.columbia.edu/archives/X11R5/contrib/ MAINTAINER= asami@FreeBSD.org COMMENT= A cat chasing a mouse all over the screen MAN1= oneko.1 MANCOMPRESSED= yes USE_IMAKE= yes .include <bsd.port.mk> See if you can figure it out. Do not worry about the contents of the $FreeBSD$ line, it will be filled in automatically by CVS when the port is imported to our main ports tree. You can find a more detailed example in the sample Makefile section. Writing the description files There are two description files that are required for any port, whether they actually package or not. They are pkg-descr and pkg-plist. Their pkg- prefix distinguishes them from other files. <filename>pkg-descr</filename> This is a longer description of the port. One to a few paragraphs concisely explaining what the port does is sufficient. This is not a manual or an in-depth description on how to use or compile the port! Please be careful if you are copying from the README or manpage; too often they are not a concise description of the port or are in an awkward format (e.g., manpages have justified spacing). If the ported software has an official WWW homepage, you should list it here. Prefix one of the websites with WWW: so that automated tools will work correctly. The following example shows how your pkg-descr should look: This is a port of oneko, in which a cat chases a poor mouse all over the screen. : (etc.) WWW: http://www.oneko.org/ <filename>pkg-plist</filename> This file lists all the files installed by the port. It is also called the packing list because the package is generated by packing the files listed here. The pathnames are relative to the installation prefix (usually /usr/local or /usr/X11R6). If you are using the MANn variables (as you should be), do not list any manpages here. If the port creates directories during installation, make sure to add @dirrm lines to remove them when the package is deleted. Here is a small example: bin/oneko lib/X11/app-defaults/Oneko lib/X11/oneko/cat1.xpm lib/X11/oneko/cat2.xpm lib/X11/oneko/mouse.xpm @dirrm lib/X11/oneko Refer to the &man.pkg.create.1; manual page for details on the packing list. It is recommended that you keep all the filenames in this file sorted alphabetically. It will make verifying the changes when you upgrade the port much easier. Creating a packing list manually can be a very tedious task. If the port installs a large numbers of files, creating the packing list automatically might save time. There is only one case when pkg-plist can be omitted from a port. If the port installs just a handful of files, and perhaps directories, the files and directories may be listed in the variables PLIST_FILES and PLIST_DIRS, respectively, within the port's Makefile. For instance, we could get along without pkg-plist in the above oneko port by adding the following lines to the Makefile: PLIST_FILES= bin/oneko \ lib/X11/app-defaults/Oneko \ lib/X11/oneko/cat1.xpm \ lib/X11/oneko/cat2.xpm \ lib/X11/oneko/mouse.xpm PLIST_DIRS= lib/X11/oneko Of course, PLIST_DIRS should be left unset if a port installs no directories of its own. The price for this way of listing port's files and directories is that you cannot use command sequences described in &man.pkg.create.1;. Therefore, it is suitable only for simple ports and makes them even simpler. At the same time, it has the advantage of reducing the number of files in the ports collection. Please consider using this technique before you resort to pkg-plist. Later we will see how pkg-plist and PLIST_FILES can be used to fulfil more sophisticated tasks. Creating the checksum file Just type make makesum. The ports make rules will automatically generate the file distinfo. If a file fetched has its checksum changed regularly and you are certain the source is trusted (i.e. it comes from manufacturer CDs or documentation generated daily), you should specify these files in the IGNOREFILES variable. Then the checksum is not calculated for that file when you run make makesum, but set to IGNORE. Testing the port You should make sure that the port rules do exactly what you want them to do, including packaging up the port. These are the important points you need to verify. pkg-plist does not contain anything not installed by your port pkg-plist contains everything that is installed by your port Your port can be installed multiple times using the reinstall target Your port cleans up after itself upon deinstall Recommended test ordering make install make package make deinstall pkg_add package-name make deinstall make reinstall make package Make sure that there are not any warnings issued in any of the package and deinstall stages. After step 3, check to see if all the new directories are correctly deleted. Also, try using the software after step 4, to ensure that it works correctly when installed from a package. Checking your port with <command>portlint</command> Please use portlint to see if your port conforms to our guidelines. The devel/portlint program is part of the ports collection. In particular, you may want to check if the Makefile is in the right shape and the package is named appropriately. Submitting the port First, make sure you have read the DOs and DON'Ts section. Now that you are happy with your port, the only thing remaining is to put it in the main FreeBSD ports tree and make everybody else happy about it too. We do not need your work directory or the pkgname.tgz package, so delete them now. Next, simply include the output of shar `find port_dir` in a bug report and send it with the &man.send-pr.1; program (see Bug Reports and General Commentary for more information about &man.send-pr.1;). Be sure to classify the bug report as category ports and class change-request (Do not mark the report confidential!). Also add a short description of the program you ported to the Description field of the PR and the shar to the Fix field. You can make our work a lot easier, if you use a good description in the synopsis of the problem report. We prefer something like New port: <category>/<portname> <short description of the port> for new ports and Update port: <category>/<portname> <short description of the update> for port updates. If you stick to this scheme, the chance that someone will take a look at your PR soon is much better. One more time, do not include the original source distfile, the work directory, or the package you built with make package. After you have submitted your port, please be patient. Sometimes it can take a few months before a port is included in FreeBSD, although it might only take a few days. You can view the list of ports waiting to be committed to FreeBSD. Once we have looked at your port, we will get back to you if necessary, and put it in the tree. Your name will also appear in the list of Additional FreeBSD Contributors and other files. Isn't that great?!? :-) Slow Porting Ok, so it was not that simple, and the port required some modifications to get it to work. In this section, we will explain, step by step, how to modify it to get it to work with the ports paradigm. How things work First, this is the sequence of events which occurs when the user first types make in your port's directory. You may find that having bsd.port.mk in another window while you read this really helps to understand it. But do not worry if you do not really understand what bsd.port.mk is doing, not many people do... :-> The fetch target is run. The fetch target is responsible for making sure that the tarball exists locally in DISTDIR. If fetch cannot find the required files in DISTDIR it will look up the URL MASTER_SITES, which is set in the Makefile, as well as our main FTP site at , where we put sanctioned distfiles as backup. It will then attempt to fetch the named distribution file with FETCH, assuming that the requesting site has direct access to the Internet. If that succeeds, it will save the file in DISTDIR for future use and proceed. The extract target is run. It looks for your port's distribution file (typically a gzip'd tarball) in DISTDIR and unpacks it into a temporary subdirectory specified by WRKDIR (defaults to work). The patch target is run. First, any patches defined in PATCHFILES are applied. Second, if any patch files named patch-* are found in PATCHDIR (defaults to the files subdirectory), they are applied at this time in alphabetical order. The configure target is run. This can do any one of many different things. If it exists, scripts/configure is run. If HAS_CONFIGURE or GNU_CONFIGURE is set, WRKSRC/configure is run. If USE_IMAKE is set, XMKMF (default: xmkmf -a) is run. The build target is run. This is responsible for descending into the port's private working directory (WRKSRC) and building it. If USE_GMAKE is set, GNU make will be used, otherwise the system make will be used. The above are the default actions. In addition, you can define targets pre-something or post-something, or put scripts with those names, in the scripts subdirectory, and they will be run before or after the default actions are done. For example, if you have a post-extract target defined in your Makefile, and a file pre-build in the scripts subdirectory, the post-extract target will be called after the regular extraction actions, and the pre-build script will be executed before the default build rules are done. It is recommended that you use Makefile targets if the actions are simple enough, because it will be easier for someone to figure out what kind of non-default action the port requires. The default actions are done by the bsd.port.mk targets do-something. For example, the commands to extract a port are in the target do-extract. If you are not happy with the default target, you can fix it by redefining the do-something target in your Makefile. The main targets (e.g., extract, configure, etc.) do nothing more than make sure all the stages up to that one are completed and call the real targets or scripts, and they are not intended to be changed. If you want to fix the extraction, fix do-extract, but never ever change the way extract operates! Now that you understand what goes on when the user types make, let us go through the recommended steps to create the perfect port. Getting the original sources Get the original sources (normally) as a compressed tarball (foo.tar.gz or foo.tar.Z) and copy it into DISTDIR. Always use mainstream sources when and where you can. You will need to set the variable MASTER_SITES to reflect where the original tarball resides. You will find convenient shorthand definitions for most mainstream sites in bsd.sites.mk. Please use these sites—and the associated definitions—if at all possible, to help avoid the problem of having the same information repeated over again many times in the source base. As these sites tend to change over time, this becomes a maintenance nightmare for everyone involved. If you cannot find a FTP/HTTP site that is well-connected to the net, or can only find sites that have irritatingly non-standard formats, you might want to put a copy on a reliable FTP or HTTP server that you control (e.g., your home page). If you cannot find somewhere convenient and reliable to put the distfile we can house it ourselves on ftp.FreeBSD.org; however, this is the least-preferred solution. The distfile must be placed into ~/public_distfiles/ of someone's freefall account. Ask the person who commits your port to do this. This person will also set MASTER_SITES to MASTER_SITE_LOCAL and MASTER_SITE_SUBDIR to their freefall username. If your port's distfile changes all the time without any kind of version update by the author, consider putting the distfile on your home page and listing it as the first MASTER_SITES. If you can, try to talk the port author out of doing this; it really does help to establish some kind of source code control. Hosting your own version will prevent users from getting checksum mismatch errors, and also reduce the workload of maintainers of our FTP site. Also, if there is only one master site for the port, it is recommended that you house a backup at your site and list it as the second MASTER_SITES. If your port requires some additional `patches' that are available on the Internet, fetch them too and put them in DISTDIR. Do not worry if they come from a site other than where you got the main source tarball, we have a way to handle these situations (see the description of PATCHFILES below). Modifying the port Unpack a copy of the tarball in a private directory and make whatever changes are necessary to get the port to compile properly under the current version of FreeBSD. Keep careful track of everything you do, as you will be automating the process shortly. Everything, including the deletion, addition, or modification of files should be doable using an automated script or patch file when your port is finished. If your port requires significant user interaction/customization to compile or install, you should take a look at one of Larry Wall's classic Configure scripts and perhaps do something similar yourself. The goal of the new ports collection is to make each port as plug-and-play as possible for the end-user while using a minimum of disk space. Unless explicitly stated, patch files, scripts, and other files you have created and contributed to the FreeBSD ports collection are assumed to be covered by the standard BSD copyright conditions. Patching In the preparation of the port, files that have been added or changed can be picked up with a recursive &man.diff.1; for later feeding to &man.patch.1;. Each set of patches you wish to apply should be collected into a file named patch-* where * indicates the pathnames of the files that are patched, such as patch-Imakefile or patch-src-config.h. These files should be stored in PATCHDIR, from where they will be automatically applied. All patches must be relative to WRKSRC (generally the directory your port's tarball unpacks itself into, that being where the build is done). To make fixes and upgrades easier, you should avoid having more than one patch fix the same file (e.g., patch-file and patch-file2 both changing WRKSRC/foobar.c). Please only use characters [-+._a-zA-Z0-9] for naming your patches. Do not use any other characters besides them. Do not name your patches like patch-aa or patch-ab etc, always mention path and file name in patch names. Do not put RCS strings in patches. CVS will mangle them when we put the files into the ports tree, and when we check them out again, they will come out different and the patch will fail. RCS strings are surrounded by dollar ($) signs, and typically start with $Id or $RCS. Using the recurse () option to &man.diff.1; to generate patches is fine, but please take a look at the resulting patches to make sure you do not have any unnecessary junk in there. In particular, diffs between two backup files, Makefiles when the port uses Imake or GNU configure, etc., are unnecessary and should be deleted. If you had to edit configure.in and run autoconf to regenerate configure, do not take the diffs of configure (it often grows to a few thousand lines!); define USE_AUTOCONF_VER=213 and take the diffs of configure.in. Quite often, there is a situation when the software being ported, especially if it is primarily developed on &windows;, uses the CR/LF convention for most of its source files. This may cause problems with further patching, compiler warnings, scripts execution (/bin/sh^M not found), etc. To quickly convert those files from CR/LF to just LF, you can do something like this: USE_REINPLACE= yes post-extract: @${FIND} -E ${WRKDIR} -type f -iregex ".*\.(c|cpp|h|txt)" -print0 | \ ${XARGS} -0 ${REINPLACE_CMD} -e 's/[[:cntrl:]]*$$//' Of course, if you need to process each and every file, above can be omitted. Be aware that this piece of code will strip all trailing control characters from each line of processed file (except \n). Also, if you had to delete a file, then you can do it in the post-extract target rather than as part of the patch. Once you are happy with the resulting diff, please split it up into one source file per patch file. Configuring Include any additional customization commands in your configure script and save it in the scripts subdirectory. As mentioned above, you can also do this with Makefile targets and/or scripts with the name pre-configure or post-configure. Handling user input If your port requires user input to build, configure, or install, you must set IS_INTERACTIVE in your Makefile. This will allow overnight builds to skip your port if the user sets the variable BATCH in his environment (and if the user sets the variable INTERACTIVE, then only those ports requiring interaction are built). This will save a lot of wasted time on the set of machines that continually build ports (see below). It is also recommended that if there are reasonable default answers to the questions, you check the PACKAGE_BUILDING variable and turn off the interactive script when it is set. This will allow us to build the packages for CDROMs and FTP. Configuring the Makefile Configuring the Makefile is pretty simple, and again we suggest that you look at existing examples before starting. Also, there is a sample Makefile in this handbook, so take a look and please follow the ordering of variables and sections in that template to make your port easier for others to read. Now, consider the following problems in sequence as you design your new Makefile: The original source Does it live in DISTDIR as a standard gzip'd tarball named something like foozolix-1.2.tar.gz? If so, you can go on to the next step. If not, you should look at overriding any of the DISTVERSION, DISTNAME, EXTRACT_CMD, EXTRACT_BEFORE_ARGS, EXTRACT_AFTER_ARGS, EXTRACT_SUFX, or DISTFILES variables, depending on how alien a format your port's distribution file is. (The most common case is EXTRACT_SUFX=.tar.Z, when the tarball is condensed by regular compress, not gzip.) In the worst case, you can simply create your own do-extract target to override the default, though this should be rarely, if ever, necessary. Naming The first part of the port's Makefile names the port, describes its version number, and lists it in the correct category. <makevar>PORTNAME</makevar> and <makevar>PORTVERSION</makevar> You should set PORTNAME to the base name of your port, and PORTVERSION to the version number of the port. <makevar>PORTREVISION</makevar> and <makevar>PORTEPOCH</makevar> <makevar>PORTREVISION</makevar> The PORTREVISION variable is a monotonically increasing value which is reset to 0 with every increase of PORTVERSION (i.e. every time a new official vendor release is made), and appended to the package name if non-zero. Changes to PORTREVISION are used by automated tools (e.g. &man.pkg.version.1;) to highlight the fact that a new package is available. PORTREVISION should be increased each time a change is made to the port which significantly affects the content or structure of the derived package. Examples of when PORTREVISION should be bumped: Addition of patches to correct security vulnerabilities, bugs, or to add new functionality to the port. Changes to the port Makefile to enable or disable compile-time options in the package. Changes in the packing list or the install-time behavior of the package (e.g. change to a script which generates initial data for the package, like ssh host keys). Version bump of a port's shared library dependency (in this case, someone trying to install the old package after installing a newer version of the dependency will fail since it will look for the old libfoo.x instead of libfoo.(x+1)). Silent changes to the port distfile which have significant functional differences, i.e. changes to the distfile requiring a correction to distinfo with no corresponding change to PORTVERSION, where a diff -ru of the old and new versions shows non-trivial changes to the code. Examples of changes which do not require a PORTREVISION bump: Style changes to the port skeleton with no functional change to what appears in the resulting package. Changes to MASTER_SITES or other functional changes to the port which do not affect the resulting package. Trivial patches to the distfile such as correction of typos, which are not important enough that users of the package should go to the trouble of upgrading. Build fixes which cause a package to become compilable where it was previously failing (as long as the changes do not introduce any functional change on any other platforms on which the port did previously build). Since PORTREVISION reflects the content of the package, if the package was not previously buildable then there is no need to increase PORTREVISION to mark a change. A rule of thumb is to ask yourself whether a change committed to a port is something which everyone would benefit from having (either because of an enhancement, fix, or by virtue that the new package will actually work at all), and weigh that against that fact that it will cause everyone who regularly updates their ports tree to be compelled to update. If yes, the PORTREVISION should be bumped. <makevar>PORTEPOCH</makevar> From time to time a software vendor or FreeBSD porter will do something silly and release a version of their software which is actually numerically less than the previous version. An example of this is a port which goes from foo-20000801 to foo-1.0 (the former will be incorrectly treated as a newer version since 20000801 is a numerically greater value than 1). In situations such as this, the PORTEPOCH version should be increased. If PORTEPOCH is nonzero it is appended to the package name as described in section 0 above. PORTEPOCH must never be decreased or reset to zero, because that would cause comparison to a package from an earlier epoch to fail (i.e. the package would not be detected as out of date): the new version number (e.g. 1.0,1 in the above example) is still numerically less than the previous version (20000801), but the ,1 suffix is treated specially by automated tools and found to be greater than the implied suffix ,0 on the earlier package. Dropping or resetting PORTEPOCH incorrectly leads to no end of grief; if you do not understand the above discussion, please keep after it until you do, or ask questions on the mailing lists. It is expected that PORTEPOCH will not be used for the majority of ports, and that sensible use of PORTVERSION can often pre-empt it becoming necessary if a future release of the software should change the version structure. However, care is needed by FreeBSD porters when a vendor release is made without an official version number — such as a code snapshot release. The temptation is to label the release with the release date, which will cause problems as in the example above when a new official release is made. For example, if a snapshot release is made on the date 20000917, and the previous version of the software was version 1.2, the snapshot release should be given a PORTVERSION of 1.2.20000917 or similar, not 20000917, so that the succeeding release, say 1.3, is still a numerically greater value. Example of <makevar>PORTREVISION</makevar> and <makevar>PORTEPOCH</makevar> usage The gtkmumble port, version 0.10, is committed to the ports collection: PORTNAME= gtkmumble PORTVERSION= 0.10 PKGNAME becomes gtkmumble-0.10. A security hole is discovered which requires a local FreeBSD patch. PORTREVISION is bumped accordingly. PORTNAME= gtkmumble PORTVERSION= 0.10 PORTREVISION= 1 PKGNAME becomes gtkmumble-0.10_1 A new version is released by the vendor, numbered 0.2 (it turns out the author actually intended 0.10 to actually mean 0.1.0, not what comes after 0.9 - oops, too late now). Since the new minor version 2 is numerically less than the previous version 10, the PORTEPOCH must be bumped to manually force the new package to be detected as newer. Since it is a new vendor release of the code, PORTREVISION is reset to 0 (or removed from the Makefile). PORTNAME= gtkmumble PORTVERSION= 0.2 PORTEPOCH= 1 PKGNAME becomes gtkmumble-0.2,1 The next release is 0.3. Since PORTEPOCH never decreases, the version variables are now: PORTNAME= gtkmumble PORTVERSION= 0.3 PORTEPOCH= 1 PKGNAME becomes gtkmumble-0.3,1 If PORTEPOCH were reset to 0 with this upgrade, someone who had installed the gtkmumble-0.10_1 package would not detect the gtkmumble-0.3 package as newer, since 3 is still numerically less than 10. Remember, this is the whole point of PORTEPOCH in the first place. <makevar>PKGNAMEPREFIX</makevar> and <makevar>PKGNAMESUFFIX</makevar> Two optional variables, PKGNAMEPREFIX and PKGNAMESUFFIX, are combined with PORTNAME and PORTVERSION to form PKGNAME as ${PKGNAMEPREFIX}${PORTNAME}${PKGNAMESUFFIX}-${PORTVERSION}. Make sure this conforms to our guidelines for a good package name. In particular, you are not allowed to use a hyphen (-) in PORTVERSION. Also, if the package name has the language- or the -compiled.specifics part (see below), use PKGNAMEPREFIX and PKGNAMESUFFIX, respectively. Do not make them part of PORTNAME. Package Naming Conventions The following are the conventions you should follow in naming your packages. This is to have our package directory easy to scan, as there are already thousands of packages and users are going to turn away if they hurt their eyes! The package name should look like language_region-name-compiled.specifics-version.numbers. The package name is defined as ${PKGNAMEPREFIX}${PORTNAME}${PKGNAMESUFFIX}-${PORTVERSION}. Make sure to set the variables to conform to that format. FreeBSD strives to support the native language of its users. The language- part should be a two letter abbreviation of the natural language defined by ISO-639 if the port is specific to a certain language. Examples are ja for Japanese, ru for Russian, vi for Vietnamese, zh for Chinese, ko for Korean and de for German. If the port is specific to a certain region within the language area, add the two letter country code as well. Examples are en_US for US English and fr_CH for Swiss French. The language- part should be set in the PKGNAMEPREFIX variable. The first letter of name part should be lowercase. (The rest of the name can contain capital letters, so use your own discretion when you are converting a software name that has some capital letters in it.) There is a tradition of naming perl 5 modules by prepending p5- and converting the double-colon separator to a hyphen; for example, the Data::Dumper module becomes p5-Data-Dumper. If the software in question has numbers, hyphens, or underscores in its name, you may include them as well (like kinput2). If the port can be built with different hardcoded defaults (usually part of the directory name in a family of ports), the -compiled.specifics part should state the compiled-in defaults (the hyphen is optional). Examples are papersize and font units. The -compiled.specifics part should be set in the PKGNAMESUFFIX variable. The version string should follow a dash (-) and be a period-separated list of integers and single lowercase alphabetics. In particular, it is not permissible to have another dash inside the version string. The only exception is the string pl (meaning patchlevel), which can be used only when there are no major and minor version numbers in the software. If the software version has strings like alpha, beta, rc, or pre, take the first letter and put it immediately after a period. If the version string continues after those names, the numbers should follow the single alphabet without an extra period between them. The idea is to make it easier to sort ports by looking at the version string. In particular, make sure version number components are always delimited by a period, and if the date is part of the string, use the yyyy.mm.dd format, not dd.mm.yyyy or the non-Y2K compliant yy.mm.dd format. Here are some (real) examples on how to convert the name as called by the software authors to a suitable package name: Distribution Name PKGNAMEPREFIX PORTNAME PKGNAMESUFFIX PORTVERSION Reason mule-2.2.2 (empty) mule (empty) 2.2.2 No changes required XFree86-3.3.6 (empty) XFree86 (empty) 3.3.6 No changes required EmiClock-1.0.2 (empty) emiclock (empty) 1.0.2 No uppercase names for single programs rdist-1.3alpha (empty) rdist (empty) 1.3.a No strings like alpha allowed es-0.9-beta1 (empty) es (empty) 0.9.b1 No strings like beta allowed mailman-2.0rc3 (empty) mailman (empty) 2.0.r3 No strings like rc allowed v3.3beta021.src (empty) tiff (empty) 3.3 What the heck was that anyway? tvtwm (empty) tvtwm (empty) pl11 Version string always required piewm (empty) piewm (empty) 1.0 Version string always required xvgr-2.10pl1 (empty) xvgr (empty) 2.10.1 pl allowed only when no major/minor version numbers gawk-2.15.6 ja- gawk (empty) 2.15.6 Japanese language version psutils-1.13 (empty) psutils -letter 1.13 Papersize hardcoded at package build time pkfonts (empty) pkfonts 300 1.0 Package for 300dpi fonts If there is absolutely no trace of version information in the original source and it is unlikely that the original author will ever release another version, just set the version string to 1.0 (like the piewm example above). Otherwise, ask the original author or use the date string (yyyy.mm.dd) as the version. Categorization <makevar>CATEGORIES</makevar> When a package is created, it is put under /usr/ports/packages/All and links are made from one or more subdirectories of /usr/ports/packages. The names of these subdirectories are specified by the variable CATEGORIES. It is intended to make life easier for the user when he is wading through the pile of packages on the FTP site or the CDROM. Please take a look at the current list of categories and pick the ones that are suitable for your port. This list also determines where in the ports tree the port is imported. If you put more than one category here, it is assumed that the port files will be put in the subdirectory with the name in the first category. See below for more discussion about how to pick the right categories. Current list of categories Here is the current list of port categories. Those marked with an asterisk (*) are virtual categories—those that do not have a corresponding subdirectory in the ports tree. They are only used as secondary categories, and only for search purposes. For non-virtual categories, you will find a one-line description in the COMMENT in that subdirectory's Makefile. Category Description Notes accessibility Ports to help disabled users. afterstep* Ports to support the AfterStep window manager. arabic Arabic language support. archivers Archiving tools. astro Astronomical ports. audio Sound support. benchmarks Benchmarking utilities. biology Biology-related software. cad Computer aided design tools. chinese Chinese language support. comms Communication software. Mostly software to talk to your serial port. converters Character code converters. databases Databases. deskutils Things that used to be on the desktop before computers were invented. devel Development utilities. Do not put libraries here just because they are libraries—unless they truly do not belong anywhere else, they should not be in this category. dns DNS-related software. editors General editors. Specialized editors go in the section for those tools (e.g., a mathematical-formula editor will go in math). elisp* Emacs-lisp ports. emulators Emulators for other operating systems. Terminal emulators do not belong here—X-based ones should go to x11 and text-based ones to either comms or misc, depending on the exact functionality. finance Monetary, financial and related applications. french French language support. ftp FTP client and server utilities. If your port speaks both FTP and HTTP, put it in ftp with a secondary category of www. games Games. german German language support. gnome* Ports from the GNOME Project. graphics Graphics utilities. haskell* Software related to the Haskell language. hebrew Hebrew language support. hungarian Hungarian language support. ipv6* IPv6 related software. irc Internet Relay Chat utilities. japanese Japanese language support. java Software related to the Java language. The java category shall not be the only one for a port. Save for ports directly related to the Java language, porters are also encouraged not to use java as the main category of a port. kde* Ports from the K Desktop Environment (KDE) Project. korean Korean language support. lang Programming languages. linux* Linux applications and support utilities. lisp* Software related to the Lisp language. mail Mail software. math Numerical computation software and other utilities for mathematics. mbone MBone applications. misc Miscellaneous utilities Basically things that do not belong anywhere else. If at all possible, try to find a better category for your port than misc, as ports tend to get overlooked in here. multimedia Multimedia software. net Miscellaneous networking software. net-im Instant messaging software. net-mgmt Networking management software. news USENET news software. offix* Ports from the OffiX suite. palm Software support for the Palm™ series. parallel* Applications dealing with parallelism in computing. pear* Ports related to the Pear PHP framework. perl5* Ports that require Perl version 5 to run. plan9* Various programs from Plan9. polish Polish language support. portuguese Portuguese language support. print Printing software. Desktop publishing tools (previewers, etc.) belong here too. python* Software related to the Python language. ruby* Software related to the Ruby language. russian Russian language support. scheme* Software related to the Scheme language. science Scientific ports that do not fit into other categories such as astro, biology and math. security Security utilities. shells Command line shells. sysutils System utilities. tcl80* Ports that use Tcl version 8.0 to run. tcl81* Ports that use Tcl version 8.1 to run. tcl82* Ports that use Tcl version 8.2 to run. tcl83* Ports that use Tcl version 8.3 to run. tcl84* Ports that use Tcl version 8.4 to run. textproc Text processing utilities. It does not include desktop publishing tools, which go to print. tk80* Ports that use Tk version 8.0 to run. tk82* Ports that use Tk version 8.2 to run. tk83* Ports that use Tk version 8.3 to run. tk84* Ports that use Tk version 8.4 to run. tkstep80* Ports that use TkSTEP version 8.0 to run. ukrainian Ukrainian language support. vietnamese Vietnamese language support. windowmaker* Ports to support the WindowMaker window manager. www Software related to the World Wide Web. HTML language support belongs here too. x11 The X Window System and friends. This category is only for software that directly supports the window system. Do not put regular X applications here; most of them should go into other x11-* categories (see below). If your port is an X application, define USE_XLIB (implied by USE_IMAKE) and put it in the appropriate category. x11-clocks X11 clocks. x11-fm X11 file managers. x11-fonts X11 fonts and font utilities. x11-servers X11 servers. x11-themes X11 themes. x11-toolkits X11 toolkits. x11-wm X11 window managers. xfce* Ports relating to the Xfce desktop environment. zope* Zope support. Choosing the right category As many of the categories overlap, you often have to choose which of the categories should be the primary category of your port. There are several rules that govern this issue. Here is the list of priorities, in decreasing order of precedence: The first category must be a physical category (see above). This is necessary to make the packaging work. Virtual categories and physical categories may be intermixed after that. Language specific categories always come first. For example, if your port installs Japanese X11 fonts, then your CATEGORIES line would read japanese x11-fonts. Specific categories are listed before less-specific ones. For instance, an HTML editor should be listed as www editors, not the other way around. Also, you should not list net when the port belongs to any of irc, mail, mbone, news, security, or www, as net is included implicitly. x11 is used as a secondary category only when the primary category is a natural language. In particular, you should not put x11 in the category line for X applications. Emacs modes should be placed in the same ports category as the application supported by the mode, not in editors. For example, an Emacs mode to edit source files of some programming language should go into lang. misc should not appear with any other non-virtual category. If you have misc with something else in your CATEGORIES line, that means you can safely delete misc and just put the port in that other subdirectory! If your port truly does not belong anywhere else, put it in misc. If you are not sure about the category, please put a comment to that effect in your &man.send-pr.1; submission so we can discuss it before we import it. If you are a committer, send a note to the &a.ports; so we can discuss it first. Too often, new ports are imported to the wrong category only to be moved right away. This causes unnecessary and undesirable bloat in the master source repository. Proposing a new category As the Ports Collection has grown over time, various new categories have been introduced. New categories can either be virtual categories—those that do not have a corresponding subdirectory in the ports tree— or physical categories—those that do. The following text discusses the issues involved in creating a new physical category so that you can understand them before you propose one. Our existing practice has been to avoid creating a new physical category unless either a large number of ports would logically belong to it, or the ports that would belong to it are a logically distinct group that is of limited general interest (for instance, categories related to spoken human languages), or preferably both. The rationale for this is that such a change creates a fair amount of work for both the committers and also for all users who track changes to the Ports Collection. In addition, proposed category changes just naturally seem to attract controversy. (Perhaps this is because there is no clear consensus on when a category is too big, nor whether categories should lend themselves to browsing (and thus what number of categories would be an ideal number), and so forth.) Here is the procedure: Propose the new category on &a.ports;. You should include a detailed rationale for the new category, including why you feel the existing categories are not sufficient, and the list of existing ports proposed to move. (If there are new ports pending in GNATS that would fit this category, list them too.) If you are the maintainer and/or submitter, respectively, mention that as it may help you to make your case. Participate in the discussion. If it seems that there is support for your idea, file a PR which includes both the rationale and the list of existing ports that need to be moved. Ideally, this PR should also include patches for the following: Makefiles for the new ports once they are repocopied Makefile for the new category Makefile for the old ports' categories Makefiles for ports that depend on the old ports (for extra credit, you can include the other files that have to change, as per the procedure in the Committer's Guide.) Since it affects the ports infrastructure and involves not only performing repo-copies but also possibly running regression tests on the build cluster, the PR should be assigned to the &a.portmgr;. If that PR is approved, a committer will need to follow the rest of the procedure that is outlined in the Committer's Guide. Proposing a new virtual category should be similar to the above but much less involved, since no ports will actually have to move. In this case, the only patches to include in the PR would be those to add the new category to the CATEGORIESs of the affected ports. Proposing reorganizing all the categories Occasionally someone proposes reorganizing the categories with either a 2-level structure, or some other kind of keyword structure. To date, nothing has come of any of these proposals because, while they are very easy to make, the effort involved to retrofit the entire existing ports collection with any kind of reorganization is daunting to say the very least. Please read the history of these proposals in the mailing list archives before you post this idea; furthermore, you should be prepared to be challenged to offer a working prototype. The distribution files The second part of the Makefile describes the files that must be downloaded in order to build the port, and where they can be downloaded from. <makevar>DISTVERSION/DISTNAME</makevar> DISTNAME is the name of the port as called by the authors of the software. DISTNAME defaults to ${PORTNAME}-${PORTVERSION}, so override it only if necessary. DISTNAME is only used in two places. First, the distribution file list (DISTFILES) defaults to ${DISTNAME}${EXTRACT_SUFX}. Second, the distribution file is expected to extract into a subdirectory named WRKSRC, which defaults to work/${DISTNAME}. Some vendor's distribution names which do not fit into the ${PORTNAME}-${PORTVERSION}-scheme can be handled automatically by setting DISTVERSION. PORTVERSION and DISTNAME will be derived automatically, but can of course be overridden. The following table lists some examples: DISTVERSION PORTVERSION 0.7.1d 0.7.1.d 10Alpha3 10.a3 3Beta7-pre2 3.b7.p2 8:f_17 8f.17 PKGNAMEPREFIX and PKGNAMESUFFIX do not affect DISTNAME. Also note that if WRKSRC is equal to work/${PORTNAME}-${PORTVERSION} while the original source archive is named something other than ${PORTNAME}-${PORTVERSION}${EXTRACT_SUFX}, you should probably leave DISTNAME alone— you are better off defining DISTFILES than having to set both DISTNAME and WRKSRC (and possibly EXTRACT_SUFX). <makevar>MASTER_SITES</makevar> Record the directory part of the FTP/HTTP-URL pointing at the original tarball in MASTER_SITES. Do not forget the trailing slash (/)! The make macros will try to use this specification for grabbing the distribution file with FETCH if they cannot find it already on the system. It is recommended that you put multiple sites on this list, preferably from different continents. This will safeguard against wide-area network problems. We are even planning to add support for automatically determining the closest master site and fetching from there; having multiple sites will go a long way towards helping this effort. If the original tarball is part of one of the popular archives such as X-contrib, GNU, or Perl CPAN, you may be able refer to those sites in an easy compact form using MASTER_SITE_* (e.g., MASTER_SITE_XCONTRIB and MASTER_SITE_PERL_GNU). Simply set MASTER_SITES to one of these variables and MASTER_SITE_SUBDIR to the path within the archive. Here is an example: MASTER_SITES= ${MASTER_SITE_XCONTRIB} MASTER_SITE_SUBDIR= applications These variables are defined in /usr/ports/Mk/bsd.sites.mk. There are new entries added all the time, so make sure to check the latest version of this file before submitting a port. The user can also set the MASTER_SITE_* variables in /etc/make.conf to override our choices, and use their favorite mirrors of these popular archives instead. <makevar>EXTRACT_SUFX</makevar> If you have one distribution file, and it uses an odd suffix to indicate the compression mechanism, set EXTRACT_SUFX. For example, if the distribution file was named foo.tgz instead of the more normal foo.tar.gz, you would write: DISTNAME= foo EXTRACT_SUFX= .tgz The USE_BZIP2 and USE_ZIP variables automatically set EXTRACT_SUFX to .tar.bz2 or .zip as necessary. If neither of these are set then EXTRACT_SUFX defaults to .tar.gz. You never need to set both EXTRACT_SUFX and DISTFILES. <makevar>DISTFILES</makevar> Sometimes the names of the files to be downloaded have no resemblance to the name of the port. For example, it might be called source.tar.gz or similar. In other cases the application's source code might be in several different archives, all of which must be downloaded. If this is the case, set DISTFILES to be a space separated list of all the files that must be downloaded. DISTFILES= source1.tar.gz source2.tar.gz If not explicitly set, DISTFILES defaults to ${DISTNAME}${EXTRACT_SUFX}. <makevar>EXTRACT_ONLY</makevar> If only some of the DISTFILES must be extracted—for example, one of them is the source code, while another is an uncompressed document—list the filenames that must be extracted in EXTRACT_ONLY. DISTFILES= source.tar.gz manual.html EXTRACT_ONLY= source.tar.gz If none of the DISTFILES should be uncompressed then set EXTRACT_ONLY to the empty string. EXTRACT_ONLY= <makevar>PATCHFILES</makevar> If your port requires some additional patches that are available by FTP or HTTP, set PATCHFILES to the names of the files and PATCH_SITES to the URL of the directory that contains them (the format is the same as MASTER_SITES). If the patch is not relative to the top of the source tree (i.e., WRKSRC) because it contains some extra pathnames, set PATCH_DIST_STRIP accordingly. For instance, if all the pathnames in the patch have an extra foozolix-1.0/ in front of the filenames, then set PATCH_DIST_STRIP=-p1. Do not worry if the patches are compressed; they will be decompressed automatically if the filenames end with .gz or .Z. If the patch is distributed with some other files, such as documentation, in a gzip'd tarball, you cannot just use PATCHFILES. If that is the case, add the name and the location of the patch tarball to DISTFILES and MASTER_SITES. Then, use the EXTRA_PATCHES variable to point to those files and bsd.port.mk will automatically apply them for you. In particular, do not copy patch files into the PATCHDIR directory—that directory may not be writable. The tarball will have been extracted alongside the regular source by then, so there is no need to explicitly extract it if it is a regular gzip'd or compress'd tarball. If you do the latter, take extra care not to overwrite something that already exists in that directory. Also, do not forget to add a command to remove the copied patch in the pre-clean target. Multiple distribution files or patches from different sites and subdirectories (<literal>MASTER_SITES:n</literal>) (Consider this to be a somewhat advanced topic; those new to this document may wish to skip this section at first). This section has information on the fetching mechanism known as both MASTER_SITES:n and MASTER_SITES_NN. We will refer to this mechanism as MASTER_SITES:n hereon. A little background first. OpenBSD has a neat feature inside both DISTFILES and PATCHFILES variables, both files and patches can be postfixed with :n identifiers where n both can be [0-9] and denote a group designation. For example: DISTFILES= alpha:0 beta:1 In OpenBSD, distribution file alpha will be associated with variable MASTER_SITES0 instead of our common MASTER_SITES and beta with MASTER_SITES1. This is a very interesting feature which can decrease that endless search for the correct download site. Just picture 2 files in DISTFILES and 20 sites in MASTER_SITES, the sites slow as hell where beta is carried by all sites in MASTER_SITES, and alpha can only be found in the 20th site. It would be such a waste to check all of them if maintainer knew this beforehand, would it not? Not a good start for that lovely weekend! Now that you have the idea, just imagine more DISTFILES and more MASTER_SITES. Surely our distfiles survey meister would appreciate the relief to network strain that this would bring. In the next sections, information will follow on the FreeBSD implementation of this idea. We improved a bit on OpenBSD's concept. Simplified information This section tells you how to quickly prepare fine grained fetching of multiple distribution files and patches from different sites and subdirectories. We describe here a case of simplified MASTER_SITES:n usage. This will be sufficient for most scenarios. However, if you need further information, you will have to refer to the next section. Some applications consist of multiple distribution files that must be downloaded from a number of different sites. For example, Ghostscript consists of the core of the program, and then a large number of driver files that are used depending on the user's printer. Some of these driver files are supplied with the core, but many others must be downloaded from a variety of different sites. To support this, each entry in DISTFILES may be followed by a colon and a tag name. Each site listed in MASTER_SITES is then followed by a colon, and the tag that indicates which distribution files should be downloaded from this site. For example, consider an application with the source split in two parts, source1.tar.gz and source2.tar.gz, which must be downloaded from two different sites. The port's Makefile would include lines like . Simplified use of <literal>MASTER_SITES:n</literal> with 1 file per site MASTER_SITES= ftp://ftp.example1.com/:source1 \ ftp://ftp.example2.com/:source2 DISTFILES= source1.tar.gz:source1 \ source2.tar.gz:source2 Multiple distribution files can have the same tag. Continuing the previous example, suppose that there was a third distfile, source3.tar.gz, that should be downloaded from ftp.example2.com. The Makefile would then be written like . Simplified use of <literal>MASTER_SITES:n</literal> with more than 1 file per site MASTER_SITES= ftp://ftp.example1.com/:source1 \ ftp://ftp.example2.com/:source2 DISTFILES= source1.tar.gz:source1 \ source2.tar.gz:source2 \ source3.tar.gz:source2 Detailed information Okay, so the previous section example did not reflect your needs? In this section we will explain in detail how the fine grained fetching mechanism MASTER_SITES:n works and how you can modify your ports to use it. Elements can be postfixed with :n where n is [^:,]+, i.e., n could conceptually be any alphanumeric string but we will limit it to [a-zA-Z_][0-9a-zA-Z_]+ for now. Moreover, string matching is case sensitive; i.e., n is different from N. However, the following words cannot be used for postfixing purposes since they yield special meaning: default, all and ALL (they are used internally in item ). Furthermore, DEFAULT is a special purpose word (check item ). Elements postfixed with :n belong to the group n, :m belong to group m and so forth. Elements without a postfix are groupless, i.e., they all belong to the special group DEFAULT. If you postfix any elements with DEFAULT, you are just being redundant unless you want to have an element belonging to both DEFAULT and other groups at the same time (check item ). The following examples are equivalent but the first one is preferred: MASTER_SITES= alpha MASTER_SITES= alpha:DEFAULT Groups are not exclusive, an element may belong to several different groups at the same time and a group can either have either several different elements or none at all. Repeated elements within the same group will be simply that, repeated elements. When you want an element to belong to several groups at the same time, you can use the comma operator (,). Instead of repeating it several times, each time with a different postfix, we can list several groups at once in a single postfix. For instance, :m,n,o marks an element that belongs to group m, n and o. All the following examples are equivalent but the last one is preferred: MASTER_SITES= alpha alpha:SOME_SITE MASTER_SITES= alpha:DEFAULT alpha:SOME_SITE MASTER_SITES= alpha:SOME_SITE,DEFAULT MASTER_SITES= alpha:DEFAULT,SOME_SITE All sites within a given group are sorted according to MASTER_SORT_AWK. All groups within MASTER_SITES and PATCH_SITES are sorted as well. Group semantics can be used in any of the following variables MASTER_SITES, PATCH_SITES, MASTER_SITE_SUBDIR, PATCH_SITE_SUBDIR, DISTFILES, and PATCHFILES according to the following syntax: All MASTER_SITES, PATCH_SITES, MASTER_SITE_SUBDIR and PATCH_SITE_SUBDIR elements must be terminated with the forward slash / character. If any elements belong to any groups, the group postfix :n must come right after the terminator /. The MASTER_SITES:n mechanism relies on the existence of the terminator / to avoid confusing elements where a :n is a valid part of the element with occurrences where :n denotes group n. For compatibility purposes, since the / terminator was not required before in both MASTER_SITE_SUBDIR and PATCH_SITE_SUBDIR elements, if the postfix immediate preceding character is not a / then :n will be considered a valid part of the element instead of a group postfix even if an element is postfixed with :n. See both and . Detailed use of <literal>MASTER_SITES:n</literal> in <makevar>MASTER_SITE_SUBDIR</makevar> MASTER_SITE_SUBDIR= old:n new/:NEW Directories within group - DEFAULT -> old:n + DEFAULT -> old:n Directories within group - NEW -> new + NEW -> new Detailed use of <literal>MASTER_SITES:n</literal> with comma operator, multiple files, multiple sites and multiple subdirectories MASTER_SITES= http://site1/%SUBDIR%/ http://site2/:DEFAULT \ http://site3/:group3 http://site4/:group4 \ http://site5/:group5 http://site6/:group6 \ http://site7/:DEFAULT,group6 \ http://site8/%SUBDIR%/:group6,group7 \ http://site9/:group8 DISTFILES= file1 file2:DEFAULT file3:group3 \ file4:group4,group5,group6 file5:grouping \ file6:group7 MASTER_SITE_SUBDIR= directory-trial:1 directory-n/:groupn \ directory-one/:group6,DEFAULT \ directory The previous example results in the following fine grained fetching. Sites are listed in the exact order they will be used. file1 will be fetched from MASTER_SITE_OVERRIDE http://site1/directory-trial:1/ http://site1/directory-one/ http://site1/directory/ http://site2/ http://site7/ MASTER_SITE_BACKUP file2 will be fetched exactly as file1 since they both belong to the same group MASTER_SITE_OVERRIDE http://site1/directory-trial:1/ http://site1/directory-one/ http://site1/directory/ http://site2/ http://site7/ MASTER_SITE_BACKUP file3 will be fetched from MASTER_SITE_OVERRIDE http://site3/ MASTER_SITE_BACKUP file4 will be fetched from MASTER_SITE_OVERRIDE http://site4/ http://site5/ http://site6/ http://site7/ http://site8/directory-one/ MASTER_SITE_BACKUP file5 will be fetched from MASTER_SITE_OVERRIDE MASTER_SITE_BACKUP file6 will be fetched from MASTER_SITE_OVERRIDE http://site8/ MASTER_SITE_BACKUP How do I group one of the special variables from bsd.sites.mk, e.g., MASTER_SITE_SOURCEFORGE? See . Detailed use of <literal>MASTER_SITES:n</literal> with <makevar>MASTER_SITE_SOURCEFORGE</makevar> MASTER_SITES= http://site1/ ${MASTER_SITE_SOURCEFORGE:S/$/:sourceforge,TEST/} DISTFILES= something.tar.gz:sourceforge something.tar.gz will be fetched from all sites within MASTER_SITE_SOURCEFORGE. How do I use this with PATCH* variables? All examples were done with MASTER* variables but they work exactly the same for PATCH* ones as can be seen in . Simplified use of <literal>MASTER_SITES:n</literal> with <makevar>PATCH_SITES</makevar>. PATCH_SITES= http://site1/ http://site2/:test PATCHFILES= patch1:test What does change for ports? What does not? All current ports remain the same. The MASTER_SITES:n feature code is only activated if there are elements postfixed with :n like elements according to the aforementioned syntax rules, especially as shown in item . The port targets remain the same: checksum, makesum, patch, configure, build, etc. With the obvious exceptions of do-fetch, fetch-list, master-sites and patch-sites. do-fetch: deploys the new grouping postfixed DISTFILES and PATCHFILES with their matching group elements within both MASTER_SITES and PATCH_SITES which use matching group elements within both MASTER_SITE_SUBDIR and PATCH_SITE_SUBDIR. Check . fetch-list: works like old fetch-list with the exception that it groups just like do-fetch. master-sites and patch-sites: (incompatible with older versions) only return the elements of group DEFAULT; in fact, they execute targets master-sites-default and patch-sites-default respectively. Furthermore, using target either master-sites-all or patch-sites-all is preferred to directly checking either MASTER_SITES or PATCH_SITES. Also, directly checking is not guaranteed to work in any future versions. Check item for more information on these new port targets. New port targets There are master-sites-n and patch-sites-n targets which will list the elements of the respective group n within MASTER_SITES and PATCH_SITES respectively. For instance, both master-sites-DEFAULT and patch-sites-DEFAULT will return the elements of group DEFAULT, master-sites-test and patch-sites-test of group test, and thereon. There are new targets master-sites-all and patch-sites-all which do the work of the old master-sites and patch-sites ones. They return the elements of all groups as if they all belonged to the same group with the caveat that it lists as many MASTER_SITE_BACKUP and MASTER_SITE_OVERRIDE as there are groups defined within either DISTFILES or PATCHFILES; respectively for master-sites-all and patch-sites-all. <makevar>DIST_SUBDIR</makevar> Do not let your port clutter /usr/ports/distfiles. If your port requires a lot of files to be fetched, or contains a file that has a name that might conflict with other ports (e.g., Makefile), set DIST_SUBDIR to the name of the port (${PORTNAME} or ${PKGNAMEPREFIX}${PORTNAME} should work fine). This will change DISTDIR from the default /usr/ports/distfiles to /usr/ports/distfiles/DIST_SUBDIR, and in effect puts everything that is required for your port into that subdirectory. It will also look at the subdirectory with the same name on the backup master site at ftp.FreeBSD.org. (Setting DISTDIR explicitly in your Makefile will not accomplish this, so please use DIST_SUBDIR.) This does not affect the MASTER_SITES you define in your Makefile. <makevar>MAINTAINER</makevar> Set your mail-address here. Please. :-) Note that only a single address without the comment part is allowed as a MAINTAINER value. The format used should be user@hostname.domain. Please do not include any descriptive text such as your real name in this entry—that merely confuses bsd.port.mk. For a detailed description of the responsibilities of maintainers, refer to the MAINTAINER on Makefiles section. If the maintainer of a port does not respond to an update request from a user after two weeks (excluding major public holidays), then that is considered a maintainer timeout, and the update may be made without explicit maintainer approval. If the maintainer does not respond within three months, then that maintainer is considered absent without leave, and can be replaced as the maintainer of the particular port in question. Exceptions to this are anything maintained by the &a.portmgr;, or the &a.security-officer;. No unauthorized commits may ever be made to ports maintained by those groups. The &a.portmgr; reserves the right to revoke or override anyone's maintainership for any reason, and the &a.security-officer; reserves the right to revoke or override maintainership for security reasons. <makevar>COMMENT</makevar> This is a one-line description of the port. Please do not include the package name (or version number of the software) in the comment. The comment should begin with a capital and end without a period. Here is an example: COMMENT= A cat chasing a mouse all over the screen The COMMENT variable should immediately follow the MAINTAINER variable in the Makefile. Please try to keep the COMMENT line less than 70 characters, as it is displayed to users as a one-line summary of the port. Dependencies Many ports depend on other ports. There are seven variables that you can use to ensure that all the required bits will be on the user's machine. There are also some pre-supported dependency variables for common cases, plus a few more to control the behavior of dependencies. <makevar>LIB_DEPENDS</makevar> This variable specifies the shared libraries this port depends on. It is a list of lib:dir:target tuples where lib is the name of the shared library, dir is the directory in which to find it in case it is not available, and target is the target to call in that directory. For example, LIB_DEPENDS= jpeg.9:${PORTSDIR}/graphics/jpeg:install will check for a shared jpeg library with major version 9, and descend into the graphics/jpeg subdirectory of your ports tree to build and install it if it is not found. The target part can be omitted if it is equal to DEPENDS_TARGET (which defaults to install). The lib part is a regular expression which is being looked up in the ldconfig -r output. Values such as intl.[5-7] and intl are allowed. The first pattern, intl.[5-7], will match any of: intl.5, intl.6 or intl.7. The second pattern, intl, will match any version of the intl library. The dependency is checked twice, once from within the extract target and then from within the install target. Also, the name of the dependency is put into the package so that &man.pkg.add.1; will automatically install it if it is not on the user's system. <makevar>RUN_DEPENDS</makevar> This variable specifies executables or files this port depends on during run-time. It is a list of path:dir:target tuples where path is the name of the executable or file, dir is the directory in which to find it in case it is not available, and target is the target to call in that directory. If path starts with a slash (/), it is treated as a file and its existence is tested with test -e; otherwise, it is assumed to be an executable, and which -s is used to determine if the program exists in the search path. For example, RUN_DEPENDS= ${LOCALBASE}/etc/innd:${PORTSDIR}/news/inn \ wish8.0:${PORTSDIR}/x11-toolkits/tk80 will check if the file or directory /usr/local/etc/innd exists, and build and install it from the news/inn subdirectory of the ports tree if it is not found. It will also see if an executable called wish8.0 is in the search path, and descend into the x11-toolkits/tk80 subdirectory of your ports tree to build and install it if it is not found. In this case, innd is actually an executable; if an executable is in a place that is not expected to be in the search path, you should use the full pathname. The official search PATH used on the ports build cluster is /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin The dependency is checked from within the install target. Also, the name of the dependency is put into the package so that &man.pkg.add.1; will automatically install it if it is not on the user's system. The target part can be omitted if it is the same as DEPENDS_TARGET. <makevar>BUILD_DEPENDS</makevar> This variable specifies executables or files this port requires to build. Like RUN_DEPENDS, it is a list of path:dir:target tuples. For example, BUILD_DEPENDS= unzip:${PORTSDIR}/archivers/unzip will check for an executable called unzip, and descend into the archivers/unzip subdirectory of your ports tree to build and install it if it is not found. build here means everything from extraction to compilation. The dependency is checked from within the extract target. The target part can be omitted if it is the same as DEPENDS_TARGET <makevar>FETCH_DEPENDS</makevar> This variable specifies executables or files this port requires to fetch. Like the previous two, it is a list of path:dir:target tuples. For example, FETCH_DEPENDS= ncftp2:${PORTSDIR}/net/ncftp2 will check for an executable called ncftp2, and descend into the net/ncftp2 subdirectory of your ports tree to build and install it if it is not found. The dependency is checked from within the fetch target. The target part can be omitted if it is the same as DEPENDS_TARGET. <makevar>EXTRACT_DEPENDS</makevar> This variable specifies executables or files this port requires for extraction. Like the previous, it is a list of path:dir:target tuples. For example, EXTRACT_DEPENDS= unzip:${PORTSDIR}/archivers/unzip will check for an executable called unzip, and descend into the archivers/unzip subdirectory of your ports tree to build and install it if it is not found. The dependency is checked from within the extract target. The target part can be omitted if it is the same as DEPENDS_TARGET. Use this variable only if the extraction does not already work (the default assumes gzip) and cannot be made to work using USE_ZIP or USE_BZIP2 described in . <makevar>PATCH_DEPENDS</makevar> This variable specifies executables or files this port requires to patch. Like the previous, it is a list of path:dir:target tuples. For example, PATCH_DEPENDS= ${NONEXISTENT}:${PORTSDIR}/java/jfc:extract will descend into the java/jfc subdirectory of your ports tree to build and install it if it is not found. The dependency is checked from within the patch target. The target part can be omitted if it is the same as DEPENDS_TARGET. <makevar>DEPENDS</makevar> If there is a dependency that does not fall into either of the above categories, or your port requires having the source of the other port extracted in addition to having it installed, then use this variable. This is a list of dir:target, as there is nothing to check, unlike the previous four. The target part can be omitted if it is the same as DEPENDS_TARGET. <makevar>USE_<replaceable>*</replaceable></makevar> A number of variables exist in order to encapsulate common dependencies that many ports have. Although their use is optional, they can help to reduce the verbosity of the port Makefiles. Each of them is styled as USE_*. The usage of these variables is restricted to the port Makefiles and ports/Mk/bsd.*.mk and is not designed to encapsulate user-settable options — use WITH_* and WITHOUT_* for that purpose. It is always incorrect to set any USE_* in /etc/make.conf. For instance, setting USE_GCC=3.2 would adds a dependency on gcc32 for every port, including gcc32 itself! The <makevar>USE_<replaceable>*</replaceable></makevar> variables Variable Means USE_BZIP2 The port's tarballs are compressed with bzip2. USE_ZIP The port's tarballs are compressed with zip. USE_BISON The port uses bison for building. USE_GCC The port requires a specific version of gcc to build. The exact version can be specified with value such as 3.2. The minimal required version can be specified as 3.2+. The gcc from the base system is used when it satisfies the requested version, otherwise an appropriate gcc is compiled from ports and the CC and CXX variables are adjusted. USE_GCC can't be used together with USE_LIBTOOL_VER.

Variables related to gmake and the configure script are described in , while autoconf, automake and libtool are described in . Perl related variables are described in . X11 variables are listed in . deals with GNOME and with KDE related variables. documents Java variables, while contains information on Apache, PHP and PEAR modules. Python is discussed in , while Ruby in . Finally, provides variables used for SDL applications. Notes on dependencies As mentioned above, the default target to call when a dependency is required is DEPENDS_TARGET. It defaults to install. This is a user variable; it is never defined in a port's Makefile. If your port needs a special way to handle a dependency, use the :target part of the *_DEPENDS variables instead of redefining DEPENDS_TARGET. When you type make clean, its dependencies are automatically cleaned too. If you do not wish this to happen, define the variable NOCLEANDEPENDS in your environment. This may be particularly desirable if the port has something that takes a long time to rebuild in its dependency list, such as KDE, GNOME or Mozilla. To depend on another port unconditionally, use the variable ${NONEXISTENT} as the first field of BUILD_DEPENDS or RUN_DEPENDS. Use this only when you need to get the source of the other port. You can often save compilation time by specifying the target too. For instance BUILD_DEPENDS= ${NONEXISTENT}:${PORTSDIR}/graphics/jpeg:extract will always descend to the jpeg port and extract it. Do not use DEPENDS unless there is no other way the behavior you want can be accomplished. It will cause the other port to always be built (and installed, by default), and the dependency will go into the packages as well. If this is really what you need, you should probably write it as BUILD_DEPENDS and RUN_DEPENDS instead—at least the intention will be clear. Circular dependencies are fatal Do not introduce any circular dependencies into the ports tree! The ports building technology does not tolerate circular dependencies. If you introduce one, you will have someone, somewhere in the world, whose FreeBSD installation will break almost immediately, with many others quickly to follow. These can really be hard to detect; if in doubt, before you make that change, make sure you have done the following: cd /usr/ports; make index. That process can be quite slow on older machines, but you may be able to save a large number of people—including yourself— a lot of grief in the process. <makevar>MASTERDIR</makevar> If your port needs to build slightly different versions of packages by having a variable (for instance, resolution, or paper size) take different values, create one subdirectory per package to make it easier for users to see what to do, but try to share as many files as possible between ports. Typically you only need a very short Makefile in all but one of the directories if you use variables cleverly. In the sole Makefile, you can use MASTERDIR to specify the directory where the rest of the files are. Also, use a variable as part of PKGNAMESUFFIX so the packages will have different names. This will be best demonstrated by an example. This is part of japanese/xdvi300/Makefile; PORTNAME= xdvi PORTVERSION= 17 PKGNAMEPREFIX= ja- PKGNAMESUFFIX= ${RESOLUTION} : # default RESOLUTION?= 300 .if ${RESOLUTION} != 118 && ${RESOLUTION} != 240 && \ ${RESOLUTION} != 300 && ${RESOLUTION} != 400 @${ECHO} "Error: invalid value for RESOLUTION: \"${RESOLUTION}\"" @${ECHO} "Possible values are: 118, 240, 300 (default) and 400." @${FALSE} .endif japanese/xdvi300 also has all the regular patches, package files, etc. If you type make there, it will take the default value for the resolution (300) and build the port normally. As for other resolutions, this is the entire xdvi118/Makefile: RESOLUTION= 118 MASTERDIR= ${.CURDIR}/../xdvi300 .include "${MASTERDIR}/Makefile" (xdvi240/Makefile and xdvi400/Makefile are similar). The MASTERDIR definition tells bsd.port.mk that the regular set of subdirectories like FILESDIR and SCRIPTDIR are to be found under xdvi300. The RESOLUTION=118 line will override the RESOLUTION=300 line in xdvi300/Makefile and the port will be built with resolution set to 118. Manpages The MAN[1-9LN] variables will automatically add any manpages to pkg-plist (this means you must not list manpages in the pkg-plist—see generating PLIST for more). It also makes the install stage automatically compress or uncompress manpages depending on the setting of NOMANCOMPRESS in /etc/make.conf. If your port tries to install multiple names for manpages using symlinks or hardlinks, you must use the MLINKS variable to identify these. The link installed by your port will be destroyed and recreated by bsd.port.mk to make sure it points to the correct file. Any manpages listed in MLINKS must not be listed in the pkg-plist. To specify whether the manpages are compressed upon installation, use the MANCOMPRESSED variable. This variable can take three values, yes, no and maybe. yes means manpages are already installed compressed, no means they are not, and maybe means the software already respects the value of NOMANCOMPRESS so bsd.port.mk does not have to do anything special. MANCOMPRESSED is automatically set to yes if USE_IMAKE is set and NO_INSTALL_MANPAGES is not set, and to no otherwise. You do not have to explicitly define it unless the default is not suitable for your port. If your port anchors its man tree somewhere other than PREFIX, you can use the MANPREFIX to set it. Also, if only manpages in certain sections go in a non-standard place, such as some perl modules ports, you can set individual man paths using MANsectPREFIX (where sect is one of 1-9, L or N). If your manpages go to language-specific subdirectories, set the name of the languages to MANLANG. The value of this variable defaults to "" (i.e., English only). Here is an example that puts it all together. MAN1= foo.1 MAN3= bar.3 MAN4= baz.4 MLINKS= foo.1 alt-name.8 MANLANG= "" ja MAN3PREFIX= ${PREFIX}/share/foobar MANCOMPRESSED= yes This states that six files are installed by this port; ${PREFIX}/man/man1/foo.1.gz ${PREFIX}/man/ja/man1/foo.1.gz ${PREFIX}/share/foobar/man/man3/bar.3.gz ${PREFIX}/share/foobar/man/ja/man3/bar.3.gz ${PREFIX}/man/man4/baz.4.gz ${PREFIX}/man/ja/man4/baz.4.gz Additionally ${PREFIX}/man/man8/alt-name.8.gz may or may not be installed by your port. Regardless, a symlink will be made to join the foo(1) manpage and alt-name(8) manpage. Info files If your package needs to install GNU info files, they should be listed in the INFO variable (without the trailing .info), and appropriate installation/de-installation code will be automatically added to the temporary pkg-plist before package registration. Makefile Options Some large applications can be built in a number of configurations, adding functionality if one of a number of libraries or applications is available. Examples include choice of natural (human) language, GUI versus command-line, or type of database to support. Since not all users want those libraries or applications, the ports system provides hooks that the port author can use to control which configuration should be built. Supporting these properly will make users happy, and effectively provide 2 or more ports for the price of one. <makevar>KNOBS</makevar> <makevar>WITH_<replaceable>*</replaceable></makevar> and <makevar>WITHOUT_<replaceable>*</replaceable></makevar> These variables are designed to be set by the system administrator. There are many that are standardized in ports/Mk/bsd.*.mk; others are not, which can be confusing. If you need to add such a configuration variable, please consider using one of the ones from the following list. You should not assume that a WITH_* necessarily has a corresponding WITHOUT_* variable and vice versa. In general, the default is simply assumed. Unless otherwise specified, these variables are only tested for being set or not set, rather than being set to some kind of variable such as YES or NO. The <makevar>WITH_<replaceable>*</replaceable></makevar> and <makevar>WITHOUT_<replaceable>*</replaceable></makevar> variables Variable Means WITH_APACHE2 If set, use www/apache2 instead of the default of www/apache. WITH_BERKELEY_DB Define this variable to specify the ability to use a variant of the Berkeley database package such as databases/db41. An associated variable, WITH_BDB_VER, may be set to values such as 2, 3, 4, 41 or 42. WITH_MYSQL Define this variable to specify the ability to use a variant of the MySQL database package such as databases/mysql40-server. An associated variable, WANT_MYSQL_VER, may be set to values such as 323, 40, 41, or 50. WITHOUT_NLS If set, says that internationalization is not needed, which can save compile time. By default, internationalization is used. WITH_OPENSSL_BASE Use the version of OpenSSL in the base system. WITH_OPENSSL_PORT Use the version of OpenSSL from security/openssh, overwriting the version that was originally installed in the base system. WITH_POSTGRESQL Define this variable to specify the ability to use a variant of the PostGreSQL database package such as databases/postgresql72. WITHOUT_X11 If the port can be built both with and without X support, then it should normally be built with X support. If this variable is defined, then the version that does not have X support should be built instead.

Knob naming It is recommended that porters use like-named knobs, for the benefit of end-users and to help keep the number of knob names down. A list of popular knob names can be found in the KNOBS file. Knob names should reflect what the knob is and does. When a port has a lib-prefix in the PORTNAME the lib-prefix should be dropped in knob naming. <makevar>OPTIONS</makevar> Background The OPTIONS variable gives the user who installs the port a dialog with the available options and saves them to /var/db/ports/portname/options. Next time when the port has to be rebuild, the options are reused. Never again you will have to remember all the twenty WITH_* and WITHOUT_* options you used to build this port! When the user runs make config (or runs make build for the first time), the framework will check for /var/db/ports/portname/options. If that file does not exist, it will use the values of OPTIONS to create a dialogbox where the options can be enabled or disabled. Then the options file is saved and the selected variables will be used when building the port. Use make showconfig to see the saved configuration. Use make rmconfig to remove the saved configuration. Syntax The syntax for the OPTIONS variable is: OPTIONS= OPTION "descriptive text" default ... The value for default is either ON or OFF. Multiple repetitions of these three fields are allowed. OPTIONS definition must appear before the inclusion of bsd.port.pre.mk. The WITH_* and WITHOUT_* variables can only be tested after the inclusion of bsd.port.pre.mk. Due to a deficiency in the infrastructure, you can only test WITH_* variables for options, which are OFF by default, and WITHOUT_* variables for options, which defaults to ON. Example Simple use of <makevar>OPTIONS</makevar> OPTIONS= FOO "Enable option foo" On \ BAR "Support feature bar" Off .include <bsd.port.pre.mk> .if defined(WITHOUT_FOO) CONFIGURE_ARGS+= --without-foo .else CONFIGURE_ARGS+= --with-foo .endif .if defined(WITH_BAR) RUN_DEPENDS+= bar:${PORTSDIR}/bar/bar .endif .include <bsd.port.post.mk> Specifying the working directory Each port is extracted in to a working directory, which must be writable. The ports system defaults to having the DISTFILES unpack in to a directory called ${DISTNAME}. In other words, if you have set: PORTNAME= foo PORTVERSION= 1.0 then the port's distribution files contain a top-level directory, foo-1.0, and the rest of the files are located under that directory. There are a number of variables you can override if that is not the case. <makevar>WRKSRC</makevar> The variable lists the name of the directory that is created when the application's distfiles are extracted. If our previous example extracted into a directory called foo (and not foo-1.0) you would write: WRKSRC= ${WRKDIR}/foo or possibly WRKSRC= ${WRKDIR}/${PORTNAME} <makevar>NO_WRKSUBDIR</makevar> If the port does not extract in to a subdirectory at all then you should set NO_WRKSUBDIR to indicate that. NO_WRKSUBDIR= yes <makevar>CONFLICTS</makevar> If your package cannot coexist with other packages (because of file conflicts, runtime incompatibility, etc.), list the other package names in the CONFLICTS variable. You can use shell globs like * and ? here. Packages names should be enumerated the same way they appear in /var/db/pkg. Please make sure that CONFLICTS does not match this port's package itself, or else forcing its installation with FORCE_PKG_REGISTER will no longer work. CONFLICTS automatically sets IGNORE, which is more fully documented in . Special considerations There are some more things you have to take into account when you create a port. This section explains the most common of those. Shared Libraries If your port installs one or more shared libraries, define a INSTALLS_SHLIB make variable, which will instruct a bsd.port.mk to run ${LDCONFIG} -m on the directory where the new library is installed (usually PREFIX/lib) during post-install target to register it into the shared library cache. This variable, when defined, will also facilitate addition of an appropriate @exec /sbin/ldconfig -m and @unexec /sbin/ldconfig -R pair into your pkg-plist file, so that a user who installed the package can start using the shared library immediately and de-installation will not cause the system to still believe the library is there. If you need, you can override the default location where the new library is installed by defining the LDCONFIG_DIRS make variable, which should contain a list of directories into which shared libraries are to be installed. For example if your port installs shared libraries into PREFIX/lib/foo and PREFIX/lib/bar directories you could use the following in your Makefile: INSTALLS_SHLIB= yes LDCONFIG_DIRS= %%PREFIX%%/lib/foo %%PREFIX%%/lib/bar Remember that non-standard directories will not be passed to &man.ldconfig.8; on (re-)boot! If any port really needs this to work, install a startup-script as x11/kdelibs3 does. Please double-check, often this is not necessary at all or can be avoided through -rpath or setting LD_RUN_PATH during linking (see lang/moscow_ml for an example), or through a shell-wrapper which sets LD_LIBRARY_PATH before invoking the binary, like www/mozilla does. Note that content of LDCONFIG_DIRS is passed through &man.sed.1; just like the rest of pkg-plist, so PLIST_SUB substitutions also apply here. It is recommended that you use %%PREFIX%% for PREFIX, %%LOCALBASE%% for LOCALBASE and %%X11BASE%% for X11BASE. Try to keep shared library version numbers in the libfoo.so.0 format. Our runtime linker only cares for the major (first) number. When the major library version number increments in the update to the new port version, all other ports that link to the affected library should have their PORTREVISION incremented, to force recompilation with the new library version. Ports with distribution restrictions Licenses vary, and some of them place restrictions on how the application can be packaged, whether it can be sold for profit, and so on. It is your responsibility as a porter to read the licensing terms of the software and make sure that the FreeBSD project will not be held accountable for violating them by redistributing the source or compiled binaries either via FTP/HTTP or CD-ROM. If in doubt, please contact the &a.ports;. In situations like this, the variables described in the following sections can be set. <makevar>NO_PACKAGE</makevar> This variable indicates that we may not generate a binary package of the application. For instance, the license may disallow binary redistribution, or it may prohibit distribution of packages created from patched sources. However, the port's DISTFILES may be freely mirrored on FTP/HTTP. They may also be distributed on a CD-ROM (or similar media) unless NO_CDROM is set as well. NO_PACKAGE should also be used if the binary package is not generally useful, and the application should always be compiled from the source code. For example, if the application has configuration information that is site specific hard coded in to it at compile time, set NO_PACKAGE. NO_PACKAGE should be set to a string describing the reason why the package should not be generated. <makevar>NO_CDROM</makevar> This variable alone indicates that, although we are allowed to generate binary packages, we may put neither those packages nor the port's DISTFILES onto a CD-ROM (or similar media) for resale. However, the binary packages and the port's DISTFILES will still be available via FTP/HTTP. If this variable is set along with NO_PACKAGE, then only the port's DISTFILES will be available, and only via FTP/HTTP. NO_CDROM should be set to a string describing the reason why the port cannot be redistributed on CD-ROM. For instance, this should be used if the port's license is for non-commercial use only. <makevar>RESTRICTED</makevar> Set this variable alone if the application's license permits neither mirroring the application's DISTFILES nor distributing the binary package in any way. NO_CDROM or NO_PACKAGE should not be set along with RESTRICTED since the latter variable implies the former ones. RESTRICTED should be set to a string describing the reason why the port cannot be redistributed. Typically, this indicates that the port contains proprietary software and that the user will need to manually download the DISTFILES, possibly after registering for the software or agreeing to accept the terms of an EULA. <makevar>RESTRICTED_FILES</makevar> When RESTRICTED or NO_CDROM is set, this variable defaults to ${DISTFILES} ${PATCHFILES}, otherwise it is empty. If only some of the distribution files are restricted, then set this variable to list them. Note that the port committer should add an entry to /usr/ports/LEGAL for every listed distribution file, describing exactly what the restriction entails. Building mechanisms <command>make</command>, <command>gmake</command>, and <command>imake</command> If your port uses GNU make, set USE_GMAKE=yes. Variables for ports related to gmake Variable Means USE_GMAKE The port requires gmake to build. GMAKE The full path for gmake if it is not in the PATH.

If your port is an X application that creates Makefile files from Imakefile files using imake, then set USE_IMAKE=yes. This will cause the configure stage to automatically do an xmkmf -a. If the flag is a problem for your port, set XMKMF=xmkmf. If the port uses imake but does not understand the install.man target, NO_INSTALL_MANPAGES=yes should be set. If your port's source Makefile has something else than all as the main build target, set ALL_TARGET accordingly. Same goes for install and INSTALL_TARGET. <command>configure</command> script If your port uses the configure script to generate Makefile files from Makefile.in files, set GNU_CONFIGURE=yes. If you want to give extra arguments to the configure script (the default argument is --prefix=${PREFIX} ${CONFIGURE_TARGET}), set those extra arguments in CONFIGURE_ARGS. Extra environment variables can be passed using CONFIGURE_ENV variable. If your package uses GNU configure, and the resulting executable file has a strange name like i386-portbld-freebsd4.7-appname, you will need to additionally override the CONFIGURE_TARGET variable to specify the target in the way required by scripts generated by recent versions of autoconf. Add the following line immediately after the GNU_CONFIGURE=yes line in your Makefile: CONFIGURE_TARGET=--build=${MACHINE_ARCH}-portbld-freebsd${OSREL} Variables for ports that use configure Variable Means GNU_CONFIGURE The port uses configure script to prepare build. HAS_CONFIGURE Same as GNU_CONFIGURE, except default configure target is not added to CONFIGURE_ARGS. CONFIGURE_ARGS Additional arguments passed to configure script. CONFIGURE_ENV Additional environment variables to be set for configure script run. CONFIGURE_TARGET Override default configure target. Default value is ${MACHINE_ARCH}-portbld-freebsd${OSREL}.

Using GNU autotools Introduction The various GNU autotools provide an abstraction mechanism for building a piece of software over a wide variety of operating systems and machine architectures. Within the Ports Collection, an individual port can make use of these tools via a simple construct: USE_AUTOTOOLS= tool:version[:operation] ... At the time of writing, tool can be one of libtool, libltdl, autoconf, autoheader, automake or aclocal. version specifies the particular tool revision to be used (see devel/{automake,autoconf,libtool}[0-9]+ for valid versions). operation is an optional extension to modify how the tool is used. Multiple tools can be specified at once, either by including them all on a single line, or using the += Makefile construct. Before proceeding any further, it cannot be stressed highly enough that the constructs discussed here are for use ONLY in building other ports. For cross-development work, the devel/gnu-{automake,autoconf,libtool} ports should be used, such as within an IDE. devel/anjuta and devel/kdevelop (GNOME and KDE respectively) are good examples of how to achieve this. <command>libtool</command> Shared libraries using the GNU building framework usually use libtool to adjust the compilation and installation of shared libraries to match the specifics of the underlying operating system. The Ports Collection provides a number of versions of libtool modified for use by &os;. USE_AUTOTOOLS= libtool:version[:inc|:env] With no additional operations, libtool:version tells the building framework that the port uses libtool, implying GNU_CONFIGURE. The configure script will be patched with the system-installed copy of libtool. Further, a number of make and shell variables will be assigned for onward use by the port. See bsd.autotools.mk for details. With the :inc operation, the environment will be set up, and a slightly different set of patching will be performed. With the :env operation, only the environment will be set up. Previously USE_AUTOTOOLS construct USE_LIBTOOL_VER=13 libtool:13 USE_INC_LIBTOOL_VER=15 libtool:15:inc WANT_LIBTOOL_VER=15 libtool:15:env Finally, LIBTOOLFLAGS and LIBTOOLFILES can be optionally set to override the most likely arguments to, and files patched by, libtool. Most ports are unlikely to need this. See bsd.autotools.mk for further details. <command>libltdl</command> Some ports make use of the libltdl library package, which is part of the libtool suite. Use of this library does not automatically necessitate the use of libtool itself, so a separate construct is provided. USE_AUTOTOOLS= libltdl:version Currently, all this does is to bring in a LIB_DEPENDS on the appropriate libltdl port, and is provided as a convenience function to help eliminate any dependencies on the autotools ports outside of the USE_AUTOTOOLS framework. There are no optional operations for this tool. Previously USE_AUTOTOOLS construct USE_LIBLTDL=YES libltdl:15 <command>autoconf</command> and <command>autoheader</command> Some ports do not contain a configure script, but do contain an autoconf template in the configure.ac file. You can use the following assignments to let autoconf create the configure script, and also have autoheader create template headers for use by the configure script. USE_AUTOTOOLS= autoconf:version[:env] and USE_AUTOTOOLS= autoheader:version which also implies the use of autoconf:version. Similarly to libtool, the inclusion of the optional :env operation simply sets up the environment for further use. Without it, patching and reconfiguration of the port is carried out. Previously USE_AUTOTOOLS construct USE_AUTOCONF_VER=213 autoconf:213 WANT_AUTOCONF_VER=259 autoconf:259:env USE_AUTOHEADER_VER=253 autoheader:253 (implies autoconf:253) The additional optional variables AUTOCONF_ARGS and AUTOHEADER_ARGS can be overridden by the port Makefile if specifically requested. As with the libtool equivalents, most ports are unlikely to need this. <command>automake</command> and <command>aclocal</command> Some packages only contain Makefile.am files. These have to be converted into Makefile.in files using automake, and the further processed by configure to generate an actual Makefile. Similarly, packages occasionally do not ship with included aclocal.m4 files, again required to build the software. This can be achieved with aclocal, which scans configure.ac or configure.in. aclocal has a similar relationship to automake as autoheader does to autoconf, described in the previous section. aclocal implies the use of automake, thus we have: USE_AUTOTOOLS= automake:version[:env] and USE_AUTOTOOLS= aclocal:version which also implies the use of automake:version. Similarly to libtool and autoconf, the inclusion of the optional :env operation simply sets up the environment for further use. Without it, reconfiguration of the port is carried out. Previously USE_AUTOTOOLS construct USE_AUTOMAKE_VER=14 automake:14 WANT_AUTOMAKE_VER=15 automake:15:env USE_ACLOCAL_VER=19 aclocal:19 (implies automake:19) As with autoconf and autoheader, both automake and aclocal have optional argument variables, AUTOMAKE_ARGS and ACLOCAL_ARGS respectively, which may be overriden by the port Makefile if required. Using <literal>perl</literal> Variables for ports that use <literal>perl</literal> Variable Means USE_PERL5 Says that the port uses perl 5 to build and run. USE_PERL5_BUILD Says that the port uses perl 5 to build. USE_PERL5_RUN Says that the port uses perl 5 to run. PERL The full path of perl 5, either in the system or installed from a port, but without the version number. Use this if you need to replace #!lines in scripts. PERL_CONFIGURE Configure using Perl's MakeMaker. It implies USE_PERL5. PERL_MODBUILD Configure, build and install using Module::Build. It implies PERL_CONFIGURE. Read only variables PERL_VERSION The full version of perl installed (e.g., 5.00503). PERL_VER The short version of perl installed (e.g., 5.005). PERL_LEVEL The installed perl version as an integer of the form MNNNPP (e.g., 500503). PERL_ARCH Where perl stores architecture dependent libraries. Defaults to ${ARCH}-freebsd. PERL_PORT Name of the perl port that is installed (e.g., perl5). SITE_PERL Directory name where site specific perl packages go. This value is added to PLIST_SUB.

Ports of Perl modules, which do not have an official website, should link cpan.org in the WWW line of a pkg-descr file. The suggested URL scheme is http://search.cpan.org/dist/Module-Name. Using X11 Variable definitions Variables for ports that use X USE_X_PREFIX The port installs in X11BASE, not PREFIX. USE_XLIB The port uses the X libraries. USE_MOTIF The port uses the Motif toolkit. Implies USE_XPM. USE_IMAKE The port uses imake. Implies USE_X_PREFIX. XMKMF Set to the path of xmkmf if not in the PATH. Defaults to xmkmf -a.

Variables for depending on individual parts of X11 X_IMAKE_PORT Port providing imake and several other utilities used to build X11. X_LIBRARIES_PORT Port providing X11 libraries. X_CLIENTS_PORT Port providing X clients. X_SERVER_PORT Port providing X server. X_FONTSERVER_PORT Port providing font server. X_PRINTSERVER_PORT Port providing print server. X_VFBSERVER_PORT Port providing virtual framebuffer server. X_NESTSERVER_PORT Port providing a nested X server. X_FONTS_ENCODINGS_PORT Port providing encodings for fonts. X_FONTS_MISC_PORT Port providing miscellaneous bitmap fonts. X_FONTS_100DPI_PORT Port providing 100dpi bitmap fonts. X_FONTS_75DPI_PORT Port providing 75dpi bitmap fonts. X_FONTS_CYRILLIC_PORT Port providing cyrillic bitmap fonts. X_FONTS_TTF_PORT Port providing &truetype; fonts. X_FONTS_TYPE1_PORT Port providing Type1 fonts. X_MANUALS_PORT Port providing developer oriented manual pages

Using X11 related variables in port # Use X11 libraries and depend on # font server as well as cyrillic fonts. RUN_DEPENDS= ${X11BASE}/bin/xfs:${X_FONTSERVER_PORT} \ ${X11BASE}/lib/X11/fonts/cyrillic/crox1c.pcf.gz:${X_FONTS_CYRILLIC_PORT} USE_XLIB= yes Ports that require Motif If your port requires a Motif library, define USE_MOTIF in the Makefile. Default Motif implementation is x11-toolkits/open-motif. Users can choose x11-toolkits/lesstif instead by setting WANT_LESSTIF variable. The MOTIFLIB variable will be set by bsd.port.mk to reference the appropriate Motif library. Please patch the source of your port to use ${MOTIFLIB} wherever the Motif library is referenced in the original Makefile or Imakefile. There are two common cases: If the port refers to the Motif library as -lXm in its Makefile or Imakefile, simply substitute ${MOTIFLIB} for it. If the port uses XmClientLibs in its Imakefile, change it to ${MOTIFLIB} ${XTOOLLIB} ${XLIB}. Note that MOTIFLIB (usually) expands to -L/usr/X11R6/lib -lXm or /usr/X11R6/lib/libXm.a, so there is no need to add -L or -l in front. X11 fonts If your port installs fonts for the X Window System, put them in X11BASE/lib/X11/fonts/local. Getting fake <envar>DISPLAY</envar> using Xvfb Some applications require a working X11 display for compilation to succeed. This pose a problem for the FreeBSD package building cluster, which operates headless. When the following canonical hack is used, the package cluster will start the virtual framebuffer X server. The working DISPLAY is then passed to the build. .if defined(PACKAGE_BUILDING) BUILD_DEPENDS+= Xvfb:${X_VFBSERVER_PORT} \ ${X11BASE}/lib/X11/fonts/misc/8x13O.pcf.gz:${X_FONTS_MISC_PORT} .endif Using GNOME The FreeBSD/GNOME project uses its own set of variables to define which GNOME components a particular port uses. A comprehensive list of these variables exists within the FreeBSD/GNOME project's homepage. Your port does not need to depend on GNOME if it merely installs pkg-config metadata files to PREFIX/libdata/pkgconfig. As usual, your port should be prepared to clean up after itself and remove that directory if it becomes empty. Assuming that your port installs a file named gtkmumble.pc to the said location, just add the following lines to pkg-plist: libdata/pkgconfig/gtkmumble.pc @unexec rmdir %B 2>/dev/null || true The latter line must appear immediately after the former one so that %B expands correctly. Please refer to &man.pkg.create.1; for a detailed description of the syntax used in pkg-plist. Using KDE Variables for ports that use KDE USE_QT_VER The port uses the Qt toolkit. Possible values are 1 and 3; each specify the major version of Qt to use. Sets both MOC and QTCPPFLAGSto default appropriate values. USE_KDELIBS_VER The port uses KDE libraries. Possible values are 3; each specify the major version of KDE to use. Implies USE_QT_VER of the appropriate version. USE_KDEBASE_VER The port uses KDE base. Possible values are 3; each specify the major version of KDE to use. Implies USE_KDELIBS_VER of the appropriate version. MOC Set to the path of moc. Default set according to USE_QT_VER value. QTCPPFLAGS Set the CPPFLAGS to use when processing Qt code. Default set according to USE_QT_VER value.

Using Java Variable definitions If your port needs a Java™ Development Kit (JDK) to either build, run or even extract the distfile, then it should define USE_JAVA. There are several JDKs in the ports collection, from various vendors, and in several versions. If your port must use one of these versions, you can define which one. The most current version is java/jdk14. Variables that may be set by ports that use Java Variable Means USE_JAVA Should be defined for the remaining variables to have any effect. JAVA_VERSION List of space-separated suitable Java versions for the port. An optional "+" allows you to specify a range of versions (allowed values: 1.1[+] 1.2[+] 1.3[+] 1.4[+]). JAVA_OS List of space-separated suitable JDK port operating systems for the port (allowed values: native linux). JAVA_VENDOR List of space-separated suitable JDK port vendors for the port (allowed values: freebsd bsdjava sun ibm blackdown). JAVA_BUILD When set, it means that the selected JDK port should be added to the build dependencies of the port. JAVA_RUN When set, it means that the selected JDK port should be added to the run dependencies of the port. JAVA_EXTRACT When set, it means that the selected JDK port should be added to the extract dependencies of the port. USE_JIKES Whether the port should or should not use the jikes bytecode compiler to build. When no value is set for this variable, the port will use jikes to build if available. You may also explicitly forbid or enforce the use of jikes (by setting 'no' or 'yes'). In the later case, devel/jikes will be added to build dependencies of the port. In any case that jikes is actually used in place of javac, then the HAVE_JIKES variable is defined by bsd.java.mk.

Below is the list of all settings a port will receive after setting USE_JAVA: Variables provided to ports that use Java Variable Value JAVA_PORT The name of the JDK port (e.g. 'java/jdk14'). JAVA_PORT_VERSION The full version of the JDK port (e.g. '1.4.2'). If you only need the first two digits of this version number, use ${JAVA_PORT_VERSION:C/^([0-9])\.([0-9])(.*)$/\1.\2/}. JAVA_PORT_OS The operating system used by the JDK port (e.g. 'linux'). JAVA_PORT_VENDOR The vendor of the JDK port (e.g. 'sun'). JAVA_PORT_OS_DESCRIPTION Description of the operating system used by the JDK port (e.g. 'Linux'). JAVA_PORT_VENDOR_DESCRIPTION Description of the vendor of the JDK port (e.g. 'FreeBSD Foundation'). JAVA_HOME Path to the installation directory of the JDK (e.g. '/usr/local/jdk1.3.1'). JAVAC Path to the Java compiler to use (e.g. '/usr/local/jdk1.1.8/bin/javac' or '/usr/local/bin/jikes'). JAR Path to the jar tool to use (e.g. '/usr/local/jdk1.2.2/bin/jar' or '/usr/local/bin/fastjar'). APPLETVIEWER Path to the appletviewer utility (e.g. '/usr/local/linux-jdk1.2.2/bin/appletviewer'). JAVA Path to the java executable. Use this for executing Java programs (e.g. '/usr/local/jdk1.3.1/bin/java'). JAVADOC Path to the javadoc utility program. JAVAH Path to the javah program. JAVAP Path to the javap program. JAVA_KEYTOOL Path to the keytool utility program. This variable is available only if the JDK is Java 1.2 or higher. JAVA_N2A Path to the native2ascii tool. JAVA_POLICYTOOL Path to the policytool program. This variable is available only if the JDK is Java 1.2 or higher. JAVA_SERIALVER Path to the serialver utility program. RMIC Path to the RMI stub/skeleton generator, rmic. RMIREGISTRY Path to the RMI registry program, rmiregistry. RMID Path to the RMI daemon program rmid. This variable is only available if the JDK is Java 1.2 or higher. JAVA_CLASSES Path to the archive that contains the JDK class files. On JDK 1.2 or later, this is ${JAVA_HOME}/jre/lib/rt.jar. Earlier JDKs used ${JAVA_HOME}/lib/classes.zip. HAVE_JIKES Defined whenever jikes is used by the port (see USE_JIKES above).

You may use the java-debug make target to get information for debugging your port. It will display the value of many of the forecited variables. Additionally, the following constants are defined so all Java ports may be installed in a consistent way: Constants defined for ports that use Java Constant Value JAVASHAREDIR The base directory for everything related to Java. Default: ${PREFIX}/share/java. JAVAJARDIR The directory where JAR files should be installed. Default: ${JAVASHAREDIR}/classes. JAVALIBDIR The directory where JAR files installed by other ports are located. Default: ${LOCALBASE}/share/java/classes.

The related entries are defined in both PLIST_SUB (documented in ) and SUB_LIST. Building with Ant When the port is to be built using Apache Ant, it has to define USE_ANT. Ant is thus considered to be the sub-make command. When no do-build target is defined by the port, a default one will be set that simply runs Ant according to MAKE_ENV, MAKE_ARGS and ALL_TARGETS. This is similar to the USE_GMAKE mechanism, which is documented in . If jikes is used in place of javac (see USE_JIKES in ), then Ant will automatically use it to build the port. Best practices When porting a Java library, your port should install the JAR file(s) in ${JAVAJARDIR}, and everything else under ${JAVASHAREDIR}/${PORTNAME} (except for the documentation, see below). In order to reduce the packing file size, you may reference the JAR file(s) directly in the Makefile. Just use the following statement (where myport.jar is the name of the JAR file installed as part of the port): PLIST_FILES+= %%JAVAJARDIR%%/myport.jar When porting a Java application, the port usually installs everything under a single directory (including its JAR dependencies). The use of ${JAVASHAREDIR}/${PORTNAME} is strongly encouraged in this regard. It is up the porter to decide whether the port should install the additional JAR dependencies under this directory or directly use the already installed ones (from ${JAVAJARDIR}). Regardless of the type of your port (library or application), the additional documentation should be installed in the same location as for any other port. The JavaDoc tool is known to produce a different set of files depending on the version of the JDK that is used. For ports that do not enforce the use of a particular JDK, it is therefore a complex task to specify the packing list (pkg-plist). This is one reason why porters are strongly encouraged to use the PORTDOCS macro. Moreover, even if you can predict the set of files that will be generated by javadoc, the size of the resulting pkg-plist advocates for the use of PORTDOCS. The default value for DATADIR is ${PREFIX}/share/${PORTNAME}. It is a good idea to override DATADIR to ${JAVASHAREDIR}/${PORTNAME} for Java ports. Indeed, DATADIR is automatically added to PLIST_SUB (documented in ) so you may use %%DATADIR%% directly in pkg-plist. As for the choice of building Java ports from source or directly installing them from a binary distribution, there is no defined policy at the time of writing. However, people from the &os; Java Project encourage porters to have their ports built from source whenever it is a trivial task. All the features that have been presented in this section are implemented in bsd.java.mk. If you ever think that your port needs more sophisticated Java support, please first have a look at the bsd.java.mk CVS log as it usually takes some time to document the latest features. Then, if you think the support you are lacking would be beneficial to many other Java ports, feel free to discuss it on the &a.java;. Although there is a java category for PRs, it refers to the JDK porting effort from the &os; Java project. Therefore, you should submit your Java port in the ports category as for any other port, unless the issue you are trying to resolve is related to either a JDK implementation or bsd.java.mk. Similarly, there is a defined policy regarding the CATEGORIES of a Java port, which is detailed in . Using Apache and PHP Apache Variables for ports that use Apache USE_APACHE The port requires Apache. WITH_APACHE2 The port requires Apache 2.0. Without this variable, the port will depend on Apache 1.3. APXS Full path to the apxs binary (read-only variable).

PHP Variables for ports that use PHP USE_PHP The port requires PHP. The value yes adds a dependency on PHP. The list of required PHP extensions can be specified instead. Example: pcre xml gettext DEFAULT_PHP_VER Selects which major version of PHP will be installed as a dependency when no PHP is installed yet. Default is 4. Possible values: 4, 5 BROKEN_WITH_PHP The port does not work with PHP of the given version. Possible values: 4, 5 USE_PHPIZE The port will be built as a PHP extension. USE_PHPEXT The port will be treated as a PHP extension, including installation and registration in the extension registry. USE_PHP_BUILD Set PHP as a build dependency. WANT_PHP_CLI Want the CLI (command line) version of PHP. WANT_PHP_CGI Want the CGI version of PHP. WANT_PHP_MOD Want the Apache module version of PHP. WANT_PHP_SCR Want the CLI or the CGI version of PHP. WANT_PHP_WEB Want the Apache module or the CGI version of PHP. WANT_PHP_PEAR Want the PEAR framework.

PEAR modules Porting PEAR modules is a very simple process. Use the variables FILES, TESTS, DATA, SQLS, SCRIPTFILES, DOCS and EXAMPLES to list the files you want to install. All listed files will be automatically installed into the appropriate locations and added to pkg-plist. Include ${PORTSDIR}/devel/pear-PEAR/Makefile.common on the last line of the Makefile. Example Makefile for PEAR class PORTNAME= Date PORTVERSION= 1.4.3 CATEGORIES= devel www pear MAINTAINER= example@domain.com COMMENT= PEAR Date and Time Zone Classes BUILD_DEPENDS= ${PEARDIR}/PEAR.php:${PORTSDIR}/devel/pear-PEAR RUN_DEPENDS= ${BUILD_DEPENDS} FILES= Date.php Date/Calc.php Date/Human.php Date/Span.php \ Date/TimeZone.php TESTS= test_calc.php test_date_methods_span.php testunit.php \ testunit_date.php testunit_date_span.php wknotest.txt \ bug674.php bug727_1.php bug727_2.php bug727_3.php \ bug727_4.php bug967.php weeksinmonth_4_monday.txt \ weeksinmonth_4_sunday.txt weeksinmonth_rdm_monday.txt \ weeksinmonth_rdm_sunday.txt DOCS= TODO _DOCSDIR= . .include <bsd.port.pre.mk> .include "${PORTSDIR}/devel/pear-PEAR/Makefile.common" .include <bsd.port.post.mk> Using Python Most useful variables for ports that use Python USE_PYTHON The port needs Python. Minimal required version can be specified with values such as 2.3+. Version ranges can also be specified, by separating two version numbers with a dash, e.g.: 2.1-2.3 USE_PYDISTUTILS Use Python distutils for configuring, compiling and installing. This is required when the port comes with setup.py. This overrides the do-build and do-install targets and may also override do-configure if GNU_CONFIGURE is not defined. PYTHON_PKGNAMEPREFIX Used as a PKGNAMEPREFIX to distinguish packages for different Python versions. Example: py24- PYTHON_SITELIBDIR Location of the site-packages tree, that contains installation path of Python (usually LOCALBASE). The PYTHON_SITELIBDIR variable can be very useful when installing Python modules. PYTHONPREFIX_SITELIBDIR The PREFIX-clean variant of PYTHON_SITELIBDIR. Always use %%PYTHON_SITELIBDIR%% in pkg-plist when possible. The default value of %%PYTHON_SITELIBDIR%% is lib/python%%PYTHON_VERSION%%/site-packages PYTHON_CMD Python interpreter command line, including version number. PYNUMERIC Dependency line for numeric extension. PYXML Dependency line for XML extension (not needed for Python 2.0 and higher as it is also in base distribution). USE_TWISTED Add dependency on twistedCore. The list of required components can be specified as a value of this variable. Example: web lore pair flow USE_ZOPE Add dependency on Zope, a web application platform. Change Python dependency to Python 2.3. Set ZOPEBASEDIR containing a directory with Zope installation.

A complete list of available variables can be found in /usr/ports/Mk/bsd.python.mk. Using Emacs This section is yet to be written. Using Ruby Useful variables for ports that use Ruby Variable Description USE_RUBY The port requires Ruby. USE_RUBY_EXTCONF The port uses extconf.rb to configure. USE_RUBY_SETUP The port uses setup.rb to configure. RUBY_SETUP Set to the alternative name of setup.rb. Common value is install.rb.

The following table shows the selected variables available to port authors via the ports infrastructure. These variables should be used to install files into their proper locations. Use them in pkg-plist as much as possible. These variables should not be redefined in the port. Selected read-only variables for ports that use Ruby Variable Description Example value RUBY_PKGNAMEPREFIX Used as a PKGNAMEPREFIX to distinguish packages for different Ruby versions. ruby18- RUBY_VERSION Full version of Ruby in the form of x.y.z. 1.8.2 RUBY_SITELIBDIR Architecture independent libraries installation path. /usr/local/lib/ruby/site_ruby/1.8 RUBY_SITEARCHILIBDIR Architecture dependent libraries installation path. /usr/local/lib/ruby/site_ruby/1.8/amd64-freebsd6 RUBY_MODDOCDIR Module documentation installation path. /usr/local/share/doc/ruby18/patsy RUBY_MODEXAMPLESDIR Module examples installation path. /usr/local/share/examples/ruby18/patsy

A complete list of available variables can be found in /usr/ports/Mk/bsd.ruby.mk. Using SDL The USE_SDL variable is used to autoconfigure the dependencies for ports which use an SDL based library like devel/sdl12 and x11-toolkits/sdl_gui. The following SDL libraries are recognized at the moment: sdl: devel/sdl12 gfx: graphics/sdl_gfx gui: x11-toolkits/sdl_gui image: graphics/sdl_image ldbad: devel/sdl_ldbad mixer: audio/sdl_mixer mm: devel/sdlmm net: net/sdl_net sound: audio/sdl_sound ttf: graphics/sdl_ttf Therefore, if a port has a dependency on net/sdl_net and audio/sdl_mixer, the syntax will be: USE_SDL= net mixer The dependency devel/sdl12, which is required by net/sdl_net and audio/sdl_mixer, is automatically added as well. If you use USE_SDL, it will automatically: Add a dependency on sdl12-config to BUILD_DEPENDS Add the variable SDL_CONFIG to CONFIGURE_ENV Add the dependencies of the selected libraries to the LIB_DEPENDS To check whether an SDL library is available, you can do it with the WANT_SDL variable: WANT_SDL=yes .include <bsd.port.pre.mk> .if ${HAVE_SDL:Mmixer}!="" USE_SDL+= mixer .endif .include <bsd.port.post.mk> Starting and stopping services (rc scripts) Startup scripts are used to start services on system startup, and to give administrator a standard way of stopping, starting and restarting the service. Ports integrates into the system rc.d framework. Details on usage can be found in the Handbook chapter. Detailed explanation of available commands is in &man.rc.subr.8;. One or more rc scripts can be installed: USE_RC_SUBR= doorman.sh Scripts must be placed in the files subdirectory and a .in suffix must be added to their filename. The only difference from a base system rc script is that the . /etc/rc.subr line must be replaced with . %%RC_SUBR%%, because older versions of &os; do not have an /etc/rc.subr file. Standard SUB_LIST expansions are used too. Especially using %%PREFIX%% is advised. More on SUB_LIST in the relevant section. Integration with &man.rcorder.8; is available by using USE_RCORDER instead of USE_RC_SUBR. Example simple rc script: #!/bin/sh # PROVIDE: doorman # REQUIRE: LOGIN # KEYWORD: FreeBSD # # Add the following lines to /etc/rc.conf to enable doorman: # doorman_enable (bool): Set to "NO" by default. # Set it to "YES" to enable doorman # doorman_config (path): Set to "%%PREFIX%%/etc/doormand/doormand.cf" by default. # . %%RC_SUBR%% name="doorman" rcvar=`set_rcvar` load_rc_config $name : ${doorman_enable="NO"} : ${doorman_config="%%PREFIX%%/etc/doormand/doormand.cf"} command=%%PREFIX%%/sbin/doormand pidfile=/var/run/doormand.pid command_args="-p $pidfile -f $doorman_config" run_rc_command "$1" Advanced <filename>pkg-plist</filename> practices Changing <filename>pkg-plist</filename> based on make variables Some ports, particularly the p5- ports, need to change their pkg-plist depending on what options they are configured with (or version of perl, in the case of p5- ports). To make this easy, any instances in the pkg-plist of %%OSREL%%, %%PERL_VER%%, and %%PERL_VERSION%% will be substituted for appropriately. The value of %%OSREL%% is the numeric revision of the operating system (e.g., 4.9). %%PERL_VERSION%% is the full version number of perl (e.g., 5.00502) and %%PERL_VER%% is the perl version number minus the patchlevel (e.g., 5.005). Several other %%VARS%% related to port's documentation files are described in the relevant section. If you need to make other substitutions, you can set the PLIST_SUB variable with a list of VAR=VALUE pairs and instances of %%VAR%% will be substituted with VALUE in the pkg-plist. For instance, if you have a port that installs many files in a version-specific subdirectory, you can put something like OCTAVE_VERSION= 2.0.13 PLIST_SUB= OCTAVE_VERSION=${OCTAVE_VERSION} in the Makefile and use %%OCTAVE_VERSION%% wherever the version shows up in pkg-plist. That way, when you upgrade the port, you will not have to change dozens (or in some cases, hundreds) of lines in the pkg-plist. This substitution (as well as addition of any manual pages) will be done between the pre-install and do-install targets, by reading from PLIST and writing to TMPPLIST (default: WRKDIR/.PLIST.mktmp). So if your port builds PLIST on the fly, do so in or before pre-install. Also, if your port needs to edit the resulting file, do so in post-install to a file named TMPPLIST. Another possibility to modify port's packing list is based on setting the variables PLIST_FILES and PLIST_DIRS. The value of each variable is regarded as a list of pathnames to write to TMPPLIST along with PLIST contents. Names listed in PLIST_FILES and PLIST_DIRS are subject to %%VAR%% substitution, as described above. Except for that, names from PLIST_FILES will appear in the final packing list unchanged, while @dirrm will be prepended to names from PLIST_DIRS. To take effect, PLIST_FILES and PLIST_DIRS must be set before TMPPLIST is written, i.e. in pre-install or earlier. Empty directories Cleaning up empty directories Do make your ports remove empty directories when they are de-installed. This is usually accomplished by adding @dirrm lines for all directories that are specifically created by the port. You need to delete subdirectories before you can delete parent directories. : lib/X11/oneko/pixmaps/cat.xpm lib/X11/oneko/sounds/cat.au : @dirrm lib/X11/oneko/pixmaps @dirrm lib/X11/oneko/sounds @dirrm lib/X11/oneko However, sometimes @dirrm will give you errors because other ports share the same directory. You can call rmdir from @unexec to remove only empty directories without warning. @unexec rmdir %D/share/doc/gimp 2>/dev/null || true This will neither print any error messages nor cause &man.pkg.delete.1; to exit abnormally even if PREFIX/share/doc/gimp is not empty due to other ports installing some files in there. Creating empty directories Empty directories created during port installation need special attention. They will not get created when installing the package, because packages only store the files, and &man.pkg.add.1; creates directories for them as needed. To make sure the empty directory is created when installing the package, add this line to pkg-plist above the corresponding @dirrm line: @exec mkdir -p %D/share/foo/templates Configuration files If your port requires some configuration files in PREFIX/etc, do not just install them and list them in pkg-plist. That will cause &man.pkg.delete.1; to delete files carefully edited by the user and a new installation to wipe them out. Instead, install sample files with a suffix (filename.sample will work well). Copy the sample file as the real configuration file, if it does not exist. On deinstall, delete the configuration file, but only if it was not modified by the user. You need to handle this both in the port Makefile, and in the pkg-plist (for installation from the package). Example of the Makefile part: post-install: @if [ ! -f ${PREFIX}/etc/orbit.conf ]; then \ ${CP} -p ${PREFIX}/etc/orbit.conf.sample ${PREFIX}/etc/orbit.conf ; \ fi Example of the pkg-plist part: @unexec if cmp -s %D/etc/orbit.conf.sample %D/etc/orbit.conf; then rm -f %D/etc/orbit.conf; fi etc/orbit.conf.sample @exec if [ ! -f %D/etc/orbit.conf ] ; then cp -p %D/%F %B/orbit.conf; fi Alternatively, print out a message pointing out that the user has to copy and edit the file before the software can be made to work. Dynamic vs. static package list A static package list is a package list which is available in the Ports Collection either as a pkg-plist file (with or without variable substitution), or embedded into the Makefile via PLIST_FILES and PLIST_DIRS. Even if the contents are auto-generated by a tool or a target in the Makefile before the inclusion into the Ports Collection by a committer, this is still considered a static list, since it is possible to examine it without having to download or compile the distfile. A dynamic package list is a package list which is generated at the time the port is compiled based upon the files and directories which are installed. It is not possible to examine it before the source code of the ported application is downloaded and compiled, or after running a make clean. While the use of dynamic package lists is not forbidden, maintainers should use static package lists wherever possible, as it enables users to &man.grep.1; through available ports to discover, for example, which port installs a certain file. Dynamic lists should be primarily used for complex ports where the package list changes drastically based upon optional features of the port (and thus maintaining a static package list is infeasible), or ports which change the package list based upon the version of dependent software used (e.g. ports which generate docs with Javadoc). Maintainers who prefer dynamic package lists are encouraged to add a new target to their port which generates the pkg-plist file so that users may examine the contents. Automated package list creation First, make sure your port is almost complete, with only pkg-plist missing. Next, create a temporary directory tree into which your port can be installed, and install any dependencies. port-type should be local for non-X ports and x11-4 or x11 for ports which install into the directory hierarchy of XFree86 4 or an earlier XFree86 release, respectively. &prompt.root; mkdir /var/tmp/port-name &prompt.root; mtree -U -f /etc/mtree/BSD.port-type.dist -d -e -p /var/tmp/port-name &prompt.root; make depends PREFIX=/var/tmp/port-name Store the directory structure in a new file. &prompt.root; (cd /var/tmp/port-name && find -d * -type d) | sort > OLD-DIRS Create an empty pkg-plist file: &prompt.root; touch pkg-plist If your port honors PREFIX (which it should) you can then install the port and create the package list. &prompt.root; make install PREFIX=/var/tmp/port-name &prompt.root; (cd /var/tmp/port-name && find -d * \! -type d) | sort > pkg-plist You must also add any newly created directories to the packing list. &prompt.root; (cd /var/tmp/port-name && find -d * -type d) | sort | comm -13 OLD-DIRS - | sort -r | sed -e 's#^#@dirrm #' >> pkg-plist Finally, you need to tidy up the packing list by hand; it is not all automated. Manual pages should be listed in the port's Makefile under MANn, and not in the package list. User configuration files should be removed, or installed as filename.sample. The info/dir file should not be listed and appropriate install-info lines should be added as noted in the info files section. Any libraries installed by the port should be listed as specified in the shared libraries section. Alternatively, use the plist script in /usr/ports/Tools/scripts/ to build the package list automatically. The first step is the same as above: take the first three lines, that is, mkdir, mtree and make depends. Then build and install the port: &prompt.root; make install PREFIX=/var/tmp/port-name And let plist create the pkg-plist file: &prompt.root; /usr/ports/Tools/scripts/plist -Md -m /etc/mtree/BSD.port-type.dist /var/tmp/port-name > pkg-plist The packing list still has to be tidied up by hand as stated above. The <filename>pkg-<replaceable>*</replaceable></filename> files There are some tricks we have not mentioned yet about the pkg-* files that come in handy sometimes. <filename>pkg-message</filename> If you need to display a message to the installer, you may place the message in pkg-message. This capability is often useful to display additional installation steps to be taken after a &man.pkg.add.1; or to display licensing information. The pkg-message file does not need to be added to pkg-plist. Also, it will not get automatically printed if the user is using the port, not the package, so you should probably display it from the post-install target yourself. <filename>pkg-install</filename> If your port needs to execute commands when the binary package is installed with &man.pkg.add.1; you can do this via the pkg-install script. This script will automatically be added to the package, and will be run twice by &man.pkg.add.1;: the first time as ${SH} pkg-install ${PKGNAME} PRE-INSTALL and the second time as ${SH} pkg-install ${PKGNAME} POST-INSTALL. $2 can be tested to determine which mode the script is being run in. The PKG_PREFIX environmental variable will be set to the package installation directory. See &man.pkg.add.1; for additional information. This script is not run automatically if you install the port with make install. If you are depending on it being run, you will have to explicitly call it from your port's Makefile, with a line like PKG_PREFIX=${PREFIX} ${SH} ${PKGINSTALL} ${PKGNAME} PRE-INSTALL. <filename>pkg-deinstall</filename> This script executes when a package is removed. This script will be run twice by &man.pkg.delete.1;. The first time as ${SH} pkg-deinstall ${PKGNAME} DEINSTALL and the second time as ${SH} pkg-deinstall ${PKGNAME} POST-DEINSTALL. <filename>pkg-req</filename> If your port needs to determine if it should install or not, you can create a pkg-req requirements script. It will be invoked automatically at installation/de-installation time to determine whether or not installation/de-installation should proceed. The script will be run at installation time by &man.pkg.add.1; as pkg-req ${PKGNAME} INSTALL. At de-installation time it will be run by &man.pkg.delete.1; as pkg-req ${PKGNAME} DEINSTALL. Changing the names of <filename>pkg-<replaceable>*</replaceable></filename> files All the names of pkg-* files are defined using variables so you can change them in your Makefile if need be. This is especially useful when you are sharing the same pkg-* files among several ports or have to write to one of the above files (see writing to places other than WRKDIR for why it is a bad idea to write directly into the pkg-* subdirectory). Here is a list of variable names and their default values. (PKGDIR defaults to ${MASTERDIR}.) Variable Default value DESCR ${PKGDIR}/pkg-descr PLIST ${PKGDIR}/pkg-plist PKGINSTALL ${PKGDIR}/pkg-install PKGDEINSTALL ${PKGDIR}/pkg-deinstall PKGREQ ${PKGDIR}/pkg-req PKGMESSAGE ${PKGDIR}/pkg-message Please change these variables rather than overriding PKG_ARGS. If you change PKG_ARGS, those files will not correctly be installed in /var/db/pkg upon install from a port. Making use of <makevar>SUB_FILES</makevar> and <makevar>SUB_LIST</makevar> The SUB_FILES and SUB_LIST variables are useful for dynamic values in port files, such as the installation PREFIX in pkg-message. The SUB_FILES variable specifies a list of files to be automatically modified. Each file in the SUB_FILES list must have a corresponding file.in present in FILESDIR. A modified version will be created in WRKDIR. Files defined as a value of USE_RC_SUBR and USE_RCORDER are automatically added to SUB_FILES. For the files pkg-message, pkg-install, pkg-deinstall and pkg-reg, the corresponding Makefile variable is automatically set to point to the processed version. The SUB_LIST variable is a list of VAR=VALUE pairs. For each pair %%VAR%% will get replaced with VALUE in each file listed in SUB_FILES. Several common pairs are automatically defined: PREFIX, LOCALBASE, X11BASE, DATADIR, DOCSDIR, EXAMPLESDIR. Any line beginning with @comment will be deleted from resulting files after a variable substitution. The following example will replace %%ARCH%% with the system architecture in a pkg-message: SUB_FILES= pkg-message SUB_LIST= ARCH=${ARCH} Note that for this example, the pkg-message.in file must exist in FILESDIR. Example of a good pkg-message.in: Now it's time to configure this package. Copy %%PREFIX%%/share/examples/putsy/%%ARCH%%.conf into your home directory as .putsy.conf and edit it. Testing your port Running <command>make describe</command> Several of the &os; port maintenance tools, such as &man.portupgrade.1;, rely on a database called /usr/ports/INDEX which keeps track of such items as port dependencies. INDEX is created by the top-level ports/Makefile via make index, which descends into each port subdirectory and executes make describe there. Thus, if make describe fails in any port, no one can generate INDEX, and many people will quickly become unhappy. It is important to be able to generate this file no matter what options are present in make.conf, so please avoid doing things such as using .error statements when (for instance) a dependency is not satisfied. (See .) If make describe produces a string rather than an error message, you are probably safe. See bsd.port.mk for the meaning of the string produced. Also note that running a recent version of portlint (as specified in the next section) will cause make describe to be run automatically. Portlint Do check your work with portlint before you submit or commit it. portlint warns you about many common errors, both functional and stylistic. For a new (or repocopied) port, portlint -A is the most thorough; for an existing port, portlint -C is sufficient. Since portlint uses heuristics to try to figure out errors, it can produce false positive warnings. In addition, occasionally something that is flagged as a problem really cannot be done in any other way due to limitations in the ports framework. When in doubt, the best thing to do is ask on &a.ports;. <makevar>PREFIX</makevar> Do try to make your port install relative to PREFIX. The value of this variable will be set to LOCALBASE (default /usr/local). If USE_X_PREFIX or USE_IMAKE is set, PREFIX will be X11BASE (default /usr/X11R6). If USE_LINUX_PREFIX is set, PREFIX will be LINUXBASE (default /compat/linux). Avoiding the hard-coding of /usr/local or /usr/X11R6 anywhere in the source will make the port much more flexible and able to cater to the needs of other sites. For X ports that use imake, this is automatic; otherwise, this can often be done by simply replacing the occurrences of /usr/local (or /usr/X11R6 for X ports that do not use imake) in the various scripts/Makefiles in the port to read ${PREFIX}, as this variable is automatically passed down to every stage of the build and install processes. Make sure your application is not installing things in /usr/local instead of PREFIX. A quick test for this is to do this is: &prompt.root; make clean; make package PREFIX=/var/tmp/port-name If anything is installed outside of PREFIX, the package creation process will complain that it cannot find the files. This does not test for the existence of internal references, or correct use of LOCALBASE for references to files from other ports. Testing the installation in /var/tmp/port-name to do that while you have it installed would do that. Do not set USE_X_PREFIX unless your port truly requires it (i.e., it links against X libs or it needs to reference files in X11BASE). The variable PREFIX can be reassigned in your Makefile or in the user's environment. However, it is strongly discouraged for individual ports to set this variable explicitly in the Makefiles. Also, refer to programs/files from other ports with the variables mentioned above, not explicit pathnames. For instance, if your port requires a macro PAGER to be the full pathname of less, use the compiler flag: -DPAGER=\"${LOCALBASE}/bin/less\" instead of -DPAGER=\"/usr/local/bin/less\". This way it will have a better chance of working if the system administrator has moved the whole /usr/local tree somewhere else. Upgrading When you notice that a port is out of date compared to the latest version from the original authors, you should first ensure that you have the latest port. You can find them in the ports/ports-current directory of the &os; FTP mirror sites. However, if you are working with more than a few ports, you will probably find it easier to use CVSup to keep your whole ports collection up-to-date, as described in the Handbook. This will have the added benefit of tracking all the ports' dependencies. The next step is to see if there is an update already pending. To do this, you have two options. There is a searchable interface to the FreeBSD Problem Report (PR) database (also known as GNATS). Select ports in the dropdown, and enter the name of the port. However, sometimes people forget to put the name of the port into the Synopsis field in an unambiguous fashion. In that case, you can try the FreeBSD Ports Monitoring System (also known as portsmon). This system attempts to classify port PRs by portname. To search for PRs about a particular port, use the Overview of One Port. If there is no pending PR, the next step is to send an email to the port's maintainer, as shown by make maintainer. That person may already be working on an upgrade, or have a reason to not upgrade the port right now (because of, for example, stability problems of the new version); you would not want to duplicate their work. Note that unmaintained ports are listed with a maintainer of ports@FreeBSD.org, which is just the general ports mailing list, so sending mail there probably will not help in this case. If the maintainer asks you to do the upgrade or there is no maintainer, then you have a chance to help out &os; by preparing the update yourself! Please make the changes and save the result of the recursive diff output of the new and old ports directories (e.g., if your modified port directory is called superedit and the original is in our tree as superedit.bak, then save the result of diff -ruN superedit.bak superedit). Either unified or context diff is fine, but port committers generally prefer unified diffs. Note the use of the -N option—this is the accepted way to force diff to properly deal with the case of new files being added or old files being deleted. Before sending us the diff, please examine the output to make sure all the changes make sense. To simplify common operations with patch files, you can use /usr/ports/Tools/scripts/patchtool.py. Before using it, please read /usr/ports/Tools/scripts/README.patchtool. If the port is unmaintained, and you are actively using it yourself, please consider volunteering to become its maintainer. &os; has over 2000 ports without maintainers, and this is an area where more volunteers are always needed. (For a detailed description of the responsibilities of maintainers, refer to the MAINTAINER on Makefiles section.) The best way to send us the diff is by including it via &man.send-pr.1; (category ports). If you are volunteering to maintain the port, be sure to put [maintainer update] at the beginning of your synopsis line and set the Class of your PR to maintainer-update. Otherwise, the Class of your PR should be change-request. Please mention any added or deleted files in the message, as they have to be explicitly specified to &man.cvs.1; when doing a commit. If the diff is more than about 20KB, please compress and uuencode it; otherwise, just include it in the PR as is. Before you &man.send-pr.1;, you should review the Writing the problem report section in the Problem Reports article; it contains far more information about how to write useful problem reports. If your upgrade is motivated by security concerns or a serious fault in the currently committed port, please notify the &a.portmgr; to request immediate rebuilding and redistribution of your port's package. Unsuspecting users of &man.pkg.add.1; will otherwise continue to install the old version via pkg_add -r for several weeks. Once again, please use &man.diff.1; and not &man.shar.1; to send updates to existing ports! Now that you have done all that, you will want to read about how to keep up-to-date in . Ports security Why security is so important Bugs are occasionally introduced to the software. Arguably, the most dangerous of them are those opening security vulnerabilities. From the technical viewpoint, such vulnerabilities are to be closed by exterminating the bugs that caused them. However, the policies for handling mere bugs and security vulnerabilities are very different. A typical small bug affects only those users who have enabled some combination of options triggering the bug. The developer will eventually release a patch followed by a new version of the software, free of the bug, but the majority of users will not take the trouble of upgrading immediately because the bug has never vexed them. A critical bug that may cause data loss represents a graver issue. Nevertheless, prudent users know that a lot of possible accidents, besides software bugs, are likely to lead to data loss, and so they make backups of important data; in addition, a critical bug will be discovered really soon. A security vulnerability is all different. First, it may remain unnoticed for years because often it does not cause software malfunction. Second, a malicious party can use it to gain unauthorized access to a vulnerable system, to destroy or alter sensitive data; and in the worst case the user will not even notice the harm caused. Third, exposing a vulnerable system often assists attackers to break into other systems that could not be compromised otherwise. Therefore closing a vulnerability alone is not enough: the audience should be notified of it in most clear and comprehensive manner, which will allow to evaluate the danger and take appropriate actions. Fixing security vulnerabilities While on the subject of ports and packages, a security vulnerability may initially appear in the original distribution or in the port files. In the former case, the original software developer is likely to release a patch or a new version instantly, and you will only need to update the port promptly with respect to the author's fix. If the fix is delayed for some reason, you should either mark the port as FORBIDDEN or introduce a patch file of your own to the port. In the case of a vulnerable port, just fix the port as soon as possible. In either case, the standard procedure for submitting your change should be followed unless you have rights to commit it directly to the ports tree. Being a ports committer is not enough to commit to an arbitrary port. Remember that ports usually have maintainers, whom you should respect. Please make sure that the port's revision is bumped as soon as the vulnerability has been closed. That is how the users who upgrade installed packages on a regular basis will see they need to run an update. Besides, a new package will be built and distributed over FTP and WWW mirrors, replacing the vulnerable one. PORTREVISION should be bumped unless PORTVERSION has changed in the course of correcting the vulnerability. That is you should bump PORTREVISION if you have added a patch file to the port, but you should not if you have updated the port to the latest software version and thus already touched PORTVERSION. Please refer to the corresponding section for more information. Keeping the community informed The VuXML database A very important and urgent step to take as early as a security vulnerability is discovered is to notify the community of port users about the jeopardy. Such notification serves two purposes. First, should the danger be really severe, it will be wise to apply an instant workaround, e.g., stop the affected network service or even deinstall the port completely, until the vulnerability is closed. Second, a lot of users tend to upgrade installed packages just occasionally. They will know from the notification that they must update the package without delay as soon as a corrected version is available. Given the huge number of ports in the tree, a security advisory cannot be issued on each incident without creating a flood and losing the attention of the audience by the time it comes to really serious matters. Therefore security vulnerabilities found in ports are recorded in the FreeBSD VuXML database. The Security Officer Team members are monitoring it for issues requiring their intervention. If you have committer rights, you can update the VuXML database by yourself. So you will both help the Security Officer Team and deliver the crucial information to the community earlier. However, if you are not a committer, or you believe you have found an exceptionally severe vulnerability, or whatever, please do not hesitate to contact the Security Officer Team directly as described on the FreeBSD Security Information page. All right, you elected the hard way. As it may be obvious from its title, the VuXML database is essentially an XML document. Its source file vuln.xml is kept right inside the port security/vuxml. Therefore the file's full pathname will be PORTSDIR/security/vuxml/vuln.xml. Each time you discover a security vulnerability in a port, please add an entry for it to that file. Until you are familiar with VuXML, the best thing you can do is to find an existing entry fitting your case, then copy it and use as a template. A short introduction to VuXML The full-blown XML is complex and far beyond the scope of this book. However, to gain basic insight on the structure of a VuXML entry, you need only the notion of tags. XML tag names are enclosed in angle brackets. Each opening <tag> must have a matching closing </tag>. Tags may be nested. If nesting, the inner tags must be closed before the outer ones. There is a hierarchy of tags, i.e. more complex rules of nesting them. Sounds very similar to HTML, doesn't it? The major difference is that XML is eXtensible, i.e. based on defining custom tags. Due to its intrinsic structure, XML puts otherwise amorphous data into shape. VuXML is particularly tailored to mark up descriptions of security vulnerabilities. Now let's consider a realistic VuXML entry: <vuln vid="f4bc80f4-da62-11d8-90ea-0004ac98a7b9"> <topic>Several vulnerabilities found in Foo</topic> <affects> <package> <name>foo</name> <name>foo-devel</name> <name>ja-foo</name> <range><ge>1.6</ge><lt>1.9</lt></range> <range><ge>2.*</ge><lt>2.4_1</lt></range> <range><eq>3.0b1</eq></range> </package> <package> <name>openfoo</name> <range><lt>1.10_7</lt></range> <range><ge>1.2,1</ge><lt>1.3_1,1</lt></range> </package> </affects> <description> <body xmlns="http://www.w3.org/1999/xhtml"> <p>J. Random Hacker reports:</p> <blockquote cite="http://j.r.hacker.com/advisories/1"> <p>Several issues in the Foo software may be exploited via carefully crafted QUUX requests. These requests will permit the injection of Bar code, mumble theft, and the readability of the Foo administrator account.</p> </blockquote> </body> </description> <references> <freebsdsa>SA-10:75.foo</freebsdsa> <freebsdpr>ports/987654</freebsdpr> <cvename>CAN-2010-0201</cvename> <cvename>CAN-2010-0466</cvename> <bid>96298</bid> <certsa>CA-2010-99</certsa> <certvu>740169</certvu> <uscertsa>SA10-99A</uscertsa> <uscertta>SA10-99A</uscertta> <mlist msgid="201075606@hacker.com">http://marc.theaimsgroup.com/?l=bugtraq&m=203886607825605</mlist> <url>http://j.r.hacker.com/advisories/1</url> </references> <dates> <discovery>2010-05-25</discovery> <entry>2010-07-13</entry> <modified>2010-09-17</entry> </dates> </vuln> The tag names are supposed to be self-descriptive, so we shall take a closer look only at fields you will need to fill in by yourself: This is the top-level tag of a VuXML entry. It has a mandatory attribute, vid, specifying a universally unique identifier (UUID) for this entry (in quotes). You should generate a UUID for each new VuXML entry (and do not forget to substitute it for the template UUID unless you are writing the entry from scratch). You can use &man.uuidgen.1; in FreeBSD 5.x, or you may install the port devel/p5-Data-UUID and issue the following command: perl -MData::UUID -le 'print lc new Data::UUID->create_str' This is a one-line description of the issue found. The names of packages affected are listed there. Multiple names can be given since several packages may be based on a single master port or software product. This may include stable and development branches, localized versions, and slave ports featuring different choices of important build-time configuration options. It is your responsibility to find all such related packages when writing a VuXML entry. Keep in mind that make search name=foo is your friend. The primary points to look for are as follows: the foo-devel variant for a foo port; other variants with a suffix like -a4 (for print-related packages), -without-gui (for packages with X support disabled), or similar; jp-, ru-, zh-, and other possible localized variants in the corresponding national categories of the ports collection. Affected versions of the package(s) are specified there as one or more ranges using a combination of <lt>, <le>, <eq>, <ge>, and <gt> elements. The version ranges given should not overlap. In a range specification, * (asterisk) denotes the smallest version number. In particular, 2.* is less than 2.a. Therefore an asterisk may be used for a range to match all possible alpha, beta, and RC versions. For instance, <ge>2.*</ge><lt>3.*</lt> will selectively match every 2.x version while <ge>2.0</ge><lt>3.0</lt> will obviously not since the latter misses 2.r3 and matches 3.b. The above example specifies that affected are versions from 1.6 to 1.9 inclusive, versions 2.x before 2.4_1, and version 3.0b1. Several related package groups (essentially, ports) can be listed in the <affected> section. This can be used if several software products (say FooBar, FreeBar and OpenBar) grow from the same code base and still share its bugs and vulnerabilities. Note the difference from listing multiple names within a single <package> section. The version ranges should allow for PORTEPOCH and PORTREVISION if applicable. Please remember that according to the collation rules, a version with a non-zero PORTEPOCH is greater than any version without PORTEPOCH, e.g., 3.0,1 is greater than 3.1 or even than 8.9. This is a summary of the issue. XHTML is used in this field. At least enclosing <p> and </p> should appear. More complex mark-up may be used, but only for the sake of accuracy and clarity: No eye candy please. This section contains references to relevant documents. As many references as apply are encouraged. This is a FreeBSD security advisory. This is a FreeBSD problem report. This is a Mitre CVE identifier. This is a SecurityFocus Bug ID. This is a US-CERT security advisory. This is a US-CERT vulnerability note. This is a US-CERT Cyber Security Alert. This is a US-CERT Technical Cyber Security Alert. This is a URL to an archived posting in a mailing list. The attribute msgid is optional and may specify the message ID of the posting. This is a generic URL. It should be used only if none of the other reference categories apply. This is the date when the issue was disclosed (YYYY-MM-DD). This is the date when the entry was added (YYYY-MM-DD). This is the date when any information in the entry was last modified (YYYY-MM-DD). New entries must not include this field. It should be added upon editing an existing entry. Testing your changes to the VuXML database Assume you just wrote or filled in an entry for a vulnerability in the package clamav that has been fixed in version 0.65_7. As a prerequisite, you need to install fresh versions of the ports security/portaudit and security/portaudit-db. First, check whether there already is an entry for this vulnerability. If there were such entry, it would match the previous version of the package, 0.65_6: &prompt.user; packaudit &prompt.user; portaudit clamav-0.65_6 To run packaudit, you must have permission to write to its DATABASEDIR, typically /var/db/portaudit. If there is none found, you get the green light to add a new entry for this vulnerability. Now you can generate a brand-new UUID (assume it's 74a9541d-5d6c-11d8-80e3-0020ed76ef5a) and add your new entry to the VuXML database. Please verify its syntax after that as follows: &prompt.user; cd ${PORTSDIR}/security/vuxml && make validate You will need at least one of the following packages installed: textproc/libxml2, textproc/jade. Now rebuild the portaudit database from the VuXML file: &prompt.user; packaudit To verify that the <affected> section of your entry will match correct package(s), issue the following command: &prompt.user; portaudit -f /usr/ports/INDEX -r 74a9541d-5d6c-11d8-80e3-0020ed76ef5a Please refer to &man.portaudit.1; for better understanding of the command syntax. Make sure that your entry produces no spurious matches in the output. Now check whether the right package versions are matched by your entry: &prompt.user; portaudit clamav-0.65_6 clamav-0.65_7 Affected package: clamav-0.65_6 (matched by clamav<0.65_7) Type of problem: clamav remote denial-of-service. Reference: <http://www.freebsd.org/ports/portaudit/74a9541d-5d6c-11d8-80e3-0020ed76ef5a.html> 1 problem(s) found. Obviously, the former version should match while the latter one should not. Finally, verify whether the web page generated from the VuXML database looks like expected: &prompt.user; mkdir -p ~/public_html/portaudit &prompt.user; packaudit &prompt.user; lynx ~/public_html/portaudit/74a9541d-5d6c-11d8-80e3-0020ed76ef5a.html If VuXML still scares you... As an easy alternative to writing VuXML, you may opt to add a single line to a different file with much simpler syntax, PORTSDIR/security/portaudit-db/database/portaudit.txt, which resides within the port security/portaudit-db, and send a request for review to the Security Officer Team as described on the FreeBSD Security Information page. A line in that file consists of four fields separated by |, a pipe character. The first field is a &man.pkg.version.1; pattern expression matching the vulnerable packages. The second field contains URLs to relevant information, separated by space characters. The third field is a one-line description of the issue. The fourth and last field is the entry's UUID. You may want take a closer look at existing entries in portaudit.txt before adding your first line to that file. Dos and Don'ts Introduction Here is a list of common dos and don'ts that you encounter during the porting process. You should check your own port against this list, but you can also check ports in the PR database that others have submitted. Submit any comments on ports you check as described in Bug Reports and General Commentary. Checking ports in the PR database will both make it faster for us to commit them, and prove that you know what you are doing. Stripping Binaries Do not strip binaries manually unless you have to. All binaries should be stripped, but the INSTALL_PROGRAM macro will install and strip a binary at the same time (see the next section). If you need to strip a file, but do not wish to use the INSTALL_PROGRAM macro, ${STRIP_CMD} will strip your program. This is typically done within the post-install target. For example: post-install: ${STRIP_CMD} ${PREFIX}/bin/xdl Use the &man.file.1; command on the installed executable to check whether the binary is stripped or not. If it does not say not stripped, it is stripped. Additionally, &man.strip.1; will not strip a previously stripped program; it will instead exit cleanly. INSTALL_* macros Do use the macros provided in bsd.port.mk to ensure correct modes and ownership of files in your own *-install targets. INSTALL_PROGRAM is a command to install binary executables. INSTALL_SCRIPT is a command to install executable scripts. INSTALL_DATA is a command to install sharable data. INSTALL_MAN is a command to install manpages and other documentation (it does not compress anything). These are basically the install command with all the appropriate flags. See below for an example on how to use them. <makevar>WRKDIR</makevar> Do not write anything to files outside WRKDIR. WRKDIR is the only place that is guaranteed to be writable during the port build (see installing ports from a CDROM for an example of building ports from a read-only tree). If you need to modify one of the pkg-* files, do so by redefining a variable, not by writing over it. <makevar>WRKDIRPREFIX</makevar> Make sure your port honors WRKDIRPREFIX. Most ports do not have to worry about this. In particular, if you are referring to a WRKDIR of another port, note that the correct location is WRKDIRPREFIXPORTSDIR/subdir/name/work not PORTSDIR/subdir/name/work or .CURDIR/../../subdir/name/work or some such. Also, if you are defining WRKDIR yourself, make sure you prepend ${WRKDIRPREFIX}${.CURDIR} in the front. Differentiating operating systems and OS versions You may come across code that needs modifications or conditional compilation based upon what version of Unix it is running under. If you need to make such changes to the code for conditional compilation, make sure you make the changes as general as possible so that we can back-port code to older FreeBSD systems and cross-port to other BSD systems such as 4.4BSD from CSRG, BSD/386, 386BSD, NetBSD, and OpenBSD. The preferred way to tell 4.3BSD/Reno (1990) and newer versions of the BSD code apart is by using the BSD macro defined in sys/param.h. Hopefully that file is already included; if not, add the code: #if (defined(__unix__) || defined(unix)) && !defined(USG) #include <sys/param.h> #endif to the proper place in the .c file. We believe that every system that defines these two symbols has sys/param.h. If you find a system that does not, we would like to know. Please send mail to the &a.ports;. Another way is to use the GNU Autoconf style of doing this: #ifdef HAVE_SYS_PARAM_H #include <sys/param.h> #endif Do not forget to add -DHAVE_SYS_PARAM_H to the CFLAGS in the Makefile for this method. Once you have sys/param.h included, you may use: #if (defined(BSD) && (BSD >= 199103)) to detect if the code is being compiled on a 4.3 Net2 code base or newer (e.g. FreeBSD 1.x, 4.3/Reno, NetBSD 0.9, 386BSD, BSD/386 1.1 and below). Use: #if (defined(BSD) && (BSD >= 199306)) to detect if the code is being compiled on a 4.4 code base or newer (e.g. FreeBSD 2.x, 4.4, NetBSD 1.0, BSD/386 2.0 or above). The value of the BSD macro is 199506 for the 4.4BSD-Lite2 code base. This is stated for informational purposes only. It should not be used to distinguish between versions of FreeBSD based only on 4.4-Lite vs. versions that have merged in changes from 4.4-Lite2. The __FreeBSD__ macro should be used instead. Use sparingly: __FreeBSD__ is defined in all versions of FreeBSD. Use it if the change you are making only affects FreeBSD. Porting gotchas like the use of sys_errlist[] vs strerror() are Berkeley-isms, not FreeBSD changes. In FreeBSD 2.x, __FreeBSD__ is defined to be 2. In earlier versions, it is 1. Later versions always bump it to match their major version number. If you need to tell the difference between a FreeBSD 1.x system and a FreeBSD 2.x or above system, usually the right answer is to use the BSD macros described above. If there actually is a FreeBSD specific change (such as special shared library options when using ld) then it is OK to use __FreeBSD__ and #if __FreeBSD__ > 1 to detect a FreeBSD 2.x and later system. If you need more granularity in detecting FreeBSD systems since 2.0-RELEASE you can use the following: #if __FreeBSD__ >= 2 #include <osreldate.h> # if __FreeBSD_version >= 199504 /* 2.0.5+ release specific code here */ # endif #endif In the hundreds of ports that have been done, there have only been one or two cases where __FreeBSD__ should have been used. Just because an earlier port screwed up and used it in the wrong place does not mean you should do so too. __FreeBSD_version values Here is a convenient list of __FreeBSD_version values as defined in sys/param.h: __FreeBSD_version values Release __FreeBSD_version 2.0-RELEASE 119411 2.1-CURRENT 199501, 199503 2.0.5-RELEASE 199504 2.2-CURRENT before 2.1 199508 2.1.0-RELEASE 199511 2.2-CURRENT before 2.1.5 199512 2.1.5-RELEASE 199607 2.2-CURRENT before 2.1.6 199608 2.1.6-RELEASE 199612 2.1.7-RELEASE 199612 2.2-RELEASE 220000 2.2.1-RELEASE 220000 (no change) 2.2-STABLE after 2.2.1-RELEASE 220000 (no change) 2.2-STABLE after texinfo-3.9 221001 2.2-STABLE after top 221002 2.2.2-RELEASE 222000 2.2-STABLE after 2.2.2-RELEASE 222001 2.2.5-RELEASE 225000 2.2-STABLE after 2.2.5-RELEASE 225001 2.2-STABLE after ldconfig -R merge 225002 2.2.6-RELEASE 226000 2.2.7-RELEASE 227000 2.2-STABLE after 2.2.7-RELEASE 227001 2.2-STABLE after &man.semctl.2; change 227002 2.2.8-RELEASE 228000 2.2-STABLE after 2.2.8-RELEASE 228001 3.0-CURRENT before &man.mount.2; change 300000 3.0-CURRENT after &man.mount.2; change 300001 3.0-CURRENT after &man.semctl.2; change 300002 3.0-CURRENT after ioctl arg changes 300003 3.0-CURRENT after ELF conversion 300004 3.0-RELEASE 300005 3.0-CURRENT after 3.0-RELEASE 300006 3.0-STABLE after 3/4 branch 300007 3.1-RELEASE 310000 3.1-STABLE after 3.1-RELEASE 310001 3.1-STABLE after C++ constructor/destructor order change 310002 3.2-RELEASE 320000 3.2-STABLE 320001 3.2-STABLE after binary-incompatible IPFW and socket changes 320002 3.3-RELEASE 330000 3.3-STABLE 330001 3.3-STABLE after adding &man.mkstemp.3; to libc 330002 3.4-RELEASE 340000 3.4-STABLE 340001 3.5-RELEASE 350000 3.5-STABLE 350001 4.0-CURRENT after 3.4 branch 400000 4.0-CURRENT after change in dynamic linker handling 400001 4.0-CURRENT after C++ constructor/destructor order change 400002 4.0-CURRENT after functioning &man.dladdr.3; 400003 4.0-CURRENT after __deregister_frame_info dynamic linker bug fix (also 4.0-CURRENT after EGCS 1.1.2 integration) 400004 4.0-CURRENT after &man.suser.9; API change (also 4.0-CURRENT after newbus) 400005 4.0-CURRENT after cdevsw registration change 400006 4.0-CURRENT after the addition of so_cred for socket level credentials 400007 4.0-CURRENT after the addition of a poll syscall wrapper to libc_r 400008 4.0-CURRENT after the change of the kernel's dev_t type to struct specinfo pointer 400009 4.0-CURRENT after fixing a hole in &man.jail.2; 400010 4.0-CURRENT after the sigset_t datatype change 400011 4.0-CURRENT after the cutover to the GCC 2.95.2 compiler 400012 4.0-CURRENT after adding pluggable linux-mode ioctl handlers 400013 4.0-CURRENT after importing OpenSSL 400014 4.0-CURRENT after the C++ ABI change in GCC 2.95.2 from -fvtable-thunks to -fno-vtable-thunks by default 400015 4.0-CURRENT after importing OpenSSH 400016 4.0-RELEASE 400017 4.0-STABLE after 4.0-RELEASE 400018 4.0-STABLE after the introduction of delayed checksums. 400019 4.0-STABLE after merging libxpg4 code into libc. 400020 4.0-STABLE after upgrading Binutils to 2.10.0, ELF branding changes, and tcsh in the base system. 400021 4.1-RELEASE 410000 4.1-STABLE after 4.1-RELEASE 410001 4.1-STABLE after &man.setproctitle.3; moved from libutil to libc. 410002 4.1.1-RELEASE 411000 4.1.1-STABLE after 4.1.1-RELEASE 411001 4.2-RELEASE 420000 4.2-STABLE after combining libgcc.a and libgcc_r.a, and associated GCC linkage changes. 420001 4.3-RELEASE 430000 4.3-STABLE after wint_t introduction. 430001 4.3-STABLE after PCI powerstate API merge. 430002 4.4-RELEASE 440000 4.4-STABLE after d_thread_t introduction. 440001 4.4-STABLE after mount structure changes (affects filesystem klds). 440002 4.4-STABLE after the userland components of smbfs were imported. 440003 4.5-RELEASE 450000 4.5-STABLE after the usb structure element rename. 450001 4.5-STABLE after the sendmail_enable &man.rc.conf.5; variable was made to take the value NONE. 450004 4.5-STABLE after moving to XFree86 4 by default for package builds. 450005 4.5-STABLE after accept filtering was fixed so that is no longer susceptible to an easy DoS. 450006 4.6-RELEASE 460000 4.6-STABLE &man.sendfile.2; fixed to comply with documentation, not to count any headers sent against the amount of data to be sent from the file. 460001 4.6.2-RELEASE 460002 4.6-STABLE 460100 4.6-STABLE after MFC of `sed -i'. 460101 4.6-STABLE after MFC of many new pkg_install features from the HEAD. 460102 4.7-RELEASE 470000 4.7-STABLE 470100 Start generated __std{in,out,err}p references rather than __sF. This changes std{in,out,err} from a compile time expression to a runtime one. 470101 4.7-STABLE after MFC of mbuf changes to replace m_aux mbufs by m_tag's 470102 4.7-STABLE gets OpenSSL 0.9.7 470103 4.8-RELEASE 480000 4.8-STABLE 480100 4.8-STABLE after &man.realpath.3; has been made thread-safe 480101 4.8-STABLE 3ware API changes to twe. 480102 4.9-RELEASE 490000 4.9-STABLE 490100 4.9-STABLE after e_sid was added to struct kinfo_eproc. 490101 4.9-STABLE after MFC of libmap functionality for rtld. 490102 4.10-RELEASE 491000 4.10-STABLE 491100 4.10-STABLE after MFC of revision 20040629 of the package tools 491101 4.10-STABLE after VM fix dealing with unwiring of fictitious pages 491102 4.11-RELEASE 492000 4.11-STABLE 492100 5.0-CURRENT 500000 5.0-CURRENT after adding addition ELF header fields, and changing our ELF binary branding method. 500001 5.0-CURRENT after kld metadata changes. 500002 5.0-CURRENT after buf/bio changes. 500003 5.0-CURRENT after binutils upgrade. 500004 5.0-CURRENT after merging libxpg4 code into libc and after TASKQ interface introduction. 500005 5.0-CURRENT after the addition of AGP interfaces. 500006 5.0-CURRENT after Perl upgrade to 5.6.0 500007 5.0-CURRENT after the update of KAME code to 2000/07 sources. 500008 5.0-CURRENT after ether_ifattach() and ether_ifdetach() changes. 500009 5.0-CURRENT after changing mtree defaults back to original variant, adding -L to follow symlinks. 500010 5.0-CURRENT after kqueue API changed. 500011 5.0-CURRENT after &man.setproctitle.3; moved from libutil to libc. 500012 5.0-CURRENT after the first SMPng commit. 500013 5.0-CURRENT after <sys/select.h> moved to <sys/selinfo.h>. 500014 5.0-CURRENT after combining libgcc.a and libgcc_r.a, and associated GCC linkage changes. 500015 5.0-CURRENT after change allowing libc and libc_r to be linked together, deprecating -pthread option. 500016 5.0-CURRENT after switch from struct ucred to struct xucred to stabilize kernel-exported API for mountd et al. 500017 5.0-CURRENT after addition of CPUTYPE make variable for controlling CPU-specific optimizations. 500018 5.0-CURRENT after moving machine/ioctl_fd.h to sys/fdcio.h 500019 5.0-CURRENT after locale names renaming. 500020 5.0-CURRENT after Bzip2 import. Also signifies removal of S/Key. 500021 5.0-CURRENT after SSE support. 500022 5.0-CURRENT after KSE Milestone 2. 500023 5.0-CURRENT after d_thread_t, and moving UUCP to ports. 500024 5.0-CURRENT after ABI change for descriptor and creds passing on 64 bit platforms. 500025 5.0-CURRENT after moving to XFree86 4 by default for package builds, and after the new libc strnstr() function was added. 500026 5.0-CURRENT after the new libc strcasestr() function was added. 500027 5.0-CURRENT after the userland components of smbfs were imported. 500028 5.0-CURRENT after the new C99 specific-width integer types were added. (Not incremented.) 5.0-CURRENT after a change was made in the return value of &man.sendfile.2;. 500029 5.0-CURRENT after the introduction of the type fflags_t, which is the appropriate size for file flags. 500030 5.0-CURRENT after the usb structure element rename. 500031 5.0-CURRENT after the introduction of Perl 5.6.1. 500032 5.0-CURRENT after the sendmail_enable &man.rc.conf.5; variable was made to take the value NONE. 500033 5.0-CURRENT after mtx_init() grew a third argument. 500034 5.0-CURRENT with Gcc 3.1. 500035 5.0-CURRENT without Perl in /usr/src 500036 5.0-CURRENT after the addition of &man.dlfunc.3; 500037 5.0-CURRENT after the types of some struct sockbuf members were changed and the structure was reordered. 500038 5.0-CURRENT after GCC 3.2.1 import. Also after headers stopped using _BSD_FOO_T_ and started using _FOO_T_DECLARED. This value can also be used as a conservative estimate of the start of &man.bzip2.1; package support. 500039 5.0-CURRENT after various changes to disk functions were made in the name of removing dependency on disklabel structure internals. 500040 5.0-CURRENT after the addition of &man.getopt.long.3; to libc. 500041 5.0-CURRENT after Binutils 2.13 upgrade, which included new FreeBSD emulation, vec, and output format. 500042 5.0-CURRENT after adding weak pthread_XXX stubs to libc, obsoleting libXThrStub.so. 5.0-RELEASE. 500043 5.0-CURRENT after branching for RELENG_5_0 500100 <sys/dkstat.h> is empty and should not be included. 500101 5.0-CURRENT after the d_mmap_t interface change. 500102 5.0-CURRENT after taskqueue_swi changed to run without Giant, and taskqueue_swi_giant added to run with Giant. 500103 cdevsw_add() and cdevsw_remove() no longer exists. Appearance of MAJOR_AUTO allocation facility. 500104 5.0-CURRENT after new cdevsw initialization method. 500105 devstat_add_entry() has been replaced by devstat_new_entry() 500106 Devstat interface change; see sys/sys/param.h 1.149 500107 Token-Ring interface changes. 500108 Addition of vm_paddr_t. 500109 5.0-CURRENT after &man.realpath.3; has been made thread-safe 500110 5.0-CURRENT after &man.usbhid.3; has been synced with NetBSD 500111 5.0-CURRENT after new NSS implementation and addition of POSIX.1 getpw*_r, getgr*_r functions 500112 5.0-CURRENT after removal of the old rc system. 500113 5.1-RELEASE. 501000 5.1-CURRENT after branching for RELENG_5_1. 501100 5.1-CURRENT after correcting the semantics of sigtimedwait(2) and sigwaitinfo(2). 501101 5.1-CURRENT after adding the lockfunc and lockfuncarg fields to &man.bus.dma.tag.create.9;. 501102 5.1-CURRENT after GCC 3.3.1-pre 20030711 snapshot integration. 501103 5.1-CURRENT 3ware API changes to twe. 501104 5.1-CURRENT dynamically-linked /bin and /sbin support and movement of libraries to /lib. 501105 5.1-CURRENT after adding kernel support for Coda 6.x. 501106 5.1-CURRENT after 16550 UART constants moved from <dev/sio/sioreg.h> to <dev/ic/ns16550.h>. Also when libmap functionality was unconditionally supported by rtld. 501107 5.1-CURRENT after PFIL_HOOKS API update 501108 5.1-CURRENT after adding kiconv(3) 501109 5.1-CURRENT after changing default operations for open and close in cdevsw 501110 5.1-CURRENT after changed layout of cdevsw 501111 5.1-CURRENT after adding kobj multiple inheritance 501112 5.1-CURRENT after the if_xname change in struct ifnet 501113 5.1-CURRENT after changing /bin and /sbin to be dynamically linked 501114 5.2-RELEASE 502000 5.2.1-RELEASE 502010 5.2-CURRENT after branching for RELENG_5_2 502100 5.2-CURRENT after __cxa_atexit/__cxa_finalize functions were added to libc. 502101 5.2-CURRENT after change of default thread library from libc_r to libpthread. 502102 5.2-CURRENT after device driver API megapatch. 502103 5.2-CURRENT after getopt_long_only() addition. 502104 5.2-CURRENT after NULL is made into ((void *)0) for C, creating more warnings. 502105 5.2-CURRENT after pf is linked to the build and install. 502106 5.2-CURRENT after time_t is changed to a 64-bit value on sparc64. 502107 5.2-CURRENT after Intel C/C++ compiler support in some headers and execve(2) changes to be more strictly conforming to POSIX. 502108 5.2-CURRENT after the introduction of the bus_alloc_resource_any API 502109 5.2-CURRENT after the addition of UTF-8 locales 502110 5.2-CURRENT after the removal of the getvfsent(3) API 502111 5.2-CURRENT after the addition of the .warning directive for make. 502112 5.2-CURRENT after ttyioctl() was made mandatory for serial drivers. 502113 5.2-CURRENT after import of the ALTQ framework. 502114 5.2-CURRENT after changing sema_timedwait(9) to return 0 on success and a non-zero error code on failure. 502115 5.2-CURRENT after changing kernel dev_t to be pointer to struct cdev *. 502116 5.2-CURRENT after changing kernel udev_t to dev_t. 502117 5.2-CURRENT after adding support for CLOCK_VIRTUAL and CLOCK_PROF to clock_gettime(2) and clock_getres(2). 502118 5.2-CURRENT after changing network interface cloning overhaul. 502119 5.2-CURRENT after the update of the package tools to revision 20040629. 502120 5.2-CURRENT after marking Bluetooth code as non-i386 specific. 502121 5.2-CURRENT after the introduction of the KDB debugger framework, the conversion of DDB into a backend and the introduction of the GDB backend. 502122 5.2-CURRENT after change to make VFS_ROOT take a struct thread argument as does vflush. Struct kinfo_proc now has a user data pointer. The switch of the default X implementation to xorg was also made at this time. 502123 5.2-CURRENT after the change to separate the way ports rc.d and legacy scripts are started. 502124 5.2-CURRENT after the backout of the previous change. 502125 5.2-CURRENT after the removal of kmem_alloc_pageable() and the import of gcc 3.4.2. 502126 5.2-CURRENT after changing the UMA kernel API to allow ctors/inits to fail. 502127 5.2-CURRENT after the change of the vfs_mount signature as well as global replacement of PRISON_ROOT with SUSER_ALLOWJAIL for the suser(9) API. 502128 5.3-BETA/RC before the pfil API change 503000 5.3-RELEASE 503001 5.3-STABLE after branching for RELENG_5_3 503100 5.3-STABLE after addition of glibc style &man.strftime.3; padding options. 503101 5.3-STABLE after OpenBSD's nc(1) import MFC. 503102 5.4-PRERELEASE after the MFC of the fixes in <src/include/stdbool.h> and <src/sys/i386/include/_types.h> for using the GCC-compatibility of the Intel C/C++ compiler. 503103 5.4-PRERELEASE after the MFC of the change of ifi_epoch from wall clock time to uptime. 503104 5.4-PRERELEASE after the MFC of the fix of EOVERFLOW check in vswprintf(3). 503105 5.4-RELEASE. 504000 5.4-STABLE after branching for RELENG_5_4 504100 5.4-STABLE after increasing the default thread stacksizes 504101 5.4-STABLE after the addition of sha256 504102 5.4-STABLE after the MFC of if_bridge 504103 5.4-STABLE after the MFC of bsdiff and portsnap 504104 6.0-CURRENT 600000 6.0-CURRENT after permanently enabling PFIL_HOOKS in the kernel. 600001 6.0-CURRENT after initial addition of ifi_epoch to struct if_data. Backed out after a few days. Do not use this value. 600002 6.0-CURRENT after the re-addition of the ifi_epoch member of struct if_data. 600003 6.0-CURRENT after addition of the struct inpcb argument to the pfil API. 600004 6.0-CURRENT after addition of the "-d DESTDIR" argument to newsyslog. 600005 6.0-CURRENT after addition of glibc style &man.strftime.3; padding options. 600006 6.0-CURRENT after addition of 802.11 framework updates. 600007 6.0-CURRENT after changes to VOP_*VOBJECT() functions and introduction of MNTK_MPSAFE flag for Giantfree filesystems. 600008 6.0-CURRENT after addition of the cpufreq framework and drivers. 600009 6.0-CURRENT after importing OpenBSD's nc(1). 600010 6.0-CURRENT after removing semblance of SVID2 matherr() support. 600011 6.0-CURRENT after increase of default thread stacks' size. 600012 6.0-CURRENT after fixes in <src/include/stdbool.h> and <src/sys/i386/include/_types.h> for using the GCC-compatibility of the Intel C/C++ compiler. 600013 6.0-CURRENT after EOVERFLOW checks in vswprintf(3) fixed. 600014 6.0-CURRENT after changing the struct if_data member, ifi_epoch, from wall clock time to uptime. 600015 6.0-CURRENT after LC_CTYPE disk format changed. 600016 6.0-CURRENT after NLS catalogs disk format changed. 600017 6.0-CURRENT after LC_COLLATE disk format changed. 600018 Installation of acpica includes into /usr/include. 600019 Addition of MSG_NOSIGNAL flag to send(2) API. 600020 Addition of fields to cdevsw 600021 Removed gtar from base system. 600022 LOCAL_CREDS, LOCAL_CONNWAIT socket options added to unix(4). 600023 &man.hwpmc.4; and related tools added to 6.0-CURRENT. 600024 struct icmphdr added to 6.0-CURRENT. 600025 pf updated to 3.7. 600026 Kernel libalias and ng_nat introduced. 600027 POSIX ttyname_r(3) made available through unistd.h and libc. 600028 6.0-CURRENT after libpcap updated to v0.9.1 alpha 096. 600029 6.0-CURRENT after importing NetBSD's if_bridge(4). 600030 6.0-CURRENT after struct ifnet was broken out of the driver softcs. 600031 6.0-CURRENT after the import of libpcap v0.9.1. 600032 6.0-STABLE after bump of all shared library versions that had not been changed since RELENG_5. 600033 6.0-STABLE after credential argument is added to dev_clone vent handler. 6.0-RELEASE. 600034 6.0-STABLE after 6.0-RELEASE 600100 6.0-STABLE after incorporating scripts from the local_startup directories into the base rcorder. 600101 6.0-STABLE after updating the ELF types and constants. 600102 7.0-CURRENT. 700000 7.0-CURRENT after bump of all shared library versions that had not been changed since RELENG_5. 700001 7.0-CURRENT after credential argument is added to dev_clone vent handler. 700002 7.0-CURRENT after memmem(3) is added to libc. 700003 7.0-CURRENT after solisten(9) kernel arguments are modified to accept a backlog paramater. 700004 7.0-CURRENT after IFP2ENADDR() was changed to return a pointer to IF_LLADDR(). 700005 7.0-CURRENT after addition of if_addr member to struct ifnet and IFP2ENADDR() removal. 700006 7.0-CURRENT after incorporating scripts from the local_startup directories into the base rcorder. 700007 7.0-CURRENT after removal of MNT_NODEV mount option. 700008 7.0-CURRENT after ELF-64 type changes and symbol versioning. 700009 7.0-CURRENT after addition of hostb and vgapci drivers, addition of pci_find_extcap(), and changing the AGP drivers to no longer map the aperture. 700010 7-0.CURRENT after tv_sec was made time_t on all platforms but Alpha. 700011

Note that 2.2-STABLE sometimes identifies itself as 2.2.5-STABLE after the 2.2.5-RELEASE. The pattern used to be year followed by the month, but we decided to change it to a more straightforward major/minor system starting from 2.2. This is because the parallel development on several branches made it infeasible to classify the releases simply by their real release dates. If you are making a port now, you do not have to worry about old -CURRENTs; they are listed here just for your reference. Writing something after <filename>bsd.port.mk</filename> Do not write anything after the .include <bsd.port.mk> line. It usually can be avoided by including bsd.port.pre.mk somewhere in the middle of your Makefile and bsd.port.post.mk at the end. You need to include either the bsd.port.pre.mk/bsd.port.post.mk pair or bsd.port.mk only; do not mix these two usages. bsd.port.pre.mk only defines a few variables, which can be used in tests in the Makefile, bsd.port.post.mk defines the rest. Here are some important variables defined in bsd.port.pre.mk (this is not the complete list, please read bsd.port.mk for the complete list). Variable Description ARCH The architecture as returned by uname -m (e.g., i386) OPSYS The operating system type, as returned by uname -s (e.g., FreeBSD) OSREL The release version of the operating system (e.g., 2.1.5 or 2.2.7) OSVERSION The numeric version of the operating system; the same as __FreeBSD_version. PORTOBJFORMAT The object format of the system (elf or aout; note that for modern versions of FreeBSD, aout is deprecated.) LOCALBASE The base of the local tree (e.g., /usr/local/) X11BASE The base of the X11 tree (e.g., /usr/X11R6) PREFIX Where the port installs itself (see more on PREFIX). If you have to define the variables USE_IMAKE, USE_X_PREFIX, or MASTERDIR, do so before including bsd.port.pre.mk. Here are some examples of things you can write after bsd.port.pre.mk: # no need to compile lang/perl5 if perl5 is already in system -.if ${OSVERSION} > 300003 +.if ${OSVERSION} > 300003 BROKEN= perl is in system .endif # only one shlib version number for ELF .if ${PORTOBJFORMAT} == "elf" TCL_LIB_FILE= ${TCL_LIB}.${SHLIB_MAJOR} .else TCL_LIB_FILE= ${TCL_LIB}.${SHLIB_MAJOR}.${SHLIB_MINOR} .endif # software already makes link for ELF, but not for a.out post-install: .if ${PORTOBJFORMAT} == "aout" ${LN} -sf liblinpack.so.1.0 ${PREFIX}/lib/liblinpack.so .endif You did remember to use tab instead of spaces after BROKEN= and TCL_LIB_FILE=, did you not? :-). Install additional documentation If your software has some documentation other than the standard man and info pages that you think is useful for the user, install it under PREFIX/share/doc. This can be done, like the previous item, in the post-install target. Create a new directory for your port. The directory name should reflect what the port is. This usually means PORTNAME. However, if you think the user might want different versions of the port to be installed at the same time, you can use the whole PKGNAME. Make the installation dependent on the variable NOPORTDOCS so that users can disable it in /etc/make.conf, like this: post-install: .if !defined(NOPORTDOCS) ${MKDIR} ${DOCSDIR} ${INSTALL_MAN} ${WRKSRC}/docs/xvdocs.ps ${DOCSDIR} .endif Here are some handy variables and how they are expanded by default when used in the Makefile: DATADIR gets expanded to PREFIX/share/PORTNAME. DOCSDIR gets expanded to PREFIX/share/doc/PORTNAME. EXAMPLESDIR gets expanded to PREFIX/share/examples/PORTNAME. These variables are exported to PLIST_SUB. Their values will appear there as pathnames relative to PREFIX if possible. That is, share/doc/PORTNAME will be substituted for %%DOCSDIR%% in the packing list by default, and so on. (See more on pkg-plist substitution here.) All documentation files and directories installed should be included in pkg-plist with the %%PORTDOCS%% prefix, for example: %%PORTDOCS%%%%DOCSDIR%%/AUTHORS %%PORTDOCS%%%%DOCSDIR%%/CONTACT %%PORTDOCS%%@dirrm %%DOCSDIR%% As an alternative to enumerating the documentation files in pkg-plist, a port can set the variable PORTDOCS to a list of file names and shell glob patterns to add to the final packing list. The names will be relative to DOCSDIR. Therefore, a port that utilizes PORTDOCS and uses a non-default location for its documentation should set DOCSDIR accordingly. If a directory is listed in PORTDOCS or matched by a glob pattern from this variable, the entire subtree of contained files and directories will be registered in the final packing list. If NOPORTDOCS is defined then files and directories listed in PORTDOCS would not be installed and neither would be added to port packing list. Installing the documentation at PORTDOCS as shown above remains up to the port itself. A typical example of utilizing PORTDOCS looks as follows: PORTDOCS= README.* ChangeLog docs/* You can also use the pkg-message file to display messages upon installation. See the section on using pkg-message for details. The pkg-message file does not need to be added to pkg-plist. Subdirectories Try to let the port put things in the right subdirectories of PREFIX. Some ports lump everything and put it in the subdirectory with the port's name, which is incorrect. Also, many ports put everything except binaries, header files and manual pages in the a subdirectory of lib, which does not work well with the BSD paradigm. Many of the files should be moved to one of the following: etc (setup/configuration files), libexec (executables started internally), sbin (executables for superusers/managers), info (documentation for info browser) or share (architecture independent files). See &man.hier.7; for details; the rules governing /usr pretty much apply to /usr/local too. The exception are ports dealing with USENET news. They may use PREFIX/news as a destination for their files. UIDs and GIDs If your port requires a certain user to be on the installed system, let the pkg-install script call pw to create it automatically. Look at net/cvsup-mirror for an example. If your port must use the same user/group ID number when it is installed as a binary package as when it was compiled, then you must choose a free UID from 50 to 999 and register it below. Look at japanese/Wnn6 for an example. Make sure you do not use a UID already used by the system or other ports. This is the current list of UIDs between 50 and 999. bind:*:53:53:Bind Sandbox:/:/sbin/nologin majordom:*:54:54:Majordomo Pseudo User:/usr/local/majordomo:/nonexistent rdfdb:*:55:55:rdfDB Daemon:/var/db/rdfdb:/bin/sh spamd:*:58:58:SpamAssassin user:/var/spool/spamd:/sbin/nologin cyrus:*:60:60:the cyrus mail server:/nonexistent:/nonexistent gnats:*:61:1:GNATS database owner:/usr/local/share/gnats/gnats-db:/bin/sh proxy:*:62:62:Packet Filter pseudo-user:/nonexistent:/nonexistent uucp:*:66:66:UUCP pseudo-user:/var/spool/uucppublic:/usr/libexec/uucp/uucico xten:*:67:67:X-10 daemon:/usr/local/xten:/nonexistent pop:*:68:6:Post Office Owner (popper):/nonexistent:/sbin/nologin wnn:*:69:7:Wnn:/nonexistent:/nonexistent pgsql:*:70:70:PostgreSQL pseudo-user:/usr/local/pgsql:/bin/sh oracle:*:71:71::0:0:Oracle:/usr/local/oracle7:/sbin/nologin ircd:*:72:72:IRC daemon:/nonexistent:/nonexistent ircservices:*:73:73:IRC services:/nonexistent:/nonexistent simscan:*:74:74:Simscan User:/nonexistent:/sbin/nologin ifmail:*:75:66:Ifmail user:/nonexistent:/nonexistent www:*:80:80:World Wide Web Owner:/nonexistent:/sbin/nologin alias:*:81:81:QMail user:/var/qmail/alias:/nonexistent qmaild:*:82:81:QMail user:/var/qmail:/nonexistent qmaill:*:83:81:QMail user:/var/qmail:/nonexistent qmailp:*:84:81:QMail user:/var/qmail:/nonexistent qmailq:*:85:82:QMail user:/var/qmail:/nonexistent qmailr:*:86:82:QMail user:/var/qmail:/nonexistent qmails:*:87:82:QMail user:/var/qmail:/nonexistent mysql:*:88:88:MySQL Daemon:/var/db/mysql:/sbin/nologin vpopmail:*:89:89:VPop Mail User:/usr/local/vpopmail:/nonexistent firebird:*:90:90:Firebird Database Administrator:/usr/local/firebird:/bin/sh mailman:*:91:91:Mailman User:/usr/local/mailman:/sbin/nologin gdm:*:92:92:GDM Sandbox:/:/sbin/nologin jabber:*:93:93:Jabber Daemon:/nonexistent:/nonexistent p4admin:*:94:94:Perforce admin:/usr/local/perforce:/sbin/nologin interch:*:95:95:Interchange user:/usr/local/interchange:/sbin/nologin squeuer:*:96:96:SQueuer Owner:/nonexistent:/bin/sh mud:*:97:97:MUD Owner:/nonexistent:/bin/sh msql:*:98:98:mSQL-2 pseudo-user:/var/db/msqldb:/bin/sh rscsi:*:99:99:Remote SCSI:/usr/local/rscsi:/usr/local/sbin/rscsi squid:*:100:100:squid caching-proxy pseudo user:/usr/local/squid:/sbin/nologin quagga:*:101:101:Quagga route daemon pseudo user:/usr/local/etc/quagga:/sbin/nologin ganglia:*:102:102:Ganglia User:/nonexistent:/sbin/nologin sgeadmin:*:103:103:Sun Grid Engine Admin:/nonexistent:/sbin/nologin slimserv:*:104:104:Slim Devices SlimServer pseudo-user:/nonexistent:/sbin/nologin dnetc:*:105:105:distributed.net client and proxy pseudo-user:/nonexistent:/sbin/nologin clamav:*:106:106:Clamav Antivirus:/nonexistent:/sbin/nologin cacti:*:107:107:Cacti Sandbox:/nonexistent:/sbin/nologin webkit:*:108:108:WebKit Default User:/usr/local/www/webkit:/bin/sh quickml:*:109:109:quickml Server:/nonexistent:/sbin/nologin vscan:*:110:110:Scanning Virus Account:/var/amavis:/bin/sh fido:*:111:111:Fido System:/usr/local/fido:/bin/sh dcc:*:112:112:Distributed Checksum Clearinghouse:/nonexistent:/sbin/nologin amavis:*:113:113:Amavis-stats Account:/nonexistent:/sbin/nologin dhis:*:114:114:DHIS Daemon:/nonexistent:/sbin/nologin _symon:*:115:115:Symon Account:/var/empty:/sbin/nologin postfix:*:125:125:Postfix Mail System:/var/spool/postfix:/sbin/nologin rbldns:*:153:153:rbldnsd pseudo-user:/nonexistent:/sbin/nologin sfs:*:171:171:Self-Certifying File System:/nonexistent:/sbin/nologin agk:*:172:172:AquaGateKeeper:/nonexistent:/nonexistent polipo:*:173:173:polipo web cache:/nonexistent:/sbin/nologin bogomilter:*:174:174:milter-bogom:/nonexistent:/sbin/nologin moinmoin:*:192:192:MoinMoin User:/nonexistent:/sbin/nologin sympa:*:200:200:Sympa Owner:/nonexistent:/sbin/nologin privoxy:*:201:201:Privoxy proxy user:/nonexistent:/sbin/nologin dspam:*:202:202:Dspam:/nonexistent:/sbin/nologin shoutcast:*:210:210:Shoutcast sandbox:/nonexistent:/bin/sh _tor:*:256:256:Tor anonymising router:/var/db/tor:/bin/sh smxs:*:260:260:Sendmail X SMTPS:/nonexistent:/sbin/nologin smxq:*:261:261:Sendmail X QMGR:/nonexistent:/sbin/nologin smxc:*:262:262:Sendmail X SMTPC:/nonexistent:/sbin/nologin smxm:*:263:263:Sendmail X misc:/nonexistent:/sbin/nologin smx:*:264:264:Sendmail X other:/nonexistent:/sbin/nologin ldap:*:389:389:OpenLDAP Server:/nonexistent:/sbin/nologin drweb:*:426:426:Dr.Web Mail Scanner:/nonexistent:/sbin/nologin courier:*:465:465:Courier Mail Server:/nonexistent:/sbin/nologin _bbstored:*:505:505::0:0:BoxBackup Store Daemon:/nonexistent:/bin/sh qtss:*:554:554:Darwin Streaming Server:/nonexistent:/sbin/nologin ircdru:*:555:555:Russian hybrid IRC server:/nonexistent:/bin/sh messagebus:*:556:556:D-BUS Daemon User:/nonexistent:/sbin/nologin avahi:*:558:558:Avahi Daemon User:/nonexistent:/sbin/nologin bnetd:*:700:700:Bnetd user:/nonexistent:/sbin/nologin bopm:*:717:717:Blitzed Open Proxy Monitor:/nonexistent:/bin/sh bacula:*:910:910:Bacula Daemon:/var/db/bacula:/sbin/nologin This is the current list of reserved GIDs. bind:*:53: rdfdb:*:55: spamd:*:58: cyrus:*:60: proxy:*:62: authpf:*:63: uucp:*:66: xten:*:67: dialer:*:68: network:*:69: pgsql:*:70: simscan:*:74: www:*:80: qnofiles:*:81: qmail:*:82: mysql:*:88: vpopmail:*:89: firebird:*:90: mailman:*:91: gdm:*:92: jabber:*:93: p4admin:*:94: interch:*:95: squeuer:*:96: mud:*:97: msql:*:98: rscsi:*:99: squid:*:100: quagga:*:101: ganglia:*:102: sgeadmin:*:103: slimserv:*:104: dnetc:*:105: clamav:*:106: cacti:*:107: webkit:*:108: quickml:*:109: vscan:*:110: fido:*:111: dcc:*:112: amavis:*:113: dhis:*:114: _symon:*:115: postfix:*:125: maildrop:*:126: rbldns:*:153: sfs:*:171: agk:*:172: polipo:*:173: moinmoin:*:192: sympa:*:200: dspam:*:202: _tor:*:256: smxs:*:260: smxq:*:261: smxc:*:262: smxm:*:263: smx:*:264: ldap:*:389: drweb:*:426: courier:*:465: _bbstored:*:505: qtss:*:554: ircdru:*:555: messagebus:*:556: realtime:*:557: avahi:*:558: bnetd:*:700: bopm:*:717: bacula:*:910: Please include a notice when you submit a port (or an upgrade) that reserves a new UID or GID in this range. This allows us to keep the list of reserved IDs up to date. Do things rationally The Makefile should do things simply and reasonably. If you can make it a couple of lines shorter or more readable, then do so. Examples include using a make .if construct instead of a shell if construct, not redefining do-extract if you can redefine EXTRACT* instead, and using GNU_CONFIGURE instead of CONFIGURE_ARGS += --prefix=${PREFIX}. If you find yourself having to write a lot of new code to try to do something, please go back and review bsd.port.mk to see if it contains an existing implementation of what you are trying to do. While hard to read, there are a great many seemingly-hard problems for which bsd.port.mk already provides a shorthand solution. Respect both <makevar>CC</makevar> and <makevar>CXX</makevar> The port should respect both CC and CXX variables. What we mean by this is that the port should not set the values of these variables absolutely, overriding existing values; instead, it should append whatever values it needs to the existing values. This is so that build options that affect all ports can be set globally. If the port does not respect these variables, please add NO_PACKAGE=ignores either cc or cxx to the Makefile. An example of a Makefile respecting both CC and CXX variables follows. Note the ?=: CC?= gcc CXX?= g++ Here is an example which respects neither CC nor CXX variables: CC= gcc CXX= g++ Both CC and CXX variables can be defined on FreeBSD systems in /etc/make.conf. The first example defines a value if it was not previously set in /etc/make.conf, preserving any system-wide definitions. The second example clobbers anything previously defined. Respect <makevar>CFLAGS</makevar> The port should respect the CFLAGS variable. What we mean by this is that the port should not set the value of this variable absolutely, overriding the existing value; instead, it should append whatever values it needs to the existing value. This is so that build options that affect all ports can be set globally. If it does not, please add NO_PACKAGE=ignores cflags to the Makefile. An example of a Makefile respecting the CFLAGS variable follows. Note the +=: CFLAGS+= -Wall -Werror Here is an example which does not respect the CFLAGS variable: CFLAGS= -Wall -Werror The CFLAGS variable is defined on FreeBSD systems in /etc/make.conf. The first example appends additional flags to the CFLAGS variable, preserving any system-wide definitions. The second example clobbers anything previously defined. You should remove optimization flags from the third party Makefiles. System CFLAGS contains system-wide optimization flags. An example from an unmodified Makefile: CFLAGS= -O3 -funroll-loops -DHAVE_SOUND Using system optimization flags, the Makefile would look similar to the following example: CFLAGS+= -DHAVE_SOUND Feedback Do send applicable changes/patches to the original author/maintainer for inclusion in next release of the code. This will only make your job that much easier for the next release. <filename>README.html</filename> Do not include the README.html file. This file is not part of the cvs collection but is generated using the make readme command. Marking a port not installable with <makevar>BROKEN</makevar>, <makevar>FORBIDDEN</makevar>, or <makevar>IGNORE</makevar> In certain cases users should be prevented from installing a port. To tell a user that a port should not be installed, there are several make variables that can be used in a port's Makefile. The value of the following make variables will be the reason that is given back to users for why the port refuses to install itself. Please use the correct make variable as each make variable conveys radically different meanings to both users, and to automated systems that depend on the Makefiles, such as the ports build cluster, FreshPorts, and portsmon. Variables BROKEN is reserved for ports that currently do not compile, install, or deinstall correctly. It should be used for ports where the the problem is believed to be temporary. The build cluster will still attempt to try to build them to see if the underlying problem has been resolved. For instance, use BROKEN when a port: does not compile fails its configuration or installation process installs files outside of ${LOCALBASE} and ${X11BASE} does not remove all its files cleanly upon deinstall (however, it may be acceptable, and desirable, for the port to leave user-modified files behind) FORBIDDEN is used for ports that do contain a security vulnerability or induce grave concern regarding the security of a FreeBSD system with a given port installed (ex: a reputably insecure program or a program that provides easily exploitable services). Ports should be marked as FORBIDDEN as soon as a particular piece of software has a vulnerability and there is no released upgrade. Ideally ports should be upgraded as soon as possible when a security vulnerability is discovered so as to reduce the number of vulnerable FreeBSD hosts (we like being known for being secure), however sometimes there is a noticeable time gap between disclosure of a vulnerability and an updated release of the vulnerable software. Do not mark a port FORBIDDEN for any reason other than security. IGNORE is reserved for ports that should not be built for some other reason. It should be used for ports where the the problem is believed to be structural. The build cluster will not, under any circumstances, build ports marked as IGNORE. For instance, use IGNORE when a port: compiles but does not run properly does not work on the installed version of &os; requires &os; kernel sources to build, but the user does not have them installed has a distfile which may not be automatically fetched due to licensing restrictions does not work with some other currently installed port (for instance, the port depends on www/apache21 but www/apache13 is installed) If a port would conflict with a currently installed port (for example, if they install a file in the same place that perfoms a different function), use CONFLICTS instead. CONFLICTS will set IGNORE by itself. If a port sould be marked IGNORE only on certain architectures, there are two other convenience variables that will automatically set IGNORE for you: ONLY_FOR_ARCHS and NOT_FOR_ARCHS. Examples: ONLY_FOR_ARCHS= i386 amd64 NOT_FOR_ARCHS= alpha ia64 sparc64 Implementation Notes Due to vagaries in the usage of IGNORECMD in bsd.port.mk among other places, the value of BROKEN should be enclosed in quotes, and the value of IGNORE should not be enclosed in quotes. Also, the wording of the string should be somewhat different due to the way the information is shown to the user. Examples: BROKEN= "this port is unsupported on FreeBSD 5.x" IGNORE= is unsupported on FreeBSD 5.x resulting in the following output from make describe: - ===> foobar-0.1 is marked as broken: this port is unsupported on FreeBSD 5.x. + ===> foobar-0.1 is marked as broken: this port is unsupported on FreeBSD 5.x. - ===> foobar-0.1 is unsupported on FreeBSD 5.x. + ===> foobar-0.1 is unsupported on FreeBSD 5.x. Marking a port for removal with <makevar>DEPRECATED</makevar> or <makevar>EXPIRATION_DATE</makevar> Do remember that BROKEN and FORBIDDEN are to be used as a temporary resort if a port is not working. Permanently broken ports should be removed from the tree entirely. When it makes sense to do so, users can be warned about a pending port removal with DEPRECATED and EXPIRATION_DATE. The former is simply a string stating why the port is scheduled for removal; the latter is a string in ISO 8601 format (YYYY-MM-DD). Both will be shown to the user. It is possible to set DEPRECATED without an EXPIRATION_DATE (for instance, recommending a newer version of the port), but the converse does not make any sense. There is no set policy on how much notice to give. Current practice seems to be one month for security-related issues and two months for build issues. This also gives any interested committers a little time to fix the problems. Avoid use of the <literal>.error</literal> construct The correct way for a Makefile to signal that the port can not be installed due to some external factor (for instance, the user has specified an illegal combination of build options) is to set a nonblank value to IGNORE. This value will be formatted and shown to the user by make install. It is a common mistake to use .error for this purpose. The problem with this is that many automated tools that work with the ports tree will fail in this situation. The most common occurence of this is seen when trying to build /usr/ports/INDEX (see ). However, even more trivial commands such as make -V maintainer also fail in this scenario. This is not acceptable. How to avoid using <literal>.error</literal> Assume that someone has the line USE_POINTYHAT=yes in make.conf. The first of the next two Makefile snippets will cause make index to fail, while the second one will not: .if USE_POINTYHAT .error "POINTYHAT is not supported" .endif .if USE_POINTYHAT IGNORE=POINTYHAT is not supported .endif Necessary workarounds Sometimes it is necessary to work around bugs in software included with older versions of &os;. Some versions of &man.make.1; were broken on at least 4.8 and 5.0 with respect to handling comparisons based on OSVERSION. This would often lead to failures during make describe (and thus, the overall ports make index). The workaround is to enclose the conditional comparison in spaces, e.g.: - if ( ${OSVERSION} > 500023 ) + if ( ${OSVERSION} > 500023 ) Be aware that test-installing a port on 4.9 or 5.2 will not detect this problem. Miscellanea The files pkg-descr and pkg-plist should each be double-checked. If you are reviewing a port and feel they can be worded better, do so. Do not copy more copies of the GNU General Public License into our system, please. Please be careful to note any legal issues! Do not let us illegally distribute software! A Sample <filename>Makefile</filename> Here is a sample Makefile that you can use to create a new port. Make sure you remove all the extra comments (ones between brackets)! It is recommended that you follow this format (ordering of variables, empty lines between sections, etc.). This format is designed so that the most important information is easy to locate. We recommend that you use portlint to check the Makefile. [the header...just to make it easier for us to identify the ports.] # New ports collection makefile for: xdvi [the "version required" line is only needed when the PORTVERSION variable is not specific enough to describe the port.] # Date created: 26 May 1995 [this is the person who did the original port to FreeBSD, in particular, the person who wrote the first version of this Makefile. Remember, this should not be changed when upgrading the port later.] # Whom: Satoshi Asami <asami@FreeBSD.org> # # $FreeBSD$ [ ^^^^^^^^^ This will be automatically replaced with RCS ID string by CVS when it is committed to our repository. If upgrading a port, do not alter this line back to "$FreeBSD$". CVS deals with it automatically.] # [section to describe the port itself and the master site - PORTNAME and PORTVERSION are always first, followed by CATEGORIES, and then MASTER_SITES, which can be followed by MASTER_SITE_SUBDIR. PKGNAMEPREFIX and PKGNAMESUFFIX, if needed, will be after that. Then comes DISTNAME, EXTRACT_SUFX and/or DISTFILES, and then EXTRACT_ONLY, as necessary.] PORTNAME= xdvi PORTVERSION= 18.2 CATEGORIES= print [do not forget the trailing slash ("/")! if you are not using MASTER_SITE_* macros] MASTER_SITES= ${MASTER_SITE_XCONTRIB} MASTER_SITE_SUBDIR= applications PKGNAMEPREFIX= ja- DISTNAME= xdvi-pl18 [set this if the source is not in the standard ".tar.gz" form] EXTRACT_SUFX= .tar.Z [section for distributed patches -- can be empty] PATCH_SITES= ftp://ftp.sra.co.jp/pub/X11/japanese/ PATCHFILES= xdvi-18.patch1.gz xdvi-18.patch2.gz [maintainer; *mandatory*! This is the person who is volunteering to handle port updates, build breakages, and to whom a users can direct questions and bug reports. To keep the quality of the Ports Collection as high as possible, we no longer accept new ports that are assigned to "ports@FreeBSD.org".] MAINTAINER= asami@FreeBSD.org COMMENT= A DVI Previewer for the X Window System [dependencies -- can be empty] RUN_DEPENDS= gs:${PORTSDIR}/print/ghostscript LIB_DEPENDS= Xpm.5:${PORTSDIR}/graphics/xpm [this section is for other standard bsd.port.mk variables that do not belong to any of the above] [If it asks questions during configure, build, install...] IS_INTERACTIVE= yes [If it extracts to a directory other than ${DISTNAME}...] WRKSRC= ${WRKDIR}/xdvi-new [If the distributed patches were not made relative to ${WRKSRC}, you may need to tweak this] PATCH_DIST_STRIP= -p1 [If it requires a "configure" script generated by GNU autoconf to be run] GNU_CONFIGURE= yes [If it requires GNU make, not /usr/bin/make, to build...] USE_GMAKE= yes [If it is an X application and requires "xmkmf -a" to be run...] USE_IMAKE= yes [et cetera.] [non-standard variables to be used in the rules below] MY_FAVORITE_RESPONSE= "yeah, right" [then the special rules, in the order they are called] pre-fetch: i go fetch something, yeah post-patch: i need to do something after patch, great pre-install: and then some more stuff before installing, wow [and then the epilogue] .include <bsd.port.mk> Keeping Up The &os; Ports Collection is constantly changing. Here is some information on how to keep up. FreshPorts One of the easiest ways to learn about updates that have already been committed is by subscribing to FreshPorts. You can select multiple ports to monitor. Maintainers are strongly encouraged to subscribe, because they will receive notification of not only their own changes, but also any changes that any other &os; committer has made. (These are often necessary to keep up with changes in the underlying ports framework—although it would be most polite to receive an advance heads-up from those committing such changes, sometimes this is overlooked or just simply impractical. Also, in some cases, the changes are very minor in nature. We expect everyone to use their best judgement in these cases.) If you wish to use FreshPorts, all you need is an account. If your registered email address is @FreeBSD.org, you will see the opt-in link on the right hand side of the webpages. For those of you who already have a FreshPorts account, but are not using your @FreeBSD.org email address, just change your email to @FreeBSD.org, subscribe, then change it back again. FreshPorts also has a sanity test feature which automatically tests each commit to the FreeBSD ports tree. If subscribed to this service, you will be notified of any errors which FreshPorts detects during sanity testing of your commits. The Web Interface to the Source Repository It is possible to browse the files in the source repository by using a web interface. Changes that affect the entire port system are now documented in the CHANGES file. Changes that affect individual ports are now documented in the UPDATING file. However, the definitive answer to any question is undoubtedly to read the source code of bsd.port.mk, and associated files. The &os; Ports Mailing List If you maintain ports, you should consider following the &a.ports;. Important changes to the way ports work will be announced there, and then committed to CHANGES. The &os; Port Building Cluster on <hostid role="hostname">pointyhat.FreeBSD.org</hostid> One of the least-publicized strengths of &os; is that an entire cluster of machines is dedicated to continually building the Ports Collection, for each of the major OS releases and for each Tier-1 architecture. You can find the results of these builds at package building logs and errors. Individual ports are built unless they are specifically marked with IGNORE. Ports that are marked with BROKEN will still be attempted, to see if the underlying problem has been resolved. (This is done by passing TRYBROKEN to the port's Makefile.) The &os; Port Distfile Survey The build cluster is dedicated to building the latest release of each port with distfiles that have already been fetched. However, as the Internet continually changes, distfiles can quickly go missing. The FreeBSD Ports distfiles survey attempts to query every download site for every port to find out if each distfile is still currently available. Maintainers are asked to check this report periodically, not only to speed up the building process for users, but to help avoid wasting bandwidth of the sites that volunteer to host all these distfiles. The &os; Ports Monitoring System Another handy resource is the FreeBSD Ports Monitoring System (also known as portsmon). This system comprises a database that processes information from several sources and allows its to be browsed via a web interface. Currently, the ports Problem Reports (PRs), the error logs from the build cluster, and individual files from the ports collection are used. In the future, this will be expanded to include the distfile survey, as well as other sources. To get started, you can view all information about a particular port by using the Overview of One Port. As of this writing, this is the only resource available that maps GNATS PR entries to portnames. (PR submitters do not always include the portname in their Synopsis, although we would prefer that they did.) So, portsmon is a good place to start if you want to find out whether an existing port has any PRs filed against it and/or any build errors; or, to find out if a new port that you may be thinking about creating has already been submitted. diff --git a/en_US.ISO8859-1/share/sgml/freebsd.dsl b/en_US.ISO8859-1/share/sgml/freebsd.dsl index c6569f7342..9ac6a611ae 100644 --- a/en_US.ISO8859-1/share/sgml/freebsd.dsl +++ b/en_US.ISO8859-1/share/sgml/freebsd.dsl @@ -1,251 +1,251 @@ %freebsd.l10n; ]>

(define %refentry-xref-link% #t) (define ($email-footer$) (make sequence (make element gi: "p" attributes: (list (list "align" "center")) (make element gi: "small" (literal "This, and other documents, can be downloaded from ") (create-link (list (list "HREF" "ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/")) (literal "ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/")) (literal "."))) (make element gi: "p" attributes: (list (list "align" "center")) (make element gi: "small" (literal "For questions about FreeBSD, read the ") (create-link (list (list "HREF" "http://www.FreeBSD.org/docs.html")) (literal "documentation")) (literal " before contacting <") (create-link (list (list "HREF" "mailto:questions@FreeBSD.org")) (literal "questions@FreeBSD.org")) (literal ">.") (make empty-element gi: "br") (literal "For questions about this documentation, e-mail <") (create-link (list (list "HREF" "mailto:doc@FreeBSD.org")) (literal "doc@FreeBSD.org")) (literal ">."))))) ]]> number ;; then get the apparent level + (string->number ;; then get the apparent level (substring renderas 4 5)) ;; from "renderas", (SECTLEVEL))) ;; else use the real level (hs (HSIZE (- 4 hlevel)))) (make sequence (make paragraph font-family-name: %title-font-family% font-weight: (if (< hlevel 5) 'bold 'medium) font-posture: (if (< hlevel 5) 'upright 'italic) font-size: hs line-spacing: (* hs %line-spacing-factor%) space-before: (* hs %head-before-factor%) space-after: (if (node-list-empty? subtitles) (* hs %head-after-factor%) 0pt) start-indent: (if (or (>= hlevel 3) (member (gi) (list (normalize "refsynopsisdiv") (normalize "refsect1") (normalize "refsect2") (normalize "refsect3")))) %body-start-indent% 0pt) first-line-start-indent: 0pt quadding: %section-title-quadding% keep-with-next?: #t heading-level: (if %generate-heading-level% (+ hlevel 1) 0) ;; SimpleSects are never AUTO numbered...they aren't hierarchical (if (> hlevel (- max-section-level-labels 1)) (empty-sosofo) (if (string=? (element-label (current-node)) "") (empty-sosofo) (literal (element-label (current-node)) (gentext-label-title-sep (gi sect))))) (element-title-sosofo (current-node))) (with-mode section-title-mode (process-node-list subtitles)) ($section-info$ info)))) ]]> (define (local-en-label-title-sep) (list (list (normalize "warning") ": ") (list (normalize "caution") ": ") (list (normalize "chapter") " ") (list (normalize "sect1") " ") (list (normalize "sect2") " ") (list (normalize "sect3") " ") (list (normalize "sect4") " ") (list (normalize "sect5") " ") ))