Index: head/sbin/init/init.8 =================================================================== --- head/sbin/init/init.8 (revision 337739) +++ head/sbin/init/init.8 (revision 337740) @@ -1,430 +1,440 @@ .\" Copyright (c) 1980, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" This code is derived from software contributed to Berkeley by .\" Donn Seeley at Berkeley Software Design, Inc. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)init.8 8.3 (Berkeley) 4/18/94 .\" $FreeBSD$ .\" -.Dd August 7, 2018 +.Dd August 14, 2018 .Dt INIT 8 .Os .Sh NAME .Nm init .Nd process control initialization .Sh SYNOPSIS .Nm .Nm .Oo .Cm 0 | 1 | 6 | .Cm c | q .Oc .Sh DESCRIPTION The .Nm utility is the last stage of the boot process. It normally runs the automatic reboot sequence as described in .Xr rc 8 , and if this succeeds, begins multi-user operation. If the reboot scripts fail, .Nm commences single-user operation by giving the super-user a shell on the console. The .Nm utility may be passed parameters from the boot program to prevent the system from going multi-user and to instead execute a single-user shell without starting the normal daemons. The system is then quiescent for maintenance work and may later be made to go to multi-user by exiting the single-user shell (with ^D). This causes .Nm to run the .Pa /etc/rc start up command file in fastboot mode (skipping disk checks). .Pp If the .Em console entry in the .Xr ttys 5 file is marked .Dq insecure , then .Nm will require that the super-user password be entered before the system will start a single-user shell. The password check is skipped if the .Em console is marked as .Dq secure . .Pp If the system security level (see .Xr security 7 ) is initially nonzero, then .Nm leaves it unchanged. Otherwise, .Nm raises the level to 1 before going multi-user for the first time. Since the level cannot be reduced, it will be at least 1 for subsequent operation, even on return to single-user. If a level higher than 1 is desired while running multi-user, it can be set before going multi-user, e.g., by the startup script .Xr rc 8 , using .Xr sysctl 8 to set the .Va kern.securelevel variable to the required security level. .Pp If .Nm is run in a jail, the security level of the .Dq host system will not be affected. Part of the information set up in the kernel to support a jail is a per-jail security level. This allows running a higher security level inside of a jail than that of the host system. See .Xr jail 8 for more information about jails. .Pp In multi-user operation, .Nm maintains processes for the terminal ports found in the file .Xr ttys 5 . The .Nm utility reads this file and executes the command found in the second field, unless the first field refers to a device in .Pa /dev which is not configured. The first field is supplied as the final argument to the command. This command is usually .Xr getty 8 ; .Nm getty opens and initializes the tty line and executes the .Xr login 1 program. The .Nm login program, when a valid user logs in, executes a shell for that user. When this shell dies, either because the user logged out or an abnormal termination occurred (a signal), the cycle is restarted by executing a new .Nm getty for the line. .Pp The .Nm utility can also be used to keep arbitrary daemons running, automatically restarting them if they die. In this case, the first field in the .Xr ttys 5 file must not reference the path to a configured device node and will be passed to the daemon as the final argument on its command line. This is similar to the facility offered in the .At V .Pa /etc/inittab . .Pp Line status (on, off, secure, getty, or window information) may be changed in the .Xr ttys 5 file without a reboot by sending the signal .Dv SIGHUP to .Nm with the command .Dq Li "kill -HUP 1" . On receipt of this signal, .Nm re-reads the .Xr ttys 5 file. When a line is turned off in .Xr ttys 5 , .Nm will send a SIGHUP signal to the controlling process for the session associated with the line. For any lines that were previously turned off in the .Xr ttys 5 file and are now on, .Nm executes the command specified in the second field. If the command or window field for a line is changed, the change takes effect at the end of the current login session (e.g., the next time .Nm starts a process on the line). If a line is commented out or deleted from .Xr ttys 5 , .Nm will not do anything at all to that line. .Pp The .Nm utility will terminate multi-user operations and resume single-user mode if sent a terminate .Pq Dv TERM signal, for example, .Dq Li "kill \-TERM 1" . If there are processes outstanding that are deadlocked (because of hardware or software failure), .Nm will not wait for them all to die (which might take forever), but will time out after 30 seconds and print a warning message. .Pp The .Nm utility will cease creating new processes and allow the system to slowly die away, if it is sent a terminal stop .Pq Dv TSTP signal, i.e.\& .Dq Li "kill \-TSTP 1" . A later hangup will resume full multi-user operations, or a terminate will start a single-user shell. This hook is used by .Xr reboot 8 and .Xr halt 8 . .Pp The .Nm utility will terminate all possible processes (again, it will not wait for deadlocked processes) and reboot the machine if sent the interrupt .Pq Dv INT signal, i.e.\& .Dq Li "kill \-INT 1". This is useful for shutting the machine down cleanly from inside the kernel or from X when the machine appears to be hung. .Pp The .Nm utility will do the same, except it will halt the machine if sent the user defined signal 1 .Pq Dv USR1 , or will halt and turn the power off (if hardware permits) if sent the user defined signal 2 .Pq Dv USR2 . .Pp When shutting down the machine, .Nm will try to run the .Pa /etc/rc.shutdown script. This script can be used to cleanly terminate specific programs such as .Nm innd (the InterNetNews server). If this script does not terminate within 120 seconds, .Nm will terminate it. The timeout can be configured via the .Xr sysctl 8 variable .Va kern.init_shutdown_timeout . .Pp The role of .Nm is so critical that if it dies, the system will reboot itself automatically. If, at bootstrap time, the .Nm process cannot be located, the system will panic with the message .Dq "panic: init died (signal %d, exit %d)" . .Pp If run as a user process as shown in the second synopsis line, .Nm will emulate .At V behavior, i.e., super-user can specify the desired .Em run-level on a command line, and .Nm will signal the original (PID 1) .Nm as follows: .Bl -column Run-level SIGTERM .It Sy "Run-level Signal Action" .It Cm 0 Ta Dv SIGUSR1 Ta "Halt" .It Cm 0 Ta Dv SIGUSR2 Ta "Halt and turn the power off" .It Cm 0 Ta Dv SIGWINCH Ta "Halt and turn the power off and then back on" .It Cm 1 Ta Dv SIGTERM Ta "Go to single-user mode" .It Cm 6 Ta Dv SIGINT Ta "Reboot the machine" .It Cm c Ta Dv SIGTSTP Ta "Block further logins" .It Cm q Ta Dv SIGHUP Ta Rescan the .Xr ttys 5 file .El .Sh KERNEL ENVIRONMENT VARIABLES The following .Xr kenv 2 variables are available as .Xr loader 8 tunables: .Bl -tag -width indent .It Va init_chroot If set to a valid directory in the root file system, it causes .Nm to perform a .Xr chroot 2 operation on that directory, making it the new root directory. That happens before entering single-user mode or multi-user mode (but after executing the .Va init_script if enabled). This functionality has generally been eclipsed by rerooting. See .Xr reboot 8 .Fl r for details. +.It Va init_exec +If set to a valid file name in the root file system, +instructs +.Nm +to directly execute that file as the very first action, +replacing +.Nm +as PID 1. .It Va init_script If set to a valid file name in the root file system, instructs .Nm to run that script as the very first action, before doing anything else. Signal handling and exit code interpretation is similar to running the .Pa /etc/rc script. In particular, single-user operation is enforced if the script terminates with a non-zero exit code, or if a SIGTERM is delivered to the .Nm process (PID 1). This functionality has generally been eclipsed by rerooting. See .Xr reboot 8 .Fl r for details. .It Va init_shell Defines the shell binary to be used for executing the various shell scripts. The default is .Dq Li /bin/sh . It is used for running the +.Va init_exec +or .Va init_script if set, as well as for the .Pa /etc/rc and .Pa /etc/rc.shutdown scripts. The value of the corresponding .Xr kenv 2 variable is evaluated every time .Nm calls a shell script, so it can be changed later on using the .Xr kenv 1 utility. In particular, if a non-default shell is used for running an .Va init_script , it might be desirable to have that script reset the value of .Va init_shell back to the default, so that the .Pa /etc/rc script is executed with the standard shell .Pa /bin/sh . .Sh FILES .Bl -tag -width /var/log/init.log -compact .It Pa /dev/console system console device .It Pa /dev/tty* terminal ports found in .Xr ttys 5 .It Pa /etc/ttys the terminal initialization information file .It Pa /etc/rc system startup commands .It Pa /etc/rc.shutdown system shutdown commands .It Pa /var/log/init.log log of .Xr rc 8 output if the system console device is not available .El .Sh DIAGNOSTICS .Bl -diag .It "getty repeating too quickly on port %s, sleeping." A process being started to service a line is exiting quickly each time it is started. This is often caused by a ringing or noisy terminal line. .Bf -emphasis Init will sleep for 30 seconds, then continue trying to start the process. .Ef .It "some processes would not die; ps axl advised." A process is hung and could not be killed when the system was shutting down. This condition is usually caused by a process that is stuck in a device driver because of a persistent device error condition. .El .Sh SEE ALSO .Xr kill 1 , .Xr login 1 , .Xr sh 1 , .Xr ttys 5 , .Xr security 7 , .Xr getty 8 , .Xr halt 8 , .Xr jail 8 , .Xr rc 8 , .Xr reboot 8 , .Xr shutdown 8 , .Xr sysctl 8 .Sh HISTORY An .Nm utility appeared in .At v1 . .Sh CAVEATS Systems without .Xr sysctl 8 behave as though they have security level \-1. .Pp Setting the security level above 1 too early in the boot sequence can prevent .Xr fsck 8 from repairing inconsistent file systems. The preferred location to set the security level is at the end of .Pa /etc/rc after all multi-user startup actions are complete. Index: head/sbin/init/init.c =================================================================== --- head/sbin/init/init.c (revision 337739) +++ head/sbin/init/init.c (revision 337740) @@ -1,2037 +1,2059 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Donn Seeley at Berkeley Software Design, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef lint static const char copyright[] = "@(#) Copyright (c) 1991, 1993\n\ The Regents of the University of California. All rights reserved.\n"; #endif /* not lint */ #ifndef lint #if 0 static char sccsid[] = "@(#)init.c 8.1 (Berkeley) 7/15/93"; #endif static const char rcsid[] = "$FreeBSD$"; #endif /* not lint */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef SECURE #include #endif #ifdef LOGIN_CAP #include #endif #include "mntopts.h" #include "pathnames.h" /* * Sleep times; used to prevent thrashing. */ #define GETTY_SPACING 5 /* N secs minimum getty spacing */ #define GETTY_SLEEP 30 /* sleep N secs after spacing problem */ #define GETTY_NSPACE 3 /* max. spacing count to bring reaction */ #define WINDOW_WAIT 3 /* wait N secs after starting window */ #define STALL_TIMEOUT 30 /* wait N secs after warning */ #define DEATH_WATCH 10 /* wait N secs for procs to die */ #define DEATH_SCRIPT 120 /* wait for 2min for /etc/rc.shutdown */ #define RESOURCE_RC "daemon" #define RESOURCE_WINDOW "default" #define RESOURCE_GETTY "default" static void handle(sig_t, ...); static void delset(sigset_t *, ...); static void stall(const char *, ...) __printflike(1, 2); static void warning(const char *, ...) __printflike(1, 2); static void emergency(const char *, ...) __printflike(1, 2); static void disaster(int); static void badsys(int); static void revoke_ttys(void); static int runshutdown(void); static char *strk(char *); /* * We really need a recursive typedef... * The following at least guarantees that the return type of (*state_t)() * is sufficiently wide to hold a function pointer. */ typedef long (*state_func_t)(void); typedef state_func_t (*state_t)(void); static state_func_t single_user(void); static state_func_t runcom(void); static state_func_t read_ttys(void); static state_func_t multi_user(void); static state_func_t clean_ttys(void); static state_func_t catatonia(void); static state_func_t death(void); static state_func_t death_single(void); static state_func_t reroot(void); static state_func_t reroot_phase_two(void); static state_func_t run_script(const char *); static enum { AUTOBOOT, FASTBOOT } runcom_mode = AUTOBOOT; #define FALSE 0 #define TRUE 1 static int Reboot = FALSE; static int howto = RB_AUTOBOOT; static int devfs; static char *init_path_argv0; static void transition(state_t); static state_t requested_transition; static state_t current_state = death_single; static void execute_script(char *argv[]); static void open_console(void); static const char *get_shell(void); +static void replace_init(char *path); static void write_stderr(const char *message); typedef struct init_session { pid_t se_process; /* controlling process */ time_t se_started; /* used to avoid thrashing */ int se_flags; /* status of session */ #define SE_SHUTDOWN 0x1 /* session won't be restarted */ #define SE_PRESENT 0x2 /* session is in /etc/ttys */ #define SE_IFEXISTS 0x4 /* session defined as "onifexists" */ #define SE_IFCONSOLE 0x8 /* session defined as "onifconsole" */ int se_nspace; /* spacing count */ char *se_device; /* filename of port */ char *se_getty; /* what to run on that port */ char *se_getty_argv_space; /* pre-parsed argument array space */ char **se_getty_argv; /* pre-parsed argument array */ char *se_window; /* window system (started only once) */ char *se_window_argv_space; /* pre-parsed argument array space */ char **se_window_argv; /* pre-parsed argument array */ char *se_type; /* default terminal type */ struct init_session *se_prev; struct init_session *se_next; } session_t; static void free_session(session_t *); static session_t *new_session(session_t *, struct ttyent *); static session_t *sessions; static char **construct_argv(char *); static void start_window_system(session_t *); static void collect_child(pid_t); static pid_t start_getty(session_t *); static void transition_handler(int); static void alrm_handler(int); static void setsecuritylevel(int); static int getsecuritylevel(void); static int setupargv(session_t *, struct ttyent *); #ifdef LOGIN_CAP static void setprocresources(const char *); #endif static int clang; static int start_session_db(void); static void add_session(session_t *); static void del_session(session_t *); static session_t *find_session(pid_t); static DB *session_db; /* * The mother of all processes. */ int main(int argc, char *argv[]) { state_t initial_transition = runcom; char kenv_value[PATH_MAX]; int c, error; struct sigaction sa; sigset_t mask; /* Dispose of random users. */ if (getuid() != 0) errx(1, "%s", strerror(EPERM)); /* System V users like to reexec init. */ if (getpid() != 1) { #ifdef COMPAT_SYSV_INIT /* So give them what they want */ if (argc > 1) { if (strlen(argv[1]) == 1) { char runlevel = *argv[1]; int sig; switch (runlevel) { case '0': /* halt + poweroff */ sig = SIGUSR2; break; case '1': /* single-user */ sig = SIGTERM; break; case '6': /* reboot */ sig = SIGINT; break; case 'c': /* block further logins */ sig = SIGTSTP; break; case 'q': /* rescan /etc/ttys */ sig = SIGHUP; break; case 'r': /* remount root */ sig = SIGEMT; break; default: goto invalid; } kill(1, sig); _exit(0); } else invalid: errx(1, "invalid run-level ``%s''", argv[1]); } else #endif errx(1, "already running"); } init_path_argv0 = strdup(argv[0]); if (init_path_argv0 == NULL) err(1, "strdup"); /* * Note that this does NOT open a file... * Does 'init' deserve its own facility number? */ openlog("init", LOG_CONS, LOG_AUTH); /* * Create an initial session. */ if (setsid() < 0 && (errno != EPERM || getsid(0) != 1)) warning("initial setsid() failed: %m"); /* * Establish an initial user so that programs running * single user do not freak out and die (like passwd). */ if (setlogin("root") < 0) warning("setlogin() failed: %m"); /* * This code assumes that we always get arguments through flags, * never through bits set in some random machine register. */ while ((c = getopt(argc, argv, "dsfr")) != -1) switch (c) { case 'd': devfs = 1; break; case 's': initial_transition = single_user; break; case 'f': runcom_mode = FASTBOOT; break; case 'r': initial_transition = reroot_phase_two; break; default: warning("unrecognized flag '-%c'", c); break; } if (optind != argc) warning("ignoring excess arguments"); /* * We catch or block signals rather than ignore them, * so that they get reset on exec. */ handle(badsys, SIGSYS, 0); handle(disaster, SIGABRT, SIGFPE, SIGILL, SIGSEGV, SIGBUS, SIGXCPU, SIGXFSZ, 0); handle(transition_handler, SIGHUP, SIGINT, SIGEMT, SIGTERM, SIGTSTP, SIGUSR1, SIGUSR2, SIGWINCH, 0); handle(alrm_handler, SIGALRM, 0); sigfillset(&mask); delset(&mask, SIGABRT, SIGFPE, SIGILL, SIGSEGV, SIGBUS, SIGSYS, SIGXCPU, SIGXFSZ, SIGHUP, SIGINT, SIGEMT, SIGTERM, SIGTSTP, SIGALRM, SIGUSR1, SIGUSR2, SIGWINCH, 0); sigprocmask(SIG_SETMASK, &mask, NULL); sigemptyset(&sa.sa_mask); sa.sa_flags = 0; sa.sa_handler = SIG_IGN; sigaction(SIGTTIN, &sa, NULL); sigaction(SIGTTOU, &sa, NULL); /* * Paranoia. */ close(0); close(1); close(2); + if (kenv(KENV_GET, "init_exec", kenv_value, sizeof(kenv_value)) > 0) { + replace_init(kenv_value); + _exit(0); /* reboot */ + } + if (kenv(KENV_GET, "init_script", kenv_value, sizeof(kenv_value)) > 0) { state_func_t next_transition; if ((next_transition = run_script(kenv_value)) != NULL) initial_transition = (state_t) next_transition; } if (kenv(KENV_GET, "init_chroot", kenv_value, sizeof(kenv_value)) > 0) { if (chdir(kenv_value) != 0 || chroot(".") != 0) warning("Can't chroot to %s: %m", kenv_value); } /* * Additional check if devfs needs to be mounted: * If "/" and "/dev" have the same device number, * then it hasn't been mounted yet. */ if (!devfs) { struct stat stst; dev_t root_devno; stat("/", &stst); root_devno = stst.st_dev; if (stat("/dev", &stst) != 0) warning("Can't stat /dev: %m"); else if (stst.st_dev == root_devno) devfs++; } if (devfs) { struct iovec iov[4]; char *s; int i; char _fstype[] = "fstype"; char _devfs[] = "devfs"; char _fspath[] = "fspath"; char _path_dev[]= _PATH_DEV; iov[0].iov_base = _fstype; iov[0].iov_len = sizeof(_fstype); iov[1].iov_base = _devfs; iov[1].iov_len = sizeof(_devfs); iov[2].iov_base = _fspath; iov[2].iov_len = sizeof(_fspath); /* * Try to avoid the trailing slash in _PATH_DEV. * Be *very* defensive. */ s = strdup(_PATH_DEV); if (s != NULL) { i = strlen(s); if (i > 0 && s[i - 1] == '/') s[i - 1] = '\0'; iov[3].iov_base = s; iov[3].iov_len = strlen(s) + 1; } else { iov[3].iov_base = _path_dev; iov[3].iov_len = sizeof(_path_dev); } nmount(iov, 4, 0); if (s != NULL) free(s); } if (initial_transition != reroot_phase_two) { /* * Unmount reroot leftovers. This runs after init(8) * gets reexecuted after reroot_phase_two() is done. */ error = unmount(_PATH_REROOT, MNT_FORCE); if (error != 0 && errno != EINVAL) warning("Cannot unmount %s: %m", _PATH_REROOT); } /* * Start the state machine. */ transition(initial_transition); /* * Should never reach here. */ return 1; } /* * Associate a function with a signal handler. */ static void handle(sig_t handler, ...) { int sig; struct sigaction sa; sigset_t mask_everything; va_list ap; va_start(ap, handler); sa.sa_handler = handler; sigfillset(&mask_everything); while ((sig = va_arg(ap, int)) != 0) { sa.sa_mask = mask_everything; /* XXX SA_RESTART? */ sa.sa_flags = sig == SIGCHLD ? SA_NOCLDSTOP : 0; sigaction(sig, &sa, NULL); } va_end(ap); } /* * Delete a set of signals from a mask. */ static void delset(sigset_t *maskp, ...) { int sig; va_list ap; va_start(ap, maskp); while ((sig = va_arg(ap, int)) != 0) sigdelset(maskp, sig); va_end(ap); } /* * Log a message and sleep for a while (to give someone an opportunity * to read it and to save log or hardcopy output if the problem is chronic). * NB: should send a message to the session logger to avoid blocking. */ static void stall(const char *message, ...) { va_list ap; va_start(ap, message); vsyslog(LOG_ALERT, message, ap); va_end(ap); sleep(STALL_TIMEOUT); } /* * Like stall(), but doesn't sleep. * If cpp had variadic macros, the two functions could be #defines for another. * NB: should send a message to the session logger to avoid blocking. */ static void warning(const char *message, ...) { va_list ap; va_start(ap, message); vsyslog(LOG_ALERT, message, ap); va_end(ap); } /* * Log an emergency message. * NB: should send a message to the session logger to avoid blocking. */ static void emergency(const char *message, ...) { va_list ap; va_start(ap, message); vsyslog(LOG_EMERG, message, ap); va_end(ap); } /* * Catch a SIGSYS signal. * * These may arise if a system does not support sysctl. * We tolerate up to 25 of these, then throw in the towel. */ static void badsys(int sig) { static int badcount = 0; if (badcount++ < 25) return; disaster(sig); } /* * Catch an unexpected signal. */ static void disaster(int sig) { emergency("fatal signal: %s", (unsigned)sig < NSIG ? sys_siglist[sig] : "unknown signal"); sleep(STALL_TIMEOUT); _exit(sig); /* reboot */ } /* * Get the security level of the kernel. */ static int getsecuritylevel(void) { #ifdef KERN_SECURELVL int name[2], curlevel; size_t len; name[0] = CTL_KERN; name[1] = KERN_SECURELVL; len = sizeof curlevel; if (sysctl(name, 2, &curlevel, &len, NULL, 0) == -1) { emergency("cannot get kernel security level: %s", strerror(errno)); return (-1); } return (curlevel); #else return (-1); #endif } /* * Set the security level of the kernel. */ static void setsecuritylevel(int newlevel) { #ifdef KERN_SECURELVL int name[2], curlevel; curlevel = getsecuritylevel(); if (newlevel == curlevel) return; name[0] = CTL_KERN; name[1] = KERN_SECURELVL; if (sysctl(name, 2, NULL, NULL, &newlevel, sizeof newlevel) == -1) { emergency( "cannot change kernel security level from %d to %d: %s", curlevel, newlevel, strerror(errno)); return; } #ifdef SECURE warning("kernel security level changed from %d to %d", curlevel, newlevel); #endif #endif } /* * Change states in the finite state machine. * The initial state is passed as an argument. */ static void transition(state_t s) { current_state = s; for (;;) current_state = (state_t) (*current_state)(); } /* * Start a session and allocate a controlling terminal. * Only called by children of init after forking. */ static void open_console(void) { int fd; /* * Try to open /dev/console. Open the device with O_NONBLOCK to * prevent potential blocking on a carrier. */ revoke(_PATH_CONSOLE); if ((fd = open(_PATH_CONSOLE, O_RDWR | O_NONBLOCK)) != -1) { (void)fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) & ~O_NONBLOCK); if (login_tty(fd) == 0) return; close(fd); } /* No luck. Log output to file if possible. */ if ((fd = open(_PATH_DEVNULL, O_RDWR)) == -1) { stall("cannot open null device."); _exit(1); } if (fd != STDIN_FILENO) { dup2(fd, STDIN_FILENO); close(fd); } fd = open(_PATH_INITLOG, O_WRONLY | O_APPEND | O_CREAT, 0644); if (fd == -1) dup2(STDIN_FILENO, STDOUT_FILENO); else if (fd != STDOUT_FILENO) { dup2(fd, STDOUT_FILENO); close(fd); } dup2(STDOUT_FILENO, STDERR_FILENO); } static const char * get_shell(void) { static char kenv_value[PATH_MAX]; if (kenv(KENV_GET, "init_shell", kenv_value, sizeof(kenv_value)) > 0) return kenv_value; else return _PATH_BSHELL; } static void write_stderr(const char *message) { write(STDERR_FILENO, message, strlen(message)); } static int read_file(const char *path, void **bufp, size_t *bufsizep) { struct stat sb; size_t bufsize; void *buf; ssize_t nbytes; int error, fd; fd = open(path, O_RDONLY); if (fd < 0) { emergency("%s: %s", path, strerror(errno)); return (-1); } error = fstat(fd, &sb); if (error != 0) { emergency("fstat: %s", strerror(errno)); close(fd); return (error); } bufsize = sb.st_size; buf = malloc(bufsize); if (buf == NULL) { emergency("malloc: %s", strerror(errno)); close(fd); return (error); } nbytes = read(fd, buf, bufsize); if (nbytes != (ssize_t)bufsize) { emergency("read: %s", strerror(errno)); close(fd); free(buf); return (error); } error = close(fd); if (error != 0) { emergency("close: %s", strerror(errno)); free(buf); return (error); } *bufp = buf; *bufsizep = bufsize; return (0); } static int create_file(const char *path, const void *buf, size_t bufsize) { ssize_t nbytes; int error, fd; fd = open(path, O_WRONLY | O_CREAT | O_EXCL, 0700); if (fd < 0) { emergency("%s: %s", path, strerror(errno)); return (-1); } nbytes = write(fd, buf, bufsize); if (nbytes != (ssize_t)bufsize) { emergency("write: %s", strerror(errno)); close(fd); return (-1); } error = close(fd); if (error != 0) { emergency("close: %s", strerror(errno)); return (-1); } return (0); } static int mount_tmpfs(const char *fspath) { struct iovec *iov; char errmsg[255]; int error, iovlen; iov = NULL; iovlen = 0; memset(errmsg, 0, sizeof(errmsg)); build_iovec(&iov, &iovlen, "fstype", __DECONST(void *, "tmpfs"), (size_t)-1); build_iovec(&iov, &iovlen, "fspath", __DECONST(void *, fspath), (size_t)-1); build_iovec(&iov, &iovlen, "errmsg", errmsg, sizeof(errmsg)); error = nmount(iov, iovlen, 0); if (error != 0) { if (*errmsg != '\0') { emergency("cannot mount tmpfs on %s: %s: %s", fspath, errmsg, strerror(errno)); } else { emergency("cannot mount tmpfs on %s: %s", fspath, strerror(errno)); } return (error); } return (0); } static state_func_t reroot(void) { void *buf; size_t bufsize; int error; buf = NULL; bufsize = 0; revoke_ttys(); runshutdown(); /* * Make sure nobody can interfere with our scheme. * Ignore ESRCH, which can apparently happen when * there are no processes to kill. */ error = kill(-1, SIGKILL); if (error != 0 && errno != ESRCH) { emergency("kill(2) failed: %s", strerror(errno)); goto out; } /* * Copy the init binary into tmpfs, so that we can unmount * the old rootfs without committing suicide. */ error = read_file(init_path_argv0, &buf, &bufsize); if (error != 0) goto out; error = mount_tmpfs(_PATH_REROOT); if (error != 0) goto out; error = create_file(_PATH_REROOT_INIT, buf, bufsize); if (error != 0) goto out; /* * Execute the temporary init. */ execl(_PATH_REROOT_INIT, _PATH_REROOT_INIT, "-r", NULL); emergency("cannot exec %s: %s", _PATH_REROOT_INIT, strerror(errno)); out: emergency("reroot failed; going to single user mode"); free(buf); return (state_func_t) single_user; } static state_func_t reroot_phase_two(void) { char init_path[PATH_MAX], *path, *path_component; size_t init_path_len; int nbytes, error; /* * Ask the kernel to mount the new rootfs. */ error = reboot(RB_REROOT); if (error != 0) { emergency("RB_REBOOT failed: %s", strerror(errno)); goto out; } /* * Figure out where the destination init(8) binary is. Note that * the path could be different than what we've started with. Use * the value from kenv, if set, or the one from sysctl otherwise. * The latter defaults to a hardcoded value, but can be overridden * by a build time option. */ nbytes = kenv(KENV_GET, "init_path", init_path, sizeof(init_path)); if (nbytes <= 0) { init_path_len = sizeof(init_path); error = sysctlbyname("kern.init_path", init_path, &init_path_len, NULL, 0); if (error != 0) { emergency("failed to retrieve kern.init_path: %s", strerror(errno)); goto out; } } /* * Repeat the init search logic from sys/kern/init_path.c */ path_component = init_path; while ((path = strsep(&path_component, ":")) != NULL) { /* * Execute init(8) from the new rootfs. */ execl(path, path, NULL); } emergency("cannot exec init from %s: %s", init_path, strerror(errno)); out: emergency("reroot failed; going to single user mode"); return (state_func_t) single_user; } /* * Bring the system up single user. */ static state_func_t single_user(void) { pid_t pid, wpid; int status; sigset_t mask; const char *shell; char *argv[2]; struct timeval tv, tn; #ifdef SECURE struct ttyent *typ; struct passwd *pp; static const char banner[] = "Enter root password, or ^D to go multi-user\n"; char *clear, *password; #endif #ifdef DEBUGSHELL char altshell[128]; #endif if (Reboot) { /* Instead of going single user, let's reboot the machine */ sync(); if (reboot(howto) == -1) { emergency("reboot(%#x) failed, %s", howto, strerror(errno)); _exit(1); /* panic and reboot */ } warning("reboot(%#x) returned", howto); _exit(0); /* panic as well */ } shell = get_shell(); if ((pid = fork()) == 0) { /* * Start the single user session. */ open_console(); #ifdef SECURE /* * Check the root password. * We don't care if the console is 'on' by default; * it's the only tty that can be 'off' and 'secure'. */ typ = getttynam("console"); pp = getpwnam("root"); if (typ && (typ->ty_status & TTY_SECURE) == 0 && pp && *pp->pw_passwd) { write_stderr(banner); for (;;) { clear = getpass("Password:"); if (clear == NULL || *clear == '\0') _exit(0); password = crypt(clear, pp->pw_passwd); bzero(clear, _PASSWORD_LEN); if (password != NULL && strcmp(password, pp->pw_passwd) == 0) break; warning("single-user login failed\n"); } } endttyent(); endpwent(); #endif /* SECURE */ #ifdef DEBUGSHELL { char *cp = altshell; int num; #define SHREQUEST "Enter full pathname of shell or RETURN for " write_stderr(SHREQUEST); write_stderr(shell); write_stderr(": "); while ((num = read(STDIN_FILENO, cp, 1)) != -1 && num != 0 && *cp != '\n' && cp < &altshell[127]) cp++; *cp = '\0'; if (altshell[0] != '\0') shell = altshell; } #endif /* DEBUGSHELL */ /* * Unblock signals. * We catch all the interesting ones, * and those are reset to SIG_DFL on exec. */ sigemptyset(&mask); sigprocmask(SIG_SETMASK, &mask, NULL); /* * Fire off a shell. * If the default one doesn't work, try the Bourne shell. */ char name[] = "-sh"; argv[0] = name; argv[1] = 0; execv(shell, argv); emergency("can't exec %s for single user: %m", shell); execv(_PATH_BSHELL, argv); emergency("can't exec %s for single user: %m", _PATH_BSHELL); sleep(STALL_TIMEOUT); _exit(1); } if (pid == -1) { /* * We are seriously hosed. Do our best. */ emergency("can't fork single-user shell, trying again"); while (waitpid(-1, (int *) 0, WNOHANG) > 0) continue; return (state_func_t) single_user; } requested_transition = 0; do { if ((wpid = waitpid(-1, &status, WUNTRACED)) != -1) collect_child(wpid); if (wpid == -1) { if (errno == EINTR) continue; warning("wait for single-user shell failed: %m; restarting"); return (state_func_t) single_user; } if (wpid == pid && WIFSTOPPED(status)) { warning("init: shell stopped, restarting\n"); kill(pid, SIGCONT); wpid = -1; } } while (wpid != pid && !requested_transition); if (requested_transition) return (state_func_t) requested_transition; if (!WIFEXITED(status)) { if (WTERMSIG(status) == SIGKILL) { /* * reboot(8) killed shell? */ warning("single user shell terminated."); gettimeofday(&tv, NULL); tn = tv; tv.tv_sec += STALL_TIMEOUT; while (tv.tv_sec > tn.tv_sec || (tv.tv_sec == tn.tv_sec && tv.tv_usec > tn.tv_usec)) { sleep(1); gettimeofday(&tn, NULL); } _exit(0); } else { warning("single user shell terminated, restarting"); return (state_func_t) single_user; } } runcom_mode = FASTBOOT; return (state_func_t) runcom; } /* * Run the system startup script. */ static state_func_t runcom(void) { state_func_t next_transition; if ((next_transition = run_script(_PATH_RUNCOM)) != NULL) return next_transition; runcom_mode = AUTOBOOT; /* the default */ return (state_func_t) read_ttys; } static void execute_script(char *argv[]) { struct sigaction sa; const char *shell, *script; int error; bzero(&sa, sizeof(sa)); sigemptyset(&sa.sa_mask); sa.sa_handler = SIG_IGN; sigaction(SIGTSTP, &sa, NULL); sigaction(SIGHUP, &sa, NULL); open_console(); sigprocmask(SIG_SETMASK, &sa.sa_mask, NULL); #ifdef LOGIN_CAP setprocresources(RESOURCE_RC); #endif /* * Try to directly execute the script first. If it * fails, try the old method of passing the script path * to sh(1). Don't complain if it fails because of * the missing execute bit. */ script = argv[1]; error = access(script, X_OK); if (error == 0) { execv(script, argv + 1); warning("can't exec %s: %m", script); } else if (errno != EACCES) { warning("can't access %s: %m", script); } shell = get_shell(); execv(shell, argv); stall("can't exec %s for %s: %m", shell, script); +} + +/* + * Execute binary, replacing init(8) as PID 1. + */ +static void +replace_init(char *path) +{ + char *argv[3]; + char sh[] = "sh"; + + argv[0] = sh; + argv[1] = path; + argv[2] = NULL; + + execute_script(argv); } /* * Run a shell script. * Returns 0 on success, otherwise the next transition to enter: * - single_user if fork/execv/waitpid failed, or if the script * terminated with a signal or exit code != 0. * - death_single if a SIGTERM was delivered to init(8). */ static state_func_t run_script(const char *script) { pid_t pid, wpid; int status; char *argv[4]; const char *shell; shell = get_shell(); if ((pid = fork()) == 0) { char _sh[] = "sh"; char _autoboot[] = "autoboot"; argv[0] = _sh; argv[1] = __DECONST(char *, script); argv[2] = runcom_mode == AUTOBOOT ? _autoboot : 0; argv[3] = 0; execute_script(argv); sleep(STALL_TIMEOUT); _exit(1); /* force single user mode */ } if (pid == -1) { emergency("can't fork for %s on %s: %m", shell, script); while (waitpid(-1, (int *) 0, WNOHANG) > 0) continue; sleep(STALL_TIMEOUT); return (state_func_t) single_user; } /* * Copied from single_user(). This is a bit paranoid. */ requested_transition = 0; do { if ((wpid = waitpid(-1, &status, WUNTRACED)) != -1) collect_child(wpid); if (wpid == -1) { if (requested_transition == death_single || requested_transition == reroot) return (state_func_t) requested_transition; if (errno == EINTR) continue; warning("wait for %s on %s failed: %m; going to " "single user mode", shell, script); return (state_func_t) single_user; } if (wpid == pid && WIFSTOPPED(status)) { warning("init: %s on %s stopped, restarting\n", shell, script); kill(pid, SIGCONT); wpid = -1; } } while (wpid != pid); if (WIFSIGNALED(status) && WTERMSIG(status) == SIGTERM && requested_transition == catatonia) { /* /etc/rc executed /sbin/reboot; wait for the end quietly */ sigset_t s; sigfillset(&s); for (;;) sigsuspend(&s); } if (!WIFEXITED(status)) { warning("%s on %s terminated abnormally, going to single " "user mode", shell, script); return (state_func_t) single_user; } if (WEXITSTATUS(status)) return (state_func_t) single_user; return (state_func_t) 0; } /* * Open the session database. * * NB: We could pass in the size here; is it necessary? */ static int start_session_db(void) { if (session_db && (*session_db->close)(session_db)) emergency("session database close: %s", strerror(errno)); if ((session_db = dbopen(NULL, O_RDWR, 0, DB_HASH, NULL)) == NULL) { emergency("session database open: %s", strerror(errno)); return (1); } return (0); } /* * Add a new login session. */ static void add_session(session_t *sp) { DBT key; DBT data; key.data = &sp->se_process; key.size = sizeof sp->se_process; data.data = &sp; data.size = sizeof sp; if ((*session_db->put)(session_db, &key, &data, 0)) emergency("insert %d: %s", sp->se_process, strerror(errno)); } /* * Delete an old login session. */ static void del_session(session_t *sp) { DBT key; key.data = &sp->se_process; key.size = sizeof sp->se_process; if ((*session_db->del)(session_db, &key, 0)) emergency("delete %d: %s", sp->se_process, strerror(errno)); } /* * Look up a login session by pid. */ static session_t * find_session(pid_t pid) { DBT key; DBT data; session_t *ret; key.data = &pid; key.size = sizeof pid; if ((*session_db->get)(session_db, &key, &data, 0) != 0) return 0; bcopy(data.data, (char *)&ret, sizeof(ret)); return ret; } /* * Construct an argument vector from a command line. */ static char ** construct_argv(char *command) { int argc = 0; char **argv = (char **) malloc(((strlen(command) + 1) / 2 + 1) * sizeof (char *)); if ((argv[argc++] = strk(command)) == NULL) { free(argv); return (NULL); } while ((argv[argc++] = strk((char *) 0)) != NULL) continue; return argv; } /* * Deallocate a session descriptor. */ static void free_session(session_t *sp) { free(sp->se_device); if (sp->se_getty) { free(sp->se_getty); free(sp->se_getty_argv_space); free(sp->se_getty_argv); } if (sp->se_window) { free(sp->se_window); free(sp->se_window_argv_space); free(sp->se_window_argv); } if (sp->se_type) free(sp->se_type); free(sp); } /* * Allocate a new session descriptor. * Mark it SE_PRESENT. */ static session_t * new_session(session_t *sprev, struct ttyent *typ) { session_t *sp; if ((typ->ty_status & TTY_ON) == 0 || typ->ty_name == 0 || typ->ty_getty == 0) return 0; sp = (session_t *) calloc(1, sizeof (session_t)); sp->se_flags |= SE_PRESENT; if ((typ->ty_status & TTY_IFEXISTS) != 0) sp->se_flags |= SE_IFEXISTS; if ((typ->ty_status & TTY_IFCONSOLE) != 0) sp->se_flags |= SE_IFCONSOLE; if (asprintf(&sp->se_device, "%s%s", _PATH_DEV, typ->ty_name) < 0) err(1, "asprintf"); if (setupargv(sp, typ) == 0) { free_session(sp); return (0); } sp->se_next = 0; if (sprev == NULL) { sessions = sp; sp->se_prev = 0; } else { sprev->se_next = sp; sp->se_prev = sprev; } return sp; } /* * Calculate getty and if useful window argv vectors. */ static int setupargv(session_t *sp, struct ttyent *typ) { if (sp->se_getty) { free(sp->se_getty); free(sp->se_getty_argv_space); free(sp->se_getty_argv); } if (asprintf(&sp->se_getty, "%s %s", typ->ty_getty, typ->ty_name) < 0) err(1, "asprintf"); sp->se_getty_argv_space = strdup(sp->se_getty); sp->se_getty_argv = construct_argv(sp->se_getty_argv_space); if (sp->se_getty_argv == NULL) { warning("can't parse getty for port %s", sp->se_device); free(sp->se_getty); free(sp->se_getty_argv_space); sp->se_getty = sp->se_getty_argv_space = 0; return (0); } if (sp->se_window) { free(sp->se_window); free(sp->se_window_argv_space); free(sp->se_window_argv); } sp->se_window = sp->se_window_argv_space = 0; sp->se_window_argv = 0; if (typ->ty_window) { sp->se_window = strdup(typ->ty_window); sp->se_window_argv_space = strdup(sp->se_window); sp->se_window_argv = construct_argv(sp->se_window_argv_space); if (sp->se_window_argv == NULL) { warning("can't parse window for port %s", sp->se_device); free(sp->se_window_argv_space); free(sp->se_window); sp->se_window = sp->se_window_argv_space = 0; return (0); } } if (sp->se_type) free(sp->se_type); sp->se_type = typ->ty_type ? strdup(typ->ty_type) : 0; return (1); } /* * Walk the list of ttys and create sessions for each active line. */ static state_func_t read_ttys(void) { session_t *sp, *snext; struct ttyent *typ; /* * Destroy any previous session state. * There shouldn't be any, but just in case... */ for (sp = sessions; sp; sp = snext) { snext = sp->se_next; free_session(sp); } sessions = 0; if (start_session_db()) return (state_func_t) single_user; /* * Allocate a session entry for each active port. * Note that sp starts at 0. */ while ((typ = getttyent()) != NULL) if ((snext = new_session(sp, typ)) != NULL) sp = snext; endttyent(); return (state_func_t) multi_user; } /* * Start a window system running. */ static void start_window_system(session_t *sp) { pid_t pid; sigset_t mask; char term[64], *env[2]; int status; if ((pid = fork()) == -1) { emergency("can't fork for window system on port %s: %m", sp->se_device); /* hope that getty fails and we can try again */ return; } if (pid) { waitpid(-1, &status, 0); return; } /* reparent window process to the init to not make a zombie on exit */ if ((pid = fork()) == -1) { emergency("can't fork for window system on port %s: %m", sp->se_device); _exit(1); } if (pid) _exit(0); sigemptyset(&mask); sigprocmask(SIG_SETMASK, &mask, NULL); if (setsid() < 0) emergency("setsid failed (window) %m"); #ifdef LOGIN_CAP setprocresources(RESOURCE_WINDOW); #endif if (sp->se_type) { /* Don't use malloc after fork */ strcpy(term, "TERM="); strlcat(term, sp->se_type, sizeof(term)); env[0] = term; env[1] = 0; } else env[0] = 0; execve(sp->se_window_argv[0], sp->se_window_argv, env); stall("can't exec window system '%s' for port %s: %m", sp->se_window_argv[0], sp->se_device); _exit(1); } /* * Start a login session running. */ static pid_t start_getty(session_t *sp) { pid_t pid; sigset_t mask; time_t current_time = time((time_t *) 0); int too_quick = 0; char term[64], *env[2]; if (current_time >= sp->se_started && current_time - sp->se_started < GETTY_SPACING) { if (++sp->se_nspace > GETTY_NSPACE) { sp->se_nspace = 0; too_quick = 1; } } else sp->se_nspace = 0; /* * fork(), not vfork() -- we can't afford to block. */ if ((pid = fork()) == -1) { emergency("can't fork for getty on port %s: %m", sp->se_device); return -1; } if (pid) return pid; if (too_quick) { warning("getty repeating too quickly on port %s, sleeping %d secs", sp->se_device, GETTY_SLEEP); sleep((unsigned) GETTY_SLEEP); } if (sp->se_window) { start_window_system(sp); sleep(WINDOW_WAIT); } sigemptyset(&mask); sigprocmask(SIG_SETMASK, &mask, NULL); #ifdef LOGIN_CAP setprocresources(RESOURCE_GETTY); #endif if (sp->se_type) { /* Don't use malloc after fork */ strcpy(term, "TERM="); strlcat(term, sp->se_type, sizeof(term)); env[0] = term; env[1] = 0; } else env[0] = 0; execve(sp->se_getty_argv[0], sp->se_getty_argv, env); stall("can't exec getty '%s' for port %s: %m", sp->se_getty_argv[0], sp->se_device); _exit(1); } /* * Return 1 if the session is defined as "onifexists" * or "onifconsole" and the device node does not exist. */ static int session_has_no_tty(session_t *sp) { int fd; if ((sp->se_flags & SE_IFEXISTS) == 0 && (sp->se_flags & SE_IFCONSOLE) == 0) return (0); fd = open(sp->se_device, O_RDONLY | O_NONBLOCK, 0); if (fd < 0) { if (errno == ENOENT) return (1); return (0); } close(fd); return (0); } /* * Collect exit status for a child. * If an exiting login, start a new login running. */ static void collect_child(pid_t pid) { session_t *sp, *sprev, *snext; if (! sessions) return; if (! (sp = find_session(pid))) return; del_session(sp); sp->se_process = 0; if (sp->se_flags & SE_SHUTDOWN || session_has_no_tty(sp)) { if ((sprev = sp->se_prev) != NULL) sprev->se_next = sp->se_next; else sessions = sp->se_next; if ((snext = sp->se_next) != NULL) snext->se_prev = sp->se_prev; free_session(sp); return; } if ((pid = start_getty(sp)) == -1) { /* serious trouble */ requested_transition = clean_ttys; return; } sp->se_process = pid; sp->se_started = time((time_t *) 0); add_session(sp); } /* * Catch a signal and request a state transition. */ static void transition_handler(int sig) { switch (sig) { case SIGHUP: if (current_state == read_ttys || current_state == multi_user || current_state == clean_ttys || current_state == catatonia) requested_transition = clean_ttys; break; case SIGWINCH: case SIGUSR2: howto = sig == SIGUSR2 ? RB_POWEROFF : RB_POWERCYCLE; case SIGUSR1: howto |= RB_HALT; case SIGINT: Reboot = TRUE; case SIGTERM: if (current_state == read_ttys || current_state == multi_user || current_state == clean_ttys || current_state == catatonia) requested_transition = death; else requested_transition = death_single; break; case SIGTSTP: if (current_state == runcom || current_state == read_ttys || current_state == clean_ttys || current_state == multi_user || current_state == catatonia) requested_transition = catatonia; break; case SIGEMT: requested_transition = reroot; break; default: requested_transition = 0; break; } } /* * Take the system multiuser. */ static state_func_t multi_user(void) { pid_t pid; session_t *sp; requested_transition = 0; /* * If the administrator has not set the security level to -1 * to indicate that the kernel should not run multiuser in secure * mode, and the run script has not set a higher level of security * than level 1, then put the kernel into secure mode. */ if (getsecuritylevel() == 0) setsecuritylevel(1); for (sp = sessions; sp; sp = sp->se_next) { if (sp->se_process) continue; if (session_has_no_tty(sp)) continue; if ((pid = start_getty(sp)) == -1) { /* serious trouble */ requested_transition = clean_ttys; break; } sp->se_process = pid; sp->se_started = time((time_t *) 0); add_session(sp); } while (!requested_transition) if ((pid = waitpid(-1, (int *) 0, 0)) != -1) collect_child(pid); return (state_func_t) requested_transition; } /* * This is an (n*2)+(n^2) algorithm. We hope it isn't run often... */ static state_func_t clean_ttys(void) { session_t *sp, *sprev; struct ttyent *typ; int devlen; char *old_getty, *old_window, *old_type; /* * mark all sessions for death, (!SE_PRESENT) * as we find or create new ones they'll be marked as keepers, * we'll later nuke all the ones not found in /etc/ttys */ for (sp = sessions; sp != NULL; sp = sp->se_next) sp->se_flags &= ~SE_PRESENT; devlen = sizeof(_PATH_DEV) - 1; while ((typ = getttyent()) != NULL) { for (sprev = 0, sp = sessions; sp; sprev = sp, sp = sp->se_next) if (strcmp(typ->ty_name, sp->se_device + devlen) == 0) break; if (sp) { /* we want this one to live */ sp->se_flags |= SE_PRESENT; if ((typ->ty_status & TTY_ON) == 0 || typ->ty_getty == 0) { sp->se_flags |= SE_SHUTDOWN; kill(sp->se_process, SIGHUP); continue; } sp->se_flags &= ~SE_SHUTDOWN; old_getty = sp->se_getty ? strdup(sp->se_getty) : 0; old_window = sp->se_window ? strdup(sp->se_window) : 0; old_type = sp->se_type ? strdup(sp->se_type) : 0; if (setupargv(sp, typ) == 0) { warning("can't parse getty for port %s", sp->se_device); sp->se_flags |= SE_SHUTDOWN; kill(sp->se_process, SIGHUP); } else if ( !old_getty || (!old_type && sp->se_type) || (old_type && !sp->se_type) || (!old_window && sp->se_window) || (old_window && !sp->se_window) || (strcmp(old_getty, sp->se_getty) != 0) || (old_window && strcmp(old_window, sp->se_window) != 0) || (old_type && strcmp(old_type, sp->se_type) != 0) ) { /* Don't set SE_SHUTDOWN here */ sp->se_nspace = 0; sp->se_started = 0; kill(sp->se_process, SIGHUP); } if (old_getty) free(old_getty); if (old_window) free(old_window); if (old_type) free(old_type); continue; } new_session(sprev, typ); } endttyent(); /* * sweep through and kill all deleted sessions * ones who's /etc/ttys line was deleted (SE_PRESENT unset) */ for (sp = sessions; sp != NULL; sp = sp->se_next) { if ((sp->se_flags & SE_PRESENT) == 0) { sp->se_flags |= SE_SHUTDOWN; kill(sp->se_process, SIGHUP); } } return (state_func_t) multi_user; } /* * Block further logins. */ static state_func_t catatonia(void) { session_t *sp; for (sp = sessions; sp; sp = sp->se_next) sp->se_flags |= SE_SHUTDOWN; return (state_func_t) multi_user; } /* * Note SIGALRM. */ static void alrm_handler(int sig) { (void)sig; clang = 1; } /* * Bring the system down to single user. */ static state_func_t death(void) { int block, blocked; size_t len; /* Temporarily block suspend. */ len = sizeof(blocked); block = 1; if (sysctlbyname("kern.suspend_blocked", &blocked, &len, &block, sizeof(block)) == -1) blocked = 0; /* * Also revoke the TTY here. Because runshutdown() may reopen * the TTY whose getty we're killing here, there is no guarantee * runshutdown() will perform the initial open() call, causing * the terminal attributes to be misconfigured. */ revoke_ttys(); /* Try to run the rc.shutdown script within a period of time */ runshutdown(); /* Unblock suspend if we blocked it. */ if (!blocked) sysctlbyname("kern.suspend_blocked", NULL, NULL, &blocked, sizeof(blocked)); return (state_func_t) death_single; } /* * Do what is necessary to reinitialize single user mode or reboot * from an incomplete state. */ static state_func_t death_single(void) { int i; pid_t pid; static const int death_sigs[2] = { SIGTERM, SIGKILL }; revoke(_PATH_CONSOLE); for (i = 0; i < 2; ++i) { if (kill(-1, death_sigs[i]) == -1 && errno == ESRCH) return (state_func_t) single_user; clang = 0; alarm(DEATH_WATCH); do if ((pid = waitpid(-1, (int *)0, 0)) != -1) collect_child(pid); while (clang == 0 && errno != ECHILD); if (errno == ECHILD) return (state_func_t) single_user; } warning("some processes would not die; ps axl advised"); return (state_func_t) single_user; } static void revoke_ttys(void) { session_t *sp; for (sp = sessions; sp; sp = sp->se_next) { sp->se_flags |= SE_SHUTDOWN; kill(sp->se_process, SIGHUP); revoke(sp->se_device); } } /* * Run the system shutdown script. * * Exit codes: XXX I should document more * -2 shutdown script terminated abnormally * -1 fatal error - can't run script * 0 good. * >0 some error (exit code) */ static int runshutdown(void) { pid_t pid, wpid; int status; int shutdowntimeout; size_t len; char *argv[4]; struct stat sb; /* * rc.shutdown is optional, so to prevent any unnecessary * complaints from the shell we simply don't run it if the * file does not exist. If the stat() here fails for other * reasons, we'll let the shell complain. */ if (stat(_PATH_RUNDOWN, &sb) == -1 && errno == ENOENT) return 0; if ((pid = fork()) == 0) { char _sh[] = "sh"; char _reboot[] = "reboot"; char _single[] = "single"; char _path_rundown[] = _PATH_RUNDOWN; argv[0] = _sh; argv[1] = _path_rundown; argv[2] = Reboot ? _reboot : _single; argv[3] = 0; execute_script(argv); _exit(1); /* force single user mode */ } if (pid == -1) { emergency("can't fork for %s: %m", _PATH_RUNDOWN); while (waitpid(-1, (int *) 0, WNOHANG) > 0) continue; sleep(STALL_TIMEOUT); return -1; } len = sizeof(shutdowntimeout); if (sysctlbyname("kern.init_shutdown_timeout", &shutdowntimeout, &len, NULL, 0) == -1 || shutdowntimeout < 2) shutdowntimeout = DEATH_SCRIPT; alarm(shutdowntimeout); clang = 0; /* * Copied from single_user(). This is a bit paranoid. * Use the same ALRM handler. */ do { if ((wpid = waitpid(-1, &status, WUNTRACED)) != -1) collect_child(wpid); if (clang == 1) { /* we were waiting for the sub-shell */ kill(wpid, SIGTERM); warning("timeout expired for %s: %m; going to " "single user mode", _PATH_RUNDOWN); return -1; } if (wpid == -1) { if (errno == EINTR) continue; warning("wait for %s failed: %m; going to " "single user mode", _PATH_RUNDOWN); return -1; } if (wpid == pid && WIFSTOPPED(status)) { warning("init: %s stopped, restarting\n", _PATH_RUNDOWN); kill(pid, SIGCONT); wpid = -1; } } while (wpid != pid && !clang); /* Turn off the alarm */ alarm(0); if (WIFSIGNALED(status) && WTERMSIG(status) == SIGTERM && requested_transition == catatonia) { /* * /etc/rc.shutdown executed /sbin/reboot; * wait for the end quietly */ sigset_t s; sigfillset(&s); for (;;) sigsuspend(&s); } if (!WIFEXITED(status)) { warning("%s terminated abnormally, going to " "single user mode", _PATH_RUNDOWN); return -2; } if ((status = WEXITSTATUS(status)) != 0) warning("%s returned status %d", _PATH_RUNDOWN, status); return status; } static char * strk(char *p) { static char *t; char *q; int c; if (p) t = p; if (!t) return 0; c = *t; while (c == ' ' || c == '\t' ) c = *++t; if (!c) { t = 0; return 0; } q = t; if (c == '\'') { c = *++t; q = t; while (c && c != '\'') c = *++t; if (!c) /* unterminated string */ q = t = 0; else *t++ = 0; } else { while (c && c != ' ' && c != '\t' ) c = *++t; *t++ = 0; if (!c) t = 0; } return q; } #ifdef LOGIN_CAP static void setprocresources(const char *cname) { login_cap_t *lc; if ((lc = login_getclassbyname(cname, NULL)) != NULL) { setusercontext(lc, (struct passwd*)NULL, 0, LOGIN_SETPRIORITY | LOGIN_SETRESOURCES | LOGIN_SETLOGINCLASS | LOGIN_SETCPUMASK); login_close(lc); } } #endif Index: head/stand/man/loader.8 =================================================================== --- head/stand/man/loader.8 (revision 337739) +++ head/stand/man/loader.8 (revision 337740) @@ -1,1089 +1,1092 @@ .\" Copyright (c) 1999 Daniel C. Sobral .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd August 13, 2018 +.Dd August 14, 2018 .Dt LOADER 8 .Os .Sh NAME .Nm loader .Nd kernel bootstrapping final stage .Sh DESCRIPTION The program called .Nm is the final stage of .Fx Ns 's kernel bootstrapping process. On IA32 (i386) architectures, it is a .Pa BTX client. It is linked statically to .Xr libstand 3 and usually located in the directory .Pa /boot . .Pp It provides a scripting language that can be used to automate tasks, do pre-configuration or assist in recovery procedures. This scripting language is roughly divided in two main components. The smaller one is a set of commands designed for direct use by the casual user, called "builtin commands" for historical reasons. The main drive behind these commands is user-friendliness. The bigger component is an .Tn ANS Forth compatible Forth interpreter based on FICL, by .An John Sadler . .Pp During initialization, .Nm will probe for a console and set the .Va console variable, or set it to serial console .Pq Dq Li comconsole if the previous boot stage used that. If multiple consoles are selected, they will be listed separated by spaces. Then, devices are probed, .Va currdev and .Va loaddev are set, and .Va LINES is set to 24. Next, .Tn FICL is initialized, the builtin words are added to its vocabulary, and .Pa /boot/boot.4th is processed if it exists. No disk switching is possible while that file is being read. The inner interpreter .Nm will use with .Tn FICL is then set to .Ic interpret , which is .Tn FICL Ns 's default. After that, .Pa /boot/loader.rc is processed if available. These files are processed through the .Ic include command, which reads all of them into memory before processing them, making disk changes possible. .Pp At this point, if an .Ic autoboot has not been tried, and if .Va autoboot_delay is not set to .Dq Li NO (not case sensitive), then an .Ic autoboot will be tried. If the system gets past this point, .Va prompt will be set and .Nm will engage interactive mode. Please note that historically even when .Va autoboot_delay is set to .Dq Li 0 user will be able to interrupt autoboot process by pressing some key on the console while kernel and modules are being loaded. In some cases such behaviour may be undesirable, to prevent it set .Va autoboot_delay to .Dq Li -1 , in this case .Nm will engage interactive mode only if .Ic autoboot has failed. .Sh BUILTIN COMMANDS In .Nm , builtin commands take parameters from the command line. Presently, the only way to call them from a script is by using .Pa evaluate on a string. If an error condition occurs, an exception will be generated, which can be intercepted using .Tn ANS Forth exception handling words. If not intercepted, an error message will be displayed and the interpreter's state will be reset, emptying the stack and restoring interpreting mode. .Pp The builtin commands available are: .Pp .Bl -tag -width Ds -compact .It Ic autoboot Op Ar seconds Op Ar prompt Proceeds to bootstrap the system after a number of seconds, if not interrupted by the user. Displays a countdown prompt warning the user the system is about to be booted, unless interrupted by a key press. The kernel will be loaded first if necessary. Defaults to 10 seconds. .Pp .It Ic bcachestat Displays statistics about disk cache usage. For debugging only. .Pp .It Ic boot .It Ic boot Ar kernelname Op Cm ... .It Ic boot Fl flag Cm ... Immediately proceeds to bootstrap the system, loading the kernel if necessary. Any flags or arguments are passed to the kernel, but they must precede the kernel name, if a kernel name is provided. .Pp .Em WARNING : The behavior of this builtin is changed if .Xr loader.4th 8 is loaded. .Pp .It Ic echo Xo .Op Fl n .Op Aq message .Xc Displays text on the screen. A new line will be printed unless .Fl n is specified. .Pp .It Ic heap Displays memory usage statistics. For debugging purposes only. .Pp .It Ic help Op topic Op subtopic Shows help messages read from .Pa /boot/loader.help . The special topic .Em index will list the topics available. .Pp .It Ic include Ar file Op Ar Process script files. Each file, in turn, is completely read into memory, and then each of its lines is passed to the command line interpreter. If any error is returned by the interpreter, the include command aborts immediately, without reading any other files, and returns an error itself (see .Sx ERRORS ) . .Pp .It Ic load Xo .Op Fl t Ar type .Ar file Cm ... .Xc Loads a kernel, kernel loadable module (kld), disk image, or file of opaque contents tagged as being of the type .Ar type . Kernel and modules can be either in a.out or ELF format. Any arguments passed after the name of the file to be loaded will be passed as arguments to that file. Use the .Li md_image type to make the kernel create a file-backed .Xr md 4 disk. This is useful for booting from a temporary rootfs. Currently, argument passing does not work for the kernel. .Pp .It Ic load_geli Xo .Op Fl n Ar keyno .Ar prov Ar file .Xc Loads a .Xr geli 8 encryption keyfile for the given provider name. The key index can be specified via .Ar keyno or will default to zero. .Pp .It Ic ls Xo .Op Fl l .Op Ar path .Xc Displays a listing of files in the directory .Ar path , or the root directory if .Ar path is not specified. If .Fl l is specified, file sizes will be shown too. .Pp .It Ic lsdev Op Fl v Lists all of the devices from which it may be possible to load modules, as well as ZFS pools. If .Fl v is specified, more details are printed, including ZFS pool information in a format that resembles .Nm zpool Cm status output. .Pp .It Ic lsmod Op Fl v Displays loaded modules. If .Fl v is specified, more details are shown. .Pp .It Ic lszfs Ar filesystem A ZFS extended command that can be used to explore the ZFS filesystem hierarchy in a pool. Lists the immediate children of the .Ar filesystem . The filesystem hierarchy is rooted at a filesystem with the same name as the pool. .Pp .It Ic more Ar file Op Ar Display the files specified, with a pause at each .Va LINES displayed. .Pp .It Ic pnpscan Op Fl v Scans for Plug-and-Play devices. This is not functional at present. .Pp .It Ic read Xo .Op Fl t Ar seconds .Op Fl p Ar prompt .Op Va variable .Xc Reads a line of input from the terminal, storing it in .Va variable if specified. A timeout can be specified with .Fl t , though it will be canceled at the first key pressed. A prompt may also be displayed through the .Fl p flag. .Pp .It Ic reboot Immediately reboots the system. .Pp .It Ic set Ar variable .It Ic set Ar variable Ns = Ns Ar value Set loader's environment variables. .Pp .It Ic show Op Va variable Displays the specified variable's value, or all variables and their values if .Va variable is not specified. .Pp .It Ic unload Remove all modules from memory. .Pp .It Ic unset Va variable Removes .Va variable from the environment. .Pp .It Ic \&? Lists available commands. .El .Ss BUILTIN ENVIRONMENT VARIABLES The .Nm has actually two different kinds of .Sq environment variables. There are ANS Forth's .Em environmental queries , and a separate space of environment variables used by builtins, which are not directly available to Forth words. It is the latter type that this section covers. .Pp Environment variables can be set and unset through the .Ic set and .Ic unset builtins, and can have their values interactively examined through the use of the .Ic show builtin. Their values can also be accessed as described in .Sx BUILTIN PARSER . .Pp Notice that these environment variables are not inherited by any shell after the system has been booted. .Pp A few variables are set automatically by .Nm . Others can affect the behavior of either .Nm or the kernel at boot. Some options may require a value, while others define behavior just by being set. Both types of builtin variables are described below. .Bl -tag -width bootfile .It Va autoboot_delay Number of seconds .Ic autoboot will wait before booting. If this variable is not defined, .Ic autoboot will default to 10 seconds. .Pp If set to .Dq Li NO , no .Ic autoboot will be automatically attempted after processing .Pa /boot/loader.rc , though explicit .Ic autoboot Ns 's will be processed normally, defaulting to 10 seconds delay. .Pp If set to .Dq Li 0 , no delay will be inserted, but user still will be able to interrupt .Ic autoboot process and escape into the interactive mode by pressing some key on the console while kernel and modules are being loaded. .Pp If set to .Dq Li -1 , no delay will be inserted and .Nm will engage interactive mode only if .Ic autoboot has failed for some reason. .It Va boot_askname Instructs the kernel to prompt the user for the name of the root device when the kernel is booted. .It Va boot_cdrom Instructs the kernel to try to mount the root file system from CD-ROM. .It Va boot_ddb Instructs the kernel to start in the DDB debugger, rather than proceeding to initialize when booted. .It Va boot_dfltroot Instructs the kernel to mount the statically compiled-in root file system. .It Va boot_gdb Selects gdb-remote mode for the kernel debugger by default. .It Va boot_multicons Enables multiple console support in the kernel early on boot. In a running system, console configuration can be manipulated by the .Xr conscontrol 8 utility. .It Va boot_mute All kernel console output is suppressed when console is muted. In a running system, the state of console muting can be manipulated by the .Xr conscontrol 8 utility. .It Va boot_pause During the device probe, pause after each line is printed. .It Va boot_serial Force the use of a serial console even when an internal console is present. .It Va boot_single Prevents the kernel from initiating a multi-user startup; instead, a single-user mode will be entered when the kernel has finished device probing. .It Va boot_verbose Setting this variable causes extra debugging information to be printed by the kernel during the boot phase. .It Va bootfile List of semicolon-separated search path for bootable kernels. The default is .Dq Li kernel . .It Va comconsole_speed Defines the speed of the serial console (i386 and amd64 only). If the previous boot stage indicated that a serial console is in use then this variable is initialized to the current speed of the console serial port. Otherwise it is set to 9600 unless this was overridden using the .Va BOOT_COMCONSOLE_SPEED variable when .Nm was compiled. Changes to the .Va comconsole_speed variable take effect immediately. .It Va comconsole_port Defines the base i/o port used to access console UART (i386 and amd64 only). If the variable is not set, its assumed value is 0x3F8, which corresponds to PC port COM1, unless overridden by .Va BOOT_COMCONSOLE_PORT variable during the compilation of .Nm . Setting the .Va comconsole_port variable automatically set .Va hw.uart.console environment variable to provide a hint to kernel for location of the console. Loader console is changed immediately after variable .Va comconsole_port is set. .It Va comconsole_pcidev Defines the location of a PCI device of the 'simple communication' class to be used as the serial console UART (i386 and amd64 only). The syntax of the variable is .Li 'bus:device:function[:bar]' , where all members must be numeric, with possible .Li 0x prefix to indicate a hexadecimal value. The .Va bar member is optional and assumed to be 0x10 if omitted. The bar must decode i/o space. Setting the variable .Va comconsole_pcidev automatically sets the variable .Va comconsole_port to the base of the selected bar, and hint .Va hw.uart.console . Loader console is changed immediately after variable .Va comconsole_pcidev is set. .It Va console Defines the current console or consoles. Multiple consoles may be specified. In that case, the first listed console will become the default console for userland output (e.g.\& from .Xr init 8 ) . .It Va currdev Selects the default device. Syntax for devices is odd. .It Va dumpdev Sets the device for kernel dumps. This can be used to ensure that a device is configured before the corresponding .Va dumpdev directive from .Xr rc.conf 5 has been processed, allowing kernel panics that happen during the early stages of boot to be captured. .It Va init_chroot +See +.Xr init 8 . +.It Va init_exec See .Xr init 8 . .It Va init_path Sets the list of binaries which the kernel will try to run as the initial process. The first matching binary is used. The default list is .Dq Li /sbin/init:/sbin/oinit:/sbin/init.bak:\:/rescue/init . .It Va init_script See .Xr init 8 . .It Va init_shell See .Xr init 8 . .It Va interpret Has the value .Dq Li OK if the Forth's current state is interpreting. .It Va LINES Define the number of lines on the screen, to be used by the pager. .It Va module_path Sets the list of directories which will be searched for modules named in a load command or implicitly required by a dependency. The default value for this variable is .Dq Li /boot/kernel;/boot/modules . .It Va num_ide_disks Sets the number of IDE disks as a workaround for some problems in finding the root disk at boot. This has been deprecated in favor of .Va root_disk_unit . .It Va prompt Value of .Nm Ns 's prompt. Defaults to .Dq Li "${interpret}" . If variable .Va prompt is unset, the default prompt is .Ql > . .It Va root_disk_unit If the code which detects the disk unit number for the root disk is confused, e.g.\& by a mix of SCSI and IDE disks, or IDE disks with gaps in the sequence (e.g.\& no primary slave), the unit number can be forced by setting this variable. .It Va rootdev By default the value of .Va currdev is used to set the root file system when the kernel is booted. This can be overridden by setting .Va rootdev explicitly. .El .Pp Other variables are used to override kernel tunable parameters. The following tunables are available: .Bl -tag -width Va .It Va efi.rt.disabled Disable UEFI runtime services in the kernel, if applicable. Runtime services are only available and used if the kernel is booted in a UEFI environment. .It Va hw.physmem Limit the amount of physical memory the system will use. By default the size is in bytes, but the .Cm k , K , m , M , g and .Cm G suffixes are also accepted and indicate kilobytes, megabytes and gigabytes respectively. An invalid suffix will result in the variable being ignored by the kernel. .It Va hw.pci.host_start_mem , hw.acpi.host_start_mem When not otherwise constrained, this limits the memory start address. The default is 0x80000000 and should be set to at least size of the memory and not conflict with other resources. Typically, only systems without PCI bridges need to set this variable since PCI bridges typically constrain the memory starting address (and the variable is only used when bridges do not constrain this address). .It Va hw.pci.enable_io_modes Enable PCI resources which are left off by some BIOSes or are not enabled correctly by the device driver. Tunable value set to ON (1) by default, but this may cause problems with some peripherals. .It Va kern.maxusers Set the size of a number of statically allocated system tables; see .Xr tuning 7 for a description of how to select an appropriate value for this tunable. When set, this tunable replaces the value declared in the kernel compile-time configuration file. .It Va kern.ipc.nmbclusters Set the number of mbuf clusters to be allocated. The value cannot be set below the default determined when the kernel was compiled. .It Va kern.ipc.nsfbufs Set the number of .Xr sendfile 2 buffers to be allocated. Overrides .Dv NSFBUFS . Not all architectures use such buffers; see .Xr sendfile 2 for details. .It Va kern.maxswzone Limits the amount of KVM to be used to hold swap metadata, which directly governs the maximum amount of swap the system can support, at the rate of approximately 200 MB of swap space per 1 MB of metadata. This value is specified in bytes of KVA space. If no value is provided, the system allocates enough memory to handle an amount of swap that corresponds to eight times the amount of physical memory present in the system. .Pp Note that swap metadata can be fragmented, which means that the system can run out of space before it reaches the theoretical limit. Therefore, care should be taken to not configure more swap than approximately half of the theoretical maximum. .Pp Running out of space for swap metadata can leave the system in an unrecoverable state. Therefore, you should only change this parameter if you need to greatly extend the KVM reservation for other resources such as the buffer cache or .Va kern.ipc.nmbclusters . Modifies kernel option .Dv VM_SWZONE_SIZE_MAX . .It Va kern.maxbcache Limits the amount of KVM reserved for use by the buffer cache, specified in bytes. The default maximum is 200MB on i386, and 400MB on amd64 and sparc64. This parameter is used to prevent the buffer cache from eating too much KVM in large-memory machine configurations. Only mess around with this parameter if you need to greatly extend the KVM reservation for other resources such as the swap zone or .Va kern.ipc.nmbclusters . Note that the NBUF parameter will override this limit. Modifies .Dv VM_BCACHE_SIZE_MAX . .It Va kern.msgbufsize Sets the size of the kernel message buffer. The default limit of 64KB is usually sufficient unless large amounts of trace data need to be collected between opportunities to examine the buffer or dump it to a file. Overrides kernel option .Dv MSGBUF_SIZE . .It Va machdep.disable_mtrrs Disable the use of i686 MTRRs (x86 only). .It Va net.inet.tcp.tcbhashsize Overrides the compile-time set value of .Dv TCBHASHSIZE or the preset default of 512. Must be a power of 2. .It Va twiddle_divisor Throttles the output of the .Sq twiddle I/O progress indicator displayed while loading the kernel and modules. This is useful on slow serial consoles where the time spent waiting for these characters to be written can add up to many seconds. The default is 1 (full speed); a value of 2 spins half as fast, and so on. .It Va vm.kmem_size Sets the size of kernel memory (bytes). This overrides the value determined when the kernel was compiled. Modifies .Dv VM_KMEM_SIZE . .It Va vm.kmem_size_min .It Va vm.kmem_size_max Sets the minimum and maximum (respectively) amount of kernel memory that will be automatically allocated by the kernel. These override the values determined when the kernel was compiled. Modifies .Dv VM_KMEM_SIZE_MIN and .Dv VM_KMEM_SIZE_MAX . .El .Ss ZFS FEATURES .Nm supports the following format for specifying ZFS filesystems which can be used wherever .Xr loader 8 refers to a device specification: .Pp .Ar zfs:pool/filesystem: .Pp where .Pa pool/filesystem is a ZFS filesystem name as described in .Xr zfs 8 . .Pp If .Pa /etc/fstab does not have an entry for the root filesystem and .Va vfs.root.mountfrom is not set, but .Va currdev refers to a ZFS filesystem, then .Nm will instruct kernel to use that filesystem as the root filesystem. .Ss BUILTIN PARSER When a builtin command is executed, the rest of the line is taken by it as arguments, and it is processed by a special parser which is not used for regular Forth commands. .Pp This special parser applies the following rules to the parsed text: .Bl -enum .It All backslash characters are preprocessed. .Bl -bullet .It \eb , \ef , \er , \en and \et are processed as in C. .It \es is converted to a space. .It \ev is converted to .Tn ASCII 11. .It \ez is just skipped. Useful for things like .Dq \e0xf\ez\e0xf . .It \e0xN and \e0xNN are replaced by the hex N or NN. .It \eNNN is replaced by the octal NNN .Tn ASCII character. .It \e" , \e' and \e$ will escape these characters, preventing them from receiving special treatment in Step 2, described below. .It \e\e will be replaced with a single \e . .It In any other occurrence, backslash will just be removed. .El .It Every string between non-escaped quotes or double-quotes will be treated as a single word for the purposes of the remaining steps. .It Replace any .Li $VARIABLE or .Li ${VARIABLE} with the value of the environment variable .Va VARIABLE . .It Space-delimited arguments are passed to the called builtin command. Spaces can also be escaped through the use of \e\e . .El .Pp An exception to this parsing rule exists, and is described in .Sx BUILTINS AND FORTH . .Ss BUILTINS AND FORTH All builtin words are state-smart, immediate words. If interpreted, they behave exactly as described previously. If they are compiled, though, they extract their arguments from the stack instead of the command line. .Pp If compiled, the builtin words expect to find, at execution time, the following parameters on the stack: .D1 Ar addrN lenN ... addr2 len2 addr1 len1 N where .Ar addrX lenX are strings which will compose the command line that will be parsed into the builtin's arguments. Internally, these strings are concatenated in from 1 to N, with a space put between each one. .Pp If no arguments are passed, a 0 .Em must be passed, even if the builtin accepts no arguments. .Pp While this behavior has benefits, it has its trade-offs. If the execution token of a builtin is acquired (through .Ic ' or .Ic ['] ) , and then passed to .Ic catch or .Ic execute , the builtin behavior will depend on the system state .Bf Em at the time .Ic catch or .Ic execute is processed! .Ef This is particularly annoying for programs that want or need to handle exceptions. In this case, the use of a proxy is recommended. For example: .Dl : (boot) boot ; .Sh FICL .Tn FICL is a Forth interpreter written in C, in the form of a forth virtual machine library that can be called by C functions and vice versa. .Pp In .Nm , each line read interactively is then fed to .Tn FICL , which may call .Nm back to execute the builtin words. The builtin .Ic include will also feed .Tn FICL , one line at a time. .Pp The words available to .Tn FICL can be classified into four groups. The .Tn ANS Forth standard words, extra .Tn FICL words, extra .Fx words, and the builtin commands; the latter were already described. The .Tn ANS Forth standard words are listed in the .Sx STANDARDS section. The words falling in the two other groups are described in the following subsections. .Ss FICL EXTRA WORDS .Bl -tag -width wid-set-super .It Ic .env .It Ic .ver .It Ic -roll .It Ic 2constant .It Ic >name .It Ic body> .It Ic compare This is the STRING word set's .Ic compare . .It Ic compile-only .It Ic endif .It Ic forget-wid .It Ic parse-word .It Ic sliteral This is the STRING word set's .Ic sliteral . .It Ic wid-set-super .It Ic w@ .It Ic w! .It Ic x. .It Ic empty .It Ic cell- .It Ic -rot .El .Ss FREEBSD EXTRA WORDS .Bl -tag -width XXXXXXXX .It Ic \&$ Pq -- Evaluates the remainder of the input buffer, after having printed it first. .It Ic \&% Pq -- Evaluates the remainder of the input buffer under a .Ic catch exception guard. .It Ic .# Works like .Ic "." but without outputting a trailing space. .It Ic fclose Pq Ar fd -- Closes a file. .It Ic fkey Pq Ar fd -- char Reads a single character from a file. .It Ic fload Pq Ar fd -- Processes a file .Em fd . .It Ic fopen Pq Ar addr len mode Li -- Ar fd Opens a file. Returns a file descriptor, or \-1 in case of failure. The .Ar mode parameter selects whether the file is to be opened for read access, write access, or both. The constants .Dv O_RDONLY , O_WRONLY , and .Dv O_RDWR are defined in .Pa /boot/support.4th , indicating read only, write only, and read-write access, respectively. .It Xo .Ic fread .Pq Ar fd addr len -- len' .Xc Tries to read .Em len bytes from file .Em fd into buffer .Em addr . Returns the actual number of bytes read, or -1 in case of error or end of file. .It Ic heap? Pq -- Ar cells Return the space remaining in the dictionary heap, in cells. This is not related to the heap used by dynamic memory allocation words. .It Ic inb Pq Ar port -- char Reads a byte from a port. .It Ic key Pq -- Ar char Reads a single character from the console. .It Ic key? Pq -- Ar flag Returns .Ic true if there is a character available to be read from the console. .It Ic ms Pq Ar u -- Waits .Em u microseconds. .It Ic outb Pq Ar port char -- Writes a byte to a port. .It Ic seconds Pq -- Ar u Returns the number of seconds since midnight. .It Ic tib> Pq -- Ar addr len Returns the remainder of the input buffer as a string on the stack. .It Ic trace! Pq Ar flag -- Activates or deactivates tracing. Does not work with .Ic catch . .El .Ss FREEBSD DEFINED ENVIRONMENTAL QUERIES .Bl -tag -width Ds .It arch-i386 .Ic TRUE if the architecture is IA32. .It FreeBSD_version .Fx version at compile time. .It loader_version .Nm version. .El .Sh FILES .Bl -tag -width /usr/share/examples/bootforth/ -compact .It Pa /boot/loader .Nm itself. .It Pa /boot/boot.4th Additional .Tn FICL initialization. .It Pa /boot/defaults/loader.conf .It Pa /boot/loader.4th Extra builtin-like words. .It Pa /boot/loader.conf .It Pa /boot/loader.conf.local .Nm configuration files, as described in .Xr loader.conf 5 . .It Pa /boot/loader.rc .Nm bootstrapping script. .It Pa /boot/loader.help Loaded by .Ic help . Contains the help messages. .It Pa /boot/support.4th .Pa loader.conf processing words. .It Pa /usr/share/examples/bootforth/ Assorted examples. .El .Sh EXAMPLES Boot in single user mode: .Pp .Dl boot -s .Pp Load the kernel, a splash screen, and then autoboot in five seconds. Notice that a kernel must be loaded before any other .Ic load command is attempted. .Bd -literal -offset indent load kernel load splash_bmp load -t splash_image_data /boot/chuckrulez.bmp autoboot 5 .Ed .Pp Set the disk unit of the root device to 2, and then boot. This would be needed in a system with two IDE disks, with the second IDE disk hardwired to ada2 instead of ada1. .Bd -literal -offset indent set root_disk_unit=2 boot /boot/kernel/kernel .Ed .Pp Set the default device used for loading a kernel from a ZFS filesystem: .Bd -literal -offset indent set currdev=zfs:tank/ROOT/knowngood: .Ed .Pp .Sh ERRORS The following values are thrown by .Nm : .Bl -tag -width XXXXX -offset indent .It 100 Any type of error in the processing of a builtin. .It -1 .Ic Abort executed. .It -2 .Ic Abort" executed. .It -56 .Ic Quit executed. .It -256 Out of interpreting text. .It -257 Need more text to succeed -- will finish on next run. .It -258 .Ic Bye executed. .It -259 Unspecified error. .El .Sh SEE ALSO .Xr libstand 3 , .Xr loader.conf 5 , .Xr tuning 7 , .Xr boot 8 , .Xr btxld 8 .Sh STANDARDS For the purposes of ANS Forth compliance, loader is an .Bf Em ANS Forth System with Environmental Restrictions, Providing .Ef .Bf Li .No .( , .No :noname , .No ?do , parse, pick, roll, refill, to, value, \e, false, true, .No <> , .No 0<> , compile\&, , erase, nip, tuck .Ef .Em and .Li marker .Bf Em from the Core Extensions word set, Providing the Exception Extensions word set, Providing the Locals Extensions word set, Providing the Memory-Allocation Extensions word set, Providing .Ef .Bf Li \&.s, bye, forget, see, words, \&[if], \&[else] .Ef .Em and .Li [then] .Bf Em from the Programming-Tools extension word set, Providing the Search-Order extensions word set. .Ef .Sh HISTORY The .Nm first appeared in .Fx 3.1 . .Sh AUTHORS .An -nosplit The .Nm was written by .An Michael Smith Aq msmith@FreeBSD.org . .Pp .Tn FICL was written by .An John Sadler Aq john_sadler@alum.mit.edu . .Sh BUGS The .Ic expect and .Ic accept words will read from the input buffer instead of the console. The latter will be fixed, but the former will not.