diff --git a/en/handbook/kerneldebug/chapter.sgml b/en/handbook/kerneldebug/chapter.sgml
index 1ac1153a73..8233ecf626 100644
--- a/en/handbook/kerneldebug/chapter.sgml
+++ b/en/handbook/kerneldebug/chapter.sgml
@@ -1,597 +1,596 @@
Kernel DebuggingContributed by &a.paul; and &a.joerg;Debugging a Kernel Crash Dump with kgdbHere are some instructions for getting kernel debugging working on a
crash dump. They assume that you have enough swap space for a crash
dump. If you have multiple swap partitions and the first one is too
small to hold the dump, you can configure your kernel to use an
alternate dump device (in the config kernel line), or
you can specify an alternate using the
&man.dumpon.8; command. The best way to use &man.dumpon.8; is to set
the dumpdev variable in
/etc/rc.conf. Typically you want to specify one of
the swap devices specified in /etc/fstab. Dumps to
non-swap devices, tapes for example, are currently not supported. Config
your kernel using config -g. See Kernel Configuration for details on
configuring the FreeBSD kernel.Use the &man.dumpon.8; command to tell the kernel where to dump to
(note that this will have to be done after configuring the partition in
question as swap space via &man.swapon.8;). This is normally arranged
via /etc/rc.conf and /etc/rc.
Alternatively, you can hard-code the dump device via the
dump clause in the config line of
your kernel config file. This is deprecated and should be used only if
you want a crash dump from a kernel that crashes during booting.In the following, the term kgdb refers to
gdb run in “kernel debug mode”. This
can be accomplished by either starting the gdb with
the option , or by linking and starting it under
the name kgdb. This is not being done by default,
however, and the idea is basically deprecated since the GNU folks do
not like their tools to behave differently when called by another
name. This feature may well be discontinued in further
releases.When the kernel has been built make a copy of it, say
kernel.debug, and then run strip
-d on the original. Install the original as normal. You
may also install the unstripped kernel, but symbol table lookup time for
some programs will drastically increase, and since the whole kernel is
loaded entirely at boot time and cannot be swapped out later, several
megabytes of physical memory will be wasted.If you are testing a new kernel, for example by typing the new
kernel's name at the boot prompt, but need to boot a different one in
order to get your system up and running again, boot it only into single
user state using the flag at the boot prompt, and
then perform the following steps:&prompt.root; fsck -p
&prompt.root; mount -a -t ufs # so your file system for /var/crash is writable
&prompt.root; savecore -N /kernel.panicked /var/crash
&prompt.root; exit # ...to multi-userThis instructs &man.savecore.8; to use another kernel for symbol
name extraction. It would otherwise default to the currently running
kernel and most likely not do anything at all since the crash dump and
the kernel symbols differ.Now, after a crash dump, go to
/sys/compile/WHATEVER and run
kgdb. From kgdb do:
symbol-file kernel.debugexec-file /var/crash/kernel.0core-file /var/crash/vmcore.0
and voila, you can debug the crash dump using the kernel sources just
like you can for any other program.Here is a script log of a kgdb session
illustrating the procedure. Long lines have been folded to improve
readability, and the lines are numbered for reference. Despite this, it
is a real-world error trace taken during the development of the pcvt
console driver.
-
1:Script started on Fri Dec 30 23:15:22 1994
+ 1:Script started on Fri Dec 30 23:15:22 1994
2:&prompt.root; cd /sys/compile/URIAH
3:&prompt.root; kgdb kernel /var/crash/vmcore.1
4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
...done.
5:IdlePTD 1f3000
6:panic: because you said to!
7:current pcb at 1e3f70
8:Reading in symbols for ../../i386/i386/machdep.c...done.
9:(kgdb)where
10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767)
11:#1 0xf0115159 in panic ()
12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
13:#3 0xf010185e in db_fncall ()
14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073)
15:#5 0xf0101711 in db_command_loop ()
16:#6 0xf01040a0 in db_trap ()
17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
18:#8 0xf019d2eb in trap_fatal (...)
19:#9 0xf019ce60 in trap_pfault (...)
20:#10 0xf019cb2f in trap (...)
21:#11 0xf01932a1 in exception:calltrap ()
22:#12 0xf0191503 in cnopen (...)
23:#13 0xf0132c34 in spec_open ()
24:#14 0xf012d014 in vn_open ()
25:#15 0xf012a183 in open ()
26:#16 0xf019d4eb in syscall (...)
27:(kgdb)up 10
28:Reading in symbols for ../../i386/i386/trap.c...done.
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
35:283 (void) trap_pfault(&frame, FALSE);
36:(kgdb)frame frame->tf_ebp frame->tf_eip
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
40:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
41:(kgdb)list
42:398
43:399 tp->t_state |= TS_CARR_ON;
44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */
45:401
46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
47:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
48:404 #else
49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
51:407 }
52:(kgdb)print tp
53:Reading in symbols for ../../i386/i386/cons.c...done.
54:$1 = (struct tty *) 0x1bae
55:(kgdb)print tp->t_line
56:$2 = 1767990816
57:(kgdb)up
58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
61:(kgdb)up
62:#2 0xf0132c34 in spec_open ()
63:(kgdb)up
64:#3 0xf012d014 in vn_open ()
65:(kgdb)up
66:#4 0xf012a183 in open ()
67:(kgdb)up
68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
73:673 error = (*callp->sy_call)(p, args, rval);
74:(kgdb)up
75:Initial frame selected; you cannot go up.
76:(kgdb)quit
77:&prompt.root; exit
78:exit
79:
80:Script done on Fri Dec 30 23:18:04 1994
-
Comments to the above script:line 6:This is a dump taken from within DDB (see below), hence the
panic comment “because you said to!”, and a rather
long stack trace; the initial reason for going into DDB has been a
page fault trap though.line 20:This is the location of function trap()
in the stack trace.line 36:Force usage of a new stack frame; this is no longer necessary
now. The stack frames are supposed to point to the right
locations now, even in case of a trap. (I do not have a new core
dump handy <g>, my kernel has not panicked for a rather long
time.) From looking at the code in source line 403, there is a
high probability that either the pointer access for
“tp” was messed up, or the array access was out of
bounds.line 52:The pointer looks suspicious, but happens to be a valid
address.line 56:However, it obviously points to garbage, so we have found our
error! (For those unfamiliar with that particular piece of code:
tp->t_line refers to the line discipline of
the console device here, which must be a rather small integer
number.)Debugging a crash dump with DDDExamining a kernel crash dump with a graphical debugger like
ddd is also possible. Add the
option to the ddd command line you would use
normally. For example;&prompt.root; ddd -k /var/crash/kernel.0 /var/crash/vmcore.0You should then be able to go about looking at the crash dump using
ddd'd graphical interface.Post-mortem Analysis of a DumpWhat do you do if a kernel dumped core but you did not expect it,
and it is therefore not compiled using config -g? Not
everything is lost here. Do not panic!Of course, you still need to enable crash dumps. See above on the
options you have to specify in order to do this.Go to your kernel compile directory, and edit the line containing
COPTFLAGS?=-O. Add the option
there (but do not change anything on the level of
optimization). If you do already know roughly the probable location of
the failing piece of code (e.g., the pcvt
driver in the example above), remove all the object files for this code.
Rebuild the kernel. Due to the time stamp change on the Makefile, there
will be some other object files rebuild, for example
trap.o. With a bit of luck, the added
option will not change anything for the generated
code, so you will finally get a new kernel with similar code to the
faulting one but some debugging symbols. You should at least verify the
old and new sizes with the
&man.size.1; command. If there is a mismatch, you probably need to
give up here.Go and examine the dump as described above. The debugging symbols
might be incomplete for some places, as can be seen in the stack trace
in the example above where some functions are displayed without line
numbers and argument lists. If you need more debugging symbols, remove
the appropriate object files and repeat the kgdb
session until you know enough.All this is not guaranteed to work, but it will do it fine in most
cases.On-line Kernel Debugging Using DDBWhile kgdb as an offline debugger provides a very
high level of user interface, there are some things it cannot do. The
most important ones being breakpointing and single-stepping kernel
code.If you need to do low-level debugging on your kernel, there is an
on-line debugger available called DDB. It allows to setting
breakpoints, single-steping kernel functions, examining and changing
kernel variables, etc. However, it cannot access kernel source files,
and only has access to the global and static symbols, not to the full
debug information like kgdb.To configure your kernel to include DDB, add the option line
options DDB
to your config file, and rebuild. (See Kernel Configuration for details on
configuring the FreeBSD kernel.Note that if you have an older version of the boot blocks, your
debugger symbols might not be loaded at all. Update the boot blocks;
the recent ones load the DDB symbols automagically.)Once your DDB kernel is running, there are several ways to enter
DDB. The first, and earliest way is to type the boot flag
right at the boot prompt. The kernel will start up
in debug mode and enter DDB prior to any device probing. Hence you can
even debug the device probe/attach functions.The second scenario is a hot-key on the keyboard, usually
Ctrl-Alt-ESC. For syscons, this can be remapped; some of the
distributed maps do this, so watch out. There is an option available
for serial consoles that allows the use of a serial line BREAK on the
console line to enter DDB (options BREAK_TO_DEBUGGER
in the kernel config file). It is not the default since there are a lot
of crappy serial adapters around that gratuitously generate a BREAK
condition, for example when pulling the cable.The third way is that any panic condition will branch to DDB if the
kernel is configured to use it. For this reason, it is not wise to
configure a kernel with DDB for a machine running unattended.The DDB commands roughly resemble some gdb
commands. The first thing you probably need to do is to set a
breakpoint:b function-nameb addressNumbers are taken hexadecimal by default, but to make them distinct
from symbol names; hexadecimal numbers starting with the letters
a-f need to be preceded with 0x
(this is optional for other numbers). Simple expressions are allowed,
for example: function-name + 0x103.To continue the operation of an interrupted kernel, simply
type:cTo get a stack trace, use:traceNote that when entering DDB via a hot-key, the kernel is currently
servicing an interrupt, so the stack trace might be not of much use
for you.If you want to remove a breakpoint, usedeldel address-expressionThe first form will be accepted immediately after a breakpoint hit,
and deletes the current breakpoint. The second form can remove any
breakpoint, but you need to specify the exact address; this can be
obtained from:show bTo single-step the kernel, try:sThis will step into functions, but you can make DDB trace them until
the matching return statement is reached by:nThis is different from gdb's
next statement; it is like gdb's
finish.To examine data from memory, use (for example):
x/wx 0xf0133fe0,40x/hd db_symtab_spacex/bc termbuf,10x/s stringbuf
for word/halfword/byte access, and hexadecimal/decimal/character/ string
display. The number after the comma is the object count. To display
the next 0x10 items, simply use:x ,10Similarly, use
x/ia foofunc,10
to disassemble the first 0x10 instructions of
foofunc, and display them along with their offset
from the beginning of foofunc.To modify memory, use the write command:w/b termbuf 0xa 0xb 0w/w 0xf0010030 0 0The command modifier
(b/h/w)
specifies the size of the data to be written, the first following
expression is the address to write to and the remainder is interpreted
as data to write to successive memory locations.If you need to know the current registers, use:show regAlternatively, you can display a single register value by e.g.
p $eax
and modify it by:set $eax new-valueShould you need to call some kernel functions from DDB, simply
say:call func(arg1, arg2, ...)The return value will be printed.For a &man.ps.1; style summary of all running processes, use:psNow you have now examined why your kernel failed, and you wish to
reboot. Remember that, depending on the severity of previous
malfunctioning, not all parts of the kernel might still be working as
expected. Perform one of the following actions to shut down and reboot
your system:call diediedie()This will cause your kernel to dump core and reboot, so you can
later analyze the core on a higher level with kgdb. This command
usually must be followed by another continue
statement. There is now an alias for this:
panic.call boot(0)Which might be a good way to cleanly shut down the running system,
sync() all disks, and finally reboot. As long as
the disk and file system interfaces of the kernel are not damaged, this
might be a good way for an almost clean shutdown.call cpu_reset()is the final way out of disaster and almost the same as hitting the
Big Red Button.If you need a short command summary, simply type:helpHowever, it is highly recommended to have a printed copy of the
&man.ddb.4; manual page ready for a debugging
session. Remember that it is hard to read the on-line manual while
single-stepping the kernel.On-line Kernel Debugging Using Remote GDBThis feature has been supported since FreeBSD 2.2, and it's actually
a very neat one.GDB has already supported remote debugging for
a long time. This is done using a very simple protocol along a serial
line. Unlike the other methods described above, you will need two
machines for doing this. One is the host providing the debugging
environment, including all the sources, and a copy of the kernel binary
with all the symbols in it, and the other one is the target machine that
simply runs a similar copy of the very same kernel (but stripped of the
debugging information).You should configure the kernel in question with config
-g, include into the configuration, and
compile it as usual. This gives a large blurb of a binary, due to the
debugging information. Copy this kernel to the target machine, strip
the debugging symbols off with strip -x, and boot it
using the boot option. Connect the first serial
line of the target machine to any serial line of the debugging host.
Now, on the debugging machine, go to the compile directory of the target
kernel, and start gdb:&prompt.user; gdb -k kernel
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd),
Copyright 1996 Free Software Foundation, Inc...
(kgdb)Initialize the remote debugging session (assuming the first serial
port is being used) by:(kgdb)target remote /dev/cuaa0Now, on the target host (the one that entered DDB right before even
starting the device probe), type:Debugger("Boot flags requested debugger")
Stopped at Debugger+0x35: movb $0, edata+0x51bc
db>gdbDDB will respond with:Next trap will enter GDB remote protocol modeEvery time you type gdb, the mode will be toggled
between remote GDB and local DDB. In order to force a next trap
immediately, simply type s (step). Your hosting GDB
will now gain control over the target kernel:Remote debugging using /dev/cuaa0
Debugger (msg=0xf01b0383 "Boot flags requested debugger")
at ../../i386/i386/db_interface.c:257
(kgdb)You can use this session almost as any other GDB session, including
full access to the source, running it in gud-mode inside an Emacs window
(which gives you an automatic source code display in another Emacs
window) etc.Remote GDB can also be used to debug LKMs. First build the LKM with
debugging symbols:&prompt.root; cd /usr/src/lkm/linux
&prompt.root; make clean; make COPTS=-gThen install this version of the module on the target machine, load
it and use modstat to find out where it was
loaded:&prompt.root; linux
&prompt.root; modstat
Type Id Off Loadaddr Size Info Rev Module Name
EXEC 0 4 f5109000 001c f510f010 1 linux_modTake the load address of the module and add 0x20 (probably to
account for the a.out header). This is the address that the module code
was relocated to. Use the add-symbol-file command in
GDB to tell the debugger about the module:(kgdb)add-symbol-file /usr/src/lkm/linux/linux_mod.o 0xf5109020
add symbol table from file "/usr/src/lkm/linux/linux_mod.o" at
text_addr = 0xf5109020? (y or n) y(kgdb)You now have access to all the symbols in the LKM.Debugging a Console DriverSince you need a console driver to run DDB on, things are more
complicated if the console driver itself is failing. You might remember
the use of a serial console (either with modified boot blocks, or by
specifying at the Boot: prompt),
and hook up a standard terminal onto your first serial port. DDB works
on any configured console driver, of course also on a serial
console.
diff --git a/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml b/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml
index 1ac1153a73..8233ecf626 100644
--- a/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml
+++ b/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml
@@ -1,597 +1,596 @@
Kernel DebuggingContributed by &a.paul; and &a.joerg;Debugging a Kernel Crash Dump with kgdbHere are some instructions for getting kernel debugging working on a
crash dump. They assume that you have enough swap space for a crash
dump. If you have multiple swap partitions and the first one is too
small to hold the dump, you can configure your kernel to use an
alternate dump device (in the config kernel line), or
you can specify an alternate using the
&man.dumpon.8; command. The best way to use &man.dumpon.8; is to set
the dumpdev variable in
/etc/rc.conf. Typically you want to specify one of
the swap devices specified in /etc/fstab. Dumps to
non-swap devices, tapes for example, are currently not supported. Config
your kernel using config -g. See Kernel Configuration for details on
configuring the FreeBSD kernel.Use the &man.dumpon.8; command to tell the kernel where to dump to
(note that this will have to be done after configuring the partition in
question as swap space via &man.swapon.8;). This is normally arranged
via /etc/rc.conf and /etc/rc.
Alternatively, you can hard-code the dump device via the
dump clause in the config line of
your kernel config file. This is deprecated and should be used only if
you want a crash dump from a kernel that crashes during booting.In the following, the term kgdb refers to
gdb run in “kernel debug mode”. This
can be accomplished by either starting the gdb with
the option , or by linking and starting it under
the name kgdb. This is not being done by default,
however, and the idea is basically deprecated since the GNU folks do
not like their tools to behave differently when called by another
name. This feature may well be discontinued in further
releases.When the kernel has been built make a copy of it, say
kernel.debug, and then run strip
-d on the original. Install the original as normal. You
may also install the unstripped kernel, but symbol table lookup time for
some programs will drastically increase, and since the whole kernel is
loaded entirely at boot time and cannot be swapped out later, several
megabytes of physical memory will be wasted.If you are testing a new kernel, for example by typing the new
kernel's name at the boot prompt, but need to boot a different one in
order to get your system up and running again, boot it only into single
user state using the flag at the boot prompt, and
then perform the following steps:&prompt.root; fsck -p
&prompt.root; mount -a -t ufs # so your file system for /var/crash is writable
&prompt.root; savecore -N /kernel.panicked /var/crash
&prompt.root; exit # ...to multi-userThis instructs &man.savecore.8; to use another kernel for symbol
name extraction. It would otherwise default to the currently running
kernel and most likely not do anything at all since the crash dump and
the kernel symbols differ.Now, after a crash dump, go to
/sys/compile/WHATEVER and run
kgdb. From kgdb do:
symbol-file kernel.debugexec-file /var/crash/kernel.0core-file /var/crash/vmcore.0
and voila, you can debug the crash dump using the kernel sources just
like you can for any other program.Here is a script log of a kgdb session
illustrating the procedure. Long lines have been folded to improve
readability, and the lines are numbered for reference. Despite this, it
is a real-world error trace taken during the development of the pcvt
console driver.
-
1:Script started on Fri Dec 30 23:15:22 1994
+ 1:Script started on Fri Dec 30 23:15:22 1994
2:&prompt.root; cd /sys/compile/URIAH
3:&prompt.root; kgdb kernel /var/crash/vmcore.1
4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
...done.
5:IdlePTD 1f3000
6:panic: because you said to!
7:current pcb at 1e3f70
8:Reading in symbols for ../../i386/i386/machdep.c...done.
9:(kgdb)where
10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767)
11:#1 0xf0115159 in panic ()
12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
13:#3 0xf010185e in db_fncall ()
14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073)
15:#5 0xf0101711 in db_command_loop ()
16:#6 0xf01040a0 in db_trap ()
17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
18:#8 0xf019d2eb in trap_fatal (...)
19:#9 0xf019ce60 in trap_pfault (...)
20:#10 0xf019cb2f in trap (...)
21:#11 0xf01932a1 in exception:calltrap ()
22:#12 0xf0191503 in cnopen (...)
23:#13 0xf0132c34 in spec_open ()
24:#14 0xf012d014 in vn_open ()
25:#15 0xf012a183 in open ()
26:#16 0xf019d4eb in syscall (...)
27:(kgdb)up 10
28:Reading in symbols for ../../i386/i386/trap.c...done.
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
35:283 (void) trap_pfault(&frame, FALSE);
36:(kgdb)frame frame->tf_ebp frame->tf_eip
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
40:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
41:(kgdb)list
42:398
43:399 tp->t_state |= TS_CARR_ON;
44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */
45:401
46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
47:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
48:404 #else
49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
51:407 }
52:(kgdb)print tp
53:Reading in symbols for ../../i386/i386/cons.c...done.
54:$1 = (struct tty *) 0x1bae
55:(kgdb)print tp->t_line
56:$2 = 1767990816
57:(kgdb)up
58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
61:(kgdb)up
62:#2 0xf0132c34 in spec_open ()
63:(kgdb)up
64:#3 0xf012d014 in vn_open ()
65:(kgdb)up
66:#4 0xf012a183 in open ()
67:(kgdb)up
68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
73:673 error = (*callp->sy_call)(p, args, rval);
74:(kgdb)up
75:Initial frame selected; you cannot go up.
76:(kgdb)quit
77:&prompt.root; exit
78:exit
79:
80:Script done on Fri Dec 30 23:18:04 1994
-
Comments to the above script:line 6:This is a dump taken from within DDB (see below), hence the
panic comment “because you said to!”, and a rather
long stack trace; the initial reason for going into DDB has been a
page fault trap though.line 20:This is the location of function trap()
in the stack trace.line 36:Force usage of a new stack frame; this is no longer necessary
now. The stack frames are supposed to point to the right
locations now, even in case of a trap. (I do not have a new core
dump handy <g>, my kernel has not panicked for a rather long
time.) From looking at the code in source line 403, there is a
high probability that either the pointer access for
“tp” was messed up, or the array access was out of
bounds.line 52:The pointer looks suspicious, but happens to be a valid
address.line 56:However, it obviously points to garbage, so we have found our
error! (For those unfamiliar with that particular piece of code:
tp->t_line refers to the line discipline of
the console device here, which must be a rather small integer
number.)Debugging a crash dump with DDDExamining a kernel crash dump with a graphical debugger like
ddd is also possible. Add the
option to the ddd command line you would use
normally. For example;&prompt.root; ddd -k /var/crash/kernel.0 /var/crash/vmcore.0You should then be able to go about looking at the crash dump using
ddd'd graphical interface.Post-mortem Analysis of a DumpWhat do you do if a kernel dumped core but you did not expect it,
and it is therefore not compiled using config -g? Not
everything is lost here. Do not panic!Of course, you still need to enable crash dumps. See above on the
options you have to specify in order to do this.Go to your kernel compile directory, and edit the line containing
COPTFLAGS?=-O. Add the option
there (but do not change anything on the level of
optimization). If you do already know roughly the probable location of
the failing piece of code (e.g., the pcvt
driver in the example above), remove all the object files for this code.
Rebuild the kernel. Due to the time stamp change on the Makefile, there
will be some other object files rebuild, for example
trap.o. With a bit of luck, the added
option will not change anything for the generated
code, so you will finally get a new kernel with similar code to the
faulting one but some debugging symbols. You should at least verify the
old and new sizes with the
&man.size.1; command. If there is a mismatch, you probably need to
give up here.Go and examine the dump as described above. The debugging symbols
might be incomplete for some places, as can be seen in the stack trace
in the example above where some functions are displayed without line
numbers and argument lists. If you need more debugging symbols, remove
the appropriate object files and repeat the kgdb
session until you know enough.All this is not guaranteed to work, but it will do it fine in most
cases.On-line Kernel Debugging Using DDBWhile kgdb as an offline debugger provides a very
high level of user interface, there are some things it cannot do. The
most important ones being breakpointing and single-stepping kernel
code.If you need to do low-level debugging on your kernel, there is an
on-line debugger available called DDB. It allows to setting
breakpoints, single-steping kernel functions, examining and changing
kernel variables, etc. However, it cannot access kernel source files,
and only has access to the global and static symbols, not to the full
debug information like kgdb.To configure your kernel to include DDB, add the option line
options DDB
to your config file, and rebuild. (See Kernel Configuration for details on
configuring the FreeBSD kernel.Note that if you have an older version of the boot blocks, your
debugger symbols might not be loaded at all. Update the boot blocks;
the recent ones load the DDB symbols automagically.)Once your DDB kernel is running, there are several ways to enter
DDB. The first, and earliest way is to type the boot flag
right at the boot prompt. The kernel will start up
in debug mode and enter DDB prior to any device probing. Hence you can
even debug the device probe/attach functions.The second scenario is a hot-key on the keyboard, usually
Ctrl-Alt-ESC. For syscons, this can be remapped; some of the
distributed maps do this, so watch out. There is an option available
for serial consoles that allows the use of a serial line BREAK on the
console line to enter DDB (options BREAK_TO_DEBUGGER
in the kernel config file). It is not the default since there are a lot
of crappy serial adapters around that gratuitously generate a BREAK
condition, for example when pulling the cable.The third way is that any panic condition will branch to DDB if the
kernel is configured to use it. For this reason, it is not wise to
configure a kernel with DDB for a machine running unattended.The DDB commands roughly resemble some gdb
commands. The first thing you probably need to do is to set a
breakpoint:b function-nameb addressNumbers are taken hexadecimal by default, but to make them distinct
from symbol names; hexadecimal numbers starting with the letters
a-f need to be preceded with 0x
(this is optional for other numbers). Simple expressions are allowed,
for example: function-name + 0x103.To continue the operation of an interrupted kernel, simply
type:cTo get a stack trace, use:traceNote that when entering DDB via a hot-key, the kernel is currently
servicing an interrupt, so the stack trace might be not of much use
for you.If you want to remove a breakpoint, usedeldel address-expressionThe first form will be accepted immediately after a breakpoint hit,
and deletes the current breakpoint. The second form can remove any
breakpoint, but you need to specify the exact address; this can be
obtained from:show bTo single-step the kernel, try:sThis will step into functions, but you can make DDB trace them until
the matching return statement is reached by:nThis is different from gdb's
next statement; it is like gdb's
finish.To examine data from memory, use (for example):
x/wx 0xf0133fe0,40x/hd db_symtab_spacex/bc termbuf,10x/s stringbuf
for word/halfword/byte access, and hexadecimal/decimal/character/ string
display. The number after the comma is the object count. To display
the next 0x10 items, simply use:x ,10Similarly, use
x/ia foofunc,10
to disassemble the first 0x10 instructions of
foofunc, and display them along with their offset
from the beginning of foofunc.To modify memory, use the write command:w/b termbuf 0xa 0xb 0w/w 0xf0010030 0 0The command modifier
(b/h/w)
specifies the size of the data to be written, the first following
expression is the address to write to and the remainder is interpreted
as data to write to successive memory locations.If you need to know the current registers, use:show regAlternatively, you can display a single register value by e.g.
p $eax
and modify it by:set $eax new-valueShould you need to call some kernel functions from DDB, simply
say:call func(arg1, arg2, ...)The return value will be printed.For a &man.ps.1; style summary of all running processes, use:psNow you have now examined why your kernel failed, and you wish to
reboot. Remember that, depending on the severity of previous
malfunctioning, not all parts of the kernel might still be working as
expected. Perform one of the following actions to shut down and reboot
your system:call diediedie()This will cause your kernel to dump core and reboot, so you can
later analyze the core on a higher level with kgdb. This command
usually must be followed by another continue
statement. There is now an alias for this:
panic.call boot(0)Which might be a good way to cleanly shut down the running system,
sync() all disks, and finally reboot. As long as
the disk and file system interfaces of the kernel are not damaged, this
might be a good way for an almost clean shutdown.call cpu_reset()is the final way out of disaster and almost the same as hitting the
Big Red Button.If you need a short command summary, simply type:helpHowever, it is highly recommended to have a printed copy of the
&man.ddb.4; manual page ready for a debugging
session. Remember that it is hard to read the on-line manual while
single-stepping the kernel.On-line Kernel Debugging Using Remote GDBThis feature has been supported since FreeBSD 2.2, and it's actually
a very neat one.GDB has already supported remote debugging for
a long time. This is done using a very simple protocol along a serial
line. Unlike the other methods described above, you will need two
machines for doing this. One is the host providing the debugging
environment, including all the sources, and a copy of the kernel binary
with all the symbols in it, and the other one is the target machine that
simply runs a similar copy of the very same kernel (but stripped of the
debugging information).You should configure the kernel in question with config
-g, include into the configuration, and
compile it as usual. This gives a large blurb of a binary, due to the
debugging information. Copy this kernel to the target machine, strip
the debugging symbols off with strip -x, and boot it
using the boot option. Connect the first serial
line of the target machine to any serial line of the debugging host.
Now, on the debugging machine, go to the compile directory of the target
kernel, and start gdb:&prompt.user; gdb -k kernel
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd),
Copyright 1996 Free Software Foundation, Inc...
(kgdb)Initialize the remote debugging session (assuming the first serial
port is being used) by:(kgdb)target remote /dev/cuaa0Now, on the target host (the one that entered DDB right before even
starting the device probe), type:Debugger("Boot flags requested debugger")
Stopped at Debugger+0x35: movb $0, edata+0x51bc
db>gdbDDB will respond with:Next trap will enter GDB remote protocol modeEvery time you type gdb, the mode will be toggled
between remote GDB and local DDB. In order to force a next trap
immediately, simply type s (step). Your hosting GDB
will now gain control over the target kernel:Remote debugging using /dev/cuaa0
Debugger (msg=0xf01b0383 "Boot flags requested debugger")
at ../../i386/i386/db_interface.c:257
(kgdb)You can use this session almost as any other GDB session, including
full access to the source, running it in gud-mode inside an Emacs window
(which gives you an automatic source code display in another Emacs
window) etc.Remote GDB can also be used to debug LKMs. First build the LKM with
debugging symbols:&prompt.root; cd /usr/src/lkm/linux
&prompt.root; make clean; make COPTS=-gThen install this version of the module on the target machine, load
it and use modstat to find out where it was
loaded:&prompt.root; linux
&prompt.root; modstat
Type Id Off Loadaddr Size Info Rev Module Name
EXEC 0 4 f5109000 001c f510f010 1 linux_modTake the load address of the module and add 0x20 (probably to
account for the a.out header). This is the address that the module code
was relocated to. Use the add-symbol-file command in
GDB to tell the debugger about the module:(kgdb)add-symbol-file /usr/src/lkm/linux/linux_mod.o 0xf5109020
add symbol table from file "/usr/src/lkm/linux/linux_mod.o" at
text_addr = 0xf5109020? (y or n) y(kgdb)You now have access to all the symbols in the LKM.Debugging a Console DriverSince you need a console driver to run DDB on, things are more
complicated if the console driver itself is failing. You might remember
the use of a serial console (either with modified boot blocks, or by
specifying at the Boot: prompt),
and hook up a standard terminal onto your first serial port. DDB works
on any configured console driver, of course also on a serial
console.
diff --git a/en_US.ISO8859-1/books/handbook/kerneldebug/chapter.sgml b/en_US.ISO8859-1/books/handbook/kerneldebug/chapter.sgml
index 1ac1153a73..8233ecf626 100644
--- a/en_US.ISO8859-1/books/handbook/kerneldebug/chapter.sgml
+++ b/en_US.ISO8859-1/books/handbook/kerneldebug/chapter.sgml
@@ -1,597 +1,596 @@
Kernel DebuggingContributed by &a.paul; and &a.joerg;Debugging a Kernel Crash Dump with kgdbHere are some instructions for getting kernel debugging working on a
crash dump. They assume that you have enough swap space for a crash
dump. If you have multiple swap partitions and the first one is too
small to hold the dump, you can configure your kernel to use an
alternate dump device (in the config kernel line), or
you can specify an alternate using the
&man.dumpon.8; command. The best way to use &man.dumpon.8; is to set
the dumpdev variable in
/etc/rc.conf. Typically you want to specify one of
the swap devices specified in /etc/fstab. Dumps to
non-swap devices, tapes for example, are currently not supported. Config
your kernel using config -g. See Kernel Configuration for details on
configuring the FreeBSD kernel.Use the &man.dumpon.8; command to tell the kernel where to dump to
(note that this will have to be done after configuring the partition in
question as swap space via &man.swapon.8;). This is normally arranged
via /etc/rc.conf and /etc/rc.
Alternatively, you can hard-code the dump device via the
dump clause in the config line of
your kernel config file. This is deprecated and should be used only if
you want a crash dump from a kernel that crashes during booting.In the following, the term kgdb refers to
gdb run in “kernel debug mode”. This
can be accomplished by either starting the gdb with
the option , or by linking and starting it under
the name kgdb. This is not being done by default,
however, and the idea is basically deprecated since the GNU folks do
not like their tools to behave differently when called by another
name. This feature may well be discontinued in further
releases.When the kernel has been built make a copy of it, say
kernel.debug, and then run strip
-d on the original. Install the original as normal. You
may also install the unstripped kernel, but symbol table lookup time for
some programs will drastically increase, and since the whole kernel is
loaded entirely at boot time and cannot be swapped out later, several
megabytes of physical memory will be wasted.If you are testing a new kernel, for example by typing the new
kernel's name at the boot prompt, but need to boot a different one in
order to get your system up and running again, boot it only into single
user state using the flag at the boot prompt, and
then perform the following steps:&prompt.root; fsck -p
&prompt.root; mount -a -t ufs # so your file system for /var/crash is writable
&prompt.root; savecore -N /kernel.panicked /var/crash
&prompt.root; exit # ...to multi-userThis instructs &man.savecore.8; to use another kernel for symbol
name extraction. It would otherwise default to the currently running
kernel and most likely not do anything at all since the crash dump and
the kernel symbols differ.Now, after a crash dump, go to
/sys/compile/WHATEVER and run
kgdb. From kgdb do:
symbol-file kernel.debugexec-file /var/crash/kernel.0core-file /var/crash/vmcore.0
and voila, you can debug the crash dump using the kernel sources just
like you can for any other program.Here is a script log of a kgdb session
illustrating the procedure. Long lines have been folded to improve
readability, and the lines are numbered for reference. Despite this, it
is a real-world error trace taken during the development of the pcvt
console driver.
-
1:Script started on Fri Dec 30 23:15:22 1994
+ 1:Script started on Fri Dec 30 23:15:22 1994
2:&prompt.root; cd /sys/compile/URIAH
3:&prompt.root; kgdb kernel /var/crash/vmcore.1
4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
...done.
5:IdlePTD 1f3000
6:panic: because you said to!
7:current pcb at 1e3f70
8:Reading in symbols for ../../i386/i386/machdep.c...done.
9:(kgdb)where
10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767)
11:#1 0xf0115159 in panic ()
12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
13:#3 0xf010185e in db_fncall ()
14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073)
15:#5 0xf0101711 in db_command_loop ()
16:#6 0xf01040a0 in db_trap ()
17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
18:#8 0xf019d2eb in trap_fatal (...)
19:#9 0xf019ce60 in trap_pfault (...)
20:#10 0xf019cb2f in trap (...)
21:#11 0xf01932a1 in exception:calltrap ()
22:#12 0xf0191503 in cnopen (...)
23:#13 0xf0132c34 in spec_open ()
24:#14 0xf012d014 in vn_open ()
25:#15 0xf012a183 in open ()
26:#16 0xf019d4eb in syscall (...)
27:(kgdb)up 10
28:Reading in symbols for ../../i386/i386/trap.c...done.
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
35:283 (void) trap_pfault(&frame, FALSE);
36:(kgdb)frame frame->tf_ebp frame->tf_eip
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
40:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
41:(kgdb)list
42:398
43:399 tp->t_state |= TS_CARR_ON;
44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */
45:401
46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
47:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
48:404 #else
49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
51:407 }
52:(kgdb)print tp
53:Reading in symbols for ../../i386/i386/cons.c...done.
54:$1 = (struct tty *) 0x1bae
55:(kgdb)print tp->t_line
56:$2 = 1767990816
57:(kgdb)up
58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
61:(kgdb)up
62:#2 0xf0132c34 in spec_open ()
63:(kgdb)up
64:#3 0xf012d014 in vn_open ()
65:(kgdb)up
66:#4 0xf012a183 in open ()
67:(kgdb)up
68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
73:673 error = (*callp->sy_call)(p, args, rval);
74:(kgdb)up
75:Initial frame selected; you cannot go up.
76:(kgdb)quit
77:&prompt.root; exit
78:exit
79:
80:Script done on Fri Dec 30 23:18:04 1994
-
Comments to the above script:line 6:This is a dump taken from within DDB (see below), hence the
panic comment “because you said to!”, and a rather
long stack trace; the initial reason for going into DDB has been a
page fault trap though.line 20:This is the location of function trap()
in the stack trace.line 36:Force usage of a new stack frame; this is no longer necessary
now. The stack frames are supposed to point to the right
locations now, even in case of a trap. (I do not have a new core
dump handy <g>, my kernel has not panicked for a rather long
time.) From looking at the code in source line 403, there is a
high probability that either the pointer access for
“tp” was messed up, or the array access was out of
bounds.line 52:The pointer looks suspicious, but happens to be a valid
address.line 56:However, it obviously points to garbage, so we have found our
error! (For those unfamiliar with that particular piece of code:
tp->t_line refers to the line discipline of
the console device here, which must be a rather small integer
number.)Debugging a crash dump with DDDExamining a kernel crash dump with a graphical debugger like
ddd is also possible. Add the
option to the ddd command line you would use
normally. For example;&prompt.root; ddd -k /var/crash/kernel.0 /var/crash/vmcore.0You should then be able to go about looking at the crash dump using
ddd'd graphical interface.Post-mortem Analysis of a DumpWhat do you do if a kernel dumped core but you did not expect it,
and it is therefore not compiled using config -g? Not
everything is lost here. Do not panic!Of course, you still need to enable crash dumps. See above on the
options you have to specify in order to do this.Go to your kernel compile directory, and edit the line containing
COPTFLAGS?=-O. Add the option
there (but do not change anything on the level of
optimization). If you do already know roughly the probable location of
the failing piece of code (e.g., the pcvt
driver in the example above), remove all the object files for this code.
Rebuild the kernel. Due to the time stamp change on the Makefile, there
will be some other object files rebuild, for example
trap.o. With a bit of luck, the added
option will not change anything for the generated
code, so you will finally get a new kernel with similar code to the
faulting one but some debugging symbols. You should at least verify the
old and new sizes with the
&man.size.1; command. If there is a mismatch, you probably need to
give up here.Go and examine the dump as described above. The debugging symbols
might be incomplete for some places, as can be seen in the stack trace
in the example above where some functions are displayed without line
numbers and argument lists. If you need more debugging symbols, remove
the appropriate object files and repeat the kgdb
session until you know enough.All this is not guaranteed to work, but it will do it fine in most
cases.On-line Kernel Debugging Using DDBWhile kgdb as an offline debugger provides a very
high level of user interface, there are some things it cannot do. The
most important ones being breakpointing and single-stepping kernel
code.If you need to do low-level debugging on your kernel, there is an
on-line debugger available called DDB. It allows to setting
breakpoints, single-steping kernel functions, examining and changing
kernel variables, etc. However, it cannot access kernel source files,
and only has access to the global and static symbols, not to the full
debug information like kgdb.To configure your kernel to include DDB, add the option line
options DDB
to your config file, and rebuild. (See Kernel Configuration for details on
configuring the FreeBSD kernel.Note that if you have an older version of the boot blocks, your
debugger symbols might not be loaded at all. Update the boot blocks;
the recent ones load the DDB symbols automagically.)Once your DDB kernel is running, there are several ways to enter
DDB. The first, and earliest way is to type the boot flag
right at the boot prompt. The kernel will start up
in debug mode and enter DDB prior to any device probing. Hence you can
even debug the device probe/attach functions.The second scenario is a hot-key on the keyboard, usually
Ctrl-Alt-ESC. For syscons, this can be remapped; some of the
distributed maps do this, so watch out. There is an option available
for serial consoles that allows the use of a serial line BREAK on the
console line to enter DDB (options BREAK_TO_DEBUGGER
in the kernel config file). It is not the default since there are a lot
of crappy serial adapters around that gratuitously generate a BREAK
condition, for example when pulling the cable.The third way is that any panic condition will branch to DDB if the
kernel is configured to use it. For this reason, it is not wise to
configure a kernel with DDB for a machine running unattended.The DDB commands roughly resemble some gdb
commands. The first thing you probably need to do is to set a
breakpoint:b function-nameb addressNumbers are taken hexadecimal by default, but to make them distinct
from symbol names; hexadecimal numbers starting with the letters
a-f need to be preceded with 0x
(this is optional for other numbers). Simple expressions are allowed,
for example: function-name + 0x103.To continue the operation of an interrupted kernel, simply
type:cTo get a stack trace, use:traceNote that when entering DDB via a hot-key, the kernel is currently
servicing an interrupt, so the stack trace might be not of much use
for you.If you want to remove a breakpoint, usedeldel address-expressionThe first form will be accepted immediately after a breakpoint hit,
and deletes the current breakpoint. The second form can remove any
breakpoint, but you need to specify the exact address; this can be
obtained from:show bTo single-step the kernel, try:sThis will step into functions, but you can make DDB trace them until
the matching return statement is reached by:nThis is different from gdb's
next statement; it is like gdb's
finish.To examine data from memory, use (for example):
x/wx 0xf0133fe0,40x/hd db_symtab_spacex/bc termbuf,10x/s stringbuf
for word/halfword/byte access, and hexadecimal/decimal/character/ string
display. The number after the comma is the object count. To display
the next 0x10 items, simply use:x ,10Similarly, use
x/ia foofunc,10
to disassemble the first 0x10 instructions of
foofunc, and display them along with their offset
from the beginning of foofunc.To modify memory, use the write command:w/b termbuf 0xa 0xb 0w/w 0xf0010030 0 0The command modifier
(b/h/w)
specifies the size of the data to be written, the first following
expression is the address to write to and the remainder is interpreted
as data to write to successive memory locations.If you need to know the current registers, use:show regAlternatively, you can display a single register value by e.g.
p $eax
and modify it by:set $eax new-valueShould you need to call some kernel functions from DDB, simply
say:call func(arg1, arg2, ...)The return value will be printed.For a &man.ps.1; style summary of all running processes, use:psNow you have now examined why your kernel failed, and you wish to
reboot. Remember that, depending on the severity of previous
malfunctioning, not all parts of the kernel might still be working as
expected. Perform one of the following actions to shut down and reboot
your system:call diediedie()This will cause your kernel to dump core and reboot, so you can
later analyze the core on a higher level with kgdb. This command
usually must be followed by another continue
statement. There is now an alias for this:
panic.call boot(0)Which might be a good way to cleanly shut down the running system,
sync() all disks, and finally reboot. As long as
the disk and file system interfaces of the kernel are not damaged, this
might be a good way for an almost clean shutdown.call cpu_reset()is the final way out of disaster and almost the same as hitting the
Big Red Button.If you need a short command summary, simply type:helpHowever, it is highly recommended to have a printed copy of the
&man.ddb.4; manual page ready for a debugging
session. Remember that it is hard to read the on-line manual while
single-stepping the kernel.On-line Kernel Debugging Using Remote GDBThis feature has been supported since FreeBSD 2.2, and it's actually
a very neat one.GDB has already supported remote debugging for
a long time. This is done using a very simple protocol along a serial
line. Unlike the other methods described above, you will need two
machines for doing this. One is the host providing the debugging
environment, including all the sources, and a copy of the kernel binary
with all the symbols in it, and the other one is the target machine that
simply runs a similar copy of the very same kernel (but stripped of the
debugging information).You should configure the kernel in question with config
-g, include into the configuration, and
compile it as usual. This gives a large blurb of a binary, due to the
debugging information. Copy this kernel to the target machine, strip
the debugging symbols off with strip -x, and boot it
using the boot option. Connect the first serial
line of the target machine to any serial line of the debugging host.
Now, on the debugging machine, go to the compile directory of the target
kernel, and start gdb:&prompt.user; gdb -k kernel
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd),
Copyright 1996 Free Software Foundation, Inc...
(kgdb)Initialize the remote debugging session (assuming the first serial
port is being used) by:(kgdb)target remote /dev/cuaa0Now, on the target host (the one that entered DDB right before even
starting the device probe), type:Debugger("Boot flags requested debugger")
Stopped at Debugger+0x35: movb $0, edata+0x51bc
db>gdbDDB will respond with:Next trap will enter GDB remote protocol modeEvery time you type gdb, the mode will be toggled
between remote GDB and local DDB. In order to force a next trap
immediately, simply type s (step). Your hosting GDB
will now gain control over the target kernel:Remote debugging using /dev/cuaa0
Debugger (msg=0xf01b0383 "Boot flags requested debugger")
at ../../i386/i386/db_interface.c:257
(kgdb)You can use this session almost as any other GDB session, including
full access to the source, running it in gud-mode inside an Emacs window
(which gives you an automatic source code display in another Emacs
window) etc.Remote GDB can also be used to debug LKMs. First build the LKM with
debugging symbols:&prompt.root; cd /usr/src/lkm/linux
&prompt.root; make clean; make COPTS=-gThen install this version of the module on the target machine, load
it and use modstat to find out where it was
loaded:&prompt.root; linux
&prompt.root; modstat
Type Id Off Loadaddr Size Info Rev Module Name
EXEC 0 4 f5109000 001c f510f010 1 linux_modTake the load address of the module and add 0x20 (probably to
account for the a.out header). This is the address that the module code
was relocated to. Use the add-symbol-file command in
GDB to tell the debugger about the module:(kgdb)add-symbol-file /usr/src/lkm/linux/linux_mod.o 0xf5109020
add symbol table from file "/usr/src/lkm/linux/linux_mod.o" at
text_addr = 0xf5109020? (y or n) y(kgdb)You now have access to all the symbols in the LKM.Debugging a Console DriverSince you need a console driver to run DDB on, things are more
complicated if the console driver itself is failing. You might remember
the use of a serial console (either with modified boot blocks, or by
specifying at the Boot: prompt),
and hook up a standard terminal onto your first serial port. DDB works
on any configured console driver, of course also on a serial
console.
diff --git a/en_US.ISO_8859-1/books/handbook/kerneldebug/chapter.sgml b/en_US.ISO_8859-1/books/handbook/kerneldebug/chapter.sgml
index 1ac1153a73..8233ecf626 100644
--- a/en_US.ISO_8859-1/books/handbook/kerneldebug/chapter.sgml
+++ b/en_US.ISO_8859-1/books/handbook/kerneldebug/chapter.sgml
@@ -1,597 +1,596 @@
Kernel DebuggingContributed by &a.paul; and &a.joerg;Debugging a Kernel Crash Dump with kgdbHere are some instructions for getting kernel debugging working on a
crash dump. They assume that you have enough swap space for a crash
dump. If you have multiple swap partitions and the first one is too
small to hold the dump, you can configure your kernel to use an
alternate dump device (in the config kernel line), or
you can specify an alternate using the
&man.dumpon.8; command. The best way to use &man.dumpon.8; is to set
the dumpdev variable in
/etc/rc.conf. Typically you want to specify one of
the swap devices specified in /etc/fstab. Dumps to
non-swap devices, tapes for example, are currently not supported. Config
your kernel using config -g. See Kernel Configuration for details on
configuring the FreeBSD kernel.Use the &man.dumpon.8; command to tell the kernel where to dump to
(note that this will have to be done after configuring the partition in
question as swap space via &man.swapon.8;). This is normally arranged
via /etc/rc.conf and /etc/rc.
Alternatively, you can hard-code the dump device via the
dump clause in the config line of
your kernel config file. This is deprecated and should be used only if
you want a crash dump from a kernel that crashes during booting.In the following, the term kgdb refers to
gdb run in “kernel debug mode”. This
can be accomplished by either starting the gdb with
the option , or by linking and starting it under
the name kgdb. This is not being done by default,
however, and the idea is basically deprecated since the GNU folks do
not like their tools to behave differently when called by another
name. This feature may well be discontinued in further
releases.When the kernel has been built make a copy of it, say
kernel.debug, and then run strip
-d on the original. Install the original as normal. You
may also install the unstripped kernel, but symbol table lookup time for
some programs will drastically increase, and since the whole kernel is
loaded entirely at boot time and cannot be swapped out later, several
megabytes of physical memory will be wasted.If you are testing a new kernel, for example by typing the new
kernel's name at the boot prompt, but need to boot a different one in
order to get your system up and running again, boot it only into single
user state using the flag at the boot prompt, and
then perform the following steps:&prompt.root; fsck -p
&prompt.root; mount -a -t ufs # so your file system for /var/crash is writable
&prompt.root; savecore -N /kernel.panicked /var/crash
&prompt.root; exit # ...to multi-userThis instructs &man.savecore.8; to use another kernel for symbol
name extraction. It would otherwise default to the currently running
kernel and most likely not do anything at all since the crash dump and
the kernel symbols differ.Now, after a crash dump, go to
/sys/compile/WHATEVER and run
kgdb. From kgdb do:
symbol-file kernel.debugexec-file /var/crash/kernel.0core-file /var/crash/vmcore.0
and voila, you can debug the crash dump using the kernel sources just
like you can for any other program.Here is a script log of a kgdb session
illustrating the procedure. Long lines have been folded to improve
readability, and the lines are numbered for reference. Despite this, it
is a real-world error trace taken during the development of the pcvt
console driver.
-
1:Script started on Fri Dec 30 23:15:22 1994
+ 1:Script started on Fri Dec 30 23:15:22 1994
2:&prompt.root; cd /sys/compile/URIAH
3:&prompt.root; kgdb kernel /var/crash/vmcore.1
4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
...done.
5:IdlePTD 1f3000
6:panic: because you said to!
7:current pcb at 1e3f70
8:Reading in symbols for ../../i386/i386/machdep.c...done.
9:(kgdb)where
10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767)
11:#1 0xf0115159 in panic ()
12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
13:#3 0xf010185e in db_fncall ()
14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073)
15:#5 0xf0101711 in db_command_loop ()
16:#6 0xf01040a0 in db_trap ()
17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
18:#8 0xf019d2eb in trap_fatal (...)
19:#9 0xf019ce60 in trap_pfault (...)
20:#10 0xf019cb2f in trap (...)
21:#11 0xf01932a1 in exception:calltrap ()
22:#12 0xf0191503 in cnopen (...)
23:#13 0xf0132c34 in spec_open ()
24:#14 0xf012d014 in vn_open ()
25:#15 0xf012a183 in open ()
26:#16 0xf019d4eb in syscall (...)
27:(kgdb)up 10
28:Reading in symbols for ../../i386/i386/trap.c...done.
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
35:283 (void) trap_pfault(&frame, FALSE);
36:(kgdb)frame frame->tf_ebp frame->tf_eip
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
40:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
41:(kgdb)list
42:398
43:399 tp->t_state |= TS_CARR_ON;
44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */
45:401
46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
47:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
48:404 #else
49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
51:407 }
52:(kgdb)print tp
53:Reading in symbols for ../../i386/i386/cons.c...done.
54:$1 = (struct tty *) 0x1bae
55:(kgdb)print tp->t_line
56:$2 = 1767990816
57:(kgdb)up
58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
61:(kgdb)up
62:#2 0xf0132c34 in spec_open ()
63:(kgdb)up
64:#3 0xf012d014 in vn_open ()
65:(kgdb)up
66:#4 0xf012a183 in open ()
67:(kgdb)up
68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
73:673 error = (*callp->sy_call)(p, args, rval);
74:(kgdb)up
75:Initial frame selected; you cannot go up.
76:(kgdb)quit
77:&prompt.root; exit
78:exit
79:
80:Script done on Fri Dec 30 23:18:04 1994
-
Comments to the above script:line 6:This is a dump taken from within DDB (see below), hence the
panic comment “because you said to!”, and a rather
long stack trace; the initial reason for going into DDB has been a
page fault trap though.line 20:This is the location of function trap()
in the stack trace.line 36:Force usage of a new stack frame; this is no longer necessary
now. The stack frames are supposed to point to the right
locations now, even in case of a trap. (I do not have a new core
dump handy <g>, my kernel has not panicked for a rather long
time.) From looking at the code in source line 403, there is a
high probability that either the pointer access for
“tp” was messed up, or the array access was out of
bounds.line 52:The pointer looks suspicious, but happens to be a valid
address.line 56:However, it obviously points to garbage, so we have found our
error! (For those unfamiliar with that particular piece of code:
tp->t_line refers to the line discipline of
the console device here, which must be a rather small integer
number.)Debugging a crash dump with DDDExamining a kernel crash dump with a graphical debugger like
ddd is also possible. Add the
option to the ddd command line you would use
normally. For example;&prompt.root; ddd -k /var/crash/kernel.0 /var/crash/vmcore.0You should then be able to go about looking at the crash dump using
ddd'd graphical interface.Post-mortem Analysis of a DumpWhat do you do if a kernel dumped core but you did not expect it,
and it is therefore not compiled using config -g? Not
everything is lost here. Do not panic!Of course, you still need to enable crash dumps. See above on the
options you have to specify in order to do this.Go to your kernel compile directory, and edit the line containing
COPTFLAGS?=-O. Add the option
there (but do not change anything on the level of
optimization). If you do already know roughly the probable location of
the failing piece of code (e.g., the pcvt
driver in the example above), remove all the object files for this code.
Rebuild the kernel. Due to the time stamp change on the Makefile, there
will be some other object files rebuild, for example
trap.o. With a bit of luck, the added
option will not change anything for the generated
code, so you will finally get a new kernel with similar code to the
faulting one but some debugging symbols. You should at least verify the
old and new sizes with the
&man.size.1; command. If there is a mismatch, you probably need to
give up here.Go and examine the dump as described above. The debugging symbols
might be incomplete for some places, as can be seen in the stack trace
in the example above where some functions are displayed without line
numbers and argument lists. If you need more debugging symbols, remove
the appropriate object files and repeat the kgdb
session until you know enough.All this is not guaranteed to work, but it will do it fine in most
cases.On-line Kernel Debugging Using DDBWhile kgdb as an offline debugger provides a very
high level of user interface, there are some things it cannot do. The
most important ones being breakpointing and single-stepping kernel
code.If you need to do low-level debugging on your kernel, there is an
on-line debugger available called DDB. It allows to setting
breakpoints, single-steping kernel functions, examining and changing
kernel variables, etc. However, it cannot access kernel source files,
and only has access to the global and static symbols, not to the full
debug information like kgdb.To configure your kernel to include DDB, add the option line
options DDB
to your config file, and rebuild. (See Kernel Configuration for details on
configuring the FreeBSD kernel.Note that if you have an older version of the boot blocks, your
debugger symbols might not be loaded at all. Update the boot blocks;
the recent ones load the DDB symbols automagically.)Once your DDB kernel is running, there are several ways to enter
DDB. The first, and earliest way is to type the boot flag
right at the boot prompt. The kernel will start up
in debug mode and enter DDB prior to any device probing. Hence you can
even debug the device probe/attach functions.The second scenario is a hot-key on the keyboard, usually
Ctrl-Alt-ESC. For syscons, this can be remapped; some of the
distributed maps do this, so watch out. There is an option available
for serial consoles that allows the use of a serial line BREAK on the
console line to enter DDB (options BREAK_TO_DEBUGGER
in the kernel config file). It is not the default since there are a lot
of crappy serial adapters around that gratuitously generate a BREAK
condition, for example when pulling the cable.The third way is that any panic condition will branch to DDB if the
kernel is configured to use it. For this reason, it is not wise to
configure a kernel with DDB for a machine running unattended.The DDB commands roughly resemble some gdb
commands. The first thing you probably need to do is to set a
breakpoint:b function-nameb addressNumbers are taken hexadecimal by default, but to make them distinct
from symbol names; hexadecimal numbers starting with the letters
a-f need to be preceded with 0x
(this is optional for other numbers). Simple expressions are allowed,
for example: function-name + 0x103.To continue the operation of an interrupted kernel, simply
type:cTo get a stack trace, use:traceNote that when entering DDB via a hot-key, the kernel is currently
servicing an interrupt, so the stack trace might be not of much use
for you.If you want to remove a breakpoint, usedeldel address-expressionThe first form will be accepted immediately after a breakpoint hit,
and deletes the current breakpoint. The second form can remove any
breakpoint, but you need to specify the exact address; this can be
obtained from:show bTo single-step the kernel, try:sThis will step into functions, but you can make DDB trace them until
the matching return statement is reached by:nThis is different from gdb's
next statement; it is like gdb's
finish.To examine data from memory, use (for example):
x/wx 0xf0133fe0,40x/hd db_symtab_spacex/bc termbuf,10x/s stringbuf
for word/halfword/byte access, and hexadecimal/decimal/character/ string
display. The number after the comma is the object count. To display
the next 0x10 items, simply use:x ,10Similarly, use
x/ia foofunc,10
to disassemble the first 0x10 instructions of
foofunc, and display them along with their offset
from the beginning of foofunc.To modify memory, use the write command:w/b termbuf 0xa 0xb 0w/w 0xf0010030 0 0The command modifier
(b/h/w)
specifies the size of the data to be written, the first following
expression is the address to write to and the remainder is interpreted
as data to write to successive memory locations.If you need to know the current registers, use:show regAlternatively, you can display a single register value by e.g.
p $eax
and modify it by:set $eax new-valueShould you need to call some kernel functions from DDB, simply
say:call func(arg1, arg2, ...)The return value will be printed.For a &man.ps.1; style summary of all running processes, use:psNow you have now examined why your kernel failed, and you wish to
reboot. Remember that, depending on the severity of previous
malfunctioning, not all parts of the kernel might still be working as
expected. Perform one of the following actions to shut down and reboot
your system:call diediedie()This will cause your kernel to dump core and reboot, so you can
later analyze the core on a higher level with kgdb. This command
usually must be followed by another continue
statement. There is now an alias for this:
panic.call boot(0)Which might be a good way to cleanly shut down the running system,
sync() all disks, and finally reboot. As long as
the disk and file system interfaces of the kernel are not damaged, this
might be a good way for an almost clean shutdown.call cpu_reset()is the final way out of disaster and almost the same as hitting the
Big Red Button.If you need a short command summary, simply type:helpHowever, it is highly recommended to have a printed copy of the
&man.ddb.4; manual page ready for a debugging
session. Remember that it is hard to read the on-line manual while
single-stepping the kernel.On-line Kernel Debugging Using Remote GDBThis feature has been supported since FreeBSD 2.2, and it's actually
a very neat one.GDB has already supported remote debugging for
a long time. This is done using a very simple protocol along a serial
line. Unlike the other methods described above, you will need two
machines for doing this. One is the host providing the debugging
environment, including all the sources, and a copy of the kernel binary
with all the symbols in it, and the other one is the target machine that
simply runs a similar copy of the very same kernel (but stripped of the
debugging information).You should configure the kernel in question with config
-g, include into the configuration, and
compile it as usual. This gives a large blurb of a binary, due to the
debugging information. Copy this kernel to the target machine, strip
the debugging symbols off with strip -x, and boot it
using the boot option. Connect the first serial
line of the target machine to any serial line of the debugging host.
Now, on the debugging machine, go to the compile directory of the target
kernel, and start gdb:&prompt.user; gdb -k kernel
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd),
Copyright 1996 Free Software Foundation, Inc...
(kgdb)Initialize the remote debugging session (assuming the first serial
port is being used) by:(kgdb)target remote /dev/cuaa0Now, on the target host (the one that entered DDB right before even
starting the device probe), type:Debugger("Boot flags requested debugger")
Stopped at Debugger+0x35: movb $0, edata+0x51bc
db>gdbDDB will respond with:Next trap will enter GDB remote protocol modeEvery time you type gdb, the mode will be toggled
between remote GDB and local DDB. In order to force a next trap
immediately, simply type s (step). Your hosting GDB
will now gain control over the target kernel:Remote debugging using /dev/cuaa0
Debugger (msg=0xf01b0383 "Boot flags requested debugger")
at ../../i386/i386/db_interface.c:257
(kgdb)You can use this session almost as any other GDB session, including
full access to the source, running it in gud-mode inside an Emacs window
(which gives you an automatic source code display in another Emacs
window) etc.Remote GDB can also be used to debug LKMs. First build the LKM with
debugging symbols:&prompt.root; cd /usr/src/lkm/linux
&prompt.root; make clean; make COPTS=-gThen install this version of the module on the target machine, load
it and use modstat to find out where it was
loaded:&prompt.root; linux
&prompt.root; modstat
Type Id Off Loadaddr Size Info Rev Module Name
EXEC 0 4 f5109000 001c f510f010 1 linux_modTake the load address of the module and add 0x20 (probably to
account for the a.out header). This is the address that the module code
was relocated to. Use the add-symbol-file command in
GDB to tell the debugger about the module:(kgdb)add-symbol-file /usr/src/lkm/linux/linux_mod.o 0xf5109020
add symbol table from file "/usr/src/lkm/linux/linux_mod.o" at
text_addr = 0xf5109020? (y or n) y(kgdb)You now have access to all the symbols in the LKM.Debugging a Console DriverSince you need a console driver to run DDB on, things are more
complicated if the console driver itself is failing. You might remember
the use of a serial console (either with modified boot blocks, or by
specifying at the Boot: prompt),
and hook up a standard terminal onto your first serial port. DDB works
on any configured console driver, of course also on a serial
console.