Page MenuHomeFreeBSD

kernel linker: do not read debug symbol tables for non-debug symbols
ClosedPublic

Authored by kib on Nov 7 2021, 9:16 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jun 22, 11:10 AM
Unknown Object (File)
Fri, Jun 21, 11:45 AM
Unknown Object (File)
Fri, Jun 21, 11:25 AM
Unknown Object (File)
Fri, Jun 21, 11:08 AM
Unknown Object (File)
Fri, Jun 21, 10:42 AM
Unknown Object (File)
Wed, Jun 19, 11:50 AM
Unknown Object (File)
Sun, Jun 16, 6:34 PM
Unknown Object (File)
Sun, Jun 16, 6:33 PM
Subscribers

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kib requested review of this revision.Nov 7 2021, 9:16 AM

I tried testing the patch to see if dtrace still works properly. Some KLDs fail to load:

link_elf_obj: symbol bstp_create undefined
linker_load_file: /boot/kernel.test/if_bridge.ko - unsupported file type
link_elf_obj: symbol solaris_cpu undefined
linker_load_file: /boot/kernel.test/dtrace.ko - unsupported file type

I am sure many EXPORT_SYMS definitions are out of date since nothing forces anyone to update them. Perhaps EXPORT_SYMS should default to yes, to reduce the amount of fallout?

sys/kern/link_elf.c
169 ↗(On Diff #98157)

Extra newline.

199 ↗(On Diff #98157)

Should it be under the debug MIB?

1658 ↗(On Diff #98157)

Did several passes over output from

cd /boot/kernel
for x in *.ko; do echo $x; kldload $x; done

This, in fact, revealed more issues with missed recorded dependencies between modules, and I did not tried to resolve that.

Moved sysctls to debug oid.

kib marked 3 inline comments as done.Nov 8 2021, 9:28 AM

Is this staged as a few commits? Module changes and kmod_syms.awk seem like they could be independent changes.

sys/kern/link_elf.c
1562 ↗(On Diff #98181)

link_elf_leak_locals is false here, to me it seems confusing to pass it?
I imagine we'll GC link_elf_leak_locals later on anyhow.

sys/kern/link_elf_obj.c
183

Allow local symbols to participate in kernel module symbol resolution or something like that?

Maybe even add (temporary) in the description if our intent is to remove this soon? Although it might not be apparent that the existence of the sysctl itself is what's temporary, not some aspect of symbol resolution.

kib marked 2 inline comments as done.Nov 8 2021, 1:50 PM

Is this staged as a few commits? Module changes and kmod_syms.awk seem like they could be independent changes.

https://kib.kiev.ua/git/gitweb.cgi?p=deviant3.git;a=shortlog;h=refs/heads/klink_static

sys/kern/link_elf_obj.c
183

Allow local symbols to participate in global module symbol resolution

kib marked an inline comment as done.

Add doc for sysctls.
Use false instead of false var.

LGTM. Only other comment is that I think we should fix the modules before changing the behaviour, to avoid having a broken step in the sequence.

This revision is now accepted and ready to land.Nov 8 2021, 2:14 PM

modules/opensolaris/Makefile still needs to be adjusted, I needed this: https://reviews.freebsd.org/P526 . Some dtrace module makefiles need to export syms as well, I did not finish. Should I work on this further?

EXPORT_SYMS is not documented anywhere that I can find, so requiring it to be set in all modules with dependencies will likely cause some confusion. How can this be mitigated? Can the kernel linker print a message if a matching local symbol is present but ignored, to provide a breadcrumb pointing to debug.link_elf_leak_locals?

modules/opensolaris/Makefile

My original proposal was to introduce this change with the sysctl set to keep the current behaviour, and then toggle it once issues have been fixed. The fact that there are more modules that need to be fixed (and that the default should perhaps be toggled) suggests that maybe we should go this way.

I think it's reasonable to push the first bits of this change now though (style, .awk) at least.

I'm surprised by the fact that there's so much that needs to be fixed in modules though, I guess it's just that the vast majority of our users are amd64?

modules/opensolaris/Makefile still needs to be adjusted, I needed this: https://reviews.freebsd.org/P526 . Some dtrace module makefiles need to export syms as well, I did not finish. Should I work on this further?

It is reasonable to wait for your changes to be committed, I think. I indeed did not tried dtrace.

EXPORT_SYMS is not documented anywhere that I can find, so requiring it to be set in all modules with dependencies will likely cause some confusion. How can this be mitigated? Can the kernel linker print a message if a matching local symbol is present but ignored, to provide a breadcrumb pointing to debug.link_elf_leak_locals?

I do not like an attempts to make kernel issue lectures, it eats non-swappable memory for something that should be read elsewhere. I believe it would be enough to send some longer explanation to current@ before toggling the sysctls default values.

This revision now requires review to proceed.Nov 9 2021, 8:12 AM

I agree with doing this in stages where the default is steal leaky and then fixing modules as followups. Would be nice to flip the default for 14.0 I think as a target.

Would be nice to flip the default for 14.0 I think as a target.

Yes I think we should flip the default for 14.0, and to do so we need to flip it in main well before that.

I tried this patch and a (new, haven't noticed it before) LOR message shows up like this:

lock order reversal:                                                                                                                                                          
 1st 0xffffffff81e9d250 GEOM topology (GEOM topology, sx) @ /root/freebsd/sys/ufs/ffs/ffs_vfsops.c:1434                                                                       
 2nd 0xfffff80002418e70 mntfs (mntfs, lockmgr) @ /root/freebsd/sys/geom/geom_vfs.c:312                                                                                        
lock order mntfs -> GEOM topology established at:                                                                                                                             
#0 0xffffffff80c9ee0c at ??+0                                                                                                                                                 
#1 0xffffffff80c38b67 at ??+0                                                                                                                                                 
#2 0xffffffff80f5d516 at ??+0                                                                                                                                                 
#3 0xffffffff80d14750 at ??+0                                                                                                                                                 
#4 0xffffffff80d12ccb at ??+0                                                                                                                                                 
#5 0xffffffff80d17d17 at ??+0                                                                                                                                                 
#6 0xffffffff80d1a8f1 at ??+0                                                                                                                                                 
#7 0xffffffff80d18fb3 at ??+0                                                                                                                                                 
#8 0xffffffff80bb8913 at ??+0                                                                                                                                                 
#9 0xffffffff80be60c0 at ??+0                                                                                                                                                 
#10 0xffffffff810cbfde at ??+0                                                                                                                                                
lock order GEOM topology -> mntfs attempted at:                                                                                                                               
#0 0xffffffff80c9f712 at ??+0                                                                                                                                                 
#1 0xffffffff80bfcc5e at ??+0                                                                                                                                                 
#2 0xffffffff80d367c4 at ??+0                                                                                                                                                 
#3 0xffffffff80b6fa06 at ??+0                                                                                                                                                 
#4 0xffffffff80f5eaba at ??+0                                                                                                                                                 
#5 0xffffffff80d16299 at ??+0                                                                                                                                                 
#6 0xffffffff80d159b7 at ??+0                                                                                                                                                 
#7 0xffffffff810fa77e at ??+0                                                                                                                                                 
#8 0xffffffff810cb87b at ??+0

It should be

lock order reversal:                                                                                                                                                          
 1st 0xffffffff8399f5c0 GEOM topology (GEOM topology, sx) @ /root/freebsd/sys/ufs/ffs/ffs_vfsops.c:1434                                                                       
 2nd 0xfffffe00b6e93070 mntfs (mntfs, lockmgr) @ /root/freebsd/sys/geom/geom_vfs.c:312                                                                                        
lock order mntfs -> GEOM topology established at:                                                                                                                             
#0 0xffffffff8140a474 at witness_checkorder+0x804                                                                                                                             
#1 0xffffffff813347be at _sx_xlock+0xfe                                                                                                                                       
#2 0xffffffff81908bc2 at ffs_mount+0xe22                                                                                                                                      
#3 0xffffffff814d2e3b at vfs_domount+0x102b                                                                                                                                   
#4 0xffffffff814d006a at vfs_donmount+0xada                                                                                                                                   
#5 0xffffffff814d898a at kernel_mount+0x11a                                                                                                                                   
#6 0xffffffff814dca58 at parse_mount+0x568                                                                                                                                    
#7 0xffffffff814da5d6 at vfs_mountroot+0xc16                                                                                                                                  
#8 0xffffffff8125a574 at start_init+0xb4                                                                                                                                      
#9 0xffffffff812a9f51 at fork_exit+0xa1                                                                                                                                       
#10 0xffffffff81b8682e at fork_trampoline+0xe                                                                                                                                 
lock order GEOM topology -> mntfs attempted at:                                                                                                                               
#0 0xffffffff8140b022 at witness_checkorder+0x13b2                                                                                                                            
#1 0xffffffff812cfcf9 at lockmgr_lock_flags+0x259                                                                                                                             
#2 0xffffffff81510817 at _vn_lock+0x137                                                                                                                                       
#3 0xffffffff811d2ba4 at g_vfs_close+0xd4                                                                                                                                     
#4 0xffffffff8190b374 at ffs_unmount+0x374                                                                                                                                    
#5 0xffffffff814d5a5d at dounmount+0xead                                                                                                                                      
#6 0xffffffff814d4af1 at kern_unmount+0x431                                                                                                                                   
#7 0xffffffff81bcbf9a at amd64_syscall+0x2ea                                                                                                                                  
#8 0xffffffff81b860cb at fast_syscall_common+0xf8

I tried this patch and a (new, haven't noticed it before) LOR message shows up like this:

lock order reversal:                                                                                                                                                          
 1st 0xffffffff81e9d250 GEOM topology (GEOM topology, sx) @ /root/freebsd/sys/ufs/ffs/ffs_vfsops.c:1434                                                                       
 2nd 0xfffff80002418e70 mntfs (mntfs, lockmgr) @ /root/freebsd/sys/geom/geom_vfs.c:312                                                                                        
lock order mntfs -> GEOM topology established at:                                                                                                                             
#0 0xffffffff80c9ee0c at ??+0                                                                                                                                                 
#1 0xffffffff80c38b67 at ??+0                                                                                                                                                 
#2 0xffffffff80f5d516 at ??+0                                                                                                                                                 
#3 0xffffffff80d14750 at ??+0                                                                                                                                                 
#4 0xffffffff80d12ccb at ??+0                                                                                                                                                 
#5 0xffffffff80d17d17 at ??+0                                                                                                                                                 
#6 0xffffffff80d1a8f1 at ??+0                                                                                                                                                 
#7 0xffffffff80d18fb3 at ??+0                                                                                                                                                 
#8 0xffffffff80bb8913 at ??+0                                                                                                                                                 
#9 0xffffffff80be60c0 at ??+0                                                                                                                                                 
#10 0xffffffff810cbfde at ??+0                                                                                                                                                
lock order GEOM topology -> mntfs attempted at:                                                                                                                               
#0 0xffffffff80c9f712 at ??+0                                                                                                                                                 
#1 0xffffffff80bfcc5e at ??+0                                                                                                                                                 
#2 0xffffffff80d367c4 at ??+0                                                                                                                                                 
#3 0xffffffff80b6fa06 at ??+0                                                                                                                                                 
#4 0xffffffff80f5eaba at ??+0                                                                                                                                                 
#5 0xffffffff80d16299 at ??+0                                                                                                                                                 
#6 0xffffffff80d159b7 at ??+0                                                                                                                                                 
#7 0xffffffff810fa77e at ??+0                                                                                                                                                 
#8 0xffffffff810cb87b at ??+0

It should be

lock order reversal:                                                                                                                                                          
 1st 0xffffffff8399f5c0 GEOM topology (GEOM topology, sx) @ /root/freebsd/sys/ufs/ffs/ffs_vfsops.c:1434                                                                       
 2nd 0xfffffe00b6e93070 mntfs (mntfs, lockmgr) @ /root/freebsd/sys/geom/geom_vfs.c:312                                                                                        
lock order mntfs -> GEOM topology established at:                                                                                                                             
#0 0xffffffff8140a474 at witness_checkorder+0x804                                                                                                                             
#1 0xffffffff813347be at _sx_xlock+0xfe                                                                                                                                       
#2 0xffffffff81908bc2 at ffs_mount+0xe22                                                                                                                                      
#3 0xffffffff814d2e3b at vfs_domount+0x102b                                                                                                                                   
#4 0xffffffff814d006a at vfs_donmount+0xada                                                                                                                                   
#5 0xffffffff814d898a at kernel_mount+0x11a                                                                                                                                   
#6 0xffffffff814dca58 at parse_mount+0x568                                                                                                                                    
#7 0xffffffff814da5d6 at vfs_mountroot+0xc16                                                                                                                                  
#8 0xffffffff8125a574 at start_init+0xb4                                                                                                                                      
#9 0xffffffff812a9f51 at fork_exit+0xa1                                                                                                                                       
#10 0xffffffff81b8682e at fork_trampoline+0xe                                                                                                                                 
lock order GEOM topology -> mntfs attempted at:                                                                                                                               
#0 0xffffffff8140b022 at witness_checkorder+0x13b2                                                                                                                            
#1 0xffffffff812cfcf9 at lockmgr_lock_flags+0x259                                                                                                                             
#2 0xffffffff81510817 at _vn_lock+0x137                                                                                                                                       
#3 0xffffffff811d2ba4 at g_vfs_close+0xd4                                                                                                                                     
#4 0xffffffff8190b374 at ffs_unmount+0x374                                                                                                                                    
#5 0xffffffff814d5a5d at dounmount+0xead                                                                                                                                      
#6 0xffffffff814d4af1 at kern_unmount+0x431                                                                                                                                   
#7 0xffffffff81bcbf9a at amd64_syscall+0x2ea                                                                                                                                  
#8 0xffffffff81b860cb at fast_syscall_common+0xf8

This should be fixed by the following snippet:

diff --git a/sys/kern/kern_linker.c b/sys/kern/kern_linker.c
index 288f8928a429..71c3763b288d 100644
--- a/sys/kern/kern_linker.c
+++ b/sys/kern/kern_linker.c
@@ -955,7 +955,7 @@ linker_debug_symbol_values(c_linker_sym_t sym, linker_symval_t *symval)
 	linker_file_t lf;
 
 	TAILQ_FOREACH(lf, &linker_files, link) {
-		if (LINKER_SYMBOL_VALUES(lf, sym, symval) == 0)
+		if (LINKER_DEBUG_SYMBOL_VALUES(lf, sym, symval) == 0)
 			return (0);
 	}
 	return (ENOENT);

I added it to the patchset.

linker_debug_symbol_values(): use proper linker interface to get debug values

This revision was not accepted when it landed; it landed in state Needs Review.Nov 18 2021, 1:57 PM
This revision was automatically updated to reflect the committed changes.
kib updated this revision to Diff 98692.

Rebase after modules bits were committed.

Rebase.
Change defaults to still leak locals.

emaste added inline comments.
sys/kern/link_elf.c
197 ↗(On Diff #99649)

Should we leave a note to remove this again after some time (e.g., default false in FreeBSD 14 and remove in FreeBSD 15)?

This revision is now accepted and ready to land.Dec 7 2021, 2:26 PM
sys/kern/link_elf.c
197 ↗(On Diff #99649)

I believe we should flip it much earlier than 'for FreeBSD14'. I did flip it back now to get some more reviews and push the patch without also taking the stream of issues.

I believe that the plan could be to send the warning to current@ and set it to false in 2 weeks or 1 month, not sure. Meantime I want some more reviews and perhaps testing of the 'false' by recent reporters.

sys/kern/link_elf.c
197 ↗(On Diff #99649)

Yes, flip it soon. I would say 2 weeks is probably too early as holidays are starting, but a month should be fine.

I just mean that if it toggles soon then FreeBSD 14 would be the first release with it defaulting to false. If we're going to remove it again that could happen just after 14 branches.

link_elf_each_function_name() and link_elf_each_function_nameval() behave inconsistently. The latter obeys leak_local, while the former does not. Moreover, the loop in link_elf_each_function_nameval() aborts if a local symbol is encountered, but it should really just continue to the next one. That function is only used by dtrace, though, and it doesn't make sense to hide local symbols from it, so link_elf_each_function_nameval() should probably just call link_elf_debug_symbol_values(), or do something equivalent.

kib marked 2 inline comments as done.

Use link_elf_debug_symbol_values() in link_elf_each_function_nameval().
More style.

This revision now requires review to proceed.Dec 8 2021, 7:04 PM
This revision is now accepted and ready to land.Dec 8 2021, 9:25 PM