This feature is aim at preventing debugger to load incompatible kernel dump with corresponding module. The mechanism behind it is to expose the address of uuid to debugger and let the debugger check if the dump has the same uuid with the binary file. If they are same, it can be loaded successfully. If not, even though we can load it, the debug information may be wrong.
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 57661 Build 54549: arc lint + arc unit
Event Timeline
- Add uuid for module to prevent incompetable coredump loaded by debugger
- Expose offset of uuid_addr and uuid_size in linker_file for debugger
It probably makes sense to split the change into two: one for loader, one for kernel.
Don't we need to ensure that notes are preserved, in the modules linker scripts? (sys/conf/ldscript.kmod.ARCH)
I assume that the debugger changes as well as adding UUIDS to the built modules would happen somewhere else?
stand/common/load_elf.c | ||
---|---|---|
746 | Indent should be +4 spaces for the continuation line | |
sys/kern/link_elf.c | ||
1320 | Unindent by 4 spaces back | |
sys/kern/link_elf_obj.c | ||
608 | unindent by -4 spaces |
- Oh, I will separate it into two diffs.
- I think ldscript already guarantee that. Besides, I decide to skip this check in LLDB with waring printed as a fallback.
- Yes, I will write and send the patch in here as an extension of the previous work.
Thinking about it more, why does the loader change needed at all? Kernel parses/should parse the elf.
For preloaded relocatable kernel module, I think the modification of loader is necessary as I discover SHT_NOTE section was not loaded by boot1 originally when the kernel module is preloaded.
For shared object, actually we can get the address of uuid when kernel is loaded. I modify the loader because I discover that CTORS pass the information from loader and I think I should follow it?
This is most likely can/should be fixed by linker script.
For shared object, actually we can get the address of uuid when kernel is loaded. I modify the loader because I discover that CTORS pass the information from loader and I think I should follow it?
Let me reformulate my point: this functionality (uuid module identification) should not depend on updated loader. I do not see why do we need to establish such requirement.
After my investigation, I discover that shstrtab section is not loaded into memory (because no SHF_ALLOC). Thus I cannot compare the section name with ".note.gnu.build-id". Besides, in link_elf_link_preload, the function don't open the file as link_elf_load_file but use all of the memory and preload information from boot1. Thus if I want to compare the section name, I should load the file, create temp memory space for it, and read file. Does it worth to a file in kernel just for compare a section name?
@aokblast
I'm not familiar with debuggers, so this might be a silly question.
@emaste made a commit 74cd06b42ea6 which will write note.gnu.build-id section to kernel / kernel module files. Can that ( note.gnu.build-id ) serve the same purpose aim at preventing debugger to load incompatible kernel dump with corresponding module ?
An example of note.gnu.build-id on 13.3:
# freebsd-version -k 13.3-RELEASE-p1 # readelf -n /boot/kernel/if_bridge.ko Notes at offset 0x0000b798 with length 0x00000024: Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (Build id set by ld(1)) Build ID: ab49d4bfb1a1ed65193083dc7f7eb653a65fad0d # readelf -n /boot/kernel/if_lagg.ko Notes at offset 0x0001324c with length 0x00000024: Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (Build id set by ld(1)) Build ID: 90a012c146bc5a6e1dd3d04358f39fc947b521ea
And on 14.0
# freebsd-versioin -k 14.0-RELEASE-p6 # readelf -n /boot/kernel/if_lagg.ko Notes at offset 0x0001359c with length 0x00000024: Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (Build id set by ld(1)) Build ID: 934128ad28a2562d0c1dcedbaa067d9fb93d27de
Sorry about my late reply. The patch Ed commit is the pre-request to ensure my patch work. The logic of my patch is the loader would load the build-id from elf file in .gnu.note.build-id section into memory. When kernel crashed, it generate coredump and the build-id loaded from the loader will dump into the coredump file at the same time. After that, when debugger load kernel and it's coredump, the debugger will try to find the elf file of module and its debug file. We can compare the build-id from memory with elf file. If they are different, we shouldn't load the debug file (because they are incompatible).
I don't think we can do this for .o kernel modules. When loading .o files, the loader does not copy SHT_NOTE sections into memory. With a linker script we can push the contents of the .note.gnu.build-id section into .rodata, but then the input section is removed, so debuggers cannot find it. I cannot find a way to avoid this.
For .so kmods, it should be possible to use a linker script to ensure that the build-id section is covered by a PT_LOAD segment. Then, no loader changes are needed, but we need to write a linker script for kmods for each platform. I guess that's reasonable. It is easier (and more uniform) for the loader to handle this though.
Isn't it same as the modinit sections? We find them with the special linker-generated start/stop symbols, which should be done for build-id sections as well.
For .so kmods, it should be possible to use a linker script to ensure that the build-id section is covered by a PT_LOAD segment. Then, no loader changes are needed, but we need to write a linker script for kmods for each platform. I guess that's reasonable. It is easier (and more uniform) for the loader to handle this though.
We can add start/stop symbols but the section itself will not be loaded by the boot loader, since it has type SHT_NOTE. So far I do not see how it can work unless the loader is modified to load SHT_NOTE sections.
You said above, that the section can be moved into .rodata with the script, so I did not copied that part. I do not see why would the combination of these two (.note->.rodata + start/stop) would be not enough.
Because then the .note.gnu.build-id section is not present in the .ko file, so debuggers do not know how to find it (without some custom FreeBSD-specific code). The intent behind this patch is to make it possible for debuggers to verify the build-id of the target (I guess a vmcore or a live system) against on-disk files.
This is going in cicrles. The content of the section is moved into .rodata with the liner script. The start and end of the section are available by the values of the special start and end symbols. What else is needed for debuggers?
I don't think we can do this for .o kernel modules. When loading .o files, the loader does not copy SHT_NOTE sections into memory. With a linker script we can push the contents of the .note.gnu.build-id section into .rodata, but then the input section is removed, so debuggers cannot find it. I cannot find a way to avoid this.
Isn't it same as the modinit sections? We find them with the special linker-generated start/stop symbols, which should be done for build-id sections as well.
We can add start/stop symbols but the section itself will not be loaded by the boot loader, since it has type SHT_NOTE. So far I do not see how it can work unless the loader is modified to load SHT_NOTE sections.
You said above, that the section can be moved into .rodata with the script, so I did not copied that part. I do not see why would the combination of these two (.note->.rodata + start/stop) would be not enough.
Because then the .note.gnu.build-id section is not present in the .ko file, so debuggers do not know how to find it (without some custom FreeBSD-specific code). The intent behind this patch is to make it possible for debuggers to verify the build-id of the target (I guess a vmcore or a live system) against on-disk files.
This is going in cicrles. The content of the section is moved into .rodata with the liner script. The start and end of the section are available by the values of the special start and end symbols. What else is needed for debuggers?
How does a debugger (gdb, lldb, ...) know about the special symbols? It needs custom code, instead of looking for the NT_GNU_BUILD_ID note. It seems preferable to avoid that.
Doesn't debugger need a special knowledge anyway, because .ko object files are not finally linked objects (.exe or dso), they do not have program headers and lot of other related stuff. In particular, they do not have PT_GNU_NOTE program header. On the other hand, correct debugger must not assume that sections are present at all in the finally linked objects. So .ko modules are quite special already.
lldb and (k)gdb do not. I can run kgdb against a live amd64 kernel and it already knows how to fetch build IDs from the .kos on my filesystem:
(kgdb) info files Symbols from "/boot/kernel.GENERIC-NODEBUG/kernel". Kernel core dump file: `/dev/mem', file type FreeBSD kernel vmcore. Local exec file: ... 0xffffffff82162428 - 0xffffffff8216244c is .note.gnu.build-id in /boot/kernel.GENERIC-NODEBUG/cryptodev.ko
So if we change the linker script to insert the build-id into .rodata, gdb will presumably need some modification to handle this. Perhaps @jhb can comment on this.
lldb already supports object files; there, it simply looks for the .note.gnu.build-id SHT_NOTE section.
So should we divided our patch into amd64/arm64 for further discussion about two cases. Or divided the patch into kernel/loader?
We should do this via normal ELF and not invent new ways to store the build-id. The build-id should already be memory-resident for the link_elf.c case. What we really should be doing is storing the build-id for the kernel as a .note in the crash dump. We should also probably just be treating the .note as memory-resident for .o files (if not already, Mark suggested they are already memory-resident), and kgdb can look for the build-id that way rather than digging around in the linker_files TAILQ.
But we cannot just rely on the file to tell use where the memory is because it maybe wrong after you compile the module. For relocatable file, you don't even have VMA to tell you where is the memory. We should use uuid_addr from kernel memory. This is why I think we should have build-id feature (The file loaded is different from the one statically resides in your file system)
Expose uuid for kernel module in kernel to prevent incompetable coredump loaded by debugger for relocatable file
It is far worse though, we can't find the linker_files linked list in a crash dump without already having the correct kernel to find the linker_files and related symbols, so this change doesn't really add anything useful that I can see. At least, I don't believe we are currently including the kernel's build-id in the minidump header (and it doesn't end up in the /var/crash/info.X file). That's probably the first thing to address in some way.
Long term I plan to rework kernel crash dumps to be a true ELF core dump, and then we can have a regular PT_NOTE in the core that includes the build-id of the running kernel. That will let you more reliably map the raw kernel modules so you can then inspect them to find their embedded build-id. In the near term it may be easier to remember the build-id from the main kernel explicitly and store that in a new version of the minidump header (it's just a royal PITA to extend minidump headers which is why I want to switch to ELF which is more easily extensible)
I think you are right, I don't think about kernel problem. And in my opinion, it is the chicken-and-egg conundrum if we implemented it in a normal ELF way (rely on VMA of section) for kernel. But it is still a mitigation for driver developer when developing kernel module. Besides, we still need uuid_addr for relocatable kernel module. I have extract my patch to contain relocatable file handling only. So I think we should preserve this patch not merge it until finding a way to verify who is the correct kernel? (by extended the kernel dump header as you mentioned or something else?)
My point is that existing debuggers already know to extract build-id from PT_NOTE sections in cores, etc. and that it will likely be much simpler to implement support for build-id matching in kgdb (which I maintain) if we just ensure the note is memory-resident vs adding a custom entry in the linker_files linked list. Probably it would be worth while building the debugger side out in either lldb or gdb to validate the kernel changes.
So do you means we only aware of the incompatible kernel and don't implement similar check for kernel module?