Page MenuHomeFreeBSD

kern linker: do not allow more than one kldload and kldunload syscalls simultaneously
ClosedPublic

Authored by kib on May 25 2021, 5:10 PM.

Details

Summary

kld_sx is dropped e.g. for executing sysinits, which allows user to initiate kldunload while module is not yet fully initialized.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kib requested review of this revision.May 25 2021, 5:10 PM
kib created this revision.

I suspect it should be a linker file flag, similar to LINKER_FILE_LINKED. linker_file_load() may be called in other paths, mainly through linker_reference_module() when a driver requests a firmware image.

I suspect it should be a linker file flag, similar to LINKER_FILE_LINKED. linker_file_load() may be called in other paths, mainly through linker_reference_module() when a driver requests a firmware image.

It does not play well with ordering with kld_sx. I just wrapped firmware load task function with busy.

In D30456#684512, @kib wrote:

I suspect it should be a linker file flag, similar to LINKER_FILE_LINKED. linker_file_load() may be called in other paths, mainly through linker_reference_module() when a driver requests a firmware image.

It does not play well with ordering with kld_sx. I just wrapped firmware load task function with busy.

But isn't the scope too wide now? Suppose a driver requests a firmware image from its DEVICE_ATTACH routine. It creates a module with DRIVER_MODULE, which adds an entry to the module metadata linker set and adds a sysinit call to module_register_init(). module_register_init() is called when the linker is busied, and it calls the MOD_LOAD handler for the driver, which calls driver_module_handler(), which typically results in DEVICE_PROBE and _ATTACH being called. Now the firmware_get() call will block forever because the linker is already busied. Am I missing something there? I did not test it.

Maybe we could store the busying thread and permit recursion, but I think this won't work easily since firmware_get() uses a taskqueue thread to actually load the module.

In D30456#684512, @kib wrote:

I suspect it should be a linker file flag, similar to LINKER_FILE_LINKED. linker_file_load() may be called in other paths, mainly through linker_reference_module() when a driver requests a firmware image.

It does not play well with ordering with kld_sx. I just wrapped firmware load task function with busy.

But isn't the scope too wide now? Suppose a driver requests a firmware image from its DEVICE_ATTACH routine. It creates a module with DRIVER_MODULE, which adds an entry to the module metadata linker set and adds a sysinit call to module_register_init(). module_register_init() is called when the linker is busied, and it calls the MOD_LOAD handler for the driver, which calls driver_module_handler(), which typically results in DEVICE_PROBE and _ATTACH being called. Now the firmware_get() call will block forever because the linker is already busied. Am I missing something there? I did not test it.

Maybe we could store the busying thread and permit recursion, but I think this won't work easily since firmware_get() uses a taskqueue thread to actually load the module.

First, more I think about it, more I do not see why would firmware load need to use this busy mechanism. It is not a user initiated action, so it is fine for kernel to load a firmware blob module whenever it needs. So I tend to just remove the subr_firmware chunks.

Somewhat more fine solution is to allow parallel kldloads and firmware loads, and similarly allow parallel kldunloads. It would still achieve my goal that it is impossible to unload a module that is executing sysinits. The busy state then converted from bool to counter, with e.g. > 0 meaning kldloads are in flight, <0 kldunloads.

Which approach do you prefer?

It does not play well with ordering with kld_sx. I just wrapped firmware load task function with busy.

But isn't the scope too wide now? Suppose a driver requests a firmware image from its DEVICE_ATTACH routine. It creates a module with DRIVER_MODULE, which adds an entry to the module metadata linker set and adds a sysinit call to module_register_init(). module_register_init() is called when the linker is busied, and it calls the MOD_LOAD handler for the driver, which calls driver_module_handler(), which typically results in DEVICE_PROBE and _ATTACH being called. Now the firmware_get() call will block forever because the linker is already busied. Am I missing something there? I did not test it.

Maybe we could store the busying thread and permit recursion, but I think this won't work easily since firmware_get() uses a taskqueue thread to actually load the module.

First, more I think about it, more I do not see why would firmware load need to use this busy mechanism. It is not a user initiated action, so it is fine for kernel to load a firmware blob module whenever it needs. So I tend to just remove the subr_firmware chunks.

Somewhat more fine solution is to allow parallel kldloads and firmware loads, and similarly allow parallel kldunloads. It would still achieve my goal that it is impossible to unload a module that is executing sysinits. The busy state then converted from bool to counter, with e.g. > 0 meaning kldloads are in flight, <0 kldunloads.

Which approach do you prefer?

With the first approach, isn't it still technically possible for a user to trigger kldunload of a firmware module that is being loaded asynchronously by the kernel? I can imagine a scenario where the driver loads firmware only in response to some administrative action, like bringing an interface up. Maybe it is not significant enough to be a concern; I cannot think of any examples, but there are many drivers which use firmware(9).

I suspect the second approach still isn't enough, since drivers may also unload firmware images during DEVICE_ATTACH. For instance, due to an error later on, or because the driver makes a copy of the firmware image in memory and immediately drops the ref on the image file. So if the first approach meets your goal, then let's follow it and keep it simple for now.

sys/kern/kern_linker.c
1054

I think named flags would be better, especially since linker_set_unbusy() shares one of them.

We already frequently use "busy" and "unbusy" as verbs elsewhere in the kernel, so I'd be inclined to call these just linker_(un)busy() or maybe linker_user_(un)busy. Just a suggestion.

With the first approach, isn't it still technically possible for a user to trigger kldunload of a firmware module that is being loaded asynchronously by the kernel? I can imagine a scenario where the driver loads firmware only in response to some administrative action, like bringing an interface up. Maybe it is not significant enough to be a concern; I cannot think of any examples, but there are many drivers which use firmware(9).

Hm, in fact not, user cannot unload firmware module at all, I believe. There are two references to the linker file, normal reference, and kldload reference (userrefs). The later is incremented only by kldload, and kldunload checks for userrefs == 0 and refuses to unload.

So I think I will go with the first approach.

I suspect the second approach still isn't enough, since drivers may also unload firmware images during DEVICE_ATTACH. For instance, due to an error later on, or because the driver makes a copy of the firmware image in memory and immediately drops the ref on the image file. So if the first approach meets your goal, then let's follow it and keep it simple for now.

Even without split references, I do not think that firmware unload is important for my case. My scenario is the following: some automated tests do kldload to start the process, but some async event might trigger kldunload of the same driver. Firmware modules should be safe in this regard.

Stop busying firmware load (at least for now).
Rename functions to linker_kldload_(un)busy, and use flags instead of bool args.

For now I kept functions global.

markj added inline comments.
sys/sys/linker.h
199

Seems it is unused.

This revision is now accepted and ready to land.May 27 2021, 7:39 PM
sys/sys/linker.h
199

In the sense that there is no consumers? Yes but this comes together with keeping the functions global.

sys/sys/linker.h
199

Ok, fair enough.