Page MenuHomeFreeBSD

linuxkpi: Provide a non-NULL value for THIS_MODULE
Needs ReviewPublic

Authored by ashafer_badland.io on Mar 11 2024, 5:55 PM.
Tags
None
Referenced Files
F82327549: D44306.id.diff
Sat, Apr 27, 4:48 PM
Unknown Object (File)
Fri, Apr 26, 12:15 PM
Unknown Object (File)
Fri, Apr 26, 5:35 AM
Unknown Object (File)
Fri, Apr 26, 4:05 AM
Unknown Object (File)
Sun, Apr 21, 8:19 AM
Unknown Object (File)
Sat, Apr 20, 4:55 AM
Unknown Object (File)
Fri, Apr 12, 2:40 PM
Unknown Object (File)
Fri, Apr 5, 2:57 PM

Details

Reviewers
None
Group Reviewers
linuxkpi
Summary

THIS_MODULE is used to differentiate modules on Linux. We currently
completely stub out any Linux struct module usage, but THIS_MODULE
is still used to populate the "owner" fields of various drivers.
Even though we don't actually dereference these "owner" fields they
are still used by drivers to check if devices/dmabufs/etc come
from different modules. For example, during DRM GEM import some
drivers check if the dmabuf's owner matches the dev's owner. If
they match because they are both NULL drivers may incorrectly think
two resources come from the same module.

This defines THIS_MODULE to be the filename casted to a struct module.
This allows us to have differing pointers for every "owner", and
we can rely on the module stubs to never dereference these pointers.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 57278
Build 54166: arc lint + arc unit

Event Timeline

I'm open to other suggestions about how we can do this. We definitely want THIS_MODULE to be non-zero, but having it be an opaque pointer that isn't a struct module isn't that great. Barring fully implementing module in linuxkpi I think something like this could make sense to allow emulated code to check if devices/dmabufs/etc come from the same module?

User reported issue: https://github.com/amshafer/nvidia-driver/issues/21

That's for struct class though, right? I'm talking about struct module, and struct file_operations. Here's an example usage from nvidia-drm: https://github.com/amshafer/nvidia-driver/blob/535.98/nvidia/src/nvidia-drm/nvidia-drm-gem.c#L156. Iirc there's a couple similar ones in other drivers in base.

So based on that I think whatever you're doing to class is fine and doesn't affect this?

bz added inline comments.
sys/compat/linuxkpi/common/include/linux/module.h
70

FILE_NAME won't do much good, will it? I mean that means THIS_MODULE may have 20 different values for a driver with 20 files?

So up for discussion -- not sure if that'll math the Linux version any better...?

I wonder if it would make more sense to define this from a module build Makefile [*] so at least a driver has one value and only set it here if it is not define before?

  • we do have a set of things we define there from the framework, like LINUXKPI_GENSRCS and LINUXKPI_INCLUDES. We could probably add a generic LINUXKPI_CFLAGS as well?
sys/compat/linuxkpi/common/include/linux/module.h
70

I think using the file name does "improve" things in that there is at least a value of THIS_MODULE which differs between modules. But yes like you said it can have multiple mismatching values. Some of the modules use it once so to them this doesn't matter, but others have multiple usages.

So something like add LINUXKPI_CFLAGS (i.e. the existing LINUXKPI_C) in kmod.mk and then define a LINUXKPI_MODNAME macro with the value of $KMOD (the module name)? Then set THIS_MODULE to LINUXKPI_MODNAME. That does seem like it'd get rid of the duplication at the very least.

This comment has been deleted.
sys/compat/linuxkpi/common/include/linux/module.h
70

We have KBUILD_MODNAME macro already, but we can not just do #define THIS_MODULE KBUILD_MODNAME in header as it will still result in 20 different pointer values. THIS_MODULE has to be exported from the single C file or be a kernel linker hack like in Linux.
May be we should just allow THIS_MODULE to be overridden by build system? It seems that @bz advised something similar.

sys/compat/linuxkpi/common/include/linux/module.h
70

I'm still a bit confused about what you all are suggesting by "overridden by build system"? Does that mean injecting a C definition into LINUXKPI_GENSRCS or something? This would definitely be cleaner so if there's a good solution I'm happy to implement it.

KBUILD_MODNAME does seem to do what I suggested though, why would that still result in 20 differen pointer values? Wouldn't it just result in one value per module since I assume all files in that module would have the same KBUILD_MODNAME? That would still let us compare things belonging to different modules afaict. As long as the KBUILD_MODNAME matches it should all reference one global string literal so the pointer is identical for all usages in the module? Or is there something I'm missing.

sys/compat/linuxkpi/common/include/linux/module.h
70

@wulf hmm indeed. So my thinking won't work either. I wonder how much linker magic we would actually need...

sys/compat/linuxkpi/common/include/linux/module.h
70

I'm still a bit confused about what you all are suggesting by "overridden by build system"? Does that mean injecting a C definition into LINUXKPI_GENSRCS or something? This would definitely be cleaner so if there's a good solution I'm happy to implement it.

I meant adding of -DTHIS_MODULE=0xdeadbeef to CFLAGS on compile stage where 0xdeadbeef is unique along the modules. Of course our linux/module.h should handle this.

KBUILD_MODNAME does seem to do what I suggested though, why would that still result in 20 differen pointer values? Wouldn't it just result in one value per module since I assume all files in that module would have the same KBUILD_MODNAME? That would still let us compare things belonging to different modules afaict. As long as the KBUILD_MODNAME matches it should all reference one global string literal so the pointer is identical for all usages in the module? Or is there something I'm missing.

KBUILD_MODNAME is a preprocessor macro rather than a string literal. It is expanded every time it is referred by the code so each instance of KBUILD_MODNAME creates new string literal having a new pointer to it.

sys/compat/linuxkpi/common/include/linux/module.h
70

I meant adding of -DTHIS_MODULE=0xdeadbeef to CFLAGS on compile stage where 0xdeadbeef is unique along the modules. Of course our linux/module.h should handle this.

I'd be a little concerned about how we could pick values in this case though. It sounds like they would have to be pre-assigned or something which doesn't sound great?

I think another alternative might be a linker script. Iirc it's relatively easy to declare variables in linker scripts, the problem is having a valid linker script lying around that contains the definition, which might not be trivial in our build infrastructure? I would guess Linux does something like this.

KBUILD_MODNAME is a preprocessor macro rather than a string literal. It is expanded every time it is referred by the code so each instance of KBUILD_MODNAME creates new string literal having a new pointer to it.

Huh I've always seen multiple identical string literal usages point to the same read-only global string, but this prompted me to look it up and turns out that behavior is not guaranteed by the C spec and is instead implementation defined. Seems both GCC and LLVM usually do it, but if we can't depend on it we can't depend on it.

sys/compat/linuxkpi/common/include/linux/module.h
70

I meant adding of -DTHIS_MODULE=0xdeadbeef to CFLAGS on compile stage where 0xdeadbeef is unique along the modules. Of course our linux/module.h should handle this.

I'd be a little concerned about how we could pick values in this case though. It sounds like they would have to be pre-assigned or something which doesn't sound great?

It does not sound great to me too.

I think another alternative might be a linker script. Iirc it's relatively easy to declare variables in linker scripts, the problem is having a valid linker script lying around that contains the definition, which might not be trivial in our build infrastructure? I would guess Linux does something like this.

KBUILD_MODNAME is a preprocessor macro rather than a string literal. It is expanded every time it is referred by the code so each instance of KBUILD_MODNAME creates new string literal having a new pointer to it.

Huh I've always seen multiple identical string literal usages point to the same read-only global string, but this prompted me to look it up and turns out that behavior is not guaranteed by the C spec and is instead implementation defined. Seems both GCC and LLVM usually do it, but if we can't depend on it we can't depend on it.

Do GCC/LLVM able to detect identical strings across different source/object files at a link time?
Looks that we should ask @imp about adding of kernel linker features

Maybe an interesting solution would be to do something a bit more DRM specific using LKPI_DRIVER_MODULE. Afaict only the DRM drivers will run into the issue that motivated this change. There's no reason we couldn't do something like this in base or just move the bits I mentioned into base.

We could override THIS_MODULE from drm-kmod's linuxkpi/bsd/include/linux/module.h and have it refer to a global variable defined in LKPI_DRIVER_MODULE? Then only set THIS_MODULE to 0x0 in base if THIS_MODULE isn't defined. That would avoid linker nastiness and still prevent panics on DRM. Although it would require drm-kmod header changes it isn't that invasive of a change and we could easily replace it with something more thorough in the future if we wanted to.

Since there were no comments I've gone ahead with the strategy I
outlined in my last comment: Have drm-kmod create a global variable
during LKPI_DRIVER_MODULE and point THIS_MODULE at it.

drm-kmod PR: https://github.com/freebsd/drm-kmod/pull/297

Overall it's not too ugly, and I verified it does prevent the panic.

Bump __FreeBSD_version for drm-kmod

maybe we can do something like:
#define THIS_MODULE ((void *)string_hash(KBUILD_MODNAME))

There are couple of examples of build-time string hashing functions floating around like: https://stackoverflow.com/questions/28654707/how-to-calculate-the-hash-of-a-string-literal-using-only-the-c-preprocessor

Any thoughts?

That's a promising idea. I wonder if instead of having an unusual inline function we could instead make it part of the build setup. Something like setting KBUILD_MODNAME_HASH to whatever sha1 -s "$KBUILD_MODNAME" returns as part of the Makefile? That seems like it might be a little cleaner and easier to debug. Only downside is we have to update all of the makefiles that use KBUILD_MODNAME, I can't think of a way to automate it off the top of my head.

At this point I will ask -- has anyone considered checking how much linker work would be needed for this?

It sounds like linux has insmod set up the __this_module variable when a module is loaded. If you're asking about doing something equivalent then I don't have a good estimate for how hard that would be. I don't know the details though.

I did mention how this could be accomplished using linker scripts earlier: "I think another alternative might be a linker script. Iirc it's relatively easy to declare variables in linker scripts, the problem is having a valid linker script lying around that contains the definition, which might not be trivial in our build infrastructure?". I do know you can create global variables there so that's not a problem, but you have to have/ship the default linker script wit your global variable modifications for every architecture that you want to compile and that might not be trivial for the kernel module building infrastructure? I think this could be the "proper" way of doing it but the effort required would probably be quite high, and I'm not sure if that's something we want to do when all we need (for now) is a way to check if two modules have matching pointers.

I do think that GCC and clang (and really ld.bfd and lld) will collapse duplicate copies of the same anonymous string down to the same storage.

However, is insmod equivalent to just kldload and is __this_module only valid during then? If so, you can hang a suitable pointer to the current struct linker_file off of struct thread that gets set during kldload(2) and cleared before it exits (probably a bit further down in linker_load_file or the like). Then you can #define __this_module to something like curthread->td_current_linker_file with a suitable wrapper cast or whatever.

In D44306#1020400, @jhb wrote:

I do think that GCC and clang (and really ld.bfd and lld) will collapse duplicate copies of the same anonymous string down to the same storage.

However, is insmod equivalent to just kldload and is __this_module only valid during then? If so, you can hang a suitable pointer to the current struct linker_file off of struct thread that gets set during kldload(2) and cleared before it exits (probably a bit further down in linker_load_file or the like). Then you can #define __this_module to something like curthread->td_current_linker_file with a suitable wrapper cast or whatever.

@jhb my understanding is that it is valid for the lifetime of the module (until unloaded). Hence my thinking of how much it would actually need to set an (unresolved) symbol, and how much to add it during build... I think they put it in a special section but I haven't investigated or read up any further.

Oh, if it is just a global symbol that isn't a ton of work to handle. You just need to recognize that symbol name when resolving symbols and resolve the pointer to the right thing instead. Probably what we would want to do is having __this_module store the pointer to the current linker_file_t when you are resolving symbols and require it to be the same size as a pointer.

In D44306#1020417, @jhb wrote:

Oh, if it is just a global symbol that isn't a ton of work to handle. You just need to recognize that symbol name when resolving symbols and resolve the pointer to the right thing instead. Probably what we would want to do is having __this_module store the pointer to the current linker_file_t when you are resolving symbols and require it to be the same size as a pointer.

Well it is a per-module (.ko or possibly multiple modules in a single .ko) "global".
Assume for simplicity 1 .ko 1 this_module.
Load multiple LinuxKPI based modules, have multiple
this_module.
Is our kernel linker smart enough to know the "scope" and not to have a conflicting symbols then?
I.e. amdgpu.ko this_module != iwlwifi this_module?

Somehow we do similar things for DPCPU and VNET?
So basically add a GLOBL and possibly with section(FOO) used?
Something like this?

In D44306#1020434, @bz wrote:
In D44306#1020417, @jhb wrote:

Oh, if it is just a global symbol that isn't a ton of work to handle. You just need to recognize that symbol name when resolving symbols and resolve the pointer to the right thing instead. Probably what we would want to do is having __this_module store the pointer to the current linker_file_t when you are resolving symbols and require it to be the same size as a pointer.

Well it is a per-module (.ko or possibly multiple modules in a single .ko) "global".
Assume for simplicity 1 .ko 1 this_module.
Load multiple LinuxKPI based modules, have multiple
this_module.
Is our kernel linker smart enough to know the "scope" and not to have a conflicting symbols then?
I.e. amdgpu.ko this_module != iwlwifi this_module?

Somehow we do similar things for DPCPU and VNET?
So basically add a GLOBL and possibly with section(FOO) used?
Something like this?

It would have to be global for the .ko. You would basically just add a special case for the symbol name "__this_module" (or whatever you call it) and give that particular name special semantics.

That is, you would have extern linker_file_t __this_module; in a header so you end up with an undefined symbol, and you would need to add a special hack for that symbol name to resolve it. Possibly just in linker_file_lookup_symbol_internal you would add a special case before the call to LINKER_LOOKUP_SYMBOL:

if (strcmp(name, "__this_module") == 0)
    return (lf);

Thanks for all the suggestions! I've implemented it by doing the special
casing in linker_file_lookup_symbol_internal as suggested. It looks like
this does work, I can see different owner fields populated in nvidia-drm
and drm. Also confirmed multiple usages of THIS_MODULE in drm.ko point to
the same thing.

My only thought is do we want to make this work for stock FreeBSD as well and not have it be linuxkpi specific by just s/__lkpi_this_module/__this_module/? Perhaps @kib has some thoughts on if we might find this useful outside of Linux KPI?

Updated to be generic. I changed the name to __this_linker_file to better represent
what the variable points at. On linux __this_module points at a struct module, so
for us we point at a linker_file_t.

So what is the expected use of THIS_MODULE in Linux? Determine that current module is not that module, or something else?

Also note that linker script cannot work for amd64 .o modules.

sys/compat/linuxkpi/common/include/linux/module.h
72

The cast is too rude, isn't it? What is the module? Can we add the real thing to each module (?) metadata?

Hm, I think there already should be such thing 'struct module' for Linux modules, right? Then THIS_MODULE should resolve to it instead of the FreeBSD' internal structure.

sys/kern/kern_linker.c
915

You probably want to add the KLD_DPF tracing there, noting the very special case.

Updated with KLD_DPF debug statement.

So what is the expected use of THIS_MODULE in Linux? Determine that current module is not that module, or something else?

It's supposed to point to a struct module which describes some things about a loaded module, seems the most important of which is it contains the file operations. DRM code uses this to tell if things belong to the same kernel module and take action based on that: https://github.com/freebsd/drm-kmod/blob/master/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c#L204

Afaict this comparison is the only type of thing we use owner/THIS_MODULE for through linuxkpi.

sys/compat/linuxkpi/common/include/linux/module.h
72

I don't think there is a linuxkpi struct module, or at least I can't track one down. The things that would manipulate a Linux struct module (module_get, etc) are noops.

So it looks to me that struct module in linuxkpi is really the FreeBSD struct module, and THIS_MODULE just casts to it to have everything compile. The panic that this change fixes just needs "some" valid pointer value to be in THIS_MODULE so that different owner fields can be compared to know if two DRM objects come from the same kernel module.

As of right now we don't really need to add a real Linux struct module. I suppose we could do that but my focus with this was just to fix issues from "*.owner == *.owner" secretly being "0x0 == 0x0" and causing panics.

Based on all that I think this aggressive cast makes sense? Open to suggestions though.

% grep -r THIS_MODULE sys/contrib/dev | wc -l
      87
% grep -r THIS_MODULE sys/contrib/dev | grep -v '\.owner =' | wc -l
       9
% grep -r THIS_MODULE sys/contrib/dev | grep -v '\.owner =' 
sys/contrib/dev/acpica/changes.txt:any "_THIS_MODULE defined but not used" messages.
sys/contrib/dev/iwlwifi/iwl-drv.c:      return request_firmware_nowait(THIS_MODULE, 1, drv->firmware_name,
sys/contrib/dev/iwlwifi/mvm/ops.c:      module_put(THIS_MODULE);
sys/contrib/dev/iwlwifi/mvm/ops.c:              if (!try_module_get(THIS_MODULE)) {
sys/contrib/dev/iwlwifi/mvm/ops.c:                      module_put(THIS_MODULE);
sys/contrib/dev/iwlwifi/pcie/trans.c:   module_put(THIS_MODULE);
sys/contrib/dev/iwlwifi/pcie/trans.c:   if (!try_module_get(THIS_MODULE)) {
sys/contrib/dev/iwlwifi/pcie/trans.c:           module_put(THIS_MODULE);
sys/contrib/dev/rtw88/main.c:   ret = request_firmware_nowait(THIS_MODULE, true, fw_name, rtwdev->dev,

Taking into the account bz' grep output, I believe that the best route is to properly implement the struct module. For start it could be simply struct module {int dummy;}; if nothing is needed from its guts. But there should be an instance of the structure for each LKPI module, and THIS_MODULE underlying symbol should be magic indeed, but not in the sense of the current review. Instead, it should be only resolved locally, causing undefined symbol error if attempt is made to reference it from kld without the module symbol (section ?).

BTW, how it THIS_MODULE supposed to work when LKPI modules are statically linked into kernel?

Instead, it should be only resolved locally, causing undefined symbol error if attempt is made to reference it from kld without the module symbol (section ?).

Could you elaborate a little on what you mean by this?

There was an earlier version of this which used LKPI_DRIVER_MODULE to create a global variable for each lkpi module, do you mean something like that? It would be very easy to create the dummy struct module there.

https://github.com/freebsd/drm-kmod/pull/297

Instead, it should be only resolved locally, causing undefined symbol error if attempt is made to reference it from kld without the module symbol (section ?).

Could you elaborate a little on what you mean by this?

There was an earlier version of this which used LKPI_DRIVER_MODULE to create a global variable for each lkpi module, do you mean something like that? It would be very easy to create the dummy struct module there.

https://github.com/freebsd/drm-kmod/pull/297

Yes, very similar. But the magic from kernel linker must ensure that this symbol is local to linker file. It must not be resolved from other file. So if a file does not provide the symbol but references it, the result is undefined symbol instead of random and invalid binding.

And again, my question still stands, how would any of proposals handle LKPI modules statically compiled into the kernel?

how would any of proposals handle LKPI modules statically compiled into the kernel?

The previous solution I linked generated a unique global name per module that was then used. i.e. instead of __this_module it was __this_module_drm_ or something similar. The names were unique since they were derived from the sysctl node names. It was only done for the DRM lkpi modules in drm-kmod, but there's no reason we couldn't do the same approach for the in-tree ones.

For static linking that approach should work, since it's just a uniquely named global symbol. But if we are doing the "add logic in the linker to check that the usage is contained within the module" approach you listed then I'm not sure how to accomplish that? I assume the linker isn't even invoked for static kernel module, so any approach that uses it is going to have to accomplish the same through some other means? Do we have any existing variables that behave similar to what you're suggesting here that we could use as an example?

I think an important question is what does __THIS_MODULE mean in Linux when used in the kernel proper rather than in a kernel module? That is what we should be aiming for when a linuxkpi-using module is compiled statically into the kernel.

Note that having __THIS_MODULE resolve to the FreeBSD kernel's linker_file_t would work just fine with what @bz found, you just want module_put to know that the pointer is really a linker_file_t and then call a suitable function in kern_linker.c to bump the reference count. The important point is the semantics, and those may not require a wrapper structure if the value is an opaque cookie.

It seems the way it works is that THIS_MODULE is a NULL pointer during "builtin" (statically building modules into the kernel), and points to the struct module of the dynamically loaded module otherwise. I think we could do something similar where we get the current linuxkpi behavior (THIS_MODULE is null) if we aren't building a dynamically loaded module?

With that concern resolved I guess the question now is do we want to:

a) have __this_module magically resolve to the current linker_file_t, implementing module_put as jhb suggested (no wrapper struct, close to what the current revision is)
b) have some linuxkpi macro declare a struct module global, validate usage at module load time (similar to the previous version I linked)
c) maybe have the module loader allocate space at load time, placing __this_module there and filling it with a wrapper struct? Not sure how possible this is.
d) something else?

Of the options I think (a) is the simplest, but (c) would probably be the most robust. I'm not expert but happy to implement whatever you all suggest.

It seems the way it works is that THIS_MODULE is a NULL pointer during "builtin" (statically building modules into the kernel), and points to the struct module of the dynamically loaded module otherwise. I think we could do something similar where we get the current linuxkpi behavior (THIS_MODULE is null) if we aren't building a dynamically loaded module?

With that concern resolved I guess the question now is do we want to:

a) have __this_module magically resolve to the current linker_file_t, implementing module_put as jhb suggested (no wrapper struct, close to what the current revision is)
b) have some linuxkpi macro declare a struct module global, validate usage at module load time (similar to the previous version I linked)
c) maybe have the module loader allocate space at load time, placing __this_module there and filling it with a wrapper struct? Not sure how possible this is.
d) something else?

Of the options I think (a) is the simplest, but (c) would probably be the most robust. I'm not expert but happy to implement whatever you all suggest.

IMO b is the most correct (assuming it would detect that the lmodule is statically linked into kernel and not define the module 'global'). Then c is just a complication over b with the dynamic allocation.

Since jhb answered what should happen with the static linking into kernel proper, I still think that b is right, and you need to add something to the kernel linker to ensure that __this_module is local to the module.

I guess the last question is what linuxkpi macro should actually create the global? I had previously used LKPI_DRIVER_MODULE, but not everywhere uses that. I guess we could make our own DEFINE_THIS_MODULE() macro and then call it from LKPI_DRIVER_MODULE, and add calls elsewhere as needed?

Do we have a good way to detect if we are building for a dynamic module or not during build? I looked for a macro that signaled that but couldn't find anything in kmod.mk.

Do we have a good way to detect if we are building for a dynamic module or not during build? I looked for a macro that signaled that but couldn't find anything in kmod.mk.

The right way is to check for #ifdef KLD_MODULE. Look for the uses in the tree.