Page MenuHomeFreeBSD

libc: remove dlerror() contamination from NS caching
AcceptedPublic

Authored by bjk on Aug 17 2025, 3:09 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 6, 1:25 AM
Unknown Object (File)
Sun, Oct 26, 4:46 AM
Unknown Object (File)
Fri, Oct 24, 10:16 AM
Unknown Object (File)
Fri, Oct 24, 12:58 AM
Unknown Object (File)
Thu, Oct 23, 4:27 PM
Unknown Object (File)
Thu, Oct 23, 8:00 AM
Unknown Object (File)
Oct 12 2025, 5:47 PM
Unknown Object (File)
Oct 3 2025, 7:01 PM
Subscribers

Details

Reviewers
kib
Summary
The NS caching functionality was brought in by commit
06a99fe36f0aac93e7689da6b3f07b727750691f, with the work initially done
as a GSoC 2005 project.  The implementation relies on a separate nscd
caching process, with libc's NSS module being able to use the cache as a
data source.  However, nscd itself is linked against libc and needs to
avoid recursing into itself to fulfil NSS requests, and so libc has a
check for whether a "nss_cache_cycle_prevention_function" symbol is
present in the main process's address space.  The nscd utility is
expected to be the only program that provides such a symbol, which means
that all other programs will have libc perform a dlsym() lookup for the
symbol, which fails.  This, in turn, leaves a non-NULL error string
ready to be returned by the next call to dlerror(), which is highly
confusing to encounter since it does not reflect an error seen by the
program's logic but rather the normal operation of libc, and is in a
certain sense a namespace contamination -- libc should not in normal
operation leave error strings visible to the application like this.

Fortunately, correct usage of dlerror() should only call it after the
application has experienced an actual error in one of its calls to a
dynamic linker function; in this scenario the libc-triggered string
will be replaced by an error string corresponding to the application's
failed call.  Only poorly written code that (e.g.) uses dlsym() for
several symbol lookups and relies on dlerror() to return a non-NULL
value if any lookup failed would be impacted by this issue.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

bjk requested review of this revision.Aug 17 2025, 3:09 AM
lib/libc/net/nsdispatch.c
400–402

If handle == NULL then you are guaranteed to have dlerror. Similarly, dlclose() might leave an error.

I think it is better to call dlerror() unconditionally right after the if (handle != NULL) block.

lib/libc/net/nsdispatch.c
400–402

I can handle dlopen() and dlclose() errors, sure.
I was a bit reluctant to put an unconditional dlerror() call in place because there is some (maybe theoretical) risk that it would consume a "real" error from the application code's logic (e.g., if the application tries a failing dlsym(), then does something that invokes nsdispatch(), and only later checks dlerror()). If we were confident this was running before main() I would be pretty amenable to unconditional dlerror(), but the comments elsewhere in the file talk about "first call to nsdispatch()" which suggests that the above scenarios is at least technically possible.

lib/libc/net/nsdispatch.c
400–402

If handle == NULL, then you are guaranteed to have dlerror set.

To be precise, you need to check the error drom dlclose() as well.

In reality, there are silent internal rtld errors that might cause setting the dlerror, mostly due to auto-loading of the filters.

All that means that I would not bother and just called dlerror() to clear it.

This revision is now accepted and ready to land.Aug 20 2025, 9:02 AM