Page MenuHomeFreeBSD

fix nsdispatch(3) race and add an ATF test
ClosedPublic

Authored by markj on Oct 22 2014, 10:28 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 18, 11:16 AM
Unknown Object (File)
Wed, Dec 11, 10:20 AM
Unknown Object (File)
Thu, Nov 28, 2:20 AM
Unknown Object (File)
Thu, Nov 28, 1:54 AM
Unknown Object (File)
Mon, Nov 25, 1:17 AM
Unknown Object (File)
Oct 29 2024, 2:56 AM
Unknown Object (File)
Oct 26 2024, 9:48 PM
Unknown Object (File)
Oct 21 2024, 10:48 PM
Subscribers
None

Details

Reviewers
jhb
ngie
Summary

nsdispatch(3) internally calls nss_configure() to see whether nsswitch.conf needs to be parsed. nss_configure() is called with the nss read lock held. It attempts to acquire the configuration mutex (i.e. trylock), and just returns 0 if it can't acquire it, so it continues with the lookup. In particular, this thread races with the thread holding the configuration mutex as it parses nsswitch.conf. This race is exacerbated by the fact that the thread holding the configuration mutex has to acquire the nss write lock before it can parse nsswitch.conf.

This can cause spurious lookup failures, since the first thread above may have to fall back to the default database sources instead of the sources specified in nsswitch.conf.

The fix is to unconditionally acquire the configuration mutex, dropping it immediately if another thread parsed nsswitch.conf while we were blocked.

Test Plan

The test case fails without the fix, and passes with the fix applied.

This change has been in Isilon's tree for a while. (The test case is new.)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

markj retitled this revision from to fix nsdispatch(3) race and add an ATF test.
markj updated this object.
markj edited the test plan for this revision. (Show Details)

More generally, why does conf_lock even exist? Upgrading to a write lock on nss_lock should be sufficient on its own. I think you can just remove conf_lock altogether instead. I think you probably still want to re-check the stat results after grabbing the write lock possibly, so the code might be:

if (isthreaded) {
    pthread_rwlock_rdunlock();
    pthread_rwlock_wrlock();
    stat();
    re-check mtime
}
lib/libc/net/nsdispatch.c
358

This can just be 'return (result)' can't it? (And then you don't need the changes below?)

359

Should you reinvoke stat() since the file might have changed since you checked it? Maybe that doesn't really matter?

In D994#2, @jhb wrote:

More generally, why does conf_lock even exist? Upgrading to a write lock on nss_lock should be sufficient on its own. I think you can just remove conf_lock altogether instead.

That makes sense. I'll give that a try and upload a new patch once I've tested it. Thanks!

lib/libc/net/nsdispatch.c
358

Hm, I think you're right. It felt wrong to change the lock ordering here but not below, but I think it's harmless in this case.

359

It can't hurt I guess. It probably doesn't make much of a difference since st_mtime only has second granularity.

  • Get rid of conf_lock entirely - it's made redundant by the nss write lock.
  • Re-stat nsswitch.conf after we acquire the write lock.
  • Simplify the code a bit by returning directly if we fail to acquire the write lock.
jhb removed a subscriber: jhb.
lib/libc/net/nsdispatch.c
393

Only nit here is that this is functionally identical to the previous code at the fin: label. If you find this more readable this way, that is fine. If it isn't an intentional change it would be nice to minimize the diff further.

  • Restore the original code from the end of nss_configure() rather than gratuitously modifying it.
jhb edited edge metadata.

Looks good to me, thanks!

This revision is now accepted and ready to land.Oct 24 2014, 5:47 PM