Page MenuHomeFreeBSD

symlink support for lockless lookup
ClosedPublic

Authored by mjg on Dec 5 2020, 4:36 PM.

Details

Summary

Name cache smr zones are reused for symlink storage.

FS routines are expected to provide a stable snapshot of the symlink to cache_symlink_resolve routine. Calling into something namecache-specific avoids the problem seen in regular lookup where an additional buffer has to be allocated. This would be additionally problematic since the entire operation happens within an SMR section and freeing of the old buffer cannot be performed without exiting it.

This provides an immediate fit for tmpfs.

ZFS required some level of hackery to store an additional pointer, effectively duplicating symlink content. I don't think it's a big deal in this case.

UFS got a minimal patch to prevent it from regressing -- it always resorts to locked lookup (like stock kernel). I failed to find a good way of adding cached symlinks there and I don't intend to spend more time looking for one. This probably can be implemented by trylocking everything relevant needed to read the symlink.

With this out of the way I'm going to refactor part of lookup and namei.

Test Plan

tested by pho

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjg requested review of this revision.Dec 5 2020, 4:36 PM
mjg updated this revision to Diff 80366.
  • wrong patch
sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops.c
6594 ↗(On Diff #80366)

This line requires #if __FreeBSD_version check, again.

sys/fs/tmpfs/tmpfs_subr.c
345 ↗(On Diff #80366)

This store is release to ensure that symlink_smr update is visible if tn_link_target is non-NULL, right ?

437 ↗(On Diff #80366)

Why this store is release ?

sys/fs/tmpfs/tmpfs_vnops.c
1466 ↗(On Diff #80366)

... but then this load should be acquire and testing of tn_link_smr done only after load provided non-NULL value.

sys/kern/vfs_lookup.c
584

This refactoring can be (and should) committed in advance.

sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops.c
6594 ↗(On Diff #80366)

I'm going to add them when upstreaming.

sys/fs/tmpfs/tmpfs_subr.c
345 ↗(On Diff #80366)

No, it is to ensure the content of the buffer itself is fully populated before the pointer is set. The read side gets away with data dependency barrier (or nothing on all supported architectures).

tn_link_smr starts as false and in absolutely worst case we can get a false negative, which is harmless.

however, perhaps all tn_link_smr accesses should be gated behind atomic_* as well.

437 ↗(On Diff #80366)

copy pasto

sys/fs/tmpfs/tmpfs_vnops.c
1466 ↗(On Diff #80366)

tn_link_smr only goes one way (false -> true), so possible pairs are:

smr buffer + flag false ; fine transient failure
smr buffer + flag true; normal stable state past creation
non-smr buffer + flag false; normal stable state past creation

non-smr buffer + flag true can't happen.

Thus I don't think fences are needed here, but extra commentary probably would not hurt.

I'm going to convert tn_link_smr accesses to use atomic_* though.

sys/kern/vfs_lookup.c
584

ok

  • rebase
  • better tmpfs commentary
sys/fs/tmpfs/tmpfs_subr.c
367 ↗(On Diff #80376)

There is no 'data dep barrier' in the freebsd atomic(9) memory model. Also, and perhaps more important, store_rel() does not work unless load is load_acq().

You need to ensure and explain why symlink != NULL, symlink non-smr and symlink_smr == true is impossible to observe.

sys/fs/tmpfs/tmpfs_subr.c
367 ↗(On Diff #80376)

I thought the comment explains this. nnode itself will be made visible later, with modifications to both tn_link_target and tn_link_smr already stored anyone who deferences it sees both set one way or the other. acq barrier would be needed if nnode itself was already visible.

mjg added inline comments.
sys/fs/tmpfs/tmpfs_subr.c
367 ↗(On Diff #80376)

not mentioning data dependency barriers, if only as not needed on all supported architectures, sounds like a bug in the manpage.

sys/fs/tmpfs/tmpfs_subr.c
367 ↗(On Diff #80376)

FreeBSD as well as C11 memory models do not operate with barriers, they provide fences. Data barrier is subsumed by appropriate fence use.

That said, again, what is the acquire residual to the store_rel(tn_link_smr) ? rel does not have any effect unless paired with acq.

  • rebase
  • refer to data dep barrier as 'consume' in line with the c standard, use acq since there is no primitive for consume right now
  • add more commentary
  • assorted changes
kib added inline comments.
sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
4478 ↗(On Diff #82609)

Don't you need the same acq as in tmpfs?

sys/fs/tmpfs/tmpfs_subr.c
350 ↗(On Diff #82609)

'publishes publishes'

This revision is now accepted and ready to land.Jan 22 2021, 5:31 PM