Page MenuHomeFreeBSD

getblk: fail faster with GB_LOCK_NOWAIT
ClosedPublic

Authored by rlibby on May 19 2024, 1:44 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 7 2024, 8:36 PM
Unknown Object (File)
Nov 26 2024, 5:11 AM
Unknown Object (File)
Nov 18 2024, 4:46 PM
Unknown Object (File)
Nov 18 2024, 3:55 PM
Unknown Object (File)
Nov 18 2024, 10:29 AM
Unknown Object (File)
Nov 15 2024, 9:43 PM
Unknown Object (File)
Nov 15 2024, 9:40 PM
Unknown Object (File)
Nov 9 2024, 4:03 PM
Subscribers

Details

Summary

If we asked not to wait on a lock, and then we failed to get a buf lock
because we would have had to wait, then just return the error. This
avoids taking the bufobj lock and a second trip to lockmgr.

Sponsored by: Dell EMC Isilon

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 57802
Build 54690: arc lint + arc unit

Event Timeline

kib added a subscriber: kib.

I think that this is technically correct but unfair. Intent was that failed unlocked buffer lookup should not change the behavior, in particular, allowing the normal lookup quirks to proceed.

I do not have rational arguments against the change.

This revision is now accepted and ready to land.May 20 2024, 2:55 AM
In D45245#1032333, @kib wrote:

I think that this is technically correct but unfair. Intent was that failed unlocked buffer lookup should not change the behavior, in particular, allowing the normal lookup quirks to proceed.

Can you say more about the behavior that might need to be preserved, or about what might be unfair?

To me, this doesn't seem to change anything logically. Here are two cases I thought about where we would now simply exit after getting an error from BUF_TIMELOCK:

  1. The buf lock is held by someone else, and it still has the same identity. Without the patch, we goto loop where we acquire the bufobj lock, and then try again. Most likely we then find the buf again and fail to lock it again. But maybe we find it unlocked or non-existent. It doesn't really matter because either could have happened anyway.
  2. The buf lock is held by someone else, and its identity has changed (b_bufobj or b_lblkno). Changing the buf's identity itself requires the buf lock, and we could have contended against that. So even if we might now exit after having contended against the "wrong" buf lock, it isn't wrong to return EBUSY because the only way this could happen is when we might have also contended against the "right" buf lock.

My motivation here is to avoid the bo lock more and reduce time under it. I have a few more ideas and in-progress patches along these lines, so I appreciate any discussion of what I might be missing or might need to consider here. Thanks.

I accepted the review, I said that I do not have rational arguments against it.

My only comment is that, reading the code, it's not immediately obvious why we give up right away when failing to acquire the buf lock, but not when we successfully acquire the buf lock and discover that the buffer identity has changed. A comment explaining that we want to avoid touching the interlock when the "locked lock" is probably going to fail anyway would help, I think.

Note that GB_LOCK_NOWAIT has somewhat subtle meaning, it does not request to not sleep, or even does not request to not sleep on a buffer lock. It only signals to avoid sleeping on buffer lock with the specified identity.

My only comment is that, reading the code, it's not immediately obvious why we give up right away when failing to acquire the buf lock, but not when we successfully acquire the buf lock and discover that the buffer identity has changed.

There's no reason not to, it just didn't seem likely enough to worry about. My main issue comes from encountering locked bufs with the expected identity. But maybe it's better to be consistent so that the intent is easier to understand. I'll add this to the next diff but I'll take guidance on whether we actually do this or not.

A comment explaining that we want to avoid touching the interlock when the "locked lock" is probably going to fail anyway would help, I think.

I'll work something up to address your and @kib's comments and try not to be too wordy.

My only comment is that, reading the code, it's not immediately obvious why we give up right away when failing to acquire the buf lock, but not when we successfully acquire the buf lock and discover that the buffer identity has changed.

There's no reason not to, it just didn't seem likely enough to worry about. My main issue comes from encountering locked bufs with the expected identity. But maybe it's better to be consistent so that the intent is easier to understand. I'll add this to the next diff but I'll take guidance on whether we actually do this or not.

I don't have strong feelings either way. I agree that it's probably not worth worrying about that case; a comment would just make it clear why we don't (in the current version of the patch, anyway).

  • getblk: fail faster with GB_LOCK_NOWAIT
  • GB_LOCK_NOWAIT markj feedback
This revision now requires review to proceed.May 20 2024, 7:34 PM
This revision is now accepted and ready to land.May 20 2024, 9:08 PM
sys/kern/vfs_bio.c
4011–4015

Maybe this is too aggressive? lockmgr can also return EDEADLK with LK_NOWAIT, when we already hold an exclusive lock and then request a shared lock. We don't have shared buf locks in tree though.

sys/kern/vfs_bio.c
4011–4015

We only use exclusive mode for buffer locks. Even if we started trying to use shared lock for bread()/brelse(), the situation you mention is invalid.

I have some memories that recursion for the buffer lock was needed for xfs, but it is long time gone.

I concur that this seems like a reasonable optimization.

This revision was automatically updated to reflect the committed changes.