Differential D19863

Rework vnode locking in the ELF loader
AbandonedPublic
Actions

Authored by trasz on Apr 9 2019, 4:43 PM.

Details

Reviewers

Summary

Since the exec_elf{32,64}() function needs to drop the vnode lock anyway, try to do it as early as possible. This just moves the unlock; the number of lock operations shouldn't change.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 23598
Build 22585: arc lint + arc unit

Event Timeline

trasz created this revision.Apr 9 2019, 4:43 PM

Herald added a subscriber: imp. · View Herald TranscriptApr 9 2019, 4:43 PM

Harbormaster completed remote builds in B23598: Diff 56011.Apr 9 2019, 4:43 PM

I do not understand why this patch is needed, and you did not bothered to explain.

My impression is that the change is for worse: previous approach was that the vnode is locked most of the time, and only unlocked when needed. Now it is locked only when needed, which should increase the number of atomics executed, and increase contention with other lock pretenders. Also, if going this route, the use of shared lock should be examined.

sys/kern/imgact_elf.c
661	There is ASSERT_VOP_LOCKED() which must be used for such asserts, and which prints useful information about the vnode.
945	You are locking vnode for each loadable segment, i.e. typically twice. This is arguably worse than keeping the vnode locked.

You are right regarding the shared lock, this is the logical next step. I wanted to avoid mixing that change with refactoring.

But I don't see how the patch increases the number of locks/unlocks. Could you explain?

In D19863#426441, @trasz wrote:

But I don't see how the patch increases the number of locks/unlocks. Could you explain?

I did not count exact number of locks you take, but you now lock per segment, and do this more than once. So the number of locks taken is O(num of segments) instead of O(1).

That said, even if we assume that executables all have one segment (in fact two), then there is still more locks after your patch. I did not checked this.

In D19863#426594, @kib wrote:

In D19863#426441, @trasz wrote:

But I don't see how the patch increases the number of locks/unlocks. Could you explain?

I did not count exact number of locks you take, but you now lock per segment, and do this more than once. So the number of locks taken is O(num of segments) instead of O(1).

And that's the part I don't get - where? From what I see, the lock is taken once, around load_sections().

That said, even if we assume that executables all have one segment (in fact two), then there is still more locks after your patch. I did not checked this.

There is one additional lock, at the end - we have to return with an exclusive lock held. I'll think about this some more.

sys/kern/imgact_elf.c
661	Thanks; committed separately.

trasz marked an inline comment as done.Apr 10 2019, 10:24 AM

In D19863#426603, @trasz wrote:

In D19863#426594, @kib wrote:

In D19863#426441, @trasz wrote:

But I don't see how the patch increases the number of locks/unlocks. Could you explain?

I did not count exact number of locks you take, but you now lock per segment, and do this more than once. So the number of locks taken is O(num of segments) instead of O(1).

And that's the part I don't get - where? From what I see, the lock is taken once, around load_sections().

That said, even if we assume that executables all have one segment (in fact two), then there is still more locks after your patch. I did not checked this.

There is one additional lock, at the end - we have to return with an exclusive lock held. I'll think about this some more.

Each vn_rdwr() without NODELOCKED takes the vnode lock.

In D19863#426612, @kib wrote:

In D19863#426603, @trasz wrote:

In D19863#426594, @kib wrote:

In D19863#426441, @trasz wrote:

But I don't see how the patch increases the number of locks/unlocks. Could you explain?

I did not count exact number of locks you take, but you now lock per segment, and do this more than once. So the number of locks taken is O(num of segments) instead of O(1).

And that's the part I don't get - where? From what I see, the lock is taken once, around load_sections().

That said, even if we assume that executables all have one segment (in fact two), then there is still more locks after your patch. I did not checked this.

There is one additional lock, at the end - we have to return with an exclusive lock held. I'll think about this some more.

Each vn_rdwr() without NODELOCKED takes the vnode lock.

But previously we were dropping and picking up the lock just before, to call malloc(9), so this should equal out?

In D19863#426613, @trasz wrote:

But previously we were dropping and picking up the lock just before, to call malloc(9), so this should equal out?

We only drop lock for the malloc, where the layout of the binary is unusual, i.e. either note of phdr segments are not in the first page. There, we can do the trick, if this is what bothers you:

 ptr = malloc(M_NOWAIT);
 if (ptr == NULL) {
    VOP_UNLOCK();
    ptr = malloc(M_WAITOK);
    vn_lock(LK_RETRY);
}

This would eliminate all unlocks except under the memory stress, even for the weird binaries.

In D19863#426660, @kib wrote:

In D19863#426613, @trasz wrote:

But previously we were dropping and picking up the lock just before, to call malloc(9), so this should equal out?

We only drop lock for the malloc, where the layout of the binary is unusual, i.e. either note of phdr segments are not in the first page.

But we also only call vn_rdwr in the exact same situations.

There, we can do the trick, if this is what bothers you:
 ptr = malloc(M_NOWAIT);
 if (ptr == NULL) {
    VOP_UNLOCK();
    ptr = malloc(M_WAITOK);
    vn_lock(LK_RETRY);
}
This would eliminate all unlocks except under the memory stress, even for the weird binaries.

But it would still leave the whole thing under exclusive lock.

Rework.

Harbormaster completed remote builds in B23633: Diff 56109.Apr 11 2019, 8:36 PM

trasz edited the summary of this revision. (Show Details)Apr 11 2019, 8:42 PM

I still fail to understand why the change is needed.

sys/kern/imgact_elf.c
1145	Now look at this (and previous) goto ret.

In D19863#427073, @kib wrote:

I still fail to understand why the change is needed.

It reduces lock coverage. I have another patch (on top of this one), which moves load_sections() to the very end, which reduces lock coverage even more and gets rid of one lock/unlock pair.

sys/kern/imgact_elf.c
1145	Yeah, just noticed that.

Slightly less obviously buggy version.

Harbormaster completed remote builds in B23636: Diff 56112.Apr 11 2019, 9:14 PM

Also note the updated description at the top.

kib added inline comments.Apr 12 2019, 9:36 AM

sys/kern/imgact_elf.c
1042	Do not add 'locked', it makes reasoning about the code impossible. Perhaps add 'ret_unlocked' at the very end, which would lock the vnode and goto ret.
1075	If I was forced to do this change, I would move unlock there, instead of the beginning of the function.

trasz added inline comments.Apr 12 2019, 12:13 PM

sys/kern/imgact_elf.c
1042	That would be much worse, imho. Look at error handling in do_execve(). How does the 'locked' variable make things harder to follow? Its meaning is quite obvious - iff it's true, the vp is locked.
1075	But why? It would make it less obvious that the lock is not really needed at all on entry to this function.

trasz added inline comments.Apr 12 2019, 1:26 PM

sys/kern/imgact_elf.c
1075	You do have a point, though - the first conditional is used in loop from kern_exec(). Let me fix it.

Don't drop the vnode lock until after we're sure it's an ELF file.

Harbormaster completed remote builds in B23644: Diff 56135.Apr 12 2019, 1:37 PM

kib added inline comments.Apr 14 2019, 9:03 AM

sys/kern/imgact_elf.c
1042	Because locked variable basically means that the code author (in this case, the code mangler) gave up on understanding which blocks of code have the vnode locked, and which are not. So in essence, any addition of the code would require two branches, one for the locked case, another for unlocked. I strongly disagree with all versions of the proposed rearrangement of the chairs that you proposed so far.
1068	No, I did not suggested that. Move the VOP_UNLOCK() after the if() that checks for alignment of phdr. So far this is the only good part.

trasz added inline comments.Apr 14 2019, 10:22 AM

sys/kern/imgact_elf.c
1042	Locked variable sometimes leads to that, yes. But not in this case - here it's very obvious that it's only used for error handling. An alternative would be to either add vn_lock() before each goto statement, or use a weird code at the end which looks like this: ret: ... return (error) ret_unlocked: vn_lock() goto ret; } Which is, IMHO, quite ugly. Also, this is not "rearrangement of chairs" - current approach to locking in ELF loader is hilariously bad. This, along with two other diffs, removes all the unnceccesary unlock/lock pairs and makes it use shared locking.
1068	But why? There's nothing that requires the vnode lock there.

I do not see that this could end in anything useful.

Start from the other end, lets talk about your supposed shared-locking patch for the exe vnode.

sys/kern/imgact_elf.c
1042	No, it is not ugly. No, the current approach of locking in ELF loader is not bad as you try to represent it. Your diff does make the mess from the code that is understandable.

Ok, let's do that. The larger patch is here: https://reviews.freebsd.org/D19909

sys/kern/imgact_elf.c
1042	Yes, it is ugly. And yes, current "approach of locking" in ELF loader, which is basically "slap an exclusive lock around the whole thing and then drop it in random places so INVARIANTS doesn't panic" is pretty bad.

trasz abandoned this revision.May 5 2019, 11:17 PM

Revision Contents
Changeset List

Path

Size

sys/

kern/

imgact_elf.c

57 lines

Diff 56011

View Options

Rework vnode locking in the ELF loaderAbandonedPublicActions