Page MenuHomeFreeBSD

Fix locking in vm_reserv_reclaim_contig().
ClosedPublic

Authored by markj on Nov 15 2019, 11:45 PM.
Tags
None
Referenced Files
F105939061: D22392.diff
Sun, Dec 22, 8:22 PM
Unknown Object (File)
Sat, Dec 14, 8:53 PM
Unknown Object (File)
Nov 18 2024, 9:26 PM
Unknown Object (File)
Nov 18 2024, 7:06 PM
Unknown Object (File)
Oct 30 2024, 10:24 PM
Unknown Object (File)
Oct 26 2024, 3:27 PM
Unknown Object (File)
Oct 26 2024, 3:26 PM
Unknown Object (File)
Oct 26 2024, 3:26 PM
Subscribers

Details

Summary

We were not properly handling the case where the trylock of the
reservaton fails, in which case we could leak the per-domain reservation
queue lock.

Introduce a marker reservation so that we can implement precise scanning
in vm_reserv_reclaim_contig(). (Ab)Use the marker's lock to serialize scans
of the partpop queue so that a global marker structure can be used. I
do not believe that this will significantly inhibit parallelism:
vm_reserv_reclaim_contig() is rarely called, and
vm_reserv_reclaim_inactive()'s hold time is short.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 27551
Build 25773: arc lint + arc unit

Event Timeline

sys/vm/vm_reserv.c
1184

If I can rewrite to eliminate a continue, and a break, and a duplicate statement, I've got to recommend it.

while (!ret &&((rv...))) {

 ...
ret = rv == TAILQ_FIRST(...);
if (ret)
  vm_reserv_reclaim(rv);

vm_reserv_unlock(rv);
}

kib added inline comments.
sys/vm/vm_reserv.c
1084

I think we want to initialize marker into some impossible state so that attempts to use it as normal reservation trigger asserts.

1172

This is true because you do not drop the scan lock in vm_reserv_reclaim_contig, right ? May be add a comment.

This revision is now accepted and ready to land.Nov 16 2019, 10:07 AM
markj added inline comments.
sys/vm/vm_reserv.c
1084

I guess it would be sufficient to make it look like a fully populated reservation. Such a reservation should never be present in the partpop queue.

1172

Indeed, ok.

1184

Thanks, that's much better.

markj marked 3 inline comments as done.

Address feedback.

This revision now requires review to proceed.Nov 16 2019, 5:41 PM

I don't see the lock leak? again or continue both expect the lock held.

In D22392#490198, @jeff wrote:

I don't see the lock leak? again or continue both expect the lock held.

We were failing to unlock the reservation after the trylock failed.

Change locking vm_reserv_reclaim_inactive() following a suggestion
from Jeff: instead of using the scan lock, take the domain lock
and reclaim from the first reservation where trylock succeeds.
Drop the domain lock as soon as the reservation as dequeued. This
way we avoid holding any domain-global reservation locks while the
reservation is being broken.

I like this much better. Thank you.

This revision is now accepted and ready to land.Nov 21 2019, 11:12 PM
alc added inline comments.
sys/vm/vm_reserv.c
1162

Since you are returning true/false below, instead of TRUE/FALSE, change this to "bool" here and in vm_reserv.h.

1271

Same here. (This case was already inconsistent, using boolean_t with true/false.)

This revision was automatically updated to reflect the committed changes.