Page MenuHomeFreeBSD

Fix locking in vm_reserv_reclaim_contig().
ClosedPublic

Authored by markj on Nov 15 2019, 11:45 PM.
Tags
None
Referenced Files
F107721915: D22392.diff
Fri, Jan 17, 7:18 PM
Unknown Object (File)
Wed, Jan 8, 10:49 PM
Unknown Object (File)
Fri, Jan 3, 4:29 PM
Unknown Object (File)
Fri, Jan 3, 3:51 PM
Unknown Object (File)
Thu, Dec 26, 9:48 AM
Unknown Object (File)
Mon, Dec 23, 2:08 AM
Unknown Object (File)
Sun, Dec 22, 8:22 PM
Unknown Object (File)
Dec 14 2024, 8:53 PM
Subscribers

Details

Summary

We were not properly handling the case where the trylock of the
reservaton fails, in which case we could leak the per-domain reservation
queue lock.

Introduce a marker reservation so that we can implement precise scanning
in vm_reserv_reclaim_contig(). (Ab)Use the marker's lock to serialize scans
of the partpop queue so that a global marker structure can be used. I
do not believe that this will significantly inhibit parallelism:
vm_reserv_reclaim_contig() is rarely called, and
vm_reserv_reclaim_inactive()'s hold time is short.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/vm/vm_reserv.c
1184 ↗(On Diff #64408)

If I can rewrite to eliminate a continue, and a break, and a duplicate statement, I've got to recommend it.

while (!ret &&((rv...))) {

 ...
ret = rv == TAILQ_FIRST(...);
if (ret)
  vm_reserv_reclaim(rv);

vm_reserv_unlock(rv);
}

kib added inline comments.
sys/vm/vm_reserv.c
1084 ↗(On Diff #64408)

I think we want to initialize marker into some impossible state so that attempts to use it as normal reservation trigger asserts.

1172 ↗(On Diff #64408)

This is true because you do not drop the scan lock in vm_reserv_reclaim_contig, right ? May be add a comment.

This revision is now accepted and ready to land.Nov 16 2019, 10:07 AM
markj added inline comments.
sys/vm/vm_reserv.c
1084 ↗(On Diff #64408)

I guess it would be sufficient to make it look like a fully populated reservation. Such a reservation should never be present in the partpop queue.

1172 ↗(On Diff #64408)

Indeed, ok.

1184 ↗(On Diff #64408)

Thanks, that's much better.

markj marked 3 inline comments as done.

Address feedback.

This revision now requires review to proceed.Nov 16 2019, 5:41 PM

I don't see the lock leak? again or continue both expect the lock held.

In D22392#490198, @jeff wrote:

I don't see the lock leak? again or continue both expect the lock held.

We were failing to unlock the reservation after the trylock failed.

Change locking vm_reserv_reclaim_inactive() following a suggestion
from Jeff: instead of using the scan lock, take the domain lock
and reclaim from the first reservation where trylock succeeds.
Drop the domain lock as soon as the reservation as dequeued. This
way we avoid holding any domain-global reservation locks while the
reservation is being broken.

I like this much better. Thank you.

This revision is now accepted and ready to land.Nov 21 2019, 11:12 PM
alc added inline comments.
sys/vm/vm_reserv.c
1176 ↗(On Diff #64695)

Since you are returning true/false below, instead of TRUE/FALSE, change this to "bool" here and in vm_reserv.h.

1284 ↗(On Diff #64695)

Same here. (This case was already inconsistent, using boolean_t with true/false.)

This revision was automatically updated to reflect the committed changes.