Page MenuHomeFreeBSD

Fix locking in vm_reserv_reclaim_contig().
ClosedPublic

Authored by markj on Fri, Nov 15, 11:45 PM.

Details

Summary

We were not properly handling the case where the trylock of the
reservaton fails, in which case we could leak the per-domain reservation
queue lock.

Introduce a marker reservation so that we can implement precise scanning
in vm_reserv_reclaim_contig(). (Ab)Use the marker's lock to serialize scans
of the partpop queue so that a global marker structure can be used. I
do not believe that this will significantly inhibit parallelism:
vm_reserv_reclaim_contig() is rarely called, and
vm_reserv_reclaim_inactive()'s hold time is short.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj created this revision.Fri, Nov 15, 11:45 PM
dougm added inline comments.Sat, Nov 16, 6:05 AM
sys/vm/vm_reserv.c
1184 ↗(On Diff #64408)

If I can rewrite to eliminate a continue, and a break, and a duplicate statement, I've got to recommend it.

while (!ret &&((rv...))) {

 ...
ret = rv == TAILQ_FIRST(...);
if (ret)
  vm_reserv_reclaim(rv);

vm_reserv_unlock(rv);
}

kib accepted this revision.Sat, Nov 16, 10:07 AM
kib added inline comments.
sys/vm/vm_reserv.c
1084 ↗(On Diff #64408)

I think we want to initialize marker into some impossible state so that attempts to use it as normal reservation trigger asserts.

1172 ↗(On Diff #64408)

This is true because you do not drop the scan lock in vm_reserv_reclaim_contig, right ? May be add a comment.

This revision is now accepted and ready to land.Sat, Nov 16, 10:07 AM
markj marked 3 inline comments as done.Sat, Nov 16, 5:41 PM
markj added inline comments.
sys/vm/vm_reserv.c
1084 ↗(On Diff #64408)

I guess it would be sufficient to make it look like a fully populated reservation. Such a reservation should never be present in the partpop queue.

1172 ↗(On Diff #64408)

Indeed, ok.

1184 ↗(On Diff #64408)

Thanks, that's much better.

markj updated this revision to Diff 64451.Sat, Nov 16, 5:41 PM
markj marked 3 inline comments as done.

Address feedback.

This revision now requires review to proceed.Sat, Nov 16, 5:41 PM
jeff added a comment.Sat, Nov 16, 11:58 PM

I don't see the lock leak? again or continue both expect the lock held.

markj added a comment.Mon, Nov 18, 6:23 PM
In D22392#490198, @jeff wrote:

I don't see the lock leak? again or continue both expect the lock held.

We were failing to unlock the reservation after the trylock failed.

markj updated this revision to Diff 64695.Thu, Nov 21, 9:17 PM

Change locking vm_reserv_reclaim_inactive() following a suggestion
from Jeff: instead of using the scan lock, take the domain lock
and reclaim from the first reservation where trylock succeeds.
Drop the domain lock as soon as the reservation as dequeued. This
way we avoid holding any domain-global reservation locks while the
reservation is being broken.

jeff accepted this revision.Thu, Nov 21, 11:12 PM

I like this much better. Thank you.

This revision is now accepted and ready to land.Thu, Nov 21, 11:12 PM
kib accepted this revision.Fri, Nov 22, 12:39 AM
alc accepted this revision.Fri, Nov 22, 5:26 AM
alc added inline comments.
sys/vm/vm_reserv.c
1176 ↗(On Diff #64695)

Since you are returning true/false below, instead of TRUE/FALSE, change this to "bool" here and in vm_reserv.h.

1284 ↗(On Diff #64695)

Same here. (This case was already inconsistent, using boolean_t with true/false.)

markj marked 2 inline comments as done.Fri, Nov 22, 3:55 PM
This revision was automatically updated to reflect the committed changes.