Page MenuHomeFreeBSD

convert EBR like barrier to epoch(9)
AbandonedPublic

Authored by mmacy on Jun 23 2018, 11:10 PM.
Tags
None
Referenced Files
Unknown Object (File)
Nov 3 2024, 6:47 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 8:05 AM
Unknown Object (File)
Nov 2 2024, 7:45 AM
Subscribers

Details

Reviewers
jeff
alc
kib
Summary

the invl_gen_mtx is currently an expensive global serialization point on pmap_remove this converts it to use epoch

  • passes pho's tests post epoch kpi change

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 17846

Event Timeline

sys/amd64/amd64/pmap.c
475

don't you still need this assert?

500โ€“505

I see you deleted the turnstile code, but not the turnstile comment.

507

And this one?

sys/amd64/amd64/pmap.c
475

No. epoch(9) can recurse.

500โ€“505

Thanks. Removing.

507

No. If you exit an epoch without having entered one it will trigger an assert.

  • fix comment
  • update epoch_alloc to allow locks to be held across wait

Portions of the network stack now execute within a preemptible epoch section. In particular, we're allocating from and freeing to UMA within these sections. This means that pmap_remove() might already be getting called within an epoch section for net_epoch. Because the td_epochq linkage is embedded in the thread struct rather than living on the stack (like we do for rmlocks), preemptible epochs don't compose, so I think this change as-is is fragile.

Portions of the network stack now execute within a preemptible epoch section. In particular, we're allocating from and freeing to UMA within these sections. This means that pmap_remove() might already be getting called within an epoch section for net_epoch. Because the td_epochq linkage is embedded in the thread struct rather than living on the stack (like we do for rmlocks), preemptible epochs don't compose, so I think this change as-is is fragile.

More than fragile. https://people.freebsd.org/~pho/stress/log/mmacy025.txt

Will fix when I get back.

  • Update to new preemptible epoch interface

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

I want to read it (globally). I currently have some very time-consuming problems, so expect around a week turn-around time from me.

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

How does this fix the problem in the report from pho that you linked earlier?

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

How does this fix the problem in the report from pho that you linked earlier?

As of r335924 different epochs can now compose.

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

How does this fix the problem in the report from pho that you linked earlier?

As of r335924 different epochs can now compose.

But AFAICS you still cannot call epoch_wait() in an epoch section, which is the problem I'm referring to.

@kib & @markj this now passes pho's tests after addressing Mark's concerns with nesting. What further needs to be done for approval?

How does this fix the problem in the report from pho that you linked earlier?

As of r335924 different epochs can now compose.

But AFAICS you still cannot call epoch_wait() in an epoch section, which is the problem I'm referring to.

You can't call epoch_wait(x) while in epoch x. In this case we're calling epoch_wait_preempt on pmap_epoch while inside of net_epoch_preempt. Except for a couple of teardown paths in in if.c epoch_wait is never called on net_epoch_preempt.

You can't call epoch_wait(x) while in epoch x. In this case we're calling epoch_wait_preempt on pmap_epoch while inside of net_epoch_preempt. Except for a couple of teardown paths in in if.c epoch_wait is never called on net_epoch_preempt.

epoch_wait_preempt() still asserts that td_epochnest == 0 though. And if you remove that constraint it becomes possible for concurrent epoch_wait()s to deadlock.

You can't call epoch_wait(x) while in epoch x. In this case we're calling epoch_wait_preempt on pmap_epoch while inside of net_epoch_preempt. Except for a couple of teardown paths in in if.c epoch_wait is never called on net_epoch_preempt.

epoch_wait_preempt() still asserts that td_epochnest == 0 though.

Thanks, meant to remove that.

And if you remove that constraint it becomes possible for concurrent epoch_wait()s to deadlock.

Possible in the general case. Not here. And yes, I have a WIP patch to add WITNESS support.

You can't call epoch_wait(x) while in epoch x. In this case we're calling epoch_wait_preempt on pmap_epoch while inside of net_epoch_preempt. Except for a couple of teardown paths in in if.c epoch_wait is never called on net_epoch_preempt.

epoch_wait_preempt() still asserts that td_epochnest == 0 though. And if you remove that constraint it becomes possible for concurrent epoch_wait()s to deadlock.

And to address this particular issue, the assert can just be changed to MPASS(!in_epoch(x)) now that in_epoch takes an argument.

sys/amd64/amd64/pmap.c
439โ€“440

Wny is this mutex left around ?

444

Blank line after '{', there and for all other functions without locals.

Please put spaces around binary op '|'.

500โ€“505

This is not yet handled. I see that the comment requires much more significant rewrite, e.g. there is no longer a global di gen number, am I right ?

511

Why this code is needed ? And why the pmap_invl_gen.link is needed ? Also, pmap_invl_gen.gen should be bool or int instead of long, if it can only have the values 0 and 1.

3068โ€“3069

Is this line too long ?

3111

Put this declaration before previous static one.