Page MenuHomeFreeBSD

Add support for sleepable EPOCH(9) type, based on prior work in the LinuxKPI.
Needs ReviewPublic

Authored by hselasky on May 21 2021, 9:31 AM.
Tags
None
Referenced Files
F81653381: D30376.id.diff
Fri, Apr 19, 12:28 PM
Unknown Object (File)
Dec 23 2023, 11:31 AM
Unknown Object (File)
Dec 10 2023, 10:38 PM
Unknown Object (File)
Nov 27 2023, 9:50 PM
Unknown Object (File)
Nov 27 2023, 3:24 PM
Unknown Object (File)
Nov 27 2023, 12:18 AM
Unknown Object (File)
Nov 25 2023, 11:28 AM
Unknown Object (File)
Nov 23 2023, 7:48 AM

Details

Reviewers
markj
glebius
mmacy
kib
Group Reviewers
manpages
Summary

This EPOCH type is not targeted for high turnaround performance, but more to
close configuration races in the kernel in a performant way, without having to
enforce sleepable mutexes which has the disadvantage of sharing a single
cache line for the lock.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

hselasky created this revision.

This EPOCH type is not targeted for high turnaround performance, but more to
close configuration races in the kernel in a performant way, without having to
enforce sleepable mutexes which has the disadvantage of sharing a single
cache line for the lock.

Why are sleepable rm locks not sufficient?

@markj:

Reasons for not using a RM lock:

  1. blocking can happen on the lock itself.
  2. multiple CPUs can touch the same RM lock , causing memory congestion trying to update the lock.

This also may be way to move the "RCU" support in the LinuxKPI to be implemented by the kernel.

@markj:

Reasons for not using a RM lock:

  1. blocking can happen on the lock itself.

Ok, but by definition sleeping can happen anyway. What usages do you have in mind for this?

  1. multiple CPUs can touch the same RM lock , causing memory congestion trying to update the lock.

RM read locks touch only per-CPU data in the fast path.

@markj:

Reasons for not using a RM lock:

  1. blocking can happen on the lock itself.

Ok, but by definition sleeping can happen anyway. What usages do you have in mind for this?

I want to resolve a race with if_ioctl. Yes, there are many ways to solve this, but EPOCH seems to be the more performant one, where wait operations are very rare, compared to the read sections.

https://reviews.freebsd.org/D28136

  1. multiple CPUs can touch the same RM lock , causing memory congestion trying to update the lock.

RM read locks touch only per-CPU data in the fast path.

I see, but still they block all callers when there is an exclusive lock?

--HPS

@markj:

Reasons for not using a RM lock:

  1. blocking can happen on the lock itself.

Ok, but by definition sleeping can happen anyway. What usages do you have in mind for this?

I want to resolve a race with if_ioctl. Yes, there are many ways to solve this, but EPOCH seems to be the more performant one, where wait operations are very rare, compared to the read sections.

rm lock read locks should be cheaper than epoch read sections.

https://reviews.freebsd.org/D28136

  1. multiple CPUs can touch the same RM lock , causing memory congestion trying to update the lock.

RM read locks touch only per-CPU data in the fast path.

I see, but still they block all callers when there is an exclusive lock?

--HPS

Right, they enforce mutual exclusion. In general I think it's preferable to use mutual exclusion when possible, since that makes it much easier to reason about the correctness of the code, at the expense of large worst-case latency for readers. But why is this downside a problem for ioctl handlers?

Right, they enforce mutual exclusion. In general I think it's preferable to use mutual exclusion when possible, since that makes it much easier to reason about the correctness of the code, at the expense of large worst-case latency for readers. But why is this downside a problem for ioctl handlers?

The ifioctl() handling always start by looking up the ifnet by name. There is no problem with correctness checking. By controlling if the network device is listed or not, this makes a perfect match for EPOCH(9). The only problem is that this code is sleepable, so that's why I added sleepable EPOCH(9) support, which should not conflict with your patches to simplify preemptable EPOCH's.

Right, they enforce mutual exclusion. In general I think it's preferable to use mutual exclusion when possible, since that makes it much easier to reason about the correctness of the code, at the expense of large worst-case latency for readers. But why is this downside a problem for ioctl handlers?

The ifioctl() handling always start by looking up the ifnet by name. There is no problem with correctness checking. By controlling if the network device is listed or not, this makes a perfect match for EPOCH(9). The only problem is that this code is sleepable, so that's why I added sleepable EPOCH(9) support, which should not conflict with your patches to simplify preemptable EPOCH's.

Sure, I am not worried about conflicts. I just don't see why we should extend our synchronization primitives when existing ones are sufficient. If there is some reason that we really want to avoid blocking in read-only ioctl handlers, then perhaps it is reasonable, but it's not clear why that is a problem.

Right, they enforce mutual exclusion. In general I think it's preferable to use mutual exclusion when possible, since that makes it much easier to reason about the correctness of the code, at the expense of large worst-case latency for readers. But why is this downside a problem for ioctl handlers?

The ifioctl() handling always start by looking up the ifnet by name. There is no problem with correctness checking. By controlling if the network device is listed or not, this makes a perfect match for EPOCH(9). The only problem is that this code is sleepable, so that's why I added sleepable EPOCH(9) support, which should not conflict with your patches to simplify preemptable EPOCH's.

Sure, I am not worried about conflicts. I just don't see why we should extend our synchronization primitives when existing ones are sufficient. If there is some reason that we really want to avoid blocking in read-only ioctl handlers, then perhaps it is reasonable, but it's not clear why that is a problem.

I currently view EPOCH(9) as the best solution, that sleeping is entirely on the writer side, so to speak. Then we don't have to worry about sleeping issues in the fast path.

I've also added setsockopt and getsockopt under the same sleepable EPOCH(9), see:

https://reviews.freebsd.org/D28136

I'd like to move forward with a solution for a few panic I see.

What do you think?

--HPS

gbe added a subscriber: gbe.

LGTM for the man page parts.

Misspelling can be fixed on commit if nothing else requires a rereview. Everything else is beyond me.

share/man/man9/epoch.9
136