Page MenuHomeFreeBSD

sysctl net.inet.tcp.ktlslist: allow snd_tag_status_str() to sleep
ClosedPublic

Authored by kib on Thu, Jul 3, 10:46 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jul 22, 10:13 PM
Unknown Object (File)
Thu, Jul 10, 3:51 PM
Unknown Object (File)
Wed, Jul 9, 4:29 AM
Unknown Object (File)
Wed, Jul 9, 12:01 AM
Unknown Object (File)
Tue, Jul 8, 9:02 PM
Unknown Object (File)
Tue, Jul 8, 4:49 PM
Unknown Object (File)
Tue, Jul 8, 12:48 PM
Unknown Object (File)
Tue, Jul 8, 9:06 AM
Subscribers

Details

Summary
in_pcb: add in_pcbrele_rlock()

The helper the derefs and rlock the provided inp.  Returns false if inp
is still usable.


sysctl net.inet.tcp.ktlslist: allow snd_tag_status_str() to sleep

For this, unlock inp around the calls, taking the reference on it.  If
the inp appears to be freed or unlinked after the relock, restart the
loop.  In case we already copying out some data, the request oldidx is
rewind.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Thu, Jul 3, 10:46 AM
kib edited the summary of this revision. (Show Details)

Instead of exposing INP_FREED, add in_pcbrele_rlock().

sys/netinet/in_pcb.c
1748
1749

Those semantics seem backwards. inp_trylock(), inp_smr_lock() both return true if the operation succeeded, false otherwise.

sys/netinet/tcp_subr.c
2728

I worry that this loop will not terminate on busy systems, e.g., if they are undergoing some DOS and inpcbs are being recycled frequently. An alternate approach would be to define some marker inpcb that lets you keep your place, but the implementation will be more challenging.

kib marked 2 inline comments as done.Thu, Jul 3, 2:26 PM
kib added inline comments.
sys/netinet/in_pcb.c
1748

I formulated it differently, mentioning that caller must own the reference.

1749

I copied the return values from in_pcbrele_locked().
I can change, but IMO it is more closely related function.

sys/netinet/tcp_subr.c
2728

Lets limit the number of retries for now.

kib marked 2 inline comments as done.

Mention the owned reference in the comment.
Limit the number of restarts for sysctl handler.

sys/netinet/in_pcb.c
1749

I see, fair enough.

sys/netinet/tcp_subr.c
2728

That sounds fine, but the latest diff doesn't implement it.

kib marked 3 inline comments as done.Thu, Jul 3, 6:17 PM
kib added inline comments.
sys/netinet/tcp_subr.c
2728

I forgot to commit it.
As result, I completely reworked the loop, adding the checks for potentially pending signals.

kib marked an inline comment as done.

Rework retry.
Relock sx between retries.
Check for pending signals.

I can't say that I am happy with this change, sorry. But looks like you need it, so let it be so.

I can't say that I am happy with this change, sorry. But looks like you need it, so let it be so.

What specifically you do not like?

All this hardware TLS assist is a crazy layer violation. But this is a wave a can't stand against.

That's why you need to sleep in the all inpcb iterator, which definitely was not designed for that. I can suggest that the iterator just returns to the userland the list of ktls enabled pcbs, and then a user level binary would iterate them one by one and request whatever it needs to be requested in a sleepable context, a syscall per pcb. I don't know if that will fit your needs or if you want to implement that.

All this hardware TLS assist is a crazy layer violation. But this is a wave a can't stand against.

https://kib.kiev.ua/kib/cats_Ln.jpg

That's why you need to sleep in the all inpcb iterator, which definitely was not designed for that. I can suggest that the iterator just returns to the userland the list of ktls enabled pcbs, and then a user level binary would iterate them one by one and request whatever it needs to be requested in a sleepable context, a syscall per pcb. I don't know if that will fit your needs or if you want to implement that.

And then kernel needs to validate passed inpcb address to be really inpcb, which involves re-walking the inpcb list. Basically, the linear scan is replaced by N^2.

This revision was not accepted when it landed; it landed in state Needs Review.Thu, Jul 10, 2:43 PM
This revision was automatically updated to reflect the committed changes.