Page MenuHomeFreeBSD

inpcb: immediately return matching pcb on lookup
ClosedPublic

Authored by glebius on Jan 10 2023, 2:02 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jan 1, 4:47 AM
Unknown Object (File)
Sun, Dec 22, 2:37 PM
Unknown Object (File)
Sat, Dec 14, 9:53 PM
Unknown Object (File)
Tue, Dec 10, 7:01 PM
Unknown Object (File)
Mon, Dec 9, 3:45 AM
Unknown Object (File)
Oct 5 2024, 6:06 PM
Unknown Object (File)
Oct 5 2024, 4:52 PM
Unknown Object (File)
Oct 2 2024, 11:34 PM
Subscribers

Details

Summary

This saves a lot of CPU cycles if you got large connection table.

The code removed originates from 413628a7e3d, a very large changeset.
Discussed that with Bjoern, Jamie we can't recover why would we ever
have identical 4-tuples in the hash, even in the presence of jails.
Bjoern did a test that confirms that it is impossible to allocate an
identical connection from a jail to a host. Code review also confirms
that system shouldn't allow for such connections to exist.

With a lack of proper test suite we decided to take a risk and go
forward with removing that code.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 49044
Build 45933: arc lint + arc unit

Event Timeline

we can't recover why would we ever have identical 4-tuples in the hash, even in the presence of jails

Sorry if this is a stupid question. Isn't it possible for this table to contain bound and unconnected sockets, inserted by in_pcbbind()? If so, it should indeed be possible to contain identical 4-tuples in the hash, though the addresses may be wildcards. I'd also assume it is possible via SO_REUSEPORT or the undocumented IP_BINDMULTI option. Or are we somehow assuming that this lookup path can only match connected sockets?

we can't recover why would we ever have identical 4-tuples in the hash, even in the presence of jails

Sorry if this is a stupid question. Isn't it possible for this table to contain bound and unconnected sockets, inserted by in_pcbbind()? If so, it should indeed be possible to contain identical 4-tuples in the hash, though the addresses may be wildcards. I'd also assume it is possible via SO_REUSEPORT or the undocumented IP_BINDMULTI option. Or are we somehow assuming that this lookup path can only match connected sockets?

Not a stupid question at all! Here goes my understanding of state of things. All users of in_pcblookup_hash_locked() seach for 4-tuples, that would ignore sockets inserted by in_pcbbind() as they have zero faddr&fport. The wildcard lookup happens later and the patch doesn't modify it. The SO_REUSEPORT allows to kick away existing pcb, not to exist, AFAIU.

The IP_BINDMULTI is Adrian's unfinished idea, checked in as "first stage" in 0a100a6f1ee5. The "second stage" never happened. There is zero test cases and zero software that uses it (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261398). Given that - I don't care about it. Rather I'd open dicussion on retiring it or bringing to a maintainable shape - e.g. actively used software or at least a test suite.

we can't recover why would we ever have identical 4-tuples in the hash

One example JFYI, IPv6 link-local addresses can be the same for different interfaces. I think this isn't the case for now, since we keep them in embedded form, but it would work in proper way in the future and we should be able to handle several the same 4-tuple but with different scope zone id.

In D38015#863418, @ae wrote:

we can't recover why would we ever have identical 4-tuples in the hash

One example JFYI, IPv6 link-local addresses can be the same for different interfaces. I think this isn't the case for now, since we keep them in embedded form, but it would work in proper way in the future and we should be able to handle several the same 4-tuple but with different scope zone id.

That's a good point. There indeed shouldn't be a problem with an embedded form. When we eliminate the embedded form, that shouldn't be a problem either: inpcbs assume unicast addresses and the only scope in unicast is link-local.
There is already the ifnet argument in the in6_pcblookup_hash_locked(), so converting it to the "normal" form will require and additional condition, checking inp scopeid with the ifp scopeid.

I am fine with this on "main" if other concerns are sorted/understood.
I would still very much love someone to test this on a FreeBSD 7.4 or 8.0 before tons of other changes happened to see if this was ever possible and invariants have changed or not. Not doing the due-diligens now would kind-of be lazy and leave the question open for ever (and the commit message would read differently now if it was never possible).

PS: and I did email a test script to the mailing list following up to gallatin's posting (at least one using VNETs) which can be easily converted to be included in our test-suite as such.

Sorry for the delay.

we can't recover why would we ever have identical 4-tuples in the hash, even in the presence of jails

Sorry if this is a stupid question. Isn't it possible for this table to contain bound and unconnected sockets, inserted by in_pcbbind()? If so, it should indeed be possible to contain identical 4-tuples in the hash, though the addresses may be wildcards. I'd also assume it is possible via SO_REUSEPORT or the undocumented IP_BINDMULTI option. Or are we somehow assuming that this lookup path can only match connected sockets?

Not a stupid question at all! Here goes my understanding of state of things. All users of in_pcblookup_hash_locked() seach for 4-tuples, that would ignore sockets inserted by in_pcbbind() as they have zero faddr&fport. The wildcard lookup happens later and the patch doesn't modify it.

I see, thank you. It looked a bit like in_pcb_lport_dest() could potentially pass a wildcard local address to in_pcblookup_hash_locked(), but seems to be impossible.

Should we therefore assert that faddr and laddr are not wildcard addresses?

The SO_REUSEPORT allows to kick away existing pcb, not to exist, AFAIU.

I think you are thinking of SO_REUSEADDR. (BTW, I think SO_REUSEADDR is mostly a no-op now with the removal of INP_TIMEWAIT...?)

SO_REUSEPORT indeed allows multiple sockets to bind to the same laddr:lport, useful for UDP multicast.

The IP_BINDMULTI is Adrian's unfinished idea, checked in as "first stage" in 0a100a6f1ee5. The "second stage" never happened. There is zero test cases and zero software that uses it (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261398). Given that - I don't care about it. Rather I'd open dicussion on retiring it or bringing to a maintainable shape - e.g. actively used software or at least a test suite.

Sure, if we don't think it was ever used then it should be removed, if only to make in_pcbbind_setup() simpler. I'll happily write that patch.

I'm taking lack of approvals from subscribed parties as "no objections" and plan to commit this revision this week.

I do take the lack of anyone taking up the test script I emailed for the test suite or any results for 7.x/8.0 for that as no interest of actually trying to understand the historic implications or keeping the new status-quo documented and working. Are there any plans to still do either of these?
Apart from that I said I am fine with this on "main".

And I agree with clearing up IP_BINDMULTI if it hasn't happened yet.

This revision is now accepted and ready to land.Jan 31 2023, 6:43 PM