Page MenuHomeFreeBSD

inpcb: reoder inpcb destruction
ClosedPublic

Authored by glebius on Dec 19 2023, 5:45 PM.
Tags
None
Referenced Files
F108746544: D43122.id131583.diff
Mon, Jan 27, 4:50 PM
F108744091: D43122.id131911.diff
Mon, Jan 27, 4:33 PM
Unknown Object (File)
Sun, Jan 26, 6:09 PM
Unknown Object (File)
Fri, Jan 24, 10:43 PM
Unknown Object (File)
Fri, Jan 24, 10:37 PM
Unknown Object (File)
Fri, Jan 24, 9:49 PM
Unknown Object (File)
Thu, Jan 9, 11:53 AM
Unknown Object (File)
Sat, Jan 4, 9:30 AM

Details

Summary

First, merge in_pcbdetach() with in_pcbfree(). The comment for
in_pcbdetach() was no longer correct. Then, make sure we remove
the inpcb from the hash before we commit any destructive actions
on it. There are couple functions that rely on the hash lock
skipping SMR + inpcb lock to lookup an inpcb.

PR: 273890

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 55047
Build 51936: arc lint + arc unit

Event Timeline

jtl added inline comments.
sys/netinet/in_pcb.c
1723

Can you explain what race you see is possible and how placing this code here stops that race?

1756

If in_pcbrele_wlocked() returns false a few lines later, could this lead to a NULL dereference or double free? Should we instead set these to NULL if in_pcbrele_wlocked() returns false?

1766

If in_pcbrele_wlocked() returns false a few lines earlier, could this lead to a NULL dereference or double free? Should we instead set these to NULL if in_pcbrele_wlocked() returns false?

sys/netinet/in_pcb.c
1723

Let's imagine we are running the old code. We already cleared inp_socket pointer, we set the INP_FREED flag and entered in_pcbremhash(). We are blocked on the hash mutex in there. The hash mutex is owned by bind(2) syscall, that finds our inpcb (INP_FREED & INP_INHASHLIST, inp_socket == NULL) via in_pcblookup_local(). Holding the hash lock it expects inpcb to be valid, while it is already not.

This is what happens in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273890

1756

To my knowledge any code that may put a reference on inpcb to prolong its lifetime over in_pcbfree() will not access in6p_outputopts or in6p_moptions. There are two more reasons to consider leaving dangling pointer a safe thing:

  • the existing logic to free the options predates the SMR conversion
  • the amount of subsystems that reference inpcb to prolong their lifetime has reduced in the last couple years
sys/netinet/in_pcb.c
1756

It might be worthwhile to use the new DEBUG_POISON_POINTER() macro here. In particular, setting the pointers to NULL might hide some bugs since the common pattern is to check whether inp_options etc. is NULL before doing anything with them.

I think you're right that this code is safe today, but there's no invariant preventing someone from adding code elsewhere which violates this assumption.

sys/netinet/in_pcb.c
1723

OK, I think I now understand the problem you are trying to address. A few comments:

a) I think it might be worth rephrasing the comment to make it more obvious what problem you are trying to solve and how this solves it.
b) While this solves the exact bug you've described, I question whether this is actually "safe" in the long term. in6_pcbbind() is violating the locking protocol for struct inpcb which is documented in in_pcb.h. It seems like we actually should fix this more fundamentally, either by making in6_pcbbind() (and any other consumers) follow the correct locking protocol or by changing the documented locking protocol to one which meets the needs of all structure users. Otherwise, any future developers may believe that the inpcb lock is sufficient to protect these fields when it is not.

sys/netinet/in_pcb.c
1723

Well, I can't 100% agree that in6_pcbbind() really violates the protocol. It finds inpcb via hash with an exclusive lock on hash. The hash shall not contain freed inpcbs in principle when you got a write lock on it. You may find a free one only if you use SMR, only then you are required to get inpcb lock. I'd rather say the destruction path before this change slightly violates protocol.

1756

Looks like that is going to be first consumer of the macro in tree! I'll create a separate revision.

Also move the global list removal before any destructive actions.
No known bugs with that, but for overall consistency.

sys/netinet/in_pcb.c
1723

The reason I believe in6_pcbbind() violates the protocol is that it accesses the inp_socket member without holding the inpcb lock. The inpcb structure lists the inpcb lock as the synchronization mechanism. Therefore, no function has any guarantee that field is stable unless the function acquires the inpcb lock. If that field should instead stay stable the entire time the inpcb is in the hash table, we should really update the locking notes for the inpcb structure to indicate this guarantee. (The same is probably true of inp_vflag and inp_flags, or at least particular flag values.)

The other fields in6_pcbbind() accesses (inp_inc and inp_cred) are either marked as constant or protected by both the inpcb lock and hash lock.

sys/netinet/in_pcb.c
1723

I see now your concerns. The comment in in_pcb.h comes from 2008 from Robert Watson. That was definitely true back then. At least due to existence of compressed time-wait state. I'm 90%+ positive that now the comment no longer applies. The inp_socket is valid through the entire lifetime of an inpcb, except the INP_FREED state, when it is waiting for SMR period. If you obtained an inpcb through normal lookup functions, and did not release either lock provided by the lookup function, the inpcb must have valid inp_socket pointer.

sys/netinet/in_pcb.c
1723

A correction: the INP_FREED state can be prolonged beyond SMR period due to a reference. Those subsystems that do that, should take care themselves.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 27 2023, 4:35 PM
This revision was automatically updated to reflect the committed changes.