Page MenuHomeFreeBSD

inpcb: retire the inpcb global list
ClosedPublic

Authored by glebius on Thu, Mar 19, 7:20 PM.
Tags
None
Referenced Files
F152881116: D55967.id175363.diff
Fri, Apr 17, 5:55 PM
Unknown Object (File)
Tue, Apr 14, 8:25 AM
Unknown Object (File)
Tue, Apr 14, 6:32 AM
Unknown Object (File)
Mon, Apr 13, 6:44 PM
Unknown Object (File)
Sat, Apr 11, 8:03 PM
Unknown Object (File)
Sat, Apr 11, 6:20 AM
Unknown Object (File)
Sat, Apr 11, 12:36 AM
Unknown Object (File)
Fri, Apr 10, 9:48 AM

Details

Summary

The iteration over all pcbs is possible without the global list. The
newborn inpcbs are put on a global list of unconnected inpcbs, then after
connect(2) or bind(2) they move to respective hash slot list.

This adds a bit of complexity to inp_next(), but the storage scheme is
actually simplified.

One potential problem before this change was that a couple of pcbs fall
into the same hash slot and are linked A->B there, but they also sit next
to each other in the global list, linked as B->A. This can deadlock of
course. The problem was never observed in the wild, but I was able to
instrument it with lots of effort: just few pcbs in the system, hash size
reduced down to 2 and a lot of repetitive calls into two kinds of
iterators.

However the main motivation is not the above problem, but make a step
towards splitting the big hash lock into per-slot locks.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/netinet/in_pcb.h
330
sys/netinet/tcp_usrreq.c
3013

If we break here, we'll proceed to the next loop and print one inpcb before breaking again.

sys/netinet/in_pcb.h
326
327

"CK-nature" is not very clear. I'd write something like, "we can't share the linkage pointers because unlocked readers might be iterating over them."

sys/netinet/in_pcb.h
324–332
325

Really it's just unconnected pcbs.

326

The comment describes two lists--what is the third list?

glebius added inline comments.
sys/netinet/in_pcb.h
326

What I'm trying to say here is that a pcb is linked by either of inp_hash_exact, inp_hash_wild, inp_unconn_list. All three members are of the same list entry type. Then I try explain why we can't use just one list entry member, or a union.

sys/netinet/in_pcb.h
326

Ok, it's more clear to me now.

364–365

The cache line annotation is wrong now. Why group these together?

glebius added inline comments.
sys/netinet/in_pcb.h
364–365

I'd probably remove all the comments on cache lines. Several years back @gallatin with @rrs did some actual measures that proved that certain cache line alignment is beneficial. The structure has changed a lot since, and as of today member placement is only speculatively cache line improved. If @gallatin goes for another round of cache misses profiling, then members can be rearranged and comments resurrected. Do you agree with this plan?

This revision was not accepted when it landed; it landed in state Needs Review.Sun, Apr 12, 6:36 PM
This revision was automatically updated to reflect the committed changes.