Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability:
- The existing TCP INP_INFO lock continues to protect the global inpcb list stability during full list traversal (e.g. tcp_pcblist()).
- A new INP_LIST lock protects inpcb list actual modifications (inp allocation and free) and inpcb counters.
It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input())
and use INP_INFO_WLOCK only in occasional operations that walk all connections.
The maximum of number of TCP connection (setup and teardown) per connection
is increased from 60k/sec to 150k/sec.
Some notes:
- This patch is one more step in the short-lived TCP connection scalability effort that currently includes:
- rS261242: Decrease lock contention within the TCP accept case by removing
- rS264321: Currently, the TCP slow timer can starve TCP input processing while it
- rS271119: In tcp_input(), don't acquire the pcbinfo global write lock for SYN
- rS273850: Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and
- rS281599: Fix an old and well-documented use-after-free race condition in
- This effort can be also seen as a part of the larger effort started with rS222488: Decompose the current single inpcbinfo lock into two locks:.