The TCP stack currently runs through the list of mbufs sizing
for TSO and then again for the actual copy out to the new
mbuf chain being passed down to IP. This causes double the
number of cache hits as we walk the mbuf chains.
For Rack a new tcp_m_copy() module was introduced that
optimizes this so the copy does both copy and size limiting
for TSO at the same time.
This change brings the use of that function into the main stack. Note
that NF has been using this in both rack and the main stack for a couple
of years now. This also paves the way for Drew Gallatin's new mbufs and
the TLS sendfile feature that uses those.