- Ask the firmware for the number of frames that can be stuffed in one work request.
- Modify mp_ring to increase the likelihood of tx coalescing when there are just one or two threads that are doing most of the tx. Add teeth to the abdication mechanism by pushing the consumer lock into mp_ring. This reduces the likelihood that a consumer will get stuck with all the work even though it is above its budget.
- Add support for coalesced tx WR to the VF driver. This, with the changes above, results in a 7x improvement in the tx pps of the VF driver for some common cases. The firmware vets the L2 headers submitted by the VF driver and it's a big win if the checks are performed for a batch of packets and not each one individually.