- User Since
- Jun 19 2014, 6:57 AM (283 w, 1 d)
Sep 20 2019
Excellent news, and it looks like there's still some performance left on the table there. I guess the question now is if I consider the theory that getting rid of the mp ring improves performance proven and start gutting it, or if I finish separating the TX path from the reclaim path and clean this hack up more so further testing can be done.
So two things jump out at me here... first, a quick look suggests _task_fn_tx() -> iflib_txq_drain() is recursing on the mutex, and second the interrupt handler is contending with tx for no good reason.
Sep 17 2019
Create a iflib_txq_drain_bypass() function called only from
iflib_if_transmit_bypass() which doesn't have the loop,
volatile/devolatile casts, conditional prefetching, txq-cast-to-mbuf,
etc overhead and only locks/unlocks once.
I assume the two piles in the bypass flamegraphs which don't start at fork_exit() are just because the stack sample isn't deep enough to correlate them over and that the iflib_txq_drain() samples in the wide peak are actually the ones that are directly in iflib_txq_drain?
Sep 16 2019
For the bypass disabled results, is tx_abdicate set?
Sep 11 2019
Use a tunable instead of a sysctl to toggle mp_ring bypass.
Aug 29 2019
Wait... why is iflib_if_transmit() being called in both? It should be using iflib_if_transmit_bypass() instead.
Aug 14 2019
Jul 30 2019
Jul 22 2019
Jul 9 2019
Cannot reproduce with D20834 appplied.
Jun 23 2019
Jun 19 2019
Jun 18 2019
Jun 12 2019
Jun 5 2019
May 13 2019
May 8 2019
May 6 2019
Ok, I didn't think these functions were hot enough to be worth optimizing, and I'm not sure anything can be optimized around E1000_WRITE_*
Phew, great job. I've always been a bit uneasy about the MSI/INTx stuff, but this seems to have cleared everything up.
May 2 2019
Apr 30 2019
Apr 29 2019
Use rtsdtr as the stty argument to set the default mode, and -rtsdtr to
disable automatically asserting them on open().
Apr 26 2019
Address feedback, add support to umcs.
Apr 25 2019
"Reference" count core offsets and free() when all drivers unloaded.
Apr 24 2019
Fix style(), clean up default sysctl values, log message on malloc() failure.
Apr 23 2019
Apr 22 2019
Apr 19 2019
I'd really like to see this part split out into a separate review and committed soon, because it shouldn't see any contention- it adds control that is desirable to the people that know they need it, and doesn't really affect anyone else.
Apr 17 2019
Apr 15 2019
Mar 27 2019
Looks good to me.
Mar 14 2019
And again now that I've joined iflib...
Feb 28 2019
Feb 22 2019
Feb 15 2019
Feb 14 2019
FILTER_SCHEDULE_THREAD is a flag, not a discrete value. If it's set, schedule the
gtask and return FILTER_HANDLED.
Return FILTER_HANDLED rather than FILTER_SCHEDULE_THREAD when gtask (potentially) scheduled
Some bits of the upstream patch didn't paply cleanly... added manually.
Put the result variable back in. *facepalm*
Unpack the conditional for easier reading.
Remove unrelated stuff.
Looks good to me.
Feb 5 2019
Feb 4 2019
I love everything about this. If you plan to use the summary as the commit message though you should modify "change the M_NOWAIT from malloc(9) calls into M_NOWAIT." bit to mention M_WAITOK.
Jan 28 2019
Looks good. This may also fix where bar == -1 && pci_msix_count(dev) == 0 (or break it less... or something)
I think that the inability to map an MSI-X table should likely not be bootverbose... this is an allocation failure which will significantly impact device performance. While the user should know that MSI-X is disabled, I'm not sure the user should be expected to know that the allocation will fail.
Jan 25 2019
No objections, though I do agree with cem@ in principle regarding putting it in the GENERICs. When 3rd-party drivers start relying on it, it would violate POLA removing Intel NICs from the kernel would break a 3rd-party driver. For now it's likely "fine".
Jan 18 2019
Jan 17 2019
So the basic goal is to have the test at sys/net/if_ethersubr.c:583 see a consistent state of the changes from the /* Change the interface type */ line to the lagg_proto_addport() line.
Jan 16 2019
RLOCK() before setting ifp->if_type
Jan 15 2019
Jan 14 2019
Jan 11 2019
Jan 8 2019
Jan 7 2019
My only concern here is making IFLIB_MAX_RX_SEGS available to the driver...
Looks good, only a tiny nitpick on the fail label.
Dec 27 2018
Dec 20 2018
In addition to trying to keep the TXQ full, use a mp_ring size that's half the
number of descriptors. Previously, the mp_ring was a fixed size which happened
to be twice the default size of the txq for my em devices.
Dec 18 2018
Dec 14 2018
Remove the drain limit completely and instead try to fill the TX queue on each drain.