- User Since
- Jun 22 2015, 5:21 PM (190 w, 4 d)
Thu, Feb 14
I understand the missing FILTER_STRAY.
Fri, Feb 8
Updated to truncate the length passed to pfil for multi-segment jumbo frames to just the length of the first segment.
Thu, Feb 7
Any high performance interface will look similar to this and be similarly invasive.
The reason to have this in the driver is for performance. The plan for pfil is to eventually just use a pointer to a memory blob for filtering. The current fake mbuf stuff is a step in that direction.
Wed, Feb 6
- moved rv declaration to top of function
- entered CURVNET to prevent panic from null curvnet when VIMAGE is compiled in Note placing this at the top of the rx function is a trade off between tiny overhead in the common case (nothing hooked) and a bit larger overhead in the case when something is hooked, and entering/restoring the VNET each time we call into the filter.
- Addressed all possible return values from pfil_run_hooks Note that PFIL_REALLOCED was tested by hacking pfil.c to copy packets
- Added a label to jump to when PFIL_REALLOCED is returned, and we need to send up an mbuf allocated by the filter.
Sat, Feb 2
Thu, Jan 31
Mon, Jan 28
Fri, Jan 25
Tue, Jan 22
Jan 4 2019
Remove XXX as per wlosh
Jan 1 2019
Dec 30 2018
I really like collapsing all those pointers, and making the NULL checks flags. Great work.
Dec 15 2018
I like this much better; hopefully the performance tests will show that it is an improvement.
Dec 14 2018
Nov 27 2018
Nov 16 2018
I was about to say that cxgbe still uses it, but it looks like that was just changed in r340465. So yes, lets get rid of it.
Nov 10 2018
- Fixed a pre-existing bug where, when receiving small packets, rx clusters were not unmapped, yet they were re-mapped when refilling the ring. This was fixed by keeping track of the bus address in ifsd_ba, and simply re-using them rather than re-doing the virt to bus mapping
Nov 9 2018
Thanks Olivier. 5% is about what I had expected. I think I see a way to improve that though. We seem to be doing repeated virt to phys translation on clusters where we have copied out a small mbuf on the rx side in iflib_rxd_pkt_get(). We call rxd_frag_to_sd() with a FALSE arg, to prevent unmapping. Yet in _iflib_fl_refill(), we seem to always do the mapping, even when we do not re-allocate clusters. This is one bit that's going to be more expensive on low-end boxes, since it will now result in a busdma callback function being called. I think this is an actual bug that would impact systems w/IOMMUs
Nov 8 2018
Oct 26 2018
Oct 23 2018
Oct 19 2018
Oct 11 2018
Stepping back for a second, are we going to get to a point where we can make malloc() and uma_zalloc() use the current thread's domainset policy, rather than one derived from the kernel object?
Oct 10 2018
FWIW, i have patched both of these changes (this + D17492) into the upstream ck git repo, and run them through the CK regression tests for epoch in userpace on my 32-core/64-thread amd, and they passed.
Oct 8 2018
You mention kstacks. I have not gone through the rest of your reviews, but I was wondering if you were working on affinitizing kstacks. Did you plan to break the cache out by domain, or drop it for affinitized kstack allocations? The default is so undersized these days, we're mostly not getting the benefit of the cache anyway (and the cache might be pessimal, since it is protected by a global lock).
Oct 6 2018
Oct 5 2018
Do we really need a new malloc variant? Couldn't this be done with a new malloc flag that was specific to malloc_domain(). Something like M_ALTOK or something? (or maybe M_EXACT if you want the default to be the other way around, and have malloc_domain() fall back by default)
Oct 1 2018
This was committed as r339043. Due to a cut and paste error, the wrong review was linked from the commit message, so the review was not auto-closed.
Sep 24 2018
- Fixed spelling errors pointed out by alc
- Updated to account for recent VM changes:
- added empty domain guard in kmm_back()
- fixed conflict in keg_fetch_slab() after r338755
Sep 19 2018
Sep 17 2018
Added check for empty domains in uma_large_malloc_domain(), and removed added blank line, as pointed out by Mark
Sep 15 2018
Sep 13 2018
Sep 10 2018
FWIW, as I mentioned in private email, the testing went fine. I verified that there were no apparent mem leaks, and that in_pcblbgroup_free_deferred() was called when I bounced nginx several times on a heavily loaded (> 90Gb/s) box.
Sep 6 2018
We've had it in our config at Netflix for so long that I forgot that it was not in GENERIC..
Aug 29 2018
For what it is, Van's birthday might be a better choice. Eg, some non-valid pointer that would cause free or mfree to panic rather than accept it and corrupt data.
Aug 28 2018
Sorry, having the ring pointer in the ring is just so odd.. Can you explain why it is there? Maybe it would be better to just not have the ring pointer in the ring and not have to have all these special cases.
What keeps the normal, non-IFC_QFLUSH from hitting this condition? Is the txq pointer included in r->size?
Aug 27 2018
Aug 25 2018
Aug 23 2018
Thanks for the clarification on the non-offset range!
Thank you so much for the pointer to the docs!!! It is much nicer to write from real docs, rather than crib from undocumented changes in another project.
Aug 22 2018
Aug 21 2018
Aug 17 2018
Aug 16 2018
Aug 15 2018
Aug 13 2018
Aug 10 2018
Jul 30 2018
In general, I looked around at ixlv.c, and it looks like there are many functions for the main pf which are identical, or nearly identical for the VF. Can we please work on collapsing them down to common code?
Jul 17 2018
Thank for doing this!
Jul 5 2018
Jul 4 2018
Jun 27 2018
I ran this on a Netflix 100g box. I observed no measurable difference in CPU time. So I think this patch is "neutral" for the perspective of our (mostly kernel) workload.
Jun 20 2018
I added Hans since it is mostly his code we are factoring out.
Jun 14 2018
if @jch cannot run his real workload, what about running a synthetic test that creates / tears down many connections per second? Eg, something like many copies of netperf -t TCP_CC ?
You might want to go for some middle ground here. 1ms is a bit much (even for us).
Jun 11 2018
Jun 4 2018
Sure.. Those checks for TCP in the TSO case should probably turn into asserts eventually..
May 30 2018
If people want to use ethernet entropy harvesting, can we do something to make it more useful in a separate change? I was going to initially suggest the ether dst addr, but that's not very random either.
May 29 2018
This is the sparse detect I'm using. Note how it walks the header, and then the TSO by packet on the wire.
Somehow my last comment got lost. With real Netflix traffic, the sparse detection logic is not sufficient. When we have kernel TLS, which inserts 13 and 16 byte mbufs around TLS records, we hit MDDs regularly. I think you need to walk the packet stream, not the DMA descriptors, so that you keep the same segmentation used by the hardware. I re-wrote the sparse detection to do this, and our MDDs went away.