- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Dec 8 2018
Dec 7 2018
Dec 5 2018
can you grab a flamegraph from such a test? also, can you compare this against https://reviews.freebsd.org/D17992 ?
Dec 1 2018
I did basic tests with changing the alignment of src and slowdowns were very small compared to similarly misaligned dst, at least on EPYC. I may take a closer look later.
Nov 30 2018
Nov 29 2018
Nov 28 2018
once more i don't have a full picture so can't give a proper review.
Nov 23 2018
Nov 22 2018
- remove now spurious cv_broadcast(&p->p_pwait);
In D17992#387922, @kristof wrote:Adding more rings won't really help any more than making this one ring larger. That merely increases the queue length between the multiple pf threads, and the pfsync processing code (which is still single-threaded) in pfsync_msg_intr() and pfsyncintr().
Nov 21 2018
- rebase
- fix fork
Nov 20 2018
So both ring and swi kicking code are significant players. I think a simple and probably good enough solution would just add more rings, perhaps based on the number of hardware threads. Assuming the traffic is hashed to distribute among them, the rings could mostly remain unshared with unrelated threads. Sending out of the traffic would just combine data from all rings. Kicking can also be avoided in a simple manner. You can add a var signifying the frequency of wakeups. The increase the frequency based on the traffic and past certain threshold you stop kicking swi. It has to decay so that if there is no traffic, the code goes back to wakeups once a second (or whatever).
Nov 18 2018
So I retested with your change. a failing build indeed is fixed. Perhaps my original change had a typo or compatible. Thanks.
Are you sure that on your box a *failing* -DNO_CLEAN starts building again with this change? Are you using meta-mode? I had a similar change locally and the build kept failing anyway, no meta-mode though.
Nov 16 2018
- address feedback
- drop killpg changes
Nov 15 2018
yes. these patches are stale and kind of crap. I have a WIP replacement which I'llprobalby post in a new review, we will see.
I have no doubt there is an improvement, just saying it is still slower than it can be and unless this uncovered a new major bottleneck, the ring manipulation is the new hotspot.
Nov 14 2018
I think the approach taken here is iffy. Basic problem with this is that even if there is no lock contention anymore, you are still suffering from bouncing cache lines. Also swi_sched probably does not appreciate being called very often.
Nov 13 2018
Nov 8 2018
Nov 6 2018
- address feedback
- regen against head
- i did not change the condition in proc_realparent as it goes way over the 80 char limit
Nov 4 2018
I don't think the same problem is a concern for ps/top, so it can be discussed further in a different review. I can simply drop killpg conversion from the patchset (and remove the spurious curly braces).
killpg is already unreliable. if the child is spotted as PRS_NEW it will be explicitly omitted, so I don't think this constitutes a regression in functionality
Nov 3 2018
Nov 2 2018
Nov 1 2018
Oct 31 2018
Oct 24 2018
Oct 23 2018
I believe the problem is roughly the same as before. Passed buffers are often already heavily misaligned, so movs here trip over the same words anyway.