Other than the fd is not used in the test code, I am good with this change.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mon, Jul 7
Mon, Jun 30
Thanks for the elaboration. Looks good to me now.
Looks I am missing the context. Given the fact that the socket is in TCPS_TIME_WAIT while switching to the default stack, was this assert a day one issue that was not test covered before? Or was this panic caused by recent changes?
Jun 16 2025
Jun 13 2025
Jun 11 2025
May 14 2025
May 1 2025
Apr 21 2025
Mar 27 2025
Mar 20 2025
This is for the fast path fix. Just want to confirm with your that a second patch may be around to improve both of the fast and slow path?
Mar 17 2025
Mar 5 2025
Feb 28 2025
Didn't see any surprising regression from my test result for this patch:
testD49047
Feb 21 2025
code update based on Richard's comments
re-base, add a missing part and update based on Richard's comment in the meeting
Feb 20 2025
Feb 19 2025
In D49047#1118719, @glebius wrote:Question: why does this function returns signed value? And why some of the sackhint counters are signed?
There are several calls to tcp_compute_pipe() that did not check the V_tcp_do_newsack: 2 in tcp_output and 1 in tcp_input. Their behavior will change with revised SACK turned off. Commit message should explain why this is okay.
Feb 18 2025
Feb 14 2025
Feb 12 2025
Jan 31 2025
Jan 8 2025
Why not move the old_method: label above the stack variables' declaration? I think it may be cleaner to read.
Like this:
Jan 6 2025
Dec 18 2024
Dec 11 2024
Dec 10 2024
Additional comment:
Nov 25 2024
Nov 19 2024
update:
Looks this patch has some significant reduction on fragment (data_size % MSS) > 0 out of TSO data chunks: testD47474
TSO not enabled:
Nov 14 2024
In D47474#1084985, @rscheff wrote:Well, I had the same thought - the full MSS (including options) is less frequently used, that the mss without options...
So, it would certainly be more efficient to store the MSS exkluding options in tcpcb, and calculating the MSS inkl. option space only where that is needed...
But that should a different Diff IMHO, as this one is more in the break/fix category.
Nov 4 2024
I think you meant the title be:
tcp: consistently set CWND to MSS => tcp: consistently set CWND to 1
in case of SYN/SYN ACK retransmissions => in case of SYN retransmissions
Oct 28 2024
OK. I am approving it now as my test in https://wiki.freebsd.org/chengcui/testD43470 shows some improvement. Any bug related observations can be fixed later.
Oct 24 2024
Also, please correct the SUMMARY section:
Oct 23 2024
In D30155#1076291, @kbowling wrote:In D30155#1075233, @cc wrote:From my test result in testD30155, I didn't find any significant improvement under my eyes:
- no significant difference in ping latency
- no significant iperf3 performance improvement due to bad performance (3.x Gbps) in FreeBSD 15-current vs. (9.x Gbps) in stock Linux kernel 5.15.
Thanks for the results @cc. Something seems very strange with the throughput there, the main system I am testing is a xeon-d that is much less than 1/4th as powerful and can line rate both directions no issues and I also have an older 2x Xeon E5-2695 v2 (two NUMA domains) without throughput limitations. I will see if I can find my emulab credentials and take a look there, it seems like these might be 4-way NUMA machines but it is not expected to me that that would cause this magnitude of throughput issues, especially at the 10gbit data rate.
Oct 22 2024
Oct 21 2024
Oct 17 2024
update code based on discussion
From my test result in testD30155, I didn't find any significant improvement under my eyes:
Oct 16 2024
Better now. But it can be cleaner.
Add the __inline keyword to avoid overhead when possible.
Oct 15 2024
My current concern is that the definition and the usage of the super set macro TH_FLAGS or TCPF_ALL are inconsistent. For example, TH_ECE is in TH_FLAGS, but TH_ECN is in TCPF_ALL.
In D30155#1073639, @kbowling wrote:Ok this is a bit messy code and comment wise but I have the new algorithm working in what I believe to be the correct way with some bug fixes versus the origin and would like some data to see how to proceed before tidying everything up.
@cc it looks like emulab has ix(4) on d820s nodes, would you be willing to take a look at these 3 options similar to the e1000 test?
- Default in HEAD/STABLE: sysctl dev.ix.<N>.enable_aim=0
- New algorithm (on by default with this patch) sysctl dev.ix.<N>.enable_aim=1
- Old algorithm (FreeBSD <10) sysctl dev.ix.<N>.enable_aim=2
Oct 14 2024
I current concern is that new code for the TH_AE shall be in a separate patch, so that this patch can be a pure big non-functional change.
Oct 11 2024
Need code update.
Because of commit 440f4ba18e3a, please re-base.
Oct 10 2024
By the way based on my test, I didn't find this statement In addition, cwnd used to be 1 MSS right after RTO, increasing to 2 MSS more recently. to be true in your SUMMARY section. Also Address this by setting up snd_recover just in cc_cong_signal. needs to be revised.
With the provided packetdrill scripts before/after the fix, my test result is in my wiki: testD43355.
Oct 9 2024
I have no problem with this patch after testing it in Emulab. The test result is in my above comment.
If I recall these machines are Pentium 4 era and pretty CPU constrained. You can try the tunable 'hw.em.unsupported_tso=1' and then enable TSO on the interface to get some more bulk bandwidth, they are stable with TSO.
Are you able to detect any improvements or regressions otherwise? ping-pong time at low packet rate between two systems both set with enable_aim=0,1,2 would be interesting.
Oct 2 2024
In D46768#1069027, @kbowling wrote:In D46768#1069015, @cc wrote:In D46768#1067607, @kbowling wrote:@cc this code works well in my testing. There are now some quality of life improvements, at runtime you can now switch in the middle of a test. I run a tmux session with three splits, one of systat -vmsat, one of the benchmark (iperf3 or whatever), and one to either toggle sysctl dev.{em,igb}.<interface number>.enable_aim=<N> where <N> description which follows. You can also do something like sysctl dev.igb.0 | grep _rate to see the current queue values.
Existing static 8000 int/s behavior (how the driver is in main):
sysctl dev.igb.0.enable_aim=0Suggested new default, you will boot in this mode with this patch:
sysctl dev.igb.0.enable_aim=1Low latency option of above algorithm (up to 70k ints/s):
sysctl dev.igb.0.enable_aim=2ixl(4) algorithm bodged in that would need to be cleaned up:
sysctl dev.igb.0.enable_aim=3I would be curious to know what you find with these different options in an array of testing and I will use the results to ready this for actual use.
I didn't find any rate change by the sysctl. Please let me know if the hardware does not support this new change.
root@s1:~ # sysctl dev.em.2.enable_aim=0
dev.em.2.enable_aim: 0 -> 0
root@s1:~ # sysctl dev.em.2 | grep _rate
dev.em.2.queue_rx_0.interrupt_rate: 20032
dev.em.2.queue_tx_0.interrupt_rate: 20032
root@s1:~ # sysctl dev.em.2.enable_aim=1
dev.em.2.enable_aim: 0 -> 1
root@s1:~ # sysctl dev.em.2 | grep _rate
dev.em.2.queue_rx_0.interrupt_rate: 20032
dev.em.2.queue_tx_0.interrupt_rate: 20032
root@s1:~ # sysctl dev.em.2.enable_aim=2
dev.em.2.enable_aim: 1 -> 2
root@s1:~ # sysctl dev.em.2 | grep _rate
dev.em.2.queue_rx_0.interrupt_rate: 20032
dev.em.2.queue_tx_0.interrupt_rate: 20032
root@s1:~ #This looks to me like it is working, the algorithm is dynamic and 20k would be latency reducing idle queue. At enable_aim=0, you would see 8000. 20k looks right for an idle queue, what happens if you place a bulk load through it like iperf3? It should drop down to 4k.
In D46824#1068983, @tuexen wrote:In D46824#1068981, @jhb wrote:I can fix the type mismatch during commit. I have not looked to see if other stacks are affected.
Fixing the type mismatch would be good. I think other stacks are not affected, since I think they
do not send a FIN before any outstanding data is ACKed and nothing is buffered anymore.
In D46768#1067607, @kbowling wrote:@cc this code works well in my testing. There are now some quality of life improvements, at runtime you can now switch in the middle of a test. I run a tmux session with three splits, one of systat -vmsat, one of the benchmark (iperf3 or whatever), and one to either toggle sysctl dev.{em,igb}.<interface number>.enable_aim=<N> where <N> description which follows. You can also do something like sysctl dev.igb.0 | grep _rate to see the current queue values.
Existing static 8000 int/s behavior (how the driver is in main):
sysctl dev.igb.0.enable_aim=0Suggested new default, you will boot in this mode with this patch:
sysctl dev.igb.0.enable_aim=1Low latency option of above algorithm (up to 70k ints/s):
sysctl dev.igb.0.enable_aim=2ixl(4) algorithm bodged in that would need to be cleaned up:
sysctl dev.igb.0.enable_aim=3I would be curious to know what you find with these different options in an array of testing and I will use the results to ready this for actual use.
Oct 1 2024
I think this change also applies to the bbr and rack stacks.
Looks good to me. Thanks for removing the goto label skip_alloc that improves reading.
Sep 27 2024
In D46793#1067415, @cc wrote:Does the summary section need to be updated? I didn't find the mentioned leaking part in code. Or am I missing something?
Does the summary section need to be updated? I didn't find the mentioned leaking part in code. Or am I missing something?
In D46768#1067199, @kbowling wrote:Rebase on main and some small improvements and bug fixes. Upon more testing the reimported algorithm is tuned for igb and less governed than intended on lem/em due to a different unit of measure on the ITR register. Need to think a little on how I would like to handle that.
Sep 24 2024
Thanks for adding me as one of the reviewers. I will look at this patch and more likely test it in one of the machines in Emulab.
Sep 17 2024
re-base