Page MenuHomeFreeBSD

gallatin (Andrew Gallatin)
User

Projects

User Details

User Since
Jun 22 2015, 5:21 PM (461 w, 2 d)

Recent Activity

Thu, Apr 18

gallatin updated the diff for D39150: hyperv: Fix compilation with larger page sizes.

I just tripped over this again when trying to use some of the 16K changes I have in my Netflix tree on a personal machine running a GENERIC kernel, so let's try this again in a different way.

Thu, Apr 18, 12:07 AM

Mon, Apr 15

gallatin accepted D44800: tcp bbr: improve code consistency.
Mon, Apr 15, 8:13 PM

Fri, Apr 5

gallatin accepted D43504: netinet: add a probe point for IP stats counters.

Thank you for adding that option.

Fri, Apr 5, 5:50 PM

Wed, Apr 3

gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

Below are the results from my testing. I'm sorry that it took so long.. I had to re-do testing from the start b/c the new machine was not exactly identical to the old (different BIOS rev) and was giving slightly different results.
The results are from 92Gb/s of traffic over a one hour period with 45-47K TCP connections established/

No SDT probes: 56.4%
normal SDT 57.5%
new IP SDTs 57.9%
new IP SDT+ 56.6%
zero-cost

Just to be clear, "SDT+" is with the patch I supplied to provide new asm goto-based SDT probes? I'm not sure what the "zero-cost" line means.

This is just measuring CPU usage as reported by the scheduler?

I made some progress on the hot-patching implementation last week. I hope to have it ready fairly soon.

Wed, Apr 3, 7:05 PM
gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

Below are the results from my testing. I'm sorry that it took so long.. I had to re-do testing from the start b/c the new machine was not exactly identical to the old (different BIOS rev) and was giving slightly different results.
The results are from 92Gb/s of traffic over a one hour period with 45-47K TCP connections established/

No SDT probes: 56.4%
normal SDT 57.5%
new IP SDTs 57.9%
new IP SDT+ 56.6%
zero-cost

Just to be clear, "SDT+" is with the patch I supplied to provide new asm goto-based SDT probes? I'm not sure what the "zero-cost" line means.

This is just measuring CPU usage as reported by the scheduler?

I made some progress on the hot-patching implementation last week. I hope to have it ready fairly soon.

Wed, Apr 3, 6:47 PM
gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

Below are the results from my testing. I'm sorry that it took so long.. I had to re-do testing from the start b/c the new machine was not exactly identical to the old (different BIOS rev) and was giving slightly different results.
The results are from 92Gb/s of traffic over a one hour period with 45-47K TCP connections established/

Wed, Apr 3, 5:53 PM

Fri, Mar 29

gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

OK, starting with an unpatched kernel & working my way through the patches. I'll report percent busy for unpatched and various patches on our original 100G server (based around Xeon E5-2697A v4, which tends to be a poster-child for cache misses, as it runs very close to the limits of its memory bandwidth. I'll be disabling powerd and using TCP RACK TCP's DGP pacing.

This will take several days, as it takes a while to load up a server, get a few hours of steady-state, and unload it gently,.

Fri, Mar 29, 4:18 PM

Tue, Mar 26

gallatin accepted D44420: Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold.
Tue, Mar 26, 7:15 PM

Mar 25 2024

gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

OK, starting with an unpatched kernel & working my way through the patches. I'll report percent busy for unpatched and various patches on our original 100G server (based around Xeon E5-2697A v4, which tends to be a poster-child for cache misses, as it runs very close to the limits of its memory bandwidth. I'll be disabling powerd and using TCP RACK TCP's DGP pacing.

Mar 25 2024, 3:43 PM

Mar 23 2024

gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

Regarding SDT hotpatching, the implementation[1] was written a long time ago, before we had "asm goto" in LLVM. It required a custom toolchain program[2].

Since then, "asm goto" support appeared in LLVM. It makes for a much simpler implementation. I hacked up part of it and posted a patch[3]. In particular, the patch makes use of asm goto to remove the branch and data access. (The probe site is moved to the end of the function in an unreachable block.) The actual hot-patching part isn't implemented and will take some more work, but this is enough to do some benchmarking to verify that the overhead really is minimal. @gallatin would you be able to verify this?

I would also appreciate any comments on the approach taken in the patch, keeping in mind that the MD bits are not yet implemented.

[1] https://people.freebsd.org/~markj/patches/sdt-zerocost/
[2] https://github.com/markjdb/sdtpatch
[3] https://reviews.freebsd.org/D44483

Mar 23 2024, 9:26 PM

Mar 21 2024

gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

I agree, that introducing these probes should not have a performance hit. Since they exist in Solaris, I was assuming that there is no substantial performance hit. I would really like to see zero cost probes going into the tree.

Mar 21 2024, 9:22 PM
gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.

Guys, this is crazy. Every SDT probe does a test on a global variable. If this lands, it will cause a noticeable performance impact, especially in high packet rate workloads. Can we shelve this until / unless SDT is modified to insert nops rather than do tests on a global variable? Or put this under its own options EXTRA_IP_PROBES or something?

Mar 21 2024, 5:56 PM
gallatin requested changes to D43504: netinet: add a probe point for IP stats counters.

Guys, this is crazy. Every SDT probe does a test on a global variable. If this lands, it will cause a noticeable performance impact, especially in high packet rate workloads. Can we shelve this until / unless SDT is modified to insert nops rather than do tests on a global variable? Or put this under its own options EXTRA_IP_PROBES or something?

Mar 21 2024, 5:46 PM
gallatin added inline comments to D44420: Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold.
Mar 21 2024, 12:54 PM
gallatin added inline comments to D44420: Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold.
Mar 21 2024, 12:47 PM

Mar 20 2024

gallatin committed rG530c2c30b0c7: ip6_output: Reduce cache misses on pktopts (authored by gallatin).
ip6_output: Reduce cache misses on pktopts
Mar 20 2024, 7:51 PM
gallatin closed D44204: ip6_output: Reduce cache misses on pktopts.
Mar 20 2024, 7:51 PM
gallatin added a comment to D44204: ip6_output: Reduce cache misses on pktopts.

Generally looks good to me.

Mar 20 2024, 7:45 PM
gallatin accepted D44420: Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold.

This tests well. On my test system hw.model: Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz, I see a 50% reduction in syscall latency for the lmbench lat_syscall test (0.89us -> 0.44us) when the system is idle.

Mar 20 2024, 2:16 PM

Mar 19 2024

gallatin added a comment to D44420: Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold.

Do you think you could also add a 'bool __read_mostly hpts_userrret_hook = true;' controlled by a sysctl to avoid calling into htps at all from userret for folks that want to disable this entirely?

Mar 19 2024, 2:40 PM

Mar 5 2024

gallatin added a comment to D44204: ip6_output: Reduce cache misses on pktopts.

LGTM, q - would it be possible to introduce ‘ip6po_<set|clear>_<field>’ inline functions and use them so we don’t accidentally miss setting/clearing up the relevant bit?

Mar 5 2024, 10:22 PM
gallatin added a comment to D44204: ip6_output: Reduce cache misses on pktopts.
In D44204#1008766, @ae wrote:

Probably you can simplify some similar checks in in6_src.c too, e.g. IP6PO_VALID_PKTINFO and IP6PO_VALID_NHINFO. Not sure how it impacts your cache misses measurements.

Mar 5 2024, 10:20 PM
gallatin updated the diff for D44204: ip6_output: Reduce cache misses on pktopts.

Added a comment explaining why the flags exist, as suggested by @glebius

Mar 5 2024, 10:18 PM
gallatin added a comment to D44204: ip6_output: Reduce cache misses on pktopts.
In D44204#1008527, @bz wrote:

Initially I thought we should name some better but the original structs have the same names so all good.
I have not checked if you got all the places but it looks good.

Mar 5 2024, 1:03 AM
gallatin updated the diff for D44204: ip6_output: Reduce cache misses on pktopts.
  • added a blank like before ip6po_m, as requested by @bz
Mar 5 2024, 12:58 AM

Mar 4 2024

gallatin requested review of D44204: ip6_output: Reduce cache misses on pktopts.
Mar 4 2024, 10:19 PM

Feb 21 2024

gallatin accepted D44016: kboot: Implement write support for hostdisk.
Feb 21 2024, 11:18 PM
gallatin added inline comments to D41326: iflib: Add sysctl to request extra MSIX vectors on driver load.
Feb 21 2024, 6:40 PM
gallatin added inline comments to D41326: iflib: Add sysctl to request extra MSIX vectors on driver load.
Feb 21 2024, 6:29 PM

Feb 9 2024

gallatin created P630 imac usb issue.
Feb 9 2024, 4:50 PM

Jan 23 2024

gallatin added reviewers for D43504: netinet: add a probe point for IP stats counters: markj, christos.
Jan 23 2024, 6:47 PM
gallatin added a comment to D43504: netinet: add a probe point for IP stats counters.
In D43504#991958, @kp wrote:
Jan 23 2024, 6:46 PM
gallatin added a reviewer for D43504: netinet: add a probe point for IP stats counters: olivier.
Jan 23 2024, 6:41 PM

Jan 12 2024

gallatin added a comment to D43400: ktls: fix vnet-related panic in ktls_reset_receive_tag().
In D43400#989566, @jhb wrote:

For future reference, uploading diffs with context (e.g. with git-arc) makes reviewing easier. I don't think we need the vnet around if_rele() (in case it calls if_free), so I think this is correct.

Jan 12 2024, 5:05 PM

Jan 11 2024

gallatin accepted D43400: ktls: fix vnet-related panic in ktls_reset_receive_tag().
Jan 11 2024, 3:00 PM
gallatin created P625 zfs panic.
Jan 11 2024, 2:55 PM
gallatin created P624 zfs panics.
Jan 11 2024, 2:52 PM

Jan 10 2024

gallatin added a comment to D43385: apei: panic on uncorrectable memory errors.
In D43385#989059, @mav wrote:

There is already a panic in apei_ge_handler(), based on total status severity. Do you see it not enough?

Jan 10 2024, 3:33 PM

Jan 9 2024

gallatin updated the diff for D43385: apei: panic on uncorrectable memory errors.

Removed hunk that was Netflix specific

Jan 9 2024, 9:42 PM
gallatin requested review of D43385: apei: panic on uncorrectable memory errors.
Jan 9 2024, 9:38 PM
gallatin committed rG5cd08d9ecf52: apei: Mark ReadAckRegister resource as shareable (authored by gallatin).
apei: Mark ReadAckRegister resource as shareable
Jan 9 2024, 9:10 PM

Jan 3 2024

gallatin added a comment to D43166: tcp: bypass TSO when CWR bit is to be sent.

I pinged Nvidia/Mellanox last week, and I'm still waiting to hear back to see if they can support AccECN in their NICs

Jan 3 2024, 8:31 PM

Dec 26 2023

gallatin added a comment to D43166: tcp: bypass TSO when CWR bit is to be sent.

OK, I'm sorry, I was not aware of AccECN and its desired behavior of setting CWR on all segments.

Dec 26 2023, 7:16 PM
gallatin added a comment to D43166: tcp: bypass TSO when CWR bit is to be sent.

Properly handling CWR is part of the NDIS spec... though the spec is broken, and says that "If the CWR bit in the TCP header of the large TCP packet is set, the miniport driver must set this bit in the TCP header of the first packet that it creates from the large TCP packet. The miniport driver may choose to set this bit in the TCP header of the last packet that it creates from the large TCP packet, although this is less desirable." [https://learn.microsoft.com/en-us/windows-hardware/drivers/network/offloading-the-segmentation-of-large-tcp-packets]

Dec 26 2023, 1:37 AM
gallatin requested changes to D43166: tcp: bypass TSO when CWR bit is to be sent.

I'd prefer you add a feature flag so that NICs which do properly support CWR be able to use TSO, and avoid being pessimized by this.

Dec 26 2023, 1:17 AM

Dec 11 2023

gallatin added a comment to D42988: inet6: Use IfAPI helper in in6_ifstat_inc.

The if_afdata[] array comes from the old BSD times when it was expected that there would be support for many many address families (e.g. IPX, AppleTalk, etc). Right now it has only two entries AF_INET and AF_INET6. It is very very very unlikely it will ever get a third one. It is much more likely that the array will go away and we will have just ifp->if_inet and ifp->if_inet6. Or maybe something more complicated. Anyway, the access to this data is going to change anyway, so there is no point in overdesigning it right now. Any solution for the sake of IfAPI cleanness is acceptable.

I'll add couple more people to confirm or argue my statement.

Dec 11 2023, 3:55 PM

Nov 16 2023

gallatin committed rG5972ffde919a: ig4(4): Add an EMAG device type (authored by gallatin).
ig4(4): Add an EMAG device type
Nov 16 2023, 12:54 AM
gallatin closed D28746: ig4(4): Add an EMAG device type.
Nov 16 2023, 12:53 AM

Nov 15 2023

gallatin updated the diff for D28746: ig4(4): Add an EMAG device type.
  • Tried to address @wulf 's feedback
Nov 15 2023, 10:41 PM
gallatin commandeered D28746: ig4(4): Add an EMAG device type.
Nov 15 2023, 10:39 PM
gallatin closed D28741: acpi: Add workaround for Altra I2C memory resource.
Nov 15 2023, 9:28 PM
gallatin committed rG6f38d2e7b059: acpi: Add workaround for Altra I2C memory resource (authored by gallatin).
acpi: Add workaround for Altra I2C memory resource
Nov 15 2023, 9:28 PM
gallatin committed rGba0e4d7971e0: smbios: handle smbios3 for arm64 (authored by gallatin).
smbios: handle smbios3 for arm64
Nov 15 2023, 4:50 PM
gallatin closed D42592: smbios: handle smbios3 for arm64.
Nov 15 2023, 4:50 PM
gallatin updated the diff for D42592: smbios: handle smbios3 for arm64.
Nov 15 2023, 1:12 AM
gallatin updated the diff for D42592: smbios: handle smbios3 for arm64.

Test signature as suggested by @imp

Nov 15 2023, 1:11 AM

Nov 14 2023

gallatin added inline comments to D42592: smbios: handle smbios3 for arm64.
Nov 14 2023, 8:18 PM
gallatin added inline comments to D42592: smbios: handle smbios3 for arm64.
Nov 14 2023, 3:01 PM
gallatin added inline comments to D42592: smbios: handle smbios3 for arm64.
Nov 14 2023, 2:42 PM
gallatin committed rGab063ac4444e: ipmi_ssif: Fix typo in debug print (authored by gallatin).
ipmi_ssif: Fix typo in debug print
Nov 14 2023, 12:47 AM
gallatin requested review of D42592: smbios: handle smbios3 for arm64.
Nov 14 2023, 12:40 AM

Nov 11 2023

gallatin committed rGb2921fdc2330: arm64: Implement bus_get_resource and bus_delete_resource. (authored by gallatin).
arm64: Implement bus_get_resource and bus_delete_resource.
Nov 11 2023, 6:00 PM
gallatin accepted D42550: busdma: On systmes that use subr_busdma_bounce, measure deferred time.
Nov 11 2023, 4:27 PM

Nov 9 2023

gallatin added a comment to D28737: stand: Support (and prefer) the SMBIOS 64-bit Entry Point Structure.

It seems like this was obsoleted by:

Nov 9 2023, 8:45 PM

Oct 26 2023

gallatin added a comment to D42368: mlx5: fix deadlock in mlx5_fs_tree.c.

Wouldn't it be better to toggle the IFCAP2_BIT(IFCAP2_RXTLS4) and IFCAP2_BIT(IFCAP2_RXTLS6) than to add the check in mlx5e_tls_rx_snd_tag_alloc()?

Oct 26 2023, 2:49 PM

Oct 12 2023

gallatin closed D42158: acpi_ged: Handle events directly.
Oct 12 2023, 3:32 PM
gallatin committed rGbe91b4797e2c: acpi_ged: Handle events directly (authored by gallatin).
acpi_ged: Handle events directly
Oct 12 2023, 3:32 PM
gallatin abandoned D42141: defer acpi_ged() interrupt setup to late in boot on some platforms.

I'm abandoning this in favor of handling events directly (https://reviews.freebsd.org/D42158)

Oct 12 2023, 3:14 PM

Oct 11 2023

gallatin added a comment to D42158: acpi_ged: Handle events directly.
In D42158#961862, @imp wrote:

I don't suppose that there's a way to know if the GPE handler sleeps so we can warn / avoid it?

Oct 11 2023, 5:16 PM
gallatin added a comment to D42158: acpi_ged: Handle events directly.

Would it be useful to add a tunable to revert to the current behaviour if we do find a machine that can't run the ACPI method from an ithread?

Oct 11 2023, 5:16 PM
gallatin updated the diff for D42158: acpi_ged: Handle events directly.

Update to add a tunable to run ged events in a deferred context, as suggested by @andrew

Oct 11 2023, 5:15 PM
gallatin added a comment to D42141: defer acpi_ged() interrupt setup to late in boot on some platforms.
In D42141#961666, @jhb wrote:

The duplicate events would be fixed by my other suggestion of using a dedicated struct task instead of calling AcpiOsExecute which allocates and schedules a new struct task each time.

My worry is that GED is too general a thing. It means "go run some random firmware-provided bit of AML that can do God knows what" when an interrupt occurs.

After digging in the spec for a bit, the description there for GED (5.6.9) is not very clear. There is one requirement for event handling in general for interrupts by the OS (OSPM) is to leave the interrupt source disabled (e.g. the GIC pin masked) until the ACPI control method has been executed (section 5.6.4 that talks about Generic Event Handling). The current acpi_ged driver definitely doesn't do that, and our interrupt model doesn't have a good way to cope with that (we re-enable the GIC pin after the ithread handler completes). We might just have to run the method synchronously and hope for the best. The spec doesn't mandate that these handlers are safe, but it does suggest that they should invoke Notify() from the AML for non-trivial event reporting, so it may be that these are safe. The _EVT handler is required by the spec to clear the interrupt so it doesn't keep firing.

Oct 11 2023, 3:11 PM
gallatin requested review of D42158: acpi_ged: Handle events directly.
Oct 11 2023, 3:10 PM

Oct 10 2023

gallatin added a comment to D42141: defer acpi_ged() interrupt setup to late in boot on some platforms.

Do we know what types of GED events might sleep? I'm not sure this will work when there's only a single CPU as it will be identical to the pre-AP startup case.

Oct 10 2023, 7:24 PM

Oct 9 2023

gallatin requested review of D42141: defer acpi_ged() interrupt setup to late in boot on some platforms.
Oct 9 2023, 9:45 PM

Oct 5 2023

gallatin accepted D42051: nvme: Eliminate RECOVERY_FAILED state.
Oct 5 2023, 11:45 AM
gallatin accepted D42065: nvme: Close a race in destroying qpair and timeouts.
Oct 5 2023, 11:43 AM
gallatin accepted D42049: nvme: Really remove NVME_2X_RESET.
Oct 5 2023, 11:39 AM
gallatin accepted D42048: nvme: gc nvme_ctrlr_post_failed_request and related task stuff.
Oct 5 2023, 11:39 AM

Sep 13 2023

gallatin accepted D41845: newvers.sh: Avoid picking up stray envars..

Thank you!

Sep 13 2023, 1:42 PM

Sep 6 2023

gallatin accepted D41749: Record completed SYSINITs.
Sep 6 2023, 4:46 AM
gallatin accepted D41748: init_main: Switch from SLIST to STAILQ, fix order.

Thank you; this fixes the weird performance problem that we've yet to fully root cause.

Sep 6 2023, 4:46 AM

Aug 22 2023

gallatin accepted D41555: gicv3: Support indirect ITS tables.
Aug 22 2023, 5:16 PM
gallatin accepted D41554: gicv3: Add checks for the device ID.
Aug 22 2023, 5:09 PM
gallatin accepted D41553: gicv3: Add a verbose message for unknown tables.
Aug 22 2023, 5:06 PM
gallatin accepted D41552: gicv3: Stop setting the esize field.
Aug 22 2023, 5:05 PM
gallatin accepted D41551: gicv3: Split out finding the page size.
Aug 22 2023, 5:05 PM

Aug 21 2023

gallatin accepted D41536: libcrypto: Don't embed $FreeBSD$ in generated assembly files.
Aug 21 2023, 11:20 PM
gallatin accepted D41538: libcrypto: Generate new files added in OpenSSL 3.0..
Aug 21 2023, 11:20 PM
gallatin accepted D41537: libcrypto: Add new assembly files added in OpenSSL 3.0..
Aug 21 2023, 11:20 PM
gallatin accepted D41539: libcrypto: Update assembly build glue for x86 for OpenSSL 3.0..
Aug 21 2023, 11:19 PM

Aug 17 2023

gallatin requested changes to D41480: sys/net: only panic on unset tx_csum_flags if cap is disabled.

I think I agree with Eric and Kevin after looking at this further, and withdraw my approval

Aug 17 2023, 12:01 AM

Aug 16 2023

gallatin accepted D41480: sys/net: only panic on unset tx_csum_flags if cap is disabled.
Aug 16 2023, 12:06 PM

Aug 14 2023

gallatin accepted D41452: nvme: Add exclusion for ISR.
Aug 14 2023, 8:44 PM
gallatin added a comment to D41452: nvme: Add exclusion for ISR.

Don't you need a mtx_init() ? Or did I just miss it..?

Aug 14 2023, 7:34 PM

Aug 13 2023

gallatin added a comment to D36924: nvme: provide mutual exclusion for interrupt handler.

Again, you've hand-rolled what is essentially a mutex_trylock(). Why not just use a mutex?

Aug 13 2023, 3:16 PM

Jul 17 2023

gallatin accepted D41059: tcp: improve layout of struct tcpcb.
Jul 17 2023, 10:53 PM
gallatin added inline comments to D41059: tcp: improve layout of struct tcpcb.
Jul 17 2023, 7:52 PM

Jun 16 2023

gallatin added a comment to D40580: ossl: Don't try to initialize the cipher for Chacha20+Poly1305..

This allows the ktls tests for chacha poly to pass when the ossl driver is attached.

Jun 16 2023, 6:16 PM

Jun 1 2023

gallatin accepted D40368: netlink: use netlink mbufs in the mbuf chains..
Jun 1 2023, 12:35 PM