Page MenuHomeFreeBSD

gallatin (Andrew Gallatin)
User

Projects

User Details

User Since
Jun 22 2015, 5:21 PM (570 w, 6 d)

Recent Activity

Fri, May 29

gallatin accepted D57316: src.opts.mk: enable OPENSSL_KTLS by default on riscv64.
Fri, May 29, 1:40 PM
gallatin committed rG3118f1b99f23: bnxt: Fix build / load error for bnxt(4) in kernels without PCI_IOV (authored by gallatin).
bnxt: Fix build / load error for bnxt(4) in kernels without PCI_IOV
Fri, May 29, 1:06 PM
gallatin closed D57300: bnxt: Fix build / load error for bnxt(4) in kernels without PCI_IOV.
Fri, May 29, 1:06 PM

Thu, May 28

gallatin requested review of D57300: bnxt: Fix build / load error for bnxt(4) in kernels without PCI_IOV.
Thu, May 28, 3:43 PM

Mon, May 25

gallatin committed rGb97ee5e9ce7b: hwpmc: Avoid panic on AMD cpus where IBS is not available (authored by gallatin).
hwpmc: Avoid panic on AMD cpus where IBS is not available
Mon, May 25, 8:35 PM

Fri, May 22

gallatin committed rGfd9af1e7084c: sendfile: Fix bug when using headers with SW KTLS offload (authored by gallatin).
sendfile: Fix bug when using headers with SW KTLS offload
Fri, May 22, 6:34 PM
gallatin closed D57134: sendfile: Fix bug when using headers with SW KTLS offload.
Fri, May 22, 6:34 PM

Thu, May 21

gallatin added a comment to D55203: svc_vc.c: Add support for an xp_extpg boolean.

For this particular patch, I'd suggest the following.

The _svc_vc_checkextpg() should be made a generic function living somewhere in sys/net/route. It should accept socket address and return a referenced struct nhop_object *. It should call fib[46]_lookup() with NHR_REF flag. It also should be passed a u_int * argument that it shall fill with current route table generation number obtained via rt_tables_get_gen(). Alternatively the generic function may assert the epoch, and then it would be the caller's choice to pull the route generation number together with the lookup or not.

The svc_vc code should call this generic function at setup and store returned struct nhop_object * and rtgen number in its xprt.

On packet generation it should first do rt_tables_get_gen() to validate that routing table did not change. If it did, then dereference the stored nhop and call the generic function again to get a new one. After that it can consult xprt->nh->nh_ifp->if_capenable to decide if to use extended mbufs. AFAIK, we are always in the net epoch at the packet generation.

The above is not 100% beautiful as the rt_* and nhop_* and fib_* KPIs are slightly decoupled, despite they describe the same backend. However, the above is exactly what the generic ip_output() does today.

P.S. Alexander had left a promising comment for bright future in nhop.h:

* TODO: subscribe for the interface notifications and update the nexthops
*  with NHF_INVALID flag.

So some day this is going to be improved.

Thu, May 21, 5:11 PM
gallatin added inline comments to D57134: sendfile: Fix bug when using headers with SW KTLS offload.
Thu, May 21, 4:14 PM
gallatin added inline comments to D57134: sendfile: Fix bug when using headers with SW KTLS offload.
Thu, May 21, 3:57 PM
gallatin added a comment to D55203: svc_vc.c: Add support for an xp_extpg boolean.
In D55203#1309629, @kib wrote:

I think it could be set to 'enabled' for machines with DMAP, i.e. amd64/arm64, and might be risc-v, if anybody ever uses nfs server on it. For other arches, the knob almost surely should be kept disabled.
mb_unmapped_to_ext() uses non-privately mapped sfbufs for all extents of mbufs. On DMAP systems, it is free. On other arches, allocating such sfbuf causes global IPI, and the whole chain of sfbufs is freed only after the mbufs are released by the network card. Besides the cost of allocating , this would make sfbufs scarce resource for other consumers and even for nfs server itself. There are around ~1K of sfbufs, and mb_unmapped_to_ext() seems to drop packet if an sfbuf cannot be allocated immediately.

Thu, May 21, 3:48 PM
gallatin added a reverting change for rG2fe37927d419: loopback: Clear hash unconditionally.: rGd7cde43f95bf: Revert "loopback: Clear hash unconditionally.".
Thu, May 21, 1:39 PM
gallatin committed rGd7cde43f95bf: Revert "loopback: Clear hash unconditionally." (authored by gallatin).
Revert "loopback: Clear hash unconditionally."
Thu, May 21, 1:38 PM
gallatin added a reverting change for D53090: loopback: Clear hash unconditionally.: rGd7cde43f95bf: Revert "loopback: Clear hash unconditionally.".
Thu, May 21, 1:38 PM
gallatin added a comment to D55203: svc_vc.c: Add support for an xp_extpg boolean.

I don't think this is a right fix. A route lookup now doesn't guarantee the same interface will be used in the future. There is dynamic routing, weighted routing, policy routing, etc etc etc.

There should be some generic gate that would convert mbufs otherwise, we will need to add a code like this every module that generates mbufs. Again, a code that is not correct when routing isn't static.

As Rick said in the description, this is just a hint, and the conversion routine in ip_output() will handle any case where the egress NIC changes. However, just using extpg mbufs all the time seems like a better solution. That's what SW ktls offload does. mb_unmapped_to_ext() is not super expensive, and most high-ish performance NICs are aware of extpgs these days (iflib + mlx5 + cxgbe covers most 10Gb or more NICs in practice).

Kostik preferred not enabling it all the time.
See thtis:
https://lists.freebsd.org/archives/freebsd-net/2026-May/008766.html

The case of concern is some corner case, where
the "always enabled" causes a regression. I asked on freebsd-net@
to try and determine if such a case exists?

Kostik didn't indicate if he thought such a case exists, but was
concerned that the user wouldn't understand why the regression
happened (and the NFS patch includes a sysctl that turns it off).
--> Right now, it is never enabled, so not enabling it does not

 introduce a regression.
Enabling it unconditionally makes a regression more likely
than only enabling it when this hint thinks the NIC can do it
and that hint is incorrect.

I doubt NFS servers have frequent routing changes, but I do not
know that for certain?

You guys can debate it. If there is no consensus, I'll just leave
it disabled as it is now.

Thu, May 21, 1:27 PM

Wed, May 20

gallatin added a comment to D55203: svc_vc.c: Add support for an xp_extpg boolean.

I don't think this is a right fix. A route lookup now doesn't guarantee the same interface will be used in the future. There is dynamic routing, weighted routing, policy routing, etc etc etc.

There should be some generic gate that would convert mbufs otherwise, we will need to add a code like this every module that generates mbufs. Again, a code that is not correct when routing isn't static.

Wed, May 20, 8:20 PM
gallatin requested review of D57134: sendfile: Fix bug when using headers with SW KTLS offload.
Wed, May 20, 6:59 PM

Tue, May 19

gallatin accepted D57083: if_media.h: Add 800GBase-X and 200Gbit/s per lane support.
Tue, May 19, 3:18 PM

Sat, May 16

gallatin updated the summary of D57026: lacp: simplify lacp_compose_key().
Sat, May 16, 2:01 AM
gallatin requested review of D57026: lacp: simplify lacp_compose_key().
Sat, May 16, 2:00 AM

Tue, May 12

gallatin accepted D56972: sys/time: add bintime2us() helper.

LGTM

Tue, May 12, 5:53 PM
gallatin added a reviewer for D56972: sys/time: add bintime2us() helper: imp.
Tue, May 12, 5:52 PM
gallatin added a reverting change for rGce33f96fcf2f: mlx5e: Ensure rx timestamps are monotonically increasing: rG23b263dfbf5e: Revert "mlx5e: Ensure rx timestamps are monotonically increasing".
Tue, May 12, 1:03 PM
gallatin added a reverting change for D56427: mlx5e: Ensure rx timestamps are monotonically increasing: rG23b263dfbf5e: Revert "mlx5e: Ensure rx timestamps are monotonically increasing".
Tue, May 12, 1:03 PM
gallatin committed rG23b263dfbf5e: Revert "mlx5e: Ensure rx timestamps are monotonically increasing" (authored by gallatin).
Revert "mlx5e: Ensure rx timestamps are monotonically increasing"
Tue, May 12, 1:03 PM
gallatin closed D56579: lacp: fix link state with multiple aggregators.
Tue, May 12, 1:02 PM
gallatin committed rG9f69446d4548: lacp: fix link state with multiple aggregators (authored by gallatin).
lacp: fix link state with multiple aggregators
Tue, May 12, 1:02 PM

Apr 29 2026

gallatin committed rG72e2ae505c4a: tcp: release nic ktls send tags when entering time wait (authored by gallatin).
tcp: release nic ktls send tags when entering time wait
Apr 29 2026, 11:37 PM
gallatin closed D56610: tcp: release nic ktls send tags before time wait.
Apr 29 2026, 11:36 PM

Apr 25 2026

gallatin updated the diff for D56610: tcp: release nic ktls send tags before time wait.

Address review feedback by moving this into tcp_twstart()

Apr 25 2026, 12:45 AM

Apr 24 2026

gallatin committed rG5923b363ca61: net: Fix collision between SIOCGI2CPB and IPSECGREQID (authored by gallatin).
net: Fix collision between SIOCGI2CPB and IPSECGREQID
Apr 24 2026, 8:31 PM
gallatin added inline comments to D56610: tcp: release nic ktls send tags before time wait.
Apr 24 2026, 4:05 PM

Apr 23 2026

gallatin requested review of D56610: tcp: release nic ktls send tags before time wait.
Apr 23 2026, 9:23 PM

Apr 22 2026

gallatin added inline comments to D56564: offload: Compute and insert checksums as late as possible.
Apr 22 2026, 4:02 PM
gallatin requested review of D56579: lacp: fix link state with multiple aggregators.
Apr 22 2026, 3:39 PM

Apr 21 2026

gallatin added inline comments to D56564: offload: Compute and insert checksums as late as possible.
Apr 21 2026, 10:30 PM
gallatin committed rG16de94eaf09b: audit: Fix logging of IPv6 addresses (authored by gallatin).
audit: Fix logging of IPv6 addresses
Apr 21 2026, 1:29 PM
gallatin closed D39633: audit: Fix logging of IPv6 addresses.
Apr 21 2026, 1:28 PM

Apr 20 2026

gallatin added a comment to D39633: audit: Fix logging of IPv6 addresses.

a few years late to the party here but have you tested this?

Apr 20 2026, 9:40 PM

Apr 18 2026

gallatin accepted D56509: iflib: fix book keeping.

Thank you for fixing this for me!

Apr 18 2026, 7:47 PM
gallatin committed rG29336f1c9d25: tcp: Allocate t_tcpreq_info on demand (authored by gallatin).
tcp: Allocate t_tcpreq_info on demand
Apr 18 2026, 7:10 PM
gallatin closed D56484: tcp: Allocate t_tcpreq_info on demand.
Apr 18 2026, 7:10 PM

Apr 17 2026

gallatin added a comment to D56484: tcp: Allocate t_tcpreq_info on demand.

Two options: 1) embed all of this in struct tcp_rack:

#ifdef TCP_REQUEST_TRK
        /* Response tracking addons. */
        uint8_t t_tcpreq_req;   /* Request count */
        uint8_t t_tcpreq_open;  /* Number of open range requests */
        uint8_t t_tcpreq_closed;        /* Number of closed range requests */
        uint32_t tcp_hybrid_start;      /* Num of times we started hybrid pacing */
        uint32_t tcp_hybrid_stop;       /* Num of times we stopped hybrid pacing */
        uint32_t tcp_hybrid_error;      /* Num of times we failed to start hybrid pacing */
        struct tcp_sendfile_track *t_tcpreq_info;
#endif

or 2) create a new structure that holds all of the above, lazily allocate it and have a pointer from struct tcp_rack to it.

Apr 17 2026, 8:31 PM
gallatin requested review of D56484: tcp: Allocate t_tcpreq_info on demand.
Apr 17 2026, 8:08 PM
gallatin committed rG3fade68cfdf9: iflib: accurately count bytes/segments for TSO (authored by gallatin).
iflib: accurately count bytes/segments for TSO
Apr 17 2026, 5:36 PM
gallatin committed rG52e7958702be: iflib: ignore reclaim coalescing when low on tx descriptors (authored by gallatin).
iflib: ignore reclaim coalescing when low on tx descriptors
Apr 17 2026, 5:36 PM
gallatin closed D56338: iflib: accurately account for TSO packets & transmitted.
Apr 17 2026, 5:35 PM
gallatin closed D56339: iflib: ignore reclaim coalescing when low on tx descriptors.
Apr 17 2026, 5:35 PM

Apr 16 2026

gallatin accepted D56385: so_splice: Fix the KTLS check for the sink socket in so_splice().
Apr 16 2026, 5:04 PM
gallatin closed D56427: mlx5e: Ensure rx timestamps are monotonically increasing.
Apr 16 2026, 4:31 PM
gallatin committed rGce33f96fcf2f: mlx5e: Ensure rx timestamps are monotonically increasing (authored by gallatin).
mlx5e: Ensure rx timestamps are monotonically increasing
Apr 16 2026, 4:31 PM
gallatin requested review of D56427: mlx5e: Ensure rx timestamps are monotonically increasing.
Apr 16 2026, 1:26 PM

Apr 14 2026

gallatin committed rG956acdce0505: loopback: Account for packet drops (authored by gallatin).
loopback: Account for packet drops
Apr 14 2026, 7:48 PM
gallatin closed D56356: loopback: Account for packet drops.
Apr 14 2026, 7:47 PM
gallatin committed rG963f1a5455c9: ifconfig: add CMIS support for 400GbE optics (authored by gallatin).
ifconfig: add CMIS support for 400GbE optics
Apr 14 2026, 6:51 PM
gallatin committed rGf707cc00ed12: lro: move pkt rejection checks to leafs to avoid queueing non-LRO'able pkts (authored by gallatin).
lro: move pkt rejection checks to leafs to avoid queueing non-LRO'able pkts
Apr 14 2026, 6:51 PM
gallatin closed D56265: ifconfig: Add CMIS support for 400GbE optics.
Apr 14 2026, 6:51 PM
gallatin closed D56337: lro: move packet rejection checks out of lro_rx_common to avoid queueing non-LRO'able packets.
Apr 14 2026, 6:51 PM

Apr 11 2026

gallatin updated the test plan for D56356: loopback: Account for packet drops.
Apr 11 2026, 9:13 PM
gallatin requested review of D56356: loopback: Account for packet drops.
Apr 11 2026, 9:13 PM

Apr 9 2026

gallatin added a comment to D56337: lro: move packet rejection checks out of lro_rx_common to avoid queueing non-LRO'able packets.
In D56337#1289068, @bz wrote:

I am wondering if you can see the V_ipforwarding/V_ip6_forwarding hits?
Does it make any differences if you do the mbuf pkthdr checks first?

Silly idea I don't like much as inet and inet6 should be distinct but would it help if we would make sure they are in the same cache line?

Apr 9 2026, 11:42 PM
gallatin requested review of D56339: iflib: ignore reclaim coalescing when low on tx descriptors.
Apr 9 2026, 11:36 PM
gallatin requested review of D56338: iflib: accurately account for TSO packets & transmitted.
Apr 9 2026, 11:27 PM
gallatin requested review of D56337: lro: move packet rejection checks out of lro_rx_common to avoid queueing non-LRO'able packets.
Apr 9 2026, 11:12 PM

Apr 5 2026

gallatin requested review of D56265: ifconfig: Add CMIS support for 400GbE optics.
Apr 5 2026, 10:33 PM
gallatin committed rGcf1f21572897: net: Add SIOCGI2CPB ioctl & add page/bank fields to ifi2creq (authored by gallatin).
net: Add SIOCGI2CPB ioctl & add page/bank fields to ifi2creq
Apr 5 2026, 9:08 PM
gallatin closed D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
Apr 5 2026, 9:08 PM

Apr 4 2026

gallatin committed rG3f79bc9ca336: Fix nooptions VIMAGE build (authored by gallatin).
Fix nooptions VIMAGE build
Apr 4 2026, 11:11 PM
gallatin added a comment to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
In D55912#1287051, @kib wrote:

Sorry for the late reply.

I think this should work as is. See below.

Apr 4 2026, 10:22 PM
gallatin requested changes to D55988: ice(4): Fix traffic distribution on TX queues.

I hate that I'm saying this, but this isn't just an ice(4) problem... The same thing will happen for the queue selection in iflib itself for most other drivers, and for the queue selection in non-iflib drivers. So I think it should be solved in a more generic fashion.

Apr 4 2026, 9:57 PM

Mar 28 2026

gallatin updated the diff for D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.

Updated diff to add a new ioctl for page-bank operations.
Changed the 2 drivers that might use page/bank (iflib, mce) to accept the new ioctl and to ensure they are not reading garbage page/bank fields when operating with the legacy SIOGI2C ioctl

Mar 28 2026, 1:38 AM
gallatin added a comment to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
In D55912#1282432, @kib wrote:

I mean something like this (ignore naming), in the generic ioctl handler, not in the driver code;

case SIOCGI2C_OLD:
     i2c.page = i2c.bank = 0;
     ioctl(SIOCG2C, &i2c);
     break;

I.e. drivers only need to implement the new ioctl that knows about bank/page.

Or put it differently, your original patch for mlx5en then would be correct, with this addition to the net/, instead of requiring each driver to have code to handle both ioctls.

Mar 28 2026, 1:36 AM

Mar 23 2026

gallatin added a comment to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.

I think that I may not understand your proposal. When you say "at the ioctl level..", Do you mean intercept the ioctls in ifhwioctl()? I think that seems awkward because ifi2req lives in userspace, and individual drivers copy it in / out, rather than it being done in a single spot for all drivers. So to zero fields, ifhwioctl would need copy-in ifi2creq, fix it up, & copy it out again..

Mar 23 2026, 3:01 PM

Mar 19 2026

gallatin added a comment to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
In D55912#1279758, @kib wrote:

Why not simply not allocate the new number for the new ioctl. Rename current ioctl to something like SIOCGI2C_NOBANK. Then, at the ioctl level, do two things:

  • for new SIOCGI2C check that reserved bytes in the structure are zero
  • for SIOCGI2C_NOBANK, zero bank/page/padding and call into SIOCGI2C
Mar 19 2026, 6:36 PM
gallatin added a comment to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.

@kib What if we reclaimed some of the spare fields in ifi2creq to a uint16_t magic field. User sets i2creq.magic=PAGE_BANK_REQ and driver sets magic=PAGE_BANK_ACK ? That prevents the confusion you're concerned about.

Mar 19 2026, 2:52 PM

Mar 18 2026

gallatin added inline comments to D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
Mar 18 2026, 11:09 PM
gallatin accepted D55904: hash(9): introduce hashalloc()/hashfree() KPI.
Mar 18 2026, 5:28 PM
gallatin updated the test plan for D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
Mar 18 2026, 12:42 PM
gallatin updated the test plan for D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
Mar 18 2026, 12:41 PM
gallatin requested review of D55912: net: Add page/bank fields to ifi2cre; update mce(4) to use them.
Mar 18 2026, 12:41 PM

Mar 17 2026

gallatin added a comment to D55904: hash(9): introduce hashalloc()/hashfree() KPI.

This is really clever. What about other lock types (rm / rms, sx, etc)?

Mar 17 2026, 10:19 PM

Mar 11 2026

gallatin accepted D55818: nvme: replace bus_space_[read|write]_4 with bus_[read|write]_4.
Mar 11 2026, 9:44 PM

Mar 6 2026

gallatin accepted D55541: nvme: Don't active memory space until all BARs are configured.
Mar 6 2026, 5:25 PM
gallatin committed rGf1e8b1aca543: splice: optionally limit worker queues (authored by gallatin).
splice: optionally limit worker queues
Mar 6 2026, 3:26 PM
gallatin closed D55579: splice: optionally limit worker queues.
Mar 6 2026, 3:26 PM
gallatin added inline comments to D55579: splice: optionally limit worker queues.
Mar 6 2026, 3:07 PM

Mar 5 2026

gallatin accepted D55541: nvme: Don't active memory space until all BARs are configured.
Mar 5 2026, 1:51 PM

Mar 4 2026

gallatin updated the diff for D55579: splice: optionally limit worker queues.

Re-worked thread setup to use mp_ncpus rather than mp_maxid, as suggested by @markj

Mar 4 2026, 2:56 PM
gallatin added inline comments to D55579: splice: optionally limit worker queues.
Mar 4 2026, 2:03 AM
gallatin updated the diff for D55579: splice: optionally limit worker queues.
Mar 4 2026, 12:37 AM

Mar 3 2026

gallatin added a comment to D55477: hwpstate_amd: Support writable cpufreq interface in CPPC mode.
Mar 3 2026, 5:43 PM

Feb 28 2026

gallatin added inline comments to D55579: splice: optionally limit worker queues.
Feb 28 2026, 1:49 AM
gallatin updated the diff for D55579: splice: optionally limit worker queues.
  • fixed typo in sysctl declaration
Feb 28 2026, 1:48 AM
gallatin requested review of D55579: splice: optionally limit worker queues.
Feb 28 2026, 12:27 AM

Feb 27 2026

gallatin added a comment to D55477: hwpstate_amd: Support writable cpufreq interface in CPPC mode.

I tried this on 3 EPYC generations at Netflix. The most recent (AMD EPYC 8434P, AMD EPYC 9535) behaved as expected. There was an dev.hwpstate_amd. node for each CPU, and a lot more freqs were exposed (from 3 to roughly a dozen).
On the oldest EPYC we have is EPYC 7502P, where it didn't change anything. We still only have 3 frequencies exposed. On this machine, we see just a single node from dev.hwpstate:
dev.hwpstate_amd.0.freq_settings: 2500/2750 2200/2200 1500/1350
dev.hwpstate_amd.0.%iommu:
dev.hwpstate_amd.0.%parent: cpu0
dev.hwpstate_amd.0.%pnpinfo:
dev.hwpstate_amd.0.%location:
dev.hwpstate_amd.0.%driver: hwpstate_amd
dev.hwpstate_amd.0.%desc: Cool`n'Quiet 2.0
dev.hwpstate_amd.%parent:

Feb 27 2026, 12:18 AM

Feb 26 2026

gallatin added a comment to D55541: nvme: Don't active memory space until all BARs are configured.

This looks OK to me, but I defer to Warner.

Feb 26 2026, 8:34 PM

Feb 10 2026

gallatin accepted D55196: tcp: restrict flowtype copying to specific RSS TCP types.

Looks good. Thank you!

Feb 10 2026, 12:14 AM

Feb 9 2026

gallatin added a comment to D55196: tcp: restrict flowtype copying to specific RSS TCP types.

I think it's driver bugs that have flown under the radar until the recent RSS change so I'm not sure this is the place to do this.

In particular to this review, opaque and opaque_hash should both be usable hashes if they are implemented correctly in the driver.

Thinking about this more, I think when we are using the hash for setting flowtype, we should check for valid toeplitz hash types, since we're counting on being able to calculate the same hash in software. So I think this patch is probably the correct place to fix this. I think it would be shorter/cleaner to use M_HASHTYPE_ISHASH(). See https://reviews.freebsd.org/D52989

Question: Is M_HASHTYPE_RSS_IPV4 or M_HASHTYPE_RSS_IPV6 (a hash only on the src/dst address) acceptable or not? M_HASHTYPE_ISHASH() is true for them.

Feb 9 2026, 8:24 PM
gallatin added a comment to D55196: tcp: restrict flowtype copying to specific RSS TCP types.

I think it's driver bugs that have flown under the radar until the recent RSS change so I'm not sure this is the place to do this.

In particular to this review, opaque and opaque_hash should both be usable hashes if they are implemented correctly in the driver.

Feb 9 2026, 5:47 PM
gallatin added a comment to D55196: tcp: restrict flowtype copying to specific RSS TCP types.

Slightly worried some NIC somwhere might not be setting this when it should. Is this solving a problem for you?

Feb 9 2026, 5:17 PM