gallatin (Andrew Gallatin)
User

Projects

User Details

User Since
Jun 22 2015, 5:21 PM (156 w, 4 d)

Recent Activity

Wed, Jun 20

gallatin added a comment to D15937: Optimize the TSO and copy paths to use the new tcp_m_copy routine.

I've deployed a variant of this widely at LLNW on the default stack as well. I think you can garbage collect sbsndptr() from the tree.

Wed, Jun 20, 9:30 PM
gallatin accepted D15937: Optimize the TSO and copy paths to use the new tcp_m_copy routine.

I added Hans since it is mostly his code we are factoring out.

Wed, Jun 20, 8:40 PM
gallatin added a reviewer for D15937: Optimize the TSO and copy paths to use the new tcp_m_copy routine: hselasky.
Wed, Jun 20, 8:39 PM

Thu, Jun 14

gallatin added a comment to D15686: convert inpcbhash rlock to epoch.

if @jch cannot run his real workload, what about running a synthetic test that creates / tears down many connections per second? Eg, something like many copies of netperf -t TCP_CC ?

Thu, Jun 14, 4:23 PM
gallatin accepted D15577: Update ixl(4) to use iflib..
Thu, Jun 14, 2:53 AM
gallatin added a comment to D15577: Update ixl(4) to use iflib..

You might want to go for some middle ground here. 1ms is a bit much (even for us).

Thu, Jun 14, 2:52 AM

Mon, Jun 11

gallatin added inline comments to D15577: Update ixl(4) to use iflib..
Mon, Jun 11, 11:41 PM

Mon, Jun 4

gallatin accepted D15558: iflib: Record TCP checksum info in iflib for ixl(4).

Sure.. Those checks for TCP in the TSO case should probably turn into asserts eventually..

Mon, Jun 4, 8:13 PM · Intel Networking

Wed, May 30

gallatin accepted D15526: reduce overhead of entropy collection.

If people want to use ethernet entropy harvesting, can we do something to make it more useful in a separate change? I was going to initially suggest the ether dst addr, but that's not very random either.

Wed, May 30, 9:22 PM
gallatin added a reviewer for D15526: reduce overhead of entropy collection: gallatin.
Wed, May 30, 9:18 PM

Tue, May 29

gallatin added a comment to D15577: Update ixl(4) to use iflib..

This is the sparse detect I'm using. Note how it walks the header, and then the TSO by packet on the wire.

Tue, May 29, 9:28 PM
gallatin added a comment to D15577: Update ixl(4) to use iflib..

Somehow my last comment got lost. With real Netflix traffic, the sparse detection logic is not sufficient. When we have kernel TLS, which inserts 13 and 16 byte mbufs around TLS records, we hit MDDs regularly. I think you need to walk the packet stream, not the DMA descriptors, so that you keep the same segmentation used by the hardware. I re-wrote the sparse detection to do this, and our MDDs went away.

Tue, May 29, 9:27 PM
gallatin added inline comments to D15577: Update ixl(4) to use iflib..
Tue, May 29, 9:25 PM

Fri, May 25

gallatin requested changes to D15558: iflib: Record TCP checksum info in iflib for ixl(4).
Fri, May 25, 8:27 PM · Intel Networking
gallatin added a comment to D15558: iflib: Record TCP checksum info in iflib for ixl(4).

Was this the patch that we discussed on our call? Does this fix your TX checksum problems on IXL?

Yes and Yes.

Unfortunately, I've now proceeded to the "entire box hangs" portion of the game. Even serial console break-to-debugger is ignored. I'm going to try NMIs next..

Fri, May 25, 8:18 PM · Intel Networking
gallatin accepted D15575: iflib: Add new shared flag to iflib: IFLIB_ADMIN_ALWAYS_RUN.
Fri, May 25, 8:16 PM
gallatin added inline comments to D15575: iflib: Add new shared flag to iflib: IFLIB_ADMIN_ALWAYS_RUN.
Fri, May 25, 6:32 PM

Thu, May 24

gallatin added a comment to D15558: iflib: Record TCP checksum info in iflib for ixl(4).

Was this the patch that we discussed on our call? Does this fix your TX checksum problems on IXL?

Thu, May 24, 6:20 PM · Intel Networking

May 21 2018

gallatin accepted D15510: Defer inpcb deletion until after a grace period has elapsed.
May 21 2018, 10:04 PM

May 18 2018

gallatin accepted D15409: Protect global ifnet list and short lived ifaddr references with epoch.
May 18 2018, 11:14 PM
gallatin committed rS333793: Teach pmcannotate about $TMPDIR and _PATH_TMP.
Teach pmcannotate about $TMPDIR and _PATH_TMP
May 18 2018, 2:14 PM

May 16 2018

gallatin accepted D15366: Replace if_addr_lock rwlock with epoch + mutex.
May 16 2018, 12:12 AM

May 15 2018

gallatin committed rS333655: Unhook DEBUG_BUFRING from INVARIANTS.
Unhook DEBUG_BUFRING from INVARIANTS
May 15 2018, 11:55 PM

May 13 2018

gallatin accepted D15419: epoch(9) man page.

Looks good for content to me.

May 13 2018, 11:04 PM

May 11 2018

gallatin added a comment to D15366: Replace if_addr_lock rwlock with epoch + mutex.

Under a heavy UDP packet flood this helps quite a bit.

May 11 2018, 6:36 PM

May 10 2018

gallatin committed rS333462: Fix a panic in the IPv6 multicast code..
Fix a panic in the IPv6 multicast code.
May 10 2018, 4:19 PM
gallatin committed rS333459: Fix the build after r333457.
Fix the build after r333457
May 10 2018, 1:19 PM

May 9 2018

gallatin added a reviewer for D15366: Replace if_addr_lock rwlock with epoch + mutex: jtl.
May 9 2018, 3:02 PM
gallatin added a comment to D15366: Replace if_addr_lock rwlock with epoch + mutex.

Looks awesome. My only issue is that I'd strongly prefer that the STAILQ_HEAD / STAILQ_INIT / STAILQ_ENTRY macros be prefixed with CK_ so that readers realize that these lists are used with epochs, and cannot be safely used with the normal STAILQ macros. I tried to point out most of them, but I may have missed a few.

May 9 2018, 3:02 PM
gallatin added inline comments to D15365: simple preempt safe epoch API.
May 9 2018, 1:56 PM
gallatin accepted D15354: MFC iflib bugfixes.
May 9 2018, 1:45 PM

May 8 2018

gallatin added a comment to D15345: Add support for packet batching in ifnet.

It would be nice to have this patch together with at least one example of a caller and one example of a NIC driver implementing the API.

May 8 2018, 2:09 PM
gallatin added a reviewer for D15345: Add support for packet batching in ifnet: hselasky.
May 8 2018, 1:25 PM

May 7 2018

gallatin committed rS333329: Fix an off-by-one error when deciding to request a tx interrupt.
Fix an off-by-one error when deciding to request a tx interrupt
May 7 2018, 6:11 PM
gallatin committed rS333325: Boost thread priority while changing CPU frequency.
Boost thread priority while changing CPU frequency
May 7 2018, 3:24 PM
gallatin closed D15246: Boost thread priority while changing CPU frequency.
May 7 2018, 3:24 PM

May 4 2018

gallatin accepted D15300: iflib: print message when iflib_tx_structures_setup fails.
May 4 2018, 11:33 PM
gallatin added a comment to D15300: iflib: print message when iflib_tx_structures_setup fails.

I'd just update this review.

May 4 2018, 10:14 PM
gallatin accepted D15300: iflib: print message when iflib_tx_structures_setup fails.

As mentioned on the other review, maybe remove the print in the caller, since it is now redundant..

May 4 2018, 7:59 PM
gallatin accepted D15299: iflib: cleanup queues when iflib_device_register fail.
May 4 2018, 7:57 PM
gallatin accepted D15285: iflib: fix invalid free during queue allocation failure.
May 4 2018, 12:35 PM
gallatin accepted D15284: iflib: remove unused brscp pointer from iflib_queues_alloc.
May 4 2018, 12:33 PM

May 2 2018

gallatin updated the diff for D15246: Boost thread priority while changing CPU frequency.

Change to using sched_prio() rather than sched_lend_prio, as suggested by @kib

May 2 2018, 3:05 PM

May 1 2018

gallatin added a comment to D15246: Boost thread priority while changing CPU frequency.
In D15246#321533, @jhb wrote:

Adding kib@.

It's not clear to me why to prefer PI_NET - 1 over, say, PRI_MIN in this case. There's nothing about powerd that is network specific.

May 1 2018, 8:26 PM
gallatin updated the diff for D15246: Boost thread priority while changing CPU frequency.

I've replaced the arbitrary PI_NET - 1 to PRI_MIN, as suggested by John

May 1 2018, 8:25 PM
gallatin committed rS333141: Optionally panic when cxgbe encounters a fatal error.
Optionally panic when cxgbe encounters a fatal error
May 1 2018, 3:33 PM

Apr 30 2018

gallatin committed rS333131: Fix iflib_encap() EFBIG handling bugs.
Fix iflib_encap() EFBIG handling bugs
Apr 30 2018, 11:53 PM
gallatin created D15246: Boost thread priority while changing CPU frequency.
Apr 30 2018, 11:31 PM

Apr 25 2018

gallatin added inline comments to D15199: Add possibility to disable or reduce amount of UMA debugging with INVARIANTS.
Apr 25 2018, 7:50 PM

Apr 17 2018

gallatin committed rS332653: Restore SIOCGI2C functionality to ixgbe.
Restore SIOCGI2C functionality to ixgbe
Apr 17 2018, 4:51 PM
gallatin committed rS332645: Make lagg creation more fault tolerant.
Make lagg creation more fault tolerant
Apr 17 2018, 12:55 PM
gallatin closed D15046: Make lagg creation more fault tolerant.
Apr 17 2018, 12:55 PM

Apr 11 2018

gallatin created D15046: Make lagg creation more fault tolerant.
Apr 11 2018, 5:49 PM

Apr 9 2018

gallatin accepted D14967: split out flag manipulation from general context manipulation in iflib.
Apr 9 2018, 7:40 PM

Apr 6 2018

gallatin accepted D14967: split out flag manipulation from general context manipulation in iflib.
Apr 6 2018, 11:30 PM

Apr 3 2018

gallatin accepted D14937: Fix LRO window comparison.
Apr 3 2018, 12:44 PM

Mar 8 2018

gallatin accepted D14540: Several LRO fixes.
Mar 8 2018, 7:11 PM · transport

Feb 22 2018

gallatin accepted D14470: Do not return out of bound pointers from intr_lookup_source()..
Feb 22 2018, 2:01 PM

Feb 16 2018

gallatin added a reviewer for D14402: PID Controlled page daemon: imp.
Feb 16 2018, 8:57 PM
gallatin added a comment to D14381: mxge(4) should pass unhandled ioctls to ether_ioctl().

Well thats embarrassing. Thanks for the fix.

Feb 16 2018, 2:27 AM

Feb 5 2018

gallatin accepted D14210: Rationalize license test on Linuxolator files.
Feb 5 2018, 11:09 PM
gallatin accepted D14000: per-domain page queue free locking.
Feb 5 2018, 11:09 PM

Oct 31 2017

gallatin added a comment to D12101: swfw_sync DELAY -> sleep conversion.

I think you'll get a lot less pushback if you serialize the multicast stuff in the stack, rather than the driver framework. This will allow you to put warnings / asserts into all the ioctl entry points above the drivers, so as to lock in the "you can't hold a lock while calling into a driver" rule.

Oct 31 2017, 3:47 PM · network

Oct 23 2017

gallatin added a comment to D12101: swfw_sync DELAY -> sleep conversion.

So let me try to sum up in my own words what's going on here:

Oct 23 2017, 3:40 PM · network

Oct 17 2017

gallatin added a comment to D12638: mlx5(4) rx timestamps..
In D12638#263636, @kib wrote:
Oct 17 2017, 5:19 PM
gallatin added a comment to D12638: mlx5(4) rx timestamps..

I patched this into our netflix tree. I confirmed that, even with rx timestamps enabled, the new functionality does not cause a significant performance loss in our case. There seems to be roughly one or two cache misses per mlx5e_poll_rx_cq() to check the p->priv->clbr_done > 2 condition, and roughly 8x as many for the timestamp handling itself. This pushes the cost of mlx5e_build_rx_mbuf() higher, but not horribly so.

Oct 17 2017, 1:05 PM

Oct 11 2017

gallatin accepted D12638: mlx5(4) rx timestamps..

I have not tested it yet, but this looks fantastic. Thank you for this!

Oct 11 2017, 8:56 PM

Oct 9 2017

gallatin accepted D12615: Mbuf external storage improvements..

The change doesn't affect sfbufs at all.

What is LGTM?

Oct 9 2017, 2:41 PM

Sep 25 2017

gallatin added a comment to D12487: Combine LROed mbufs for a single call to if_input().

I think this is intended to piggyback on the recent iflib change which claims a speedup from chaining the packets. However, I'm afraid that I don't understand where this speedup is coming from. The stated reason to allow chaining in ether_input() is to allow drivers to amortize the release/acquire of the rx lock. However, no decent driver even uses an rx lock anymore, certainly not iflib or mlx5. So is there a benefit? If yes, then can you explain where it is coming from?

Sep 25 2017, 1:46 PM

Sep 6 2017

gallatin accepted D12229: mlx4_en: Setup mbuf hash type properly.
Sep 6 2017, 6:54 PM

Aug 31 2017

gallatin accepted D12176: mlx4_en: Implement SIOCGIFRSS{KEY,HASH}.
Aug 31 2017, 1:15 PM
gallatin added inline comments to D12175: hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}..
Aug 31 2017, 1:14 PM
gallatin accepted D12175: hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}..
Aug 31 2017, 1:12 PM
gallatin accepted D12174: if: Add ioctls to get RSS key and hash type/function..
Aug 31 2017, 1:07 PM

Aug 28 2017

gallatin accepted D12137: Adaptively enable/disable entropy collection from packets.

We currently disable entropy collection (or hoarding, as you aptly describe it) for NET_ETHER. It looks like if the IFF_ NO_ENTROPY flag was present for our 100G nics, this might be slightly cheaper for us, as it will avoid the function call, and the load of the entropy mask, since if_flags will already be hot in cache from the IFF_UP check. As it is, it will be no worse .

Aug 28 2017, 12:34 PM

Aug 27 2017

gallatin added a comment to D11525: Allow distinct setting and querying of interrupt and ithread affinities.

Added my probably-weak mdoc suggestions, and added brueffer as a reviewer.

Aug 27 2017, 8:44 PM

Aug 24 2017

gallatin added a comment to D11525: Allow distinct setting and querying of interrupt and ithread affinities.

This is a bit confusing. The new -I and -X options take the same irq argument as the old -x?

Aug 24 2017, 11:41 AM

Aug 11 2017

gallatin added a comment to D11969: refactoring in support of *future* change to cope with slow configuration path on INTC and BRCM drivers.

It has been years since I've had to think about this, but I remember that the problem that all drivers fight is that you wind up being called in your ioctl routine with potentially some lock held by something that is calling you, so you cannot sleep. Is that still the issue? Can you remind me what lock is held?

Aug 11 2017, 1:19 PM

Aug 3 2017

gallatin accepted D11683: Fix mlx4en(4) to properly call m_defrag..
Aug 3 2017, 3:29 PM
gallatin added inline comments to D11683: Fix mlx4en(4) to properly call m_defrag..
Aug 3 2017, 2:01 PM

Aug 1 2017

gallatin added a comment to D11525: Allow distinct setting and querying of interrupt and ithread affinities.

Thanks for the feedback. I'll reach out to wblock about the doc changes. I'm quite weak in man page fu, and this is a slightly odd one at that..

Aug 1 2017, 2:38 PM
gallatin updated the diff for D11525: Allow distinct setting and querying of interrupt and ithread affinities.
  • Feedback from jbh regarding wording
Aug 1 2017, 2:35 PM

Jul 31 2017

gallatin committed rS321790: Don't request CTLTYPE_OPAQUE if we can't print them..
Don't request CTLTYPE_OPAQUE if we can't print them.
Jul 31 2017, 2:57 PM
gallatin closed D11461: Don't request CTLTYPE_OPAQUE if we can't print them. by committing rS321790: Don't request CTLTYPE_OPAQUE if we can't print them..
Jul 31 2017, 2:56 PM

Jul 13 2017

gallatin added a comment to D10445: Speed up NVME crashdumps.

I finally tried this, and sadly it does not seem to speed things up at all.

Jul 13 2017, 5:18 PM

Jul 10 2017

gallatin accepted D11518: Add support for generic backpressure indicator for ratelimited transmit queues aswell as non-ratelimited ones.
Jul 10 2017, 2:13 PM

Jul 7 2017

gallatin created D11525: Allow distinct setting and querying of interrupt and ithread affinities.
Jul 7 2017, 9:54 PM

Jul 6 2017

gallatin committed rS320738: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove
Jul 6 2017, 3:04 PM
gallatin closed D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove by committing rS320738: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Jul 6 2017, 3:04 PM

Jul 5 2017

gallatin added inline comments to D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Jul 5 2017, 5:16 PM
gallatin added inline comments to D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Jul 5 2017, 5:02 PM
gallatin updated the diff for D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.

Address kib's feedback

Jul 5 2017, 5:00 PM
gallatin added inline comments to D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Jul 5 2017, 4:42 PM
gallatin created D11489: Simplify UIO_SYSSPACE and UIO_NOCOPY paths in uiomove.
Jul 5 2017, 1:16 PM

Jul 4 2017

gallatin accepted D11475: Zero initialize all fields of socket structure.

This seems reasonable to me

Jul 4 2017, 1:31 PM

Jul 3 2017

gallatin created D11461: Don't request CTLTYPE_OPAQUE if we can't print them..
Jul 3 2017, 1:30 PM

May 18 2017

gallatin accepted D10681: bnxt: Enable HW LRO and Fix out-of-order updates to rxd's completely..

I really hate the idea of spreading the linux kpi, but I totally understand why you used it. We should probably move a lot of that stuff to a native interface.

May 18 2017, 6:06 PM

May 15 2017

gallatin accepted D10645: Avoid use of contiguous memory allocations in busdma.
May 15 2017, 1:07 PM

May 11 2017

gallatin accepted D10645: Avoid use of contiguous memory allocations in busdma.

I added Scott, as he's been quite involved in the busdma code. I want to make sure this looks OK to him too.

May 11 2017, 1:39 PM
gallatin added a reviewer for D10645: Avoid use of contiguous memory allocations in busdma: scottl.
May 11 2017, 1:38 PM