iflib is a library to eliminate the need for frequently duplicated device independent logic propagated (poorly) across many network drivers.
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
- Lint Skipped 
- Unit
- Tests Skipped 
Event Timeline
| sys/net/iflib.c | ||
|---|---|---|
| 2963 | Looks like this requires nextbsd 9230bff and a5090db as another dependency (the changes to add a register method to device_if.m). Are you compile testing this on -current? | |
When linking, the mp_ring here conflicts with the one in the cxgbe driver if both iflib and cxgbe are included in the kernel.
| sys/net/iflib.c | ||
|---|---|---|
| 425 | This (and the similar SYSINIT in the TASKQGROUP_DEFINE()) prevent building drivers using iflib into the kernel. The sysinits run well after the pci bus is walked, and you wind up tripping over uninitialized stuff everywhere. | |
Will rename.
| sys/net/iflib.c | ||
|---|---|---|
| 425 | Hrrm. Can one safely run them before the pci tree is walked? | |
| sys/net/iflib.c | ||
|---|---|---|
| 2680 | So, who is supposed to clear this? I don't see any code to clear this flag. I confess that I noticed this not because of brilliant code review skills, but because I have a box accepting only 1/N ssh sessions, and complaining constantly about 'syslogd: sendto: No buffer space available'. When poking around in kgdb, I see this flag set on several rings. | |
| sys/net/iflib.c | ||
|---|---|---|
| 277 | What is this in_use field for? Just debugging? | |
Update to m_init and kernel config to compile ixl/ixlv/ix/ixv in kernel with GENERIC
- does not address Gallatin's most recent issues
Address issues raised by Gallatin:
- don't mark a queue as closed without corresponding driver backpressure support
- rename mp_ring structures / types / functions to avoid namespace collisions when linking cxgbe in to the kernel
| sys/conf/files | ||
|---|---|---|
| 1896 ↗ | (On Diff #13459) | Probably. But it's more difficult managing it as part of that patchset. | 
| sys/conf/files.amd64 | ||
| 239 ↗ | (On Diff #13459) | Again. Probably, but since iflib is kind of package deal anyway if iflib goes in without one or both of those drivers this part of the change will just be dropped. | 
| sys/net/iflib.c | ||
|---|---|---|
| 426 | I'm trying to fix this but it's hard to diagnose with DDB being unusable at this stage in boot thanks to USB "modernization." If you know what values I should pass that would be great. It may well require jhb's SMP startup changes. | |
| sys/net/iflib.c | ||
|---|---|---|
| 1663 | Settiung of the hwassist bits seems to have been removed from the drivers (at least ixl), and does not seem to appear anywhere else in iflib. Should it also appear in iflib_init_locked()? | |
To follow up on my last comment, fixing this (patch appended) seems to result in the kernel actually sending down packets with csum offload, which then causes the ixl NIC to stop passing traffic, and to die occasionally with: "ixl0: Malicious Driver Detection event 0x02 on TX queue 5 pf number 0x00"
diff --git a/sys/net/iflib.c b/sys/net/iflib.c --- a/sys/net/iflib.c +++ b/sys/net/iflib.c @@ -1681,6 +1681,7 @@ static void iflib_init_locked(if_ctx_t ctx) { if_softc_ctx_t sctx = &ctx->ifc_softc_ctx; + if_t ifp = ctx->ifc_ifp; iflib_fl_t fl; iflib_txq_t txq; iflib_rxq_t rxq; @@ -1697,6 +1698,15 @@ iflib_init_locked(if_ctx_t ctx) iflib_netmap_rxq_init(ctx, rxq); } + + /* Set hardware offload abilities */ + if_clearhwassist(ifp); + if (if_getcapenable(ifp) & IFCAP_TXCSUM) + if_sethwassistbits(ifp, CSUM_TCP | CSUM_UDP, 0); + if (if_getcapenable(ifp) & IFCAP_TSO4) + if_sethwassistbits(ifp, CSUM_TSO, 0); + if (if_getcapenable(ifp) & IFCAP_TXCSUM_IPV6) + if_sethwassistbits(ifp, (CSUM_TCP_IPV6 | CSUM_UDP_IPV6), 0); IFDI_INIT(ctx); for (i = 0, rxq = ctx->ifc_rxqs; i < sctx->isc_nqsets; i++, rxq++) { for (j = 0, fl = rxq->ifr_fl; j < rxq->ifr_nfl; j++, fl++) {
We still seem to be missing the device_register api required by iflib. Eg:
diff --git a/sys/kern/device_if.m b/sys/kern/device_if.m index a5319c6..18d9fd7 100644 --- a/sys/kern/device_if.m +++ b/sys/kern/device_if.m @@ -62,6 +62,11 @@ CODE { { return 0; } + + static void *null_register(device_t dev) + { + return NULL; + } }; /** @@ -316,3 +321,7 @@ METHOD int resume { METHOD int quiesce { device_t dev; } DEFAULT null_quiesce; + +METHOD void * register { + device_t dev; +} DEFAULT null_register;
| sys/net/iflib.c | ||
|---|---|---|
| 427 | I did a fresh git clone of current and arc patch today (after walking the tree back prior to the last commit to sys/bus.h so that it would apply), and I'm still seeing panic on boot when ixl is built into the kernel. I noticed that the taskq change is still using SI_ORDER_SMP. | |
| 1504 | Actually, you DO indeed need INVARIANTS around this. W/o Invariants, "delta" triggers an unused warning, killing the build. Sorry. | |
Hmm.. Even when loading the module at boot on -current, I'm still seeing a panic:
ixl0: using 28 queues
ixl0: Using MSIX interrupts with 29 vectors
ixl0: allocated for 28 queues
Fatal trap 12: page fault while in kernel mode
cpuid = 18; apic id = 14
fault virtual address   = 0x1818
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80ad711c
stack pointer           = 0x28:0xfffffe3fca9f3fb0
frame pointer           = 0x28:0xfffffe3fca9f4020
code segment            = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 123 (kldload)
[ thread pid 123 tid 100220 ]
Stopped at      taskqgroup_attach+0x4c: lock cmpxchgq   %rsi,0x1818(%r15)
db> bt
Tracing pid 123 tid 100220 td 0xfffff8013b123000
taskqgroup_attach() at taskqgroup_attach+0x4c/frame 0xfffffe3fca9f4020
iflib_irq_alloc_generic() at iflib_irq_alloc_generic+0x357/frame 0xfffffe3fca9f40b0
ixl_if_msix_intr_assign() at ixl_if_msix_intr_assign+0x66/frame 0xfffffe3fca9f4130iflib_device_register() at iflib_device_register+0x17a9/frame 0xfffffe3fca9f4470
iflib_device_attach() at iflib_device_attach+0xb7/frame 0xfffffe3fca9f44a0
device_attach() at device_attach+0x423/frame 0xfffffe3fca9f4500
pci_driver_added() at pci_driver_added+0xed/frame 0xfffffe3fca9f4540
devclass_driver_added() at devclass_driver_added+0x7d/frame 0xfffffe3fca9f4580
devclass_add_driver() at devclass_add_driver+0x11c/frame 0xfffffe3fca9f45c0
module_register_init() at module_register_init+0x177/frame 0xfffffe3fca9f45f0
linker_load_module() at linker_load_module+0xca8/frame 0xfffffe3fca9f4920
kern_kldload() at kern_kldload+0xea/frame 0xfffffe3fca9f4960
sys_kldload() at sys_kldload+0x5b/frame 0xfffffe3fca9f4990
amd64_syscall() at amd64_syscall+0x502/frame 0xfffffe3fca9f4ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe3fca9f4ab0
- syscall (304, FreeBSD ELF64, sys_kldload), rip = 0x80086b83a, rsp = 0x7fffffffe7e8, rbp = 0x7fffffffed00 ---
db>
I can confirm that moving back to SI_SUB_SMP in the DECLARE_MODULE macro fixes it enough to at least work with an ixl module loaded post-boot.
diff --git a/sys/net/iflib.c b/sys/net/iflib.c index ab5f74c..37d970e 100644 --- a/sys/net/iflib.c +++ b/sys/net/iflib.c @@ -421,7 +421,7 @@ static moduledata_t iflib_moduledata = { NULL }; -DECLARE_MODULE(iflib, iflib_moduledata, SI_SUB_INIT_IF, SI_ORDER_ANY); +DECLARE_MODULE(iflib, iflib_moduledata, SI_SUB_SMP, SI_ORDER_ANY); MODULE_VERSION(iflib, 1); MODULE_DEPEND(iflib, pci, 1, 1, 1);
Sorry, I'm sure it was in an earlier patch. Will put it back up.
Ok. I needed that change to make my system boot with ixgbe linked in to the kernel.
The flag setting appears to have gotten lost in a mismerge jumping between branches. The first place I'd look to fix that error would be at my changes to ixl when I removed packet parsing.
Uhm ... did you pull in the corresponding update to the task queue patch? That's probably the real problem.
Yes.
I think the problem is actually in that file. Specifically, the SYSINIT() in the TASKQGROUP_DEFINE() macro that uses SI_SUB_SMP pretty much makes certain that when the DRIVER_MODULE executes earlier, the taskq stuff will still be NULL. Then that stays cached, and the drivers trip over it.
No. The updated taskq patch doesn't use SI_SUB_SMP.
There are 2 patches in that review. The second patch updates 2 SYSINITS to use the IF order, but leaves one at SMP.
Right. Adjust updates the number of threads once SMP comes on line. Are you saying that that's still a problem?
It's odd that ixgbe works with the change. Although I admit I haven't tried ixgbe as a module with that change.
Right. Adjust updates the number of threads once SMP comes on line. Are you saying that that's still a problem?
Honestly, I haven't thought much about it. I'm just trying to get things to not crash so that I can try to get ixl working in a controlled environment before I test it for real. All I can say for sure is that the patch, as it stands today, goes "kaboom" with both ixl compiled as a module, and with it compiled into the kernel.
I'll try to diagnose this after I get ixl not to barf on the descriptors we're sending down to it when csum offload is enabled.
- include opt_inet.h & opt_inet6.h
- make enabled code compile
- fix !INVARIANTS warning
Please note that the complete set of changes against HEAD is maintained in the HEAD_MERGE/iflib branch at https://github.com/NextBSD/NextBSD.git
- TSO fixes
- serialize doorbell updates
- extend tx pkt_info format for bxe TSO
- heavily revise rx pkt info for bxe S/G across rings
- don't break out of transmit loop on busdma map failure
- report internal state and error statistics via sysctl
- also check if there are more credits available when deciding to drain
- re-merge against HEAD
- Mbuf free optimizations
- bus_get_cpus change
- fix memory corruptions caused by bxe and an probe routines
- support for architectures without 32-bit atomics
- taskqueue group fixes
- support for drivers that don't know what their msix-bar is and so map everything
- changes to private ioctl interface
- sysctl extensions
- lots of good stuff
The change to kern_mbuf allows iflib to avoid gratuitous memory touches on free. The other driver changes are_not_ optional as they fix use after free memory corruptions in those drivers' probe routines that have hit iflib on driver reload (unload and then reload a driver kmod).
You can leave out the mbuf changes in the initial commit I guess. They also rerequire adding defines to mbuf.h which I missed.
I've made the changes to if_an and committed that part of the patch to freebsd-head
| sys/dev/an/if_an.c | ||
|---|---|---|
| 322 ↗ | (On Diff #16239) | Prototype needs to be removed from the headers. | 
| sys/dev/an/if_an_pci.c | ||
| 152 ↗ | (On Diff #16239) | This line can be moved into an_probe_pci() instead of duplicating the id list search | 
| 159 ↗ | (On Diff #16239) | It is gratuitous, but it's also a safety pattern that everyone follows, | 
| sys/dev/bxe/bxe.c | ||
|---|---|---|
| 2495 ↗ | (On Diff #16263) | This should work, device_set_desc_copy() is intended for this usage pattern. Moving it out to bxe_attach() is the wrong thing, IMHO. | 
| sys/dev/bxe/bxe.c | ||
|---|---|---|
| 2495 ↗ | (On Diff #16263) | No. Unless you know with 100% certainty that the driver will subsequently attach (which you can't as you might have another driver that eats device nodes to facilitate passthrough) you should never ever touch the device node in probe. Probe is purely yay or nay. | 
That's not true. device_set_desc and friends are specifically designed to be operated on in device_probe.
In any event. I don't see any point in setting the description until attach. It may be the freeing of descbuf isn't permitted, but in that case you'd have a leak if it didn't subsequently attach. This is not a robust interface.
| sys/net/mp_ring.c | ||
|---|---|---|
| 45 | MIPS buildkernel failed with: Could you please add support for non-x86 platforms (arm, mips32)? Thanks! | |
There is a directive for platforms without 64-bit atomics. Such arches
should put that in their param.h, sorry for the oversight. A lot of balls
in the air.
-M
Thank you for quick reply!
Another issue is:
src/sys/net/iflib.c: In function '_iflib_fl_refill':
src/sys/net/iflib.c:1538: warning: comparison is always true due to limited range of data type
src/sys/net/iflib.c: In function 'iflib_fl_bufs_free':
src/sys/net/iflib.c:1667: warning: comparison is always true due to limited range of data type
I suppose it's due to unsigned types. But build is failed due to warnings.
Thanks!