- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 14 2024
Dec 28 2023
Nov 20 2023
Nov 19 2023
Jun 28 2023
In D40769#927934, @diizzy wrote:Looks like you submitted the wrong patch?
Jun 27 2023
Jun 6 2023
In D39614#919546, @melifaro wrote:In D39614#907626, @kp wrote:In D39614#907531, @melifaro wrote:In D39614#902547, @kp wrote:The fix to this particular issue depends on the answers of the above (or can be a bandaid as in diff if we want to kick the can down the road).
Those are all good questions, but I think in this case our issue is that the ifnet goes away between the task being enqueued and executed. There are a number of other places where we see things like that, and that's the 'general problem' I was referring to.
Ack! I was looking into it from a different angle. The mental model of "safe detach" to me is a) set the demarcation point that signals "no more data accepted", b) ensure no data is indeed accepted and c) clear the queued data. I'm a bit unsure if we can or should generalise the implementations for the different datapaths and control plane. The current code mostly have everything implemented - IFF_DYING flag is set, marking the end of the era and the taskq is cleaned from the link tasks matching this interface. The remaining part is rejecting new changes.
We can indeed do an MPASS() check in the enqueue code to reject such code patterns if that's what we agree with.
What do you folks think?
So that seems like a sane general approach, but I'm not clear on how it'd work here. We'd have to make sure the ifp sticks around until the task queue no longer has any if_linktasks remaining, and I don't think we have a mechanism for that currently.
I think we have something in if.c:if_detach_internal() for that ( https://cgit.freebsd.org/src/tree/sys/net/if.c#n1145 ):
/* * In any case (destroy or vmove) detach us from the groups * and remove/wait for pending events on the taskq. * XXX-BZ in theory an interface could still enqueue a taskq change? */ if_delgroups(ifp); taskqueue_drain(taskqueue_swi, &ifp->if_linktask); taskqueue_drain(taskqueue_swi, &ifp->if_addmultitask);So I still believe that the remaining part is to reject the new changes. Happy to discuss it more :-)
And anyway, @akiyano_amazon.com, I guess we need the driver to be updated to avoid setting the state when dying. Could you please look at that?
Apr 19 2023
In D39614#902533, @melifaro wrote:In D39614#901770, @zlei wrote:I do not quite understand this logic:
ena_detach() ether_ifdetach() if_detach() if_detach_internal() # here we drain taskqueue_swi from link events ena_destroy_device() if_link_state_change() # here we enqueue new link eventWhy is ether_ifdetach() invoked before if_link_state_change() ???
As far as I understand, ena_destroy_device() is called in multiple places, including an init error handler, so it incorporates setting the link down. I think that it would make sense for the driver to move the link state change elsewhere, but I'd prefer to address the stack problem first.
Apr 17 2023
I've tested this commit and it solves the issue.
Updated kernel and world to the latest main. => Issue reproduced.
Applied this fix to kernel sources, built and installed kernel => issue doesn't reproduce.
Apr 13 2023
In D39048#899642, @melifaro wrote:In D39048#899603, @zlei wrote:In D39048#899501, @akiyano_amazon.com wrote:Hi all,
I hope it is OK that I write this here, if not please tell me what will be a better place to do it.
This commit has made the ENA driver crash upon kldunload if_ena.ko. (I made sure by running buildworld on this commit and on the commit before it)Here is the full call stack that was dumped:
curthread () at /root/freebsd-src/sys/amd64/include/pcpu_aux.h:59
59 asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0 __curthread () at /root/freebsd-src/sys/amd64/include/pcpu_aux.h:59
#1 doadump (textdump=textdump@entry=1)
at /root/freebsd-src/sys/kern/kern_shutdown.c:407
#2 0xffffffff80bedc6c in kern_reboot (howto=260)
at /root/freebsd-src/sys/kern/kern_shutdown.c:528
#3 0xffffffff80bee18f in vpanic (fmt=<optimized out>,
ap=ap@entry=0xfffffe01fef62ae0)
at /root/freebsd-src/sys/kern/kern_shutdown.c:972
#4 0xffffffff80bedf13 in panic (fmt=<unavailable>)
at /root/freebsd-src/sys/kern/kern_shutdown.c:896
#5 0xffffffff810e2b39 in trap_fatal (frame=0xfffffe01fef62b70, eva=0)
at /root/freebsd-src/sys/amd64/amd64/trap.c:954
#6 <signal handler called>
#7 dump_sa (nw=nw@entry=0xfffffe01fef62d08, attr=attr@entry=1,
sa=0xdeadc0dedeadc0de) at /root/freebsd-src/sys/netlink/route/iface.c:210
#8 0xffffffff80e5659a in dump_iface (nw=nw@entry=0xfffffe01fef62d08,
ifp=ifp@entry=0xfffff80109bbe800, hdr=hdr@entry=0xfffffe01fef62d48,
if_flags_mask=if_flags_mask@entry=0)
at /root/freebsd-src/sys/netlink/route/iface.c:279
#9 0xffffffff80e55e7b in rtnl_handle_ifevent (ifp=0xfffff80109bbe800,
nlmsg_type=<optimized out>, if_flags_mask=0)
at /root/freebsd-src/sys/netlink/route/iface.c:943
#10 0xffffffff80d1fc1d in do_link_state_change (arg=0xfffff80109bbe800,
pending=1) at /root/freebsd-src/sys/net/if.c:2205
#11 0xffffffff80c5233a in taskqueue_run_locked (
queue=queue@entry=0xfffff80106ce7100)
at /root/freebsd-src/sys/kern/subr_taskqueue.c:514
#12 0xffffffff80c5224d in taskqueue_run (queue=0xfffff80106ce7100)
at /root/freebsd-src/sys/kern/subr_taskqueue.c:529
#13 0xffffffff80ba8126 in intr_event_execute_handlers (ie=0xfffff80106a9d300,
p=<optimized out>) at /root/freebsd-src/sys/kern/kern_intr.c:1207
#14 ithread_execute_handlers (ie=0xfffff80106a9d300, p=<optimized out>)
at /root/freebsd-src/sys/kern/kern_intr.c:1220
#15 ithread_loop (arg=arg@entry=0xfffff80106c951c0)
at /root/freebsd-src/sys/kern/kern_intr.c:1308
#16 0xffffffff80ba45c0 in fork_exit (
callout=0xffffffff80ba7eb0 <ithread_loop>, arg=0xfffff80106c951c0,
frame=0xfffffe01fef62f40) at /root/freebsd-src/sys/kern/kern_fork.c:1102
#17 <signal handler called>
(kgdb)I don't have much experience with CARP, and I was wondering if anything obvious jumps out from this call stack with regard to this commit?
Thank you for the report!
I agree with the assessment, it's not carp related.
When the ifnet departure hook is called, indeed some stuff from the ifnet may have been already disappeared. I'll update the callback handler to use only basic interface info, that should be present till the end.
Apr 11 2023
Hi all,
I hope it is OK that I write this here, if not please tell me what will be a better place to do it.
This commit has made the ENA driver crash upon kldunload if_ena.ko. (I made sure by running buildworld on this commit and on the commit before it)