Details

Reviewers

kib
jhb
• hselasky
imp
lstewart
gnn
sbruno
adrian
rwatson

Summary

The callout code had two specific bugs within it after the new CPU migration
feature was added.

The callout_active() call at times could lie to the KPI user, this is because during a migration the ACTIVE flag would have been removed so that the semantics were broken. This could cause problems for KPI users since the active flag is a key feature used by those declaring themselves MPSAFE and using there own lock (think TCP).
There was a bug in the callout code where if a migration was going on and two calls were made to callout_reset (the first starting the migration) the second call would possibly corrupt the linked list (if it was the active callout about to execute) and thus we would spin forever in soft clock and this would result in a panic, spin lock held to long.

Test Plan

I have added the kernel test framework and a callout_test module. This
allows us to pound on the callout code with multiple threads resetting, draining
and stopping callouts all at about the same time. This test without the fixes would
lock up and panic with a spin lock held to long vs now it does not.

I have also begun testing this on Netflix Caches to further validate that it does no harm
and instead gets rid of a panic or two ;-)

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Ok I have stripped out the kernel test framework and moved this
to D1755.. jhb, you are a reviewer there as well ;-)

This boils things down to *just* the fixes to the callout.

This revision now requires review to proceed.Feb 2 2015, 5:23 PM

My only request here is that you commit the mechanical changes separate to the migration fixes. That way it's very clear what's supposed to be fixing the bug and what's improving the readability/consistency of the code.

(Edit: ah, two others have also commented along those lines. :)

Other than that, great work on nailing this down.

This revision is now accepted and ready to land.Feb 2 2015, 5:25 PM

Looks good. There are some non-related changes still, like the definition of access macros to improve readability.

rrs added reviewers: jhb, kib.Feb 2 2015, 6:32 PM

kib accepted this revision.Feb 2 2015, 6:35 PM

kib edited edge metadata.

For future diffs, please generate with full context, e.g. -U999999

sys/kern/kern_timeout.c
789 ↗	(On Diff #3591)	added EOL whitespace?
1067–1069 ↗	(On Diff #3591)	I'm a bit confused by this sentence.

rrs added inline comments.Feb 3 2015, 11:45 AM

sys/kern/kern_timeout.c
789 ↗	(On Diff #3591)	Opps I will fix that thanks Ed!
1067–1069 ↗	(On Diff #3591)	Let me see if I can edit this comment a bit and make it more clear.

Ed:

See if this comment clarification does not help a bit.

Hans:

I have no intention of *not* leaving the macro changes in here, this was a comment suggested by
imp and I think its a good one!

This revision now requires review to proceed.Feb 3 2015, 11:56 AM

Lets try this one more time to get line 789 right ;-)

• hselasky accepted this revision.Feb 3 2015, 12:07 PM

• hselasky edited edge metadata.

This revision is now accepted and ready to land.Feb 3 2015, 12:07 PM

adrian accepted this revision.Feb 3 2015, 4:46 PM

adrian edited edge metadata.

I think we're ready to land. It fixes all the issues discussed, and enhanced testing shows no additional errors.

kern/kern_timeout.c
1159	"When the callout wheel runs, it will ignore this callout" maybe is a better phrase to use here rather than have an ambiguous "callout" that refers to two different things in the same sentence.
1234	tiny nit: I think this is indented wrong, but Phabric is so lame I can't tell for sure. Please check and fix if it isn't indented 4 spaces.
sys/callout.h
67	I'd be tempted to create a callout_processed() macro too and use it in the last chunk of kern_timeout.c, but that can be done as a separate commit.

emaste added inline comments.Feb 3 2015, 5:44 PM

kern/kern_timeout.c
1159	Yeah, something like @imp's suggestion makes it more clear to me.

We're dubious over here in "actually running this on live systems" land.

We're seeing pretty chaotic panics, unsure what we should dig into here. Within minutes of running live on a machine, we see various panics in cam or the network stack running this against stable/10. Let me find a convenient place to drop the panic information.

Sanitized panic #1

Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 6451576832B (6152 MB)
  Blocksize: 512
  Dumptime: Sat Jan 31 21:30:37 2015
  Hostname: xxxxxxxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-llnw10-D1711 #0 r277988M: Sat Jan 31 19:48:14 MST 2015
    root@xxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: duplicate worklist: 0xfffff80023118b00

  Dump Parity: 3331327211
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols


#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff8072d967 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff8072dd44 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80928082 in softdep_disk_write_complete (
    bp=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:10944
#4  0xffffffff807bb3c3 in bufdone_finish (bp=0xfffffe1f2b0ade80) at buf.h:420
#5  0xffffffff807bb227 in bufdone (bp=<value optimized out>)
    at /usr/src/sys/kern/vfs_bio.c:3769
#6  0xffffffff806925b0 in g_io_deliver (bp=0xfffff803b80c74d8, 
    error=<value optimized out>) at /usr/src/sys/geom/geom_io.c:669
#7  0xffffffff806925b0 in g_io_deliver (bp=0xfffff803b80a1e88, 
    error=<value optimized out>) at /usr/src/sys/geom/geom_io.c:669
#8  0xffffffff806925b0 in g_io_deliver (bp=0xfffff803b80a1d90, 
    error=<value optimized out>) at /usr/src/sys/geom/geom_io.c:669
#9  0xffffffff806902ab in g_disk_done (bp=0xfffff803b80b41f0)
    at /usr/src/sys/geom/geom_disk.c:252
#10 0xffffffff802e065c in dadone (periph=<value optimized out>, 
    done_ccb=<value optimized out>) at /usr/src/sys/cam/scsi/scsi_da.c:3041
#11 0xffffffff802bcefd in xpt_done_process (ccb_h=0xfffff8021e3e1000)
    at /usr/src/sys/cam/cam_xpt.c:5258
#12 0xffffffff802c02d6 in xpt_done_td (arg=0xffffffff81248e80)
    at /usr/src/sys/cam/cam_xpt.c:5285
#13 0xffffffff806fc16a in fork_exit (
    callout=0xffffffff802c01b0 <xpt_done_td>, arg=0xffffffff81248e80, 
    frame=0xfffffe1f9e79eac0) at /usr/src/sys/kern/kern_fork.c:1017
#14 0xffffffff80acf40e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:611
#15 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)

Sanitized panic #2

Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 6752395264B (6439 MB)
  Blocksize: 512
  Dumptime: Mon Feb  2 14:42:05 2015
  Hostname: xxxxxxxxxxxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-llnw10-D1711 #0 r277988M: Mon Feb  2 11:45:34 MST 2015
    root@xxxxxxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: general protection fault
  Dump Parity: 1989734895
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols

#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff8072d967 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff8072dd44 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80ae944f in trap_fatal (frame=<value optimized out>, 
    eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:859
#4  0xffffffff80ae90ac in trap (frame=<value optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:203
#5  0xffffffff80aceed2 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:236
#6  0xffffffff808b588c in in6_lltable_lookup (llt=0xfffff8001ffafe00, 
    flags=0, l3addr=0xfffff8026d300e20) at /usr/src/sys/netinet6/in6.c:2613
#7  0xffffffff808ce1f4 in nd6_output_lle (ifp=0xfffff8001682c800, 
    origifp=0xfffff8001682c800, m0=0xfffff8013f631500, 
    dst=0xfffff8026d300e20, rt0=0xfffff8001682c800, lle=0x0, chain=0x1)
    at if_llatbl.h:199
#8  0xffffffff808ce9b8 in nd6_output (ifp=0xffffffff80cdf7c5, 
    origifp=0xfffff8026d300e30, m0=0x10, dst=0x1, rt0=0xfffff8001682c800)
    at /usr/src/sys/netinet6/nd6.c:1839
#9  0xffffffff808c51c7 in ip6_output (m0=<value optimized out>, 
    opt=<value optimized out>, flags=<value optimized out>, 
    im6o=<value optimized out>, ifpp=<value optimized out>, 
    inp=<value optimized out>) at /usr/src/sys/netinet6/ip6_output.c:858
#10 0xffffffff808dbe7d in udp6_send (so=<value optimized out>, 
    flags=<value optimized out>, m=<value optimized out>, 
    addr=<value optimized out>, control=0x0, td=<value optimized out>)
    at /usr/src/sys/netinet6/udp6_usrreq.c:845
#11 0xffffffff807a63a5 in sosend_dgram (so=0xfffff805f5b592b8, addr=0x0, 
    uio=<value optimized out>, top=<value optimized out>, 
    control=0xfffff8001682c800, flags=<value optimized out>, 
    td=0xfffff80500000000) at /usr/src/sys/kern/uipc_socket.c:1102
#12 0xffffffff807aca65 in kern_sendit (td=0xfffff805f5ba3490, s=980, 
    mp=0xfffffe2021317948, flags=0, control=0x0, segflg=<value optimized out>)
    at /usr/src/sys/kern/uipc_syscalls.c:944
#13 0xffffffff807acd1c in sendit (td=0xffffffff80cdf7c5, 
    s=<value optimized out>, mp=0xfffffe2021317948, flags=1)
    at /usr/src/sys/kern/uipc_syscalls.c:871
#14 0xffffffff807acda1 in sys_sendmsg (td=0xfffff805f5ba3490, 
    uap=0xfffffe2021317a40) at /usr/src/sys/kern/uipc_syscalls.c:1073
#15 0xffffffff80ae9d4a in amd64_syscall (td=0xfffff805f5ba3490, traced=0)
    at subr_syscall.c:134
#16 0xffffffff80acf1bb in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:396
#17 0x0000000801b8802a in ?? ()
Current language:  auto; currently minimal
(kgdb)

Sanitized panic #3

Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 5393809408B (5143 MB)
  Blocksize: 512
  Dumptime: Tue Feb  3 13:21:19 2015
  Hostname: xxxxxxxxxxxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-D1711 #0: Tue Feb  3 12:19:58 MST 2015
    root@xxxxxxxxxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: page fault
  Dump Parity: 4197108606
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols
Reading symbols from /boot/kernel/if_lagg.ko.symbols...done.
Loaded symbols for /boot/kernel/if_lagg.ko.symbols
#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff8072d0b7 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff8072d494 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80ae703f in trap_fatal (frame=<value optimized out>, 
    eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:865
#4  0xffffffff80ae7358 in trap_pfault (frame=0xfffffe1f9e73f7b0, 
    usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:676
#5  0xffffffff80ae69ba in trap (frame=0xfffffe1f9e73f7b0)
    at /usr/src/sys/amd64/amd64/trap.c:440
#6  0xffffffff80acca22 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:236
#7  0xffffffff8072b400 in __rw_wlock_hard (c=0xfffff8030a59aa28, 
    tid=18446735277970927616, file=0x0, line=173648384)
    at /usr/src/sys/kern/kern_rwlock.c:757
#8  0xffffffff80742915 in softclock_call_cc (c=0xfffff8030a59aa98, 
    cc=0xffffffff81342180, direct=0) at /usr/src/sys/kern/kern_timeout.c:637
#9  0xffffffff80742db4 in softclock (arg=0xffffffff81342180)
    at /usr/src/sys/kern/kern_timeout.c:801
#10 0xffffffff806fde4b in intr_event_execute_handlers (
    p=<value optimized out>, ie=0xfffff80015214c00)
    at /usr/src/sys/kern/kern_intr.c:1264
#11 0xffffffff806fe7e6 in ithread_loop (arg=0xfffff800151f6f00)
    at /usr/src/sys/kern/kern_intr.c:1277
#12 0xffffffff806fba6a in fork_exit (
    callout=0xffffffff806fe750 <ithread_loop>, arg=0xfffff800151f6f00, 
    frame=0xfffffe1f9e73fac0) at /usr/src/sys/kern/kern_fork.c:1017
#13 0xffffffff80accf5e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:611
#14 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)

Sbruno: I have a hard time connecting the dots between the kernel panic's you are seeing
and the callout system (especially the changes I made above). You were getting a spin-lock held
too long right?

hiren:

This looks interesting to me, it is definitely something I would like to look at. I assume you
are on 10.stable like Sean?

Note please email me at rrs@netflix.com .. unfortunately my mail server is down and I am about a 3 hour
drive away from it ;-(

Hiren:

Ok looking at kern_timeout.c thats a call to
class->lc_lock(c_lock, lock_status);

If my 10.x matches yours.

And the call inside that kern_rwlock.c:757
is

v = rw->rw_lock;
owner = (struct thread *)RW_OWNER(v);

I would imagine v is probably a freed lock or some such.. not sure.
If you have a vmcore sending the registers would be helpful. And for that
matter if you have a vmcore if you could get in the frame of kern_timeout
and tell me what
c_lock
c_func
are that would be helpful. I have not tested this with my test framework for locks
that pass in a lock.. If the c_func is not some private thing but something in
BSD I can puzzle out what sub-system is using the callout this way and
try to reproduce a test that will blow up this way on me as well.

Assuming of course its not the caller that has freed the
lock ahead of the callout system running...

In D1711#58, @rrs wrote:

hiren:

This looks interesting to me, it is definitely something I would like to look at. I assume you
are on 10.stable like Sean?

Yes, its plain stable10+D1711.
Also, all 3 panics are from the same system.

Hi,

There is only one or two likely consumers of callout_init_rw() at the present moment, and one of them is:

./netinet6/nd6.c: canceled = callout_stop(&ln->ln_timer_ch);
./netinet6/nd6.c: canceled = callout_reset(&ln->ln_timer_ch, INT_MAX,
./netinet6/nd6.c: canceled = callout_reset(&ln->ln_timer_ch, tick,
./netinet6/in6.c: callout_init_rw(&lle->base.ln_timer_ch, &lle->base.lle_lock,

hiren: Is this box configured for IPv6 ?

static void
in_lltable_free(struct lltable *llt, struct llentry *lle)
{

LLE_WUNLOCK(lle);
LLE_LOCK_DESTROY(lle);
free(lle, M_LLTABLE);

}

ln_lltable_free() does not drain the callout associated with it and I am not sure if we have a sleeping context for that. Even if the refcount is zero, it doesn't mean that the callback is finished using the RW mutex.

This is another example where we really need a "callout_drain_async_function()".

In D1711#59, @rrs wrote:

Hiren:

Ok looking at kern_timeout.c thats a call to
class->lc_lock(c_lock, lock_status);

If my 10.x matches yours.

It's not :-(

Looks like what we have here is not stock stable10 really. I'll check all the details and get back first thing in the morning tomorrow.

Thanks for checking and sorry for the trouble.

In D1711#61, @hselasky wrote:

Hi,

There is only one or two likely consumers of callout_init_rw() at the present moment, and one of them is:

./netinet6/nd6.c: canceled = callout_stop(&ln->ln_timer_ch);
./netinet6/nd6.c: canceled = callout_reset(&ln->ln_timer_ch, INT_MAX,
./netinet6/nd6.c: canceled = callout_reset(&ln->ln_timer_ch, tick,
./netinet6/in6.c: callout_init_rw(&lle->base.ln_timer_ch, &lle->base.lle_lock,

hiren: Is this box configured for IPv6 ?

No, not for panic #3. But as I replied to rrs's comment, I need to first make sure what tree we are running and if we are missing critical fixes from stable-10.

static void
in_lltable_free(struct lltable *llt, struct llentry *lle)
{
LLE_WUNLOCK(lle);
LLE_LOCK_DESTROY(lle);
free(lle, M_LLTABLE);
}

ln_lltable_free() does not drain the callout associated with it and I am not sure if we have a sleeping context for that. Even if the refcount is zero, it doesn't mean that the callback is finished using the RW mutex.

This is another example where we really need a "callout_drain_async_function()".

In D1711#60, @hiren wrote:

In D1711#58, @rrs wrote:

hiren:

This looks interesting to me, it is definitely something I would like to look at. I assume you
are on 10.stable like Sean?

Yes, its plain stable10+D1711.
Also, all 3 panics are from the same system.

In D1711#62, @hiren wrote:

In D1711#59, @rrs wrote:

Hiren:

Ok looking at kern_timeout.c thats a call to
class->lc_lock(c_lock, lock_status);

If my 10.x matches yours.

It's not :-(

Looks like what we have here is not stock stable10 really. I'll check all the details and get back first thing in the morning tomorrow.

Thanks for checking and sorry for the trouble.

My bad. So yes, these machines are indeed stable10+D1711. I'll follow up separately on original comment.

In D1711#59, @rrs wrote:

Hiren:

Ok looking at kern_timeout.c thats a call to
class->lc_lock(c_lock, lock_status);

If my 10.x matches yours.

And the call inside that kern_rwlock.c:757
is

v = rw->rw_lock;
owner = (struct thread *)RW_OWNER(v);

I would imagine v is probably a freed lock or some such.. not sure.
If you have a vmcore sending the registers would be helpful. And for that
matter if you have a vmcore if you could get in the frame of kern_timeout
and tell me what
c_lock
c_func
are that would be helpful. I have not tested this with my test framework for locks
that pass in a lock.. If the c_func is not some private thing but something in
BSD I can puzzle out what sub-system is using the callout this way and
try to reproduce a test that will blow up this way on me as well.

Assuming of course its not the caller that has freed the
lock ahead of the callout system running...

panic #3 happened on stable-10+this patch.
I've setup a -head box with this patch to reproduce the problem.
In any case, I'll try to get vmcore and other details tomorrow.

Ok guys, I have puzzled out what that crash *may* be that was posted by Hiren. The same
issue exists in the timeout code rewrite that Han's has up on the board as well. Though
the callout_drain_async() may solve it if the user called that instead. Here is what is happening.

CPU1 is running the callout from the callout wheel it gets to softclock_call_cc() line 635, its all ready to run and unlocks the callout mutex. It then sees c_lock is populated and thus calls class->c_lock()...) to switch locks. It begins to spin in __rw_wlock_hard() in the while loop waiting to get the lock since its held.

lltable_free() has been called on CPU 2, it grabs its AFDATA lock and then goes through each entry and does LIST_FOREACH_SAFE(..) { LLE_WLOCK(lle); <-- this is the lock that CPU 1 is waiting on if(callout_stop(la->la_timer)) LLE_REMREF(lle) llentry_free(lle); }

The callout_stop() on CPU 2 does what it is supposed to and sets the cc_cancel bit to true and return 1. This causes callout_stop() to lower the reference count which means when llentry_free() is called the lle *is* freed. This calls into either in_lltable_free() or in6_lltable_free() which unlocks the lock (letting CPU 1 go forward) and then promptly destroys the lock and frees the memory.

Finally our spinning CPU 1 loops back around and finds the destroyed mutex and crashes.

Now, some interesting notes on this:

a) If the lle* code had used MPSAFE instead of passing the lock in, then zero would have been

returned since the callout "could not have been stopped". This would have left the reference
and *not* freed the memory.. the timer would have done that has it executed.

b) If the lle* code just did not stop the timer, and instead let it run off, it would have freed itself

on expiration.. of course what you really want here is "stop it if you can" but don't stop it
if its already running. Which is why <a> is true since thats what you get if you control the
lock. I

c) This little hole is also in Han's new code unless we change the user of the code

to use the async_drain routine. Which again is changing the KPI something I don't like to do
since its in *so* many places ;-) Of course we may want an async-drain I have not thought about that.

d) This race has always been there near as I can tell and really is not caused by the migration

code that was added like the other code that was fixed.

I don't really see an easy way to fix this accept to change the caller to use MPSAFE.. or maybe make
a async_drain routine.. but then we still end up changing the caller ;-)

kern/kern_timeout.c
1159	sounds good I will make it so.
1234	Imp: Yes, it looks wrong in the kern_timeout.c code.. let me fix it ;-)

In D1711#66, @rrs wrote:

The callout_stop() on CPU 2 does what it is supposed to and sets the cc_cancel bit to true and return 1. This causes callout_stop() to lower the reference count which means when llentry_free() is called the lle *is* freed. This calls into either in_lltable_free() or in6_lltable_free() which unlocks the lock (letting CPU 1 go forward) and then promptly destroys the lock and frees the memory.

Surely the answer is for the other thread to take out a reference so that callout_stop() doesn't get the value to 0

julian: Hence a lock is used, the callback won't be called when callout_stop() returns 1. Only the mutex will still be used. Maybe a callout_reset() having 1 tick as timeout will work instead of callout_stop(). If the callout_reset() returns 0 we add a ref instead of grabbing one. ....

Julian:

The point is *exactly* that, the callout *has* a reference.. and now that the table is being
flushed if the callout_stop returns 1 it thinks the callout has been stopped, which it has, which
means it will not run and release its reference. Thus the lowering of the reference count on
the return true from the stop. But the issue is the lock used is part of what the thing will
free.

I have looked through all callout_stops and only the lle code (in three forms) does this. I think
the right fix here is to just go ahead and change the callout_init used to one with the MPSAFE flag.

When the callout runs all it does is drop the lock and release the reference which causes the
free in theory... So I think with a simple change to the init function this problem does not happen.

let me see if this is correct..
I'm assuming that the reference count can not change if you hold the lock?
if so then before releasing the lock, we can check if the count is 1.. if it is then we are the only person who even knows where this thing is so there can't be any new references added..
having stopped the callout action from happening we inherit it's reference.

If it IS 1 then we can drop the lock because no one else can get it, and we can then drop the reference safely.
the question is what to do if it is NOT 1? can that happen? who else might hold a reference..?
we drop the lock, and wait a moment, and then retake the lock and retry?
if it can not be > 1 then maybe an assert to that affect may be a safe thing to add.

julian: What do you mean by "wait a moment". Spinning or sleeping?

This fixes the comment as imp suggested and the indent...

This revision now requires review to proceed.Feb 4 2015, 2:38 PM

Julian:

The simple fix is to change the callout_init_rw -> callout_init(c, 1);

That makes it so the callout_stop() in this instance would return 0, not 1. Since the
callout could not be stopped.

This would then cause the reference count to drop to 1 (it was 2 the original and the callout).
It would not free here but when the callout runs it would.

Now that all being said, I have put that in some netflix code and will test it.. but there is
something strange about this whole lle* path that I don't yet grok.

The arp code and nd6 code carefully do reference counting. There are comments in
in.c and in6.c that mention that the arp/nd6 code are the callers.. thats fine.. However
*if* either callout goes off in the then the in/in6 callout runs which does:

UNLOCK(lle)
free(la)

No reference counting, no nothing.. I don't grok how all this is connected and
need to study it more but it just feels wrong. If its doing reference counting I would
have thought the callout would have done so too.

I need to take a bigger and closer view at the lle/arp code and how it is all
supposed to inter-work.. it appears to me broken, but that could be just my
mis-understanding of the code.. I should get home tomorrow and besides fixing
my mail-server I wills see what Robert, George and Kirk have to say about
the lle code paths in there new book, and then dig in and try to puzzle it out.

I *think* the simple change callout_init_rw -> callout_init(c, 1) will fix this and of
course we have to remove the unlock in the actual callout since the callout
code won't be locking it any longer... but I am wondering if there is not a
deeper issue with this code. Doing the above will not change the behavior and
will get the right results on the flush, but this use/not-use references seems wrong
so there may be deeper bugs in the lle code!

For the lle* code, it looks like the reference count for the data structure improperly doesn't cover the implicit use of the mutex by the callout system. That seems to be the real bug here, no? Protecting a mutex with a reference count without holding a reference to that mutex until the callout code cannot use it again? Not sure how, however, we can properly drop the reference count to the mutex with the current callout API :(

So why doesn't the lle* code set mpsafe to 1? After re-reading the man page several times, I'm thinking that's the solution here. It already uses other locks and reference counts to synchronize things, so why get Giant involved at all?

Imp:

Ok I have spent a bit of time puzzling this out. First I was mistaken, the callouts being run
are either arptimer or nd6timer(function name not right). These are not using giant but
the passed in lle structure rw_lock.

We need to adjust these so that they check they:

get the lock (since the callout system would no longer lock for them the rw_lock)
Check the pending bit.. if its set some other place as restarted the callout (there are several places this can happen from).
Don't check the !callout_active() flag, since this would mean its not been rescheduled and it actually was cleared by the lle_table_flush (wrong function name here) function, in there its watching the return of the callout (its where the crash was from). So instead now since the "callout can't be stopped" it returns 0 (not lowering the reference) even though the callout can't be stopped, it will have removed the active bit which.. so if we returned here we would leak the memory leaving the reference up, so instead we go ahead and finish processing the callout doing the reference lowering and removal.

Its a bit odd from the normal way you do it but I think it will work fine.

I have created D1777 to address the nd6/arp crash separately. I am currently in the midst
of testing these.

Update from llnw world:

Things have been pretty stable here without any panics for 24+ hours with Stable10+D1711+D1777.

Thanks a lot, Randall!

Don't forget to add the "MFC after" tag.

@rrs:
Looks like we've come full circle back to the very first crash reported. We are on stable10 with all relevant fixes.

Sanitized panic #4

  Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 6764437504B (6451 MB)
  Blocksize: 512
  Dumptime: Mon Feb 16 02:54:11 2015
  Hostname: xxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015
    jason@xxxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: spin lock held too long
  Dump Parity: 1861214463
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols
Reading symbols from /boot/kernel/ftcp.ko...done.
Loaded symbols for /boot/kernel/ftcp.ko
#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff80749c17 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff80749ff4 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80735683 in _mtx_lock_spin_cookie (c=<value optimized out>, 
    tid=<value optimized out>, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>)
    at /usr/src/sys/kern/kern_mutex.c:561
#4  0xffffffff80760064 in callout_lock (c=0xfffff80870266e80)
    at /usr/src/sys/kern/kern_timeout.c:530
#5  0xffffffff8075fc62 in callout_reset_sbt_on (c=0xfffff80870266e80, 
    sbt=<value optimized out>, precision=<value optimized out>, 
    ftn=0xffffffff808bcfe0 <tcp_timer_rexmt>, arg=0xfffff80870266c00, 
    cpu=<value optimized out>, flags=<value optimized out>)
    at /usr/src/sys/kern/kern_timeout.c:975
#6  0xffffffff808bd807 in tcp_timer_activate (tp=0x0, 
    timer_type=<value optimized out>, delta=<value optimized out>)
    at /usr/src/sys/netinet/tcp_timer.c:883
#7  0xffffffff808b3ce0 in tcp_output (tp=0xfffff80870266c00)
    at /usr/src/sys/netinet/tcp_output.c:1579
#8  0xffffffff808bfa41 in tcp_usr_send (so=<value optimized out>, 
    flags=<value optimized out>, m=<value optimized out>, 
    nam=<value optimized out>, control=<value optimized out>, 
    td=<value optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:887
#9  0xffffffff807c2535 in sosend_generic (so=0xfffff80678731000, addr=0x0, 
    uio=0xfffffe2021072960, top=<value optimized out>, 
    control=<value optimized out>, flags=<value optimized out>, td=0x143)
    at /usr/src/sys/kern/uipc_socket.c:1284
#10 0xffffffff807a3fc3 in soo_write (fp=<value optimized out>, 
    uio=0xfffffe2021072960, active_cred=<value optimized out>, 
    flags=<value optimized out>, td=<value optimized out>)
    at /usr/src/sys/kern/sys_socket.c:103
#11 0xffffffff8079cb47 in dofilewrite (td=0xfffff8011c9c7920, fd=323, 
    fp=0xfffff804c4c17870, auio=0xfffffe2021072960, 
    offset=<value optimized out>, flags=0) at file.h:304
#12 0xffffffff8079c878 in kern_writev (td=0xfffff8011c9c7920, fd=323, 
    auio=0xfffffe2021072960) at /usr/src/sys/kern/sys_generic.c:481
#13 0xffffffff8079c803 in sys_write (td=<value optimized out>, 
    uap=<value optimized out>) at /usr/src/sys/kern/sys_generic.c:396
#14 0xffffffff80b059ca in amd64_syscall (td=0xfffff8011c9c7920, traced=0)
    at subr_syscall.c:134
#15 0xffffffff80aeae3b in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:396
#16 0x0000000801640b8a in ?? ()
Current language:  auto; currently minimal
(kgdb)

@rrs: One more

Sanitized panic #5

Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 1694281728B (1615 MB)
  Blocksize: 512
  Dumptime: Sun Feb 15 18:03:14 2015
  Hostname: xxxxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015
    jason@xxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: spin lock held too long
  Dump Parity: 4219482370
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols
Reading symbols from /boot/kernel/if_gif.ko.symbols...done.
Loaded symbols for /boot/kernel/if_gif.ko.symbols
Reading symbols from /boot/kernel/ftcp.ko...done.
Loaded symbols for /boot/kernel/ftcp.ko
#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff80749c17 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff80749ff4 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80735683 in _mtx_lock_spin_cookie (c=<value optimized out>, 
    tid=<value optimized out>, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>)
    at /usr/src/sys/kern/kern_mutex.c:561
#4  0xffffffff80760064 in callout_lock (c=0xfffffe000092f6d0)
    at /usr/src/sys/kern/kern_timeout.c:530
#5  0xffffffff8076019c in _callout_stop_safe (c=0xfffffe000092f6d0, safe=0)
    at /usr/src/sys/kern/kern_timeout.c:1119
#6  0xffffffff80557202 in mpt_scsi_reply_handler (mpt=0xfffffe0000900000, 
    req=0xfffffe000092f678, reply_desc=0, reply_frame=0x0)
    at /usr/src/sys/dev/mpt/mpt_cam.c:2599
#7  0xffffffff805509a7 in mpt_intr (arg=0xfffffe0000900000)
    at /usr/src/sys/dev/mpt/mpt.c:823
#8  0xffffffff8055d9d6 in mpt_pci_intr (arg=0xfffffe0000900000)
    at /usr/src/sys/dev/mpt/mpt_pci.c:802
#9  0xffffffff8071a7fb in intr_event_execute_handlers (
    p=<value optimized out>, ie=0xfffff80008532e00)
    at /usr/src/sys/kern/kern_intr.c:1264
#10 0xffffffff8071b196 in ithread_loop (arg=0xfffff8000857bc00)
    at /usr/src/sys/kern/kern_intr.c:1277
#11 0xffffffff8071841a in fork_exit (
    callout=0xffffffff8071b100 <ithread_loop>, arg=0xfffff8000857bc00, 
    frame=0xfffffe064b1edc00) at /usr/src/sys/kern/kern_fork.c:1017
#12 0xffffffff80aeb08e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:611
#13 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)

@hps: cc_cpu[MAXCPU] info as you requested on IRC. Let me know if you need more info.

(kgdb) backtrace 
#0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff80749ff4 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80735683 in _mtx_lock_spin_cookie (c=<value optimized out>, tid=<value optimized out>, opts=<value optimized out>, file=<value optimized out>, 
    line=<value optimized out>) at /usr/src/sys/kern/kern_mutex.c:561
#4  0xffffffff80760064 in callout_lock (c=0xfffff8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530
#5  0xffffffff8075fc62 in callout_reset_sbt_on (c=0xfffff8000d81dc98, sbt=<value optimized out>, precision=<value optimized out>, 
    ftn=0xffffffff8082a610 <arptimer>, arg=0xfffff8000d81dc00, cpu=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/kern_timeout.c:975
#6  0xffffffff8082b878 in arpintr (m=<value optimized out>) at /usr/src/sys/netinet/if_ether.c:781
#7  0xffffffff808189d2 in netisr_dispatch_src (proto=<value optimized out>, source=<value optimized out>, m=0x0) at /usr/src/sys/net/netisr.c:972
#8  0xffffffff80811396 in ether_demux (ifp=<value optimized out>, m=0xfffff8005c1e8000) at /usr/src/sys/net/if_ethersubr.c:851
#9  0xffffffff80812029 in ether_nh_input (m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:646
#10 0xffffffff808189d2 in netisr_dispatch_src (proto=<value optimized out>, source=<value optimized out>, m=0x0) at /usr/src/sys/net/netisr.c:972
#11 0xffffffff80425f9b in em_rxeof (count=99) at /usr/src/sys/dev/e1000/if_em.c:4532
#12 0xffffffff80426373 in em_msix_rx (arg=0xfffff8000c53a200) at /usr/src/sys/dev/e1000/if_em.c:1600
#13 0xffffffff8071a7fb in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff8000c4ac300) at /usr/src/sys/kern/kern_intr.c:1264
#14 0xffffffff8071b196 in ithread_loop (arg=0xfffff8000c5166e0) at /usr/src/sys/kern/kern_intr.c:1277
#15 0xffffffff8071841a in fork_exit (callout=0xffffffff8071b100 <ithread_loop>, arg=0xfffff8000c5166e0, frame=0xfffffe0c23fccc00)
    at /usr/src/sys/kern/kern_fork.c:1017
#16 0xffffffff80aeb08e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#17 0x0000000000000000 in ?? ()
(kgdb) p * cc_cpu@8                                                                                                                                      [1/18608]
$2 = {{cc_lock = {lock_object = {lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{
        cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0xfffffe00006a4000, cc_callwheel = 0xfffffe00007c6000, cc_expireq = {
      tqh_first = 0x0, tqh_last = 0xffffffff81364288}, cc_callfree = {slh_first = 0xfffffe00007c5240}, cc_firstevent = 899380454888656, 
    cc_lastscan = 899380454354416, cc_cookie = 0xfffff8000c34f100, cc_bucket = 31391, cc_ktr_event_name = "callwheel cpu 0\000\000\000\000"}, {cc_lock = {
      lock_object = {lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe000085a000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364408}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899620856539720, cc_lastscan = 899620209076539, 
    cc_cookie = 0xfffff8000c34f080, cc_bucket = 13092, cc_ktr_event_name = "callwheel cpu 1\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe000089a000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364588}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899446753609881, cc_lastscan = 899445618670680, 
    cc_cookie = 0xfffff8000c34f000, cc_bucket = 2680, cc_ktr_event_name = "callwheel cpu 2\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe00008da000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364708}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899622088212480, cc_lastscan = 899619941714270, 
    cc_cookie = 0xfffff8000c34ef00, cc_bucket = 13079, cc_ktr_event_name = "callwheel cpu 3\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe000091a000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364888}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899622239207424, cc_lastscan = 899620104922681, 
    cc_cookie = 0xfffff8000c34ee80, cc_bucket = 10843, cc_ktr_event_name = "callwheel cpu 4\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe000095a000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364a08}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899622004326400, cc_lastscan = 899619864406244, 
    cc_cookie = 0xfffff8000c34ee00, cc_bucket = 10380, cc_ktr_event_name = "callwheel cpu 5\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe000099a000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364b88}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899709111631872, cc_lastscan = 899706966335378, 
    cc_cookie = 0xfffff8000c34ed80, cc_bucket = 17431, cc_ktr_event_name = "callwheel cpu 6\000\000\000\000"}, {cc_lock = {lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, 
        ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
        cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
        cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfffffe00009da000, cc_expireq = {tqh_first = 0x0, 
      tqh_last = 0xffffffff81364d08}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899621360124557, cc_lastscan = 899620209077230, 
    cc_cookie = 0xfffff8000c34e400, cc_bucket = 13086, cc_ktr_event_name = "callwheel cpu 7\000\000\000\000"}}

Hi,

rrs + hiren:

I think the problem is this:

In "_callout_stop_safe()" we sometimes exit having "cc_migration_cpu(cc, direct) = CPUBLOCK;". Now if a second call to "_callout_stop_safe()" happens before the pending callback has returned, which is using a mutex, we are deadlocked, because "_callout_stop_safe()" is called having the same lock locked which the callback needs to aquire aswell. Because the callout subsystem cannot aquire the mutex during the callback function, it can neither reach the migration code which resets the cc_migration_cpu() variable.

hiren: Can you backtrace all the softclock processes in your dump?

--HPS

If you change how "cc_migration_cpu(cc, direct)" works, the "cc_cce_migrating()" checks become invalid. I think you need to introduce yet another callout flag ....

Hans:

I don't get your call sequence, I sent you an email on it..

Hiren:

Can you go up the call chain and dump the callout structure
c in
0xffffffff80760064 in callout_lock (c=0xfffff8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530

There is something funny here, because the lock's listed (which is what the spin lock panic was on)
are all mtx_lock=4.. which means they are un-locked.

In D1711#88, @rrs wrote:

Hans:

I don't get your call sequence, I sent you an email on it..

Hiren:

Can you go up the call chain and dump the callout structure
c in
0xffffffff80760064 in callout_lock (c=0xfffff8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530

(kgdb) frame 4 
#4  0xffffffff80760064 in callout_lock (c=0xfffff8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530
530                     CC_LOCK(cc);
(kgdb) p *c 
$1 = {c_links = {le = {le_next = 0x0, le_prev = 0xfffffe0000804db8}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfffffe0000804db8}}, 
  c_time = 903238992575048, c_precision = 241591893750, c_arg = 0xfffff8000d81dc00, c_func = 0xffffffff8082a610 <arptimer>, c_lock = 0x0, c_flags = 22, 
  c_cpu = 0}

There is something funny here, because the lock's listed (which is what the spin lock panic was on)
are all mtx_lock=4.. which means they are un-locked.

In D1711#86, @hselasky wrote:

Hi,

rrs + hiren:

I think the problem is this:

In "_callout_stop_safe()" we sometimes exit having "cc_migration_cpu(cc, direct) = CPUBLOCK;". Now if a second call to "_callout_stop_safe()" happens before the pending callback has returned, which is using a mutex, we are deadlocked, because "_callout_stop_safe()" is called having the same lock locked which the callback needs to aquire aswell. Because the callout subsystem cannot aquire the mutex during the callback function, it can neither reach the migration code which resets the cc_migration_cpu() variable.

hiren: Can you backtrace all the softclock processes in your dump?

How exactly do I do it? I do not see any explicit mention of softclock in the dump.

Hiren:

Thats helpful.. as I said this is strange. The callout you posted shows its associated with CPU 0, (c_cpu == 0), and yet
the mtx on that (which is what we are spinning on) is free (its owned == 4). So why would we have crashed
holding the spin lock too long? Unless just as we decided to panic the owner released it. Hmm there is
code in there to check that though.. td = mtx_owner() if (td == NULL) return...

The c_flags = 22 which is PENDING/ACTIVE and Return unlocked. That means it is *supposed* to be on the
callout wheel someplace. The linked list used is then the LLIST.. i.e. {le_next = 0x0, le_prev = 0xfffffe0000804db8}

Now if le_next is 0, its the end of the list.

Can you look back a the previous.. i.e. walk it back

print *(struct callout *)0xfffffe0000804db8

That should print a valid callout as well.. and we should be able to walk back to
the top of the wheel.. by keeping on moving back.

Hiren:

There also should have been a printf before the panic string
printf( "spin lock %p (%s) held by %p (tid %d) too long\n",

m, m->lock_object.lo_name, td, td->td_tid);

Can we see what that lovely printf has displayed?

In theory the lo_name should be "callout" and the %p should point to &cc_cpu[0].cc_lock

Can we validate that these align correctly too?

In D1711#91, @rrs wrote:

Hiren:

Thats helpful.. as I said this is strange. The callout you posted shows its associated with CPU 0, (c_cpu == 0), and yet
the mtx on that (which is what we are spinning on) is free (its owned == 4). So why would we have crashed
holding the spin lock too long? Unless just as we decided to panic the owner released it. Hmm there is
code in there to check that though.. td = mtx_owner() if (td == NULL) return...

The c_flags = 22 which is PENDING/ACTIVE and Return unlocked. That means it is *supposed* to be on the
callout wheel someplace. The linked list used is then the LLIST.. i.e. {le_next = 0x0, le_prev = 0xfffffe0000804db8}

Now if le_next is 0, its the end of the list.

Can you look back a the previous.. i.e. walk it back

print *(struct callout *)0xfffffe0000804db8

That should print a valid callout as well.. and we should be able to walk back to
the top of the wheel.. by keeping on moving back.

Hrm, are there only 2 entries here?

(kgdb) print *(struct callout *)0xfffffe0000804db8
$4 = {c_links = {le = {le_next = 0xfffff8000d81dc98, le_prev = 0x0}, sle = {sle_next = 0xfffff8000d81dc98}, tqe = {tqe_next = 0xfffff8000d81dc98, 
      tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 0, c_cpu = 0}

le_next is back to 0xfffff8000d81dc98.

Anything else I should look at?

In D1711#92, @rrs wrote:
Hiren:

There also should have been a printf before the panic string
printf( "spin lock %p (%s) held by %p (tid %d) too long\n",
m, m->lock_object.lo_name, td, td->td_tid);
Can we see what that lovely printf has displayed?

Ah, my bad for not providing that earlier here:

spin lock 0xffffffff81364180 (callout) held by 0xfffff8000dc0e920 (tid 100111) too long
panic: spin lock held too long

Now,

(kgdb) print *(struct callout *)0xffffffff81364180
$8 = {c_links = {le = {le_next = 0xffffffff80d03d28, le_prev = 0xb0000}, sle = {sle_next = 0xffffffff80d03d28}, tqe = {tqe_next = 0xffffffff80d03d28, 
      tqe_prev = 0xb0000}}, c_time = 0, c_precision = 4, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 0, c_cpu = 0}

if I print it's le_next, that node's le_prev is not pointing to 0xffffffff81364180

(kgdb) print *(struct callout *)0xffffffff80d03d28
  $9 = {c_links = {le = {le_next = 0x74756f6c6c6163, le_prev = 0x6f207265626d754e}, sle = {sle_next = 0x74756f6c6c6163}, tqe = {tqe_next = 0x74756f6c6c6163, 
        tqe_prev = 0x6f207265626d754e}}, c_time = 7307497714779234406, c_precision = 7809632219779637363, c_arg = 0x61206c656568776c, c_func = 0x20657a697320646e, 
    c_lock = 0x6f656d697420666f, c_flags = 690517109, c_cpu = 1701998624}

Also, trying to print le_next or le_prev is not working:

(kgdb) print *(struct callout *)0x6f207265626d754e
Cannot access memory at address 0x6f207265626d754e
(kgdb) print *(struct callout *)0x74756f6c6c6163  
Cannot access memory at address 0x74756f6c6c6163

Is something wrong here or I am failing to understand this. (The latter has a higher probability)

In theory the lo_name should be "callout" and the %p should point to &cc_cpu[0].cc_lock

Can we validate that these align correctly too?

How do I validate it?

Wow, but look at the flags here. They are cc_flags == 0.

That means its *not* on the wheel and yet the thing it points to
(our victim) *thinks* its on the wheel.

This is not good.. We are stuck in a lock
trying to reschedule the timeout (a lock that is
not locked by the way yet we kept spinning).. but PENDING/ACTIVE flag are
set. Those should be off if it was not on the callout wheel. Compound
that with the fact that the previous callout entry on the "wheel" is not on the
wheel having cc_flags == 0.. thats really strange..

This is really bizarre.

Let me think about this for a while..

Hiren:

You have the wrong structure type.

In the printf before panic it is giving you the lock that was spinning.. that
would be in the callout_cpu structure I bet.. I mis-told you in email.

So if you did

print *(struct callout_cpu *)0xffffffff81364180

It should show you our CPU structure .. and I believe the lock should be un-held owner = 4
Either that or 0xffffffff81364180 does not equal

&cc_cpu[0]

In D1711#96, @rrs wrote:

Hiren:

You have the wrong structure type.

In the printf before panic it is giving you the lock that was spinning.. that
would be in the callout_cpu structure I bet.. I mis-told you in email.

So if you did

print *(struct callout_cpu *)0xffffffff81364180

It should show you our CPU structure .. and I believe the lock should be un-held owner = 4
Either that or 0xffffffff81364180 does not equal

&cc_cpu[0]

bah , right.

(kgdb) print *(struct callout_cpu *)0xffffffff81364180
$5 = {cc_lock = {lock_object = {lo_name = 0xffffffff80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{
      cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, 
      cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, 
      cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0xfffffe00006a4000, cc_callwheel = 0xfffffe00007c6000, cc_expireq = {tqh_first = 0x0, 
    tqh_last = 0xffffffff81364288}, cc_callfree = {slh_first = 0xfffffe00007c5240}, cc_firstevent = 899380454888656, cc_lastscan = 899380454354416, 
  cc_cookie = 0xfffff8000c34f100, cc_bucket = 31391, cc_ktr_event_name = "callwheel cpu 0\000\000\000\000"}

randall: Let me try to explain a bit slower:

Assume that a callout has been cancelled and is now migrating to another CPU. c->c_cpu = CPUBLOCK.

Upon calling _callout_stop_safe() we will enter the callout_lock() function which will wait for the condition "c->c_cpu == CPUBLOCK" to disappear.

Thread 1:
mtx_lock(xxx);

_callout_stop_safe(struct callout *c, int safe)
   callout_lock();
     while(c->c_cpu == CPUBLOCK)
          ;

mtx_unlock(xxx);

Thread 2:

 softclock_call_cc()
     if (c_lock != NULL)
         class->lc_lock(xxx);   /* stuck forever */


In softclock_call_cc() c->c_cpu is only updated _after_ that class->lc_lock() has been locked / unlocked. Especially this is critical if the callback function locks and unlocks the "xxx" lock multiple times.

Do you get it? Or do you want me to explain more. Sorry I am not so good communicating with you.

--HPS

Hans:

Let me explain to you how I think you are wrong, you are missing a
small subtle thing here....

When we do the callout_stop we set
cc_migration_cpu() = CPUBLOCK

*NOT*

c->c_cpu = CPUBLOCK;

You are confusing the two things. The CPUBLOCK is used
in two different places.

The loop in callout_lock() is
looking for

c->c_cpu != CPUBLOCK

This only gets set when callout_cpu_switch() is called *not* when callout_reset or callout_stop is
called.

So I think you have the wrong fish in the net.

What Hiren is seeing is something completely different. What is
strange is the cc_cpu[] structures he dumped has *no* locks present.. Look
back above and see the cc_lock structures of the cc_cpu's.. all of the
owners are 4.. i.e. MTX_UNLOCKED.

So how in the world are we getting a "thread holds spin lock too long".

Worse the callout that was being "inserted" is ACTIVE and PENDING, which
means it should be on the wheel, but it is *not* on the wheel .. at least
not that I can see... since Hiren has traced back on that callout and
the previous one on the list (which points to it) has a c_flags of 0..

This is very very odd.

randall: You are right I confused the two c_cpu values.

Let my try to shoot again:

static struct callout_cpu *
callout_cpu_switch(struct callout *c, struct callout_cpu *cc, int new_cpu)
{

struct callout_cpu *new_cc;

MPASS(c != NULL && cc != NULL);
CC_LOCK_ASSERT(cc);

/*
 * Avoid interrupts and preemption firing after the callout cpu
 * is blocked in order to avoid deadlocks as the new thread
 * may be willing to acquire the callout cpu lock.
 */
c->c_cpu = CPUBLOCK;
spinlock_enter();
CC_UNLOCK(cc);
new_cc = CC_CPU(new_cpu);
CC_LOCK(new_cc);
spinlock_exit();
c->c_cpu = new_cpu;
return (new_cc);

}

spinlock_enter() is not an SMP lock. It only protects the executing CPU from interrupts and task switching. Basically that means two threads can be stuck at:
new_cc = CC_CPU(new_cpu), meaning a single callout can be switched to multiple different CPU's at the same time? Or am I wrong?

--HPS

Hans:

I think your wrong here. The caller of callout_cpu_switch() is holding
the CC_LOCK(). Now there are only two callers of this function.

Either the actual callout code itself (softclock_call_cc()) or
the callout_reset_sbt_on(). In the case of callout_reset_sbt_on().

So lets look at these two cases

Case 1: soft clock_call_cc() - In this case the callout is finished

and is being migrated to a different CPU. This would have
happened only because we had the case of doing a
callout_reset_sbt_on() and finding the callout executing
on the wheel *is* the callout running now and we can't
do the migration, so the CPU is set and the PENDING/MIGRATION/ACTVIE
flags are also present. The callout lock is held from the perspective
of the original CPU, and the code goes through and releases the
lock only after setting in the c_lock = CPUBLOCK which will prevent any
other running callout_reset_sbt_on() from moving forward, they will
be stuck waiting for the c_lock to clear to a CPU number. When the migration
is complete the function restores the cpu and the other guy can proceed. It 
may be moving to yet another CPU, that would be fine too since it now has
the lock on the new CPU and moves forward out of callout_reset_sbt_on()

callout_reset_sbt_on() - here again we have the current CPU callout lock, and we will only do the switch if the callout is not the current guy executing, if it is we set the flags so <1> above occurs. If its not we move it.. again we set the CPUBLOCK on the callout so no there callout_reset_sbt_on() can get in until after we move it.

So there is no way we can have two guys moving at the same time. The CPUBLOCK on
the c->c_cpu is used to serialize the two guys.

This of course is all noise, since it is *not* related in any shape or form to what
Hiren is seeing. If it was you would have seen the panic not at line 530 but
above that..

So the question is what is happening.. I am still contemplating it.. and hopefully I can
figure out a way to recreate it.. without a vmcore/kernel.debug its pretty hard to
decipher these things.. Hiren has been a wonderful help but I am afraid I am about
out of question to ask for now.. Its very very strange.

Randall: Shooting again:

Thread 1 is executing in "softclock_call_cc()" in the "new_cc = callout_cpu_switch(c, cc, new_cpu)" it has set "c->c_cpu = CPUBLOCK;"
Thread 2 is now executing callout_reset(). As you can see in the implementation detail, it is reading "c_cpu" without CC_LOCK() locked and without checking if it was reading the c_cpu while it was equal to CPUBLOCK !

#define callout_reset(c, on_tick, fn, arg) \

callout_reset_on((c), (on_tick), (fn), (arg), (c)->c_cpu)

This results in migrating the callout to CPUBLOCK, and subsequent calls will hang in callout_lock() like in hirens panic.

--HPS

I have thought long and hard about this. I don't think its a bug.
But to know for sure I will need to add some instrumentation.

I suspect what is happening is a tremendous number of callouts
all come due at the same time. The three back traces trying to stop or reset
a callout are just unlucky in they don't get the lock as the callout code
works through doing its loops of
CC_LOCK(cc)
while there is more on the list

prepare callout
CC_UNLOCK(cc)
 call_callout_function
CC_LOCK(cc)

done
CC_UNLOCK(cc)

The spin-mtx has (from what I can see) no awareness of the fact that you might
have lost several bids to get the lock. It just crashes if it spins for too long
and cannot get the lock.

The previous problem is fixed, which I could reproduce, where the callout temp-list
was corrupt and pointing to itself.. thus the soft clock looped forever...

Hiren/Sbruno:

Let me make a special patch that includes some counts in the cpu_cc structure
that we can find out

For both callout loops how many the last call had
For both callout loops what was the max ever seen

This will give us a hint if I am correct.

I have also asked jhb on his thoughts for this in email.

Let me re-phrase if I was unclear:

I see nothing preventing the callout_reset() macro from reading (c)->c_cpu lock when it is equal to CPUBLOCK while another CPU is calling callout_cpu_switch() on the same callout.

Especially in the case of a migration case done by the callout_process(), because there is no common "c->c_lock" so-to-speak for the two pieces of code.

Ok after much discussion with Hans, we *could* have an issue where the
user sends in an invalid CPU. This is *not* what I think is happening with Hiren
since the cc_cpu and lock is all sane (it would be a invalid index to cc_cpu which
would not have an init'd lock). But I have created a new review since this one
is already committed

https://reviews.freebsd.org/D1894

To add verification and correction code for just in case the callers of the callout system
mix the callout_reset <and> callout_reset_on within the same callout structure. No system
that I can find currently does this, but it is best to have the protection.

Another panic from an almost *idle* box:

Sanitized panic #6

Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 6525980672B (6223 MB)
  Blocksize: 512
  Dumptime: Thu Feb 19 06:16:57 2015
  Hostname: xxxxxxxxxxxxxxxxxx
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015
    jason@xxxxxxxxxxxx:/usr/obj/usr/src/sys/SIXFOUR
  Panic String: spin lock held too long
  Dump Parity: 1313546413
  Bounds: 0
  Dump Status: good

Backtrace:
Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cubic.ko.symbols
Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done.
Loaded symbols for /boot/kernel/cc_cdg.ko.symbols
Reading symbols from /boot/kernel/h_ertt.ko.symbols...done.
Loaded symbols for /boot/kernel/h_ertt.ko.symbols
Reading symbols from /boot/kernel/ftcp.ko...done.
Loaded symbols for /boot/kernel/ftcp.ko
#0  doadump (textdump=1) at pcpu.h:219
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff80749c17 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff80749ff4 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80735683 in _mtx_lock_spin_cookie (c=<value optimized out>, 
    tid=<value optimized out>, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>)
    at /usr/src/sys/kern/kern_mutex.c:561
#4  0xffffffff80af3fc1 in smp_tlb_shootdown (vector=246, 
    pmap=0xffffffff81391ae0, addr1=18446742009410568192, 
    addr2=18446742009410572288) at /usr/src/sys/amd64/amd64/mp_machdep.c:1145
#5  0xffffffff80af5d3c in pmap_invalidate_range (pmap=<value optimized out>, 
    sva=<value optimized out>, eva=<value optimized out>)
    at /usr/src/sys/amd64/amd64/pmap.c:1480
#6  0xffffffff807d57ef in vfs_vmio_release (bp=0xfffffe1f298bd000)
    at /usr/src/sys/kern/vfs_bio.c:1861
#7  0xffffffff807d622b in getnewbuf (maxsize=<value optimized out>, 
    gbflags=<value optimized out>) at /usr/src/sys/kern/vfs_bio.c:2149
#8  0xffffffff807d3791 in getblk (vp=0xfffff802a158f3b0, blkno=0, size=4096, 
    slpflag=0, slptimeo=0, flags=<value optimized out>)
    at /usr/src/sys/kern/vfs_bio.c:3210
#9  0xffffffff807d41dd in breadn_flags (vp=0xfffff802a158f3b0, blkno=0, 
    size=0, rablkno=0x0, rabsize=0x0, cnt=0, cred=0xfffffe2020f6e670, 
    flags=0, bpp=0xfffffe2020f6e670) at /usr/src/sys/kern/vfs_bio.c:1127
#10 0xffffffff8095c84a in ffs_blkatoff (vp=0x0, offset=0, res=0x0, 
    bpp=0xfffffe2020f6e7f8) at /usr/src/sys/ufs/ffs/ffs_subr.c:86
#11 0xffffffff8096ef92 in ufs_readdir (ap=0xfffffe2020f6e900)
    at /usr/src/sys/ufs/ufs/ufs_vnops.c:2177
#12 0xffffffff80c2af07 in VOP_READDIR_APV (vop=<value optimized out>, 
    a=<value optimized out>) at vnode_if.c:1821
#13 0xffffffff807f9aaa in kern_getdirentries (td=0xfffff800230a8000, 
    fd=<value optimized out>, 
    buf=0x8022a9000 <Address 0x8022a9000 out of bounds>, 
    count=<value optimized out>, basep=0xfffffe2020f6e980, residp=0x0)
    at vnode_if.h:758
#14 0xffffffff807f9888 in sys_getdirentries (td=0x0, uap=0xfffffe2020f6ea40)
    at /usr/src/sys/kern/vfs_syscalls.c:4030
#15 0xffffffff80b059ca in amd64_syscall (td=0xfffff800230a8000, traced=0)
    at subr_syscall.c:134
#16 0xffffffff80aeae3b in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:396
#17 0x0000000801588efa in ?? ()
Current language:  auto; currently minimal
(kgdb)

@rrs This does not have your patch so we won't get any more interesting data but I wanted to show that we get the panic on almost idle boxes too. What does that tell us?

The panic string:

Unread portion of the kernel message buffer:
spin lock 0xffffffff8136a700 (smp rendezvous) held by 0xfffff8033745e920 (tid 100393) too long
panic: spin lock held too long
cpuid = 5

And cc_cpu data: (sysctl hw.ncpu = 6)

(kgdb) p * cc_cpu@6
$4 = {{
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0xfffffe00006a4000, 
    cc_callwheel = 0xfffffe00007c6000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364288
    }, 
    cc_callfree = {
      slh_first = 0xfffffe00007c52c0
    }, 
    cc_firstevent = 631304712284972, 
    cc_lastscan = 631304712287589, 
    cc_cookie = 0xfffff800151f2d80, 
    cc_bucket = 11019, 
    cc_ktr_event_name = "callwheel cpu 0\000\000\000\000"
  }, {
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0x0, 
    cc_callwheel = 0xfffffe000092a000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364408
    }, 
    cc_callfree = {
      slh_first = 0x0
    }, 
    cc_firstevent = 631305899306477, 
    cc_lastscan = 631304548574354, 
    cc_cookie = 0xfffff800151f2d00, 
    cc_bucket = 11019, 
    cc_ktr_event_name = "callwheel cpu 1\000\000\000\000"
  }, {
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0x0, 
    cc_callwheel = 0xfffffe000096a000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364588
    }, 
    cc_callfree = {
      slh_first = 0x0
    }, 
    cc_firstevent = 631547828895744, 
    cc_lastscan = 631545698055244, 
    cc_cookie = 0xfffff800151f2c80, 
    cc_bucket = 15578, 
    cc_ktr_event_name = "callwheel cpu 2\000\000\000\000"
  }, {
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0x0, 
    cc_callwheel = 0xfffffe00009aa000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364708
    }, 
    cc_callfree = {
      slh_first = 0x0
    }, 
    cc_firstevent = 631547577237504, 
    cc_lastscan = 631545436060144, 
    cc_cookie = 0xfffff800151f2c00, 
    cc_bucket = 13829, 
    cc_ktr_event_name = "callwheel cpu 3\000\000\000\000"
  }, {
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0x0, 
    cc_callwheel = 0xfffffe00009ea000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364888
    }, 
    cc_callfree = {
      slh_first = 0x0
    }, 
    cc_firstevent = 631547325579264, 
    cc_lastscan = 631545178364344, 
    cc_cookie = 0xfffff800151f2b80, 
    cc_bucket = 13170, 
    cc_ktr_event_name = "callwheel cpu 4\000\000\000\000"
  }, {
    cc_lock = {
      lock_object = {
        lo_name = 0xffffffff80d03d28 "callout", 
        lo_flags = 720896, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      mtx_lock = 4
    }, 
    cc_exec_entity = {{
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }, {
        cc_curr = 0x0, 
        ce_migration_func = 0, 
        ce_migration_arg = 0x0, 
        ce_migration_cpu = 64, 
        ce_migration_time = 0, 
        ce_migration_prec = 0, 
        cc_cancel = false, 
        cc_waiting = false
      }}, 
    cc_next = 0x0, 
    cc_callout = 0x0, 
    cc_callwheel = 0xfffffe0000a2a000, 
    cc_expireq = {
      tqh_first = 0x0, 
      tqh_last = 0xffffffff81364a08
    }, 
    cc_callfree = {
      slh_first = 0x0
    }, 
    cc_firstevent = 631549707943936, 
    cc_lastscan = 631547562069400, 
    cc_cookie = 0xfffff800151f2b00, 
    cc_bucket = 21496, 
    cc_ktr_event_name = "callwheel cpu 5\000\000\000\000"
  }}

Randall:

I think this needs to be manually closed as the svn commit hook didn't fire when r278469 hit the tree.

This revision is now accepted and ready to land.Mar 24 2015, 2:24 PM

rrs closed this revision.Mar 28 2015, 2:54 PM

Changes to the callout code to restore active semantics and also add a test-framework and test to validate the callout code (and potentially for use by other tests).
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 3631

kern/kern_timeout.c

sys/callout.h

Changes to the callout code to restore active semantics and also add a test-framework and test to validate the callout code (and potentially for use by other tests).ClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 3631

kern/kern_timeout.c

sys/callout.h

Changes to the callout code to restore active semantics and also add a test-framework and test to validate the callout code (and potentially for use by other tests).
ClosedPublic
Actions

Revision Contents
Changeset List