A spinlock disables interrupts. As rendering could take some time,
disabling interrupts while rendering can cause jitter on real time
systems.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 50653 Build 47544: arc lint + arc unit
Event Timeline
Hi,
I'm not sure if this is as easy as it sounds.
Did you make a callgraph?
I'm worried that some layers using VT are calling into various functions with spin locks held already.
Have a look at TTY at least.
Using serial consoles, COM-ports and USB modems.
Also you need to consider KDB, the kernel debugger, and how it prints stuff on the screen, and what may happen if this mutex is not disabling IRQ's.
There was a similar issue with USB keyboards and KDB, and that was solved by allocating on USB transfer for normal operation and another USB transfer structure for KDB operation.
Then "panic" mode operation still works almost as expected, w/o triggering double faults.
The idea is in general good to avoid disabling interrupts, inside VT.
Another approach is atomic operations and EPOCH. VT doesn't do that much really.
--HPS
Hi guys,
basically, I agree with Hans and not sure it will behave well w/o spinlock.
But that way VT will be faster and machine performance will be better.
/me vote to try
Same, that's why I said to Corvin to open this review to have a discussion about the approach we took in our downstream fork.
We have a ~2 months before stable/14 branch, I think that's enough time for people to complain about this.
@corvink when you commit this please add a note in UPDATING and maybe send a mail to current@ so people know about this in case they see some problems.
Thanks for your feedback.
We're using this patch for month now in our fork. So far, we haven't seen any issues for our use case. Additionally, latency spikes went away after applying this patch.
Will do.
I tried this patch on a hyper-v based FreeBSD 14-CURRENT and a panic happened directly after the efifb is loaded. I wasn't able to get a crash dump.
@zlei : Can you help here using your current knowledge about function tracing, to see if changing from spinlock to mutex can work, as described by this change. That you don't get invalid locking sequences like spinlock before mutex ...
Nobody here said anything about "waiting for others to fix them". I hope that you understand that we cannot test every configuration possible.
And, sure, since now we have a report that it doesn't work on hyper-v nothing will be commited before the problem is understood and fixed.
It would also help to get list of tested setups - as this issue turns out to be more complicated, it would be nice to get important variants covered with tests. I'm sure people will help there if needed.
@manu : You are right, that we cannot test everything before commit. That was not my point. I just meant when there are tools that easily trace functions calls in the whole FreeBSD kernel, and when changing central components like VT, we should just make function tracing a routine, for this kind of changes, like changing the mutex type.
:-)
I'm not familiar with VT.
This change is spinlock -> mutex. mutex is not direct replacement of spinlock from what I know. The unlock routine is OK, but not sure the lock routines.
I think it is better to check all call stack to VTBUF_LOCK, TERMINAL_LOCK, TERMINAL_LOCK_TTY and TERMINAL_LOCK_CONS, to see if some bad can happen.
@zlei : Can you educate the people here, what tools you use and how to check the call stack of those functions / macros? Do you need to make them static inline functions in order to trace them?
Currently by hand egrep -r function_name sys, a little tedious anyway.
I wonder we can get the call graph by analyze debug symbol files, maybe there're good tools for this.
A reverse call tree (should be graph) of VTBUF_LOCK. Might not complete but I tried my best.
VTBUF_LOCK() vtbuf_lock() vtterm_pre_input() termteken_pre_input() // .tc_pre_input = vtterm_pre_input, sys/kern/subr_terminal.c termteken_pre_input() // .tf_pre_input = termteken_pre_input, sys/kern/subr_terminal.c teken_funcs_pre_input() // sys/teken/teken.c teken_input() main() // sys/teken/demo/teken_demo.c main() // sys/teken/stress/teken_stress.c vt_mark_mouse_position_as_dirty() vt_mouse_state() vtterm_ioctl() vt_flush() vt_flush() vt_timer() //by callout_reset(&vd->vd_timer, hz / VT_TIMERFREQ, vt_timer, vd); vtterm_done() // .tc_done = vtterm_done termtty_outwakeup() // .tsw_outwakeup = termtty_outwakeup, sys/kern/subr_terminal.c termcn_cnputc() // .cn_putc = termcn_cnputc , sys/kern/subr_terminal.c cnputc() // sys/kern/kern_cons.c cngets() // cngetc(), cpu_spinwait parse_dir_ask() // sys/kern/vfs_mountroot.c parse_directive() vfs_mountroot_parse() vfs_mountroot() start_init() // TSENTER, sys/kern/init_main.c cnputsn() // mtx_lock_spin(&cnputs_mtx); sbuf_tty_drain() // sys/kern/tty_info.c, sbuf_set_drain(&sb, sbuf_tty_drain, tp); tty_info() cnputs() cngets() prf_putbuf() // sys/kern/subr_prf.c _vprintf() vlog() // lots of usage in kernel vprintf() // lots of usage in kernel putbuf() // lots of usage in kernel putchar() // lots of usage in kernel sbuf_putbuf() // lots of usage in kernel sbuf_printf_drain() // lots of usage in kernel constty_clear() // tty_assert_locked(tp); constty_timeout() // callout_reset_sbt(&conscallout, SBT_1S / constty_wakeups_per_second, 0, constty_timeout, tp, C_PREL(1)); constty_set() tty_generic_ioctl() ttydev_leave() // sys/kern/tty.c tty_generic_ioctl() vtterm_cngetc() // .tc_cngetc = vtterm_cngetc termcn_cngetc() // .cn_getc = termcn_cngetc cncheckc() // lots of usage in kernel cngetc() // lots of usage in kernel vtbuf_clearhistory() vtterm_ioctl() vtbuf_grow() vt_change_font() vt_resize() vt_upgrade() // SYSINIT(vt_early_cons, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_ANY, vt_upgrade, &vt_consdev); vt_replace_backend() vt_allocate() vt_fb_attach() // sys/dev/vt/hw/fb/vt_fb.c fbd_register() // sys/dev/fb/fbd.c register_fb_wrap() // EVENTHANDLER_REGISTER(register_framebuffer, register_fb_wrap, NULL, EVENTHANDLER_PRI_ANY); fbd_attach() // DEVMETHOD(device_attach, fbd_attach), vt_deallocate() vt_fb_detach() vtterm_ioctl() vtbuf_sethistory_size() vtterm_ioctl()() vt_init_logos() // SYSINIT(vt_logos, SI_SUB_TASKQ, SI_ORDER_ANY, vt_init_logos, NULL); vt_fini_logos() // TIMEOUT_TASK_INIT(taskqueue_thread, &vt_splash_cpu_fini_task, 0, vt_fini_logos, NULL); vtbuf_flush_mark() vtbuf_set_mark() vt_mouse_event() sysmouse_process_event() consolectl_ioctl() // .d_ioctl = consolectl_ioctl, sys/dev/vt/vt_sysmouse.c vtbuf_scroll_mode() vtterm_cngetc() vt_scrollmode_kbdevent() vt_processkey() vt_kbdevent() // kbd_allocate("kbdmux", -1, vd, vt_kbdevent, vd);
SYSINIT (vt_early_cons, ... and SYSINIT(vt_logos, SI_SUB_TASKQ, SI_ORDER_ANY, vt_init_logos, NULL); are suspicious.
Is mutex safe to use in early stage ?
My impression is all init code is run off a single thread, thread0, until multithreading and the timer subsystem is initialized. Basically mutexes() should never block, because there will not be any mutex congestion at this early stage of boot. Only recursive calls may happen, and that's technically a NOP.
I see one obvious problem, and that is cnputc() is called by printf() in the kernel.
If we call printf() under a spinlock() and the VT buffer is non-spinlocked, that will be a problem!
Does someone see a good solution for calling printf() within a critical section, basically?
Could we temporarily pin printf() callers to CPU-0 and let other realtime threads run on CPU-1+ ? Then if they disable interrupts, it's not that critical?
I have a patch to move Giant out of adaptive locking. I wonder if it can help?
Because VT/console is still all under Giant?
{F59885551}
--HPS