Page MenuHomeFreeBSD

vt: avoid using a spinlock
AcceptedPublic

Authored by corvink on Mar 29 2023, 2:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 28, 3:52 AM
Unknown Object (File)
Wed, Dec 4, 11:11 PM
Unknown Object (File)
Nov 24 2024, 4:57 AM
Unknown Object (File)
Sep 24 2024, 12:18 AM
Unknown Object (File)
Sep 24 2024, 12:18 AM
Unknown Object (File)
Sep 23 2024, 5:42 PM
Unknown Object (File)
Sep 23 2024, 5:42 PM
Unknown Object (File)
Sep 14 2024, 9:19 PM

Details

Summary

A spinlock disables interrupts. As rendering could take some time,
disabling interrupts while rendering can cause jitter on real time
systems.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 50653
Build 47544: arc lint + arc unit

Event Timeline

corvink added subscribers: tsoome, jah, hselasky and 2 others.

@dumbbell @jah @hselasky @emaste @tsoome I'm adding you as reviewer because I found you in the git history of these files.

This revision is now accepted and ready to land.Mar 29 2023, 3:20 PM

Hi,

I'm not sure if this is as easy as it sounds.

Did you make a callgraph?

I'm worried that some layers using VT are calling into various functions with spin locks held already.

Have a look at TTY at least.

Using serial consoles, COM-ports and USB modems.

Also you need to consider KDB, the kernel debugger, and how it prints stuff on the screen, and what may happen if this mutex is not disabling IRQ's.

There was a similar issue with USB keyboards and KDB, and that was solved by allocating on USB transfer for normal operation and another USB transfer structure for KDB operation.

Then "panic" mode operation still works almost as expected, w/o triggering double faults.

The idea is in general good to avoid disabling interrupts, inside VT.

Another approach is atomic operations and EPOCH. VT doesn't do that much really.

--HPS

Hi guys,
basically, I agree with Hans and not sure it will behave well w/o spinlock.
But that way VT will be faster and machine performance will be better.
/me vote to try

In D39323#895452, @ray wrote:

Hi guys,
basically, I agree with Hans and not sure it will behave well w/o spinlock.
But that way VT will be faster and machine performance will be better.
/me vote to try

Same, that's why I said to Corvin to open this review to have a discussion about the approach we took in our downstream fork.
We have a ~2 months before stable/14 branch, I think that's enough time for people to complain about this.
@corvink when you commit this please add a note in UPDATING and maybe send a mail to current@ so people know about this in case they see some problems.

Thanks for your feedback.

Hi,

I'm not sure if this is as easy as it sounds.
Did you make a callgraph?
I'm worried that some layers using VT are calling into various functions with spin locks held already.

We're using this patch for month now in our fork. So far, we haven't seen any issues for our use case. Additionally, latency spikes went away after applying this patch.

In D39323#895514, @manu wrote:

@corvink when you commit this please add a note in UPDATING and maybe send a mail to current@ so people know about this in case they see some problems.

Will do.

I tried this patch on a hyper-v based FreeBSD 14-CURRENT and a panic happened directly after the efifb is loaded. I wasn't able to get a crash dump.

I think doing a functional function trace is mandatory before this commit, seeing what code is calling into what code, and what locks are in use.

If you need help ask, @zlei .

What @gbe reports clearly shows we can do better than just pushing changes and waiting for others to fix them.

--HPS

@zlei : Can you help here using your current knowledge about function tracing, to see if changing from spinlock to mutex can work, as described by this change. That you don't get invalid locking sequences like spinlock before mutex ...

I think doing a functional function trace is mandatory before this commit, seeing what code is calling into what code, and what locks are in use.

If you need help ask, @zlei .

What @gbe reports clearly shows we can do better than just pushing changes and waiting for others to fix them.

--HPS

Nobody here said anything about "waiting for others to fix them". I hope that you understand that we cannot test every configuration possible.
And, sure, since now we have a report that it doesn't work on hyper-v nothing will be commited before the problem is understood and fixed.

In D39323#895597, @manu wrote:

I think doing a functional function trace is mandatory before this commit, seeing what code is calling into what code, and what locks are in use.

If you need help ask, @zlei .

What @gbe reports clearly shows we can do better than just pushing changes and waiting for others to fix them.

--HPS

Nobody here said anything about "waiting for others to fix them". I hope that you understand that we cannot test every configuration possible.
And, sure, since now we have a report that it doesn't work on hyper-v nothing will be commited before the problem is understood and fixed.

It would also help to get list of tested setups - as this issue turns out to be more complicated, it would be nice to get important variants covered with tests. I'm sure people will help there if needed.

@manu : You are right, that we cannot test everything before commit. That was not my point. I just meant when there are tools that easily trace functions calls in the whole FreeBSD kernel, and when changing central components like VT, we should just make function tracing a routine, for this kind of changes, like changing the mutex type.

:-)

@zlei : Can you help here using your current knowledge about function tracing, to see if changing from spinlock to mutex can work, as described by this change. That you don't get invalid locking sequences like spinlock before mutex ...

I'm not familiar with VT.

This change is spinlock -> mutex. mutex is not direct replacement of spinlock from what I know. The unlock routine is OK, but not sure the lock routines.

I think it is better to check all call stack to VTBUF_LOCK, TERMINAL_LOCK, TERMINAL_LOCK_TTY and TERMINAL_LOCK_CONS, to see if some bad can happen.

@zlei : Can you educate the people here, what tools you use and how to check the call stack of those functions / macros? Do you need to make them static inline functions in order to trace them?

@zlei : Can you educate the people here, what tools you use and how to check the call stack of those functions / macros? Do you need to make them static inline functions in order to trace them?

Currently by hand egrep -r function_name sys, a little tedious anyway.

I wonder we can get the call graph by analyze debug symbol files, maybe there're good tools for this.

A reverse call tree (should be graph) of VTBUF_LOCK. Might not complete but I tried my best.

VTBUF_LOCK()
    vtbuf_lock()
        vtterm_pre_input() 
            termteken_pre_input() // .tc_pre_input = vtterm_pre_input, sys/kern/subr_terminal.c
                termteken_pre_input() // .tf_pre_input = termteken_pre_input, sys/kern/subr_terminal.c
                    teken_funcs_pre_input() // sys/teken/teken.c
                        teken_input()
                            main() // sys/teken/demo/teken_demo.c
                            main() // sys/teken/stress/teken_stress.c
        vt_mark_mouse_position_as_dirty()
            vt_mouse_state()
                vtterm_ioctl()
            vt_flush()
            
        vt_flush()
            vt_timer() //by callout_reset(&vd->vd_timer, hz / VT_TIMERFREQ, vt_timer, vd);
            vtterm_done() // .tc_done = vtterm_done
                termtty_outwakeup() // .tsw_outwakeup = termtty_outwakeup, sys/kern/subr_terminal.c
                termcn_cnputc()  // .cn_putc = termcn_cnputc , sys/kern/subr_terminal.c
                    cnputc() // sys/kern/kern_cons.c
                        cngets() // cngetc(), cpu_spinwait
                            parse_dir_ask() // sys/kern/vfs_mountroot.c
                                parse_directive()
                                    vfs_mountroot_parse()
                                        vfs_mountroot()
                                            start_init() // TSENTER, sys/kern/init_main.c
                        cnputsn() // mtx_lock_spin(&cnputs_mtx);
                            sbuf_tty_drain() // sys/kern/tty_info.c, sbuf_set_drain(&sb, sbuf_tty_drain, tp); tty_info()
                            cnputs()
                                cngets() 
                                prf_putbuf() // sys/kern/subr_prf.c
                                    _vprintf()
                                        vlog()        // lots of usage in kernel
                                        vprintf()    // lots of usage in kernel
                                    putbuf()        // lots of usage in kernel
                                        putchar()    // lots of usage in kernel
                                    sbuf_putbuf()        // lots of usage in kernel
                                    sbuf_printf_drain()    // lots of usage in kernel
                        constty_clear() // tty_assert_locked(tp);
                            constty_timeout() // callout_reset_sbt(&conscallout, SBT_1S / constty_wakeups_per_second, 0, constty_timeout, tp, C_PREL(1));
                                constty_set()
                                    tty_generic_ioctl()
                            ttydev_leave() // sys/kern/tty.c
                            tty_generic_ioctl()
            vtterm_cngetc() // .tc_cngetc = vtterm_cngetc
                termcn_cngetc() // .cn_getc = termcn_cngetc
                    cncheckc()        // lots of usage in kernel
                        cngetc()    // lots of usage in kernel
        
    vtbuf_clearhistory()
        vtterm_ioctl()
    vtbuf_grow()
        vt_change_font()
            vt_resize()
                vt_upgrade() // SYSINIT(vt_early_cons, SI_SUB_INT_CONFIG_HOOKS, SI_ORDER_ANY, vt_upgrade, &vt_consdev);
                    vt_replace_backend()
                        vt_allocate()
                            vt_fb_attach() // sys/dev/vt/hw/fb/vt_fb.c
                                fbd_register() // sys/dev/fb/fbd.c
                                    register_fb_wrap() // EVENTHANDLER_REGISTER(register_framebuffer, register_fb_wrap, NULL, EVENTHANDLER_PRI_ANY);
                                    fbd_attach() // DEVMETHOD(device_attach,        fbd_attach),
                        vt_deallocate()
                            vt_fb_detach()
            vtterm_ioctl()
        vtbuf_sethistory_size()
            vtterm_ioctl()()
        vt_init_logos()    // SYSINIT(vt_logos, SI_SUB_TASKQ, SI_ORDER_ANY, vt_init_logos, NULL);
        vt_fini_logos()    // TIMEOUT_TASK_INIT(taskqueue_thread, &vt_splash_cpu_fini_task, 0, vt_fini_logos, NULL);
    vtbuf_flush_mark()
        vtbuf_set_mark()
            vt_mouse_event()
                sysmouse_process_event()
                    consolectl_ioctl()    // .d_ioctl = consolectl_ioctl, sys/dev/vt/vt_sysmouse.c
    vtbuf_scroll_mode()
        vtterm_cngetc()
        vt_scrollmode_kbdevent()
        vt_processkey()
            vt_kbdevent()    // kbd_allocate("kbdmux", -1, vd, vt_kbdevent, vd);

SYSINIT (vt_early_cons, ... and SYSINIT(vt_logos, SI_SUB_TASKQ, SI_ORDER_ANY, vt_init_logos, NULL); are suspicious.
Is mutex safe to use in early stage ?

In D39323#895619, @zlei wrote:

SYSINIT (vt_early_cons, ... and SYSINIT(vt_logos, SI_SUB_TASKQ, SI_ORDER_ANY, vt_init_logos, NULL); are suspicious.
Is mutex safe to use in early stage ?

My impression is all init code is run off a single thread, thread0, until multithreading and the timer subsystem is initialized. Basically mutexes() should never block, because there will not be any mutex congestion at this early stage of boot. Only recursive calls may happen, and that's technically a NOP.

I see one obvious problem, and that is cnputc() is called by printf() in the kernel.

If we call printf() under a spinlock() and the VT buffer is non-spinlocked, that will be a problem!

Does someone see a good solution for calling printf() within a critical section, basically?

Could we temporarily pin printf() callers to CPU-0 and let other realtime threads run on CPU-1+ ? Then if they disable interrupts, it's not that critical?

I have a patch to move Giant out of adaptive locking. I wonder if it can help?

Because VT/console is still all under Giant?

{F59885551}

--HPS