Page MenuHomeFreeBSD

riscv: Permit spurious faults in kernel mode
AcceptedPublic

Authored by markj on Wed, Nov 20, 6:23 PM.

Details

Reviewers
mhorne
br
jhb
jrtc27
Group Reviewers
riscv
Summary

Right now, pmap_enter() does not issue an sfence.vma after overwriting
an invalid PTE, so the kernel can trigger a page fault when accessing a
freshly created mapping. In this case, pmap_fault() can handle the
exception, but we may panic before that. Move the check; this is
consistent with arm64 and serves to ensure that we don't call vm_fault()
etc. from a context where that's not permitted.

Also fix a related bug: don't enable interrupts if they were disabled in
the context where the exception occurred.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 60707
Build 57591: arc lint + arc unit

Event Timeline

markj requested review of this revision.Wed, Nov 20, 6:23 PM

Presumably in practice the td_critnest check meant the interrupt enabling didn't matter, as spinlock_enter increments it? I assume there aren't many places where we disable interrupts without also being inside a critical section, so it would only be a problem if that sequence of code itself took a spurious fault.

Also, perhaps worth pointing out arm64 does an unconditional intr_enable for the kernel too...

And amd64 has:

if ((frame->tf_rflags & PSL_I) == 0) {
        /*
         * Buggy application or kernel code has disabled
         * interrupts and then trapped.  Enabling interrupts
         * now is wrong, but it is better than running with
         * interrupts disabled until they are accidentally
         * enabled later.
         */
        if (TRAPF_USERMODE(frame)) {
                uprintf(
                    "pid %ld (%s): trap %d (%s) "
                    "with interrupts disabled\n",
                    (long)curproc->p_pid, curthread->td_name, type,
                    trap_msg[type]);
        } else {
                switch (type) {
                case T_NMI:
                case T_BPTFLT:
                case T_TRCTRAP:
                case T_PROTFLT:
                case T_SEGNPFLT:
                case T_STKFLT:
                        break;
                default:
                        printf(
                            "kernel trap %d with interrupts disabled\n",
                            type);

                        /*
                         * We shouldn't enable interrupts while holding a
                         * spin lock.
                         */
                        if (td->td_md.md_spinlock_count == 0)
                                enable_intr();
                }
        }
}

So I think the expectation is really that this shouldn't happen. Whether that's true or not I don't know :)

Presumably in practice the td_critnest check meant the interrupt enabling didn't matter, as spinlock_enter increments it? I assume there aren't many places where we disable interrupts without also being inside a critical section, so it would only be a problem if that sequence of code itself took a spurious fault.

spinlock_enter() increments td_critnest, but so does critical_enter(), which doesn't disable interrupts. The point of the critnest check is, I think, to make sure we don't do anything which might involve rescheduling the current thread. In particular, it's invalid to call vm_fault() (which acquires locks and might sleep) with critnest > 0 (which means, don't reschedule the current thread).

amd64 uses spinlock_count > 0 as a proxy for "are interrupts disabled?", which is close enough in practice, but I don't know why it doesn't just check IF.

Also, perhaps worth pointing out arm64 does an unconditional intr_enable for the kernel too...

Until recently yes, but not anymore.

Works to me on SiFive Premier P550 (no hacky TLB flush needed anymore). Thanks!

This revision is now accepted and ready to land.Thu, Nov 21, 10:08 AM