Page MenuHomeFreeBSD

AST: rework
ClosedPublic

Authored by kib on Jul 24 2022, 9:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 21, 9:43 PM
Unknown Object (File)
Fri, Jan 17, 7:24 PM
Unknown Object (File)
Fri, Jan 17, 6:16 PM
Unknown Object (File)
Wed, Jan 15, 6:35 AM
Unknown Object (File)
Wed, Jan 15, 6:33 AM
Unknown Object (File)
Wed, Jan 15, 6:33 AM
Unknown Object (File)
Wed, Jan 15, 6:26 AM
Unknown Object (File)
Wed, Jan 15, 6:26 AM

Details

Summary
Make most AST handlers dynamically registered.  This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it.  For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.

Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit.  For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.

The dynamic registration also allows third-party modules to register AST
handlers if needed.  There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it.  In fact, this
is already present behavior for hwpmc.ko and ufs.ko.  I do not think it
is worth the efforts and the runtime overhead to try to fix it.

The patch got a short run for sanity check by Peter.
Per-commit view is available at https://kib.kiev.ua/git/gitweb.cgi?p=deviant3.git;a=shortlog;h=refs/heads/ast
Unfortunately, the main commit cannot be split due to inter-dependencies on the move of ast handlers to subsystems, but this is in fact the point of the change.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kib requested review of this revision.Jul 24 2022, 9:39 AM

Only amd64 is handled ATM. i386 changes were not even compile-tested, and other arches should be handled after the review.

Cleanup headers in subr_trap.c

sys/amd64/vmm/vmm.c
1334 ↗(On Diff #108493)

Can we add something like

#define td_ast_pending(td, tda) ((td->td_ast & TDAI(tda)) != 0)

and use it instead?

sys/dev/hwpmc/hwpmc_soft.c
436 ↗(On Diff #108447)

Indeed, this appears to be unlocked. PMC_SOFT_CALL and PMC_SOFT_CALL_TF merely disable interrupts.

sys/geom/geom_event.c
104 ↗(On Diff #108493)
sys/kern/kern_exit.c
257 ↗(On Diff #108493)

The comment above is quite cryptic now.

It looks like the kclear mechanism is only used by SU. At the same time, it seems weird that SU cannot specifically request its own handler, i.e., ast_kclear() will run all pending handlers flagged with ASTR_KCLEAR. What if another ASTR_KCLEAR handler is pending but not prepared to run in this context?

Can we pass a mask of desired handlers to ast_kclear() instead, so it looks like ast_kclear(TDAI(TDA_UFS))? Then the caller can control the set of handlers that run, and the purpose of the ast_kclear() is clear to someone reading the code. It might be simpler too, e.g., ASTR_KCLEAR is not needed.

Hmm, now I see that GEOM uses it too, but I think ast_kclear(TDAI2(TDA_UFS, TDA_GEOM)) is still clearer.

sys/kern/subr_trap.c
310 ↗(On Diff #108493)

Why is "atomic" quoted? Seems like this read-and-clear is indeed atomic.

327 ↗(On Diff #108493)

Declare at the beginning of the scope?

kib marked 6 inline comments as done.Jul 25 2022, 4:00 PM
kib added inline comments.
sys/kern/kern_exit.c
257 ↗(On Diff #108493)

I decided to go with kclear() approach, because explicit bit set requires unrealted callers, in this case exit1(), to be aware of all existing ASTF_KCLEAR handlers.

I reformulated the comment.

kib marked an inline comment as done.

Handle Mark' comments.
Add td_ast_pending().
Fix locking in hwpmc_soft.
Comment updates.

sys/kern/kern_exit.c
257 ↗(On Diff #108493)

Maybe it makes sense to support both. For exit1(), it makes some sense to handle all kclear ASTs, but not so much in NFS.

sys/kern/kern_exit.c
257 ↗(On Diff #108493)

I do not see why NFS needs an exception. The geom AST is predicated on TDP_GEOM, both in HEAD and in the new code. Then practically, NFS server threads should not trigger TDP_GEOM at the first place, but if they did, I see no reason why they should not wait for geom events to finish propagation.

sys/arm64/arm64/exception.S
182 ↗(On Diff #108554)
sys/dev/hwpmc/hwpmc_soft.c
459 ↗(On Diff #108554)

Extra newline.

sys/kern/kern_exit.c
255 ↗(On Diff #108554)
sys/kern/kern_racct.c
1106–1107 ↗(On Diff #108554)

The racct_enable check is unneeded: the handler is registered if and only if racct_enable is set, and it's never cleared. The check should go after the assertion.

sys/kern/kern_sig.c
3408 ↗(On Diff #108554)

Or if (!td_ast_pending(td, TDA_SIG) && !td_ast_pending(td, TDA_SUSPEND)), I expect the compiler would fuse the checks.

sys/kern/subr_trap.c
226 ↗(On Diff #108554)

What's the purpose of the fence?

231 ↗(On Diff #108554)

I would note here that deregistration does not drain threads executing the handler.

kib marked 8 inline comments as done.Jul 27 2022, 9:17 AM
kib added inline comments.
sys/kern/subr_trap.c
226 ↗(On Diff #108554)

Practically, this is a designator for the place where something should be done if we are serious about making registration and deregistration fully async correct. Initially it was rel for ae_f assignment there and in ast_deregister(), and acq in ast_handler() load of f. But then I decided that the overhead is not worth it, esp. because this does not solve the issue completely.

Still I find it somewhat useful in the sense of place-keeper.

kib marked an inline comment as done.

Further Mark' notes.

Try to fix arm64 compilation issue reported by emaste: TD_AST too large to fit into one (signed) byte field for immediate offset of the ldr instruction.

hselasky added inline comments.
sys/amd64/include/vmm.h
368 ↗(On Diff #108589)

Fix referencing curthread twice?

struct thread *td = curthread;

sys/sys/proc.h
506 ↗(On Diff #108589)

Could hide the prefix inside the macro:

(TDA_##tda)

--HPS

kib marked 2 inline comments as done.Jul 27 2022, 10:03 AM
kib added inline comments.
sys/amd64/include/vmm.h
368 ↗(On Diff #108589)

curthread (actually the __curthread() inline function, the curthread macro is expanded to) is designated as pure2. This is an instruction for significantly smart compiler that the second call is not needed, it can reuse the value.

Anyway, I did as you suggested.

sys/sys/proc.h
506 ↗(On Diff #108589)

IMO it would be too obscure, I like to see the namespace of the identifier explicitly.

kib marked 2 inline comments as done.

Use td in vmm.h, as suggested by Hans.

markj added inline comments.
sys/amd64/amd64/vm_machdep.c
173

Why is it needed here? Do other platforms need the same re-initialization?

This revision is now accepted and ready to land.Jul 27 2022, 3:17 PM
kib marked an inline comment as done.Jul 27 2022, 3:30 PM
kib added inline comments.
sys/amd64/amd64/vm_machdep.c
173

See https://kib.kiev.ua/git/gitweb.cgi?p=deviant3.git;a=commit;h=f66589606ecfb0180ae7c0045feac6a6669bbd6b

I think yes, other platforms need this as well, but I worked around it in the very start of ast_handler():

	if (framep != NULL) {
		kmsan_mark(framep, sizeof(*framep), KMSAN_STATE_INITED);
		td->td_frame = framep;
	}

Now compiles on arm64 but seems to hang starting init.

 CPU  0: ARM Cortex-A57 r1p0 affinity:  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,PIPT ICache,64 byte ERG,64 byte CWG>
 Instruction Set Attributes 0 = <CRC32,SHA2,SHA1,AES+PMULL>
 Instruction Set Attributes 1 = <>
         Processor Features 0 = <AdvSIMD,FP,EL1 32,EL0 32>
         Processor Features 1 = <>
      Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,16TB PA>
      Memory Model Features 1 = <8bit VMID>
      Memory Model Features 2 = <32bit CCIDX,48bit VA>
             Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
AArch32 Instruction Set Attributes 5 = <CRC32,SHA2,SHA1,AES+VMULL,SEVL>
AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD>
AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
 L1 cache: 48KB (instruction), 32KB (data)
 L2 cache: 2048KB (unified)
WARNING: WITNESS option enabled, expect reduced performance.
regulator: shutting down unused regulators
Trying to mount root from msdosfs:/dev/vtbd0s1 []...
efirtc0: providing initial system time
start_init: trying /sbin/init
qemu-system-aarch64: terminating on signal 15 from pid 39643 (timeout)

[double-checking that main is still OK in a moment]

kib marked an inline comment as done.

Hopefully fix arm64 hang returning to usermode

This revision now requires review to proceed.Jul 27 2022, 3:52 PM

smoke test boots on arm64 with latest change

start_init: trying /sbin/init
Hello world.
vm.stats.vm.v_wire_count: 5685
Shutdown NOW!
shutdown: [pid 19]
2022-07-27T18:22:40.108491+00:00 - shutdown 19 - - power-down by root:

smoke test boots on amd64 with latest change

Sorry, on amd64 or arm64?

In D35888#816479, @kib wrote:

smoke test boots on amd64 with latest change

Sorry, on amd64 or arm64?

Oops, that smoke test quoted above was indeed arm64. "amd64" was an error (finger muscle memory autocompletion).

amd64 smoke test: https://cirrus-ci.com/task/6286548705476608
arm64 smoke test: https://cirrus-ci.com/task/4879173821923328

I like the idea in general.

sys/geom/geom_event.c
84 ↗(On Diff #108613)

Why did this change to take an unused td argument?

sys/kern/kern_ktrace.c
233 ↗(On Diff #108613)

Extra blank line.

As a followup commit I believe you could switch this one to ASTR_FREQUIRED or whatever the flag is to require TDA_KTRACE instead of doing it unconditionally. When it was inline it made sense to depend on the existing nested check in KTRUSERRET() rather than adding a new td_flags flag. However, since we now have a TDA_KTRACE flag anyway, we might as well check it and avoid the call. I haven't checked any other ASTR_UNCOND to see if they are in similar shape, but this one at least I think is safe to change. I agree with preserving existing behavior initially though and doing the change as a followup change.

sys/kern/subr_trap.c
105 ↗(On Diff #108613)

Can this use td_ast_pending()?

kib marked 3 inline comments as done.Jul 28 2022, 6:35 AM
kib added inline comments.
sys/geom/geom_event.c
84 ↗(On Diff #108613)

This is bug, I forgot to change the curthread instance below.

sys/kern/kern_ktrace.c
233 ↗(On Diff #108613)

Done.

Other ASTR_UNCOND are signals and hwpmc_soft. For signals, I am sure that it is required to do ASTR_UNCOND due to possible rescheduling of the process-globally delivered signals among the threads. We check for p->pendingcnt > 0 and !SIGISEMPTY(p->p_siglist) as one of the conditions.

For hwpmc_soft perhaps it is possible to remove ASTR_UNCOND, but this is definitely outside scope of testing for this change.

kib marked 2 inline comments as done.

Fix curthread/td use in g_waitidle().
Remove TDAI2().
Require scheduled AST for ktrace ast handler call.

sys/compat/linuxkpi/common/include/linux/sched.h
133 ↗(On Diff #108627)

Should this be (curthread->td_owepreempt || td_ast_pending(curthread, TDA_SCHED))?

Hopefully the final version of the patch, passing Peter' testing.

Handle td != curthread for ast_kclean(), most importantly do not rely on the td_lock.
Move some assertions from ast_handler() to ast(). For instance, we could own Giant or sleepable mutexes while ast_kclear() is called from the process destructor.

markj added inline comments.
sys/kern/subr_trap.c
330 ↗(On Diff #108803)
This revision is now accepted and ready to land.Aug 2 2022, 1:12 PM