Page MenuHomeFreeBSD

Stop couting threads executing dynamic syscalls
Needs ReviewPublic

Authored by mjg on Sun, Feb 9, 12:47 PM.

Details

Reviewers
kib
jeff
manu
jhb
Summary

The current approach results in dirtying the syscall table each time such a syscall is executed which can significantly distort results when using pmcstat. Namely if something is syscall heavy and happens to have a syscall residing in the dirtied area, it ends up ping ponging cachelines with the cpu running pmcstat.

The entire thing can be done no worse by observing all threads not executing the dynamic syscall of interest.

Test Plan

a dynamic syscall module which does DELAY(10000); vs a concumer which calls it in a loop vs kldload && kldunload in a loop

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

mjg created this revision.Sun, Feb 9, 12:47 PM
mjg added a reviewer: jhb.Sun, Feb 9, 12:49 PM
kib added a comment.Sun, Feb 9, 3:44 PM

Other than the notes I think it is fine.

sys/kern/kern_syscalls.c
78–89

Should be a blank line after this.

81

This should be acquire load, see below.

sys/sys/sysent.h
300

I do think that syscall_read_barrier() should be replaced by the store to sa.callp with acquire semantic, to sync/with the release in the syscall_sysent_replace, to see previous writes from _replace.

Next, _exit should be a release store, so that accesses to the dynamic syscall' sysent cannot be reordered after the exit is observed by the waiter.

mjg updated this revision to Diff 68013.Sun, Feb 9, 4:46 PM
  • test for td_sa.code, not callp
mjg added inline comments.Sun, Feb 9, 4:55 PM
sys/kern/kern_syscalls.c
155

Note this is not really correct in the sense that nothing ever guarantees the thread using the syscall fetches a consistent copy. This is bug compatible with the current code and I don't think it's worth fixing. Arguably the only thing which should be legimitately changing here is the func pointer.

sys/sys/sysent.h
300

This callp thing was a brainfart from the previous iteration of the patch (where I used D23582 to wait for everyone to leave cpu_fetch_syscall_args).

I think testing for td_sa.code + the seq_cst fence injected with the IPI are sufficient. syscall_read_barrier + cpu_fence_seq_cst after setting td_sa.code means if we observe it, the target thread may be on the way to use the old (or new) sysent state, both of which are fine. We only care to observe that it switches td_sa.code /at some point/ to something else. Even if it switches back to the waited for syscall number we are fine since by that time the to be unloaded syscall is guaranteed to not be seen by this particular thread. If we observe anything but passed code, the thread is either executing something else or about to resolve this very syscall, but thanks to the fence it will read the new content.

Note that cpu_fence_seq_cst injects a fence for everyone, including the current cpu.

mjg updated this revision to Diff 68318.Fri, Feb 14, 2:07 PM
jeff added a comment.Sun, Feb 16, 10:18 AM

This would be a good use for tick based (non-atomic) smr. It would have a cheaper simpler write path. I don't object to this patch but it might get replaced again.

mjg added a comment.EditedSun, Feb 16, 12:19 PM
In D23587#520606, @jeff wrote:

This would be a good use for tick based (non-atomic) smr. It would have a cheaper simpler write path. I don't object to this patch but it might get replaced again.

In the current code the commonly executed codepath is expanded with:
#define syscall_read_barrier() __compiler_membar()

it does not get any cheaper than that for that end. Anything else *adds* work.

On the other hand we really don't expect to register/unregister syscalls a whole lot and let alone catch someone executing them while we do so. If someone would be looking at improving the area, they can consider working on lending the priority to blocked syscall users.

In fact in my opinion the facility should be removed.

If I understand your proposal correctly, it wont make unloading cheaper. Avoiding the thread list scan needs a semi-centralized way to know there are still users and the best I can see which could be done here is a per-cpu counter for dynamic syscall. This would likely save the scan for the almost never executed case while adding work to the code executed constantly.