mjg (Mateusz Guzik)
nice guy

Projects

User Details

User Since
Jun 4 2014, 10:38 AM (219 w, 6 d)

Recent Activity

Fri, Aug 17

mjg added a comment to D16744: Document seq(9).

This page looks fishy, I'll have to think about.

Fri, Aug 17, 2:59 PM
mjg accepted D16756: Add INVARIANTS-only fences to synchronize lockless refcount updates..

I think the current names while justifiable only add confusion (why is refcount *acquire* paired with fence *rel*?). How about 'VNODE_REFCOUNT_FENCE_BEFORE' and after.

Fri, Aug 17, 2:53 PM

Jul 13 2018

mjg committed rS336267: lockmgr: tidy up slock/sunlock similar to other locks.
lockmgr: tidy up slock/sunlock similar to other locks
Jul 13 2018, 10:40 PM

Jul 12 2018

mjg committed rS336232: fd: stop passing M_ZERO to uma_zalloc.
fd: stop passing M_ZERO to uma_zalloc
Jul 12 2018, 10:48 PM
mjg committed rS336231: uma: whack main zone counter update in the slow path, freeing side.
uma: whack main zone counter update in the slow path, freeing side
Jul 12 2018, 10:36 PM
mjg committed rS336230: sx: remove the spurious macro value difference vs rwlocks.
sx: remove the spurious macro value difference vs rwlocks
Jul 12 2018, 10:34 PM

Jul 9 2018

mjg added a comment to D16191: Fix vm_waitpfault on numa.

yes, GB. is there a problem reproducing the bug?

Jul 9 2018, 9:42 PM
mjg added a comment to D16191: Fix vm_waitpfault on numa.

This does not fix the problem for me -- now things start wedging on 'vmwait'.

Jul 9 2018, 11:26 AM

Jul 3 2018

mjg added inline comments to D16111: Add setproctitle_fast() for frequent callers..
Jul 3 2018, 3:16 PM
mjg abandoned D15531: Inline critical_enter/exit for amd64.

A different variant went in with r335879

Jul 3 2018, 3:14 PM

Jun 24 2018

mjg committed rS335600: vm: stop passing M_ZERO when allocating radix nodes.
vm: stop passing M_ZERO when allocating radix nodes
Jun 24 2018, 1:08 PM
mjg closed D15989: vm: stop passing M_ZERO when allocating radix nodes.
Jun 24 2018, 1:08 PM
mjg added a comment to D15989: vm: stop passing M_ZERO when allocating radix nodes.

There is one already and it can be seen in the diff: vm_radix_node_zone_dtor

Jun 24 2018, 12:41 PM
mjg created D15989: vm: stop passing M_ZERO when allocating radix nodes.
Jun 24 2018, 7:16 AM

Jun 15 2018

mjg added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334362, @jeff wrote:
In D15799#334063, @mjg wrote:
In D15799#334059, @jeff wrote:

I understand but you can't guarantee the thread is the only thing which is accessing these file descriptors. Off the top of my head, the unix domain socket gc thread does fdrops() on a task queue. It _may_ be possible to start to work around these things but it becomes incredibly hard to reason about. And you'd have to audit everything else in the kernel that uses a file * to understand whether it imposes restrictions on things you can do single threaded.

None of this is of any concern.

If the process is single threaded and the file descriptor table is not shared, it is the only entity which can modify its own fd table.

So in particular if it has a file installed, it holds a reference to keep it alive. Also nothing but curthread can drop it.

Let's say the same file object is being inspected by the unix gc thread - it is of no significance for this process. Let's say it fdrops. Does not matter, the process at hand still has its own ref.

The optimisation of not refing/unrefing files in single-threaded processes is implemented in Linux for all syscalls translating fd -> file.

The only caveat here is that you have to remember whether you grabbed the reference or not, since after you got one the other thread/whatever can disappear and you may transition to being single-threaded.

So the idiom is of this sort:
fp = fd2fp(fd, &need_fdrop);
....
fdrop_cond(fp, need_fdrop);

I would say pretty much no obfuscation in the caller and possibly beneficial to applly globally, not only here.

Matt found that most of the poll cost scales per-fd. So the atomic cost remains 5% at higher fd counts. I have said elsewhere, but 5% is not worth polluting the code.

Today the code tolerates threads other than the owning process modifying the fd table. There are interfaces in the kernel which can do so. It may be that this approach is safe with everything in the tree. I don't know. Is it safe with all of the kernel modules in ports? What about third party code?

The change eliminates a scaling bottleneck and makes select and poll more similar as well as reducing the number of atomics in select. There are reasonably written applications which use select on a small set of fds in multiple threads. I think this is a positive step.

Jun 15 2018, 9:59 PM

Jun 14 2018

mjg added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 11:30 PM
mjg added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 7:01 PM
mjg added a comment to D15809: proc0_post: Fix some locking issues.

What is the purpose of this code to begin with? It looks like it should just be removed. If it is needed (what for?), it probably has to run after all initial forking is finished.

Jun 14 2018, 6:12 PM
mjg added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334059, @jeff wrote:

I understand but you can't guarantee the thread is the only thing which is accessing these file descriptors. Off the top of my head, the unix domain socket gc thread does fdrops() on a task queue. It _may_ be possible to start to work around these things but it becomes incredibly hard to reason about. And you'd have to audit everything else in the kernel that uses a file * to understand whether it imposes restrictions on things you can do single threaded.

Jun 14 2018, 7:00 AM
mjg added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 6:51 AM
mjg added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334056, @jeff wrote:
In D15799#334039, @mjg wrote:

this avoidably pessimizes the common case of single-threaded execution by adding atomic op pair for each fd. the code can check if both the process is single threaded and the fd table is not shared, in which case there is no need to grab a ref on files. this will end up being a minor pessimization for the multithreaded (and presumably rare) case while being a win for singlethreaded one.

on my machine it takes less than 40 clock cycles or 11ns to do a atomic_add, atomic_fetchadd pair on a line that is in cache. I would really prefer that we did not obfuscate the code with fragile exceptions for a tiny bit of performance. There are far more profitable ways to improve our single threaded perf in poll.

Jun 14 2018, 6:45 AM
mjg added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 6:01 AM
mjg added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 5:55 AM
mjg added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.

this avoidably pessimizes the common case of single-threaded execution by adding atomic op pair for each fd. the code can check if both the process is single threaded and the fd table is not shared, in which case there is no need to grab a ref on files. this will end up being a minor pessimization for the multithreaded (and presumably rare) case while being a win for singlethreaded one.

Jun 14 2018, 5:52 AM

Jun 10 2018

mjg created D15736: Implement fast path for malloc and free.
Jun 10 2018, 1:36 PM

Jun 8 2018

mjg committed rS334863: counter: add a bit missed in r334858.
counter: add a bit missed in r334858
Jun 8 2018, 10:06 PM
mjg committed rS334858: uma: implement provisional api for per-cpu zones.
uma: implement provisional api for per-cpu zones
Jun 8 2018, 9:40 PM
mjg committed rS334830: uma: fix up r334824.
uma: fix up r334824
Jun 8 2018, 5:40 AM
mjg committed rS334826: amd64: remove now unused bzero, bcmp and bcopy. move pagecopy higher up..
amd64: remove now unused bzero, bcmp and bcopy. move pagecopy higher up.
Jun 8 2018, 4:18 AM
mjg committed rS334824: uma: remove M_ZERO support for pcpu zones.
uma: remove M_ZERO support for pcpu zones
Jun 8 2018, 3:16 AM
mjg committed rS334820: amd64: fix a retarded bug in memset.
amd64: fix a retarded bug in memset
Jun 8 2018, 12:47 AM

Jun 6 2018

mjg committed rS334702: malloc: elaborate on r334545 due to frequent questions.
malloc: elaborate on r334545 due to frequent questions
Jun 6 2018, 5:08 AM

Jun 2 2018

mjg committed rS334546: Remove an unused argument to turnstile_unpend..
Remove an unused argument to turnstile_unpend.
Jun 2 2018, 10:38 PM
mjg committed rS334545: malloc: try to use builtins for zeroing at the callsite.
malloc: try to use builtins for zeroing at the callsite
Jun 2 2018, 10:20 PM
mjg committed rS334537: amd64: add a mild depessimization to rep mov/stos users.
amd64: add a mild depessimization to rep mov/stos users
Jun 2 2018, 8:15 PM
mjg committed rS334534: Use __builtin for various mem* and b* (e.g. bzero) routines..
Use __builtin for various mem* and b* (e.g. bzero) routines.
Jun 2 2018, 6:03 PM
mjg committed rS334533: libkern: tidy up memset.
libkern: tidy up memset
Jun 2 2018, 5:57 PM

May 31 2018

mjg committed rS334437: MFC r329276,r329451,r330294,r330414,r330415,r330418,r331109,r332394,r332398,.
MFC r329276,r329451,r330294,r330414,r330415,r330418,r331109,r332394,r332398,
May 31 2018, 3:58 PM
mjg committed rS334419: amd64: switch pagecopy from non-temporal stores to rep movsq.
amd64: switch pagecopy from non-temporal stores to rep movsq
May 31 2018, 9:56 AM

May 30 2018

mjg added inline comments to D15526: reduce overhead of entropy collection.
May 30 2018, 9:16 PM

May 29 2018

mjg added a comment to D15607: Improve error handling for KERN_PROC_{CWD,FILEDESC}..

I have no opinion one way or the other.

May 29 2018, 7:34 PM

May 23 2018

mjg added a comment to D15531: Inline critical_enter/exit for amd64.

I tried that. Some mutex users can't affort to pull in proc.h either. The motivation here is to provide mutex unlock without atomics, which requires partial access to thread layout.

May 23 2018, 12:42 PM
mjg added a comment to D15531: Inline critical_enter/exit for amd64.

I noted I have a use for this in mutex code, but mutexes are included by proc.h which creates a lot of dependency fun. An attempt to create _thread.h failed as there is just too much work.

May 23 2018, 12:32 PM
mjg committed rS334087: Remove incorrect owepreempt assertion added in r334062.
Remove incorrect owepreempt assertion added in r334062
May 23 2018, 10:13 AM
mjg created D15531: Inline critical_enter/exit for amd64.
May 23 2018, 5:16 AM

May 22 2018

mjg committed rS334062: Move preemption handling out of critical_exit..
Move preemption handling out of critical_exit.
May 22 2018, 7:25 PM
mjg committed rS334048: sx: fixup a braino in r334024.
sx: fixup a braino in r334024
May 22 2018, 3:13 PM
mjg committed rS334026: Reduce sdt-related branch-fest in mi_switch..
Reduce sdt-related branch-fest in mi_switch.
May 22 2018, 8:27 AM
mjg committed rS334024: sx: port over writer starvation prevention measures from rwlock.
sx: port over writer starvation prevention measures from rwlock
May 22 2018, 7:20 AM
mjg committed rS334023: rw: decrease writer starvation.
rw: decrease writer starvation
May 22 2018, 7:16 AM

May 21 2018

mjg committed rS333966: amd64: annotate pti with __read_frequently.
amd64: annotate pti with __read_frequently
May 21 2018, 5:20 AM

May 20 2018

mjg committed rS333916: vfs: simplify vop_stdlock/unlock.
vfs: simplify vop_stdlock/unlock
May 20 2018, 4:45 AM

May 18 2018

mjg committed rS333816: lockmgr: avoid atomic on unlock in the slow path.
lockmgr: avoid atomic on unlock in the slow path
May 18 2018, 10:58 PM
mjg added a comment to D15483: More bcmp "optimization".

First a minor note is that I took the liberty of s/bzero/bcmp, which I presume was intended.

May 18 2018, 7:52 PM
mjg retitled D15483: More bcmp "optimization" from More bzero "optimization" to More bcmp "optimization".
May 18 2018, 7:31 PM
mjg committed rS333784: amd64: tweak the read_frequently section.
amd64: tweak the read_frequently section
May 18 2018, 7:31 AM

May 11 2018

mjg committed rS333486: amd64: align the .data.exclusive_cache_line section to 128.
amd64: align the .data.exclusive_cache_line section to 128
May 11 2018, 8:57 AM
mjg committed rS333484: uma: increase alignment to 128 bytes on amd64.
uma: increase alignment to 128 bytes on amd64
May 11 2018, 7:05 AM
mjg closed D15346: Reduce false sharing in UMA on amd64 by increasing padding to 128 bytes.
May 11 2018, 7:05 AM
mjg committed rS333483: rmlock: partially depessimize lock/unlock fastpath.
rmlock: partially depessimize lock/unlock fastpath
May 11 2018, 7:00 AM

May 9 2018

mjg committed rS333413: amd64: depessimize bcmp for small buffers.
amd64: depessimize bcmp for small buffers
May 9 2018, 3:16 PM
mjg accepted D15367: Avoid bzero() before ireloc..

This makes the kernel boot fine with mem* routines flipped to erms.

May 9 2018, 2:20 PM

May 8 2018

mjg set the repository for D15346: Reduce false sharing in UMA on amd64 by increasing padding to 128 bytes to rS FreeBSD src repository.
May 8 2018, 12:52 AM
mjg created D15346: Reduce false sharing in UMA on amd64 by increasing padding to 128 bytes.
May 8 2018, 12:52 AM

May 7 2018

mjg committed rS333344: Inlined sched_userret..
Inlined sched_userret.
May 7 2018, 11:36 PM
mjg committed rS333342: Change trap_enotcap to bool and annotate with __read_frequently.
Change trap_enotcap to bool and annotate with __read_frequently
May 7 2018, 11:10 PM
mjg committed rS333339: Avoid calls to syscall_thread_enter/exit for statically defined syscalls.
Avoid calls to syscall_thread_enter/exit for statically defined syscalls
May 7 2018, 10:29 PM
mjg committed rS333337: amd64: stop asserting params != NULL in the syscall path.
amd64: stop asserting params != NULL in the syscall path
May 7 2018, 9:32 PM
mjg committed rS333332: amd64: fix up memset added in r333324.
amd64: fix up memset added in r333324
May 7 2018, 8:54 PM
mjg committed rS333328: amd64: tweak the memmove comment regarding authorship.
amd64: tweak the memmove comment regarding authorship
May 7 2018, 5:37 PM
mjg committed rS333324: amd64: replace libkern's memset and memmove with assembly variants.
amd64: replace libkern's memset and memmove with assembly variants
May 7 2018, 3:07 PM

May 4 2018

mjg committed rS333267: tc: bcopy -> memcpy.
tc: bcopy -> memcpy
May 4 2018, 10:48 PM
mjg committed rS333266: amd64: syscall path bcopy -> memcpy.
amd64: syscall path bcopy -> memcpy
May 4 2018, 10:41 PM
mjg committed rS333265: Allow the compiler to use __builtin_memcpy.
Allow the compiler to use __builtin_memcpy
May 4 2018, 10:34 PM
mjg committed rS333241: amd64: get rid of the pessimized bcopy in syscall arg copy.
amd64: get rid of the pessimized bcopy in syscall arg copy
May 4 2018, 4:05 AM
mjg committed rS333240: Allow __builtin_memmove instead of bcopy for small buffers of known size.
Allow __builtin_memmove instead of bcopy for small buffers of known size
May 4 2018, 4:01 AM

Apr 30 2018

mjg added a comment to D15233: make ucred thread private.

I meant per-uid accounting specifically. I have no issue with counting swap usage in general, which also happens to not induce the problem.

Apr 30 2018, 10:10 PM
mjg added a comment to D15233: make ucred thread private.
In D15233#321179, @jeff wrote:
In D15233#321171, @mjg wrote:

The way to go is with per-cpu based reference counting which reverts back to single-word atomics as needed. This paired with a separate count of "legitimate" refs would take care of the problem altogether without duplicating everything.

I agree that would be better although we need to look at how this would be implemented and see if we have a reasonable solution. I assume it would likely need to be some variant of counter(9). Presumably when the per-cpu var transitioned from 0 to 1 you would grab an atomic on the global? And then release on 1->0?

Apr 30 2018, 9:39 PM
mjg added a comment to D15233: make ucred thread private.

I completely disagree with this.

Apr 30 2018, 9:18 AM

Apr 27 2018

mjg added a comment to D15122: Eliminate vm object relocks in vm fault..
In D15122#320743, @kib wrote:

mjg@, does this still provides an improvement in your benchmarks ? Can you provide the numbers ?

Apr 27 2018, 4:27 PM
mjg committed rS333066: Unbreak world build after r333064.
Unbreak world build after r333064
Apr 27 2018, 3:50 PM
mjg committed rS333064: systrace: track it like sdt probes.
systrace: track it like sdt probes
Apr 27 2018, 3:16 PM
mjg committed rS333052: uma: whack main zone counter update in the slow path.
uma: whack main zone counter update in the slow path
Apr 27 2018, 5:37 AM
mjg committed rS333051: vm: move vm_cnt to __read_mostly now that it is not written to.
vm: move vm_cnt to __read_mostly now that it is not written to
Apr 27 2018, 5:36 AM

Apr 24 2018

mjg committed rS332911: lockf: change the owner hash from pid to vnode-based.
lockf: change the owner hash from pid to vnode-based
Apr 24 2018, 6:10 AM
mjg committed rS332901: dtrace: depessimize dtmalloc when dtrace is active.
dtrace: depessimize dtmalloc when dtrace is active
Apr 24 2018, 1:06 AM
mjg committed rS332900: lockstat: track lockstat just like sdt probes.
lockstat: track lockstat just like sdt probes
Apr 24 2018, 1:04 AM

Apr 23 2018

mjg committed rS332896: malloc: stop reading the subzone if MALLOC_DEBUG_MAXZONES == 1 (the default).
malloc: stop reading the subzone if MALLOC_DEBUG_MAXZONES == 1 (the default)
Apr 23 2018, 10:29 PM
mjg committed rS332882: lockf: add per-chain locks to the owner hash.
lockf: add per-chain locks to the owner hash
Apr 23 2018, 8:23 AM
mjg committed rS332881: lockf: skip locking the graph if not necessary (common case).
lockf: skip locking the graph if not necessary (common case)
Apr 23 2018, 7:54 AM
mjg committed rS332880: lockf: perform wakeup onlly when there is anybody waiting.
lockf: perform wakeup onlly when there is anybody waiting
Apr 23 2018, 7:53 AM
mjg committed rS332879: lockf: skip the hard work in lf_purgelocks if possible.
lockf: skip the hard work in lf_purgelocks if possible
Apr 23 2018, 7:52 AM
mjg committed rS332878: lockf: free state only when recycling the vnode.
lockf: free state only when recycling the vnode
Apr 23 2018, 7:51 AM

Apr 22 2018

mjg committed rS332870: lockf: slightly depessimize.
lockf: slightly depessimize
Apr 22 2018, 9:30 AM

Apr 18 2018

mjg added a comment to D15122: Eliminate vm object relocks in vm fault..

I made the change flippable with a sysctl. It nicely speeds up the initital page faults when postgres warms up:

Apr 18 2018, 6:03 PM

Apr 17 2018

mjg accepted D15106: Add PROC_PDEATHSIG_SET to procctl interface..
Apr 17 2018, 2:43 PM

Apr 16 2018

mjg accepted D15106: Add PROC_PDEATHSIG_SET to procctl interface..

I only have cosmetic remarks. Definitely looks good enough to be shipped.

Apr 16 2018, 10:34 PM

Apr 13 2018

mjg added inline comments to D15047: Properly do a deep copy of the ioctls capability array for fget_cap()..
Apr 13 2018, 4:21 AM

Apr 12 2018

mjg committed rS332422: iflib: fix up a mismerge in r332419.
iflib: fix up a mismerge in r332419
Apr 12 2018, 4:11 AM
mjg requested changes to D15047: Properly do a deep copy of the ioctls capability array for fget_cap()..
Apr 12 2018, 1:55 AM