Page MenuHomeFreeBSD

kib (Konstantin Belousov)
User

Projects

User Details

User Since
May 16 2014, 7:35 PM (610 w, 19 h)

Recent Activity

Today

kib updated the diff for D54831: Make ULE and 4BSD coexists.

Deduplicate sched stats, sdt probes, and kdtrace hooks helper vars.
At least amd64 GENERIC and LINT build.

Sat, Jan 24, 6:25 AM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

One more #ifdef KTR for 4bsd.

Sat, Jan 24, 2:10 AM
kib added a comment to D54831: Make ULE and 4BSD coexists.
Sat, Jan 24, 2:07 AM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Take the DEFINE_SHIM() proposal.
Fix inlined strcmp().
Also hopefully fix compilation issues, but I only started tinderbox.

Sat, Jan 24, 2:01 AM
kib committed rG95eec982c37a: vm/swap_pager.c: silence compiler warning (authored by kib).
vm/swap_pager.c: silence compiler warning
Sat, Jan 24, 12:33 AM
kib committed rG2a27aefcefe0: swap_pager_getpages(): some pages from ma[] might be bogus (authored by kib).
swap_pager_getpages(): some pages from ma[] might be bogus
Sat, Jan 24, 12:33 AM
kib committed rG102400e5d07a: swap_pager_getpages(): assert that bp->b_pages[] is accessed in bounds (authored by kib).
swap_pager_getpages(): assert that bp->b_pages[] is accessed in bounds
Sat, Jan 24, 12:33 AM
kib committed rGb3e6c8eb7eba: tuning.7: wording fixes (authored by kib).
tuning.7: wording fixes
Sat, Jan 24, 12:33 AM
kib committed rGa6cc48e22ba7: sendfile: remove calculation of unused bsize (authored by kib).
sendfile: remove calculation of unused bsize
Sat, Jan 24, 12:33 AM
kib committed rG9dbc47d79efe: vm_map_entry_delete(): fix the calculation of swap release (authored by kib).
vm_map_entry_delete(): fix the calculation of swap release
Sat, Jan 24, 12:33 AM
kib committed rG7768be681bed: tuning.7: use the correct word for collapsing (authored by Oliver Pinter <oliver.pntr+freebsd@gmail.com>).
tuning.7: use the correct word for collapsing
Sat, Jan 24, 12:33 AM
kib committed rG10af3b3b71a0: tuning.7: add more explanation about swap (over-)accounting (authored by kib).
tuning.7: add more explanation about swap (over-)accounting
Sat, Jan 24, 12:33 AM
kib committed rGa4123ac9f596: vm_object: remove the charge member (authored by kib).
vm_object: remove the charge member
Sat, Jan 24, 12:33 AM
kib committed rG0ab96c91676d: rfork(2): fix swap accounting in vmspace_unshare() (authored by kib).
rfork(2): fix swap accounting in vmspace_unshare()
Sat, Jan 24, 12:33 AM
kib committed rG1f6db7d73474: swap_release_by_cred*(): give some additional info on panics due to underflow (authored by kib).
swap_release_by_cred*(): give some additional info on panics due to underflow
Sat, Jan 24, 12:33 AM
kib committed rG249939298daf: vm_object_coalesce(): return swap reservation back if overcharged (authored by kib).
vm_object_coalesce(): return swap reservation back if overcharged
Sat, Jan 24, 12:33 AM
kib committed rG1e1727a7d7bd: vm_object_coalesce(): do not account holes twice (authored by kib).
vm_object_coalesce(): do not account holes twice
Sat, Jan 24, 12:33 AM
kib committed rG99fab30f7272: vm_object_coalesce(): simplify common expression (authored by kib).
vm_object_coalesce(): simplify common expression
Sat, Jan 24, 12:32 AM
kib committed rG84cab089ed77: vm_object_coalesce(): remove commented out code (authored by kib).
vm_object_coalesce(): remove commented out code
Sat, Jan 24, 12:32 AM

Yesterday

kib added a comment to D54850: fusefs: fix a vnode locking bug during VOP_READ.
In D54850#1253849, @kib wrote:

Is this the only place where the update of the cache under the shared vnode lock occurs?
Or even better, can you point me to the single entry point of the cache update code?

I'm auditing now to look for other cases. I think I've found one in fuse_internal_invalidate_inode (which is used by very few real-world FUSE file systems, but has coverage in the test suite).

That said, there is much better strategy to solve the issue than trying to upgrade the lock.
Add one more mutex (or sx, I am not sure about ops that are done for cache update), and lock it around the cache update code if the vnode lock is only share-locked.

I did not coded it myself because I wanted an easy experiment if the exclusive locking would solve the problem. Next, I do not know where the cache update code is.

I could do that. In terms of performance, how does mutex compare to the vnode lock?

Fri, Jan 23, 9:57 PM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Add function-pointers based workaround for risvc and arm.
Hopefully at least riscv would grow ifuncs in some time frame.

Fri, Jan 23, 9:51 PM
kib added a comment to D54850: fusefs: fix a vnode locking bug during VOP_READ.

Is this the only place where the update of the cache under the shared vnode lock occurs?
Or even better, can you point me to the single entry point of the cache update code?

Fri, Jan 23, 9:46 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.

armv7 and riscv64 are tier 2, breaking them is not permitted. I have an idea for how to make them work though, without being too invasive.

Fri, Jan 23, 8:53 PM
kib accepted D54840: sys: Use __is_aligned and __align_down for some kstack alignment operations.
Fri, Jan 23, 6:50 PM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Rename sched_instance_name variable to sched_name.

Fri, Jan 23, 6:31 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.
In D54831#1253606, @kib wrote:

I am not sure what do you mean by kern.sched.instance_name, I cannot find such thing.

Indeed, I meant kern.sched.sched_instance_name. Could you rename it to simply kern.sched.name?

Fri, Jan 23, 6:30 PM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Add sched_instance_select() call to all arches.
Rename the tunable to kern.sched.name.
Automatically fall back to some scheduler if the named one is not found, and there is one.

Fri, Jan 23, 5:28 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.
In D54831#1253559, @kib wrote:

Make kern.sched sysctls working when 4BSD is selected:

I do not see any new change related to that (compared to the previous diff)?

Fri, Jan 23, 5:26 PM
kib added inline comments to D54831: Make ULE and 4BSD coexists.
Fri, Jan 23, 4:37 PM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Handle all arches for cpu_switch.S.

Fri, Jan 23, 4:17 PM
kib updated the diff for D54831: Make ULE and 4BSD coexists.

Make kern.sched sysctls working when 4BSD is selected:
sysctl kern.sched.ule.topology_spec: allow to run if ULE is not initialized
sched_shim: restore kern.ccpu sysctl

It is apparently should be considered part of the ABI, and is used by
the base top(1).  But do not declare the ccpu variable in headers, it is
needed only by 4bsd. So put the variable definition into sched_shim.c to
make the kernel buildable without SCHED_4BSD.
Fri, Jan 23, 3:20 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.
In D54831#1253408, @kib wrote:

I'd like also that we investigate why the loop on the blocked lock was elided for 4BSD.

Simply because threads are never blocked because there is only one global scheduler lock on 4BSD instead of per-runq lock on ULE. So on ULE you are not guaranteed that the sched_switch() finished with removing the thread from CPU while cpu_switch() already tries to switch to the new context. The blocked state makes this race closed without the need to somehow lock several runqs. On 4BSD it is impossible (global lock).

I well know what it is used for in ULE. Was not sure for 4BSD as it uses the block lock as well. After a check, that is just to release sleepqueues et alter's locks earlier (as ULE also does), and indeed the new selected thread cannot have the block lock.

Fri, Jan 23, 3:19 PM

Thu, Jan 22

kib added inline comments to D54831: Make ULE and 4BSD coexists.
Thu, Jan 22, 10:43 PM
kib updated subscribers of D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.
Thu, Jan 22, 10:40 PM
kib added a comment to D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.
In D54816#1253431, @kib wrote:

So I went ahead and implemented the third (?) approach. IMO it is the only safe option there.

We cannot convert ktrace_mtx to a spinlock?

Thu, Jan 22, 10:40 PM
kib added inline comments to D54831: Make ULE and 4BSD coexists.
Thu, Jan 22, 10:02 PM
kib updated the diff for D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

Remove comment.
Add __ktrace_used to 'tv' decls,

Thu, Jan 22, 10:00 PM
kib updated the diff for D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

Upload the right diff.

Thu, Jan 22, 9:53 PM
kib updated the diff for D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

So I went ahead and implemented the third (?) approach. IMO it is the only safe option there.

Thu, Jan 22, 9:52 PM
kib added a comment to D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

If you really dislike this approach, I think a workable solution that also works for msleep_spin() is the following.

Thu, Jan 22, 9:29 PM
kib updated the diff for D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

Actually commit the comment before diffing.

Thu, Jan 22, 9:23 PM
kib added a comment to D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.
In D54816#1253403, @kib wrote:

The issue is that for ktrcsw() we lock the ktrace_mtx mutex while owning the interlock from a subsystem that called msleep().

Would another solution be to modify _sleep() to release the spinlock later, effectively copying what msleep_spin_sbt() does?

I do not quite follow, sorry? We either own the interlock, or we own the spinlock. ktrcsw() cannot be called under spinlock at all.

msleep_spin_sbt() drops the sleepqueue spinlock in order to call ktrcsw(). So, it is not susceptible to the problem fixed by this diff. Can we use the same strategy elsewhere instead of adding a new field to struct thread?

Thu, Jan 22, 9:22 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.
In D54831#1253408, @kib wrote:

I'd like also that we investigate why the loop on the blocked lock was elided for 4BSD.

Simply because threads are never blocked because there is only one global scheduler lock on 4BSD instead of per-runq lock on ULE. So on ULE you are not guaranteed that the sched_switch() finished with removing the thread from CPU while cpu_switch() already tries to switch to the new context. The blocked state makes this race closed without the need to somehow lock several runqs. On 4BSD it is impossible (global lock).

Checking the blocked state on 4BSD would give 4 instructions overhead which I dismiss with prejudice.

Thu, Jan 22, 9:16 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.

I'd like also that we investigate why the loop on the blocked lock was elided for 4BSD.

Thu, Jan 22, 9:15 PM
kib updated the diff for D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

Rename to ktrace_mtx_unlock(), add explanation.

Thu, Jan 22, 9:11 PM
kib added a comment to D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

The issue is that for ktrcsw() we lock the ktrace_mtx mutex while owning the interlock from a subsystem that called msleep().

Would another solution be to modify _sleep() to release the spinlock later, effectively copying what msleep_spin_sbt() does?

Thu, Jan 22, 9:11 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.
In D54831#1253386, @kib wrote:

@olce and I are working for the same goal but with different approach. Instead of putting them both in kernel binary, our approach is to make sched_4bsd as a kernel module which overrides sched_ule's weak symbol on boot. This gives us two advantages:

  1. Kernel binary size stays the same (but letting them coexisting won't have significant impact on kernel size so this is not a huge advantage)
  2. We can let users to create third-party scheduler as kernel modules. This way, they can run their own scheduler by just building their kernel module without buildkernel and installworld .

The first step of this approach is D54830 which I opened a few hours ago.

Weak symbols do not work exactly this way generally, and esp. in kernel. Anyway, you would not get what you want without significantly hacking either very early linker or even the loader.
I am slightly curious how far this would lead you.

I think I was trying to tackle that idea without deeper understanding in kernel linker. I'll do more research on this. But at least I want hear opinions on loading scheduler as kernel module.

Thu, Jan 22, 9:08 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.

@olce and I are working for the same goal but with different approach. Instead of putting them both in kernel binary, our approach is to make sched_4bsd as a kernel module which overrides sched_ule's weak symbol on boot. This gives us two advantages:

  1. Kernel binary size stays the same (but letting them coexisting won't have significant impact on kernel size so this is not a huge advantage)
  2. We can let users to create third-party scheduler as kernel modules. This way, they can run their own scheduler by just building their kernel module without buildkernel and installworld .

The first step of this approach is D54830 which I opened a few hours ago.

Thu, Jan 22, 8:32 PM
kib added a comment to D54831: Make ULE and 4BSD coexists.

I only tried to boot into ULE so far.
There are two MD places that need to be handled for each arch, but I only did that for amd64 right now:

  • cpu_switch() made unconditionally wait for blocked thread lock unblock
  • sched_instance_select() must be called before ifuncs are resolved.
Thu, Jan 22, 7:17 PM
kib requested review of D54831: Make ULE and 4BSD coexists.
Thu, Jan 22, 7:13 PM
kib committed rGdfc4186c6dcf: x86 lapic: Dump LVTs from the ddb show lapic command (authored by kib).
x86 lapic: Dump LVTs from the ddb show lapic command
Thu, Jan 22, 7:10 PM
kib committed rG2b1db07bec92: x86: add machine/ifunc.h (authored by kib).
x86: add machine/ifunc.h
Thu, Jan 22, 7:10 PM
kib added a comment to D54820: sendfile(2): document that EINTR never happens on non-blocking socket.

Ok, my reluctance to this documentation change is because such promise, of never returning specific error code, is very hard to fulfill. For instance, on intr NFS mount, VOP_GETATTR() can return EINTR, and so on.

Thu, Jan 22, 4:06 PM
kib accepted D54785: witness: Provide facility to print detailed lock tree.
Thu, Jan 22, 3:49 PM
kib accepted D54825: xdr_string: don't leak strings with xdr_free.
Thu, Jan 22, 2:28 PM
kib accepted D54824: rpc/xdr.h: make xdrproc_t always take two arguments.
Thu, Jan 22, 2:26 PM
kib added a comment to D54824: rpc/xdr.h: make xdrproc_t always take two arguments.

In summary: defined in the comments as a function tWo arguments I believe.

Thu, Jan 22, 2:25 PM
kib added a comment to D54820: sendfile(2): document that EINTR never happens on non-blocking socket.

There is a call to kern_writev() on the socket, is it true that it never returns EINTR for non-blocking socket?

Thu, Jan 22, 2:19 PM

Wed, Jan 21

kib accepted D54785: witness: Provide facility to print detailed lock tree.
Wed, Jan 21, 6:28 PM
kib added inline comments to D54785: witness: Provide facility to print detailed lock tree.
Wed, Jan 21, 6:27 PM
kib added a comment to D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.

i386 adjustments for kern_thread.c KBI asserts will be done after the review

Wed, Jan 21, 6:22 PM
kib requested review of D54816: ktrcsw(): should not be called when the thread is owning interlock or on sleepq.
Wed, Jan 21, 6:22 PM
kib updated subscribers of D54592: Add pdrfork(2) and pdwait(2).
Wed, Jan 21, 6:21 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Assert that PDF_CLOSED is not set for fd operated by pdwait().
Make pdwait() cancellable.

Wed, Jan 21, 3:34 PM
kib added inline comments to D54592: Add pdrfork(2) and pdwait(2).
Wed, Jan 21, 2:20 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Avoid racy access to the p_pid member of possibly reclaimed struct proc.
Other review' comments are handled.

Wed, Jan 21, 12:33 AM
kib added inline comments to D54592: Add pdrfork(2) and pdwait(2).
Wed, Jan 21, 12:31 AM

Tue, Jan 20

kib added a comment to D54785: witness: Provide facility to print detailed lock tree.
In D54785#1252692, @jtl wrote:
In D54785#1252666, @kib wrote:

I do not think that trying to schedule a task is very robust with a LOR detected. We are already in the situation risking the deadlock.

Fair point!

If you refuse to make this a default, I am asking to add a simple knob which would make this additional info to be printed in all situations.

I don't "refuse"; I was just giving my thinking.

Ok, refuse was probably too strong word, sorry about this.

Tue, Jan 20, 9:03 PM
kib added a comment to D54785: witness: Provide facility to print detailed lock tree.

I do not think that trying to schedule a task is very robust with a LOR detected. We are already in the situation risking the deadlock.
Also I do not think that the doubts of the stack overflow is very grounded, since WITNESS typically has spinlock recording disabled II believe there are not solved issues with console spinlock which in reality cause hard hang). So there is definitely the space for interrupt frame, and mode, when WITNESS running.

Tue, Jan 20, 7:55 PM
kib closed D54804: ktrace: do not enqueue request if the process' ktrioparams are freed.
Tue, Jan 20, 7:47 PM
kib committed rG6bb3f208617b: ktrace: do not enqueue request if the process' ktrioparams are freed (authored by kib).
ktrace: do not enqueue request if the process' ktrioparams are freed
Tue, Jan 20, 7:47 PM
kib requested review of D54804: ktrace: do not enqueue request if the process' ktrioparams are freed.
Tue, Jan 20, 7:38 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Correct macro names.

Tue, Jan 20, 3:03 PM
kib committed rG96acaa960023: compat32: provide a type and a macro for (u)int64_t handling on non-x86 arches (authored by kib).
compat32: provide a type and a macro for (u)int64_t handling on non-x86 arches
Tue, Jan 20, 2:43 PM
kib committed rGbe1b2da855cc: sys/abi_compat.h: fix UB for bintime32 handling (authored by kib).
sys/abi_compat.h: fix UB for bintime32 handling
Tue, Jan 20, 2:43 PM
kib closed D54663: sys/abi_compat.h: fix UB.
Tue, Jan 20, 2:43 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Grammar and wording of the man page update.

Tue, Jan 20, 2:35 PM
kib added inline comments to D54592: Add pdrfork(2) and pdwait(2).
Tue, Jan 20, 2:34 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Add note about the pid reuse.

Tue, Jan 20, 3:45 AM
kib added a comment to D54592: Add pdrfork(2) and pdwait(2).

This looks great, @kib! If you commit it, I'll follow up with the test suite. A few questions:

  • Are you planning to write the man page yourself, or would you like help with that?
Tue, Jan 20, 3:42 AM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Add documentation.
Fix pdwait() prototype.

Tue, Jan 20, 3:41 AM

Mon, Jan 19

kib added a comment to D54785: witness: Provide facility to print detailed lock tree.

This should be extremely useful, I already anticipate it. But, why not make this information available by default? I am not even sure that the knob is needed to hide it.

Mon, Jan 19, 8:19 PM
kib updated the diff for D54592: Add pdrfork(2) and pdwait(2).

Renumber AUE_PDRFORK.
Drop contrib/openbsm patch, similar to how other new AUE_ constants are not imported there.

Mon, Jan 19, 7:20 PM
kib added a comment to D54323: krb5: Expose missing symbols.
In D54323#1252001, @cy wrote:

DSO bump.

Mon, Jan 19, 5:34 PM
kib committed rG902e3057cd5c: lib/libthr: add pthread_tryjoin(3) test (authored by kib).
lib/libthr: add pthread_tryjoin(3) test
Mon, Jan 19, 4:58 PM
kib committed rG7f026a58691d: Document pthread_tryjoin_np(3) (authored by kib).
Document pthread_tryjoin_np(3)
Mon, Jan 19, 4:58 PM
kib committed rGafa70a8496e9: libthr: add pthread_tryjoin_np() (authored by kib).
libthr: add pthread_tryjoin_np()
Mon, Jan 19, 4:58 PM
kib closed D54766: Add pthread_tryjoin_np(3) as provided by glibc.
Mon, Jan 19, 4:58 PM
kib committed rGce16be73707e: libthr/thread/thr_join.c: deduplicate backout_join() helper (authored by kib).
libthr/thread/thr_join.c: deduplicate backout_join() helper
Mon, Jan 19, 4:58 PM
kib committed rG002c50ea23b9: amd64/vmm: remove unused static function vcpu_state2str() (authored by kib).
amd64/vmm: remove unused static function vcpu_state2str()
Mon, Jan 19, 4:46 PM
kib closed D54781: amd64/vmm: remove unused static function vcpu_state2str().
Mon, Jan 19, 4:46 PM
kib requested review of D54781: amd64/vmm: remove unused static function vcpu_state2str().
Mon, Jan 19, 4:30 PM
kib committed rG709a53c8b20b: x86/local_apic.c: Properly calculate the number of LVT entries (authored by kib).
x86/local_apic.c: Properly calculate the number of LVT entries
Mon, Jan 19, 4:21 PM
kib committed rGad5e3cb95034: x86/local_apic.c: add lapic_maxlvt() helper (authored by kib).
x86/local_apic.c: add lapic_maxlvt() helper
Mon, Jan 19, 4:21 PM
kib committed rG83d988288675: sys: do not allow entering vm_fault() on boot until VM is initialized (authored by kib).
sys: do not allow entering vm_fault() on boot until VM is initialized
Mon, Jan 19, 4:21 PM
kib closed D54773: x86/local_apic.c: Properly calculate the number of LVT entries.
Mon, Jan 19, 4:21 PM
kib closed D54768: sys: do not allow entering vm_fault() on boot until VM is initialized.
Mon, Jan 19, 4:21 PM
kib updated the diff for D54773: x86/local_apic.c: Properly calculate the number of LVT entries.

Use lapic_maxlvt() everywhere.

Mon, Jan 19, 4:07 PM
kib updated the diff for D54768: sys: do not allow entering vm_fault() on boot until VM is initialized.

Save file before diffing.

Mon, Jan 19, 4:02 PM
kib updated the diff for D54768: sys: do not allow entering vm_fault() on boot until VM is initialized.

Add comment in vm_init.c

Mon, Jan 19, 4:01 PM