avg (Andriy Gapon)
User

Projects

User Details

User Since
Jun 4 2014, 6:42 AM (211 w, 4 d)

Recent Activity

Fri, Jun 22

avg committed rS335557: MFC r333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps.
MFC r333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps
Fri, Jun 22, 11:16 AM
avg committed rS335556: MFC r333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps.
MFC r333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps
Fri, Jun 22, 10:49 AM
avg committed rS335555: MFC r333321,r333707: x86 cpususpend_handler: call wbinvd after setting suspend….
MFC r333321,r333707: x86 cpususpend_handler: call wbinvd after setting suspend…
Fri, Jun 22, 10:44 AM
avg committed rS335554: MFC r332918, r333222: go deeper for ACPI suspend bounce test.
MFC r332918, r333222: go deeper for ACPI suspend bounce test
Fri, Jun 22, 10:39 AM
avg created D15964: subr_bus: add a list of attached children to record their order.
Fri, Jun 22, 10:21 AM
avg committed rS335549: Revert r335546 as temporary pool name feature has not been merged.
Revert r335546 as temporary pool name feature has not been merged
Fri, Jun 22, 10:13 AM
avg committed rS335546: MFC r333630: Fix 'zpool create -t <tempname>'.
MFC r333630: Fix 'zpool create -t <tempname>'
Fri, Jun 22, 9:41 AM
avg committed rS335545: MFC r333630: Fix 'zpool create -t <tempname>'.
MFC r333630: Fix 'zpool create -t <tempname>'
Fri, Jun 22, 9:37 AM
avg committed rS335544: MFC r334785: expand descriptions of x86 panic_on_nmi and kdb_on_nmi sysctls.
MFC r334785: expand descriptions of x86 panic_on_nmi and kdb_on_nmi sysctls
Fri, Jun 22, 9:29 AM
avg committed rS335543: MFC r333269: amdsbwd: fix reboot status reporting.
MFC r333269: amdsbwd: fix reboot status reporting
Fri, Jun 22, 9:26 AM
avg committed rS335542: MFC r333269: amdsbwd: fix reboot status reporting.
MFC r333269: amdsbwd: fix reboot status reporting
Fri, Jun 22, 9:25 AM
avg committed rS335541: MFC r333243: opensolaris system_taskq does not need to run at maximum priority.
MFC r333243: opensolaris system_taskq does not need to run at maximum priority
Fri, Jun 22, 9:23 AM
avg committed rS335540: MFC r333243: opensolaris system_taskq does not need to run at maximum priority.
MFC r333243: opensolaris system_taskq does not need to run at maximum priority
Fri, Jun 22, 9:22 AM
avg committed rS335538: MFC r333212: amdsbwd: add suspend and resume methods.
MFC r333212: amdsbwd: add suspend and resume methods
Fri, Jun 22, 9:21 AM
avg committed rS335537: MFC r333212: amdsbwd: add suspend and resume methods.
MFC r333212: amdsbwd: add suspend and resume methods
Fri, Jun 22, 9:20 AM
avg committed rS335536: MFC r332816: call racct_proc_ucred_changed() under the proc lock.
MFC r332816: call racct_proc_ucred_changed() under the proc lock
Fri, Jun 22, 9:19 AM
avg committed rS335534: MFC r333209: hpet: use macros instead of magic values for the timer mode.
MFC r333209: hpet: use macros instead of magic values for the timer mode
Fri, Jun 22, 9:10 AM
avg committed rS335533: MFC r333209: hpet: use macros instead of magic values for the timer mode.
MFC r333209: hpet: use macros instead of magic values for the timer mode
Fri, Jun 22, 9:08 AM
avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.
In D15905#337248, @cem wrote:

The code may have been in-tree, but it was not used. Also, I think it did more than just that, but I don't recall. What were the ramifications of using it, and was there a good reason it was never enabled? Or just inertia? Feel free to reverse the default if it makes sense — I just want one copy of the interrupt code.

Fri, Jun 22, 7:35 AM
avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.

By the way, here is a "purified" variant of the interrupt handling code without all the complications of the interrupt threads, software interrupts, interactions with other subsystems, etc.
This is purely a list of handlers that can be iterated in a wait-free manner.
I think that it could be used, for example, for installing NMI handlers in a more regular fashion than what we have now.
https://reviews.freebsd.org/P185
I hope that that code makes the idea behind this change a little bit more clear.

Fri, Jun 22, 7:24 AM
avg created P185 wait-free event handlers.
Fri, Jun 22, 7:18 AM
avg added a reviewer for D15905: safer wait-free iteration of shared interrupt handlers: mjg.
Fri, Jun 22, 6:42 AM

Thu, Jun 21

avg updated the diff for D15905: safer wait-free iteration of shared interrupt handlers.

Rework the proposed change.

Thu, Jun 21, 9:43 PM

Wed, Jun 20

avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.
In D15905#337244, @cem wrote:
Wed, Jun 20, 9:09 PM
avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.
In D15905#337244, @cem wrote:
Wed, Jun 20, 9:08 PM
avg added inline comments to D15905: safer wait-free iteration of shared interrupt handlers.
Wed, Jun 20, 9:06 PM
avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.
In D15905#337073, @cem wrote:

Sure, but they will already ping-pong the linked list walk between cores if there is contention there; might as well ping pong the real lock instead.

Wed, Jun 20, 8:41 PM
avg added a comment to D15905: safer wait-free iteration of shared interrupt handlers.
In D15905#337019, @cem wrote:

CK_SLIST when most of accesses are done under a lock is an unneeded overhead.

Can you elaborate on that overhead?

Wed, Jun 20, 4:39 PM

Tue, Jun 19

avg created D15905: safer wait-free iteration of shared interrupt handlers.
Tue, Jun 19, 3:25 PM

Fri, Jun 15

avg accepted D15725: Fix build of atk0110 with base gcc on i386.

Oops, I thought I approved this, but apparently it was "only" on IRC.
I hope that your commit message will not contain "assume" and instead will state that we indeed pass a pointer via means of uintmax_t (that must not be narrower than uintptr_t) as we discussed.

Fri, Jun 15, 8:38 PM
avg added a comment to D15207: loader: zfs reader must use uint64_t instead of off_t.

Well, we haven't seen any problems from this code. Except for that oddball system you mentioned earlier.
Not sure if this much churn is really warranted.

Fri, Jun 15, 4:23 PM

Tue, Jun 12

avg added a comment to D15755: add support for marking interrupt handlers as suspended.
In D15755#333364, @kib wrote:

I suspect it might be useful to get an ack from the interrupt thread that it sees the IH_SUSP flag, so that device power down does not occur while the handler still run ?

Tue, Jun 12, 3:26 PM
avg added inline comments to D15755: add support for marking interrupt handlers as suspended.
Tue, Jun 12, 8:15 AM

Mon, Jun 11

avg added inline comments to D15755: add support for marking interrupt handlers as suspended.
Mon, Jun 11, 2:12 PM
avg updated the diff for D15755: add support for marking interrupt handlers as suspended.

fix inline documentation of suspend_intr and resume_intr bus methods

Mon, Jun 11, 9:48 AM
avg updated subscribers of D15755: add support for marking interrupt handlers as suspended.

I will appreciate any comments and suggestions on the overall approach, the interrupt code changes and the bus code changes (their design, API extension).
I suspect that there can be a more elegant way to get the desired functionality.

Mon, Jun 11, 9:40 AM
avg added a comment to D15688: power down devices on pci bus only after having suspended all attached drivers.

I've just posted D15755.

Mon, Jun 11, 9:37 AM
avg created D15755: add support for marking interrupt handlers as suspended.
Mon, Jun 11, 9:36 AM

Fri, Jun 8

avg abandoned D15688: power down devices on pci bus only after having suspended all attached drivers.
Fri, Jun 8, 2:41 PM
avg added a comment to D15688: power down devices on pci bus only after having suspended all attached drivers.

Thank you for pointing out those problems.
I have an alternative WIP where I added support for marking interrupts as suspended.
Not sure if I did that right. I'll post a review on Monday so that problems with it can be pointed out and better ideas could be suggested.
In that work, I also suspend interrupts only for PCI devices and only legacy interrupts.
Interrupts are suspended and resumed in pci_suspend_child and pci_resume_child.

Fri, Jun 8, 2:40 PM
avg added a comment to D15646: Provide option to panic when the IPMI creates an NMI.

I also have a review request for a new kind of NMI watchdog, D15630.
In that case I do invoke a new callback to check whether the watchdog recognizes the NMI as caused by its hardware.
Not sure if that would be an overkill in this case.

Fri, Jun 8, 2:32 PM

Thu, Jun 7

avg accepted D15696: Use int instead of char to take the result of getopt() in ZFS utilities..

LGTM.
Please consider opening an issue at https://www.illumos.org/projects/illumos-gate/issues even though illumos does not support PowerPC.

Thu, Jun 7, 7:41 PM
avg created D15693: replace preempt_thresh with a set of knobs to control preemption for each priority class.
Thu, Jun 7, 5:10 PM
avg committed rS334786: x86: reorganize code that deals with unexpected NMI-s.
x86: reorganize code that deals with unexpected NMI-s
Thu, Jun 7, 2:47 PM
avg committed rS334785: expand descriptions of x86 panic_on_nmi and kdb_on_nmi sysctls.
expand descriptions of x86 panic_on_nmi and kdb_on_nmi sysctls
Thu, Jun 7, 2:25 PM
avg created D15688: power down devices on pci bus only after having suspended all attached drivers.
Thu, Jun 7, 7:15 AM

Tue, Jun 5

avg accepted D15342: Break recursion involving getnewvnode and zfs_rmnode.

LGTM.
Thank you!

Tue, Jun 5, 7:03 AM

Sun, Jun 3

avg closed D15306: call AcpiLeaveSleepStatePrep after re-enabling interrupts.

The change has actually been committed, see rS334479.

Sun, Jun 3, 12:16 PM
avg accepted D15306: call AcpiLeaveSleepStatePrep after re-enabling interrupts.
Sun, Jun 3, 12:15 PM

Fri, Jun 1

avg committed rS334479: call AcpiLeaveSleepStatePrep after re-enabling interrupts.
call AcpiLeaveSleepStatePrep after re-enabling interrupts
Fri, Jun 1, 9:44 AM

Thu, May 31

avg updated the test plan for D15630: HPET-based NMI (debug) watchdog.
Thu, May 31, 7:37 AM
avg updated subscribers of D15630: HPET-based NMI (debug) watchdog.

I guess I can split up the patch. Maybe when committing it, if that ever happens.
But while the change consists of several logical parts that can be isolated, each is useless without others.
Only the small improvements in sys/x86/x86/cpu_machdep.c (extending sysctl descriptions, etc) are completely independent of the rest of the changes.

Thu, May 31, 7:34 AM
avg added a comment to D15293: Handle the race between fork/vm_object_split() and faults..

We tested the earlier version of this patch and didn't get any regression.
We also didn't see any recurrence of the problem, but it was very rare without the patch too.

Thu, May 31, 7:02 AM

Wed, May 30

avg updated the diff for D15630: HPET-based NMI (debug) watchdog.

fix a typo

Wed, May 30, 9:24 PM
avg created D15630: HPET-based NMI (debug) watchdog.
Wed, May 30, 9:16 PM

Tue, May 29

avg committed rS334340: add support for console resuming, implement it for uart, use on x86.
add support for console resuming, implement it for uart, use on x86
Tue, May 29, 4:16 PM
avg closed D15552: add support for console resuming, implement it for uart, use on x86.
Tue, May 29, 4:16 PM
avg committed rS334338: fix x86 UP build broken by r334204, TSC resynchronization.
fix x86 UP build broken by r334204, TSC resynchronization
Tue, May 29, 4:03 PM

May 25 2018

avg committed rS334204: re-synchronize TSC-s on SMP systems after resume, if necessary.
re-synchronize TSC-s on SMP systems after resume, if necessary
May 25 2018, 7:33 AM
avg closed D15551: re-synchronize TSC-s on SMP systems after resume, if necessary.
May 25 2018, 7:33 AM
avg committed rS334203: fix zfs_getpages crash when called from sendfile, followup to r329363.
fix zfs_getpages crash when called from sendfile, followup to r329363
May 25 2018, 7:30 AM
avg added a comment to D15562: ZFS sorted scans.

Just a quick question, is this the same change that was recently presented at OpenZFS summit (http://open-zfs.org/w/images/a/a0/Saso_-_resilver_update.pdf)?
Or an alternative to it?

May 25 2018, 6:31 AM · ZFS
avg added inline comments to D15551: re-synchronize TSC-s on SMP systems after resume, if necessary.
May 25 2018, 6:06 AM
avg added inline comments to D15552: add support for console resuming, implement it for uart, use on x86.
May 25 2018, 5:39 AM

May 24 2018

avg added a comment to D15560: Use a spin lock to serialize updates on AMD CPUs..

Thanks a lot!
About this:

Unclear if updates to independent cores need to be serialized. Seems harmless — upper bound of 100M cycles = 50 milliseconds at 2GHz. Even on 32-core Epyc that's max 1.6 seconds to load all cores, with a low-clock part.
The memory pointed to can't cause a page or segmentation fault, but obviously that shouldn't happen for any valid kernel memory anyway.

May 24 2018, 8:39 PM
avg added a comment to D15560: Use a spin lock to serialize updates on AMD CPUs..

We already check MSR in userland (in cpucontrol), but treat it as a global.
In any case, all three alternatives are fine by me.
Although, my (weak) preference is to use the topology information.

May 24 2018, 8:12 PM
avg added a comment to D15560: Use a spin lock to serialize updates on AMD CPUs..

@markj something like that, yes.
I am not sure if logical_cpus_mask is accurate on Ryzen, but I think that we collect enough topology information to either make it correct or to provide another way of checking whether a CPU is a "primary" thread of a core.

May 24 2018, 7:54 PM
avg added a comment to D15560: Use a spin lock to serialize updates on AMD CPUs..

I think that it should be relatively easy to instruct threads to do either an update or just a spin.
Having said that, I won't object against this change if it fixes a problem that has been experienced.

May 24 2018, 7:38 PM
avg added a comment to D15560: Use a spin lock to serialize updates on AMD CPUs..

I am curious if both (all) hardware threads within a core need to do the update or if it would be sufficient for just one thread per core to do it.

May 24 2018, 7:26 PM
avg created D15552: add support for console resuming, implement it for uart, use on x86.
May 24 2018, 12:56 PM
avg added inline comments to D15551: re-synchronize TSC-s on SMP systems after resume, if necessary.
May 24 2018, 11:31 AM
avg created D15551: re-synchronize TSC-s on SMP systems after resume, if necessary.
May 24 2018, 10:30 AM

May 21 2018

avg committed rS334002: uchcom: extend hardware support to version 0x30.
uchcom: extend hardware support to version 0x30
May 21 2018, 9:04 PM
avg closed D15498: uchcom: extend hardware support, other enhancements.
May 21 2018, 9:04 PM
avg committed rS334001: uchcom: remove UCHCOM_REG_BREAK2 alias of UCHCOM_REG_LCR1.
uchcom: remove UCHCOM_REG_BREAK2 alias of UCHCOM_REG_LCR1
May 21 2018, 9:02 PM
avg committed rS334000: uchcom: reject parity and double stop bits as unsupported.
uchcom: reject parity and double stop bits as unsupported
May 21 2018, 9:00 PM
avg committed rS333999: uchcom: add a hardware configuration tweak seen in Linux code.
uchcom: add a hardware configuration tweak seen in Linux code
May 21 2018, 8:59 PM
avg committed rS333998: uchcom: add DPRINTF-s to aid debugging of the driver.
uchcom: add DPRINTF-s to aid debugging of the driver
May 21 2018, 8:58 PM
avg committed rS333997: uchcom: report detected product based on USB product ID.
uchcom: report detected product based on USB product ID
May 21 2018, 8:57 PM
avg committed rS333994: stop and restart kernel event timers in the suspend / resume cycle.
stop and restart kernel event timers in the suspend / resume cycle
May 21 2018, 8:23 PM
avg closed D15413: stop and restart kernel event timers in the suspend / resume cycle.
May 21 2018, 8:23 PM
avg added a comment to D15342: Break recursion involving getnewvnode and zfs_rmnode.

The change looks good to me. A couple of small comments inline.
Thank you!

May 21 2018, 4:35 PM

May 20 2018

avg updated the diff for D15498: uchcom: extend hardware support, other enhancements.

remove a change not intended for comitting

May 20 2018, 9:44 AM
avg created D15498: uchcom: extend hardware support, other enhancements.
May 20 2018, 9:43 AM

May 17 2018

avg committed rS333707: fix a problem with bad performance after wakeup caused by r333321.
fix a problem with bad performance after wakeup caused by r333321
May 17 2018, 10:16 AM

May 16 2018

avg updated the diff for D15424: nfsrvd_readdirplus: for some errors, skip an entry instead of failing the request.
  • restore formatting of a block that has no functional changes
  • re-arrange the code that handles attribute errors per Rick's suggestion
May 16 2018, 7:54 AM
avg updated the diff for D15424: nfsrvd_readdirplus: for some errors, skip an entry instead of failing the request.

fix bugs pointed out by the reviewers

May 16 2018, 7:32 AM
avg added inline comments to D15424: nfsrvd_readdirplus: for some errors, skip an entry instead of failing the request.
May 16 2018, 7:24 AM
avg committed rS333667: followup to r332730/r332752: set kdb_why to "trap" for fatal traps.
followup to r332730/r332752: set kdb_why to "trap" for fatal traps
May 16 2018, 6:52 AM
avg closed D15431: follow-up to r332730 and r332752: set kdb_why to "trap" for fatal traps.
May 16 2018, 6:52 AM
avg added a comment to D15413: stop and restart kernel event timers in the suspend / resume cycle.

I agree. Although, in practice, it seems that at present the timer devices (on x86, at least) don't have much, if any, state to save or restore.
It's sufficient that the resume code puts them in workable state and then the clocksource code just programs them for the next event.

May 16 2018, 6:28 AM

May 15 2018

avg committed rS333638: calibrate lapic timer in native_lapic_setup.
calibrate lapic timer in native_lapic_setup
May 15 2018, 4:56 PM
avg closed D15422: calibrate lapic timer in native_lapic_setup.
May 15 2018, 4:56 PM
avg committed rS333630: Fix 'zpool create -t <tempname>'.
Fix 'zpool create -t <tempname>'
May 15 2018, 1:27 PM
avg added a comment to D15413: stop and restart kernel event timers in the suspend / resume cycle.

You'll need to add a suspending check to pause() so that it uses DELAY() instead of _sleep().

May 15 2018, 10:37 AM
avg updated the diff for D15431: follow-up to r332730 and r332752: set kdb_why to "trap" for fatal traps.

place 'handled' under KDB, fix a whitespace issue near #ifdef

May 15 2018, 8:43 AM
avg added inline comments to D15431: follow-up to r332730 and r332752: set kdb_why to "trap" for fatal traps.
May 15 2018, 7:37 AM
avg added inline comments to D15424: nfsrvd_readdirplus: for some errors, skip an entry instead of failing the request.
May 15 2018, 6:07 AM
avg updated the diff for D15424: nfsrvd_readdirplus: for some errors, skip an entry instead of failing the request.

word smithing

May 15 2018, 6:04 AM

May 14 2018

avg updated subscribers of D15413: stop and restart kernel event timers in the suspend / resume cycle.
May 14 2018, 8:35 PM