Page MenuHomeFreeBSD

crest_bultmann.eu (Jan Bramkamp)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 13 2018, 3:26 PM (31 w, 9 h)

Recent Activity

Oct 15 2018

crest_bultmann.eu added a comment to D17569: sysutils/s6-rc: Update to 0.4.1.0.

I would just like to state that I'm neither the maintainer of s6-rc nor did the maintainer for s6-rc (moo@arthepsy.eu) set a maintainer approved flag for my patch in PR #232053.

Oct 15 2018, 4:35 PM
crest_bultmann.eu accepted D17568: sysutils/s6: Update to 2.7.2.1.
Oct 15 2018, 4:02 PM

Oct 1 2018

crest_bultmann.eu added a comment to D17304: Swap in processes unless there's a global memory shortage..

I'm testing this patch against r338924 on the same EYPC 7551P 32core system used for testing against D17059.

Oct 1 2018, 12:41 PM

Sep 25 2018

crest_bultmann.eu added a comment to D17304: Swap in processes unless there's a global memory shortage..

I'm testing this patch against r338924 on the same EYPC 7551P 32core system used for testing against D17059.

Sep 25 2018, 8:51 AM

Sep 24 2018

crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I removed enough DIMMs to balance all four NUMA domains on my 32core EPYC system. Now each of the four domains contains a single 32GB DIMM for a total of 128GB. Under load (again multiple dd processes writing to ZFS) the system still swaps out complete processes (e.g. login shells running zpool iostat or top). If those processes exit and their parent shell was swapped out it can take over a minute until the shell is swapped back int although there are at least 3GB of free memory spread over all domains according to top.

Sep 24 2018, 11:23 AM · NUMA

Sep 18 2018

crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I have to revise my statement. I tried an other torture test (six dd if=/dev/zero bs=1m of=/kkdata/benchmark/$RANDOM writing to an uncompressed dataset). The system is still writing at about 1GB/s with the patch, but trying to exit some tools (e.g. zpool, top) hangs. Here is the procstat -kka output:

I don't see any such processes in the procstat output. Did you try this test without "options NUMA"?

Did you look at the zsh processes as well? I observed no hangs without "options NUMA".

Yes, seems they're just waiting for children to report an exit status. I am wondering if the processes got swapped out. Could you provide "ps auxwwwH" output?

Sep 18 2018, 4:11 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I have to revise my statement. I tried an other torture test (six dd if=/dev/zero bs=1m of=/kkdata/benchmark/$RANDOM writing to an uncompressed dataset). The system is still writing at about 1GB/s with the patch, but trying to exit some tools (e.g. zpool, top) hangs. Here is the procstat -kka output:

I don't see any such processes in the procstat output. Did you try this test without "options NUMA"?

Sep 18 2018, 3:36 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I have to revise my statement. I tried an other torture test (six dd if=/dev/zero bs=1m of=/kkdata/benchmark/$RANDOM writing to an uncompressed dataset). The system is still writing at about 1GB/s with the patch, but trying to exit some tools (e.g. zpool, top) hangs. Here is the procstat -kka output:

Sep 18 2018, 10:54 AM · NUMA
crest_bultmann.eu added a comment to D17209: Only update the domain cursor once in keg_fetch_slab()..
In D17209#366880, @cem wrote:

This changes keg cursor advancement behavior slightly. I'm not sure that matters.

Sep 18 2018, 8:52 AM
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I think I see the problem. Could you test with the diff at D17209 applied?

Sep 18 2018, 8:30 AM · NUMA

Sep 17 2018

crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

This is the output from top -HSazo res when writes to ZFS stopped beeing processed on the system with a NUMA enabled kernel:

Thanks. Could you also grab "procstat -kka" output from the system in this state?

Here is the requested output from procstat -kka of a hanging system.

Great, this helps. Finally, could I ask for output from "sysctl vm", again from the system in this state?

Sep 17 2018, 5:45 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

This is the output from top -HSazo res when writes to ZFS stopped beeing processed on the system with a NUMA enabled kernel:

Thanks. Could you also grab "procstat -kka" output from the system in this state?

Sep 17 2018, 5:37 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

This is the output from top -HSazo res when writes to ZFS stopped beeing processed on the system with a NUMA enabled kernel:

Sep 17 2018, 1:31 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

This time I triggered a panic via sysctl a few minutes after ZFS writes hung but before the kernel panic()ed on its own.

Sep 17 2018, 12:29 PM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I gave up after >500 screenshots of the IPMI KVM output. I haven't yet found a working configuration for the Serial over LAN. I'm trying again with a dump device large enough to hold >200GB RAM.

Sep 17 2018, 10:23 AM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

Never mind. The ghosts in the machine read my post. The kernel just panic()ed again. I'm at the kernel debugger prompt in the IPMI KVM webinterface.

Sep 17 2018, 9:04 AM · NUMA
crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

After copying 110TB between two pools with zfs send | mbuffer -m1g -s128k | zfs recv on a kernel without "options NUMA" I bootet a kernel with "options NUMA" build from revision 338698. ZFS writes still hang, but the system doesn't panic. The mbuffer output shows that the buffer remains 100% full when writes hang.

Sep 17 2018, 9:02 AM · NUMA

Sep 14 2018

crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

I attached a screenshot of the system console taken via IPMI

. Ignore the nvme related lines. I reproduced the same panic with them unplugged. I used the ALPHA5 memstick (r338518) to install and encountered the panic with the GENERIC kernel from that installation. I checked out r338638 which includes NUMA in GENERIC and compiled a GENERIC-NODEBUG kernel and disabled malloc debugging to get a realistic impression of the hardware's potential. The EPYC system compiled the kernel and world just fine so I attached and imported the old ZFS pool from its predecessor (a FreeBSD 11.2 system) and tried to send | recv the relevant datasets from the old pool to a new pool. This repeatedly hung after about 70-80GB. Until writes stopped the system transferred 1.0 to 1.1GB/s. I remembered reading about starvation in the NUMA code disabled it on a hunch. With NUMA disabled the system is stable (so far) and currently half way through copying 107TB from the old pool to the new pool.

Sep 14 2018, 9:49 AM · NUMA

Sep 13 2018

crest_bultmann.eu added a comment to D17059: Enable options NUMA on amd64 GENERIC/MINIMAL:.

With the NUMA option enabled ZFS hangs after a few minutes of heavy write load causing the deadman switch to panic the kernel on a 32 core AMD EPCY 7551P. I can still write to the swap partitions on the same disks while writes to ZFS on an other partition hang.

Sep 13 2018, 3:37 PM · NUMA