HomeFreeBSD

nvme(4): Report NPWA before NPWG as stripesize.

Description

nvme(4): Report NPWA before NPWG as stripesize.

New Samsung 980 SSDs report Namespace Preferred Write Alignment of
8 (4KB) and Namespace Preferred Write Granularity of 32 (16KB).
My quick tests show that 16KB is a minimal sequential write size
when the SSD reaches peak IOPS, so writing much less is very slow.
But writing slightly less or slightly more does not change much,
so it seems not so much a size granularity as minimum I/O size.

Thinking about different stripesize consumers:

  • Partition alignment should be based on NPWA by definition.
  • ZFS ashift in part of forcing alignment of all I/Os should also

be based on NPWA. In part of forcing size granularity, if really
needed, it may be set to NPWG, but too big value can make ZFS too
space-inefficient, and the 16KB is actually the biggest supported
value there now.

  • ZFS recordsize/volblocksize could potentially be tuned up toward

NPWG to work as I/O size granularity, but enabled compression makes
it too fuzzy. And those are normally user-configurable things.

  • ZFS I/O aggregation code could definitely use Optimal Write Size

value and may be NPWG, but we don't have fields in GEOM now to report
the minimal and optimal I/O sizes, and even maximal is not reported
outside GEOM DISK to be used by ZFS.

MFC after: 1 week

Details

Provenance
mavAuthored on Jul 6 2021, 2:19 AM
Parents
rGe41fde3ed71c: On a failed fcmpset don't pointlessly repeat tests
Branches
Unknown
Tags
Unknown