Page MenuHomeFreeBSD

ZSTD compression for ZFS
AbandonedPublic

Authored by allanjude on Jun 10 2017, 2:06 AM.
Tags
None
Referenced Files
F108547954: D11124.id30301.diff
Sun, Jan 26, 5:10 AM
Unknown Object (File)
Sat, Jan 25, 10:58 PM
Unknown Object (File)
Sat, Jan 25, 10:50 PM
Unknown Object (File)
Wed, Jan 22, 5:31 PM
Unknown Object (File)
Sat, Jan 11, 5:11 PM
Unknown Object (File)
Wed, Jan 1, 8:32 PM
Unknown Object (File)
Dec 24 2024, 9:00 AM
Unknown Object (File)
Dec 11 2024, 12:04 PM

Details

Summary

This patch adds ZSTD compression to ZFS

It currently supports the default 19 compression levels. Support for a subset of the negative levels is planned for a future commit.

It includes the required changes to boot from zstd compressed datasets for BIOS/LEGACY and UEFI.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 17245
Build 17088: arc lint + arc unit

Event Timeline

Lots of cleanup

Works with Zstandard 1.2.0 now

Reduced diff against Zstandard

Update to support ZSTD 1.3.x

This is a much cleaner version of the patch

Depends on D10407 for libzstd changes

Some of the libstand Makefile goo will go away when Warner finishes his cleanup of sys/boot

This version works, including booting BIOS and EFI with zstd compression on the boot environment.

Still considering switching away from having the 3 compression levels (min, default, max), to just 'zstd', with a separate property that controls the level (1-19)

comments on the sys/boot / libstand parts of this.

lib/libstand/Makefile
115–116 ↗(On Diff #33855)

${SRCTOP}/ instead of ${LIBSTAND_SRC}/../../ works better and fits with changes I've made to this file.

118 ↗(On Diff #33855)

ditto

sys/boot/efi/boot1/Makefile
94–96 ↗(On Diff #33855)

Just nuke these. The first line isn't needed anymore, and the last line I've already done.

sys/boot/efi/loader/Makefile
152 ↗(On Diff #33855)

kill this, it's OBE

sys/boot/i386/gptzfsboot/Makefile
91–92 ↗(On Diff #33855)

Although this is a wrap fix, it's gratuitous. I'd avoid it. It will also conflict.

sys/boot/i386/gptzfsboot/Makefile
91–92 ↗(On Diff #33855)

yes, this was unintentional, it is left over from when there was a libzstd32 for a short while before I just added zstd to libstand instead.

Just some optional performance suggestions

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zstd.c
195

Performance suggestion :

re-using an _existing_ compression context saves a lot of allocation and initialisation work.
The smaller the block size, the more measurable the savings.

217

Same logic here as for compression side :
re-using a context saves allocation and initialisation costs.

The impact is less pronounced on the decompression side, because initialisation work is much lighter.

@yann.collet.73_gmail.com : Is there any tuning we can do to get the size of the compression context down, since we know the maximum size if the input will be 16MB?

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zstd.c
195

I have not worked out had to handle that safely yet. There are 2 issues:

There are multiple kernel threads that will be calling this function concurrently, and they may be doing different compression levels. I don't want to have N unused -19 compression contexts outstanding, as they are up to 50MB each.

So preallocating a number of contexts probably won't work because of the diversity of input block sizes and compression levels, and the multi-threaded nature of the surrounding code. ZSTD won't be invoked in its multi-threaded mode, but we may be compressing many independent blocks at once.

Is there any tuning we can do to get the size of the compression context down, since we know the maximum size if the input will be 16MB?

A large proportion of the compression context size comes from hash and binary table sizes.
The associated parameters are cparams.hashlog and cparams.chainlog.
They are automatically set when invoking ZSTD_getCParams(compressionLevel, inputSize, 0).

inputSize is very important to automatically tune down compression parameters in presence of short input.
You have already described in your presentation that selecting a small inputSize has a big impact on compression context size.

For 16 MB though, no tune down will happen, since it's considered a "large" input.

In order to reduce context size, you have a few choices :

  • reduce compression level
  • reduce input size
  • manually change compression parameters

If you are interested in scenario 3, cparams.hashlog and cparams.chainlog will be your prime candidates.
Obviously, changing these values will have additional side effects, on top of reducing memory requirements.
Reducing chainlog will reduce compression ratio, and slightly increase speed.
Reducing hashlog will reduce speed, and slightly reduce compression ratio.
Some other parameters can be modified to compensate, such as increasing searchLog, which will reduce speed, and slightly increase compression ratio.
Well, many variations are possible.
When in doubt if manually selected parameters are valid, use [ZSTD_adjustCParams(cparams, inputSize, 0)](https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L626) to get them automatically fixed.

For a selected set of custom compression parameters, ZSTD_estimateCCtxSize_advanced(cparams) will tell how much memory the compression context needs.

allanjude edited the summary of this revision. (Show Details)

Newer version of ZSTD (1.3.2)

Incorporate feedback from ZFS developer summit

  • Store ZSTD level in a new hidden property. This allows all 19 levels without filling up the zio_compress enum.

Note: stand/ and sys/contrib/zstd changes will be committed separately

So you only have one new compression value in the BP? Have you looked at how this interacts with L2ARC when compressed ARC is disabled? See arc_cksum_is_equal(), I don't think this will work right, because you won't be able to reproduce the on-disk compressed data without knowing the compression level.

Solve issues with L2ARC if compressed_arc_enabled=0

Catch up the latest -current

Update to r329600

Need to retest boot support

Rebase to r331496

This solves merge conflicts with ZFS device removal and related feature flags that have come in recently

Add support for zstd compression levels to ZFS channel programs

Also update to Sr333575

Can you bump the .Dd on zpool-features.7 (content change)?

allanjude retitled this revision from WIP: ZSTD in ZFS to ZSTD compression for ZFS.Jun 2 2018, 10:24 PM

Update to r334532

No changes to ZSTD/ZFS

Add a 'zfs send' feature flag, to indicate that zstd is required to receive this stream

Booting from zstd compressed pools has been confirmed working.

Known issues:

  • the zstd_compress feature flag is not being 'activated' properly.
  • the zstd_compress_level 'property' is not invisible as intended

Create the ZSTD replication feature flag

  • Set for both -c (compressed send) and -e (embedded blocks)
  • Fix the activation of the ZSTD per-dataset feature flag. The first time a ZSTD compressed block is written to a dataset, it is activated
  • Correctly hide the zstd_compress_level property

I was just looking at making ZSTD a compile option via ZSTDIO here https://reviews.freebsd.org/D16007

Is there any conflict between this review and what I'm doing?

Update patch to r336271 and add ZSTD-Fast modes

ZSTD 1.3.4 and later support modes faster than the old 'level 1'.
The 'negative' levels are unbounded, so I chose a subset:
1-10
20-100 in increments of 10
100, 500, and 1000

Compression results of base + kernel + kernel-dbg + src: http://dpaste.com/3GVJY39

Make ZSTD compile in if you enable ZFS

stand/i386/zfsboot/Makefile
34

We don't need this in every single Makefile, do we? Can't we centralize this and expand flags here for ZFS? We already have 4 lines for it, which is 3 too many.

stand/i386/zfsboot/Makefile
34

actually we could have something like:

CFLAGS.somefile.c += ...

I much improved version of this has just been committed upstream: https://github.com/openzfs/zfs/commit/10b3c7f5e424f54b3ba82dbf1600d866e64ec0a0

It will be pulled back into FreeBSD as part of the switch to OpenZFS in the next few weeks.