Page MenuHomeFreeBSD

Mk/bsd.port.mk: introduce PKG_CREATE_THREADS_NUMBER
Needs ReviewPublic

Authored by siva on Thu, Mar 5, 4:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Mar 9, 4:28 PM
Unknown Object (File)
Mon, Mar 9, 6:17 AM
Unknown Object (File)
Sun, Mar 8, 4:26 PM
Unknown Object (File)
Sun, Mar 8, 12:09 AM
Subscribers
None

Details

Reviewers
bapt
adamw
diizzy
vvd
Group Reviewers
portmgr
Summary

There are a number of concerns with how
53c1ad8c4add9656b4ee3e0d1a1c1b643d617f84 introduced concurrency to
pkg-create:

  1. It is tied to MAKE_JOBS, which is affected by MAKE_JOBS_UNSAFE.
  2. It could affect compression ratio, depending on the pkg workload and the parallelism available to the system
  3. On top of (2), it implies packaging reproducibility issues

This commit aims to address all of these issues by adding a new
knob PKG_CREATE_THREADS_NUMBER, which has the following semantics:

  • if defined, it must be >= 1
  • if not defined, it defaults to the number of physical CPU cores if using the default tzst packaging format, otherwise defaults to 1

zstd guarantees deterministic builds on all _multithreaded_ compression
runs, so we must always pass the number of requested threads explicitly
to avoid --single-thread differences.

Discussed With: adamw, diizzy, vvd
Fixes: 53c1ad8c4add9656b4ee3e0d1a1c1b643d617f84

Diff Detail

Repository
R11 FreeBSD ports repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 71202
Build 68085: arc lint + arc unit

Event Timeline

siva requested review of this revision.Thu, Mar 5, 4:35 PM
siva created this revision.

I'm a little confused by the logic.

If PKG_CREATE_THREADS_NUMBER <= 0, it gets kern.smp.cores || nproc. But aren't -T0 and -T$(sysctl kern.smp.cores) expected to mean the same thing?

  1. 0 is a perfectly valid value for -T, as your comment notes. But if that's what the user requests, it gets overwritten by...
  2. kern.smp.cores, which should be identical to just doing -T0 but with an extra shell call. Otherwise it becomes...
  3. nproc, which on a HT processor is twice the above values

I'm not clear what the difference would be between your version and:

.if !defined(PKG_CREATE_THREADS_NUMBER) || ${PKG_CREATE_THREADS_NUMBER} < 0
.    if ${whatever} == tzst
PKG_CREATE_THREADS_NUMBER!= ${NPROC} 2>/dev/null
.    else
PKG_CREATE_THREADS_NUMBER= 1
.    endif
.endif

But couldn't we just simplify it all the way?

.if ${PKG_COMPRESSION_FORMAT} == tzst
PKG_CREATE_THREADS_NUMBER?= 0
.elif ${PKG_COMPRESSION_FORMAT} == xz
PKG_CREATE_THREADS_NUMBER?= 1
.endif

Thinking as an end-user that's sortof what I'd expect. What do we ultimately gain from bringing in sysctl and/or nproc, and overwriting a valid value of 0?

If PKG_CREATE_THREADS_NUMBER <= 0, it gets kern.smp.cores || nproc. But aren't -T0 and -T$(sysctl kern.smp.cores) expected to mean the same thing?

  1. 0 is a perfectly valid value for -T, as your comment notes. But if that's what the user requests, it gets overwritten by...
  2. kern.smp.cores, which should be identical to just doing -T0 but with an extra shell call.

-T0 only means that for the CLI version of zstd. pkg-create uses libarchive to request compression in a compressor-agnostic way. Here is the relevant logic for zstd specifically: https://cgit.freebsd.org/src/tree/contrib/libarchive/libarchive/archive_write_add_filter_zstd.c

The problem is that libzstd inteprets threads=0 as --single-threaded as per the manual page. At the ZSTD_c_nbWorkers entry in https://facebook.github.io/zstd/zstd_manual.html:

when nbWorkers >= 1, triggers asynchronous mode when invoking ZSTD_compressStream*()
[...]
default value is 0, aka "single-threaded mode"

We should not rely on 0 meaning the same thing across all compressors. If we decide to add a new improved compression format, the number of threads that we request in pkg-create should still semantically mean the same thing.