Somewhat similar to stpncpy, but different in that we need to compute
the full source length even if the buffer is shorter than the source.
New unit tests derived from the stpncpy unit tests cover strlcpy.
strlcat is implemented as a simple wrapper around strlcpy. The scalar
implementation of strlcpy just calls into strlen() and memcpy() to do
the job.
Perf-wise we're very close to stpncpy. The code is slightly slower as
it needs to carry on with finding the source string length even if the
buffer ends before the string. glibc does not have this function
(it'll be added in the next version; no sight of a SIMD variant, though
it seems like they are working on it), so no
comparison with glibc this time.
Benchmark results as usual:
os: FreeBSD arch: amd64 cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz │ strlcpy.pre.out │ strlcpy.scalar.out │ strlcpy.baseline.out │ │ sec/op │ sec/op vs base │ sec/op vs base │ Short 131.10µ ± 0% 129.10µ ± 0% -1.52% (p=0.000 n=20) 79.67µ ± 0% -39.23% (p=0.000 n=20) Mid 77.161µ ± 0% 29.928µ ± 1% -61.21% (p=0.000 n=20) 8.652µ ± 1% -88.79% (p=0.000 n=20) Long 32.156µ ± 0% 10.064µ ± 0% -68.70% (p=0.000 n=20) 3.577µ ± 0% -88.87% (p=0.000 n=20) geomean 68.77µ 33.88µ -50.74% 13.51µ -80.36% │ strlcpy.pre.out │ strlcpy.scalar.out │ strlcpy.baseline.out │ │ B/s │ B/s vs base │ B/s vs base │ Short 909.3Mi ± 0% 923.4Mi ± 0% +1.54% (p=0.000 n=20) 1496.3Mi ± 0% +64.55% (p=0.000 n=20) Mid 1.509Gi ± 0% 3.890Gi ± 1% +157.82% (p=0.000 n=20) 13.455Gi ± 1% +791.79% (p=0.000 n=20) Long 3.620Gi ± 0% 11.568Gi ± 0% +219.52% (p=0.000 n=20) 32.541Gi ± 0% +798.85% (p=0.000 n=20) geomean 1.693Gi 3.436Gi +103.00% 8.617Gi +409.04%
Sponsored by: The FreeBSD Foundation