Index: head/contrib/xz/ChangeLog =================================================================== --- head/contrib/xz/ChangeLog (revision 359200) +++ head/contrib/xz/ChangeLog (revision 359201) @@ -1,5615 +1,6947 @@ +commit 2327a461e1afce862c22269b80d3517801103c1b +Author: Lasse Collin +Date: 2020-03-17 16:27:42 +0200 + + Bump version and soname for 5.2.5. + + src/liblzma/Makefile.am | 2 +- + src/liblzma/api/lzma/version.h | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +commit 3be82d2f7dc882258caf0f0a69214e5916b2bdda +Author: Lasse Collin +Date: 2020-03-17 16:26:04 +0200 + + Update NEWS for 5.2.5. + + NEWS | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 105 insertions(+) + +commit ab3e57539c7337f0653b13b75dbc5d03ade9700e +Author: Lasse Collin +Date: 2020-03-16 21:57:21 +0200 + + Translations: Rebuild cs.po to avoid incorrect fuzzy strings. + + "make dist" updates the .po files and the fuzzy strings would + result in multiple very wrong translations. + + po/cs.po | 592 ++++++++++++++++++++++++++++++++++----------------------------- + 1 file changed, 322 insertions(+), 270 deletions(-) + +commit 3a6f38309dc5d44d8a63ebb337b6b2028561c93e +Author: Lasse Collin +Date: 2020-03-16 20:01:37 +0200 + + README: Update outdated sections. + + README | 21 +++++++++++---------- + 1 file changed, 11 insertions(+), 10 deletions(-) + +commit 9cc0901798217e258e91c13cf6fda7ad42ba108c +Author: Lasse Collin +Date: 2020-03-16 19:46:27 +0200 + + README: Mention that translatable strings will change after 5.2.x. + + README | 74 +++--------------------------------------------------------------- + 1 file changed, 3 insertions(+), 71 deletions(-) + +commit cc163574249f6a4a66f3dc09d6fe5a71bee24fab +Author: Lasse Collin +Date: 2020-03-16 19:39:45 +0200 + + README: Mention that man pages can be translated. + + README | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +commit ca261994edc3f2d03d5589c037171c63471ee9dc +Author: Lasse Collin +Date: 2020-03-16 17:30:39 +0200 + + Translations: Add partial Danish translation. + + I made a few minor white space changes without getting them + approved by the Danish translation team. + + po/LINGUAS | 1 + + po/da.po | 896 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 897 insertions(+) + +commit 51cd5d051fc730d61411dee292e863582784e189 +Author: Lasse Collin +Date: 2020-03-16 16:43:29 +0200 + + Update INSTALL.generic from Automake 1.16.1. + + INSTALL.generic | 321 ++++++++++++++++++++++++++++---------------------------- + 1 file changed, 162 insertions(+), 159 deletions(-) + +commit 69d694e5f1beae2bbfa3b6c348ec0ec5f14b5cd0 +Author: Lasse Collin +Date: 2020-03-15 15:27:22 +0200 + + Update INSTALL for Windows and DOS and add preliminary info for z/OS. + + INSTALL | 51 +++++++++++++++++++++++++++++++++++++++++---------- + 1 file changed, 41 insertions(+), 10 deletions(-) + +commit 2c3b1bb80a3ca7e09728fe4d7a1d8648a5cb9bca +Author: Lasse Collin +Date: 2020-03-15 15:26:20 +0200 + + Build: Update m4/ax_pthread.m4 from Autoconf Archive (again). + + m4/ax_pthread.m4 | 219 +++++++++++++++++++++++++++++-------------------------- + 1 file changed, 117 insertions(+), 102 deletions(-) + +commit 74a5af180a6a6c4b8c90cefb37ee900d3fea7dc6 +Author: Lasse Collin +Date: 2020-03-11 21:15:35 +0200 + + xz: Never use thousand separators in DJGPP builds. + + DJGPP 2.05 added support for thousands separators but it's + broken at least under WinXP with Finnish locale that uses + a non-breaking space as the thousands separator. Workaround + by disabling thousands separators for DJGPP builds. + + src/xz/util.c | 14 ++++++++++++-- + 1 file changed, 12 insertions(+), 2 deletions(-) + +commit ceba0d25e826bcdbf64bb4cb03385a2a66f8cbcb +Author: Lasse Collin +Date: 2020-03-11 19:38:08 +0200 + + DOS: Update dos/Makefile for DJGPP 2.05. + + It doesn't need -fgnu89-inline like 2.04beta did. + + dos/Makefile | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +commit 29e5bd71612253281fb22bbaa0a566990a74dcc3 +Author: Lasse Collin +Date: 2020-03-11 19:36:07 +0200 + + DOS: Update instructions in dos/INSTALL.txt. + + dos/INSTALL.txt | 59 ++++++++++++++++++++++++++++----------------------------- + 1 file changed, 29 insertions(+), 30 deletions(-) + +commit 00a037ee9c8ee5a03cf9744e05570ae93d56b875 +Author: Lasse Collin +Date: 2020-03-11 17:58:51 +0200 + + DOS: Update config.h. + + The added defines assume GCC >= 4.8. + + dos/config.h | 8 ++++++++ + 1 file changed, 8 insertions(+) + +commit 4ec2feaefa310b4249eb41893caf526e5c51ee39 +Author: Lasse Collin +Date: 2020-03-11 22:37:54 +0200 + + Translations: Add hu, zh_CN, and zh_TW. + + I made a few white space changes to these without getting them + approved by the translation teams. (I tried to contact the hu and + zh_TW teams but didn't succeed. I didn't contact the zh_CN team.) + + po/LINGUAS | 3 + + po/hu.po | 985 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + po/zh_CN.po | 963 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + po/zh_TW.po | 956 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 4 files changed, 2907 insertions(+) + +commit b6ed09729ae408be4533a0ddbc7df3d6f566846a +Author: Lasse Collin +Date: 2020-03-11 14:33:30 +0200 + + Translations: Update vi.po to match the file from the TP. + + The translated strings haven't been updated but word wrapping + is different. + + po/vi.po | 407 ++++++++++++++++++++++++++++----------------------------------- + 1 file changed, 179 insertions(+), 228 deletions(-) + +commit 7c85e8953ced204c858101872a15183e4639e9fb +Author: Lasse Collin +Date: 2020-03-11 14:18:03 +0200 + + Translations: Add fi and pt_BR, and update de, fr, it, and pl. + + The German translation isn't identical to the file in + the Translation Project but the changes (white space changes + only) were approved by the translator Mario Blättermann. + + po/LINGUAS | 2 + + po/de.po | 476 ++++++++++++++-------------- + po/fi.po | 974 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + po/fr.po | 272 ++++++++-------- + po/it.po | 479 ++++++++++++---------------- + po/pl.po | 239 +++++++------- + po/pt_BR.po | 1001 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 7 files changed, 2697 insertions(+), 746 deletions(-) + +commit 7da3ebc67fb5414034685ec16c7a29dad03dfa9b +Author: Lasse Collin +Date: 2020-02-25 21:35:14 +0200 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 1acc48794364606c9091cae6fa56db75a1325114 +Author: Lasse Collin +Date: 2020-03-11 13:05:29 +0200 + + Build: Add very limited experimental CMake support. + + This version matches CMake files in the master branch (commit + 265daa873c0d871f5f23f9b56e133a6f20045a0a) except that this omits + two source files that aren't in v5.2 and in the beginning of + CMakeLists.txt the first paragraph in the comment is slightly + different to point out possible issues in building shared liblzma. + + CMakeLists.txt | 659 ++++++++++++++++++++++++++++++++++++++++++++ + cmake/tuklib_common.cmake | 49 ++++ + cmake/tuklib_cpucores.cmake | 175 ++++++++++++ + cmake/tuklib_integer.cmake | 102 +++++++ + cmake/tuklib_mbstr.cmake | 20 ++ + cmake/tuklib_physmem.cmake | 150 ++++++++++ + cmake/tuklib_progname.cmake | 19 ++ + 7 files changed, 1174 insertions(+) + +commit 9acc6abea1552803c74c1486fbb10af119550772 +Author: Lasse Collin +Date: 2020-02-27 20:24:27 +0200 + + Build: Add support for --no-po4a option to autogen.sh. + + Normally, if po4a isn't available, autogen.sh will return + with non-zero exit status. The option --no-po4a can be useful + when one knows that po4a isn't available but wants autogen.sh + to still return with zero exit status. + + autogen.sh | 11 ++++++++++- + 1 file changed, 10 insertions(+), 1 deletion(-) + +commit c8853b31545db7bd0551be85949624b1261efd47 +Author: Lasse Collin +Date: 2020-02-24 23:37:07 +0200 + + Update m4/.gitignore. + + m4/.gitignore | 1 + + 1 file changed, 1 insertion(+) + +commit 901eb4a8c992354c3ea482f5bad60a1f8ad6fcc8 +Author: Lasse Collin +Date: 2020-02-24 23:01:00 +0200 + + liblzma: Remove unneeded from fastpos_tablegen.c. + + This file only generates fastpos_table.c. + It isn't built as a part of liblzma. + + src/liblzma/lzma/fastpos_tablegen.c | 1 - + 1 file changed, 1 deletion(-) + +commit ac35c9585fb734b7a19785d490c152e0b8cd4663 +Author: Lasse Collin +Date: 2020-02-22 14:15:07 +0200 + + Use defined(__GNUC__) before __GNUC__ in preprocessor lines. + + This should silence the equivalent of -Wundef in compilers that + don't define __GNUC__. + + src/common/sysdefs.h | 3 ++- + src/liblzma/api/lzma.h | 5 +++-- + 2 files changed, 5 insertions(+), 3 deletions(-) + +commit fb9cada7cfade1156d6277717280e05b5cd342d6 +Author: Lasse Collin +Date: 2020-02-21 17:40:02 +0200 + + liblzma: Add more uses of lzma_memcmplen() to the normal mode of LZMA. + + This gives a tiny encoder speed improvement. This could have been done + in 2014 after the commit 544aaa3d13554e8640f9caf7db717a96360ec0f6 but + it was forgotten. + + src/liblzma/lzma/lzma_encoder_optimum_normal.c | 16 ++++++++++------ + 1 file changed, 10 insertions(+), 6 deletions(-) + +commit 6117955af0b9cef5acde7859e86f773692b5f43c +Author: Lasse Collin +Date: 2020-02-21 17:01:15 +0200 + + Build: Add visibility.m4 from gnulib. + + Appears that this file used to get included as a side effect of + gettext. After the change to gettext version requirements this file + no longer got copied to the package and so the build was broken. + + m4/.gitignore | 1 - + m4/visibility.m4 | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 77 insertions(+), 1 deletion(-) + +commit c2cc64d78c098834231f9cfd7d852c9cd8950d74 +Author: Lasse Collin +Date: 2020-02-21 16:10:44 +0200 + + xz: Silence a warning when sig_atomic_t is long int. + + It can be true at least on z/OS. + + src/xz/signals.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit b6314aa275b35c714e0a191d0b2e9b6106129ea9 +Author: Lasse Collin +Date: 2020-02-21 15:59:26 +0200 + + xz: Avoid unneeded access of a volatile variable. + + src/xz/signals.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit f772a1572f723e5dc7d2d32e1d4287ac7a0da55e +Author: Lasse Collin +Date: 2020-02-21 01:24:18 +0200 + + tuklib_integer.m4: Optimize the check order. + + The __builtin byteswapping is the preferred one so check for it first. + + m4/tuklib_integer.m4 | 56 +++++++++++++++++++++++++++------------------------- + 1 file changed, 29 insertions(+), 27 deletions(-) + +commit 641042e63f665f3231c2fd1241fd3dddda3fb313 +Author: Lasse Collin +Date: 2020-02-20 18:54:04 +0200 + + tuklib_exit: Add missing header. + + strerror() needs which happened to be included via + tuklib_common.h -> tuklib_config.h -> sysdefs.h if HAVE_CONFIG_H + was defined. This wasn't tested without config.h before so it + had worked fine. + + src/common/tuklib_exit.c | 1 + + 1 file changed, 1 insertion(+) + +commit dbd55a69e530fec9ae866aaf6c3ccc0b4daf1f1f +Author: Lasse Collin +Date: 2020-02-16 11:18:28 +0200 + + sysdefs.h: Omit the conditionals around string.h and limits.h. + + string.h is used unconditionally elsewhere in the project and + configure has always stopped if limits.h is missing, so these + headers must have been always available even on the weirdest + systems. + + src/common/sysdefs.h | 8 ++------ + 1 file changed, 2 insertions(+), 6 deletions(-) + +commit 9294909861e6d22b32418467e0e988f953a82264 +Author: Lasse Collin +Date: 2020-02-15 15:07:11 +0200 + + Build: Bump Autoconf and Libtool version requirements. + + There is no specific reason for this other than blocking + the most ancient versions. These are still old: + + Autoconf 2.69 (2012) + Automake 1.12 (2012) + gettext 0.19.6 (2015) + Libtool 2.4 (2010) + + configure.ac | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit bd09081bbdf552f730030d2fd0e5e39ccb3936af +Author: Lasse Collin +Date: 2020-02-15 03:08:32 +0200 + + Build: Use AM_GNU_GETTEXT_REQUIRE_VERSION and require 0.19.6. + + This bumps the version requirement from 0.19 (from 2014) to + 0.19.6 (2015). + + Using only the old AM_GNU_GETTEXT_VERSION results in old + gettext infrastructure being placed in the package. By using + both macros we get the latest gettext files while the other + programs in the Autotools family can still see the old macro. + + configure.ac | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +commit 1e5e08d86534aec7ca957982c7f6e90203c19e9f +Author: Lasse Collin +Date: 2020-02-14 20:42:06 +0200 + + Translations: Add German translation of the man pages. + + Thanks to Mario Blättermann. + + po4a/de.po | 5532 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + po4a/po4a.conf | 2 +- + 2 files changed, 5533 insertions(+), 1 deletion(-) + +commit 4b1447809ffbc0d77c0ad456bd6b3afcf0b8623e +Author: Lasse Collin +Date: 2020-02-07 15:32:21 +0200 + + Build: Add support for translated man pages using po4a. + + The dependency on po4a is optional. It's never required to install + the translated man pages when xz is built from a release tarball. + If po4a is missing when building from xz.git, the translated man + pages won't be generated but otherwise the build will work normally. + + The translations are only updated automatically by autogen.sh and + by "make mydist". This makes it easy to keep po4a as an optional + dependency and ensures that I won't forget to put updated + translations to a release tarball. + + The translated man pages aren't installed if --disable-nls is used. + + The installation of translated man pages abuses Automake internals + by calling "install-man" with redefined dist_man_MANS and man_MANS. + This makes the hairy script code slightly less hairy. If it breaks + some day, this code needs to be fixed; don't blame Automake developers. + + Also, this adds more quotes to the existing shell script code in + the Makefile.am "-hook"s. + + Makefile.am | 4 ++++ + autogen.sh | 8 ++++--- + po4a/.gitignore | 2 ++ + po4a/po4a.conf | 14 +++++++++++ + po4a/update-po | 45 ++++++++++++++++++++++++++++++++++ + src/scripts/Makefile.am | 64 +++++++++++++++++++++++++++++++++++++------------ + src/xz/Makefile.am | 50 +++++++++++++++++++++++++++----------- + src/xzdec/Makefile.am | 55 ++++++++++++++++++++++++++++++++---------- + 8 files changed, 197 insertions(+), 45 deletions(-) + +commit 882fcfdcd86525cc5c6f6d0bf0230d0089206d13 +Author: Lasse Collin +Date: 2020-02-06 00:04:42 +0200 + + Update THANKS (sync with the master branch). + + THANKS | 3 +++ + 1 file changed, 3 insertions(+) + +commit 134bb7765815d5f265eb0bc9e6ebacd9ae4a52bc +Author: Lasse Collin +Date: 2020-02-05 22:35:06 +0200 + + Update tests/.gitignore. + + .gitignore | 4 ++++ + 1 file changed, 4 insertions(+) + +commit 6912472fafb656be8f4c5b4ac9ea28fea3065de4 +Author: Lasse Collin +Date: 2020-02-05 22:28:51 +0200 + + Update m4/.gitignore. + + m4/.gitignore | 1 + + 1 file changed, 1 insertion(+) + +commit 68c60735bbb6e51d4205ba8a9fde307bcfb22f8c +Author: Lasse Collin +Date: 2020-02-05 20:47:38 +0200 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit e1beaa74bc7cb5a409d59b55870e01ae7784ce3a +Author: Lasse Collin +Date: 2020-02-05 20:33:50 +0200 + + xz: Comment out annoying sandboxing messages. + + src/xz/file_io.c | 10 +++++++--- + 1 file changed, 7 insertions(+), 3 deletions(-) + +commit 8238192652290df78bd728b20e3f6542d1a2819e +Author: Lasse Collin +Date: 2020-02-05 19:33:37 +0200 + + Build: Workaround a POSIX shell detection problem on Solaris. + + I don't know if the problem is in gnulib's gl_POSIX_SHELL macro + or if xzgrep does something that isn't in POSIX. The workaround + adds a special case for Solaris: if /usr/xpg4/bin/sh exists and + gl_cv_posix_shell wasn't overriden on the configure command line, + use that shell for xzgrep and other scripts. That shell is known + to work and exists on most Solaris systems. + + configure.ac | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +commit 93a1f61e892e145607dd938e3b30098af19a1672 +Author: Lasse Collin +Date: 2020-02-03 22:03:50 +0200 + + Build: Update m4/ax_pthread.m4 from Autoconf Archive. + + m4/ax_pthread.m4 | 398 ++++++++++++++++++++++++++++++++++++++----------------- + 1 file changed, 279 insertions(+), 119 deletions(-) + +commit d0daa21792ff861e5423bbd82aaa6c8ba9fa0462 +Author: Lasse Collin +Date: 2020-02-01 19:56:18 +0200 + + xz: Limit --memlimit-compress to at most 4020 MiB for 32-bit xz. + + See the code comment for reasoning. It's far from perfect but + hopefully good enough for certain cases while hopefully doing + nothing bad in other situations. + + At presets -5 ... -9, 4020 MiB vs. 4096 MiB makes no difference + on how xz scales down the number of threads. + + The limit has to be a few MiB below 4096 MiB because otherwise + things like "xz --lzma2=dict=500MiB" won't scale down the dict + size enough and xz cannot allocate enough memory. With + "ulimit -v $((4096 * 1024))" on x86-64, the limit in xz had + to be no more than 4085 MiB. Some safety margin is good though. + + This is hack but it should be useful when running 32-bit xz on + a 64-bit kernel that gives full 4 GiB address space to xz. + Hopefully this is enough to solve this: + + https://bugzilla.redhat.com/show_bug.cgi?id=1196786 + + FreeBSD has a patch that limits the result in tuklib_physmem() + to SIZE_MAX on 32-bit systems. While I think it's not the way + to do it, the results on --memlimit-compress have been good. This + commit should achieve practically identical results for compression + while leaving decompression and tuklib_physmem() and thus + lzma_physmem() unaffected. + + src/xz/hardware.c | 32 +++++++++++++++++++++++++++++++- + src/xz/xz.1 | 21 ++++++++++++++++++++- + 2 files changed, 51 insertions(+), 2 deletions(-) + +commit 4433c2dc5727ee6aef570e001a5a024e0d94e609 +Author: Lasse Collin +Date: 2020-01-26 20:53:25 +0200 + + xz: Set the --flush-timeout deadline when the first input byte arrives. + + xz --flush-timeout=2000, old version: + + 1. xz is started. The next flush will happen after two seconds. + 2. No input for one second. + 3. A burst of a few kilobytes of input. + 4. No input for one second. + 5. Two seconds have passed and flushing starts. + + The first second counted towards the flush-timeout even though + there was no pending data. This can cause flushing to occur more + often than needed. + + xz --flush-timeout=2000, after this commit: + + 1. xz is started. + 2. No input for one second. + 3. A burst of a few kilobytes of input. The next flush will + happen after two seconds counted from the time when the + first bytes of the burst were read. + 4. No input for one second. + 5. No input for another second. + 6. Two seconds have passed and flushing starts. + + src/xz/coder.c | 6 +----- + src/xz/file_io.c | 6 +++++- + src/xz/mytime.c | 1 - + 3 files changed, 6 insertions(+), 7 deletions(-) + +commit acc0ef3ac80f18e349c6d0252177707105c0a29c +Author: Lasse Collin +Date: 2020-01-26 20:19:19 +0200 + + xz: Move flush_needed from mytime.h to file_pair struct in file_io.h. + + src/xz/coder.c | 3 ++- + src/xz/file_io.c | 3 ++- + src/xz/file_io.h | 3 +++ + src/xz/mytime.c | 3 --- + src/xz/mytime.h | 4 ---- + 5 files changed, 7 insertions(+), 9 deletions(-) + +commit 4afe69d30b66812682a2016ee18441958019cbb2 +Author: Lasse Collin +Date: 2020-01-26 14:49:22 +0200 + + xz: coder.c: Make writing output a separate function. + + The same code sequence repeats so it's nicer as a separate function. + Note that in one case there was no test for opt_mode != MODE_TEST, + but that was only because that condition would always be true, so + this commit doesn't change the behavior there. + + src/xz/coder.c | 30 +++++++++++++++++------------- + 1 file changed, 17 insertions(+), 13 deletions(-) + +commit ec26f3ace5f9b260ca91508030f07465ae2f9f78 +Author: Lasse Collin +Date: 2020-01-26 14:13:42 +0200 + + xz: Fix semi-busy-waiting in xz --flush-timeout. + + When input blocked, xz --flush-timeout=1 would wake up every + millisecond and initiate flushing which would have nothing to + flush and thus would just waste CPU time. The fix disables the + timeout when no input has been seen since the previous flush. + + src/xz/coder.c | 4 ++++ + src/xz/file_io.c | 15 +++++++++++---- + src/xz/file_io.h | 4 ++++ + 3 files changed, 19 insertions(+), 4 deletions(-) + +commit 38915703241e69a64f133ff9a02ec9100c6019c6 +Author: Lasse Collin +Date: 2020-01-26 13:47:31 +0200 + + xz: Refactor io_read() a bit. + + src/xz/file_io.c | 17 ++++++++--------- + 1 file changed, 8 insertions(+), 9 deletions(-) + +commit f6d24245349cecfae6ec0d2366fa80716c9f6d37 +Author: Lasse Collin +Date: 2020-01-26 13:37:08 +0200 + + xz: Update a comment in file_io.h. + + src/xz/file_io.h | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +commit 15b55d5c63d27f81776edb1abc05872a751fc674 +Author: Lasse Collin +Date: 2020-01-26 13:27:51 +0200 + + xz: Move the setting of flush_needed in file_io.c to a nicer location. + + src/xz/file_io.c | 6 ++---- + 1 file changed, 2 insertions(+), 4 deletions(-) + +commit 609c7067859146ffc62ac655f6ba53599c891801 +Author: Lasse Collin +Date: 2020-02-05 19:56:09 +0200 + + xz: Enable Capsicum sandboxing by default if available. + + It has been enabled in FreeBSD for a while and reported to work fine. + + Thanks to Xin Li. + + INSTALL | 6 ------ + configure.ac | 8 ++++---- + 2 files changed, 4 insertions(+), 10 deletions(-) + +commit 00517d125cc26ecece0eebb84c1c3975cd19bee0 +Author: Lasse Collin +Date: 2019-12-31 22:41:45 +0200 + + Rename unaligned_read32ne to read32ne, and similarly for the others. + + src/common/tuklib_integer.h | 64 +++++++++++++++---------------- + src/liblzma/common/alone_encoder.c | 2 +- + src/liblzma/common/block_header_decoder.c | 2 +- + src/liblzma/common/block_header_encoder.c | 2 +- + src/liblzma/common/memcmplen.h | 9 ++--- + src/liblzma/common/stream_flags_decoder.c | 6 +-- + src/liblzma/common/stream_flags_encoder.c | 8 ++-- + src/liblzma/lz/lz_encoder_hash.h | 2 +- + src/liblzma/lzma/lzma_decoder.c | 2 +- + src/liblzma/lzma/lzma_encoder.c | 2 +- + src/liblzma/lzma/lzma_encoder_private.h | 3 +- + src/liblzma/simple/simple_decoder.c | 2 +- + src/liblzma/simple/simple_encoder.c | 2 +- + tests/test_block_header.c | 4 +- + tests/test_stream_flags.c | 6 +-- + 15 files changed, 54 insertions(+), 62 deletions(-) + +commit 52d89d8443c4a31a69c0701062f2c7711d82bbed +Author: Lasse Collin +Date: 2019-12-31 00:29:48 +0200 + + Rename read32ne to aligned_read32ne, and similarly for the others. + + Using the aligned methods requires more care to ensure that + the address really is aligned, so it's nicer if the aligned + methods are prefixed. The next commit will remove the unaligned_ + prefix from the unaligned methods which in liblzma are used in + more places than the aligned ones. + + src/common/tuklib_integer.h | 56 +++++++++++++++++++++--------------------- + src/liblzma/check/crc32_fast.c | 4 +-- + src/liblzma/check/crc64_fast.c | 4 +-- + 3 files changed, 32 insertions(+), 32 deletions(-) + +commit 850620468b57d49f16093e5870d1050886fcb37a +Author: Lasse Collin +Date: 2019-12-31 00:18:24 +0200 + + Revise tuklib_integer.h and .m4. + + Add a configure option --enable-unsafe-type-punning to get the + old non-conforming memory access methods. It can be useful with + old compilers or in some other less typical situations but + shouldn't normally be used. + + Omit the packed struct trick for unaligned access. While it's + best in some cases, this is simpler. If the memcpy trick doesn't + work, one can request unsafe type punning from configure. + + Because CRC32/CRC64 code needs fast aligned reads, if no very + safe way to do it is found, type punning is used as a fallback. + This sucks but since it currently works in practice, it seems to + be the least bad option. It's never needed with GCC >= 4.7 or + Clang >= 3.6 since these support __builtin_assume_aligned and + thus fast aligned access can be done with the memcpy trick. + + Other things: + - Support GCC/Clang __builtin_bswapXX + - Cleaner bswap fallback macros + - Minor cleanups + + m4/tuklib_integer.m4 | 43 ++++ + src/common/tuklib_integer.h | 488 ++++++++++++++++++++++++-------------------- + 2 files changed, 314 insertions(+), 217 deletions(-) + +commit a45badf0342666462cc6a7107a071418570ab773 +Author: Lasse Collin +Date: 2019-12-29 22:51:58 +0200 + + Tests: Hopefully fix test_check.c to work on EBCDIC systems. + + Thanks to Daniel Richard G. + + tests/test_check.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +commit c9a8071e6690a8db8a485c075920df254e7c70ea +Author: Lasse Collin +Date: 2019-09-24 23:02:40 +0300 + + Scripts: Put /usr/xpg4/bin to the beginning of PATH on Solaris. + + This adds a configure option --enable-path-for-scripts=PREFIX + which defaults to empty except on Solaris it is /usr/xpg4/bin + to make POSIX grep and others available. The Solaris case had + been documented in INSTALL with a manual fix but it's better + to do this automatically since it is needed on most Solaris + systems anyway. + + Thanks to Daniel Richard G. + + INSTALL | 43 +++++++++++++++++++++++++++++++++++-------- + configure.ac | 26 ++++++++++++++++++++++++++ + src/scripts/xzdiff.in | 1 + + src/scripts/xzgrep.in | 1 + + src/scripts/xzless.in | 1 + + src/scripts/xzmore.in | 1 + + 6 files changed, 65 insertions(+), 8 deletions(-) + +commit aba140e2df3ff63ad124ae997de16d517b98ca50 +Author: Lasse Collin +Date: 2019-07-12 18:57:43 +0300 + + Fix comment typos in tuklib_mbstr* files. + + src/common/tuklib_mbstr.h | 2 +- + src/common/tuklib_mbstr_fw.c | 2 +- + src/common/tuklib_mbstr_width.c | 2 +- + 3 files changed, 3 insertions(+), 3 deletions(-) + +commit 710f5bd769a5d2bd8684256c2727d15350ee2ab8 +Author: Lasse Collin +Date: 2019-07-12 18:30:46 +0300 + + Add missing include to tuklib_mbstr_width.c. + + It didn't matter in XZ Utils because sysdefs.h + includes string.h anyway. + + src/common/tuklib_mbstr_width.c | 1 + + 1 file changed, 1 insertion(+) + +commit 0e491aa8cd72e0100cd15c1b9469cd57fae500b0 +Author: Lasse Collin +Date: 2019-06-25 23:15:21 +0300 + + liblzma: Fix a buggy comment. + + src/liblzma/lz/lz_encoder_mf.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit bfc245569f340a75bd71ad32a6beba786712683b +Author: Lasse Collin +Date: 2019-06-25 00:16:06 +0300 + + configure.ac: Fix a typo in a comment. + + configure.ac | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit f18eee9d15a22c8449ef395a05f0eb637c4ad253 +Author: Lasse Collin +Date: 2019-06-25 00:08:13 +0300 + + Tests: Silence warnings from clang -Wassign-enum. + + Also changed 999 to 99 so it fits even if lzma_check happened + to be 8 bits wide. + + tests/test_block_header.c | 3 ++- + tests/test_stream_flags.c | 2 +- + 2 files changed, 3 insertions(+), 2 deletions(-) + +commit 25f74554723e8deabc66fed1abf0ec27a4ed19d5 +Author: Lasse Collin +Date: 2019-06-24 23:52:17 +0300 + + liblzma: Add a comment. + + src/liblzma/common/stream_encoder_mt.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 44eb961f2a51d02420d017bc5ff470360663650c +Author: Lasse Collin +Date: 2019-06-24 23:45:21 +0300 + + liblzma: Silence clang -Wmissing-variable-declarations. + + src/liblzma/check/crc32_table.c | 3 +++ + src/liblzma/check/crc64_table.c | 3 +++ + 2 files changed, 6 insertions(+) + +commit 267afcd9955e668c1532b069230c21c348eb4f82 +Author: Lasse Collin +Date: 2019-06-24 22:57:43 +0300 + + xz: Silence a warning from clang -Wsign-conversion in main.c. + + src/xz/main.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 0e3c4002f809311ecef239b05e556d9c462b5703 +Author: Lasse Collin +Date: 2019-06-24 22:47:39 +0300 + + liblzma: Remove incorrect uses of lzma_attribute((__unused__)). + + Caught by clang -Wused-but-marked-unused. + + src/liblzma/common/alone_decoder.c | 3 +-- + src/liblzma/common/alone_encoder.c | 3 +-- + src/liblzma/lz/lz_decoder.c | 3 +-- + 3 files changed, 3 insertions(+), 6 deletions(-) + +commit cb708e8fa3405ec13a0ebfebbbf2793f927deab1 +Author: Lasse Collin +Date: 2019-06-24 20:53:55 +0300 + + Tests: Silence a warning from -Wsign-conversion. + + tests/create_compress_files.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +commit c8cace3d6e965c0fb537591372bf71b9357dd76c +Author: Lasse Collin +Date: 2019-06-24 20:45:49 +0300 + + xz: Fix an integer overflow with 32-bit off_t. + + Or any off_t which isn't very big (like signed 64 bit integer + that most system have). A small off_t could overflow if the + file being decompressed had long enough run of zero bytes, + which would result in corrupt output. + + src/xz/file_io.c | 11 +++++++++-- + 1 file changed, 9 insertions(+), 2 deletions(-) + +commit 65a42741e290fbcd85dfc5db8a62c4bce5f7712c +Author: Lasse Collin +Date: 2019-06-24 00:57:23 +0300 + + Tests: Remove a duplicate branch from tests/tests.h. + + The duplication was introduced about eleven years ago and + should have been cleaned up back then already. + + This was caught by -Wduplicated-branches. + + tests/tests.h | 9 ++------- + 1 file changed, 2 insertions(+), 7 deletions(-) + +commit 5c4fb60e8df026e933afab0cfe0a8b55be20036c +Author: Lasse Collin +Date: 2019-06-23 23:22:45 +0300 + + tuklib_mbstr_width: Fix a warning from -Wsign-conversion. + + src/common/tuklib_mbstr_width.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 37df03ce52ce53710e1513387648763f8a744154 +Author: Lasse Collin +Date: 2019-06-23 23:19:34 +0300 + + xz: Fix some of the warnings from -Wsign-conversion. + + src/xz/args.c | 4 ++-- + src/xz/coder.c | 4 ++-- + src/xz/file_io.c | 5 +++-- + src/xz/message.c | 4 ++-- + src/xz/mytime.c | 4 ++-- + src/xz/options.c | 2 +- + src/xz/util.c | 4 ++-- + 7 files changed, 14 insertions(+), 13 deletions(-) + +commit 7c65ae0f5f2e2431f88621e8fe6d1dc7907e30c1 +Author: Lasse Collin +Date: 2019-06-23 22:27:45 +0300 + + tuklib_cpucores: Silence warnings from -Wsign-conversion. + + src/common/tuklib_cpucores.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +commit a502dd1d000b598406637d452f535f4bbd43e2a4 +Author: Lasse Collin +Date: 2019-06-23 21:40:47 +0300 + + xzdec: Fix warnings from -Wsign-conversion. + + src/xzdec/xzdec.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit a45d1a5374ceb22e23255b0b595b9e641e9860af +Author: Lasse Collin +Date: 2019-06-23 21:38:56 +0300 + + liblzma: Fix warnings from -Wsign-conversion. + + Also, more parentheses were added to the literal_subcoder + macro in lzma_comon.h (better style but no functional change + in the current usage). + + src/liblzma/common/block_header_decoder.c | 2 +- + src/liblzma/delta/delta_decoder.c | 2 +- + src/liblzma/lzma/fastpos.h | 2 +- + src/liblzma/lzma/lzma2_decoder.c | 8 ++++---- + src/liblzma/lzma/lzma_common.h | 3 ++- + src/liblzma/lzma/lzma_decoder.c | 16 ++++++++-------- + src/liblzma/simple/arm.c | 6 +++--- + src/liblzma/simple/armthumb.c | 8 ++++---- + src/liblzma/simple/ia64.c | 2 +- + src/liblzma/simple/powerpc.c | 9 +++++---- + src/liblzma/simple/x86.c | 2 +- + 11 files changed, 31 insertions(+), 29 deletions(-) + +commit 4ff87ddf80ed7cb233444cddd86ab1940b5b55ec +Author: Lasse Collin +Date: 2019-06-23 19:33:55 +0300 + + tuklib_integer: Silence warnings from -Wsign-conversion. + + src/common/tuklib_integer.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit ed1a9d33984a3a37ae9a775a46859850d98ea4d0 +Author: Lasse Collin +Date: 2019-06-20 19:40:30 +0300 + + tuklib_integer: Fix usage of conv macros. + + Use a temporary variable instead of e.g. + conv32le(unaligned_read32ne(buf)) because the macro can + evaluate its argument multiple times. + + src/common/tuklib_integer.h | 12 ++++++++---- + 1 file changed, 8 insertions(+), 4 deletions(-) + +commit 612c88dfc08e2c572623954ecfde541d21c84882 +Author: Lasse Collin +Date: 2019-06-03 20:44:19 +0300 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 85da31d8b882b8b9671ab3e3d74d88bd945cd0bb +Author: Lasse Collin +Date: 2019-06-03 20:41:54 +0300 + + liblzma: Fix comments. + + Thanks to Bruce Stark. + + src/liblzma/common/alone_encoder.c | 4 ++-- + src/liblzma/common/block_util.c | 2 +- + src/liblzma/common/common.c | 2 +- + src/liblzma/common/filter_common.h | 2 +- + src/liblzma/common/filter_decoder.h | 2 +- + src/liblzma/common/filter_flags_encoder.c | 2 +- + 6 files changed, 7 insertions(+), 7 deletions(-) + +commit 6a73a7889587aa394e236c7e9e4f870b44851036 +Author: Lasse Collin +Date: 2019-06-02 00:50:59 +0300 + + liblzma: Fix one more unaligned read to use unaligned_read16ne(). + + src/liblzma/lz/lz_encoder_hash.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit ce59b34ec9ac344d62a57cad5f94f695f42cdaee +Author: Lasse Collin +Date: 2019-06-01 21:41:55 +0300 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 94aa3fb568fe41dd4925a961966ed5cf8213bd1f +Author: Lasse Collin +Date: 2019-06-01 21:36:13 +0300 + + liblzma: memcmplen: Use ctz32() from tuklib_integer.h. + + The same compiler-specific #ifdefs are already in tuklib_integer.h + + src/liblzma/common/memcmplen.h | 10 +--------- + 1 file changed, 1 insertion(+), 9 deletions(-) + +commit 412791486dfb430219d8e30bcbebbfc57a99484a +Author: Lasse Collin +Date: 2019-06-01 21:30:03 +0300 + + tuklib_integer: Cleanup MSVC-specific code. + + src/common/tuklib_integer.h | 20 +++++++++----------- + 1 file changed, 9 insertions(+), 11 deletions(-) + +commit efbf6e5f0932e6c1a4250f91ee99059f449f2470 +Author: Lasse Collin +Date: 2019-06-01 19:01:21 +0300 + + liblzma: Use unaligned_readXXne functions instead of type punning. + + Now gcc -fsanitize=undefined should be clean. + + Thanks to Jeffrey Walton. + + src/liblzma/common/memcmplen.h | 12 ++++++------ + src/liblzma/lzma/lzma_encoder_private.h | 2 +- + 2 files changed, 7 insertions(+), 7 deletions(-) + +commit 29afef03486d461c23f57150ac5436684bff7811 +Author: Lasse Collin +Date: 2019-06-01 18:41:16 +0300 + + tuklib_integer: Improve unaligned memory access. + + Now memcpy() or GNU C packed structs for unaligned access instead + of type punning. See the comment in this commit for details. + + Avoiding type punning with unaligned access is needed to + silence gcc -fsanitize=undefined. + + New functions: unaliged_readXXne and unaligned_writeXXne where + XX is 16, 32, or 64. + + src/common/tuklib_integer.h | 180 +++++++++++++++++++++++++++++++++++++++++--- + 1 file changed, 168 insertions(+), 12 deletions(-) + +commit 596ed3de4485a4b1d83b5fe506ae9d0a172139b4 +Author: Lasse Collin +Date: 2019-05-13 20:05:17 +0300 + + liblzma: Avoid memcpy(NULL, foo, 0) because it is undefined behavior. + + I should have always known this but I didn't. Here is an example + as a reminder to myself: + + int mycopy(void *dest, void *src, size_t n) + { + memcpy(dest, src, n); + return dest == NULL; + } + + In the example, a compiler may assume that dest != NULL because + passing NULL to memcpy() would be undefined behavior. Testing + with GCC 8.2.1, mycopy(NULL, NULL, 0) returns 1 with -O0 and -O1. + With -O2 the return value is 0 because the compiler infers that + dest cannot be NULL because it was already used with memcpy() + and thus the test for NULL gets optimized out. + + In liblzma, if a null-pointer was passed to memcpy(), there were + no checks for NULL *after* the memcpy() call, so I cautiously + suspect that it shouldn't have caused bad behavior in practice, + but it's hard to be sure, and the problematic cases had to be + fixed anyway. + + Thanks to Jeffrey Walton. + + src/liblzma/common/common.c | 6 +++++- + src/liblzma/lz/lz_decoder.c | 12 +++++++++--- + src/liblzma/simple/simple_coder.c | 10 +++++++++- + 3 files changed, 23 insertions(+), 5 deletions(-) + +commit b4b83555c576e1d845a2b98a193b23c021437804 +Author: Lasse Collin +Date: 2019-05-11 20:56:08 +0300 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 8d4906262b45557ed164cd74adb270e6ef7f6f03 +Author: Lasse Collin +Date: 2019-05-11 20:54:12 +0300 + + xz: Update xz man page date. + + src/xz/xz.1 | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 0d318402f8a022f707622c72f8f1894ea476cf89 +Author: Antoine Cœur +Date: 2019-05-08 13:30:57 +0800 + + spelling + + Doxyfile.in | 2 +- + NEWS | 2 +- + src/liblzma/api/lzma/block.h | 2 +- + src/liblzma/api/lzma/hardware.h | 2 +- + src/liblzma/api/lzma/lzma12.h | 2 +- + src/liblzma/api/lzma/vli.h | 2 +- + src/liblzma/common/hardware_physmem.c | 2 +- + src/liblzma/common/index.c | 4 ++-- + src/liblzma/common/stream_encoder_mt.c | 2 +- + src/liblzma/common/vli_decoder.c | 2 +- + src/liblzma/lz/lz_decoder.c | 2 +- + src/scripts/xzgrep.in | 2 +- + src/xz/args.c | 2 +- + src/xz/coder.c | 4 ++-- + src/xz/main.c | 2 +- + src/xz/mytime.h | 2 +- + src/xz/private.h | 2 +- + src/xz/xz.1 | 2 +- + windows/build.bash | 2 +- + 19 files changed, 21 insertions(+), 21 deletions(-) + +commit aeb3be8ac4c4b06a745c3799b80b38159fb78b1a +Author: Lasse Collin +Date: 2019-03-04 22:49:04 +0200 + + README: Update translation instructions. + + XZ Utils is now part of the Translation Project + . + + README | 32 +++++++++++++------------------- + 1 file changed, 13 insertions(+), 19 deletions(-) + +commit 0c238dc3feb0a3eea1e713feb8d338c8dfba9f74 +Author: Lasse Collin +Date: 2018-12-20 20:42:29 +0200 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 3ca432d9cce4bf7e793de173dd22025b68611c42 +Author: Lasse Collin +Date: 2018-12-14 20:34:30 +0200 + + xz: Fix a crash in progress indicator when in passthru mode. + + "xz -dcfv not_an_xz_file" crashed (all four options are + required to trigger it). It caused xz to call + lzma_get_progress(&strm, ...) when no coder was initialized + in strm. In this situation strm.internal is NULL which leads + to a crash in lzma_get_progress(). + + The bug was introduced when xz started using lzma_get_progress() + to get progress info for multi-threaded compression, so the + bug is present in versions 5.1.3alpha and higher. + + Thanks to Filip Palian for + the bug report. + + src/xz/coder.c | 11 +++++++---- + src/xz/message.c | 18 ++++++++++++++++-- + src/xz/message.h | 3 ++- + 3 files changed, 25 insertions(+), 7 deletions(-) + +commit fcc419e3c3f77a8b6fc5056a86b1b8abbe266e62 +Author: Lasse Collin +Date: 2018-11-22 17:20:31 +0200 + + xz: Update man page timestamp. + + src/xz/xz.1 | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 5a2fc3cd0194e55df329dd29f805299aaca5f32f +Author: Pavel Raiskup +Date: 2018-11-22 15:14:34 +0100 + + 'have have' typos + + src/xz/signals.c | 2 +- + src/xz/xz.1 | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +commit 7143b04fe49390807f355b1dad686a3d8c4dbdcf +Author: Lasse Collin +Date: 2018-07-27 18:10:44 +0300 + + xzless: Rename unused variables to silence static analysers. + + In this particular case I don't see this affecting readability + of the code. + + Thanks to Pavel Raiskup. + + src/scripts/xzless.in | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 273c33297bb69621045ed19665eaf8338bcf4a50 +Author: Lasse Collin +Date: 2018-07-27 16:02:58 +0300 + + liblzma: Remove an always-true condition from lzma_index_cat(). + + This should help static analysis tools to see that newg + isn't leaked. + + Thanks to Pavel Raiskup. + + src/liblzma/common/index.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 65b4aba6d06d2cd24ba9ad01fa389c238ad8f352 +Author: Lasse Collin +Date: 2018-05-19 21:23:25 +0300 + + liblzma: Improve lzma_properties_decode() API documentation. + + src/liblzma/api/lzma/filter.h | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +commit 531e78e5a253a3e2c4d4dd1505acaccee48f4083 +Author: Lasse Collin +Date: 2019-05-01 16:52:36 +0300 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 905de7e93528ca5a47039e7e1e5270163f9fc67e +Author: Lasse Collin +Date: 2019-05-01 16:43:16 +0300 + + Windows: Update VS version in windows/vs2019/config.h. + + windows/vs2019/config.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 0ffd30e172fd18cc619823b2a86448bf56a67e22 +Author: Julien Marrec +Date: 2019-04-25 17:44:06 +0200 + + Windows: Upgrade solution itself + + windows/vs2019/xz_win.sln | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +commit c2ef96685fc7ca36311649eeb2284b9808292040 +Author: Julien Marrec +Date: 2019-04-25 17:40:24 +0200 + + Windows: Upgrade solution with VS2019 + + windows/vs2019/liblzma.vcxproj | 15 ++++++++------- + windows/vs2019/liblzma_dll.vcxproj | 15 ++++++++------- + 2 files changed, 16 insertions(+), 14 deletions(-) + +commit 25fccaf00bea399d8aa026e5b8fa254ce196e6e0 +Author: Julien Marrec +Date: 2019-04-25 17:39:32 +0200 + + Windows: Duplicate windows/vs2017 before upgrading + + windows/vs2019/config.h | 148 ++++++++++++++ + windows/vs2019/liblzma.vcxproj | 354 ++++++++++++++++++++++++++++++++++ + windows/vs2019/liblzma_dll.vcxproj | 383 +++++++++++++++++++++++++++++++++++++ + windows/vs2019/xz_win.sln | 48 +++++ + 4 files changed, 933 insertions(+) + +commit 1424078d6328291c7c524b64328ce9660617cb24 +Author: Lasse Collin +Date: 2019-01-13 17:29:23 +0200 + + Windows/VS2017: Omit WindowsTargetPlatformVersion from project files. + + I understood that if a WTPV is specified, it's often wrong + because different VS installations have different SDK version + installed. Omitting the WTPV tag makes VS2017 default to + Windows SDK 8.1 which often is also missing, so in any case + people may need to specify the WTPV before building. But some + day in the future a missing WTPV tag will start to default to + the latest installed SDK which sounds reasonable: + + https://developercommunity.visualstudio.com/content/problem/140294/windowstargetplatformversion-makes-it-impossible-t.html + + Thanks to "dom". + + windows/INSTALL-MSVC.txt | 4 ++++ + windows/vs2017/liblzma.vcxproj | 1 - + windows/vs2017/liblzma_dll.vcxproj | 1 - + 3 files changed, 4 insertions(+), 2 deletions(-) + commit b5be61cc06088bb07f488f9baf7d447ff47b37c1 Author: Lasse Collin Date: 2018-04-29 19:00:06 +0300 Bump version and soname for 5.2.4. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit c47fa6d06745bb2e99866e76b81ac7a9c5a8bfec Author: Lasse Collin Date: 2018-04-29 18:48:00 +0300 extra/scanlzma: Fix compiler warnings. extra/scanlzma/scanlzma.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) commit 7b350fe21aa4fd6495a3b6188a40e3f1ae7c0edf Author: Lasse Collin Date: 2018-04-29 18:15:37 +0300 Add NEWS for 5.2.4. NEWS | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) commit 5801591162a280aa52d156dfde42c531ec7fd8b6 Author: Lasse Collin Date: 2018-02-06 19:36:30 +0200 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit c4a616f4536146f8906e1b4412eefeec07b28fae Author: Ben Boeckel Date: 2018-01-29 13:58:18 -0500 nothrow: use noexcept for C++11 and newer In C++11, the `throw()` specifier is deprecated and `noexcept` is preffered instead. src/liblzma/api/lzma.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) commit 0b8947782ff3c5ef830a7f85412e44dcf3cdeb77 Author: Lasse Collin Date: 2018-02-06 18:02:48 +0200 liblzma: Remove incorrect #ifdef from range_common.h. In most cases it was harmless but it could affect some custom build systems. Thanks to Pippijn van Steenhoven. src/liblzma/rangecoder/range_common.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) commit 48f3b9f73ffea7f55d5678997aba0e79d2e82168 Author: Lasse Collin Date: 2018-01-10 22:10:39 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit a3ce3e902342be37c626a561ce3d9ffcf27d0f94 Author: Lasse Collin Date: 2018-01-10 21:54:27 +0200 tuklib_integer: New Intel C compiler needs immintrin.h. Thanks to Melanie Blower (Intel) for the patch. src/common/tuklib_integer.h | 11 +++++++++++ 1 file changed, 11 insertions(+) commit 4505ca483985f88c6923c05a43b4327feaab83b1 Author: Lasse Collin Date: 2017-09-24 20:04:24 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 1ef3cc226e3ce173575c218238b71a4eecabc470 Author: Lasse Collin Date: 2017-09-16 20:36:20 +0300 Windows: Fix paths in VS project files. Some paths use slashes instead of backslashes as directory separators... now it should work (I tested VS2013 version). windows/vs2013/liblzma.vcxproj | 12 ++++++------ windows/vs2013/liblzma_dll.vcxproj | 24 ++++++++++++------------ windows/vs2017/liblzma.vcxproj | 12 ++++++------ windows/vs2017/liblzma_dll.vcxproj | 24 ++++++++++++------------ 4 files changed, 36 insertions(+), 36 deletions(-) commit e775d2a8189d24f60470e6e49d8af881df3a1680 Author: Lasse Collin Date: 2017-09-16 12:54:23 +0300 Windows: Add project files for VS2017. These files match the v5.2 branch (no file info decoder). windows/vs2017/config.h | 148 ++++++++++++++ windows/vs2017/liblzma.vcxproj | 355 ++++++++++++++++++++++++++++++++++ windows/vs2017/liblzma_dll.vcxproj | 384 +++++++++++++++++++++++++++++++++++++ windows/vs2017/xz_win.sln | 48 +++++ 4 files changed, 935 insertions(+) commit 10e02e0fbb6e2173f8b41f6e39b7b570f47dd74d Author: Lasse Collin Date: 2017-09-16 12:39:43 +0300 Windows: Move VS2013 files into windows/vs2013 directory. windows/{ => vs2013}/config.h | 0 windows/{ => vs2013}/liblzma.vcxproj | 278 +++++++++++++++--------------- windows/{ => vs2013}/liblzma_dll.vcxproj | 280 +++++++++++++++---------------- windows/{ => vs2013}/xz_win.sln | 0 4 files changed, 279 insertions(+), 279 deletions(-) commit 06eebd4543196ded36fa9b8b9544195b38b24ef2 Author: Lasse Collin Date: 2017-08-14 20:08:33 +0300 Fix or hide warnings from GCC 7's -Wimplicit-fallthrough. src/liblzma/lzma/lzma_decoder.c | 6 ++++++ src/xz/list.c | 2 ++ 2 files changed, 8 insertions(+) commit ea4ea1dffafebaa8b2770bf3eca46900e4dd22dc Author: Alexey Tourbin Date: 2017-05-16 23:56:35 +0300 Docs: Fix a typo in a comment in doc/examples/02_decompress.c. doc/examples/02_decompress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit eb2ef4c79bf405ea0d215f3b1df3d0eaf5e1d27b Author: Lasse Collin Date: 2017-05-23 18:34:43 +0300 xz: Fix "xz --list --robot missing_or_bad_file.xz". It ended up printing an uninitialized char-array when trying to print the check names (column 7) on the "totals" line. This also changes the column 12 (minimum xz version) to 50000002 (xz 5.0.0) instead of 0 when there are no valid input files. Thanks to kidmin for the bug report. src/xz/list.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) commit 3ea5dbd9b0d79048e336e40cef3b6d814fb74e13 Author: Lasse Collin Date: 2017-04-24 19:48:47 +0300 Build: Omit pre-5.0.0 entries from the generated ChangeLog. It makes ChangeLog significantly smaller. Makefile.am | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit bae24675936df99064de1502593c006bd902594b Author: Lasse Collin Date: 2017-04-24 19:30:22 +0300 Update the Git repository URL to HTTPS in ChangeLog. ChangeLog | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 70f479211973b5361f4d7cb08ba5be69b4266e7a Author: Lasse Collin Date: 2017-04-19 22:17:35 +0300 Update the home page URLs to HTTPS. COPYING | 2 +- README | 2 +- configure.ac | 2 +- doc/faq.txt | 4 ++-- dos/config.h | 2 +- src/common/common_w32res.rc | 2 +- src/xz/xz.1 | 6 +++--- src/xzdec/xzdec.1 | 4 ++-- windows/README-Windows.txt | 2 +- windows/config.h | 2 +- 10 files changed, 14 insertions(+), 14 deletions(-) commit 2a4b2fa75d06a097261a02ecd3cf2b6d449bf754 Author: Lasse Collin Date: 2017-03-30 22:01:54 +0300 xz: Use POSIX_FADV_RANDOM for in "xz --list" mode. xz --list is random access so POSIX_FADV_SEQUENTIAL was clearly wrong. src/xz/file_io.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) commit eb25743ade39170cffd9566a1aae272098cce216 Author: Lasse Collin Date: 2017-03-30 19:47:45 +0300 liblzma: Fix lzma_memlimit_set(strm, 0). The 0 got treated specially in a buggy way and as a result the function did nothing. The API doc said that 0 was supposed to return LZMA_PROG_ERROR but it didn't. Now 0 is treated as if 1 had been specified. This is done because 0 is already used to indicate an error from lzma_memlimit_get() and lzma_memusage(). In addition, lzma_memlimit_set() no longer checks that the new limit is at least LZMA_MEMUSAGE_BASE. It's counter-productive for the Index decoder and was actually needed only by the auto decoder. Auto decoder has now been modified to check for LZMA_MEMUSAGE_BASE. src/liblzma/api/lzma/base.h | 7 ++++++- src/liblzma/common/auto_decoder.c | 3 +++ src/liblzma/common/common.c | 6 ++++-- 3 files changed, 13 insertions(+), 3 deletions(-) commit ef36c6362f3f3853f21b8a6359bcd06576ebf207 Author: Lasse Collin Date: 2017-03-30 19:16:55 +0300 liblzma: Similar memlimit fix for stream_, alone_, and auto_decoder. src/liblzma/api/lzma/container.h | 21 +++++++++++++++++---- src/liblzma/common/alone_decoder.c | 5 +---- src/liblzma/common/auto_decoder.c | 5 +---- src/liblzma/common/stream_decoder.c | 5 +---- 4 files changed, 20 insertions(+), 16 deletions(-) commit 57616032650f03840480b696d7878acdd2065521 Author: Lasse Collin Date: 2017-03-30 18:58:18 +0300 liblzma: Fix handling of memlimit == 0 in lzma_index_decoder(). It returned LZMA_PROG_ERROR, which was done to avoid zero as the limit (because it's a special value elsewhere), but using LZMA_PROG_ERROR is simply inconvenient and can cause bugs. The fix/workaround is to treat 0 as if it were 1 byte. It's effectively the same thing. The only weird consequence is that then lzma_memlimit_get() will return 1 even when 0 was specified as the limit. This fixes a very rare corner case in xz --list where a specific memory usage limit and a multi-stream file could print the error message "Internal error (bug)" instead of saying that the memory usage limit is too low. src/liblzma/api/lzma/index.h | 18 +++++++++++------- src/liblzma/common/index_decoder.c | 4 ++-- 2 files changed, 13 insertions(+), 9 deletions(-) commit 3d566cd519017eee1a400e7961ff14058dfaf33c Author: Lasse Collin Date: 2016-12-30 13:26:36 +0200 Bump version and soname for 5.2.3. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 053e624fe33795e779ff736f16ce44a129c829b5 Author: Lasse Collin Date: 2016-12-30 13:25:10 +0200 Update NEWS for 5.2.3. NEWS | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) commit cae412b2b77d7fd88d187ed7659331709311f80d Author: Lasse Collin Date: 2015-04-01 14:45:25 +0300 xz: Fix the Capsicum rights on user_abort_pipe. src/xz/file_io.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) commit 9ccbae41000572193b9a09e7102f9e84dc6d96de Author: Lasse Collin Date: 2016-12-28 21:05:22 +0200 Mention potential sandboxing bugs in INSTALL. INSTALL | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) commit e013a337d3de77cce24360dffe956ea2339489b6 Author: Lasse Collin Date: 2016-11-21 20:24:50 +0200 liblzma: Avoid multiple definitions of lzma_coder structures. Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html src/liblzma/common/alone_decoder.c | 44 +++++---- src/liblzma/common/alone_encoder.c | 34 ++++--- src/liblzma/common/auto_decoder.c | 35 ++++--- src/liblzma/common/block_decoder.c | 41 ++++---- src/liblzma/common/block_encoder.c | 40 ++++---- src/liblzma/common/common.h | 18 ++-- src/liblzma/common/index_decoder.c | 33 ++++--- src/liblzma/common/index_encoder.c | 16 ++-- src/liblzma/common/stream_decoder.c | 50 +++++----- src/liblzma/common/stream_encoder.c | 56 ++++++----- src/liblzma/common/stream_encoder_mt.c | 124 ++++++++++++++----------- src/liblzma/delta/delta_common.c | 25 ++--- src/liblzma/delta/delta_decoder.c | 6 +- src/liblzma/delta/delta_encoder.c | 12 ++- src/liblzma/delta/delta_private.h | 4 +- src/liblzma/lz/lz_decoder.c | 60 ++++++------ src/liblzma/lz/lz_decoder.h | 13 ++- src/liblzma/lz/lz_encoder.c | 57 +++++++----- src/liblzma/lz/lz_encoder.h | 9 +- src/liblzma/lzma/lzma2_decoder.c | 32 ++++--- src/liblzma/lzma/lzma2_encoder.c | 51 +++++----- src/liblzma/lzma/lzma_decoder.c | 27 +++--- src/liblzma/lzma/lzma_encoder.c | 29 +++--- src/liblzma/lzma/lzma_encoder.h | 9 +- src/liblzma/lzma/lzma_encoder_optimum_fast.c | 3 +- src/liblzma/lzma/lzma_encoder_optimum_normal.c | 23 ++--- src/liblzma/lzma/lzma_encoder_private.h | 6 +- src/liblzma/simple/arm.c | 2 +- src/liblzma/simple/armthumb.c | 2 +- src/liblzma/simple/ia64.c | 2 +- src/liblzma/simple/powerpc.c | 2 +- src/liblzma/simple/simple_coder.c | 61 ++++++------ src/liblzma/simple/simple_private.h | 12 +-- src/liblzma/simple/sparc.c | 2 +- src/liblzma/simple/x86.c | 15 +-- 35 files changed, 532 insertions(+), 423 deletions(-) commit 8e0f1af3dcaec00a3879cce8ad7441edc6359d1c Author: Lasse Collin Date: 2016-12-26 20:50:25 +0200 Document --enable-sandbox configure option in INSTALL. INSTALL | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) commit ce2542d220de06acd618fd9f5c0a6683029fb4eb Author: Lasse Collin Date: 2015-03-31 22:19:34 +0300 xz: Add support for sandboxing with Capsicum (disabled by default). In the v5.2 branch this feature is considered experimental and thus disabled by default. The sandboxing is used conditionally as described in main.c. This isn't optimal but it was much easier to implement than a full sandboxing solution and it still covers the most common use cases where xz is writing to standard output. This should have practically no effect on performance even with small files as fork() isn't needed. C and locale libraries can open files as needed. This has been fine in the past, but it's a problem with things like Capsicum. io_sandbox_enter() tries to ensure that various locale-related files have been loaded before cap_enter() is called, but it's possible that there are other similar problems which haven't been seen yet. Currently Capsicum is available on FreeBSD 10 and later and there is a port to Linux too. Thanks to Loganaden Velvindron for help. configure.ac | 41 +++++++++++++++++++++++++++ src/xz/Makefile.am | 2 +- src/xz/file_io.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/xz/file_io.h | 6 ++++ src/xz/main.c | 18 ++++++++++++ src/xz/private.h | 4 +++ 6 files changed, 151 insertions(+), 1 deletion(-) commit 3ca1d5e6320111043e19434da881065fadafa0e4 Author: Lasse Collin Date: 2015-03-31 21:12:30 +0300 Fix bugs and otherwise improve ax_check_capsicum.m4. AU_ALIAS was removed because the new version is incompatible with the old version. It no longer checks for separately. It's enough to test for it as part of AC_CHECK_DECL. The defines HAVE_CAPSICUM_SYS_CAPSICUM_H and HAVE_CAPSICUM_SYS_CAPABILITY_H were removed as unneeded. HAVE_SYS_CAPSICUM_H from AC_CHECK_HEADERS is enough. It no longer does a useless search for the Capsicum library if the header wasn't found. Fixed a bug in ACTION-IF-FOUND (the first argument). Specifying the argument omitted the default action but the given action wasn't used instead. AC_DEFINE([HAVE_CAPSICUM]) is now always called when Capsicum support is found. Previously it was part of the default ACTION-IF-FOUND which a custom action would override. Now the default action only prepends ${CAPSICUM_LIB} to LIBS. The documentation was updated. Since there as no serial number, "#serial 2" was added. m4/ax_check_capsicum.m4 | 103 ++++++++++++++++++++++++------------------------ 1 file changed, 51 insertions(+), 52 deletions(-) commit 5f3a742b64197fe8bedb6f05fc6ce5d177d11145 Author: Lasse Collin Date: 2015-03-31 19:20:24 +0300 Add m4/ax_check_capsicum.m4 for detecting Capsicum support. The file was loaded from this web page: https://github.com/google/capsicum-test/blob/dev/autoconf/m4/ax_check_capsicum.m4 Thanks to Loganaden Velvindron for pointing it out for me. m4/ax_check_capsicum.m4 | 86 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) commit d74377e62b4c649e40294dd441de72c0f092e67c Author: Lasse Collin Date: 2015-10-12 20:29:09 +0300 liblzma: Fix a memory leak in error path of lzma_index_dup(). lzma_index_dup() calls index_dup_stream() which, in case of an error, calls index_stream_end() to free memory allocated by index_stream_init(). However, it illogically didn't actually free the memory. To make it logical, the tree handling code was modified a bit in addition to changing index_stream_end(). Thanks to Evan Nemerson for the bug report. src/liblzma/common/index.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) commit f580732216dcf971f3f006fe8e01cd4979e1d964 Author: Lasse Collin Date: 2016-10-24 18:53:25 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 88d7a7fd153bf1355cdf798ffdac7443d0169afc Author: Lasse Collin Date: 2016-10-24 18:51:36 +0300 tuklib_cpucores: Add support for sched_getaffinity(). It's available in glibc (GNU/Linux, GNU/kFreeBSD). It's better than sysconf(_SC_NPROCESSORS_ONLN) because sched_getaffinity() gives the number of cores available to the process instead of the total number of cores online. As a side effect, this commit fixes a bug on GNU/kFreeBSD where configure would detect the FreeBSD-specific cpuset_getaffinity() but it wouldn't actually work because on GNU/kFreeBSD it requires using -lfreebsd-glue when linking. Now the glibc-specific function will be used instead. Thanks to Sebastian Andrzej Siewior for the original patch and testing. m4/tuklib_cpucores.m4 | 30 +++++++++++++++++++++++++++++- src/common/tuklib_cpucores.c | 9 +++++++++ 2 files changed, 38 insertions(+), 1 deletion(-) commit 51baf684376903dbeddd840582bfdf9fa91b311b Author: Lasse Collin Date: 2016-06-30 20:27:36 +0300 xz: Fix copying of timestamps on Windows. xz used to call utime() on Windows, but its result gets lost on close(). Using _futime() seems to work. Thanks to Martok for reporting the bug: http://www.mail-archive.com/xz-devel@tukaani.org/msg00261.html configure.ac | 2 +- src/xz/file_io.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) commit 1ddc479851139d6e8202e5835421bfe6578d9e07 Author: Lasse Collin Date: 2016-06-16 22:46:02 +0300 xz: Silence warnings from -Wlogical-op. Thanks to Evan Nemerson. src/xz/file_io.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) commit be647ff5ed5a1c244a65722af6ce250259f3b14a Author: Lasse Collin Date: 2016-04-10 20:55:49 +0300 Build: Fix = to += for xz_SOURCES in src/xz/Makefile.am. Thanks to Christian Kujau. src/xz/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit fb6d50c15343831f35305982cefa82053099191d Author: Lasse Collin Date: 2016-04-10 20:54:17 +0300 Build: Bump GNU Gettext version requirement to 0.19. It silences a few warnings and most people probably have 0.19 even on stable distributions. Thanks to Christian Kujau. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 74f8dad9f912a2993768d93d108ea2b0b2c196e0 Author: Lasse Collin Date: 2016-03-13 20:21:49 +0200 liblzma: Disable external SHA-256 by default. This is the sane thing to do. The conflict with OpenSSL on some OSes and especially that the OS-provided versions can be significantly slower makes it clear that it was a mistake to have the external SHA-256 support enabled by default. Those who want it can now pass --enable-external-sha256 to configure. INSTALL was updated with notes about OSes where this can be a bad idea. The SHA-256 detection code in configure.ac had some bugs that could lead to a build failure in some situations. These were fixed, although it doesn't matter that much now that the external SHA-256 is disabled by default. MINIX >= 3.2.0 uses NetBSD's libc and thus has SHA256_Init in libc instead of libutil. Support for the libutil version was removed. INSTALL | 36 ++++++++++++++++++++++ configure.ac | 76 +++++++++++++++++++++++------------------------ src/liblzma/check/check.h | 16 ++++------ 3 files changed, 79 insertions(+), 49 deletions(-) commit ea7f6ff04cb5bb1498088eb09960a4c3f13dfe39 Author: Lasse Collin Date: 2016-03-10 20:27:05 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit d0e018016b311232e82d9a98dc68f1e3dabce794 Author: Lasse Collin Date: 2016-03-10 20:26:49 +0200 Build: Avoid SHA256_Init on FreeBSD and MINIX 3. On FreeBSD 10 and older, SHA256_Init from libmd conflicts with libcrypto from OpenSSL. The OpenSSL version has different sizeof(SHA256_CTX) and it can cause weird problems if wrong SHA256_Init gets used. Looking at the source, MINIX 3 seems to have a similar issue but I'm not sure. To be safe, I disabled SHA256_Init on MINIX 3 too. NetBSD has SHA256_Init in libc and they had a similar problem, but they already fixed it in 2009. Thanks to Jim Wilcoxson for the bug report that helped in finding the problem. configure.ac | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) commit 5daae123915f32a4ed6dc948b831533c2d1beec3 Author: Lasse Collin Date: 2015-11-08 20:16:10 +0200 tuklib_physmem: Hopefully silence a warning on Windows. src/common/tuklib_physmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit 491acc406e098167ccb7fce0728b94c2f32cff9f Author: Lasse Collin Date: 2015-11-04 23:17:43 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 8173ff8790ad3502d04e1c07d014cb84a3b8187b Author: Lasse Collin Date: 2015-11-04 23:14:00 +0200 liblzma: Make Valgrind happier with optimized (gcc -O2) liblzma. When optimizing, GCC can reorder code so that an uninitialized value gets used in a comparison, which makes Valgrind unhappy. It doesn't happen when compiled with -O0, which I tend to use when running Valgrind. Thanks to Rich Prohaska. I remember this being mentioned long ago by someone else but nothing was done back then. src/liblzma/lz/lz_encoder.c | 4 ++++ 1 file changed, 4 insertions(+) commit 013de2b5ab8094d2c82a2771f3d143eeb656eda9 Author: Lasse Collin Date: 2015-11-03 20:55:45 +0200 liblzma: Rename lzma_presets.c back to lzma_encoder_presets.c. It would be too annoying to update other build systems just because of this. src/liblzma/lzma/Makefile.inc | 2 +- src/liblzma/lzma/{lzma_presets.c => lzma_encoder_presets.c} | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit a322f70ad96de88968c2c36e6a36bc08ae30bd20 Author: Lasse Collin Date: 2015-11-03 20:47:07 +0200 Build: Disable xzdec, lzmadec, and lzmainfo when they cannot be built. They all need decoder support and if that isn't available, there's no point trying to build them. configure.ac | 3 +++ 1 file changed, 3 insertions(+) commit 8ea49606cf6427e32319de7693eca9e43f1c8ad6 Author: Lasse Collin Date: 2015-11-03 20:35:19 +0200 Build: Simplify $enable_{encoders,decoders} usage a bit. configure.ac | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) commit 42131a25e52bfe400acfa7df93469a96bb78bb78 Author: Lasse Collin Date: 2015-11-03 20:31:31 +0200 Windows/MSVC: Update config.h. windows/config.h | 6 ++++++ 1 file changed, 6 insertions(+) commit e9184e87cc989d14c7413e6adb3eca98f6ae0290 Author: Lasse Collin Date: 2015-11-03 20:29:58 +0200 DOS: Update config.h. dos/config.h | 6 ++++++ 1 file changed, 6 insertions(+) commit 2296778f3c9a1e3a8699973b09dd3610b8baa402 Author: Lasse Collin Date: 2015-11-03 20:29:33 +0200 xz: Make xz buildable even when encoders or decoders are disabled. The patch is quite long but it's mostly about adding new #ifdefs to omit code when encoders or decoders have been disabled. This adds two new #defines to config.h: HAVE_ENCODERS and HAVE_DECODERS. configure.ac | 4 ++++ src/xz/Makefile.am | 8 ++++++-- src/xz/args.c | 16 ++++++++++++++++ src/xz/coder.c | 33 +++++++++++++++++++++++++-------- src/xz/main.c | 9 +++++++-- src/xz/private.h | 5 ++++- 6 files changed, 62 insertions(+), 13 deletions(-) commit 97a3109281e475d9cf1b5095237d672fa0ad25e5 Author: Lasse Collin Date: 2015-11-03 18:06:40 +0200 Build: Build LZMA1/2 presets also when only decoder is wanted. People shouldn't rely on the presets when decoding raw streams, but xz uses the presets as the starting point for raw decoder options anyway. lzma_encocder_presets.c was renamed to lzma_presets.c to make it clear it's not used solely by the encoder code. src/liblzma/lzma/Makefile.inc | 6 +++++- src/liblzma/lzma/{lzma_encoder_presets.c => lzma_presets.c} | 3 ++- 2 files changed, 7 insertions(+), 2 deletions(-) commit dc6b78d7f0f6fe43e9d4215146e8581feb8090e7 Author: Lasse Collin Date: 2015-11-03 17:54:48 +0200 Build: Fix configure to handle LZMA1 dependency with LZMA2. Now it gives an error if LZMA1 encoder/decoder is missing when LZMA2 encoder/decoder was requested. Even better would be LZMA2 implicitly enabling LZMA1 but it would need more code. configure.ac | 5 ----- 1 file changed, 5 deletions(-) commit 46d76c9cd3cb26a31f5ae6c3a8bbcf38e6da1add Author: Lasse Collin Date: 2015-11-03 17:41:54 +0200 Build: Don't omit lzma_cputhreads() unless using --disable-threads. Previously it was omitted if encoders were disabled with --disable-encoders. It didn't make sense and it also broke the build. src/liblzma/common/Makefile.inc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) commit 16d68f874d89f1e4a1919786a35bbaef7d71a077 Author: Lasse Collin Date: 2015-11-02 18:16:51 +0200 liblzma: Fix a build failure related to external SHA-256 support. If an appropriate header and structure were found by configure, but a library with a usable SHA-256 functions wasn't, the build failed. src/liblzma/check/check.h | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) commit d9311647fc1ab512a3394596221ab8039c00af6b Author: Lasse Collin Date: 2015-11-02 15:19:10 +0200 xz: Always close the file before trying to delete it. unlink() can return EBUSY in errno for open files on some operating systems and file systems. src/xz/file_io.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) commit f59c4183f3c9066626ce45dc3db4642fa603fa21 Author: Lasse Collin Date: 2015-10-12 21:08:42 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 35f189673e280c12e4c5129f9f97e54eef3bbc04 Author: Lasse Collin Date: 2015-10-12 21:07:41 +0300 Tests: Add tests for the two bugs fixed in index.c. tests/test_index.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) commit e10bfdb0fcaff12f3a6dadee51e0a022aadccb51 Author: Lasse Collin Date: 2015-10-12 20:45:15 +0300 liblzma: Fix lzma_index_dup() for empty Streams. Stream Flags and Stream Padding weren't copied from empty Streams. src/liblzma/common/index.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) commit 06f434bd8980f25ca23232eb7bb7df7e37dc8448 Author: Lasse Collin Date: 2015-10-12 20:31:44 +0300 liblzma: Add a note to index.c for those using static analyzers. src/liblzma/common/index.c | 3 +++ 1 file changed, 3 insertions(+) commit 9815cdf6987ef91a85493bfcfd1ce2aaf3b47a0a Author: Lasse Collin Date: 2015-09-29 13:59:35 +0300 Bump version and soname for 5.2.2. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit cbe0cec8476bdd0416c7ca9bc83895c9bea1cf78 Author: Lasse Collin Date: 2015-09-29 13:57:28 +0300 Update NEWS for 5.2.2. NEWS | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) commit 49427ce7eececdd18bbd35dab23c81910d083e1c Author: Andre Noll Date: 2015-05-28 15:50:00 +0200 Fix typo in German translation. As pointed out by Robert Pollak, there's a typo in the German translation of the compression preset option (-0 ... -9) help text. "The compressor" translates to "der Komprimierer", and the genitive form is "des Komprimierers". The old word makes no sense at all. po/de.po | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 608d6f06c940e7f28c25de005e8b99bdff42d27c Author: Hauke Henningsen Date: 2015-08-17 04:59:54 +0200 Update German translation, mostly wrt orthography Provide an update of the German translation. * A lot of compound words were previously written with spaces, while German orthography is relatively clear in that the components should not be separated. * When referring to the actual process of (de)compression rather than the concept, replace “(De-)Kompression” with “(De-)Komprimierung”. Previously, both forms were used in this context and are now used in a manner consistent with “Komprimierung” being more likely to refer to a process. * Consistently translate “standard input”/“output” * Use “Zeichen” instead of false friend “Charakter” for “character” * Insert commas around relative clauses (as required in German) * Some other minor corrections * Capitalize “ß” as “ẞ” * Consistently start option descriptions in --help with capital letters Acked-By: Andre Noll * Update after msgmerge po/de.po | 383 ++++++++++++++++++++++++++++++++------------------------------- 1 file changed, 196 insertions(+), 187 deletions(-) commit c8988414e5b67b8ef2fe0ba7b1ccdd0ec73c60d3 Author: Lasse Collin Date: 2015-08-11 13:23:04 +0300 Build: Minor Cygwin cleanup. Some tests used "cygwin*" and some used "cygwin". I changed them all to use "cygwin". Shouldn't affect anything in practice. configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 85a6dfed53477906bfe9a7c0123dd412e391cb48 Author: Lasse Collin Date: 2015-08-11 13:21:52 +0300 Build: Support building of MSYS2 binaries. configure.ac | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) commit 77f270be8432df2e4516a0c48814b6976d6618c5 Author: Lasse Collin Date: 2015-08-09 21:06:26 +0300 Windows: Define DLL_EXPORT when building liblzma.dll with MSVC. src/liblzma/common/common.h uses it to set __declspec(dllexport) for the API symbols. Thanks to Adam Walling. windows/liblzma_dll.vcxproj | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) commit 8c975446c5903090a5a8493b5b96b71003056a88 Author: Lasse Collin Date: 2015-08-09 21:02:20 +0300 Windows: Omit unneeded header files from MSVC project files. windows/liblzma.vcxproj | 5 ----- windows/liblzma_dll.vcxproj | 5 ----- 2 files changed, 10 deletions(-) commit 119a00434954726ca58e4a578e6469f530fca30e Author: Lasse Collin Date: 2015-07-12 20:48:19 +0300 liblzma: A MSVC-specific hack isn't needed with MSVC 2013 and newer. src/liblzma/api/lzma.h | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) commit d4e7c557fcab353539c9481a8d95cb04bcb15c7c Author: Lasse Collin Date: 2015-06-19 20:38:55 +0300 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit 98001740ca56c894a7bd32eb47e9857a8a7d878d Author: Lasse Collin Date: 2015-06-19 20:21:30 +0300 Windows: Update the docs. INSTALL | 29 ++++++++----- windows/INSTALL-MSVC.txt | 47 ++++++++++++++++++++++ windows/{INSTALL-Windows.txt => INSTALL-MinGW.txt} | 2 +- 3 files changed, 67 insertions(+), 11 deletions(-) commit 28195e4c877007cc760ecea1d17f740693d66873 Author: Lasse Collin Date: 2015-06-19 17:25:31 +0300 Windows: Add MSVC project files for building liblzma. Thanks to Adam Walling for creating these files. windows/liblzma.vcxproj | 359 ++++++++++++++++++++++++++++++++++++++++ windows/liblzma_dll.vcxproj | 388 ++++++++++++++++++++++++++++++++++++++++++++ windows/xz_win.sln | 48 ++++++ 3 files changed, 795 insertions(+) commit 960440f3230dc628f6966d9f7614fc1b28baf44e Author: Lasse Collin Date: 2015-05-13 20:57:55 +0300 Tests: Fix a memory leak in test_bcj_exact_size. Thanks to Cristian Rodríguez. tests/test_bcj_exact_size.c | 1 + 1 file changed, 1 insertion(+) commit 68cd35acafbdcdf4e8ea8b5bb843c736939d6f8b Author: Lasse Collin Date: 2015-05-12 18:08:24 +0300 Fix NEWS about threading in 5.2.0. Thanks to Andy Hochhaus. NEWS | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit ff96ed6d25786728356017a13baf8c14731b4f1e Author: Lasse Collin Date: 2015-05-11 21:26:16 +0300 xz: Document that threaded decompression hasn't been implemented yet. src/xz/xz.1 | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) commit 00d37b64a64ea8597fd2422d5187afd761ab9531 Author: Lasse Collin Date: 2015-04-20 20:20:29 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit db190a832c49ca3aed6d69cc992fa5583cae7b11 Author: Lasse Collin Date: 2015-04-20 19:59:18 +0300 Revert "xz: Use pipe2() if available." This reverts commit 7a11c4a8e5e15f13d5fa59233b3172e65428efdd. It is a problem when libc has pipe2() but the kernel is too old to have pipe2() and thus pipe2() fails. In xz it's pointless to have a fallback for non-functioning pipe2(); it's better to avoid pipe2() completely. Thanks to Michael Fox for the bug report. configure.ac | 4 ++-- src/xz/file_io.c | 9 +-------- 2 files changed, 3 insertions(+), 10 deletions(-) commit eccd8155e107c5ada03d13e7730675cdf1a44ddc Author: Lasse Collin Date: 2015-03-29 22:14:47 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 25263fd9e7a8a913395cb93d7c104cd48c2b4a00 Author: Lasse Collin Date: 2015-03-29 22:13:48 +0300 Fix the detection of installed RAM on QNX. The earlier version compiled but didn't actually work since sysconf(_SC_PHYS_PAGES) always fails (or so I was told). Thanks to Ole André Vadla Ravnås for the patch and testing. m4/tuklib_physmem.m4 | 6 +++--- src/common/tuklib_physmem.c | 14 +++++++++++++- 2 files changed, 16 insertions(+), 4 deletions(-) commit 4c544d2410903d38402221cb783ed85585b6a007 Author: Lasse Collin Date: 2015-03-27 22:39:07 +0200 Fix CPU core count detection on QNX. It tried to use sysctl() on QNX but - it broke the build because sysctl() needs -lsocket on QNX; - sysctl() doesn't work for detecting the core count on QNX even if it compiled. sysconf() works. An alternative would have been to use QNX-specific SYSPAGE_ENTRY(num_cpu) from . Thanks to Ole André Vadla Ravnås. m4/tuklib_cpucores.m4 | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) commit e0ea6737b03e83ccaff4514d00e31bb926f8f0f3 Author: Lasse Collin Date: 2015-03-07 22:05:57 +0200 xz: size_t/uint32_t cleanup in options.c. src/xz/options.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) commit 8bcca29a65335fd679c13814b70b35b68fa5daed Author: Lasse Collin Date: 2015-03-07 22:04:23 +0200 xz: Fix a comment and silence a warning in message.c. src/xz/message.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) commit f243f5f44c6b19a7c289a0ec73a03ee08364cb5b Author: Lasse Collin Date: 2015-03-07 22:01:00 +0200 liblzma: Silence more uint32_t vs. size_t warnings. src/liblzma/lz/lz_encoder.c | 2 +- src/liblzma/lzma/lzma_encoder.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 7f0a4c50f4a374c40acf4b86848f301ad1e82d34 Author: Lasse Collin Date: 2015-03-07 19:54:00 +0200 xz: Make arg_count an unsigned int to silence a warning. Actually the value of arg_count cannot exceed INT_MAX but it's nicer as an unsigned int. src/xz/args.h | 2 +- src/xz/main.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit f6ec46801588b1be29c07c9db98558b521304002 Author: Lasse Collin Date: 2015-03-07 19:33:17 +0200 liblzma: Fix a warning in index.c. src/liblzma/common/index.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) commit a24518971cc621315af142dd3bb7614fab04ad27 Author: Lasse Collin Date: 2015-02-26 20:46:14 +0200 Build: Fix a CR+LF problem when running autoreconf -fi on OS/2. build-aux/version.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit dec11497a71518423b5ff0e759100cf8aadf6c7b Author: Lasse Collin Date: 2015-02-26 16:53:44 +0200 Bump version and soname for 5.2.1. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 29e39c79975ab89ee5dd671e97064534a9f3a649 Author: Lasse Collin Date: 2015-02-26 13:01:09 +0200 Update NEWS for 5.2.1. NEWS | 14 ++++++++++++++ 1 file changed, 14 insertions(+) commit 7a11c4a8e5e15f13d5fa59233b3172e65428efdd Author: Lasse Collin Date: 2015-02-22 19:38:48 +0200 xz: Use pipe2() if available. configure.ac | 4 ++-- src/xz/file_io.c | 9 ++++++++- 2 files changed, 10 insertions(+), 3 deletions(-) commit 117d962685c72682c63edc9bb765367189800202 Author: Lasse Collin Date: 2015-02-21 23:40:26 +0200 liblzma: Fix a compression-ratio regression in LZMA1/2 in fast mode. The bug was added in the commit f48fce093b07aeda95c18850f5e086d9f2383380 and thus affected 5.1.4beta and 5.2.0. Luckily the bug cannot cause data corruption or other nasty things. src/liblzma/lzma/lzma_encoder_optimum_fast.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit ae984e31c167d3bc52972ec422dd1ebd5f5d5719 Author: Lasse Collin Date: 2015-02-21 23:00:19 +0200 xz: Fix the fcntl() usage when creating a pipe for the self-pipe trick. Now it reads the old flags instead of blindly setting O_NONBLOCK. The old code may have worked correctly, but this is better. src/xz/file_io.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) commit 2205bb5853098aea36a56df6f5747037175f66b4 Author: Lasse Collin Date: 2015-02-10 15:29:34 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit d935b0cdf3db440269b9d952b2b281b18f8c7b08 Author: Lasse Collin Date: 2015-02-10 15:28:30 +0200 tuklib_cpucores: Use cpuset_getaffinity() on FreeBSD if available. In FreeBSD, cpuset_getaffinity() is the preferred way to get the number of available cores. Thanks to Rui Paulo for the patch. I edited it slightly, but hopefully I didn't break anything. m4/tuklib_cpucores.m4 | 23 ++++++++++++++++++++++- src/common/tuklib_cpucores.c | 18 ++++++++++++++++++ 2 files changed, 40 insertions(+), 1 deletion(-) commit eb61bc58c20769cac4d05f363b9c0e8c9c71a560 Author: Lasse Collin Date: 2015-02-09 22:08:37 +0200 xzdiff: Make the mktemp usage compatible with FreeBSD's mktemp. Thanks to Rui Paulo for the fix. src/scripts/xzdiff.in | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) commit b9a5b6b7a29029680af733082b6a46e0fc01623a Author: Lasse Collin Date: 2015-02-03 21:45:53 +0200 Add a few casts to tuklib_integer.h to silence possible warnings. I heard that Visual Studio 2013 gave warnings without the casts. Thanks to Gabi Davar. src/common/tuklib_integer.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) commit c45757135f40e4a0de730ba5fff0100219493982 Author: Lasse Collin Date: 2015-01-26 21:24:39 +0200 liblzma: Set LZMA_MEMCMPLEN_EXTRA depending on the compare method. src/liblzma/common/memcmplen.h | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) commit 3c500174ed5485f550972a2a6109c361e875f069 Author: Lasse Collin Date: 2015-01-26 20:40:16 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit fec88d41e672d9e197c9442aecf02bd0dfa6d516 Author: Lasse Collin Date: 2015-01-26 20:39:28 +0200 liblzma: Silence harmless Valgrind errors. Thanks to Torsten Rupp for reporting this. I had forgotten to run Valgrind before the 5.2.0 release. src/liblzma/lz/lz_encoder.c | 6 ++++++ 1 file changed, 6 insertions(+) commit a9b45badfec0928d20a27c7176c005fa637f7d1e Author: Lasse Collin Date: 2015-01-09 21:50:19 +0200 xz: Fix comments. src/xz/file_io.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) commit 541aee6dd4aa97a809aba281475a21b641bb89e2 Author: Lasse Collin Date: 2015-01-09 21:35:06 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 4170edc914655310d2363baccf5e615e09b04911 Author: Lasse Collin Date: 2015-01-09 21:34:06 +0200 xz: Don't fail if stdout doesn't support O_NONBLOCK. This is similar to the case with stdin. Thanks to Brad Smith for the bug report and testing on OpenBSD. src/xz/file_io.c | 36 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 21 deletions(-) commit 04bbc0c2843c50c8ad1cba42b937118e38b0508d Author: Lasse Collin Date: 2015-01-07 19:18:20 +0200 xz: Fix a memory leak in DOS-specific code. src/xz/file_io.c | 2 ++ 1 file changed, 2 insertions(+) commit f0f1f6c7235ffa901cf76fe18e33749e200b3eea Author: Lasse Collin Date: 2015-01-07 19:08:06 +0200 xz: Don't fail if stdin doesn't support O_NONBLOCK. It's a problem at least on OpenBSD which doesn't support O_NONBLOCK on e.g. /dev/null. I'm not surprised if it's a problem on other OSes too since this behavior is allowed in POSIX-1.2008. The code relying on this behavior was committed in June 2013 and included in 5.1.3alpha released on 2013-10-26. Clearly the development releases only get limited testing. src/xz/file_io.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) commit d2d484647d9d9d679f03c75abb0404f67069271c Author: Lasse Collin Date: 2015-01-06 20:30:15 +0200 Tests: Don't hide unexpected error messages in test_files.sh. Hiding them makes no sense since normally there's no error when testing the "good" files. With "bad" files errors are expected and then it makes sense to keep the messages hidden. tests/test_files.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit aae6a6aeda51cf94a47e39ad624728f9bee75e30 Author: Lasse Collin Date: 2014-12-30 11:17:16 +0200 Update Solaris notes in INSTALL. Mention the possible "make check" failure on Solaris in the Solaris-specific section of INSTALL. It was already in section 4.5 but it is better mention it in the OS-specific section too. INSTALL | 4 ++++ 1 file changed, 4 insertions(+) commit 7815112153178800a3521b9f31960e7cdc26cfba Author: Lasse Collin Date: 2014-12-26 12:00:05 +0200 Build: POSIX shell isn't required if scripts are disabled. INSTALL | 3 ++- configure.ac | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) commit a0cd05ee71d330b79ead6eb9222e1b24e1559d3a Author: Lasse Collin Date: 2014-12-21 20:48:37 +0200 DOS: Update Makefile. dos/Makefile | 1 + 1 file changed, 1 insertion(+) commit b85ee0905ec4ab7656d22e63519fdd3bedb21f2e Author: Lasse Collin Date: 2014-12-21 19:50:38 +0200 Windows: Fix bin_i486 to bin_i686 in build.bash. windows/build.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit cbafa710918195dbba3db02c3fab4f0538235206 Author: Lasse Collin Date: 2014-12-21 18:58:44 +0200 Docs: Use lzma_cputhreads() in 04_compress_easy_mt.c. doc/examples/04_compress_easy_mt.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) commit 8dbb57238d372c7263cfeb3e7f7fd9a73173156a Author: Lasse Collin Date: 2014-12-21 18:56:44 +0200 Docs: Update docs/examples/00_README.txt. doc/examples/00_README.txt | 4 ++++ 1 file changed, 4 insertions(+) commit 6060f7dc76fd6c2a8a1f8e85d0e4d86bb78273e6 Author: Lasse Collin Date: 2014-12-21 18:11:17 +0200 Bump version and soname for 5.2.0. I know that soname != app version, but I skip AGE=1 in -version-info to make the soname match the liblzma version anyway. It doesn't hurt anything as long as it doesn't conflict with library versioning rules. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 6 +++--- src/liblzma/liblzma.map | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) commit 3e8bd1d15e417f2d588e9be50ce027ee3d48b2da Author: Lasse Collin Date: 2014-12-21 18:05:03 +0200 Avoid variable-length arrays in the debug programs. debug/full_flush.c | 3 ++- debug/sync_flush.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) commit 72f7307cfdceb941aeb2bf30d424cc0d13621786 Author: Lasse Collin Date: 2014-12-21 18:01:45 +0200 Build: Include 04_compress_easy_mt.c in the tarball. Makefile.am | 1 + 1 file changed, 1 insertion(+) commit 2cb82ff21c62def11f3683a8bb0aaf363102aaa0 Author: Lasse Collin Date: 2014-12-21 18:00:38 +0200 Fix build when --disable-threads is used. src/common/mythread.h | 2 ++ 1 file changed, 2 insertions(+) commit 9b9e3536e458ef958f66b0e8982efc9d36de4d17 Author: Adrien Nader Date: 2014-12-21 15:56:15 +0100 po/fr: improve wording for help for --lzma1/--lzma2. po/fr.po | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit a8b6b569e7fadbf5b5b9139d53bc764015c15027 Author: Adrien Nader Date: 2014-12-21 15:55:48 +0100 po/fr: missing line in translation of --extreme. po/fr.po | 1 + 1 file changed, 1 insertion(+) commit f168a6fd1a888cf4f0caaddcafcb21dadc6ab6e9 Author: Lasse Collin Date: 2014-12-21 14:32:33 +0200 Update NEWS for 5.2.0. NEWS | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) commit cec2ee863b3a88f4bf039cb00f73c4a4fc93a429 Author: Lasse Collin Date: 2014-12-21 14:32:22 +0200 Update NEWS for 5.0.8. NEWS | 12 ++++++++++++ 1 file changed, 12 insertions(+) commit 42e97a32649bf53ce43be2258b902a417c6e7fa1 Author: Lasse Collin Date: 2014-12-21 14:07:54 +0200 xz: Fix a comment. src/xz/options.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 29b95d5d6665cedffa6a9d6d3d914f981e852182 Author: Lasse Collin Date: 2014-12-20 20:43:14 +0200 Update INSTALL about the dependencies of the scripts. INSTALL | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) commit 3af91040bb42c21afbb81f5568c3313125e61192 Author: Lasse Collin Date: 2014-12-20 20:42:33 +0200 Windows: Update build instructions. INSTALL | 15 +++++++++------ windows/INSTALL-Windows.txt | 44 +++++++++++++++++++++----------------------- 2 files changed, 30 insertions(+), 29 deletions(-) commit 0152f72bf6289d744823dc6c849538f3a139ad70 Author: Lasse Collin Date: 2014-12-20 20:41:48 +0200 Windows: Update the build script and README-Windows.txt. The 32-bit build is now for i686 or newer because the prebuilt MinGW-w64 toolchains include i686 code in the executables even if one uses -march=i486. The build script builds 32-bit SSE2 enabled version too. Run-time detection of SSE2 support would be nice (on any OS) but it's not implemented in XZ Utils yet. windows/README-Windows.txt | 30 ++++++++++++++++-------------- windows/build.bash | 23 ++++++++++++++--------- 2 files changed, 30 insertions(+), 23 deletions(-) commit 4a1f6133ee5533cee8d91e06fcc22443e5f1881a Author: Lasse Collin Date: 2014-12-19 15:51:50 +0200 Windows: Define TUKLIB_SYMBOL_PREFIX in config.h. It is to keep all symbols in the lzma_ namespace. windows/config.h | 3 +++ 1 file changed, 3 insertions(+) commit 7f7d093de79eee0c7dbfd7433647e46302f19f82 Author: Lasse Collin Date: 2014-12-16 21:00:09 +0200 xz: Update the man page about --threads. src/xz/xz.1 | 5 ----- 1 file changed, 5 deletions(-) commit 009823448b82aa5f465668878a544c5842885407 Author: Lasse Collin Date: 2014-12-16 20:57:43 +0200 xz: Update the man page about --block-size. src/xz/xz.1 | 41 +++++++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 8 deletions(-) commit 7dddfbeb499e528940bc12047355c184644aafe9 Author: Adrien Nader Date: 2014-12-10 22:26:57 +0100 po/fr: several more translation updates: reword and handle --ignore-check. po/fr.po | 50 ++++++++++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 24 deletions(-) commit 6eca5be40e04ddc4b738d493e4e56835956d8b69 Author: Adrien Nader Date: 2014-12-10 22:23:01 +0100 po/fr: yet another place where my email address had to be updated. po/fr.po | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit d1003673e92ba47edd6aeeb3dbea05c18269d0e7 Author: Adrien Nader Date: 2014-12-10 22:22:20 +0100 po/fr: fix several typos that have been around since the beginning. po/fr.po | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) commit 4c5aa911a0df027e46171e368debc543d2fa72b2 Author: Adrien Nader Date: 2014-12-03 20:02:31 +0100 po/fr: last batch of new translations for now. Four new error messages. po/fr.po | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) commit 3e3099e36d27059499e7996fb38a62e8ab01d356 Author: Adrien Nader Date: 2014-12-03 20:01:32 +0100 po/fr: translations for --threads, --block-size and --block-list. po/fr.po | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) commit e7d96a5933eec4e9d4a62569ee88df0ebb0f1d53 Author: Adrien Nader Date: 2014-12-03 20:00:53 +0100 po/fr: remove fuzzy marker for error messages that will be kept in English. The following is a copy of a comment inside fr.po: Note from translator on "file status flags". The following entry is kept un-translated on purpose. It is difficult to translate and should only happen in exceptional circumstances which means that translating would: - lose some of the meaning - make it more difficult to look up in search engines; it might happen one in a million times, if we dilute the error message in 20 languages, it will be almost impossible to find an explanation and support for the error. po/fr.po | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) commit 46cbb9033af8a21fafe543302d6919746e0d72af Author: Adrien Nader Date: 2014-12-03 19:58:25 +0100 po/fr: several minor updates and better wording. Meaning doesn't change at all: it's only for better wording and/or formatting of a few strings. po/fr.po | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) commit 7ce49d444f04e73145f79c832eb4d510594b074a Author: Adrien Nader Date: 2014-12-03 19:56:12 +0100 po/fr: update my email address and copyright years. po/fr.po | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 214c553ebc3047cd720da1ce5c80cf7c38118d3c Author: Adrien Nader Date: 2014-11-26 10:08:26 +0100 fr.po: commit file after only "update-po" so actual is readable. po/fr.po | 311 ++++++++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 199 insertions(+), 112 deletions(-) commit 1190c641af09cde85f8bd0fbe5c4906f4a29431b Author: Lasse Collin Date: 2014-12-02 20:04:07 +0200 liblzma: Document how lzma_mt.block_size affects memory usage. src/liblzma/api/lzma/container.h | 4 ++++ 1 file changed, 4 insertions(+) commit e4fc1d2f9571fba79ce383595be2ea2a9257def0 Author: Lasse Collin Date: 2014-11-28 20:07:18 +0200 Update INSTALL about a "make check" failure in test_scripts.sh. INSTALL | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) commit 34f9e40a0a0c3bd2c2730cdb9cd550bbb8a3f2fe Author: Lasse Collin Date: 2014-11-26 20:12:27 +0200 Remove LZMA_UNSTABLE macro. src/liblzma/api/lzma/container.h | 4 ---- src/liblzma/common/common.h | 2 -- src/xz/private.h | 1 - 3 files changed, 7 deletions(-) commit 6d9c0ce9f2677b159e32b224aba5b535b304a705 Author: Lasse Collin Date: 2014-11-26 20:10:33 +0200 liblzma: Update lzma_stream_encoder_mt() API docs. src/liblzma/api/lzma/container.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) commit 2301f3f05dd9742f42cda8f0f318864f5dc39ab3 Author: Lasse Collin Date: 2014-11-25 12:32:05 +0200 liblzma: Verify the filter chain in threaded encoder initialization. This way an invalid filter chain is detected at the Stream encoder initialization instead of delaying it to the first call to lzma_code() which triggers the initialization of the actual filter encoder(s). src/liblzma/common/stream_encoder_mt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) commit 107a263d5bb63cd3593fd6a5c938706539f84523 Author: Lasse Collin Date: 2014-11-17 19:11:49 +0200 Build: Update m4/ax_pthread.m4 from Autoconf Archive. m4/ax_pthread.m4 | 71 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 47 insertions(+), 24 deletions(-) commit b13a781833399ff5726cfc997f3cb2f0acbdbf31 Author: Lasse Collin Date: 2014-11-17 18:52:21 +0200 Build: Replace obsolete AC_HELP_STRING with AS_HELP_STRING. configure.ac | 36 ++++++++++++++++++------------------ m4/tuklib_integer.m4 | 2 +- 2 files changed, 19 insertions(+), 19 deletions(-) commit 542cac122ed3550148a2af0033af22b757491378 Author: Lasse Collin Date: 2014-11-17 18:43:19 +0200 Build: Fix Autoconf warnings about escaped backquotes. Thanks to Daniel Richard G. for pointing out that it's good to sometimes run autoreconf -fi with -Wall. configure.ac | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) commit 7b03a15cea8cd4f19ed680b51c4bcbae3ce4142f Author: Lasse Collin Date: 2014-11-10 18:54:40 +0200 xzdiff: Use mkdir if mktemp isn't available. src/scripts/xzdiff.in | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) commit f8c13e5e3609581d5dd9f8777985ca07f2390ad7 Author: Lasse Collin Date: 2014-11-10 18:45:01 +0200 xzdiff: Create a temporary directory to hold a temporary file. This avoids the possibility of "File name too long" when creating a temp file when the input file name is very long. This also means that other users on the system can no longer see the input file names in /tmp (or whatever $TMPDIR is) since the temporary directory will have a generic name. This usually doesn't matter since on many systems one can see the arguments given to all processes anyway. The number X chars to mktemp where increased from 6 to 10. Note that with some shells temp files or dirs won't be used at all. src/scripts/xzdiff.in | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) commit 7716dcf9df7f457500cb657314e7a9aea5fedb06 Author: Lasse Collin Date: 2014-11-10 15:38:47 +0200 liblzma: Fix lzma_mt.preset in lzma_stream_encoder_mt_memusage(). It read the filter chain from a wrong variable. This is a similar bug that was fixed in 9494fb6d0ff41c585326f00aa8f7fe58f8106a5e. src/liblzma/common/stream_encoder_mt.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) commit 230fa4a605542c84b4178a57381695a0af4e779b Author: Lasse Collin Date: 2014-11-10 14:49:55 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 4e4ae08bc7c1711e399c9f2d26eb375d39d08101 Author: Lasse Collin Date: 2014-10-29 21:28:25 +0200 Update .gitignore files. .gitignore | 2 ++ m4/.gitignore | 3 +++ 2 files changed, 5 insertions(+) commit c923b140b27d1a055db6284e10fd546ad1a7fcdb Author: Lasse Collin Date: 2014-10-29 21:15:35 +0200 Build: Prepare to support Automake's subdir-objects. Due to a bug in Automake, subdir-objects won't be enabled for now. http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17354 Thanks to Daniel Richard G. for the original patches. configure.ac | 7 ++++++- src/Makefile.am | 22 +++++++++++++++++++++- src/liblzma/Makefile.am | 4 ++-- src/lzmainfo/Makefile.am | 4 ++-- src/xz/Makefile.am | 10 +++++----- src/xzdec/Makefile.am | 8 ++++---- 6 files changed, 40 insertions(+), 15 deletions(-) commit 08c2aa16bea0df82828f665d51fba2e0a5e8997f Author: Lasse Collin Date: 2014-10-24 20:09:29 +0300 Translations: Update the Italian translation. Thanks to Milo Casagrande. po/it.po | 452 ++++++++++++++++++++++++++++++++++++++------------------------- 1 file changed, 275 insertions(+), 177 deletions(-) commit 2f9f61aa83539c54ff6c118a2693890f0519b3dd Author: Lasse Collin Date: 2014-10-18 18:51:45 +0300 Translations: Update the Polish translation. Thanks to Jakub Bogusz. po/pl.po | 332 ++++++++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 214 insertions(+), 118 deletions(-) commit 4f9d233f67aea25e532824d11b7642cf7dee7a76 Author: Andre Noll Date: 2014-10-14 17:30:30 +0200 l10n: de.po: Change translator email address. Although the old address is still working, the new one should be preferred. So this commit changes all three places in de.po accordingly. Signed-off-by: Andre Noll po/de.po | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 00502b2bedad43f0cc167ac17ae0608837ee196b Author: Andre Noll Date: 2014-10-14 17:30:29 +0200 l10n: de.po: Update German translation Signed-off-by: Andre Noll po/de.po | 531 +++++++++++++++++++++++++++++++++------------------------------ 1 file changed, 281 insertions(+), 250 deletions(-) commit 706b0496753fb609e69f1570ec603f11162189d1 Author: Andre Noll Date: 2014-10-14 17:30:28 +0200 l10n: de.po: Fix typo: Schießen -> Schließen. That's a funny one since "schießen" means to shoot :) Signed-off-by: Andre Noll po/de.po | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 7c32e6a935c3d7ee366abad1679bd5f322f0c7d4 Author: Lasse Collin Date: 2014-10-09 19:42:26 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 076258cc458f1e705041ac7a729b15ffe8c5214a Author: Lasse Collin Date: 2014-10-09 19:41:51 +0300 Add support for AmigaOS/AROS to tuklib_physmem(). Thanks to Fredrik Wikstrom. m4/tuklib_physmem.m4 | 3 ++- src/common/tuklib_physmem.c | 7 +++++++ 2 files changed, 9 insertions(+), 1 deletion(-) commit efa7b0a210e1baa8e128fc98c5443a944c39ad24 Author: Lasse Collin Date: 2014-10-09 18:42:14 +0300 xzgrep: Avoid passing both -q and -l to grep. The behavior of grep -ql varies: - GNU grep behaves like grep -q. - OpenBSD grep behaves like grep -l. POSIX doesn't make it 100 % clear what behavior is expected. Anyway, using both -q and -l at the same time makes no sense so both options simply should never be used at the same time. Thanks to Christian Weisgerber. src/scripts/xzgrep.in | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) commit 9c5f76098c9986b48d2fc574a0b764f4cde0c538 Author: Trần Ngọc Quân Date: 2014-09-25 09:22:45 +0700 l10n: vi.po: Update Vietnamese translation Signed-off-by: Trần Ngọc Quân po/vi.po | 136 +++++++++++++++++++++++++++++++++++++++------------------------ 1 file changed, 84 insertions(+), 52 deletions(-) commit c4911f2db36d811896c73c008b4218d8fa9a4730 Author: Lasse Collin Date: 2014-09-25 18:38:48 +0300 Build: Detect supported compiler warning flags better. Clang and nowadays also GCC accept any -Wfoobar option but then may give a warning that an unknown warning option was specified. To avoid adding unsupported warning options, the options are now tested with -Werror. Thanks to Charles Diza. configure.ac | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) commit 76e75522ed6f5c228d55587dee5a997893f6e474 Author: Lasse Collin Date: 2014-09-20 21:01:21 +0300 Update NEWS for 5.0.7. NEWS | 11 +++++++++++ 1 file changed, 11 insertions(+) commit d62028b4c1174fc67b6929f126f5eb24c018c700 Author: Lasse Collin Date: 2014-09-20 19:42:56 +0300 liblzma: Fix a portability problem in Makefile.am. POSIX supports $< only in inference rules (suffix rules). Using it elsewhere is a GNU make extension and doesn't work e.g. with OpenBSD make. Thanks to Christian Weisgerber for the patch. src/liblzma/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit c35de31d4283edad3e57d37ffe939406542cb7bb Author: Lasse Collin Date: 2014-09-14 21:54:09 +0300 Bump the version number to 5.1.4beta. src/liblzma/api/lzma/version.h | 4 ++-- src/liblzma/liblzma.map | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) commit e9e097e22cacdaa23e5414fea7913535449cb340 Author: Lasse Collin Date: 2014-09-14 21:50:13 +0300 Update NEWS for 5.0.6 and 5.1.4beta. NEWS | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) commit 642f856bb8562ab66704b1e01ac7bc08b6d0a663 Author: Lasse Collin Date: 2014-09-14 21:02:41 +0300 Update TODO. TODO | 38 ++++++++++++++++++++++++++++++++++---- 1 file changed, 34 insertions(+), 4 deletions(-) commit 6b5e3b9eff5b8cedb2aac5f524d4d60fc8a48124 Author: Lasse Collin Date: 2014-08-05 22:32:36 +0300 xz: Add --ignore-check. src/xz/args.c | 7 +++++++ src/xz/args.h | 1 + src/xz/coder.c | 10 +++++++++- src/xz/message.c | 2 ++ src/xz/xz.1 | 19 +++++++++++++++++++ 5 files changed, 38 insertions(+), 1 deletion(-) commit 9adbc2ff373f979c917cdfd3679ce0ebd59f1040 Author: Lasse Collin Date: 2014-08-05 22:15:07 +0300 liblzma: Add support for LZMA_IGNORE_CHECK. src/liblzma/api/lzma/container.h | 24 ++++++++++++++++++++++++ src/liblzma/common/common.h | 1 + src/liblzma/common/stream_decoder.c | 14 ++++++++++++-- 3 files changed, 37 insertions(+), 2 deletions(-) commit 0e0f34b8e4f1c60ecaec15c2105982381cc9c3e6 Author: Lasse Collin Date: 2014-08-05 22:03:30 +0300 liblzma: Add support for lzma_block.ignore_check. Note that this slightly changes how lzma_block_header_decode() has been documented. Earlier it said that the .version is set to the lowest required value, but now it says that the .version field is kept unchanged if possible. In practice this doesn't affect any old code, because before this commit the only possible .version was 0. src/liblzma/api/lzma/block.h | 50 ++++++++++++++++++++++++------- src/liblzma/common/block_buffer_encoder.c | 2 +- src/liblzma/common/block_decoder.c | 18 ++++++++--- src/liblzma/common/block_encoder.c | 2 +- src/liblzma/common/block_header_decoder.c | 12 ++++++-- src/liblzma/common/block_header_encoder.c | 2 +- src/liblzma/common/block_util.c | 2 +- 7 files changed, 68 insertions(+), 20 deletions(-) commit 71e1437ab585b46f7a25f5a131557d3d1c0cbaa2 Author: Lasse Collin Date: 2014-08-04 19:25:58 +0300 liblzma: Use lzma_memcmplen() in the BT3 match finder. I had missed this when writing the commit 5db75054e900fa06ef5ade5f2c21dffdd5d16141. Thanks to Jun I Jin. src/liblzma/lz/lz_encoder_mf.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) commit 41dc9ea06e1414ebe8ef52afc8fc15b6e3282b04 Author: Lasse Collin Date: 2014-08-04 00:25:44 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 5dcffdbcc23a68abc3ac3539b30be71bc9b5af84 Author: Lasse Collin Date: 2014-08-03 21:32:25 +0300 liblzma: SHA-256: Optimize the Maj macro slightly. The Maj macro is used where multiple things are added together, so making Maj a sum of two expressions allows some extra freedom for the compiler to schedule the instructions. I learned this trick from . src/liblzma/check/sha256.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit a9477d1e0c6fd0e47e637d051e7b9e2a5d9af517 Author: Lasse Collin Date: 2014-08-03 21:08:12 +0300 liblzma: SHA-256: Optimize the way rotations are done. This looks weird because the rotations become sequential, but it helps quite a bit on both 32-bit and 64-bit x86: - It requires fewer instructions on two-operand instruction sets like x86. - It requires one register less which matters especially on 32-bit x86. I hope this doesn't hurt other archs. I didn't invent this idea myself, but I don't remember where I saw it first. src/liblzma/check/sha256.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) commit 5a76c7c8ee9a0afbeedb1c211db9224260404347 Author: Lasse Collin Date: 2014-08-03 20:38:13 +0300 liblzma: SHA-256: Remove the GCC #pragma that became unneeded. The unrolling in the previous commit should avoid the situation where a compiler may think that an uninitialized variable might be accessed. src/liblzma/check/sha256.c | 5 ----- 1 file changed, 5 deletions(-) commit 9a096f8e57509775c331950b8351bbca77bdcfa8 Author: Lasse Collin Date: 2014-08-03 20:33:38 +0300 liblzma: SHA-256: Unroll a little more. This way a branch isn't needed for each operation to choose between blk0 and blk2, and still the code doesn't grow as much as it would with full unrolling. src/liblzma/check/sha256.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) commit bc7650d87bf27f85f1a2a806dc2db1780e09e6a5 Author: Lasse Collin Date: 2014-08-03 19:56:43 +0300 liblzma: SHA-256: Do the byteswapping without a temporary buffer. src/liblzma/check/sha256.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) commit 544aaa3d13554e8640f9caf7db717a96360ec0f6 Author: Lasse Collin Date: 2014-07-25 22:38:28 +0300 liblzma: Use lzma_memcmplen() in normal mode of LZMA. Two locations were not changed yet because the simplest change assumes that the initial "len" may be greater than "limit". src/liblzma/lzma/lzma_encoder_optimum_normal.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) commit f48fce093b07aeda95c18850f5e086d9f2383380 Author: Lasse Collin Date: 2014-07-25 22:30:38 +0300 liblzma: Simplify LZMA fast mode code by using memcmp(). src/liblzma/lzma/lzma_encoder_optimum_fast.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) commit 6bf5308e34e23dede5b301b1b9b4f131dacd9218 Author: Lasse Collin Date: 2014-07-25 22:29:49 +0300 liblzma: Use lzma_memcmplen() in fast mode of LZMA. src/liblzma/lzma/lzma_encoder_optimum_fast.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 353212137e51e45b105a3a3fc2e6879f1cf0d492 Author: Lasse Collin Date: 2014-07-25 21:16:23 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 5db75054e900fa06ef5ade5f2c21dffdd5d16141 Author: Lasse Collin Date: 2014-07-25 21:15:07 +0300 liblzma: Use lzma_memcmplen() in the match finders. This doesn't change the match finder output. src/liblzma/lz/lz_encoder.c | 13 ++++++++++++- src/liblzma/lz/lz_encoder_mf.c | 33 +++++++++++---------------------- 2 files changed, 23 insertions(+), 23 deletions(-) commit e1c8f1d01f4a4e2136173edab2dc63c71ef038f4 Author: Lasse Collin Date: 2014-07-25 20:57:20 +0300 liblzma: Add lzma_memcmplen() for fast memory comparison. This commit just adds the function. Its uses will be in separate commits. This hasn't been tested much yet and it's perhaps a bit early to commit it but if there are bugs they should get found quite quickly. Thanks to Jun I Jin from Intel for help and for pointing out that string comparison needs to be optimized in liblzma. configure.ac | 13 +++ src/liblzma/common/Makefile.inc | 1 + src/liblzma/common/memcmplen.h | 170 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 184 insertions(+) commit 765735cf52e5123586e74a51b9c073b5257f631f Author: Lasse Collin Date: 2014-07-12 21:10:09 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 59da01785ef66c7e62f36e70ca808fd2824bb995 Author: Lasse Collin Date: 2014-07-12 20:06:08 +0300 Translations: Add Vietnamese translation. Thanks to Trần Ngọc Quân. po/LINGUAS | 1 + po/vi.po | 1007 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1008 insertions(+) commit 17215f751c354852700e7f8592ccf319570a0721 Author: Lasse Collin Date: 2014-06-29 20:54:14 +0300 xz: Update the help message of a few options. Updated: --threads, --block-size, and --block-list Added: --flush-timeout src/xz/message.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) commit 96864a6ddf91ad693d102ea165f3d7918744d582 Author: Lasse Collin Date: 2014-06-18 22:07:06 +0300 xz: Use lzma_cputhreads() instead of own copy of tuklib_cpucores(). src/xz/Makefile.am | 1 - src/xz/hardware.c | 12 +++++++++--- 2 files changed, 9 insertions(+), 4 deletions(-) commit a115cc3748482e277f42a968baa3cd266f031dba Author: Lasse Collin Date: 2014-06-18 22:04:24 +0300 liblzma: Add lzma_cputhreads(). src/liblzma/Makefile.am | 8 +++++++- src/liblzma/api/lzma/hardware.h | 14 ++++++++++++++ src/liblzma/common/Makefile.inc | 1 + src/liblzma/common/hardware_cputhreads.c | 22 ++++++++++++++++++++++ src/liblzma/liblzma.map | 1 + 5 files changed, 45 insertions(+), 1 deletion(-) commit 3ce3e7976904fbab4e6482bafa442856f77a51fa Author: Lasse Collin Date: 2014-06-18 19:11:52 +0300 xz: Check for filter chain compatibility for --flush-timeout. This avoids LZMA_PROG_ERROR from lzma_code() with filter chains that don't support LZMA_SYNC_FLUSH. src/xz/coder.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) commit 381ac14ed79e5d38809f251705be8b3193bba417 Author: Lasse Collin Date: 2014-06-13 19:21:54 +0300 xzgrep: List xzgrep_expected_output in tests/Makefile.am. tests/Makefile.am | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit 4244b65b06d5ecaf6f9dd0387ac7e3166bd2364e Author: Lasse Collin Date: 2014-06-13 18:58:22 +0300 xzgrep: Improve the test script. Now it should be close to the functionality of the original version by Pavel Raiskup. tests/Makefile.am | 3 ++- tests/test_scripts.sh | 24 ++++++++++++++---------- tests/xzgrep_expected_output | 39 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 55 insertions(+), 11 deletions(-) commit 1e60f2c0a0ee6c18b02943ce56214799a70aac26 Author: Lasse Collin Date: 2014-06-11 21:03:25 +0300 xzgrep: Add a test for the previous fix. This is a simplified version of Pavel Raiskup's original patch. tests/test_scripts.sh | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) commit ceca37901783988204caaf40dff4623d535cc789 Author: Lasse Collin Date: 2014-06-11 20:43:28 +0300 xzgrep: exit 0 when at least one file matches. Mimic the original grep behavior and return exit_success when at least one xz compressed file matches given pattern. Original bugreport: https://bugzilla.redhat.com/show_bug.cgi?id=1108085 Thanks to Pavel Raiskup for the patch. src/scripts/xzgrep.in | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) commit 8c19216baccb92d011694590df8a1262da2e980c Author: Lasse Collin Date: 2014-06-09 21:21:24 +0300 xz: Force single-threaded mode when --flush-timeout is used. src/xz/coder.c | 11 +++++++++++ 1 file changed, 11 insertions(+) commit 87f1a24810805187d7bbc8ac5512e7eec307ddf5 Author: Lasse Collin Date: 2014-05-25 22:05:39 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit da1718f266fcfc091e7bf08aae1bc986d0e6cc6b Author: Lasse Collin Date: 2014-05-25 21:45:56 +0300 liblzma: Use lzma_alloc_zero() in LZ encoder initialization. This avoids a memzero() call for a newly-allocated memory, which can be expensive when encoding small streams with an over-sized dictionary. To avoid using lzma_alloc_zero() for memory that doesn't need to be zeroed, lzma_mf.son is now allocated separately, which requires handling it separately in normalize() too. Thanks to Vincenzo Innocente for reporting the problem. src/liblzma/lz/lz_encoder.c | 84 ++++++++++++++++++++++-------------------- src/liblzma/lz/lz_encoder.h | 2 +- src/liblzma/lz/lz_encoder_mf.c | 31 +++++++++------- 3 files changed, 62 insertions(+), 55 deletions(-) commit 28af24e9cf2eb259997c85dce13d4c97b3daa47a Author: Lasse Collin Date: 2014-05-25 19:25:57 +0300 liblzma: Add the internal function lzma_alloc_zero(). src/liblzma/common/common.c | 21 +++++++++++++++++++++ src/liblzma/common/common.h | 6 ++++++ 2 files changed, 27 insertions(+) commit ed9ac85822c490e34b68c259afa0b385d21d1c40 Author: Lasse Collin Date: 2014-05-08 18:03:09 +0300 xz: Fix uint64_t vs. size_t which broke 32-bit build. Thanks to Christian Hesse. src/xz/coder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit d716acdae3fa7996f9e68a7bac012e6d8d13dd02 Author: Lasse Collin Date: 2014-05-04 11:09:11 +0300 Docs: Update comments to refer to lzma/lzma12.h in example programs. doc/examples/03_compress_custom.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 4d5b7b3fda31241ca86ed35e08e73f776ee916e0 Author: Lasse Collin Date: 2014-05-04 11:07:17 +0300 liblzma: Rename the private API header lzma/lzma.h to lzma/lzma12.h. It can be confusing that two header files have the same name. The public API file is still lzma.h. src/liblzma/api/Makefile.am | 2 +- src/liblzma/api/lzma.h | 2 +- src/liblzma/api/lzma/{lzma.h => lzma12.h} | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) commit 1555a9c5664afc7893a2b75e9970105437f01ef1 Author: Lasse Collin Date: 2014-04-25 17:53:42 +0300 Build: Fix the combination of --disable-xzdec --enable-lzmadec. In this case "make install" could fail if the man page directory didn't already exist at the destination. If it did exist, a dangling symlink was created there. Now the link is omitted instead. This isn't the best fix but it's better than the old behavior. src/xzdec/Makefile.am | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) commit 56056571df3377eaa6ae6233b3ccc5d72e81d43d Author: Lasse Collin Date: 2014-04-25 17:44:26 +0300 Build: Add --disable-doc to configure. INSTALL | 6 ++++++ Makefile.am | 2 ++ configure.ac | 6 ++++++ 3 files changed, 14 insertions(+) commit 6de61d8721097a6214810841aa85b08e303ac538 Author: Lasse Collin Date: 2014-04-24 18:06:24 +0300 Update INSTALL. Add a note about failing "make check". The source of the problem should be fixed in libtool (if it really is a libtool bug and not mine) but I'm unable to spend time on that for now. Thanks to Nelson H. F. Beebe for reporting the issue. Add a note about a possible need to run "ldconfig" after "make install". INSTALL | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) commit 54df428799a8d853639b753d0e6784694d73eb3e Author: Lasse Collin Date: 2014-04-09 17:26:10 +0300 xz: Rename a variable to avoid a namespace collision on Solaris. I don't know the details but I have an impression that there's no problem in practice if using GCC since people have built xz with GCC (without patching xz), but renaming the variable cannot hurt either. Thanks to Mark Ashley. src/xz/signals.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) commit 5876ca27daa1429676b1160007d9688266907f00 Author: Lasse Collin Date: 2014-01-29 20:19:41 +0200 Docs: Add example program for threaded encoding. I didn't add -DLZMA_UNSTABLE to Makefile so one has to specify it manually as long as LZMA_UNSTABLE is needed. doc/examples/04_compress_easy_mt.c | 184 +++++++++++++++++++++++++++++++++++++ doc/examples/Makefile | 3 +- 2 files changed, 186 insertions(+), 1 deletion(-) commit 9494fb6d0ff41c585326f00aa8f7fe58f8106a5e Author: Lasse Collin Date: 2014-01-29 20:13:51 +0200 liblzma: Fix lzma_mt.preset not working with lzma_stream_encoder_mt(). It read the filter chain from a wrong variable. src/liblzma/common/stream_encoder_mt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 673a4cb53de3a715685cb1b836da57a3c7dcd43c Author: Lasse Collin Date: 2014-01-20 11:20:40 +0200 liblzma: Fix typo in a comment. src/liblzma/api/lzma/block.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit ad96a871a1470eb76d6233d3890ce9338047b7a3 Author: Lasse Collin Date: 2014-01-12 19:38:43 +0200 Windows: Add config.h for building liblzma with MSVC 2013. This is for building liblzma. Building xz tool too requires a little more work. Maybe it will be supported, but for most MSVC users it's enough to be able to build liblzma. C99 support in MSVC 2013 is almost usable which is a big improvement over earlier versions. It's "almost" because there's a dumb bug that breaks mixed declarations after an "if" statements unless the "if" statement uses braces: https://connect.microsoft.com/VisualStudio/feedback/details/808650/visual-studio-2013-c99-compiler-bug https://connect.microsoft.com/VisualStudio/feedback/details/808472/c99-support-of-mixed-declarations-and-statements-fails-with-certain-types-and-constructs Hopefully it will get fixed. Then liblzma should be compilable with MSVC 2013 without patching. windows/config.h | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) commit 3d5c090872fab4212b57c290e8ed4d02c78c1737 Author: Lasse Collin Date: 2014-01-12 17:41:14 +0200 xz: Fix a comment. src/xz/coder.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 69fd4e1c932c7975476a0143c86e45d81b60d3f9 Author: Lasse Collin Date: 2014-01-12 17:04:33 +0200 Windows: Add MSVC defines for inline and restrict keywords. src/common/sysdefs.h | 10 ++++++++++ 1 file changed, 10 insertions(+) commit a19d9e8575ee6647cd9154cf1f20203f1330485f Author: Lasse Collin Date: 2014-01-12 16:44:52 +0200 liblzma: Avoid C99 compound literal arrays. MSVC 2013 doesn't like them. Maybe they aren't so good for readability either since many aren't used to them. src/liblzma/lzma/lzma_encoder_presets.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) commit e28528f1c867b2ed4ac91195ad08efb9bb8a6263 Author: Lasse Collin Date: 2014-01-12 12:50:30 +0200 liblzma: Remove a useless C99ism from sha256.c. Unsurprisingly it makes no difference in compiled output. src/liblzma/check/sha256.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 5ad1effc45adfb7dabc9a98e79736077e6b7e2d5 Author: Lasse Collin Date: 2014-01-12 12:17:08 +0200 xz: Fix use of wrong variable. Since the only call to suffix_set() uses optarg as the argument, fixing this bug doesn't change the behavior of the program. src/xz/suffix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 3e62c68d75b5a3fdd46dbb34bb335d73289860d5 Author: Lasse Collin Date: 2014-01-12 12:11:36 +0200 Fix typos in comments. src/common/mythread.h | 2 +- src/liblzma/check/crc32_fast.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit e90ea601fb72867ec04adf456cbe4bf9520fd412 Author: Lasse Collin Date: 2013-11-26 18:20:16 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit b22e94d8d15764416354e04729382a7371ae2c30 Author: Lasse Collin Date: 2013-11-26 18:20:09 +0200 liblzma: Document the need for block->check for lzma_block_header_decode(). Thanks to Tomer Chachamu. src/liblzma/api/lzma/block.h | 3 +++ 1 file changed, 3 insertions(+) commit d1cd8b1cb824b72421d1ee370e628024d2fcbec4 Author: Lasse Collin Date: 2013-11-12 16:38:57 +0200 xz: Update the man page about --block-size and --block-list. src/xz/xz.1 | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) commit 76be7c612e6bcc38724488ccc3b8bcb1cfec9f0a Author: Lasse Collin Date: 2013-11-12 16:30:53 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit dd750acbe2259d75444ef0f8da2d4bacc90d7afc Author: Lasse Collin Date: 2013-11-12 16:29:48 +0200 xz: Make --block-list and --block-size work together in single-threaded. Previously, --block-list and --block-size only worked together in threaded mode. Boundaries are specified by --block-list, but --block-size specifies the maximum size for a Block. Now this works in single-threaded mode too. Thanks to James M Leddy for the original patch. src/xz/coder.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 75 insertions(+), 15 deletions(-) commit ae222fe9805d0161d022d75ba8485dab8bf6d7d5 Author: Lasse Collin Date: 2013-10-26 13:26:14 +0300 Bump the version number to 5.1.3alpha. src/liblzma/api/lzma/version.h | 2 +- src/liblzma/liblzma.map | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 2193837a6a597cd3bf4e9ddf49421a5697d8e155 Author: Lasse Collin Date: 2013-10-26 13:25:02 +0300 Update NEWS for 5.1.3alpha. NEWS | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) commit ed48e75e2763876173aef8902da407a8eb28854b Author: Lasse Collin Date: 2013-10-26 12:47:04 +0300 Update TODO. TODO | 4 ---- 1 file changed, 4 deletions(-) commit 841da0352d79a56a44796a4c39163429c9f039a3 Author: Lasse Collin Date: 2013-10-25 22:41:28 +0300 xz: Document behavior of --block-list with threads. This needs to be updated before 5.2.0. src/xz/xz.1 | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) commit 56feb8665b78c1032aabd53c619c62af51defe64 Author: Lasse Collin Date: 2013-10-22 20:03:12 +0300 xz: Document --flush-timeout=TIMEOUT on the man page. src/xz/xz.1 | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) commit ba413da1d5bb3324287cf3174922acd921165971 Author: Lasse Collin Date: 2013-10-22 19:51:55 +0300 xz: Take advantage of LZMA_FULL_BARRIER with --block-list. Now if --block-list is used in threaded mode, the encoder won't need to flush at each Block boundary specified via --block-list. This improves performance a lot, making threading helpful with --block-list. The flush timer was reset after LZMA_FULL_FLUSH but since LZMA_FULL_BARRIER doesn't flush, resetting the timer is no longer done. src/xz/coder.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) commit 0cd45fc2bc5537de287a0bc005e2d67467a92148 Author: Lasse Collin Date: 2013-10-02 20:05:23 +0300 liblzma: Support LZMA_FULL_FLUSH and _BARRIER in threaded encoder. Now --block-list=SIZES works with in the threaded mode too, although the performance is still bad due to the use of LZMA_FULL_FLUSH instead of the new LZMA_FULL_BARRIER. src/liblzma/common/stream_encoder_mt.c | 55 ++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 16 deletions(-) commit 97bb38712f414fabecca908af2e38a12570293fd Author: Lasse Collin Date: 2013-10-02 12:55:11 +0300 liblzma: Add LZMA_FULL_BARRIER support to single-threaded encoder. In the single-threaded encoder LZMA_FULL_BARRIER is simply an alias for LZMA_FULL_FLUSH. src/liblzma/api/lzma/base.h | 37 ++++++++++++++++++++++++++++++------- src/liblzma/common/common.c | 17 +++++++++++++++-- src/liblzma/common/common.h | 7 ++++++- src/liblzma/common/stream_encoder.c | 4 +++- 4 files changed, 54 insertions(+), 11 deletions(-) commit fef0c6b410c08e581c9178700a4e7599f0895ff9 Author: Lasse Collin Date: 2013-09-17 11:57:51 +0300 liblzma: Add block_buffer_encoder.h into Makefile.inc. This should have been in b465da5988dd59ad98fda10c2e4ea13d0b9c73bc. src/liblzma/common/Makefile.inc | 1 + 1 file changed, 1 insertion(+) commit 8083e03291b6d21c0f538163e187b4e8cd5594e4 Author: Lasse Collin Date: 2013-09-17 11:55:38 +0300 xz: Add a missing test for TUKLIB_DOSLIKE. src/xz/file_io.c | 2 ++ 1 file changed, 2 insertions(+) commit 6b44b4a775fe29ecc7bcb7996e086e3bc09e5fd0 Author: Lasse Collin Date: 2013-09-17 11:52:28 +0300 Add native threading support on Windows. Now liblzma only uses "mythread" functions and types which are defined in mythread.h matching the desired threading method. Before Windows Vista, there is no direct equivalent to pthread condition variables. Since this package doesn't use pthread_cond_broadcast(), pre-Vista threading can still be kept quite simple. The pre-Vista code doesn't use anything that wasn't already available in Windows 95, so the binaries should run even on Windows 95 if someone happens to care. INSTALL | 41 ++- configure.ac | 118 ++++++-- src/common/mythread.h | 513 ++++++++++++++++++++++++++------- src/liblzma/common/stream_encoder_mt.c | 83 +++--- src/xz/coder.c | 8 +- windows/README-Windows.txt | 2 +- windows/build.bash | 23 +- 7 files changed, 573 insertions(+), 215 deletions(-) commit ae0ab74a88d5b9b15845f1d9a24ade4349a54f9f Author: Lasse Collin Date: 2013-09-11 14:40:35 +0300 Build: Remove a comment about Automake 1.10 from configure.ac. The previous commit supports silent rules and that requires Automake 1.11. configure.ac | 2 -- 1 file changed, 2 deletions(-) commit 72975df6c8c59aaf849138ab3606e8fb6970596a Author: Lasse Collin Date: 2013-09-09 20:37:03 +0300 Build: Create liblzma.pc in a src/liblzma/Makefile.am. Previously it was done in configure, but doing that goes against the Autoconf manual. Autoconf requires that it is possible to override e.g. prefix after running configure and that doesn't work correctly if liblzma.pc is created by configure. A potential downside of this change is that now e.g. libdir in liblzma.pc is a standalone string instead of being defined via ${prefix}, so if one overrides prefix when running pkg-config the libdir won't get the new value. I don't know if this matters in practice. Thanks to Vincent Torri. configure.ac | 1 - src/liblzma/Makefile.am | 20 ++++++++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) commit 1c2b6e7e8382ed390f53e140f160488bb2205ecc Author: Lasse Collin Date: 2013-08-04 15:24:09 +0300 Fix the previous commit which broke the build. Apparently I didn't even compile-test the previous commit. Thanks to Christian Hesse. src/common/tuklib_cpucores.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 124eb69c7857f618b4807588c51bc9ba21bf8691 Author: Lasse Collin Date: 2013-08-03 13:52:58 +0300 Windows: Add Windows support to tuklib_cpucores(). It is used for Cygwin too. I'm not sure if that is a good or bad idea. Thanks to Vincent Torri. m4/tuklib_cpucores.m4 | 19 +++++++++++++++++-- src/common/tuklib_cpucores.c | 13 ++++++++++++- 2 files changed, 29 insertions(+), 3 deletions(-) commit eada8a875ce3fd521cb42e4ace2624d3d49c5f35 Author: Anders F Bjorklund Date: 2013-08-02 15:59:46 +0200 macosx: separate liblzma package macosx/build.sh | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) commit be0100d01ca6a75899d051bee00acf17e6dc0c15 Author: Anders F Bjorklund Date: 2013-08-02 15:58:44 +0200 macosx: set minimum to leopard macosx/build.sh | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) commit 416729e2d743f4b2fe9fd438eedeb98adce033c3 Author: Anders F Bjorklund Date: 2011-08-07 13:13:30 +0200 move configurables into variables macosx/build.sh | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-) commit 16581080e5f29f9a4e49efece21c5bf572323acc Author: Lasse Collin Date: 2013-07-15 14:08:41 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 3e2b198ba37b624efd9c7caee2a435dc986b46c6 Author: Lasse Collin Date: 2013-07-15 14:08:02 +0300 Build: Fix the detection of missing CRC32. Thanks to Vincent Torri. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit dee6ad3d5915422bc30a6821efeacaeb8ca8ef00 Author: Lasse Collin Date: 2013-07-04 14:18:46 +0300 xz: Add preliminary support for --flush-timeout=TIMEOUT. When --flush-timeout=TIMEOUT is used, xz will use LZMA_SYNC_FLUSH if read() would block and at least TIMEOUT milliseconds has elapsed since the previous flush. This can be useful in realtime-like use cases where the data is simultanously decompressed by another process (possibly on a different computer). If new uncompressed input data is produced slowly, without this option xz could buffer the data for a long time until it would become decompressible from the output. If TIMEOUT is 0, the feature is disabled. This is the default. This commit affects the compression side. Using xz for the decompression side for the above purpose doesn't work yet so well because there is quite a bit of input and output buffering when decompressing. The --long-help or man page were not updated yet. The details of this feature may change. src/xz/args.c | 7 +++++++ src/xz/coder.c | 46 +++++++++++++++++++++++++++++++++++----------- src/xz/file_io.c | 46 ++++++++++++++++++++++++++++++++++++---------- 3 files changed, 78 insertions(+), 21 deletions(-) commit fa381acaf9a29a8114e1c0a97de99bab9adb014e Author: Lasse Collin Date: 2013-07-04 13:41:03 +0300 xz: Don't set src_eof=true after an I/O error because it's useless. src/xz/file_io.c | 3 --- 1 file changed, 3 deletions(-) commit ea00545beace5b950f709ec21e46878e0f448678 Author: Lasse Collin Date: 2013-07-04 13:25:11 +0300 xz: Fix the test when to read more input. Testing for end of file was no longer correct after full flushing became possible with --block-size=SIZE and --block-list=SIZES. There was no bug in practice though because xz just made a few unneeded zero-byte reads. src/xz/coder.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 736903c64bef394c06685d79908e397bcb08b88f Author: Lasse Collin Date: 2013-07-04 12:51:57 +0300 xz: Move some of the timing code into mytime.[hc]. This switches units from microseconds to milliseconds. New clock_gettime(CLOCK_MONOTONIC) will be used if available. There is still a fallback to gettimeofday(). src/xz/Makefile.am | 2 ++ src/xz/coder.c | 5 +++ src/xz/message.c | 54 +++++++++------------------------ src/xz/mytime.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/xz/mytime.h | 47 ++++++++++++++++++++++++++++ src/xz/private.h | 1 + 6 files changed, 158 insertions(+), 40 deletions(-) commit 24edf8d807e24ffaa1e793114d94cca3b970027d Author: Lasse Collin Date: 2013-07-01 14:35:03 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit c0627b3fceacfa1ed162f5f55235360ea26f569a Author: Lasse Collin Date: 2013-07-01 14:34:11 +0300 xz: Silence a warning seen with _FORTIFY_SOURCE=2. Thanks to Christian Hesse. src/xz/file_io.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) commit 1936718bb38ee394bd89836fdd4eabc0beb02443 Author: Lasse Collin Date: 2013-06-30 19:40:11 +0300 Update NEWS for 5.0.5. NEWS | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) commit a37ae8b5eb6093a530198f109c6f7a538c80ecf0 Author: Lasse Collin Date: 2013-06-30 18:02:27 +0300 Man pages: Use similar syntax for synopsis as in xz. The man pages of lzmainfo, xzmore, and xzdec had similar constructs as the man page of xz had before the commit eb6ca9854b8eb9fbf72497c1cf608d6b19d2d494. Eric S. Raymond didn't mention these man pages in his bug report, but it's nice to be consistent. src/lzmainfo/lzmainfo.1 | 4 ++-- src/scripts/xzmore.1 | 6 +++--- src/xzdec/xzdec.1 | 10 +++++----- 3 files changed, 10 insertions(+), 10 deletions(-) commit cdba9ddd870ae72fd6219a125662c20ec997f86c Author: Lasse Collin Date: 2013-06-29 15:59:13 +0300 xz: Use non-blocking I/O for the output file. Now both reading and writing should be without race conditions with signals. They might still be signal handling issues left. Signals are blocked during many operations to avoid EINTR but it may cause problems e.g. if writing to stderr blocks when trying to display an error message. src/xz/file_io.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 8 deletions(-) commit e61a5c95da3fe31281d959e5e842885a8ba2b5bd Author: Lasse Collin Date: 2013-06-28 23:56:17 +0300 xz: Fix return value type in io_write_buf(). It didn't affect the behavior of the code since -1 becomes true anyway. src/xz/file_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 9dc319eabb34a826f4945f91c71620f14a60e9e2 Author: Lasse Collin Date: 2013-06-28 23:48:05 +0300 xz: Use the self-pipe trick to avoid a race condition with signals. It is possible that a signal to set user_abort arrives right before a blocking system call is made. In this case the call may block until another signal arrives, while the wanted behavior is to make xz clean up and exit as soon as possible. After this commit, the race condition is avoided with the input side which already uses non-blocking I/O. The output side still uses blocking I/O and thus has the race condition. src/xz/file_io.c | 56 ++++++++++++++++++++++++++++++++++++++++++++------------ src/xz/file_io.h | 8 ++++++++ src/xz/signals.c | 5 +++++ 3 files changed, 57 insertions(+), 12 deletions(-) commit 3541bc79d0cfabc0ad155c99bfdad1289f17fec3 Author: Lasse Collin Date: 2013-06-28 22:51:02 +0300 xz: Use non-blocking I/O for the input file. src/xz/file_io.c | 156 +++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 111 insertions(+), 45 deletions(-) commit 78673a08bed5066c81e8a8e90d20e670c28ecfd5 Author: Lasse Collin Date: 2013-06-28 18:46:13 +0300 xz: Remove an outdated NetBSD-specific comment. Nowadays errno == EFTYPE is documented in open(2). src/xz/file_io.c | 4 ---- 1 file changed, 4 deletions(-) commit a616fdad34b48b2932ef03fb87309dcc8b829527 Author: Lasse Collin Date: 2013-06-28 18:09:47 +0300 xz: Fix error detection of fcntl(fd, F_SETFL, flags) calls. POSIX says that fcntl(fd, F_SETFL, flags) returns -1 on error and "other than -1" on success. This is how it is documented e.g. on OpenBSD too. On Linux, success with F_SETFL is always 0 (at least accorinding to fcntl(2) from man-pages 3.51). src/xz/file_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) commit 4a08a6e4c61c65ab763ab314100a6d7a3bb89298 Author: Lasse Collin Date: 2013-06-28 17:36:47 +0300 xz: Fix use of wrong variable in a fcntl() call. Due to a wrong variable name, when writing a sparse file to standard output, *all* file status flags were cleared (to the extent the operating system allowed it) instead of only clearing the O_APPEND flag. In practice this worked fine in the common situations on GNU/Linux, but I didn't check how it behaved elsewhere. The original flags were still restored correctly. I still changed the code to use a separate boolean variable to indicate when the flags should be restored instead of relying on a special value in stdout_flags. src/xz/file_io.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) commit b790b435daa3351067f80a5973b647f8d55367a2 Author: Lasse Collin Date: 2013-06-28 14:55:37 +0300 xz: Fix assertion related to posix_fadvise(). Input file can be a FIFO or something else that doesn't support posix_fadvise() so don't check the return value even with an assertion. Nothing bad happens if the call to posix_fadvise() fails. src/xz/file_io.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) commit 84d2da6c9dc252f441deb7626c2522202b005d4d Author: Lasse Collin Date: 2013-06-26 13:30:57 +0300 xz: Check the value of lzma_stream_flags.version in --list. It is a no-op for now, but if an old xz version is used together with a newer liblzma that supports something new, then this check becomes important and will stop the old xz from trying to parse files that it won't understand. src/xz/list.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) commit 9376f5f8f762296f2173d61af9101112c36f38c0 Author: Lasse Collin Date: 2013-06-26 12:17:00 +0300 Build: Require Automake 1.12 and use serial-tests option. It should actually still work with Automake 1.10 if the serial-tests option is removed. Automake 1.13 started using parallel tests by default and the option to get the old behavior isn't supported before 1.12. At least for now, parallel tests don't improve anything in XZ Utils but they hide the progress output from test_compress.sh. configure.ac | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) commit b7e200d7bd0a3c7c171c13ad37d68296d6f73374 Author: Lasse Collin Date: 2013-06-23 18:59:13 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 46540e4c10923e363741ff5aab99e79fc0ce6ee8 Author: Lasse Collin Date: 2013-06-23 18:57:23 +0300 liblzma: Avoid a warning about a shadowed variable. On Mac OS X wait() is declared in that we include one way or other so don't use "wait" as a variable name. Thanks to Christian Kujau. src/liblzma/common/stream_encoder_mt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit ebb501ec73cecc546c67117dd01b5e33c90bfb4a Author: Lasse Collin Date: 2013-06-23 17:36:47 +0300 xz: Validate Uncompressed Size from Block Header in list.c. This affects only "xz -lvv". Normal decompression with xz already detected if Block Header and Index had mismatched Uncompressed Size fields. So this just makes "xz -lvv" show such files as corrupt instead of showing the Uncompressed Size from Index. src/xz/list.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) commit c09e91dd236d3cabee0fc48312b3dc8cceae41ab Author: Lasse Collin Date: 2013-06-21 22:08:11 +0300 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit eb6ca9854b8eb9fbf72497c1cf608d6b19d2d494 Author: Lasse Collin Date: 2013-06-21 22:04:45 +0300 xz: Make the man page more friendly to doclifter. Thanks to Eric S. Raymond. src/xz/xz.1 | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) commit 0c0a1947e6ad90a0a10b7a5c39f6ab99a0aa5c93 Author: Lasse Collin Date: 2013-06-21 21:54:59 +0300 xz: A couple of man page fixes. Now the interaction of presets and custom filter chains is described correctly. Earlier it contradicted itself. Thanks to DevHC who reported these issues on IRC to me on 2012-12-14. src/xz/xz.1 | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) commit 2fcda89939c903106c429e109083d43d894049e0 Author: Lasse Collin Date: 2013-06-21 21:50:26 +0300 xz: Fix interaction between preset and custom filter chains. There was somewhat illogical behavior when --extreme was specified and mixed with custom filter chains. Before this commit, "xz -9 --lzma2 -e" was equivalent to "xz --lzma2". After it is equivalent to "xz -6e" (all earlier preset options get forgotten when a custom filter chain is specified and the default preset is 6 to which -e is applied). I find this less illogical. This also affects the meaning of "xz -9e --lzma2 -7". Earlier it was equivalent to "xz -7e" (the -e specified before a custom filter chain wasn't forgotten). Now it is "xz -7". Note that "xz -7e" still is the same as "xz -e7". Hopefully very few cared about this in the first place, so pretty much no one should even notice this change. Thanks to Conley Moorhous. src/xz/coder.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) commit 97379c5ea758da3f8b0bc444d5f7fa43753ce610 Author: Lasse Collin Date: 2013-04-27 22:07:46 +0300 Build: Use -Wvla with GCC if supported. Variable-length arrays are mandatory in C99 but optional in C11. The code doesn't currently use any VLAs and it shouldn't in the future either to stay compatible with C11 without requiring any optional C11 features. configure.ac | 1 + 1 file changed, 1 insertion(+) commit 8957c58609d3987c58aa72b96c436cf565cc4917 Author: Lasse Collin Date: 2013-04-15 19:29:09 +0300 xzdec: Improve the --help message. The options are now ordered in the same order as in xz's help message. Descriptions were added to the options that are ignored. I left them in parenthesis even if it looks a bit weird because I find it easier to spot the ignored vs. non-ignored options from the list that way. src/xzdec/xzdec.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) commit ed886e1a92534a24401d0e99c11f1dcff3b5220a Author: Lasse Collin Date: 2013-04-05 19:25:40 +0300 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit 5019413a055ce29e660dbbf15e02443cb5a26c59 Author: Jeff Bastian Date: 2013-04-03 13:59:17 +0200 xzgrep: make the '-h' option to be --no-filename equivalent * src/scripts/xzgrep.in: Accept the '-h' option in argument parsing. src/scripts/xzgrep.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 5ea900cb5ad862bca81316729f92357c1fc040ce Author: Lasse Collin Date: 2013-03-23 22:25:15 +0200 liblzma: Be less picky in lzma_alone_decoder(). To avoid false positives when detecting .lzma files, rare values in dictionary size and uncompressed size fields were rejected. They will still be rejected if .lzma files are decoded with lzma_auto_decoder(), but when using lzma_alone_decoder() directly, such files will now be accepted. Hopefully this is an OK compromise. This doesn't affect xz because xz still has its own file format detection code. This does affect lzmadec though. So after this commit lzmadec will accept files that xz or xz-emulating-lzma doesn't. NOTE: lzma_alone_decoder() still won't decode all .lzma files because liblzma's LZMA decoder doesn't support lc + lp > 4. Reported here: http://sourceforge.net/projects/lzmautils/forums/forum/708858/topic/7068827 src/liblzma/common/alone_decoder.c | 22 ++++++++++++++-------- src/liblzma/common/alone_decoder.h | 5 +++-- src/liblzma/common/auto_decoder.c | 2 +- 3 files changed, 18 insertions(+), 11 deletions(-) commit bb117fffa84604b6e3811b068c80db82bf7f7b05 Author: Lasse Collin Date: 2013-03-23 21:55:13 +0200 liblzma: Use lzma_block_buffer_bound64() in threaded encoder. Now it uses lzma_block_uncomp_encode() if the data doesn't fit into the space calculated by lzma_block_buffer_bound64(). src/liblzma/common/stream_encoder_mt.c | 66 +++++++++++++++++++++++++--------- 1 file changed, 50 insertions(+), 16 deletions(-) commit e572e123b55b29527e54ce5f0807f115481d78b9 Author: Lasse Collin Date: 2013-03-23 21:51:38 +0200 liblzma: Fix another deadlock in the threaded encoder. This race condition could cause a deadlock if lzma_end() was called before finishing the encoding. This can happen with xz with debugging enabled (non-debugging version doesn't call lzma_end() before exiting). src/liblzma/common/stream_encoder_mt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) commit b465da5988dd59ad98fda10c2e4ea13d0b9c73bc Author: Lasse Collin Date: 2013-03-23 19:17:33 +0200 liblzma: Add lzma_block_uncomp_encode(). This also adds a new internal function lzma_block_buffer_bound64() which is similar to lzma_block_buffer_bound() but uses uint64_t instead of size_t. src/liblzma/api/lzma/block.h | 18 ++++++ src/liblzma/common/block_buffer_encoder.c | 94 +++++++++++++++++++++---------- src/liblzma/common/block_buffer_encoder.h | 24 ++++++++ src/liblzma/liblzma.map | 1 + 4 files changed, 106 insertions(+), 31 deletions(-) commit 9e6dabcf22ef4679f4faaae15ebd5b137ae2fad1 Author: Lasse Collin Date: 2013-03-05 19:14:50 +0200 Avoid unneeded use of awk in xzless. Use "read" instead of "awk" in xzless to get the version number of "less". The need for awk was introduced in the commit db5c1817fabf7cbb9e4087b1576eb26f0747338e. Thanks to Ariel P for the patch. src/scripts/xzless.in | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) commit e7b424d267a34803db8d92a3515528be2ed45abd Author: Lasse Collin Date: 2012-12-14 20:13:32 +0200 Make the progress indicator smooth in threaded mode. This adds lzma_get_progress() to liblzma and takes advantage of it in xz. lzma_get_progress() collects progress information from the thread-specific structures so that fairly accurate progress information is available to applications. Adding a new function seemed to be a better way than making the information directly available in lzma_stream (like total_in and total_out are) because collecting the information requires locking mutexes. It's waste of time to do it more often than the up to date information is actually needed by an application. src/liblzma/api/lzma/base.h | 22 +++++++++- src/liblzma/common/common.c | 16 +++++++ src/liblzma/common/common.h | 6 +++ src/liblzma/common/stream_encoder_mt.c | 77 +++++++++++++++++++++++++++++++--- src/liblzma/liblzma.map | 1 + src/xz/message.c | 20 +++++---- 6 files changed, 129 insertions(+), 13 deletions(-) commit 2ebbb994e367f55f2561aa7c9e7451703c171f2f Author: Lasse Collin Date: 2012-12-14 11:01:41 +0200 liblzma: Fix mythread_sync for nested locking. src/common/mythread.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) commit 4c7e28705f6de418d19cc77324ef301f996e01ff Author: Lasse Collin Date: 2012-12-13 21:05:36 +0200 xz: Mention --threads in --help. Thanks to Olivier Delhomme for pointing out that this was still missing. src/xz/message.c | 4 ++++ 1 file changed, 4 insertions(+) commit db5c1817fabf7cbb9e4087b1576eb26f0747338e Author: Jonathan Nieder Date: 2012-11-19 00:10:10 -0800 xzless: Make "less -V" parsing more robust In v4.999.9beta~30 (xzless: Support compressed standard input, 2009-08-09), xzless learned to parse ‘less -V’ output to figure out whether less is new enough to handle $LESSOPEN settings starting with “|-”. That worked well for a while, but the version string from ‘less’ versions 448 (June, 2012) is misparsed, producing a warning: $ xzless /tmp/test.xz; echo $? /usr/bin/xzless: line 49: test: 456 (GNU regular expressions): \ integer expression expected 0 More precisely, modern ‘less’ lists the regexp implementation along with its version number, and xzless passes the entire version number with attached parenthetical phrase as a number to "test $a -gt $b", producing the above confusing message. $ less-444 -V | head -1 less 444 $ less -V | head -1 less 456 (no regular expressions) So relax the pattern matched --- instead of expecting "less ", look for a line of the form "less [ (extra parenthetical)]". While at it, improve the behavior when no matching line is found --- instead of producing a cryptic message, we can fall back on a LESSPIPE setting that is supported by all versions of ‘less’. The implementation uses "awk" for simplicity. Hopefully that’s portable enough. Reported-by: Jörg-Volker Peetz Signed-off-by: Jonathan Nieder src/scripts/xzless.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit 65536214a31ecd33b6b03b68a351fb597d3703d6 Author: Lasse Collin Date: 2012-10-03 15:54:24 +0300 xz: Fix the note about --rsyncable on the man page. src/xz/xz.1 | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) commit 3d93b6354927247a1569caf22ad27b07e97ee904 Author: Lasse Collin Date: 2012-09-28 20:11:09 +0300 xz: Improve handling of failed realloc in xrealloc. Thanks to Jim Meyering. src/xz/util.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) commit ab225620664e235637833be2329935f9d290ba80 Author: Lasse Collin Date: 2012-08-24 16:27:31 +0300 A few typo fixes to comments and the xz man page. Thanks to Jim Meyering. configure.ac | 2 +- src/liblzma/check/sha256.c | 1 - src/xz/xz.1 | 4 ++-- 3 files changed, 3 insertions(+), 4 deletions(-) commit f3c1ec69d910175ffd431fd82968dd35cec806ed Author: Lasse Collin Date: 2012-08-13 21:40:09 +0300 xz: Add a warning to --help about alpha and beta versions. src/xz/message.c | 5 +++++ 1 file changed, 5 insertions(+) commit d8eaf9d8278c23c2cf2b7ca5562d4de570d3b5db Author: Lasse Collin Date: 2012-08-02 17:13:30 +0300 Build: Bump gettext version requirement to 0.18. Otherwise too old version of m4/lib-link.m4 gets included when autoreconf -fi is run. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 96e08902b09f0f304d4ff80c6e83ef7fff883f34 Author: Lasse Collin Date: 2012-07-17 18:29:08 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 3778db1be53e61ff285c573af5ee468803008456 Author: Lasse Collin Date: 2012-07-17 18:19:59 +0300 liblzma: Make the use of lzma_allocator const-correct. There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch. src/liblzma/api/lzma/base.h | 4 +++- src/liblzma/api/lzma/block.h | 6 ++--- src/liblzma/api/lzma/container.h | 9 +++++--- src/liblzma/api/lzma/filter.h | 13 ++++++----- src/liblzma/api/lzma/index.h | 16 ++++++------- src/liblzma/api/lzma/index_hash.h | 4 ++-- src/liblzma/common/alone_decoder.c | 6 ++--- src/liblzma/common/alone_decoder.h | 2 +- src/liblzma/common/alone_encoder.c | 8 +++---- src/liblzma/common/auto_decoder.c | 6 ++--- src/liblzma/common/block_buffer_decoder.c | 2 +- src/liblzma/common/block_buffer_encoder.c | 4 ++-- src/liblzma/common/block_decoder.c | 6 ++--- src/liblzma/common/block_decoder.h | 2 +- src/liblzma/common/block_encoder.c | 8 +++---- src/liblzma/common/block_encoder.h | 2 +- src/liblzma/common/block_header_decoder.c | 4 ++-- src/liblzma/common/common.c | 10 ++++----- src/liblzma/common/common.h | 20 +++++++++-------- src/liblzma/common/easy_buffer_encoder.c | 4 ++-- src/liblzma/common/filter_buffer_decoder.c | 3 ++- src/liblzma/common/filter_buffer_encoder.c | 7 +++--- src/liblzma/common/filter_common.c | 4 ++-- src/liblzma/common/filter_common.h | 2 +- src/liblzma/common/filter_decoder.c | 7 +++--- src/liblzma/common/filter_decoder.h | 2 +- src/liblzma/common/filter_encoder.c | 2 +- src/liblzma/common/filter_encoder.h | 2 +- src/liblzma/common/filter_flags_decoder.c | 2 +- src/liblzma/common/index.c | 26 ++++++++++----------- src/liblzma/common/index_decoder.c | 12 +++++----- src/liblzma/common/index_encoder.c | 6 ++--- src/liblzma/common/index_encoder.h | 2 +- src/liblzma/common/index_hash.c | 6 +++-- src/liblzma/common/outqueue.c | 4 ++-- src/liblzma/common/outqueue.h | 5 +++-- src/liblzma/common/stream_buffer_decoder.c | 2 +- src/liblzma/common/stream_buffer_encoder.c | 3 ++- src/liblzma/common/stream_decoder.c | 9 ++++---- src/liblzma/common/stream_decoder.h | 5 +++-- src/liblzma/common/stream_encoder.c | 10 ++++----- src/liblzma/common/stream_encoder_mt.c | 16 ++++++------- src/liblzma/delta/delta_common.c | 4 ++-- src/liblzma/delta/delta_decoder.c | 6 ++--- src/liblzma/delta/delta_decoder.h | 5 +++-- src/liblzma/delta/delta_encoder.c | 6 ++--- src/liblzma/delta/delta_encoder.h | 3 ++- src/liblzma/delta/delta_private.h | 2 +- src/liblzma/lz/lz_decoder.c | 8 +++---- src/liblzma/lz/lz_decoder.h | 7 +++--- src/liblzma/lz/lz_encoder.c | 19 ++++++++-------- src/liblzma/lz/lz_encoder.h | 6 ++--- src/liblzma/lzma/lzma2_decoder.c | 8 +++---- src/liblzma/lzma/lzma2_decoder.h | 5 +++-- src/liblzma/lzma/lzma2_encoder.c | 6 ++--- src/liblzma/lzma/lzma2_encoder.h | 2 +- src/liblzma/lzma/lzma_decoder.c | 8 +++---- src/liblzma/lzma/lzma_decoder.h | 7 +++--- src/liblzma/lzma/lzma_encoder.c | 7 +++--- src/liblzma/lzma/lzma_encoder.h | 5 +++-- src/liblzma/simple/arm.c | 8 ++++--- src/liblzma/simple/armthumb.c | 8 ++++--- src/liblzma/simple/ia64.c | 8 ++++--- src/liblzma/simple/powerpc.c | 8 ++++--- src/liblzma/simple/simple_coder.c | 10 ++++----- src/liblzma/simple/simple_coder.h | 36 ++++++++++++++++++++---------- src/liblzma/simple/simple_decoder.c | 2 +- src/liblzma/simple/simple_decoder.h | 2 +- src/liblzma/simple/simple_private.h | 3 ++- src/liblzma/simple/sparc.c | 8 ++++--- src/liblzma/simple/x86.c | 8 ++++--- 71 files changed, 269 insertions(+), 219 deletions(-) commit d625c7cf824fd3b61c6da84f56179e94917ff603 Author: Lasse Collin Date: 2012-07-05 07:36:28 +0300 Tests: Remove tests/test_block.c that had gotten committed accidentally. tests/test_block.c | 52 ---------------------------------------------------- 1 file changed, 52 deletions(-) commit 0b09d266cce72bc4841933b171e79551e488927c Author: Lasse Collin Date: 2012-07-05 07:33:35 +0300 Build: Include macosx/build.sh in the distribution. It has been in the Git repository since 2010 but probably few people have seen it since it hasn't been included in the release tarballs. :-( Makefile.am | 1 + 1 file changed, 1 insertion(+) commit d6e0b23d4613b9f417893dd96cc168c8005ece3d Author: Lasse Collin Date: 2012-07-05 07:28:53 +0300 Build: Include validate_map.sh in the distribution. It's required by "make mydist". Fix also the location of EXTRA_DIST+= so that those files get distributed also if symbol versioning isn't enabled. src/liblzma/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 19de545d86097c3954d69ab5d12820387f6a09bc Author: Lasse Collin Date: 2012-07-05 07:24:45 +0300 Docs: Fix the name LZMA Utils -> XZ Utils in debug/README. debug/README | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 672eccf57c31a40dfb956b7662db06d43e18618e Author: Lasse Collin Date: 2012-07-05 07:23:17 +0300 Include debug/translation.bash in the distribution. Also fix the script name mentioned in README. README | 4 ++-- debug/Makefile.am | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) commit cafb523adac1caf305e70a04bc37f25602bf990c Author: Lasse Collin Date: 2012-07-04 22:31:58 +0300 xz: Document --block-list better. Thanks to Jonathan Nieder. src/xz/xz.1 | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) commit c7ff218528bc8f7c65e7ef73c6515777346c6794 Author: Lasse Collin Date: 2012-07-04 20:01:49 +0300 Bump the version number to 5.1.2alpha. src/liblzma/api/lzma/version.h | 2 +- src/liblzma/liblzma.map | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 8f3c1d886f93e6478ad509ff52102b2ce7faa999 Author: Lasse Collin Date: 2012-07-04 20:01:19 +0300 Update NEWS for 5.1.2alpha. NEWS | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) commit 0d5fa05466e580fbc458820f87013ae7644e20e5 Author: Lasse Collin Date: 2012-07-04 19:58:23 +0300 xz: Fix the version number printed by xz -lvv. The decoder bug was fixed in 5.0.2 instead of 5.0.3. src/xz/list.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit df11317985a4165731dde12bb0f0028da0e7b77f Author: Lasse Collin Date: 2012-07-04 17:11:31 +0300 Build: Add a comment to configure.ac about symbol versioning. configure.ac | 4 ++++ 1 file changed, 4 insertions(+) commit bd9cc179e8be3ef515201d3ed9c7dd79ae88869d Author: Lasse Collin Date: 2012-07-04 17:06:49 +0300 Update TODO. TODO | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) commit 4a238dd9b22f462cac5e199828bf1beb0df05884 Author: Lasse Collin Date: 2012-07-04 17:05:46 +0300 Document --enable-symbol-versions in INSTALL. INSTALL | 5 +++++ 1 file changed, 5 insertions(+) commit 88ccf47205d7f3aa314d358c72ef214f10f68b43 Author: Lasse Collin Date: 2012-07-03 21:16:39 +0300 xz: Add incomplete support for --block-list. It's broken with threads and when also --block-size is used. src/xz/args.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/xz/args.h | 1 + src/xz/coder.c | 48 ++++++++++++++++++++++++++++------ src/xz/coder.h | 4 +++ src/xz/main.c | 1 + src/xz/message.c | 6 +++++ src/xz/xz.1 | 23 +++++++++++++++-- 7 files changed, 151 insertions(+), 10 deletions(-) commit 972179cdcdf5d8949c48ee31737d87d3050b44af Author: Lasse Collin Date: 2012-07-01 18:44:33 +0300 xz: Update the man page about the new field in --robot -lvv. src/xz/xz.1 | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) commit 1403707fc64a70976aebe66f8d9a9bd12f73a2c5 Author: Lasse Collin Date: 2012-06-28 10:47:49 +0300 liblzma: Check that the first byte of range encoded data is 0x00. It is just to be more pedantic and thus perhaps catch broken files slightly earlier. src/liblzma/lzma/lzma_decoder.c | 8 ++++++-- src/liblzma/rangecoder/range_decoder.h | 12 +++++++++--- 2 files changed, 15 insertions(+), 5 deletions(-) commit eccd8017ffe2c5de473222c4963ec53c62f7fda2 Author: Lasse Collin Date: 2012-06-22 19:00:23 +0300 Update NEWS from 5.0.4. NEWS | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) commit 2e6754eac26a431e8d340c28906f63bcd1e177e8 Author: Lasse Collin Date: 2012-06-22 14:34:03 +0300 xz: Update man page date to match the latest update. src/xz/xz.1 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit b3235a0b1af45d5e1244cbe3191516966c076fa0 Author: Lasse Collin Date: 2012-06-18 21:27:47 +0300 Docs: Language fix to 01_compress_easy.c. Thanks to Jonathan Nieder. doc/examples/01_compress_easy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit f1675f765fe228cb5a5f904f853445a03e33cfe9 Author: Lasse Collin Date: 2012-06-14 20:15:30 +0300 Fix the top-level Makefile.am for the new example programs. Makefile.am | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) commit 3a0c5378abefaf86aa39a62a7c9682bdb21568a1 Author: Lasse Collin Date: 2012-06-14 10:52:33 +0300 Docs: Add new example programs. These have more comments than the old examples and human-readable error messages. More tutorial-like examples are needed but these are a start. doc/examples/00_README.txt | 27 ++++ doc/examples/01_compress_easy.c | 297 ++++++++++++++++++++++++++++++++++++++ doc/examples/02_decompress.c | 287 ++++++++++++++++++++++++++++++++++++ doc/examples/03_compress_custom.c | 193 +++++++++++++++++++++++++ doc/examples/Makefile | 23 +++ 5 files changed, 827 insertions(+) commit 1bd2c2c553e30c4a73cfb82abc6908efd6be6b8d Author: Lasse Collin Date: 2012-06-14 10:33:27 +0300 Docs: Move xz_pipe_comp.c and xz_pipe_decomp.c to doc/examples_old. It is good to keep these around to so that if someone has copied the decompressor bug from xz_pipe_decomp.c he has an example how to easily fix it. doc/{examples => examples_old}/xz_pipe_comp.c | 0 doc/{examples => examples_old}/xz_pipe_decomp.c | 0 2 files changed, 0 insertions(+), 0 deletions(-) commit 905f0ab5b5ce544d4b68a2ed6077df0f3d021292 Author: Lasse Collin Date: 2012-06-14 10:33:01 +0300 Docs: Fix a bug in xz_pipe_decomp.c example program. doc/examples/xz_pipe_decomp.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) commit 4bd1a3bd5fdf4870b2f96dd0b8a21657c8a58ad8 Author: Lasse Collin Date: 2012-05-30 23:14:33 +0300 Translations: Update the French translation. Thanks to Adrien Nader. po/fr.po | 148 ++++++++++++++++++++++++++++++++++----------------------------- 1 file changed, 79 insertions(+), 69 deletions(-) commit d2e836f2f3a87df6fe6bb0589b037db51205d910 Author: Lasse Collin Date: 2012-05-29 23:42:37 +0300 Translations: Update the German translation. The previous only included the new strings in v5.0. po/de.po | 229 +++++++++++++++++++++++++++++++++++++-------------------------- 1 file changed, 133 insertions(+), 96 deletions(-) commit c9a16151577ba459afd6e3528df23bc0ddb95171 Author: Lasse Collin Date: 2012-05-29 22:26:27 +0300 Translations: Update the German translation. po/de.po | 169 ++++++++++++++++++++++++++++++++++----------------------------- 1 file changed, 91 insertions(+), 78 deletions(-) commit 1530a74fd48f8493372edad137a24541efe24713 Author: Lasse Collin Date: 2012-05-29 22:14:21 +0300 Translations: Update Polish translation. po/pl.po | 283 +++++++++++++++++++++++++++++++++++++-------------------------- 1 file changed, 165 insertions(+), 118 deletions(-) commit d8db706acb8316f9861abd432cfbe001dd6d0c5c Author: Lasse Collin Date: 2012-05-28 20:42:11 +0300 liblzma: Fix possibility of incorrect LZMA_BUF_ERROR. lzma_code() could incorrectly return LZMA_BUF_ERROR if all of the following was true: - The caller knows how many bytes of output to expect and only provides that much output space. - When the last output bytes are decoded, the caller-provided input buffer ends right before the LZMA2 end of payload marker. So LZMA2 won't provide more output anymore, but it won't know it yet and thus won't return LZMA_STREAM_END yet. - A BCJ filter is in use and it hasn't left any unfiltered bytes in the temp buffer. This can happen with any BCJ filter, but in practice it's more likely with filters other than the x86 BCJ. Another situation where the bug can be triggered happens if the uncompressed size is zero bytes and no output space is provided. In this case the decompression can fail even if the whole input file is given to lzma_code(). A similar bug was fixed in XZ Embedded on 2011-09-19. src/liblzma/simple/simple_coder.c | 2 +- tests/Makefile.am | 4 +- tests/test_bcj_exact_size.c | 112 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 116 insertions(+), 2 deletions(-) commit 3f94b6d87f1b8f1c421ba548f8ebb83dca9c8cda Author: Lasse Collin Date: 2012-05-28 15:38:32 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 7769ea051d739a38a1640fd448cf5eb83cb119c6 Author: Lasse Collin Date: 2012-05-28 15:37:43 +0300 xz: Don't show a huge number in -vv when memory limit is disabled. src/xz/message.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) commit ec921105725e4d3ef0a683dd83eee6f24ab60ccd Author: Lasse Collin Date: 2012-05-27 22:30:17 +0300 xz: Document the "summary" lines of --robot -lvv. This documents only the columns that are in v5.0. The new columns added in the master branch aren't necessarily stable yet. src/xz/xz.1 | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) commit 27d24eb0a9f6eed96d6a4594c2b0bf7a91d29f9a Author: Lasse Collin Date: 2012-05-27 21:53:20 +0300 xz: Fix output of verbose --robot --list modes. It printed the filename in "filename (x/y)" format which it obviously shouldn't do in robot mode. src/xz/message.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit ab25b82a91754d9388c89abddf806424671d9431 Author: Lasse Collin Date: 2012-05-24 18:33:54 +0300 Build: Upgrade m4/acx_pthread.m4 to the latest version. m4/ax_pthread.m4 | 98 +++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 62 insertions(+), 36 deletions(-) commit d05d6d65c41a4bc83f162fa3d67c5d84e8751634 Author: Lasse Collin Date: 2012-05-10 21:15:17 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit e077391982f9f28dbfe542bba8800e7c5b916666 Author: Lasse Collin Date: 2012-05-10 21:14:16 +0300 Docs: Cleanup line wrapping a bit. README | 12 ++++++------ doc/history.txt | 49 +++++++++++++++++++++++++------------------------ 2 files changed, 31 insertions(+), 30 deletions(-) commit fc39849c350225c6a1cd7f6e6adff1020521eabc Author: Benno Schulenberg Date: 2012-03-13 22:04:04 +0100 Fix a few typos and add some missing articles in some documents. Also hyphenate several compound adjectives. Signed-off-by: Benno Schulenberg AUTHORS | 6 +++--- README | 42 ++++++++++++++++++++--------------------- doc/faq.txt | 24 ++++++++++++------------ doc/history.txt | 58 ++++++++++++++++++++++++++++----------------------------- 4 files changed, 65 insertions(+), 65 deletions(-) commit 29fa0566d5df199cb9acb2d17bf7eea61acc7fa1 Author: Lasse Collin Date: 2012-04-29 11:51:25 +0300 Windows: Update notes about static linking with MSVC. windows/README-Windows.txt | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) commit aac1b31ea4e66cf5a7a8c116bdaa15aa45e6c56e Author: Lasse Collin Date: 2012-04-19 15:25:26 +0300 liblzma: Remove outdated comments. src/liblzma/simple/simple_coder.c | 3 --- src/liblzma/simple/simple_private.h | 3 +-- 2 files changed, 1 insertion(+), 5 deletions(-) commit df14a46013bea70c0bd35be7821b0b9108f97de7 Author: Lasse Collin Date: 2012-04-19 14:17:52 +0300 DOS: Link against DJGPP's libemu to support FPU emulation. This way xz should work on 386SX and 486SX. Floating point only is needed for verbose output in xz. dos/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 03ed742a3a4931bb5c821357832083b26f577b13 Author: Lasse Collin Date: 2012-04-19 14:02:25 +0300 liblzma: Fix Libs.private in liblzma.pc to include -lrt when needed. src/liblzma/liblzma.pc.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 8c5b13ad59df70f49429bfdfd6ac120b8f892fda Author: Lasse Collin Date: 2012-04-19 13:58:55 +0300 Docs: Update MINIX 3 information in INSTALL. INSTALL | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) commit c7376fc415a1566f38b2de4b516a17013d516a8b Author: Lasse Collin Date: 2012-02-22 14:23:13 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit cff070aba6281ba743d29a62b8c0c66e5da4b2a6 Author: Lasse Collin Date: 2012-02-22 14:02:34 +0200 Fix exit status of xzgrep when grepping binary files. When grepping binary files, grep may exit before it has read all the input. In this case, gzip -q returns 2 (eating SIGPIPE), but xz and bzip2 show SIGPIPE as the exit status (e.g. 141). This causes wrong exit status when grepping xz- or bzip2-compressed binary files. The fix checks for the special exit status that indicates SIGPIPE. It uses kill -l which should be supported everywhere since it is in both SUSv2 (1997) and POSIX.1-2008. Thanks to James Buren for the bug report. src/scripts/xzgrep.in | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit 41cafb2bf9beea915710ee68f05fe929cd17759c Author: Lasse Collin Date: 2012-02-22 12:08:43 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 2dcea03712fa881930d69ec9eff70855c3d126d9 Author: Lasse Collin Date: 2012-02-22 12:00:16 +0200 Fix compiling with IBM XL C on AIX. INSTALL | 36 ++++++++++++++++++++++-------------- configure.ac | 6 +++++- 2 files changed, 27 insertions(+), 15 deletions(-) commit 7db6bdf4abcf524115be2cf5659ed540cef074c5 Author: Lasse Collin Date: 2012-01-10 17:13:03 +0200 Tests: Fix a compiler warning with _FORTIFY_SOURCE. Reported here: http://sourceforge.net/projects/lzmautils/forums/forum/708858/topic/4927385 tests/create_compress_files.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) commit 694952d545b6cf056547893ced69486eff9ece55 Author: Lasse Collin Date: 2011-12-19 21:21:29 +0200 Docs: Explain the stable releases better in README. README | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) commit 418fe668b3c53a9a20020b6cc652aaf25c734b29 Author: Lasse Collin Date: 2011-11-07 13:07:52 +0200 xz: Show minimum required XZ Utils version in xz -lvv. Man page wasn't updated yet. src/xz/list.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 57 insertions(+), 6 deletions(-) commit 7081d82c37326bac97184e338345fa1c327e3580 Author: Lasse Collin Date: 2011-11-04 17:57:16 +0200 xz: Fix a typo in a comment. Thanks to Bela Lubkin. src/xz/args.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 232fe7cd70ad258d6a37f17e860e0f1b1891eeb5 Author: Lasse Collin Date: 2011-11-03 17:08:02 +0200 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 74d2bae4d3449c68453b0473dd3430ce91fd90c1 Author: Lasse Collin Date: 2011-11-03 17:07:22 +0200 xz: Fix xz on EBCDIC systems. Thanks to Chris Donawa. src/xz/coder.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) commit 4ac4923f47cc0ef97dd9ca5cfcc44fc53eeab34a Author: Lasse Collin Date: 2011-10-23 17:09:10 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit ab50ae3ef40c81e5bf613905ca3fd636548b75e7 Author: Lasse Collin Date: 2011-10-23 17:08:14 +0300 liblzma: Fix invalid free() in the threaded encoder. It was triggered if initialization failed e.g. due to running out of memory. Thanks to Arkadiusz Miskiewicz. src/liblzma/common/outqueue.c | 4 ++++ 1 file changed, 4 insertions(+) commit 6b620a0f0813d28c3c544b4ff8cb595b38a6e908 Author: Lasse Collin Date: 2011-10-23 17:05:55 +0300 liblzma: Fix a deadlock in the threaded encoder. It was triggered when reinitializing the encoder, e.g. when encoding two files. src/liblzma/common/stream_encoder_mt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) commit bd52cf150ecd51e3ab63a9cc1a3cff6a77500178 Author: Lasse Collin Date: 2011-09-06 12:03:41 +0300 Build: Fix "make check" on Windows. tests/Makefile.am | 7 +++++-- windows/build.bash | 2 ++ 2 files changed, 7 insertions(+), 2 deletions(-) commit 5c5b2256969ac473001b7d67615ed3bd0a54cc82 Author: Lasse Collin Date: 2011-08-09 21:19:13 +0300 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit 5b1e1f10741af9e4bbe4cfc3261fb7c7b04f7809 Author: Lasse Collin Date: 2011-08-09 21:16:44 +0300 Workaround unusual SIZE_MAX on SCO OpenServer. src/common/sysdefs.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) commit e9ed88126eee86e2511fa42681a5c7104820cf0a Author: Lasse Collin Date: 2011-08-06 20:37:28 +0300 Run the scripts with the correct shell in test_scripts.sh. The scripts are now made executable in the build tree. This way the scripts can be run like programs in test_scripts.sh. Previously test_scripts.sh always used sh but it's not correct if @POSIX_SHELL@ is set to something else by configure. Thanks to Jonathan Nieder for the patch. configure.ac | 8 ++++---- tests/test_scripts.sh | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) commit 1c673e5681720491a74fc4b2992e075f47302c22 Author: Lasse Collin Date: 2011-07-31 11:01:47 +0300 Fix exit status of "xzdiff foo.xz bar.xz". xzdiff was clobbering the exit status from diff in a case statement used to analyze the exit statuses from "xz" when its operands were two compressed files. Save and restore diff's exit status to fix this. The bug is inherited from zdiff in GNU gzip and was fixed there on 2009-10-09. Thanks to Jonathan Nieder for the patch and to Peter Pallinger for reporting the bug. src/scripts/xzdiff.in | 2 ++ tests/Makefile.am | 4 +++- tests/test_scripts.sh | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 59 insertions(+), 1 deletion(-) commit 324cde7a864f4506c32ae7846d688c359a83fe65 Author: Lasse Collin Date: 2011-06-16 12:15:29 +0300 liblzma: Remove unneeded semicolon. src/liblzma/lz/lz_encoder_hash.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 492c86345551a51a29bf18e55fe55a5e86f169ce Author: Lasse Collin Date: 2011-05-28 19:24:56 +0300 Build: Make configure print if symbol versioning is enabled or not. configure.ac | 2 ++ 1 file changed, 2 insertions(+) commit fc4d4436969bd4d71b704d400a165875e596034a Author: Lasse Collin Date: 2011-05-28 16:43:26 +0300 Don't call close(-1) in tuklib_open_stdxxx() on error. Thanks to Jim Meyering. src/common/tuklib_open_stdxxx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) commit bd35d903a04c4d388adb4065b0fa271302380895 Author: Lasse Collin Date: 2011-05-28 15:55:39 +0300 liblzma: Use symbol versioning. Symbol versioning is enabled by default on GNU/Linux, other GNU-based systems, and FreeBSD. I'm not sure how stable this is, so it may need backward-incompatible changes before the next release. The idea is that alpha and beta symbols are considered unstable and require recompiling the applications that use those symbols. Once a symbol is stable, it may get extended with new features in ways that don't break compatibility with older ABI & API. The mydist target runs validate_map.sh which should catch some probable problems in liblzma.map. Otherwise I would forget to update the map file for new releases. Makefile.am | 1 + configure.ac | 21 +++++++++ src/liblzma/Makefile.am | 6 +++ src/liblzma/liblzma.map | 105 ++++++++++++++++++++++++++++++++++++++++++++ src/liblzma/validate_map.sh | 68 ++++++++++++++++++++++++++++ 5 files changed, 201 insertions(+) commit afbb244362c9426a37ce4eb9d54aab768da3adad Author: Lasse Collin Date: 2011-05-28 09:46:46 +0300 Translations: Update the Italian translation. Thanks to Milo Casagrande. po/it.po | 365 +++++++++++++++++++++++++++++++++++++-------------------------- 1 file changed, 216 insertions(+), 149 deletions(-) commit 79bef85e0543c0c3723281c3c817616c6cec343b Author: Lasse Collin Date: 2011-05-28 08:46:04 +0300 Tests: Add a test file for the bug in the previous commit. tests/files/README | 4 ++++ tests/files/bad-1-block_header-6.xz | Bin 0 -> 72 bytes 2 files changed, 4 insertions(+) commit c0297445064951807803457dca1611b3c47e7f0f Author: Lasse Collin Date: 2011-05-27 22:25:44 +0300 xz: Fix error handling in xz -lvv. It could do an invalid free() and read past the end of the uninitialized filters array. src/xz/list.c | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) commit 8bd91918ac50731f00b1a2a48072980572eb2ff9 Author: Lasse Collin Date: 2011-05-27 22:09:49 +0300 liblzma: Handle allocation failures correctly in lzma_index_init(). Thanks to Jim Meyering. src/liblzma/common/index.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) commit fe00f95828ef5627721b57e054f7eb2d42a2c961 Author: Lasse Collin Date: 2011-05-24 00:23:46 +0300 Build: Fix checking for system-provided SHA-256. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 21b45b9bab541f419712cbfd473ccc31802e0397 Author: Lasse Collin Date: 2011-05-23 18:30:30 +0300 Build: Set GZIP_ENV=-9n in top-level Makefile.am. Makefile.am | 3 +++ 1 file changed, 3 insertions(+) commit 48053e8a4550233af46359024538bff90c870ab1 Author: Lasse Collin Date: 2011-05-22 16:42:11 +0300 Update NEWS for 5.0.3. NEWS | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) commit bba37df2c9e54ad773e15ff00a09d2d6989fb3b2 Author: Lasse Collin Date: 2011-05-21 16:28:44 +0300 Add French translation. It is known that the BCJ filter --help text is only partially translated. po/LINGUAS | 1 + po/fr.po | 864 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 865 insertions(+) commit 4161d7634965a7a287bf208dcd79f6185f448fe8 Author: Lasse Collin Date: 2011-05-21 15:12:10 +0300 xz: Translate also the string used to print the program name. French needs a space before a colon, e.g. "xz : foo error". src/xz/message.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) commit b94aa0c8380cdb18cddb33440d625474c16643cf Author: Lasse Collin Date: 2011-05-21 15:08:44 +0300 liblzma: Try to use SHA-256 from the operating system. If the operating system libc or other base libraries provide SHA-256, use that instead of our own copy. Note that this doesn't use OpenSSL or libgcrypt or such libraries to avoid creating dependencies to other packages. This supports at least FreeBSD, NetBSD, OpenBSD, Solaris, MINIX, and Darwin. They all provide similar but not identical SHA-256 APIs; everyone is a little different. Thanks to Wim Lewis for the original patch, improvements, and testing. configure.ac | 54 +++++++++++++++++++++++++++ src/liblzma/check/Makefile.inc | 2 + src/liblzma/check/check.h | 83 ++++++++++++++++++++++++++++++++++++++---- 3 files changed, 131 insertions(+), 8 deletions(-) commit f004128678d43ea10b4a6401aa184cf83252d6ec Author: Lasse Collin Date: 2011-05-17 12:52:18 +0300 Don't use clockid_t in mythread.h when clock_gettime() isn't available. Thanks to Wim Lewis for the patch. src/common/mythread.h | 2 ++ 1 file changed, 2 insertions(+) commit f779516f42ebd2db47a5b7d6143459bf7737cf2f Author: Lasse Collin Date: 2011-05-17 12:26:28 +0300 Update THANKS. THANKS | 3 +++ 1 file changed, 3 insertions(+) commit 830ba587775bb562f6eaf05cad61bf669d1f8892 Author: Lasse Collin Date: 2011-05-17 12:21:33 +0300 Update INSTALL with a note about linker problem on OpenSolaris x86. INSTALL | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) commit ec7106309c8060e9c646dba20c4f15689a0bbb04 Author: Lasse Collin Date: 2011-05-17 12:01:37 +0300 Build: Fix initialization of enable_check_* variables in configure.ac. This doesn't matter much in practice since it is unlikely that anyone would have such environment variable names. Thanks to Wim Lewis. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 4c6e146df99696920f12410fb17754412797ef36 Author: Lasse Collin Date: 2011-05-17 11:54:38 +0300 Add underscores to attributes (__attribute((__foo__))). src/liblzma/common/alone_decoder.c | 2 +- src/liblzma/common/alone_encoder.c | 2 +- src/liblzma/common/block_encoder.c | 2 +- src/liblzma/common/common.c | 2 +- src/liblzma/common/common.h | 2 +- src/liblzma/common/index_decoder.c | 9 +++++---- src/liblzma/common/index_encoder.c | 11 ++++++----- src/liblzma/delta/delta_encoder.c | 2 +- src/liblzma/lz/lz_decoder.c | 2 +- src/liblzma/lz/lz_encoder.c | 2 +- src/liblzma/simple/arm.c | 2 +- src/liblzma/simple/armthumb.c | 2 +- src/liblzma/simple/ia64.c | 2 +- src/liblzma/simple/powerpc.c | 2 +- src/liblzma/simple/simple_coder.c | 2 +- src/liblzma/simple/sparc.c | 2 +- src/lzmainfo/lzmainfo.c | 4 ++-- src/xz/coder.c | 2 +- src/xz/hardware.h | 2 +- src/xz/message.c | 2 +- src/xz/message.h | 18 +++++++++--------- src/xz/options.c | 6 +++--- src/xz/signals.c | 2 +- src/xz/util.h | 6 +++--- src/xzdec/xzdec.c | 6 +++--- 25 files changed, 49 insertions(+), 47 deletions(-) commit 7a480e485938884ef3021b48c3b0b9f9699dc9b6 Author: Lasse Collin Date: 2011-05-01 12:24:23 +0300 xz: Fix input file position when --single-stream is used. Now the following works as you would expect: echo foo | xz > foo.xz echo bar | xz >> foo.xz ( xz -dc --single-stream ; xz -dc --single-stream ) < foo.xz Note that it doesn't work if the input is not seekable or if there is Stream Padding between the concatenated .xz Streams. src/xz/coder.c | 1 + src/xz/file_io.c | 15 +++++++++++++++ src/xz/file_io.h | 13 +++++++++++++ 3 files changed, 29 insertions(+) commit c29e6630c1450c630c4e7b783bdd76515db9004c Author: Lasse Collin Date: 2011-05-01 12:15:51 +0300 xz: Print the maximum number of worker threads in xz -vv. src/xz/coder.c | 4 ++++ 1 file changed, 4 insertions(+) commit 0b77c4a75158ccc416b07d6e81df8ee0abaea720 Author: Lasse Collin Date: 2011-04-19 10:44:48 +0300 Build: Warn if no supported method to detect the number of CPU cores. configure.ac | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) commit e4622df9ab4982f8faa53d85b17be66216175a58 Author: Lasse Collin Date: 2011-04-19 09:55:06 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 9c1b05828a88eff54409760b92162c7cc2c7cff6 Author: Lasse Collin Date: 2011-04-19 09:20:44 +0300 Fix portability problems in mythread.h. Use gettimeofday() if clock_gettime() isn't available (e.g. Darwin). The test for availability of pthread_condattr_setclock() and CLOCK_MONOTONIC was incorrect. Instead of fixing the #ifdefs, use an Autoconf test. That way if there exists a system that supports them but doesn't specify the matching POSIX #defines, the features will still get detected. Don't try to use pthread_sigmask() on OpenVMS. It doesn't have that function. Guard mythread.h against being #included multiple times. configure.ac | 7 +++++++ src/common/mythread.h | 31 +++++++++++++++++++++++++++---- 2 files changed, 34 insertions(+), 4 deletions(-) commit 3de00cc75da7b0e7b65e84c62b5351e231f501e9 Author: Lasse Collin Date: 2011-04-18 19:35:49 +0300 Update THANKS. THANKS | 2 ++ 1 file changed, 2 insertions(+) commit bd5002f5821e3d1b04f2f56989e4a19318e73633 Author: Martin Väth Date: 2011-04-15 04:54:49 -0400 xzgrep: fix typo in $0 parsing Reported-by: Diego Elio Pettenò Signed-off-by: Martin Väth Signed-off-by: Mike Frysinger src/scripts/xzgrep.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 6ef4eabc0acc49e1bb9dc68064706e19fa9fcf48 Author: Lasse Collin Date: 2011-04-12 12:48:31 +0300 Bump the version number to 5.1.1alpha and liblzma soname to 5.0.99. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 9a4377be0d21e597c66bad6c7452873aebfb3c1c Author: Lasse Collin Date: 2011-04-12 12:42:37 +0300 Put the unstable APIs behind #ifdef LZMA_UNSTABLE. This way people hopefully won't complain if these APIs change and break code that used an older API. src/liblzma/api/lzma/container.h | 4 ++++ src/liblzma/common/common.h | 2 ++ src/xz/private.h | 2 ++ 3 files changed, 8 insertions(+) commit 3e321a3acd50002cf6fdfd259e910f56d3389bc3 Author: Lasse Collin Date: 2011-04-12 11:59:49 +0300 Remove doubled words from documentation and comments. Spot candidates by running these commands: git ls-files |xargs perl -0777 -n \ -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \ -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}' Thanks to Jim Meyering for the original patch. doc/lzma-file-format.txt | 4 ++-- src/liblzma/common/alone_encoder.c | 2 +- src/liblzma/lzma/lzma2_encoder.c | 2 +- src/xz/file_io.c | 2 +- src/xz/xz.1 | 2 +- windows/INSTALL-Windows.txt | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) commit d91a84b534b012d19474f2fda1fbcaef873e1ba4 Author: Lasse Collin Date: 2011-04-12 11:46:01 +0300 Update NEWS. NEWS | 47 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 45 insertions(+), 2 deletions(-) commit 14e6ad8cfe0165c1a8beeb5b2a1536558b29b0a1 Author: Lasse Collin Date: 2011-04-12 11:45:40 +0300 Update TODO. TODO | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) commit 70e750f59793f9b5cd306a5adce9b8e427739e04 Author: Lasse Collin Date: 2011-04-12 11:08:55 +0300 xz: Update the man page about threading. src/xz/xz.1 | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) commit 24e0406c0fb7494d2037dec033686faf1bf67068 Author: Lasse Collin Date: 2011-04-11 22:06:03 +0300 xz: Add support for threaded compression. src/xz/args.c | 3 +- src/xz/coder.c | 202 +++++++++++++++++++++++++++++++++++---------------------- 2 files changed, 125 insertions(+), 80 deletions(-) commit de678e0c924aa79a19293a8a6ed82e8cb6572a42 Author: Lasse Collin Date: 2011-04-11 22:03:30 +0300 liblzma: Add lzma_stream_encoder_mt() for threaded compression. This is the simplest method to do threading, which splits the uncompressed data into blocks and compresses them independently from each other. There's room for improvement especially to reduce the memory usage, but nevertheless, this is a good start. configure.ac | 1 + src/liblzma/api/lzma/container.h | 163 +++++ src/liblzma/common/Makefile.inc | 7 + src/liblzma/common/common.c | 9 +- src/liblzma/common/common.h | 14 + src/liblzma/common/outqueue.c | 180 ++++++ src/liblzma/common/outqueue.h | 155 +++++ src/liblzma/common/stream_encoder_mt.c | 1011 ++++++++++++++++++++++++++++++++ 8 files changed, 1539 insertions(+), 1 deletion(-) commit 25fe729532cdf4b8fed56a4519b73cf31efaec50 Author: Lasse Collin Date: 2011-04-11 21:15:07 +0300 liblzma: Add the forgotten lzma_lzma2_block_size(). This should have been in 5eefc0086d24a65e136352f8c1d19cefb0cbac7a. src/liblzma/lzma/lzma2_encoder.c | 10 ++++++++++ src/liblzma/lzma/lzma2_encoder.h | 2 ++ 2 files changed, 12 insertions(+) commit 91afb785a1dee34862078d9bf844ef12b8cc3e35 Author: Lasse Collin Date: 2011-04-11 21:04:13 +0300 liblzma: Document lzma_easy_(enc|dec)oder_memusage() better too. src/liblzma/api/lzma/container.h | 9 +++++++++ 1 file changed, 9 insertions(+) commit 4a9905302a9e4a1601ae09d650d3f08ce98ae9ee Author: Lasse Collin Date: 2011-04-11 20:59:07 +0300 liblzma: Document lzma_raw_(enc|dec)oder_memusage() better. It didn't mention the return value that is used if an error occurs. src/liblzma/api/lzma/filter.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) commit 0badb0b1bd649163322783b0bd9e590b4bc7a93d Author: Lasse Collin Date: 2011-04-11 19:28:18 +0300 liblzma: Use memzero() to initialize supported_actions[]. This is cleaner and makes it simpler to add new members to lzma_action enumeration. src/liblzma/common/common.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) commit a7934c446a58e20268689899d2a39f50e571f251 Author: Lasse Collin Date: 2011-04-11 19:26:27 +0300 liblzma: API comment about lzma_allocator with threaded coding. src/liblzma/api/lzma/base.h | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) commit 5eefc0086d24a65e136352f8c1d19cefb0cbac7a Author: Lasse Collin Date: 2011-04-11 19:16:30 +0300 liblzma: Add an internal function lzma_mt_block_size(). This is based lzma_chunk_size() that was included in some development version of liblzma. src/liblzma/common/filter_encoder.c | 46 ++++++++++++++++++------------------- src/liblzma/common/filter_encoder.h | 4 ++-- 2 files changed, 24 insertions(+), 26 deletions(-) commit d1199274758049fc523d98c5b860ff814a799eec Author: Lasse Collin Date: 2011-04-11 13:59:50 +0300 liblzma: Don't create an empty Block in lzma_stream_buffer_encode(). Empty Block was created if the input buffer was empty. Empty Block wastes a few bytes of space, but more importantly it triggers a bug in XZ Utils 5.0.1 and older when trying to decompress such a file. 5.0.1 and older consider such files to be corrupt. I thought that no encoder creates empty Blocks when releasing 5.0.2 but I was wrong. src/liblzma/common/stream_buffer_encoder.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) commit 3b22fc2c87ec85fcdd385c163b68fc49c97aa848 Author: Lasse Collin Date: 2011-04-11 13:28:40 +0300 liblzma: Fix API docs to mention LZMA_UNSUPPORTED_CHECK. This return value was missing from the API comments of four functions. src/liblzma/api/lzma/block.h | 1 + src/liblzma/api/lzma/container.h | 3 +++ 2 files changed, 4 insertions(+) commit 71b9380145dccf001f22e66a06b9d508905c25ce Author: Lasse Collin Date: 2011-04-11 13:21:28 +0300 liblzma: Validate encoder arguments better. The biggest problem was that the integrity check type wasn't validated, and e.g. lzma_easy_buffer_encode() would create a corrupt .xz Stream if given an unsupported Check ID. Luckily applications don't usually try to use an unsupport Check ID, so this bug is unlikely to cause many real-world problems. src/liblzma/common/block_buffer_encoder.c | 18 ++++++++++++------ src/liblzma/common/block_encoder.c | 5 +++++ src/liblzma/common/stream_buffer_encoder.c | 3 +++ 3 files changed, 20 insertions(+), 6 deletions(-) commit ec7e3dbad704268825fc48f0bdd4577bc46b4f13 Author: Lasse Collin Date: 2011-04-11 09:57:30 +0300 xz: Move the description of --block-size in --long-help. src/xz/message.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) commit cd3086ff443bb282bdf556919c28b3e3cbed8169 Author: Lasse Collin Date: 2011-04-11 09:55:35 +0300 Docs: Document --single-stream and --block-size. src/xz/xz.1 | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) commit fb64a4924334e3c440865710990fe08090f2fed0 Author: Lasse Collin Date: 2011-04-11 09:27:57 +0300 liblzma: Make lzma_stream_encoder_init() static (second try). It's an internal function and it's not needed by anything outside stream_encoder.c. src/liblzma/common/Makefile.inc | 1 - src/liblzma/common/easy_encoder.c | 1 - src/liblzma/common/stream_encoder.c | 13 ++++++------- src/liblzma/common/stream_encoder.h | 23 ----------------------- 4 files changed, 6 insertions(+), 32 deletions(-) commit a34730cf6af4d33a4057914e57227b6dfde6567e Author: Lasse Collin Date: 2011-04-11 08:31:42 +0300 Revert "liblzma: Make lzma_stream_encoder_init() static." This reverts commit 352ac82db5d3f64585c07b39e4759388dec0e4d7. I don't know what I was thinking. src/liblzma/common/Makefile.inc | 1 + src/liblzma/common/stream_encoder.c | 9 +++++---- src/liblzma/common/stream_encoder.h | 23 +++++++++++++++++++++++ 3 files changed, 29 insertions(+), 4 deletions(-) commit 9f0a806aef7ea79718e3f1f2baf3564295229a27 Author: Lasse Collin Date: 2011-04-10 21:23:21 +0300 Revise mythread.h. This adds: - mythread_sync() macro to create synchronized blocks - mythread_cond structure and related functions and macros for condition variables with timed waiting using a relative timeout - mythread_create() to create a thread with all signals blocked Some of these wouldn't need to be inline functions, but I'll keep them this way for now for simplicity. For timed waiting on a condition variable, librt is now required on some systems to use clock_gettime(). configure.ac was updated to handle this. configure.ac | 1 + src/common/mythread.h | 200 +++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 181 insertions(+), 20 deletions(-) commit 352ac82db5d3f64585c07b39e4759388dec0e4d7 Author: Lasse Collin Date: 2011-04-10 20:37:36 +0300 liblzma: Make lzma_stream_encoder_init() static. It's an internal function and it's not needed by anything outside stream_encoder.c. src/liblzma/common/Makefile.inc | 1 - src/liblzma/common/stream_encoder.c | 9 ++++----- src/liblzma/common/stream_encoder.h | 23 ----------------------- 3 files changed, 4 insertions(+), 29 deletions(-) commit 9e807fe3fe79618ac48f58207cf7082ea20a6928 Author: Lasse Collin Date: 2011-04-10 14:58:10 +0300 DOS: Update the docs and include notes about 8.3 filenames. dos/{README => INSTALL.txt} | 13 +---- dos/README.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 125 insertions(+), 11 deletions(-) commit ebd54dbd6e481d31e80757f900ac8109ad1423c6 Author: Lasse Collin Date: 2011-04-10 13:09:42 +0300 xz/DOS: Add experimental 8.3 filename support. This is incompatible with the 8.3 support patch made by Juan Manuel Guerrero. I think this one is nicer, but I need to get feedback from DOS users before saying that this is the final version of 8.3 filename support. src/xz/suffix.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 167 insertions(+), 9 deletions(-) commit cd4fe97852bcaeffe674ee51b4613709292a0972 Author: Lasse Collin Date: 2011-04-10 12:47:47 +0300 xz/DOS: Be more careful with the destination file. Try to avoid overwriting the source file if --force is used and the generated destination filename refers to the source file. This can happen with 8.3 filenames where extra characters are ignored. If the generated output file refers to a special file like "con" or "prn", refuse to write to it even if --force is used. src/xz/file_io.c | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) commit 607f9f98ae5ef6d49f4c21c806d462bf6b3d6796 Author: Lasse Collin Date: 2011-04-09 18:29:30 +0300 Update THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit fca396b37410d272b754843a5dc13847be443a3a Author: Lasse Collin Date: 2011-04-09 18:28:58 +0300 liblzma: Add missing #ifdefs to filter_common.c. Passing --disable-decoders to configure broke a few encoders due to missing #ifdefs in filter_common.c. Thanks to Jason Gorski for the patch. src/liblzma/common/filter_common.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit b03f6cd3ebadd675f2cc9d518cb26fa860269447 Author: Lasse Collin Date: 2011-04-09 15:24:59 +0300 xz: Avoid unneeded fstat() on DOS-like systems. src/xz/file_io.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) commit 335fe260a81f61ec99ff5940df733b4c50aedb7c Author: Lasse Collin Date: 2011-04-09 15:11:13 +0300 xz: Minor internal changes to handling of --threads. Now it always defaults to one thread. Maybe this will change again if a threading method is added that doesn't affect memory usage. src/xz/args.c | 4 ++-- src/xz/hardware.c | 24 ++++++++++++------------ src/xz/hardware.h | 9 ++++----- 3 files changed, 18 insertions(+), 19 deletions(-) commit 9edd6ee895fbe71d245a173f48e511f154a99875 Author: Lasse Collin Date: 2011-04-08 17:53:05 +0300 xz: Change size_t to uint32_t in a few places. src/xz/coder.c | 6 +++--- src/xz/coder.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) commit 411013ea4506a6df24d35a060fcbd73a57b73eb3 Author: Lasse Collin Date: 2011-04-08 17:48:41 +0300 xz: Fix a typo in a comment. src/xz/coder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit b34c5ce4b22e8d7b81f9895d15054af41d17f805 Author: Lasse Collin Date: 2011-04-05 22:41:33 +0300 liblzma: Use TUKLIB_GNUC_REQ to check GCC version in sha256.c. src/liblzma/check/sha256.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit db33117cc85c17e0b897b5312bd5eb43aac41c03 Author: Lasse Collin Date: 2011-04-05 17:12:20 +0300 Build: Upgrade m4/acx_pthread.m4 to the latest version. It was renamed to ax_pthread.m4 in Autoconf Archive. configure.ac | 2 +- m4/{acx_pthread.m4 => ax_pthread.m4} | 170 ++++++++++++++++++----------------- 2 files changed, 88 insertions(+), 84 deletions(-) commit 1039bfcfc098b69d56ecb39d198a092552eacf6d Author: Lasse Collin Date: 2011-04-05 15:27:26 +0300 xz: Use posix_fadvise() if it is available. configure.ac | 3 +++ src/xz/file_io.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+) commit 1ef3cf44a8eb9512480af4482a5232ea08363b14 Author: Lasse Collin Date: 2011-04-05 15:13:29 +0300 xz: Call lzma_end(&strm) before exiting if debugging is enabled. src/xz/coder.c | 10 ++++++++++ src/xz/coder.h | 5 +++++ src/xz/main.c | 4 ++++ 3 files changed, 19 insertions(+) commit bd432015d33dcade611d297bc01eb0700088ef6c Author: Lasse Collin Date: 2011-04-02 14:49:56 +0300 liblzma: Fix a memory leak in stream_encoder.c. It leaks old filter options structures (hundred bytes or so) every time the lzma_stream is reinitialized. With the xz tool, this happens when compressing multiple files. src/liblzma/common/stream_encoder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 16889013214e7620d204b6e6c1bf9f3103a13655 Author: Lasse Collin Date: 2011-04-01 08:47:20 +0300 Updated NEWS for 5.0.2. NEWS | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) commit 85cdf7dd4e97b078e7b929e47f55a7f1da36010f Author: Lasse Collin Date: 2011-03-31 15:06:58 +0300 Update INSTALL with another note about IRIX. INSTALL | 4 ++++ 1 file changed, 4 insertions(+) commit c3f4995586873d6a4fb7e451010a128571a9a370 Author: Lasse Collin Date: 2011-03-31 12:22:55 +0300 Tests: Add a new file to test empty LZMA2 streams. tests/files/README | 4 ++++ tests/files/good-1-lzma2-5.xz | Bin 0 -> 52 bytes 2 files changed, 4 insertions(+) commit 0d21f49a809dc2088da6cc0da7f948404df7ecfa Author: Lasse Collin Date: 2011-03-31 11:54:48 +0300 liblzma: Fix decoding of LZMA2 streams having no uncompressed data. The decoder considered empty LZMA2 streams to be corrupt. This shouldn't matter much with .xz files, because no encoder creates empty LZMA2 streams in .xz. This bug is more likely to cause problems in applications that use raw LZMA2 streams. src/liblzma/lzma/lzma2_decoder.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) commit 40277998cb9bad564ce4827aff152e6e1c904dfa Author: Lasse Collin Date: 2011-03-24 01:42:49 +0200 Scripts: Better fix for xzgrep. Now it uses "grep -q". Thanks to Gregory Margo. src/scripts/xzgrep.in | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) commit 2118733045ad0ca183a3f181a0399baf876983a6 Author: Lasse Collin Date: 2011-03-24 01:22:18 +0200 Updated THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit c7210d9a3fca6f31a57208bfddfc9ab20a2e097a Author: Lasse Collin Date: 2011-03-24 01:21:32 +0200 Scripts: Fix xzgrep -l. It didn't work at all. It tried to use the -q option for grep, but it appended it after "--". This works around it by redirecting to /dev/null. The downside is that this can be slower with big files compared to proper use of "grep -q". Thanks to Gregory Margo. src/scripts/xzgrep.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit 4eb83e32046a6d670862bc91c3d82530963b455e Author: Lasse Collin Date: 2011-03-19 13:08:22 +0200 Scripts: Add lzop (.lzo) support to xzdiff and xzgrep. src/scripts/xzdiff.1 | 6 ++++-- src/scripts/xzdiff.in | 22 ++++++++++++++-------- src/scripts/xzgrep.1 | 11 +++++++---- src/scripts/xzgrep.in | 5 +++-- 4 files changed, 28 insertions(+), 16 deletions(-) commit 923b22483bd9356f3219b2b784d96f455f4dc499 Author: Lasse Collin Date: 2011-03-18 19:10:30 +0200 xz: Add --block-size=SIZE. This uses LZMA_FULL_FLUSH every SIZE bytes of input. Man page wasn't updated yet. src/xz/args.c | 7 +++++++ src/xz/coder.c | 50 ++++++++++++++++++++++++++++++++++++++++---------- src/xz/coder.h | 3 +++ src/xz/message.c | 4 ++++ 4 files changed, 54 insertions(+), 10 deletions(-) commit 57597d42ca1740ad506437be168d800a50f1a0ad Author: Lasse Collin Date: 2011-03-18 18:19:19 +0200 xz: Add --single-stream. This can be useful when there is garbage after the compressed stream (.xz, .lzma, or raw stream). Man page wasn't updated yet. src/xz/args.c | 6 ++++++ src/xz/coder.c | 11 +++++++++-- src/xz/coder.h | 3 +++ src/xz/message.c | 6 +++++- 4 files changed, 23 insertions(+), 3 deletions(-) commit 96f94bc925d579a700147fa5d7793b64d69cfc18 Author: Lasse Collin Date: 2011-02-04 22:49:31 +0200 xz: Clean up suffix.c. struct suffix_pair isn't needed in compresed_name() so get rid of it there. src/xz/suffix.c | 44 ++++++++++++++++++++------------------------ 1 file changed, 20 insertions(+), 24 deletions(-) commit 8930c7ae3f82bdae15aa129f01de08be23d7e8d7 Author: Lasse Collin Date: 2011-02-04 11:29:47 +0200 xz: Check if the file already has custom suffix when compressing. Now "xz -S .test foo.test" refuses to compress the file because it already has the suffix .test. The man page had it documented this way already. src/xz/suffix.c | 9 +++++++++ 1 file changed, 9 insertions(+) commit 940d5852c6cf08abccc6befd9d1b5411c9076a58 Author: Lasse Collin Date: 2011-02-02 23:01:51 +0200 Updated THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 4ebe65f839613f27f127bab7b8c347d982330ee3 Author: Lasse Collin Date: 2011-02-02 23:00:33 +0200 Translations: Add Polish translation. Thanks to Jakub Bogusz. po/LINGUAS | 1 + po/pl.po | 825 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 826 insertions(+) commit fc1d292dca1925dfd17174f443f91a696ecd5bf8 Author: Lasse Collin Date: 2011-02-02 22:24:00 +0200 Updated THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit 6dd061adfd2775428b079eb03d6fd47d7c0f1ffe Merge: 9d542ce 5fbce0b Author: Lasse Collin Date: 2011-02-06 20:13:01 +0200 Merge commit '5fbce0b8d96dc96775aa0215e3581addc830e23d' commit 5fbce0b8d96dc96775aa0215e3581addc830e23d Author: Lasse Collin Date: 2011-01-28 20:16:57 +0200 Update NEWS for 5.0.1. NEWS | 14 ++++++++++++++ 1 file changed, 14 insertions(+) commit 03ebd1bbb314f9f204940219a835c883bf442475 Author: Lasse Collin Date: 2011-01-26 12:19:08 +0200 xz: Fix --force on setuid/setgid/sticky and multi-hardlink files. xz didn't compress setuid/setgid/sticky files and files with multiple hard links even with --force. This bug was introduced in 23ac2c44c3ac76994825adb7f9a8f719f78b5ee4. Thanks to Charles Wilson. src/xz/file_io.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) commit 9d542ceebcbe40b174169c132ccfcdc720ca7089 Merge: 4f2c69a 7bd0a5e Author: Lasse Collin Date: 2011-01-19 11:45:35 +0200 Merge branch 'v5.0' commit 7bd0a5e7ccc354f7c2e95c8bc27569c820f6a136 Author: Lasse Collin Date: 2011-01-18 21:25:24 +0200 Updated THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit f71c4e16e913f660977526f0ef8d2acdf458d7c9 Author: Lasse Collin Date: 2011-01-18 21:23:50 +0200 Add alloc_size and malloc attributes to a few functions. Thanks to Cristian Rodríguez for the original patch. src/common/sysdefs.h | 6 ++++++ src/liblzma/common/common.h | 2 +- src/xz/util.h | 5 +++-- 3 files changed, 10 insertions(+), 3 deletions(-) commit 316cbe24465143edde8f6ffb7532834b7b2ea93f Author: Lasse Collin Date: 2010-12-13 16:36:33 +0200 Scripts: Fix gzip and bzip2 support in xzdiff. src/scripts/xzdiff.in | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) commit 4f2c69a4e3e0aee2e37b0b1671d34086e20c8ac6 Merge: adb89e6 9311774 Author: Lasse Collin Date: 2010-12-12 23:13:22 +0200 Merge branch 'v5.0' commit 9311774c493c19deab51ded919dcd2e9c4aa2829 Author: Lasse Collin Date: 2010-12-12 21:23:55 +0200 Build: Enable ASM on DJGPP by default. configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 4a42aaee282fc73b482581684d65110506d5efdd Author: Lasse Collin Date: 2010-12-12 16:09:42 +0200 Updated THANKS. THANKS | 1 + 1 file changed, 1 insertion(+) commit ce56f63c41ee210e6308090eb6d49221fdf67d6c Author: Lasse Collin Date: 2010-12-12 16:07:11 +0200 Add missing PRIx32 and PRIx64 compatibility definitions. This fixes portability to systems that lack C99 inttypes.h. Thanks to Juan Manuel Guerrero. src/common/sysdefs.h | 9 +++++++++ 1 file changed, 9 insertions(+) commit e6baedddcf54e7da049ebc49183565b99facd4c7 Author: Lasse Collin Date: 2010-12-12 14:50:04 +0200 DOS-like: Treat \ and : as directory separators in addition to /. Juan Manuel Guerrero had fixed this in his XZ Utils port to DOS/DJGPP. The bug affects also Windows and OS/2. src/xz/suffix.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-) commit adb89e68d43a4cadb0c215b45ef7a75737c9c3ec Merge: 7c24e0d b7afd3e Author: Lasse Collin Date: 2010-12-07 18:53:04 +0200 Merge branch 'v5.0' commit b7afd3e22a8fac115b75c738d40d3eb1de7e286f Author: Lasse Collin Date: 2010-12-07 18:52:04 +0200 Translations: Fix Czech translation of "sparse file". Thanks to Petr Hubený and Marek Černocký. po/cs.po | 88 ++++++++++++++++++++++++++++++++-------------------------------- 1 file changed, 44 insertions(+), 44 deletions(-) commit 7c24e0d1b8a2e86e9263b0d56d39621e01aed7af Merge: b4d42f1 3e56470 Author: Lasse Collin Date: 2010-11-15 14:33:01 +0200 Merge branch 'v5.0' commit 3e564704bc6f463cb2db11e3f3f0dbd71d85992e Author: Lasse Collin Date: 2010-11-15 14:28:26 +0200 liblzma: Document the return value of lzma_lzma_preset(). src/liblzma/api/lzma/lzma.h | 3 +++ 1 file changed, 3 insertions(+) commit 2964d8d691ed92abdcf214888d79ad6d79774735 Author: Jonathan Nieder Date: 2010-11-12 15:22:13 -0600 Simplify paths in generated API docs Currently the file list generated by Doxygen has src/ at the beginning of each path. Paths like common/sysdefs.h and liblzma/api/lzma.h are easier to read without such a prefix. Builds from a separate build directory with mkdir build cd build ../configure doxygen Doxyfile include an even longer prefix /home/someone/src/xz/src; this patch has the nice side-effect of eliminating that prefix, too. Fixes: http://bugs.debian.org/572273 Doxyfile.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit b4d42f1a7120e2cefeb2f14425efe2ca6db85416 Author: Anders F Bjorklund Date: 2010-11-05 12:56:11 +0100 add build script for macosx universal macosx/build.sh | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+) commit 15ee6935abe4a2fc76639ee342ca2e69af3e0ad6 Author: Lasse Collin Date: 2010-11-04 18:31:40 +0200 Update the copies of GPLv2 and LGPLv2.1 from gnu.org. There are only a few white space changes. COPYING.GPLv2 | 14 +++++++------- COPYING.LGPLv2.1 | 16 +++++++--------- 2 files changed, 14 insertions(+), 16 deletions(-) commit 8e355f7fdbeee6fe394eb02a28f267ce99a882a2 Merge: 974ebe6 37c2565 Author: Lasse Collin Date: 2010-10-26 15:53:06 +0300 Merge branch 'v5.0' commit 37c25658efd25b034266daf87cd381d20d1df776 Author: Lasse Collin Date: 2010-10-26 15:48:48 +0300 Build: Copy the example programs to $docdir/examples. The example programs by Daniel Mealha Cabrita were included in the git repository, but I had forgot to add them to Makefile.am. Thus, they didn't get included in the source package at all by "make dist". Makefile.am | 5 +++++ windows/build.bash | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) commit 974ebe63497bdf0d262e06474f0dd5a70b1dd000 Author: Lasse Collin Date: 2010-10-26 10:36:41 +0300 liblzma: Rename a few variables and constants. This has no semantic changes. I find the new names slightly more logical and they match the names that are already used in XZ Embedded. The name fastpos wasn't changed (not worth the hassle). src/liblzma/lzma/fastpos.h | 55 +++++------ src/liblzma/lzma/lzma2_encoder.c | 2 +- src/liblzma/lzma/lzma_common.h | 45 ++++----- src/liblzma/lzma/lzma_decoder.c | 58 +++++------ src/liblzma/lzma/lzma_encoder.c | 56 +++++------ src/liblzma/lzma/lzma_encoder_optimum_fast.c | 9 +- src/liblzma/lzma/lzma_encoder_optimum_normal.c | 128 ++++++++++++------------- src/liblzma/lzma/lzma_encoder_private.h | 16 ++-- 8 files changed, 183 insertions(+), 186 deletions(-) commit 7c427ec38d016c0070a42315d752857e33792fc4 Author: Lasse Collin Date: 2010-10-25 12:59:25 +0300 Bump version 5.1.0alpha. src/liblzma/api/lzma/version.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit e45929260cd902036efd40c5610a8d0a50d5712b Author: Lasse Collin Date: 2010-10-23 17:25:52 +0300 Build: Fix mydist rule when .git doesn't exist. Makefile.am | 1 + 1 file changed, 1 insertion(+) commit 6e1326fcdf6b6209949be57cfe3ad4b781b65168 Author: Lasse Collin Date: 2010-10-23 14:15:35 +0300 Add NEWS for 5.0.0. NEWS | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) commit b667a3ef6338a2c1db7b7706b1f6c99ea392221c Author: Lasse Collin Date: 2010-10-23 14:02:53 +0300 Bump version to 5.0.0 and liblzma version-info to 5:0:0. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) Index: head/contrib/xz/FREEBSD-Xlist =================================================================== --- head/contrib/xz/FREEBSD-Xlist (revision 359200) +++ head/contrib/xz/FREEBSD-Xlist (revision 359201) @@ -1,38 +1,39 @@ $FreeBSD$ */*/*/Makefile.* */*/Makefile.* */.gitignore */Makefile.* .git .gitignore ABOUT-NLS COPYING.GPLv2 COPYING.GPLv3 COPYING.LGPLv2.1 Doxyfile.in INSTALL INSTALL.generic Makefile Makefile.* NEWS PACKAGERS aclocal.m4 autogen.sh build-aux/ config.h.in configure configure.ac debug/ doc/ dos/ extra/ lib/ m4/ macosx/ makefile.am po/ +po4a/ src/*/*.rc src/scripts/ tests/ version.sh windows/ Index: head/contrib/xz/README =================================================================== --- head/contrib/xz/README (revision 359200) +++ head/contrib/xz/README (revision 359201) @@ -1,308 +1,236 @@ XZ Utils ======== 0. Overview 1. Documentation 1.1. Overall documentation 1.2. Documentation for command-line tools 1.3. Documentation for liblzma 2. Version numbering 3. Reporting bugs - 4. Translating the xz tool + 4. Translations 5. Other implementations of the .xz format 6. Contact information 0. Overview ----------- XZ Utils provide a general-purpose data-compression library plus command-line tools. The native file format is the .xz format, but also the legacy .lzma format is supported. The .xz format supports multiple compression algorithms, which are called "filters" in the context of XZ Utils. The primary filter is currently LZMA2. With typical files, XZ Utils create about 30 % smaller files than gzip. To ease adapting support for the .xz format into existing applications and scripts, the API of liblzma is somewhat similar to the API of the popular zlib library. For the same reason, the command-line tool xz has a command-line syntax similar to that of gzip. When aiming for the highest compression ratio, the LZMA2 encoder uses a lot of CPU time and may use, depending on the settings, even hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder competes with bzip2 in compression speed, RAM usage, and compression ratio. LZMA2 is reasonably fast to decompress. It is a little slower than gzip, but a lot faster than bzip2. Being fast to decompress means that the .xz format is especially nice when the same file will be decompressed very many times (usually on different computers), which is the case e.g. when distributing software packages. In such situations, it's not too bad if the compression takes some time, since that needs to be done only once to benefit many people. With some file types, combining (or "chaining") LZMA2 with an additional filter can improve the compression ratio. A filter chain may contain up to four filters, although usually only one or two are used. For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 in the filter chain can improve compression ratio of executable files. Since the .xz format allows adding new filter IDs, it is possible that some day there will be a filter that is, for example, much faster to compress than LZMA2 (but probably with worse compression ratio). Similarly, it is possible that some day there is a filter that will compress better than LZMA2. - XZ Utils doesn't support multithreaded compression or decompression - yet. It has been planned though and taken into account when designing - the .xz file format. + XZ Utils supports multithreaded compression. XZ Utils doesn't support + multithreaded decompression yet. It has been planned though and taken + into account when designing the .xz file format. In the future, files + that were created in threaded mode can be decompressed in threaded + mode too. 1. Documentation ---------------- 1.1. Overall documentation README This file INSTALL.generic Generic install instructions for those not familiar with packages using GNU Autotools INSTALL Installation instructions specific to XZ Utils PACKAGERS Information to packagers of XZ Utils COPYING XZ Utils copyright and license information COPYING.GPLv2 GNU General Public License version 2 COPYING.GPLv3 GNU General Public License version 3 COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 AUTHORS The main authors of XZ Utils THANKS Incomplete list of people who have helped making this software NEWS User-visible changes between XZ Utils releases ChangeLog Detailed list of changes (commit log) TODO Known bugs and some sort of to-do list Note that only some of the above files are included in binary packages. 1.2. Documentation for command-line tools The command-line tools are documented as man pages. In source code releases (and possibly also in some binary packages), the man pages are also provided in plain text (ASCII only) and PDF formats in the directory "doc/man" to make the man pages more accessible to those whose operating system doesn't provide an easy way to view man pages. 1.3. Documentation for liblzma The liblzma API headers include short docs about each function and data type as Doxygen tags. These docs should be quite OK as a quick reference. - I have planned to write a bunch of very well documented example - programs, which (due to comments) should work as a tutorial to - various features of liblzma. No such example programs have been - written yet. + There are a few example/tutorial programs that should help in + getting started with liblzma. In the source package the examples + are in "doc/examples" and in binary packages they may be under + "examples" in the same directory as this README. - For now, if you have never used liblzma, libbzip2, or zlib, I - recommend learning the *basics* of the zlib API. Once you know that, - it should be easier to learn liblzma. + Since the liblzma API has similarities to the zlib API, some people + may find it useful to read the zlib docs and tutorial too: http://zlib.net/manual.html http://zlib.net/zlib_how.html 2. Version numbering -------------------- The version number format of XZ Utils is X.Y.ZS: - X is the major version. When this is incremented, the library API and ABI break. - Y is the minor version. It is incremented when new features are added without breaking the existing API or ABI. An even Y indicates a stable release and an odd Y indicates unstable (alpha or beta version). - Z is the revision. This has a different meaning for stable and unstable releases: * Stable: Z is incremented when bugs get fixed without adding any new features. This is intended to be convenient for downstream distributors that want bug fixes but don't want any new features to minimize the risk of introducing new bugs. * Unstable: Z is just a counter. API or ABI of features added in earlier unstable releases having the same X.Y may break. - S indicates stability of the release. It is missing from the stable releases, where Y is an even number. When Y is odd, S is either "alpha" or "beta" to make it very clear that such versions are not stable releases. The same X.Y.Z combination is not used for more than one stability level, i.e. after X.Y.Zalpha, the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. 3. Reporting bugs ----------------- Naturally it is easiest for me if you already know what causes the unexpected behavior. Even better if you have a patch to propose. However, quite often the reason for unexpected behavior is unknown, so here are a few things to do before sending a bug report: 1. Try to create a small example how to reproduce the issue. 2. Compile XZ Utils with debugging code using configure switches --enable-debug and, if possible, --disable-shared. If you are using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting binaries. 3. Turn on core dumps. The exact command depends on your shell; for example in GNU bash it is done with "ulimit -c unlimited", and in tcsh with "limit coredumpsize unlimited". 4. Try to reproduce the suspected bug. If you get "assertion failed" message, be sure to include the complete message in your bug report. If the application leaves a coredump, get a backtrace using gdb: $ gdb /path/to/app-binary # Load the app to the debugger. (gdb) core core # Open the coredump. (gdb) bt # Print the backtrace. Copy & paste to bug report. (gdb) quit # Quit gdb. Report your bug via email or IRC (see Contact information below). Don't send core dump files or any executables. If you have a small example file(s) (total size less than 256 KiB), please include it/them as an attachment. If you have bigger test files, put them online somewhere and include a URL to the file(s) in the bug report. Always include the exact version number of XZ Utils in the bug report. If you are using a snapshot from the git repository, use "git describe" to get the exact snapshot version. If you are using XZ Utils shipped in an operating system distribution, mention the distribution name, distribution version, and exact xz package version; if you cannot repeat the bug with the code compiled from unpatched source code, you probably need to report a bug to your distribution's bug tracking system. -4. Translating the xz tool --------------------------- +4. Translations +--------------- - The messages from the xz tool have been translated into a few - languages. Before starting to translate into a new language, ask - the author whether someone else hasn't already started working on it. + The xz command line tool and all man pages can be translated. + The translations are handled via the Translation Project. If you + wish to help translating xz, please join the Translation Project: - Test your translation. Testing includes comparing the translated - output to the original English version by running the same commands - in both your target locale and with LC_ALL=C. Ask someone to - proof-read and test the translation. + https://translationproject.org/html/translators.html - Testing can be done e.g. by installing xz into a temporary directory: - - ./configure --disable-shared --prefix=/tmp/xz-test - # - make -C po update-po - make install - bash debug/translation.bash | less - bash debug/translation.bash | less -S # For --list outputs - - Repeat the above as needed (no need to re-run configure though). - - Note especially the following: - - - The output of --help and --long-help must look nice on - an 80-column terminal. It's OK to add extra lines if needed. - - - In contrast, don't add extra lines to error messages and such. - They are often preceded with e.g. a filename on the same line, - so you have no way to predict where to put a \n. Let the terminal - do the wrapping even if it looks ugly. Adding new lines will be - even uglier in the generic case even if it looks nice in a few - limited examples. - - - Be careful with column alignment in tables and table-like output - (--list, --list --verbose --verbose, --info-memory, --help, and - --long-help): - - * All descriptions of options in --help should start in the - same column (but it doesn't need to be the same column as - in the English messages; just be consistent if you change it). - Check that both --help and --long-help look OK, since they - share several strings. - - * --list --verbose and --info-memory print lines that have - the format "Description: %s". If you need a longer - description, you can put extra space between the colon - and %s. Then you may need to add extra space to other - strings too so that the result as a whole looks good (all - values start at the same column). - - * The columns of the actual tables in --list --verbose --verbose - should be aligned properly. Abbreviate if necessary. It might - be good to keep at least 2 or 3 spaces between column headings - and avoid spaces in the headings so that the columns stand out - better, but this is a matter of opinion. Do what you think - looks best. - - - Be careful to put a period at the end of a sentence when the - original version has it, and don't put it when the original - doesn't have it. Similarly, be careful with \n characters - at the beginning and end of the strings. - - - Read the TRANSLATORS comments that have been extracted from the - source code and included in xz.pot. If they suggest testing the - translation with some type of command, do it. If testing needs - input files, use e.g. tests/files/good-*.xz. - - - When updating the translation, read the fuzzy (modified) strings - carefully, and don't mark them as updated before you actually - have updated them. Reading through the unchanged messages can be - good too; sometimes you may find a better wording for them. - - - If you find language problems in the original English strings, - feel free to suggest improvements. Ask if something is unclear. - - - The translated messages should be understandable (sometimes this - may be a problem with the original English messages too). Don't - make a direct word-by-word translation from English especially if - the result doesn't sound good in your language. - - In short, take your time and pay attention to the details. Making - a good translation is not a quick and trivial thing to do. The - translated xz should look as polished as the English version. + Several strings will change in a future version of xz so if you + wish to start a new translation, look at the code in the xz git + repostiory instead of a 5.2.x release. 5. Other implementations of the .xz format ------------------------------------------ 7-Zip and the p7zip port of 7-Zip support the .xz format starting from the version 9.00alpha. http://7-zip.org/ http://p7zip.sourceforge.net/ XZ Embedded is a limited implementation written for use in the Linux kernel, but it is also suitable for other embedded use. https://tukaani.org/xz/embedded.html 6. Contact information ---------------------- If you have questions, bug reports, patches etc. related to XZ Utils, contact Lasse Collin (in Finnish or English). I'm sometimes slow at replying. If you haven't got a reply within two weeks, assume that your email has got lost and resend it or use IRC. You can find me also from #tukaani on Freenode; my nick is Larhzu. The channel tends to be pretty quiet, so just ask your question and someone may wake up. Index: head/contrib/xz/THANKS =================================================================== --- head/contrib/xz/THANKS (revision 359200) +++ head/contrib/xz/THANKS (revision 359201) @@ -1,124 +1,134 @@ Thanks ====== Some people have helped more, some less, but nevertheless everyone's help has been important. :-) In alphabetical order: - Mark Adler - H. Peter Anvin - Jeff Bastian - Nelson H. F. Beebe - Karl Berry - Anders F. Björklund - Emmanuel Blot - Melanie Blower - Martin Blumenstingl - Ben Boeckel - Jakub Bogusz - Maarten Bosmans - Trent W. Buck - James Buren - David Burklund - Daniel Mealha Cabrita - Milo Casagrande - Marek Černocký - Tomer Chachamu + - Antoine Cœur - Gabi Davar - Chris Donawa - Andrew Dudman - Markus Duft - İsmail Dönmez - Robert Elz - Gilles Espinasse - Denis Excoffier - Michael Felt - Michael Fox - Mike Frysinger - Daniel Richard G. - Bill Glessner - Jason Gorski - Juan Manuel Guerrero - Diederik de Haas - Joachim Henke - Christian Hesse - Vincenzo Innocente - Peter Ivanov - Jouk Jansen - Jun I Jin + - Kiyoshi Kanazawa - Per Øyvind Karlsen - Thomas Klausner - Richard Koch - Ville Koskinen - Jan Kratochvil - Christian Kujau - Stephan Kulow - Peter Lawler - James M Leddy - Hin-Tak Leung - Andraž 'ruskie' Levstik - Cary Lewis - Wim Lewis + - Xin Li - Eric Lindblad - Lorenzo De Liso - Bela Lubkin - Gregory Margo + - Julien Marrec + - Martin Matuška - Jim Meyering - Arkadiusz Miskiewicz - Conley Moorhous - Rafał Mużyło - Adrien Nader - Evan Nemerson - Hongbo Ni - Jonathan Nieder - Andre Noll - Peter O'Gorman + - Filip Palian - Peter Pallinger - Rui Paulo - Igor Pavlov - Diego Elio Pettenò - Elbert Pol - Mikko Pouru - Rich Prohaska - Trần Ngọc Quân - Pavel Raiskup - Ole André Vadla Ravnås - Robert Readman - Bernhard Reutner-Fischer - Eric S. Raymond - Cristian Rodríguez - Christian von Roques - Torsten Rupp - Jukka Salmi - Alexandre Sauvé - Benno Schulenberg - Andreas Schwab + - Bhargava Shastry - Dan Shechter - Stuart Shelton - Sebastian Andrzej Siewior - Brad Smith + - Bruce Stark - Pippijn van Steenhoven - Jonathan Stott - Dan Stromberg - Vincent Torri - Paul Townsend - Mohammed Adnène Trojette - Alexey Tourbin + - Loganaden Velvindron - Patrick J. Volkerding - Martin Väth - Adam Walling + - Jeffrey Walton - Christian Weisgerber - Bert Wesarg - Fredrik Wikstrom - Jim Wilcoxson - Ralf Wildenhues - Charles Wilson - Lars Wirzenius - Pilorz Wojciech - Ryan Young - Andreas Zieringer Also thanks to all the people who have participated in the Tukaani project. I have probably forgot to add some names to the above list. Sorry about that and thanks for your help. Index: head/contrib/xz/src/common/sysdefs.h =================================================================== --- head/contrib/xz/src/common/sysdefs.h (revision 359200) +++ head/contrib/xz/src/common/sysdefs.h (revision 359201) @@ -1,202 +1,199 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file sysdefs.h /// \brief Common includes, definitions, system-specific things etc. /// /// This file is used also by the lzma command line tool, that's why this /// file is separate from common.h. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_SYSDEFS_H #define LZMA_SYSDEFS_H ////////////// // Includes // ////////////// #ifdef HAVE_CONFIG_H # include #endif // Get standard-compliant stdio functions under MinGW and MinGW-w64. #ifdef __MINGW32__ # define __USE_MINGW_ANSI_STDIO 1 #endif // size_t and NULL #include #ifdef HAVE_INTTYPES_H # include #endif // C99 says that inttypes.h always includes stdint.h, but some systems // don't do that, and require including stdint.h separately. #ifdef HAVE_STDINT_H # include #endif // Some pre-C99 systems have SIZE_MAX in limits.h instead of stdint.h. The // limits are also used to figure out some macros missing from pre-C99 systems. -#ifdef HAVE_LIMITS_H -# include -#endif +#include // Be more compatible with systems that have non-conforming inttypes.h. // We assume that int is 32-bit and that long is either 32-bit or 64-bit. // Full Autoconf test could be more correct, but this should work well enough. // Note that this duplicates some code from lzma.h, but this is better since // we can work without inttypes.h thanks to Autoconf tests. #ifndef UINT32_C # if UINT_MAX != 4294967295U # error UINT32_C is not defined and unsigned int is not 32-bit. # endif # define UINT32_C(n) n ## U #endif #ifndef UINT32_MAX # define UINT32_MAX UINT32_C(4294967295) #endif #ifndef PRIu32 # define PRIu32 "u" #endif #ifndef PRIx32 # define PRIx32 "x" #endif #ifndef PRIX32 # define PRIX32 "X" #endif #if ULONG_MAX == 4294967295UL # ifndef UINT64_C # define UINT64_C(n) n ## ULL # endif # ifndef PRIu64 # define PRIu64 "llu" # endif # ifndef PRIx64 # define PRIx64 "llx" # endif # ifndef PRIX64 # define PRIX64 "llX" # endif #else # ifndef UINT64_C # define UINT64_C(n) n ## UL # endif # ifndef PRIu64 # define PRIu64 "lu" # endif # ifndef PRIx64 # define PRIx64 "lx" # endif # ifndef PRIX64 # define PRIX64 "lX" # endif #endif #ifndef UINT64_MAX # define UINT64_MAX UINT64_C(18446744073709551615) #endif // Incorrect(?) SIZE_MAX: // - Interix headers typedef size_t to unsigned long, // but a few lines later define SIZE_MAX to INT32_MAX. // - SCO OpenServer (x86) headers typedef size_t to unsigned int // but define SIZE_MAX to INT32_MAX. #if defined(__INTERIX) || defined(_SCO_DS) # undef SIZE_MAX #endif // The code currently assumes that size_t is either 32-bit or 64-bit. #ifndef SIZE_MAX # if SIZEOF_SIZE_T == 4 # define SIZE_MAX UINT32_MAX # elif SIZEOF_SIZE_T == 8 # define SIZE_MAX UINT64_MAX # else # error size_t is not 32-bit or 64-bit # endif #endif #if SIZE_MAX != UINT32_MAX && SIZE_MAX != UINT64_MAX # error size_t is not 32-bit or 64-bit #endif #include #include // Pre-C99 systems lack stdbool.h. All the code in LZMA Utils must be written // so that it works with fake bool type, for example: // // bool foo = (flags & 0x100) != 0; // bool bar = !!(flags & 0x100); // // This works with the real C99 bool but breaks with fake bool: // // bool baz = (flags & 0x100); // #ifdef HAVE_STDBOOL_H # include #else # if ! HAVE__BOOL typedef unsigned char _Bool; # endif # define bool _Bool # define false 0 # define true 1 # define __bool_true_false_are_defined 1 #endif // string.h should be enough but let's include strings.h and memory.h too if // they exists, since that shouldn't do any harm, but may improve portability. -#ifdef HAVE_STRING_H -# include -#endif +#include #ifdef HAVE_STRINGS_H # include #endif #ifdef HAVE_MEMORY_H # include #endif // As of MSVC 2013, inline and restrict are supported with // non-standard keywords. #if defined(_WIN32) && defined(_MSC_VER) # ifndef inline # define inline __inline # endif # ifndef restrict # define restrict __restrict # endif #endif //////////// // Macros // //////////// #undef memzero #define memzero(s, n) memset(s, 0, n) // NOTE: Avoid using MIN() and MAX(), because even conditionally defining // those macros can cause some portability trouble, since on some systems // the system headers insist defining their own versions. #define my_min(x, y) ((x) < (y) ? (x) : (y)) #define my_max(x, y) ((x) > (y) ? (x) : (y)) #ifndef ARRAY_SIZE # define ARRAY_SIZE(array) (sizeof(array) / sizeof((array)[0])) #endif -#if (__GNUC__ == 4 && __GNUC_MINOR__ >= 3) || __GNUC__ > 4 +#if defined(__GNUC__) \ + && ((__GNUC__ == 4 && __GNUC_MINOR__ >= 3) || __GNUC__ > 4) # define lzma_attr_alloc_size(x) __attribute__((__alloc_size__(x))) #else # define lzma_attr_alloc_size(x) #endif #endif Index: head/contrib/xz/src/common/tuklib_cpucores.c =================================================================== --- head/contrib/xz/src/common/tuklib_cpucores.c (revision 359200) +++ head/contrib/xz/src/common/tuklib_cpucores.c (revision 359201) @@ -1,100 +1,100 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file tuklib_cpucores.c /// \brief Get the number of CPU cores online // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "tuklib_cpucores.h" #if defined(_WIN32) || defined(__CYGWIN__) # ifndef _WIN32_WINNT # define _WIN32_WINNT 0x0500 # endif # include // glibc >= 2.9 #elif defined(TUKLIB_CPUCORES_SCHED_GETAFFINITY) # include // FreeBSD #elif defined(TUKLIB_CPUCORES_CPUSET) # include # include #elif defined(TUKLIB_CPUCORES_SYSCTL) # ifdef HAVE_SYS_PARAM_H # include # endif # include #elif defined(TUKLIB_CPUCORES_SYSCONF) # include // HP-UX #elif defined(TUKLIB_CPUCORES_PSTAT_GETDYNAMIC) # include # include #endif extern uint32_t tuklib_cpucores(void) { uint32_t ret = 0; #if defined(_WIN32) || defined(__CYGWIN__) SYSTEM_INFO sysinfo; GetSystemInfo(&sysinfo); ret = sysinfo.dwNumberOfProcessors; #elif defined(TUKLIB_CPUCORES_SCHED_GETAFFINITY) cpu_set_t cpu_mask; if (sched_getaffinity(0, sizeof(cpu_mask), &cpu_mask) == 0) - ret = CPU_COUNT(&cpu_mask); + ret = (uint32_t)CPU_COUNT(&cpu_mask); #elif defined(TUKLIB_CPUCORES_CPUSET) cpuset_t set; if (cpuset_getaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID, -1, sizeof(set), &set) == 0) { # ifdef CPU_COUNT - ret = CPU_COUNT(&set); + ret = (uint32_t)CPU_COUNT(&set); # else for (unsigned i = 0; i < CPU_SETSIZE; ++i) if (CPU_ISSET(i, &set)) ++ret; # endif } #elif defined(TUKLIB_CPUCORES_SYSCTL) int name[2] = { CTL_HW, HW_NCPU }; int cpus; size_t cpus_size = sizeof(cpus); if (sysctl(name, 2, &cpus, &cpus_size, NULL, 0) != -1 && cpus_size == sizeof(cpus) && cpus > 0) - ret = cpus; + ret = (uint32_t)cpus; #elif defined(TUKLIB_CPUCORES_SYSCONF) # ifdef _SC_NPROCESSORS_ONLN // Most systems const long cpus = sysconf(_SC_NPROCESSORS_ONLN); # else // IRIX const long cpus = sysconf(_SC_NPROC_ONLN); # endif if (cpus > 0) - ret = cpus; + ret = (uint32_t)cpus; #elif defined(TUKLIB_CPUCORES_PSTAT_GETDYNAMIC) struct pst_dynamic pst; if (pstat_getdynamic(&pst, sizeof(pst), 1, 0) != -1) - ret = pst.psd_proc_cnt; + ret = (uint32_t)pst.psd_proc_cnt; #endif return ret; } Index: head/contrib/xz/src/common/tuklib_exit.c =================================================================== --- head/contrib/xz/src/common/tuklib_exit.c (revision 359200) +++ head/contrib/xz/src/common/tuklib_exit.c (revision 359201) @@ -1,57 +1,58 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file tuklib_exit.c /// \brief Close stdout and stderr, and exit // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "tuklib_common.h" #include #include +#include #include "tuklib_gettext.h" #include "tuklib_progname.h" #include "tuklib_exit.h" extern void tuklib_exit(int status, int err_status, int show_error) { if (status != err_status) { // Close stdout. If something goes wrong, // print an error message to stderr. const int ferror_err = ferror(stdout); const int fclose_err = fclose(stdout); if (ferror_err || fclose_err) { status = err_status; // If it was fclose() that failed, we have the reason // in errno. If only ferror() indicated an error, // we have no idea what the reason was. if (show_error) fprintf(stderr, "%s: %s: %s\n", progname, _("Writing to standard " "output failed"), fclose_err ? strerror(errno) : _("Unknown error")); } } if (status != err_status) { // Close stderr. If something goes wrong, there's // nothing where we could print an error message. // Just set the exit status. const int ferror_err = ferror(stderr); const int fclose_err = fclose(stderr); if (fclose_err || ferror_err) status = err_status; } exit(status); } Index: head/contrib/xz/src/common/tuklib_integer.h =================================================================== --- head/contrib/xz/src/common/tuklib_integer.h (revision 359200) +++ head/contrib/xz/src/common/tuklib_integer.h (revision 359201) @@ -1,534 +1,742 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file tuklib_integer.h /// \brief Various integer and bit operations /// /// This file provides macros or functions to do some basic integer and bit /// operations. /// -/// Endianness related integer operations (XX = 16, 32, or 64; Y = b or l): +/// Native endian inline functions (XX = 16, 32, or 64): +/// - Unaligned native endian reads: readXXne(ptr) +/// - Unaligned native endian writes: writeXXne(ptr, num) +/// - Aligned native endian reads: aligned_readXXne(ptr) +/// - Aligned native endian writes: aligned_writeXXne(ptr, num) +/// +/// Endianness-converting integer operations (these can be macros!) +/// (XX = 16, 32, or 64; Y = b or l): /// - Byte swapping: bswapXX(num) -/// - Byte order conversions to/from native: convXXYe(num) -/// - Aligned reads: readXXYe(ptr) -/// - Aligned writes: writeXXYe(ptr, num) -/// - Unaligned reads (16/32-bit only): unaligned_readXXYe(ptr) -/// - Unaligned writes (16/32-bit only): unaligned_writeXXYe(ptr, num) +/// - Byte order conversions to/from native (byteswaps if Y isn't +/// the native endianness): convXXYe(num) +/// - Unaligned reads (16/32-bit only): readXXYe(ptr) +/// - Unaligned writes (16/32-bit only): writeXXYe(ptr, num) +/// - Aligned reads: aligned_readXXYe(ptr) +/// - Aligned writes: aligned_writeXXYe(ptr, num) /// -/// Since they can macros, the arguments should have no side effects since -/// they may be evaluated more than once. +/// Since the above can macros, the arguments should have no side effects +/// because they may be evaluated more than once. /// -/// \todo PowerPC and possibly some other architectures support -/// byte swapping load and store instructions. This file -/// doesn't take advantage of those instructions. -/// -/// Bit scan operations for non-zero 32-bit integers: +/// Bit scan operations for non-zero 32-bit integers (inline functions): /// - Bit scan reverse (find highest non-zero bit): bsr32(num) /// - Count leading zeros: clz32(num) /// - Count trailing zeros: ctz32(num) /// - Bit scan forward (simply an alias for ctz32()): bsf32(num) /// /// The above bit scan operations return 0-31. If num is zero, /// the result is undefined. // // Authors: Lasse Collin // Joachim Henke // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef TUKLIB_INTEGER_H #define TUKLIB_INTEGER_H #include "tuklib_common.h" +#include +// Newer Intel C compilers require immintrin.h for _bit_scan_reverse() +// and such functions. +#if defined(__INTEL_COMPILER) && (__INTEL_COMPILER >= 1500) +# include +#endif -//////////////////////////////////////// -// Operating system specific features // -//////////////////////////////////////// -#if defined(HAVE_BYTESWAP_H) +/////////////////// +// Byte swapping // +/////////////////// + +#if defined(HAVE___BUILTIN_BSWAPXX) + // GCC >= 4.8 and Clang +# define bswap16(n) __builtin_bswap16(n) +# define bswap32(n) __builtin_bswap32(n) +# define bswap64(n) __builtin_bswap64(n) + +#elif defined(HAVE_BYTESWAP_H) // glibc, uClibc, dietlibc # include # ifdef HAVE_BSWAP_16 # define bswap16(num) bswap_16(num) # endif # ifdef HAVE_BSWAP_32 # define bswap32(num) bswap_32(num) # endif # ifdef HAVE_BSWAP_64 # define bswap64(num) bswap_64(num) # endif #elif defined(HAVE_SYS_ENDIAN_H) // *BSDs and Darwin # include #elif defined(HAVE_SYS_BYTEORDER_H) // Solaris # include # ifdef BSWAP_16 # define bswap16(num) BSWAP_16(num) # endif # ifdef BSWAP_32 # define bswap32(num) BSWAP_32(num) # endif # ifdef BSWAP_64 # define bswap64(num) BSWAP_64(num) # endif # ifdef BE_16 # define conv16be(num) BE_16(num) # endif # ifdef BE_32 # define conv32be(num) BE_32(num) # endif # ifdef BE_64 # define conv64be(num) BE_64(num) # endif # ifdef LE_16 # define conv16le(num) LE_16(num) # endif # ifdef LE_32 # define conv32le(num) LE_32(num) # endif # ifdef LE_64 # define conv64le(num) LE_64(num) # endif #endif - -//////////////////////////////// -// Compiler-specific features // -//////////////////////////////// - -// Newer Intel C compilers require immintrin.h for _bit_scan_reverse() -// and such functions. -#if defined(__INTEL_COMPILER) && (__INTEL_COMPILER >= 1500) -# include -#endif - - -/////////////////// -// Byte swapping // -/////////////////// - #ifndef bswap16 -# define bswap16(num) \ - (((uint16_t)(num) << 8) | ((uint16_t)(num) >> 8)) +# define bswap16(n) (uint16_t)( \ + (((n) & 0x00FFU) << 8) \ + | (((n) & 0xFF00U) >> 8) \ + ) #endif #ifndef bswap32 -# define bswap32(num) \ - ( (((uint32_t)(num) << 24) ) \ - | (((uint32_t)(num) << 8) & UINT32_C(0x00FF0000)) \ - | (((uint32_t)(num) >> 8) & UINT32_C(0x0000FF00)) \ - | (((uint32_t)(num) >> 24) ) ) +# define bswap32(n) (uint32_t)( \ + (((n) & UINT32_C(0x000000FF)) << 24) \ + | (((n) & UINT32_C(0x0000FF00)) << 8) \ + | (((n) & UINT32_C(0x00FF0000)) >> 8) \ + | (((n) & UINT32_C(0xFF000000)) >> 24) \ + ) #endif #ifndef bswap64 -# define bswap64(num) \ - ( (((uint64_t)(num) << 56) ) \ - | (((uint64_t)(num) << 40) & UINT64_C(0x00FF000000000000)) \ - | (((uint64_t)(num) << 24) & UINT64_C(0x0000FF0000000000)) \ - | (((uint64_t)(num) << 8) & UINT64_C(0x000000FF00000000)) \ - | (((uint64_t)(num) >> 8) & UINT64_C(0x00000000FF000000)) \ - | (((uint64_t)(num) >> 24) & UINT64_C(0x0000000000FF0000)) \ - | (((uint64_t)(num) >> 40) & UINT64_C(0x000000000000FF00)) \ - | (((uint64_t)(num) >> 56) ) ) +# define bswap64(n) (uint64_t)( \ + (((n) & UINT64_C(0x00000000000000FF)) << 56) \ + | (((n) & UINT64_C(0x000000000000FF00)) << 40) \ + | (((n) & UINT64_C(0x0000000000FF0000)) << 24) \ + | (((n) & UINT64_C(0x00000000FF000000)) << 8) \ + | (((n) & UINT64_C(0x000000FF00000000)) >> 8) \ + | (((n) & UINT64_C(0x0000FF0000000000)) >> 24) \ + | (((n) & UINT64_C(0x00FF000000000000)) >> 40) \ + | (((n) & UINT64_C(0xFF00000000000000)) >> 56) \ + ) #endif // Define conversion macros using the basic byte swapping macros. #ifdef WORDS_BIGENDIAN # ifndef conv16be # define conv16be(num) ((uint16_t)(num)) # endif # ifndef conv32be # define conv32be(num) ((uint32_t)(num)) # endif # ifndef conv64be # define conv64be(num) ((uint64_t)(num)) # endif # ifndef conv16le # define conv16le(num) bswap16(num) # endif # ifndef conv32le # define conv32le(num) bswap32(num) # endif # ifndef conv64le # define conv64le(num) bswap64(num) # endif #else # ifndef conv16be # define conv16be(num) bswap16(num) # endif # ifndef conv32be # define conv32be(num) bswap32(num) # endif # ifndef conv64be # define conv64be(num) bswap64(num) # endif # ifndef conv16le # define conv16le(num) ((uint16_t)(num)) # endif # ifndef conv32le # define conv32le(num) ((uint32_t)(num)) # endif # ifndef conv64le # define conv64le(num) ((uint64_t)(num)) # endif #endif -////////////////////////////// -// Aligned reads and writes // -////////////////////////////// +//////////////////////////////// +// Unaligned reads and writes // +//////////////////////////////// -static inline uint16_t -read16be(const uint8_t *buf) -{ - uint16_t num = *(const uint16_t *)buf; - return conv16be(num); -} +// The traditional way of casting e.g. *(const uint16_t *)uint8_pointer +// is bad even if the uint8_pointer is properly aligned because this kind +// of casts break strict aliasing rules and result in undefined behavior. +// With unaligned pointers it's even worse: compilers may emit vector +// instructions that require aligned pointers even if non-vector +// instructions work with unaligned pointers. +// +// Using memcpy() is the standard compliant way to do unaligned access. +// Many modern compilers inline it so there is no function call overhead. +// For those compilers that don't handle the memcpy() method well, the +// old casting method (that violates strict aliasing) can be requested at +// build time. A third method, casting to a packed struct, would also be +// an option but isn't provided to keep things simpler (it's already a mess). +// Hopefully this is flexible enough in practice. - static inline uint16_t -read16le(const uint8_t *buf) +read16ne(const uint8_t *buf) { - uint16_t num = *(const uint16_t *)buf; - return conv16le(num); +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) + return *(const uint16_t *)buf; +#else + uint16_t num; + memcpy(&num, buf, sizeof(num)); + return num; +#endif } static inline uint32_t -read32be(const uint8_t *buf) +read32ne(const uint8_t *buf) { - uint32_t num = *(const uint32_t *)buf; - return conv32be(num); +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) + return *(const uint32_t *)buf; +#else + uint32_t num; + memcpy(&num, buf, sizeof(num)); + return num; +#endif } -static inline uint32_t -read32le(const uint8_t *buf) -{ - uint32_t num = *(const uint32_t *)buf; - return conv32le(num); -} - - static inline uint64_t -read64be(const uint8_t *buf) +read64ne(const uint8_t *buf) { - uint64_t num = *(const uint64_t *)buf; - return conv64be(num); +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) + return *(const uint64_t *)buf; +#else + uint64_t num; + memcpy(&num, buf, sizeof(num)); + return num; +#endif } -static inline uint64_t -read64le(const uint8_t *buf) -{ - uint64_t num = *(const uint64_t *)buf; - return conv64le(num); -} - - -// NOTE: Possible byte swapping must be done in a macro to allow GCC -// to optimize byte swapping of constants when using glibc's or *BSD's -// byte swapping macros. The actual write is done in an inline function -// to make type checking of the buf pointer possible similarly to readXXYe() -// functions. - -#define write16be(buf, num) write16ne((buf), conv16be(num)) -#define write16le(buf, num) write16ne((buf), conv16le(num)) -#define write32be(buf, num) write32ne((buf), conv32be(num)) -#define write32le(buf, num) write32ne((buf), conv32le(num)) -#define write64be(buf, num) write64ne((buf), conv64be(num)) -#define write64le(buf, num) write64ne((buf), conv64le(num)) - - static inline void write16ne(uint8_t *buf, uint16_t num) { +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) *(uint16_t *)buf = num; +#else + memcpy(buf, &num, sizeof(num)); +#endif return; } static inline void write32ne(uint8_t *buf, uint32_t num) { +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) *(uint32_t *)buf = num; +#else + memcpy(buf, &num, sizeof(num)); +#endif return; } static inline void write64ne(uint8_t *buf, uint64_t num) { +#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) *(uint64_t *)buf = num; +#else + memcpy(buf, &num, sizeof(num)); +#endif return; } -//////////////////////////////// -// Unaligned reads and writes // -//////////////////////////////// - -// NOTE: TUKLIB_FAST_UNALIGNED_ACCESS indicates only support for 16-bit and -// 32-bit unaligned integer loads and stores. It's possible that 64-bit -// unaligned access doesn't work or is slower than byte-by-byte access. -// Since unaligned 64-bit is probably not needed as often as 16-bit or -// 32-bit, we simply don't support 64-bit unaligned access for now. -#ifdef TUKLIB_FAST_UNALIGNED_ACCESS -# define unaligned_read16be read16be -# define unaligned_read16le read16le -# define unaligned_read32be read32be -# define unaligned_read32le read32le -# define unaligned_write16be write16be -# define unaligned_write16le write16le -# define unaligned_write32be write32be -# define unaligned_write32le write32le - -#else - static inline uint16_t -unaligned_read16be(const uint8_t *buf) +read16be(const uint8_t *buf) { +#if defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) + uint16_t num = read16ne(buf); + return conv16be(num); +#else uint16_t num = ((uint16_t)buf[0] << 8) | (uint16_t)buf[1]; return num; +#endif } static inline uint16_t -unaligned_read16le(const uint8_t *buf) +read16le(const uint8_t *buf) { +#if !defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) + uint16_t num = read16ne(buf); + return conv16le(num); +#else uint16_t num = ((uint16_t)buf[0]) | ((uint16_t)buf[1] << 8); return num; +#endif } static inline uint32_t -unaligned_read32be(const uint8_t *buf) +read32be(const uint8_t *buf) { +#if defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) + uint32_t num = read32ne(buf); + return conv32be(num); +#else uint32_t num = (uint32_t)buf[0] << 24; num |= (uint32_t)buf[1] << 16; num |= (uint32_t)buf[2] << 8; num |= (uint32_t)buf[3]; return num; +#endif } static inline uint32_t -unaligned_read32le(const uint8_t *buf) +read32le(const uint8_t *buf) { +#if !defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) + uint32_t num = read32ne(buf); + return conv32le(num); +#else uint32_t num = (uint32_t)buf[0]; num |= (uint32_t)buf[1] << 8; num |= (uint32_t)buf[2] << 16; num |= (uint32_t)buf[3] << 24; return num; +#endif } +// NOTE: Possible byte swapping must be done in a macro to allow the compiler +// to optimize byte swapping of constants when using glibc's or *BSD's +// byte swapping macros. The actual write is done in an inline function +// to make type checking of the buf pointer possible. +#if defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) +# define write16be(buf, num) write16ne(buf, conv16be(num)) +# define write32be(buf, num) write32ne(buf, conv32be(num)) +#endif + +#if !defined(WORDS_BIGENDIAN) || defined(TUKLIB_FAST_UNALIGNED_ACCESS) +# define write16le(buf, num) write16ne(buf, conv16le(num)) +# define write32le(buf, num) write32ne(buf, conv32le(num)) +#endif + + +#ifndef write16be static inline void -unaligned_write16be(uint8_t *buf, uint16_t num) +write16be(uint8_t *buf, uint16_t num) { buf[0] = (uint8_t)(num >> 8); buf[1] = (uint8_t)num; return; } +#endif +#ifndef write16le static inline void -unaligned_write16le(uint8_t *buf, uint16_t num) +write16le(uint8_t *buf, uint16_t num) { buf[0] = (uint8_t)num; buf[1] = (uint8_t)(num >> 8); return; } +#endif +#ifndef write32be static inline void -unaligned_write32be(uint8_t *buf, uint32_t num) +write32be(uint8_t *buf, uint32_t num) { buf[0] = (uint8_t)(num >> 24); buf[1] = (uint8_t)(num >> 16); buf[2] = (uint8_t)(num >> 8); buf[3] = (uint8_t)num; return; } +#endif +#ifndef write32le static inline void -unaligned_write32le(uint8_t *buf, uint32_t num) +write32le(uint8_t *buf, uint32_t num) { buf[0] = (uint8_t)num; buf[1] = (uint8_t)(num >> 8); buf[2] = (uint8_t)(num >> 16); buf[3] = (uint8_t)(num >> 24); return; } +#endif + +////////////////////////////// +// Aligned reads and writes // +////////////////////////////// + +// Separate functions for aligned reads and writes are provided since on +// strict-align archs aligned access is much faster than unaligned access. +// +// Just like in the unaligned case, memcpy() is needed to avoid +// strict aliasing violations. However, on archs that don't support +// unaligned access the compiler cannot know that the pointers given +// to memcpy() are aligned which results in slow code. As of C11 there is +// no standard way to tell the compiler that we know that the address is +// aligned but some compilers have language extensions to do that. With +// such language extensions the memcpy() method gives excellent results. +// +// What to do on a strict-align system when no known language extentensions +// are available? Falling back to byte-by-byte access would be safe but ruin +// optimizations that have been made specifically with aligned access in mind. +// As a compromise, aligned reads will fall back to non-compliant type punning +// but aligned writes will be byte-by-byte, that is, fast reads are preferred +// over fast writes. This obviously isn't great but hopefully it's a working +// compromise for now. +// +// __builtin_assume_aligned is support by GCC >= 4.7 and clang >= 3.6. +#ifdef HAVE___BUILTIN_ASSUME_ALIGNED +# define tuklib_memcpy_aligned(dest, src, size) \ + memcpy(dest, __builtin_assume_aligned(src, size), size) +#else +# define tuklib_memcpy_aligned(dest, src, size) \ + memcpy(dest, src, size) +# ifndef TUKLIB_FAST_UNALIGNED_ACCESS +# define TUKLIB_USE_UNSAFE_ALIGNED_READS 1 +# endif #endif +static inline uint16_t +aligned_read16ne(const uint8_t *buf) +{ +#if defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) \ + || defined(TUKLIB_USE_UNSAFE_ALIGNED_READS) + return *(const uint16_t *)buf; +#else + uint16_t num; + tuklib_memcpy_aligned(&num, buf, sizeof(num)); + return num; +#endif +} + + static inline uint32_t +aligned_read32ne(const uint8_t *buf) +{ +#if defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) \ + || defined(TUKLIB_USE_UNSAFE_ALIGNED_READS) + return *(const uint32_t *)buf; +#else + uint32_t num; + tuklib_memcpy_aligned(&num, buf, sizeof(num)); + return num; +#endif +} + + +static inline uint64_t +aligned_read64ne(const uint8_t *buf) +{ +#if defined(TUKLIB_USE_UNSAFE_TYPE_PUNNING) \ + || defined(TUKLIB_USE_UNSAFE_ALIGNED_READS) + return *(const uint64_t *)buf; +#else + uint64_t num; + tuklib_memcpy_aligned(&num, buf, sizeof(num)); + return num; +#endif +} + + +static inline void +aligned_write16ne(uint8_t *buf, uint16_t num) +{ +#ifdef TUKLIB_USE_UNSAFE_TYPE_PUNNING + *(uint16_t *)buf = num; +#else + tuklib_memcpy_aligned(buf, &num, sizeof(num)); +#endif + return; +} + + +static inline void +aligned_write32ne(uint8_t *buf, uint32_t num) +{ +#ifdef TUKLIB_USE_UNSAFE_TYPE_PUNNING + *(uint32_t *)buf = num; +#else + tuklib_memcpy_aligned(buf, &num, sizeof(num)); +#endif + return; +} + + +static inline void +aligned_write64ne(uint8_t *buf, uint64_t num) +{ +#ifdef TUKLIB_USE_UNSAFE_TYPE_PUNNING + *(uint64_t *)buf = num; +#else + tuklib_memcpy_aligned(buf, &num, sizeof(num)); +#endif + return; +} + + +static inline uint16_t +aligned_read16be(const uint8_t *buf) +{ + uint16_t num = aligned_read16ne(buf); + return conv16be(num); +} + + +static inline uint16_t +aligned_read16le(const uint8_t *buf) +{ + uint16_t num = aligned_read16ne(buf); + return conv16le(num); +} + + +static inline uint32_t +aligned_read32be(const uint8_t *buf) +{ + uint32_t num = aligned_read32ne(buf); + return conv32be(num); +} + + +static inline uint32_t +aligned_read32le(const uint8_t *buf) +{ + uint32_t num = aligned_read32ne(buf); + return conv32le(num); +} + + +static inline uint64_t +aligned_read64be(const uint8_t *buf) +{ + uint64_t num = aligned_read64ne(buf); + return conv64be(num); +} + + +static inline uint64_t +aligned_read64le(const uint8_t *buf) +{ + uint64_t num = aligned_read64ne(buf); + return conv64le(num); +} + + +// These need to be macros like in the unaligned case. +#define aligned_write16be(buf, num) aligned_write16ne((buf), conv16be(num)) +#define aligned_write16le(buf, num) aligned_write16ne((buf), conv16le(num)) +#define aligned_write32be(buf, num) aligned_write32ne((buf), conv32be(num)) +#define aligned_write32le(buf, num) aligned_write32ne((buf), conv32le(num)) +#define aligned_write64be(buf, num) aligned_write64ne((buf), conv64be(num)) +#define aligned_write64le(buf, num) aligned_write64ne((buf), conv64le(num)) + + +//////////////////// +// Bit operations // +//////////////////// + +static inline uint32_t bsr32(uint32_t n) { // Check for ICC first, since it tends to define __GNUC__ too. #if defined(__INTEL_COMPILER) return _bit_scan_reverse(n); #elif TUKLIB_GNUC_REQ(3, 4) && UINT_MAX == UINT32_MAX // GCC >= 3.4 has __builtin_clz(), which gives good results on // multiple architectures. On x86, __builtin_clz() ^ 31U becomes // either plain BSR (so the XOR gets optimized away) or LZCNT and // XOR (if -march indicates that SSE4a instructions are supported). - return __builtin_clz(n) ^ 31U; + return (uint32_t)__builtin_clz(n) ^ 31U; #elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) uint32_t i; __asm__("bsrl %1, %0" : "=r" (i) : "rm" (n)); return i; -#elif defined(_MSC_VER) && _MSC_VER >= 1400 - // MSVC isn't supported by tuklib, but since this code exists, - // it doesn't hurt to have it here anyway. - uint32_t i; - _BitScanReverse((DWORD *)&i, n); +#elif defined(_MSC_VER) + unsigned long i; + _BitScanReverse(&i, n); return i; #else uint32_t i = 31; - if ((n & UINT32_C(0xFFFF0000)) == 0) { + if ((n & 0xFFFF0000) == 0) { n <<= 16; i = 15; } - if ((n & UINT32_C(0xFF000000)) == 0) { + if ((n & 0xFF000000) == 0) { n <<= 8; i -= 8; } - if ((n & UINT32_C(0xF0000000)) == 0) { + if ((n & 0xF0000000) == 0) { n <<= 4; i -= 4; } - if ((n & UINT32_C(0xC0000000)) == 0) { + if ((n & 0xC0000000) == 0) { n <<= 2; i -= 2; } - if ((n & UINT32_C(0x80000000)) == 0) + if ((n & 0x80000000) == 0) --i; return i; #endif } static inline uint32_t clz32(uint32_t n) { #if defined(__INTEL_COMPILER) return _bit_scan_reverse(n) ^ 31U; #elif TUKLIB_GNUC_REQ(3, 4) && UINT_MAX == UINT32_MAX - return __builtin_clz(n); + return (uint32_t)__builtin_clz(n); #elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) uint32_t i; __asm__("bsrl %1, %0\n\t" "xorl $31, %0" : "=r" (i) : "rm" (n)); return i; -#elif defined(_MSC_VER) && _MSC_VER >= 1400 - uint32_t i; - _BitScanReverse((DWORD *)&i, n); +#elif defined(_MSC_VER) + unsigned long i; + _BitScanReverse(&i, n); return i ^ 31U; #else uint32_t i = 0; - if ((n & UINT32_C(0xFFFF0000)) == 0) { + if ((n & 0xFFFF0000) == 0) { n <<= 16; i = 16; } - if ((n & UINT32_C(0xFF000000)) == 0) { + if ((n & 0xFF000000) == 0) { n <<= 8; i += 8; } - if ((n & UINT32_C(0xF0000000)) == 0) { + if ((n & 0xF0000000) == 0) { n <<= 4; i += 4; } - if ((n & UINT32_C(0xC0000000)) == 0) { + if ((n & 0xC0000000) == 0) { n <<= 2; i += 2; } - if ((n & UINT32_C(0x80000000)) == 0) + if ((n & 0x80000000) == 0) ++i; return i; #endif } static inline uint32_t ctz32(uint32_t n) { #if defined(__INTEL_COMPILER) return _bit_scan_forward(n); #elif TUKLIB_GNUC_REQ(3, 4) && UINT_MAX >= UINT32_MAX - return __builtin_ctz(n); + return (uint32_t)__builtin_ctz(n); #elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) uint32_t i; __asm__("bsfl %1, %0" : "=r" (i) : "rm" (n)); return i; -#elif defined(_MSC_VER) && _MSC_VER >= 1400 - uint32_t i; - _BitScanForward((DWORD *)&i, n); +#elif defined(_MSC_VER) + unsigned long i; + _BitScanForward(&i, n); return i; #else uint32_t i = 0; - if ((n & UINT32_C(0x0000FFFF)) == 0) { + if ((n & 0x0000FFFF) == 0) { n >>= 16; i = 16; } - if ((n & UINT32_C(0x000000FF)) == 0) { + if ((n & 0x000000FF) == 0) { n >>= 8; i += 8; } - if ((n & UINT32_C(0x0000000F)) == 0) { + if ((n & 0x0000000F) == 0) { n >>= 4; i += 4; } - if ((n & UINT32_C(0x00000003)) == 0) { + if ((n & 0x00000003) == 0) { n >>= 2; i += 2; } - if ((n & UINT32_C(0x00000001)) == 0) + if ((n & 0x00000001) == 0) ++i; return i; #endif } #define bsf32 ctz32 #endif Index: head/contrib/xz/src/common/tuklib_mbstr.h =================================================================== --- head/contrib/xz/src/common/tuklib_mbstr.h (revision 359200) +++ head/contrib/xz/src/common/tuklib_mbstr.h (revision 359201) @@ -1,66 +1,66 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file tuklib_mstr.h +/// \file tuklib_mbstr.h /// \brief Utility functions for handling multibyte strings /// /// If not enough multibyte string support is available in the C library, /// these functions keep working with the assumption that all strings /// are in a single-byte character set without combining characters, e.g. /// US-ASCII or ISO-8859-*. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef TUKLIB_MBSTR_H #define TUKLIB_MBSTR_H #include "tuklib_common.h" TUKLIB_DECLS_BEGIN #define tuklib_mbstr_width TUKLIB_SYMBOL(tuklib_mbstr_width) extern size_t tuklib_mbstr_width(const char *str, size_t *bytes); ///< /// \brief Get the number of columns needed for the multibyte string /// /// This is somewhat similar to wcswidth() but works on multibyte strings. /// /// \param str String whose width is to be calculated. If the /// current locale uses a multibyte character set /// that has shift states, the string must begin /// and end in the initial shift state. /// \param bytes If this is not NULL, *bytes is set to the /// value returned by strlen(str) (even if an /// error occurs when calculating the width). /// /// \return On success, the number of columns needed to display the /// string e.g. in a terminal emulator is returned. On error, /// (size_t)-1 is returned. Possible errors include invalid, /// partial, or non-printable multibyte character in str, or /// that str doesn't end in the initial shift state. #define tuklib_mbstr_fw TUKLIB_SYMBOL(tuklib_mbstr_fw) extern int tuklib_mbstr_fw(const char *str, int columns_min); ///< /// \brief Get the field width for printf() e.g. to align table columns /// /// Printing simple tables to a terminal can be done using the field field /// feature in the printf() format string, but it works only with single-byte /// character sets. To do the same with multibyte strings, tuklib_mbstr_fw() /// can be used to calculate appropriate field width. /// /// The behavior of this function is undefined, if /// - str is NULL or not terminated with '\0'; /// - columns_min <= 0; or /// - the calculated field width exceeds INT_MAX. /// /// \return If tuklib_mbstr_width(str, NULL) fails, -1 is returned. /// If str needs more columns than columns_min, zero is returned. /// Otherwise a positive integer is returned, which can be /// used as the field width, e.g. printf("%*s", fw, str). TUKLIB_DECLS_END #endif Index: head/contrib/xz/src/common/tuklib_mbstr_fw.c =================================================================== --- head/contrib/xz/src/common/tuklib_mbstr_fw.c (revision 359200) +++ head/contrib/xz/src/common/tuklib_mbstr_fw.c (revision 359201) @@ -1,31 +1,31 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file tuklib_mstr_fw.c +/// \file tuklib_mbstr_fw.c /// \brief Get the field width for printf() e.g. to align table columns // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "tuklib_mbstr.h" extern int tuklib_mbstr_fw(const char *str, int columns_min) { size_t len; const size_t width = tuklib_mbstr_width(str, &len); if (width == (size_t)-1) return -1; if (width > (size_t)columns_min) return 0; if (width < (size_t)columns_min) len += (size_t)columns_min - width; return len; } Index: head/contrib/xz/src/common/tuklib_mbstr_width.c =================================================================== --- head/contrib/xz/src/common/tuklib_mbstr_width.c (revision 359200) +++ head/contrib/xz/src/common/tuklib_mbstr_width.c (revision 359201) @@ -1,64 +1,65 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file tuklib_mstr_width.c +/// \file tuklib_mbstr_width.c /// \brief Calculate width of a multibyte string // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "tuklib_mbstr.h" +#include #if defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH) # include #endif extern size_t tuklib_mbstr_width(const char *str, size_t *bytes) { const size_t len = strlen(str); if (bytes != NULL) *bytes = len; #if !(defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH)) // In single-byte mode, the width of the string is the same // as its length. return len; #else mbstate_t state; memset(&state, 0, sizeof(state)); size_t width = 0; size_t i = 0; // Convert one multibyte character at a time to wchar_t // and get its width using wcwidth(). while (i < len) { wchar_t wc; const size_t ret = mbrtowc(&wc, str + i, len - i, &state); if (ret < 1 || ret > len) return (size_t)-1; i += ret; const int wc_width = wcwidth(wc); if (wc_width < 0) return (size_t)-1; - width += wc_width; + width += (size_t)wc_width; } // Require that the string ends in the initial shift state. // This way the caller can be combine the string with other // strings without needing to worry about the shift states. if (!mbsinit(&state)) return (size_t)-1; return width; #endif } Index: head/contrib/xz/src/liblzma/api/lzma/block.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/block.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/block.h (revision 359201) @@ -1,581 +1,581 @@ /** * \file lzma/block.h * \brief .xz Block handling */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /** * \brief Options for the Block and Block Header encoders and decoders * * Different Block handling functions use different parts of this structure. * Some read some members, other functions write, and some do both. Only the * members listed for reading need to be initialized when the specified * functions are called. The members marked for writing will be assigned * new values at some point either by calling the given function or by * later calls to lzma_code(). */ typedef struct { /** * \brief Block format version * * To prevent API and ABI breakages when new features are needed, * a version number is used to indicate which fields in this * structure are in use: * - liblzma >= 5.0.0: version = 0 is supported. * - liblzma >= 5.1.4beta: Support for version = 1 was added, * which adds the ignore_check field. * * If version is greater than one, most Block related functions * will return LZMA_OPTIONS_ERROR (lzma_block_header_decode() works * with any version value). * * Read by: * - All functions that take pointer to lzma_block as argument, * including lzma_block_header_decode(). * * Written by: * - lzma_block_header_decode() */ uint32_t version; /** * \brief Size of the Block Header field * * This is always a multiple of four. * * Read by: * - lzma_block_header_encode() * - lzma_block_header_decode() * - lzma_block_compressed_size() * - lzma_block_unpadded_size() * - lzma_block_total_size() * - lzma_block_decoder() * - lzma_block_buffer_decode() * * Written by: * - lzma_block_header_size() * - lzma_block_buffer_encode() */ uint32_t header_size; # define LZMA_BLOCK_HEADER_SIZE_MIN 8 # define LZMA_BLOCK_HEADER_SIZE_MAX 1024 /** * \brief Type of integrity Check * * The Check ID is not stored into the Block Header, thus its value * must be provided also when decoding. * * Read by: * - lzma_block_header_encode() * - lzma_block_header_decode() * - lzma_block_compressed_size() * - lzma_block_unpadded_size() * - lzma_block_total_size() * - lzma_block_encoder() * - lzma_block_decoder() * - lzma_block_buffer_encode() * - lzma_block_buffer_decode() */ lzma_check check; /** * \brief Size of the Compressed Data in bytes * * Encoding: If this is not LZMA_VLI_UNKNOWN, Block Header encoder * will store this value to the Block Header. Block encoder doesn't * care about this value, but will set it once the encoding has been * finished. * * Decoding: If this is not LZMA_VLI_UNKNOWN, Block decoder will * verify that the size of the Compressed Data field matches * compressed_size. * * Usually you don't know this value when encoding in streamed mode, * and thus cannot write this field into the Block Header. * * In non-streamed mode you can reserve space for this field before * encoding the actual Block. After encoding the data, finish the * Block by encoding the Block Header. Steps in detail: * * - Set compressed_size to some big enough value. If you don't know * better, use LZMA_VLI_MAX, but remember that bigger values take * more space in Block Header. * * - Call lzma_block_header_size() to see how much space you need to * reserve for the Block Header. * * - Encode the Block using lzma_block_encoder() and lzma_code(). * It sets compressed_size to the correct value. * * - Use lzma_block_header_encode() to encode the Block Header. * Because space was reserved in the first step, you don't need * to call lzma_block_header_size() anymore, because due to * reserving, header_size has to be big enough. If it is "too big", * lzma_block_header_encode() will add enough Header Padding to * make Block Header to match the size specified by header_size. * * Read by: * - lzma_block_header_size() * - lzma_block_header_encode() * - lzma_block_compressed_size() * - lzma_block_unpadded_size() * - lzma_block_total_size() * - lzma_block_decoder() * - lzma_block_buffer_decode() * * Written by: * - lzma_block_header_decode() * - lzma_block_compressed_size() * - lzma_block_encoder() * - lzma_block_decoder() * - lzma_block_buffer_encode() * - lzma_block_buffer_decode() */ lzma_vli compressed_size; /** * \brief Uncompressed Size in bytes * * This is handled very similarly to compressed_size above. * * uncompressed_size is needed by fewer functions than * compressed_size. This is because uncompressed_size isn't * needed to validate that Block stays within proper limits. * * Read by: * - lzma_block_header_size() * - lzma_block_header_encode() * - lzma_block_decoder() * - lzma_block_buffer_decode() * * Written by: * - lzma_block_header_decode() * - lzma_block_encoder() * - lzma_block_decoder() * - lzma_block_buffer_encode() * - lzma_block_buffer_decode() */ lzma_vli uncompressed_size; /** * \brief Array of filters * * There can be 1-4 filters. The end of the array is marked with * .id = LZMA_VLI_UNKNOWN. * * Read by: * - lzma_block_header_size() * - lzma_block_header_encode() * - lzma_block_encoder() * - lzma_block_decoder() * - lzma_block_buffer_encode() * - lzma_block_buffer_decode() * * Written by: * - lzma_block_header_decode(): Note that this does NOT free() * the old filter options structures. All unused filters[] will * have .id == LZMA_VLI_UNKNOWN and .options == NULL. If * decoding fails, all filters[] are guaranteed to be * LZMA_VLI_UNKNOWN and NULL. * * \note Because of the array is terminated with * .id = LZMA_VLI_UNKNOWN, the actual array must * have LZMA_FILTERS_MAX + 1 members or the Block * Header decoder will overflow the buffer. */ lzma_filter *filters; /** * \brief Raw value stored in the Check field * * After successful coding, the first lzma_check_size(check) bytes * of this array contain the raw value stored in the Check field. * * Note that CRC32 and CRC64 are stored in little endian byte order. * Take it into account if you display the Check values to the user. * * Written by: * - lzma_block_encoder() * - lzma_block_decoder() * - lzma_block_buffer_encode() * - lzma_block_buffer_decode() */ uint8_t raw_check[LZMA_CHECK_SIZE_MAX]; /* * Reserved space to allow possible future extensions without * breaking the ABI. You should not touch these, because the names * of these variables may change. These are and will never be used * with the currently supported options, so it is safe to leave these * uninitialized. */ void *reserved_ptr1; void *reserved_ptr2; void *reserved_ptr3; uint32_t reserved_int1; uint32_t reserved_int2; lzma_vli reserved_int3; lzma_vli reserved_int4; lzma_vli reserved_int5; lzma_vli reserved_int6; lzma_vli reserved_int7; lzma_vli reserved_int8; lzma_reserved_enum reserved_enum1; lzma_reserved_enum reserved_enum2; lzma_reserved_enum reserved_enum3; lzma_reserved_enum reserved_enum4; /** * \brief A flag to Block decoder to not verify the Check field * * This field is supported by liblzma >= 5.1.4beta if .version >= 1. * * If this is set to true, the integrity check won't be calculated * and verified. Unless you know what you are doing, you should * leave this to false. (A reason to set this to true is when the * file integrity is verified externally anyway and you want to * speed up the decompression, which matters mostly when using * SHA-256 as the integrity check.) * * If .version >= 1, read by: * - lzma_block_decoder() * - lzma_block_buffer_decode() * * Written by (.version is ignored): * - lzma_block_header_decode() always sets this to false */ lzma_bool ignore_check; lzma_bool reserved_bool2; lzma_bool reserved_bool3; lzma_bool reserved_bool4; lzma_bool reserved_bool5; lzma_bool reserved_bool6; lzma_bool reserved_bool7; lzma_bool reserved_bool8; } lzma_block; /** * \brief Decode the Block Header Size field * * To decode Block Header using lzma_block_header_decode(), the size of the * Block Header has to be known and stored into lzma_block.header_size. * The size can be calculated from the first byte of a Block using this macro. * Note that if the first byte is 0x00, it indicates beginning of Index; use * this macro only when the byte is not 0x00. * * There is no encoding macro, because Block Header encoder is enough for that. */ #define lzma_block_header_size_decode(b) (((uint32_t)(b) + 1) * 4) /** * \brief Calculate Block Header Size * * Calculate the minimum size needed for the Block Header field using the * settings specified in the lzma_block structure. Note that it is OK to * increase the calculated header_size value as long as it is a multiple of * four and doesn't exceed LZMA_BLOCK_HEADER_SIZE_MAX. Increasing header_size * just means that lzma_block_header_encode() will add Header Padding. * * \return - LZMA_OK: Size calculated successfully and stored to * block->header_size. * - LZMA_OPTIONS_ERROR: Unsupported version, filters or * filter options. * - LZMA_PROG_ERROR: Invalid values like compressed_size == 0. * * \note This doesn't check that all the options are valid i.e. this * may return LZMA_OK even if lzma_block_header_encode() or * lzma_block_encoder() would fail. If you want to validate the * filter chain, consider using lzma_memlimit_encoder() which as * a side-effect validates the filter chain. */ extern LZMA_API(lzma_ret) lzma_block_header_size(lzma_block *block) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Encode Block Header * * The caller must have calculated the size of the Block Header already with * lzma_block_header_size(). If a value larger than the one calculated by * lzma_block_header_size() is used, the Block Header will be padded to the * specified size. * * \param out Beginning of the output buffer. This must be * at least block->header_size bytes. * \param block Block options to be encoded. * * \return - LZMA_OK: Encoding was successful. block->header_size * bytes were written to output buffer. * - LZMA_OPTIONS_ERROR: Invalid or unsupported options. * - LZMA_PROG_ERROR: Invalid arguments, for example * block->header_size is invalid or block->filters is NULL. */ extern LZMA_API(lzma_ret) lzma_block_header_encode( const lzma_block *block, uint8_t *out) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Decode Block Header * * block->version should (usually) be set to the highest value supported * by the application. If the application sets block->version to a value * higher than supported by the current liblzma version, this function will * downgrade block->version to the highest value supported by it. Thus one * should check the value of block->version after calling this function if * block->version was set to a non-zero value and the application doesn't * otherwise know that the liblzma version being used is new enough to * support the specified block->version. * * The size of the Block Header must have already been decoded with * lzma_block_header_size_decode() macro and stored to block->header_size. * * The integrity check type from Stream Header must have been stored * to block->check. * * block->filters must have been allocated, but they don't need to be * initialized (possible existing filter options are not freed). * * \param block Destination for Block options. * \param allocator lzma_allocator for custom allocator functions. * Set to NULL to use malloc() (and also free() * if an error occurs). * \param in Beginning of the input buffer. This must be * at least block->header_size bytes. * * \return - LZMA_OK: Decoding was successful. block->header_size * bytes were read from the input buffer. * - LZMA_OPTIONS_ERROR: The Block Header specifies some * unsupported options such as unsupported filters. This can * happen also if block->version was set to a too low value * compared to what would be required to properly represent * the information stored in the Block Header. * - LZMA_DATA_ERROR: Block Header is corrupt, for example, * the CRC32 doesn't match. * - LZMA_PROG_ERROR: Invalid arguments, for example * block->header_size is invalid or block->filters is NULL. */ extern LZMA_API(lzma_ret) lzma_block_header_decode(lzma_block *block, const lzma_allocator *allocator, const uint8_t *in) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Validate and set Compressed Size according to Unpadded Size * * Block Header stores Compressed Size, but Index has Unpadded Size. If the * application has already parsed the Index and is now decoding Blocks, * it can calculate Compressed Size from Unpadded Size. This function does * exactly that with error checking: * * - Compressed Size calculated from Unpadded Size must be positive integer, * that is, Unpadded Size must be big enough that after Block Header and * Check fields there's still at least one byte for Compressed Size. * * - If Compressed Size was present in Block Header, the new value * calculated from Unpadded Size is compared against the value * from Block Header. * * \note This function must be called _after_ decoding the Block Header * field so that it can properly validate Compressed Size if it * was present in Block Header. * * \return - LZMA_OK: block->compressed_size was set successfully. * - LZMA_DATA_ERROR: unpadded_size is too small compared to * block->header_size and lzma_check_size(block->check). * - LZMA_PROG_ERROR: Some values are invalid. For example, * block->header_size must be a multiple of four and * between 8 and 1024 inclusive. */ extern LZMA_API(lzma_ret) lzma_block_compressed_size( lzma_block *block, lzma_vli unpadded_size) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Calculate Unpadded Size * * The Index field stores Unpadded Size and Uncompressed Size. The latter * can be taken directly from the lzma_block structure after coding a Block, * but Unpadded Size needs to be calculated from Block Header Size, * Compressed Size, and size of the Check field. This is where this function * is needed. * * \return Unpadded Size on success, or zero on error. */ extern LZMA_API(lzma_vli) lzma_block_unpadded_size(const lzma_block *block) lzma_nothrow lzma_attr_pure; /** * \brief Calculate the total encoded size of a Block * * This is equivalent to lzma_block_unpadded_size() except that the returned * value includes the size of the Block Padding field. * * \return On success, total encoded size of the Block. On error, * zero is returned. */ extern LZMA_API(lzma_vli) lzma_block_total_size(const lzma_block *block) lzma_nothrow lzma_attr_pure; /** * \brief Initialize .xz Block encoder * * Valid actions for lzma_code() are LZMA_RUN, LZMA_SYNC_FLUSH (only if the * filter chain supports it), and LZMA_FINISH. * * \return - LZMA_OK: All good, continue with lzma_code(). * - LZMA_MEM_ERROR * - LZMA_OPTIONS_ERROR * - LZMA_UNSUPPORTED_CHECK: block->check specifies a Check ID - * that is not supported by this buid of liblzma. Initializing + * that is not supported by this build of liblzma. Initializing * the encoder failed. * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_block_encoder( lzma_stream *strm, lzma_block *block) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Initialize .xz Block decoder * * Valid actions for lzma_code() are LZMA_RUN and LZMA_FINISH. Using * LZMA_FINISH is not required. It is supported only for convenience. * * \return - LZMA_OK: All good, continue with lzma_code(). * - LZMA_UNSUPPORTED_CHECK: Initialization was successful, but * the given Check ID is not supported, thus Check will be * ignored. * - LZMA_PROG_ERROR * - LZMA_MEM_ERROR */ extern LZMA_API(lzma_ret) lzma_block_decoder( lzma_stream *strm, lzma_block *block) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Calculate maximum output size for single-call Block encoding * * This is equivalent to lzma_stream_buffer_bound() but for .xz Blocks. * See the documentation of lzma_stream_buffer_bound(). */ extern LZMA_API(size_t) lzma_block_buffer_bound(size_t uncompressed_size) lzma_nothrow; /** * \brief Single-call .xz Block encoder * * In contrast to the multi-call encoder initialized with * lzma_block_encoder(), this function encodes also the Block Header. This * is required to make it possible to write appropriate Block Header also * in case the data isn't compressible, and different filter chain has to be * used to encode the data in uncompressed form using uncompressed chunks * of the LZMA2 filter. * * When the data isn't compressible, header_size, compressed_size, and * uncompressed_size are set just like when the data was compressible, but * it is possible that header_size is too small to hold the filter chain * specified in block->filters, because that isn't necessarily the filter * chain that was actually used to encode the data. lzma_block_unpadded_size() * still works normally, because it doesn't read the filters array. * * \param block Block options: block->version, block->check, * and block->filters must have been initialized. * \param allocator lzma_allocator for custom allocator functions. * Set to NULL to use malloc() and free(). * \param in Beginning of the input buffer * \param in_size Size of the input buffer * \param out Beginning of the output buffer * \param out_pos The next byte will be written to out[*out_pos]. * *out_pos is updated only if encoding succeeds. * \param out_size Size of the out buffer; the first byte into * which no data is written to is out[out_size]. * * \return - LZMA_OK: Encoding was successful. * - LZMA_BUF_ERROR: Not enough output buffer space. * - LZMA_UNSUPPORTED_CHECK * - LZMA_OPTIONS_ERROR * - LZMA_MEM_ERROR * - LZMA_DATA_ERROR * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_block_buffer_encode( lzma_block *block, const lzma_allocator *allocator, const uint8_t *in, size_t in_size, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Single-call uncompressed .xz Block encoder * * This is like lzma_block_buffer_encode() except this doesn't try to * compress the data and instead encodes the data using LZMA2 uncompressed * chunks. The required output buffer size can be determined with * lzma_block_buffer_bound(). * * Since the data won't be compressed, this function ignores block->filters. * This function doesn't take lzma_allocator because this function doesn't * allocate any memory from the heap. */ extern LZMA_API(lzma_ret) lzma_block_uncomp_encode(lzma_block *block, const uint8_t *in, size_t in_size, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Single-call .xz Block decoder * * This is single-call equivalent of lzma_block_decoder(), and requires that * the caller has already decoded Block Header and checked its memory usage. * * \param block Block options just like with lzma_block_decoder(). * \param allocator lzma_allocator for custom allocator functions. * Set to NULL to use malloc() and free(). * \param in Beginning of the input buffer * \param in_pos The next byte will be read from in[*in_pos]. * *in_pos is updated only if decoding succeeds. * \param in_size Size of the input buffer; the first byte that * won't be read is in[in_size]. * \param out Beginning of the output buffer * \param out_pos The next byte will be written to out[*out_pos]. * *out_pos is updated only if encoding succeeds. * \param out_size Size of the out buffer; the first byte into * which no data is written to is out[out_size]. * * \return - LZMA_OK: Decoding was successful. * - LZMA_OPTIONS_ERROR * - LZMA_DATA_ERROR * - LZMA_MEM_ERROR * - LZMA_BUF_ERROR: Output buffer was too small. * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_block_buffer_decode( lzma_block *block, const lzma_allocator *allocator, const uint8_t *in, size_t *in_pos, size_t in_size, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow; Index: head/contrib/xz/src/liblzma/api/lzma/filter.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/filter.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/filter.h (revision 359201) @@ -1,425 +1,426 @@ /** * \file lzma/filter.h * \brief Common filter related types and functions */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /** * \brief Maximum number of filters in a chain * * A filter chain can have 1-4 filters, of which three are allowed to change * the size of the data. Usually only one or two filters are needed. */ #define LZMA_FILTERS_MAX 4 /** * \brief Filter options * * This structure is used to pass Filter ID and a pointer filter's * options to liblzma. A few functions work with a single lzma_filter * structure, while most functions expect a filter chain. * * A filter chain is indicated with an array of lzma_filter structures. * The array is terminated with .id = LZMA_VLI_UNKNOWN. Thus, the filter * array must have LZMA_FILTERS_MAX + 1 elements (that is, five) to * be able to hold any arbitrary filter chain. This is important when * using lzma_block_header_decode() from block.h, because too small * array would make liblzma write past the end of the filters array. */ typedef struct { /** * \brief Filter ID * * Use constants whose name begin with `LZMA_FILTER_' to specify * different filters. In an array of lzma_filter structures, use * LZMA_VLI_UNKNOWN to indicate end of filters. * * \note This is not an enum, because on some systems enums * cannot be 64-bit. */ lzma_vli id; /** * \brief Pointer to filter-specific options structure * * If the filter doesn't need options, set this to NULL. If id is * set to LZMA_VLI_UNKNOWN, options is ignored, and thus * doesn't need be initialized. */ void *options; } lzma_filter; /** * \brief Test if the given Filter ID is supported for encoding * * Return true if the give Filter ID is supported for encoding by this * liblzma build. Otherwise false is returned. * * There is no way to list which filters are available in this particular * liblzma version and build. It would be useless, because the application * couldn't know what kind of options the filter would need. */ extern LZMA_API(lzma_bool) lzma_filter_encoder_is_supported(lzma_vli id) lzma_nothrow lzma_attr_const; /** * \brief Test if the given Filter ID is supported for decoding * * Return true if the give Filter ID is supported for decoding by this * liblzma build. Otherwise false is returned. */ extern LZMA_API(lzma_bool) lzma_filter_decoder_is_supported(lzma_vli id) lzma_nothrow lzma_attr_const; /** * \brief Copy the filters array * * Copy the Filter IDs and filter-specific options from src to dest. * Up to LZMA_FILTERS_MAX filters are copied, plus the terminating * .id == LZMA_VLI_UNKNOWN. Thus, dest should have at least * LZMA_FILTERS_MAX + 1 elements space unless the caller knows that * src is smaller than that. * * Unless the filter-specific options is NULL, the Filter ID has to be * supported by liblzma, because liblzma needs to know the size of every * filter-specific options structure. The filter-specific options are not * validated. If options is NULL, any unsupported Filter IDs are copied * without returning an error. * * Old filter-specific options in dest are not freed, so dest doesn't * need to be initialized by the caller in any way. * * If an error occurs, memory possibly already allocated by this function * is always freed. * * \return - LZMA_OK * - LZMA_MEM_ERROR * - LZMA_OPTIONS_ERROR: Unsupported Filter ID and its options * is not NULL. * - LZMA_PROG_ERROR: src or dest is NULL. */ extern LZMA_API(lzma_ret) lzma_filters_copy( const lzma_filter *src, lzma_filter *dest, const lzma_allocator *allocator) lzma_nothrow; /** * \brief Calculate approximate memory requirements for raw encoder * * This function can be used to calculate the memory requirements for * Block and Stream encoders too because Block and Stream encoders don't * need significantly more memory than raw encoder. * * \param filters Array of filters terminated with * .id == LZMA_VLI_UNKNOWN. * * \return Number of bytes of memory required for the given * filter chain when encoding. If an error occurs, * for example due to unsupported filter chain, * UINT64_MAX is returned. */ extern LZMA_API(uint64_t) lzma_raw_encoder_memusage(const lzma_filter *filters) lzma_nothrow lzma_attr_pure; /** * \brief Calculate approximate memory requirements for raw decoder * * This function can be used to calculate the memory requirements for * Block and Stream decoders too because Block and Stream decoders don't * need significantly more memory than raw decoder. * * \param filters Array of filters terminated with * .id == LZMA_VLI_UNKNOWN. * * \return Number of bytes of memory required for the given * filter chain when decoding. If an error occurs, * for example due to unsupported filter chain, * UINT64_MAX is returned. */ extern LZMA_API(uint64_t) lzma_raw_decoder_memusage(const lzma_filter *filters) lzma_nothrow lzma_attr_pure; /** * \brief Initialize raw encoder * * This function may be useful when implementing custom file formats. * * \param strm Pointer to properly prepared lzma_stream * \param filters Array of lzma_filter structures. The end of the * array must be marked with .id = LZMA_VLI_UNKNOWN. * * The `action' with lzma_code() can be LZMA_RUN, LZMA_SYNC_FLUSH (if the * filter chain supports it), or LZMA_FINISH. * * \return - LZMA_OK * - LZMA_MEM_ERROR * - LZMA_OPTIONS_ERROR * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_raw_encoder( lzma_stream *strm, const lzma_filter *filters) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Initialize raw decoder * * The initialization of raw decoder goes similarly to raw encoder. * * The `action' with lzma_code() can be LZMA_RUN or LZMA_FINISH. Using * LZMA_FINISH is not required, it is supported just for convenience. * * \return - LZMA_OK * - LZMA_MEM_ERROR * - LZMA_OPTIONS_ERROR * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_raw_decoder( lzma_stream *strm, const lzma_filter *filters) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Update the filter chain in the encoder * * This function is for advanced users only. This function has two slightly * different purposes: * * - After LZMA_FULL_FLUSH when using Stream encoder: Set a new filter * chain, which will be used starting from the next Block. * * - After LZMA_SYNC_FLUSH using Raw, Block, or Stream encoder: Change * the filter-specific options in the middle of encoding. The actual * filters in the chain (Filter IDs) cannot be changed. In the future, * it might become possible to change the filter options without * using LZMA_SYNC_FLUSH. * * While rarely useful, this function may be called also when no data has * been compressed yet. In that case, this function will behave as if * LZMA_FULL_FLUSH (Stream encoder) or LZMA_SYNC_FLUSH (Raw or Block * encoder) had been used right before calling this function. * * \return - LZMA_OK * - LZMA_MEM_ERROR * - LZMA_MEMLIMIT_ERROR * - LZMA_OPTIONS_ERROR * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_filters_update( lzma_stream *strm, const lzma_filter *filters) lzma_nothrow; /** * \brief Single-call raw encoder * * \param filters Array of lzma_filter structures. The end of the * array must be marked with .id = LZMA_VLI_UNKNOWN. * \param allocator lzma_allocator for custom allocator functions. * Set to NULL to use malloc() and free(). * \param in Beginning of the input buffer * \param in_size Size of the input buffer * \param out Beginning of the output buffer * \param out_pos The next byte will be written to out[*out_pos]. * *out_pos is updated only if encoding succeeds. * \param out_size Size of the out buffer; the first byte into * which no data is written to is out[out_size]. * * \return - LZMA_OK: Encoding was successful. * - LZMA_BUF_ERROR: Not enough output buffer space. * - LZMA_OPTIONS_ERROR * - LZMA_MEM_ERROR * - LZMA_DATA_ERROR * - LZMA_PROG_ERROR * * \note There is no function to calculate how big output buffer * would surely be big enough. (lzma_stream_buffer_bound() * works only for lzma_stream_buffer_encode(); raw encoder * won't necessarily meet that bound.) */ extern LZMA_API(lzma_ret) lzma_raw_buffer_encode( const lzma_filter *filters, const lzma_allocator *allocator, const uint8_t *in, size_t in_size, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow; /** * \brief Single-call raw decoder * * \param filters Array of lzma_filter structures. The end of the * array must be marked with .id = LZMA_VLI_UNKNOWN. * \param allocator lzma_allocator for custom allocator functions. * Set to NULL to use malloc() and free(). * \param in Beginning of the input buffer * \param in_pos The next byte will be read from in[*in_pos]. * *in_pos is updated only if decoding succeeds. * \param in_size Size of the input buffer; the first byte that * won't be read is in[in_size]. * \param out Beginning of the output buffer * \param out_pos The next byte will be written to out[*out_pos]. * *out_pos is updated only if encoding succeeds. * \param out_size Size of the out buffer; the first byte into * which no data is written to is out[out_size]. */ extern LZMA_API(lzma_ret) lzma_raw_buffer_decode( const lzma_filter *filters, const lzma_allocator *allocator, const uint8_t *in, size_t *in_pos, size_t in_size, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow; /** * \brief Get the size of the Filter Properties field * * This function may be useful when implementing custom file formats * using the raw encoder and decoder. * * \param size Pointer to uint32_t to hold the size of the properties * \param filter Filter ID and options (the size of the properties may * vary depending on the options) * * \return - LZMA_OK * - LZMA_OPTIONS_ERROR * - LZMA_PROG_ERROR * * \note This function validates the Filter ID, but does not * necessarily validate the options. Thus, it is possible * that this returns LZMA_OK while the following call to * lzma_properties_encode() returns LZMA_OPTIONS_ERROR. */ extern LZMA_API(lzma_ret) lzma_properties_size( uint32_t *size, const lzma_filter *filter) lzma_nothrow; /** * \brief Encode the Filter Properties field * * \param filter Filter ID and options * \param props Buffer to hold the encoded options. The size of * buffer must have been already determined with * lzma_properties_size(). * * \return - LZMA_OK * - LZMA_OPTIONS_ERROR * - LZMA_PROG_ERROR * * \note Even this function won't validate more options than actually * necessary. Thus, it is possible that encoding the properties * succeeds but using the same options to initialize the encoder * will fail. * * \note If lzma_properties_size() indicated that the size * of the Filter Properties field is zero, calling * lzma_properties_encode() is not required, but it * won't do any harm either. */ extern LZMA_API(lzma_ret) lzma_properties_encode( const lzma_filter *filter, uint8_t *props) lzma_nothrow; /** * \brief Decode the Filter Properties field * * \param filter filter->id must have been set to the correct * Filter ID. filter->options doesn't need to be * initialized (it's not freed by this function). The - * decoded options will be stored to filter->options. - * filter->options is set to NULL if there are no - * properties or if an error occurs. + * decoded options will be stored in filter->options; + * it's application's responsibility to free it when + * appropriate. filter->options is set to NULL if + * there are no properties or if an error occurs. * \param allocator Custom memory allocator used to allocate the * options. Set to NULL to use the default malloc(), * and in case of an error, also free(). * \param props Input buffer containing the properties. * \param props_size Size of the properties. This must be the exact * size; giving too much or too little input will * return LZMA_OPTIONS_ERROR. * * \return - LZMA_OK * - LZMA_OPTIONS_ERROR * - LZMA_MEM_ERROR */ extern LZMA_API(lzma_ret) lzma_properties_decode( lzma_filter *filter, const lzma_allocator *allocator, const uint8_t *props, size_t props_size) lzma_nothrow; /** * \brief Calculate encoded size of a Filter Flags field * * Knowing the size of Filter Flags is useful to know when allocating * memory to hold the encoded Filter Flags. * * \param size Pointer to integer to hold the calculated size * \param filter Filter ID and associated options whose encoded * size is to be calculated * * \return - LZMA_OK: *size set successfully. Note that this doesn't * guarantee that filter->options is valid, thus * lzma_filter_flags_encode() may still fail. * - LZMA_OPTIONS_ERROR: Unknown Filter ID or unsupported options. * - LZMA_PROG_ERROR: Invalid options * * \note If you need to calculate size of List of Filter Flags, * you need to loop over every lzma_filter entry. */ extern LZMA_API(lzma_ret) lzma_filter_flags_size( uint32_t *size, const lzma_filter *filter) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Encode Filter Flags into given buffer * * In contrast to some functions, this doesn't allocate the needed buffer. * This is due to how this function is used internally by liblzma. * * \param filter Filter ID and options to be encoded * \param out Beginning of the output buffer * \param out_pos out[*out_pos] is the next write position. This * is updated by the encoder. * \param out_size out[out_size] is the first byte to not write. * * \return - LZMA_OK: Encoding was successful. * - LZMA_OPTIONS_ERROR: Invalid or unsupported options. * - LZMA_PROG_ERROR: Invalid options or not enough output * buffer space (you should have checked it with * lzma_filter_flags_size()). */ extern LZMA_API(lzma_ret) lzma_filter_flags_encode(const lzma_filter *filter, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow lzma_attr_warn_unused_result; /** * \brief Decode Filter Flags from given buffer * * The decoded result is stored into *filter. The old value of * filter->options is not free()d. * * \return - LZMA_OK * - LZMA_OPTIONS_ERROR * - LZMA_MEM_ERROR * - LZMA_PROG_ERROR */ extern LZMA_API(lzma_ret) lzma_filter_flags_decode( lzma_filter *filter, const lzma_allocator *allocator, const uint8_t *in, size_t *in_pos, size_t in_size) lzma_nothrow lzma_attr_warn_unused_result; Index: head/contrib/xz/src/liblzma/api/lzma/hardware.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/hardware.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/hardware.h (revision 359201) @@ -1,64 +1,64 @@ /** * \file lzma/hardware.h * \brief Hardware information * * Since liblzma can consume a lot of system resources, it also provides * ways to limit the resource usage. Applications linking against liblzma * need to do the actual decisions how much resources to let liblzma to use. * To ease making these decisions, liblzma provides functions to find out - * the relevant capabilities of the underlaying hardware. Currently there + * the relevant capabilities of the underlying hardware. Currently there * is only a function to find out the amount of RAM, but in the future there * will be also a function to detect how many concurrent threads the system * can run. * * \note On some operating systems, these function may temporarily * load a shared library or open file descriptor(s) to find out * the requested hardware information. Unless the application * assumes that specific file descriptors are not touched by * other threads, this should have no effect on thread safety. * Possible operations involving file descriptors will restart * the syscalls if they return EINTR. */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /** * \brief Get the total amount of physical memory (RAM) in bytes * * This function may be useful when determining a reasonable memory * usage limit for decompressing or how much memory it is OK to use * for compressing. * * \return On success, the total amount of physical memory in bytes * is returned. If the amount of RAM cannot be determined, * zero is returned. This can happen if an error occurs * or if there is no code in liblzma to detect the amount * of RAM on the specific operating system. */ extern LZMA_API(uint64_t) lzma_physmem(void) lzma_nothrow; /** * \brief Get the number of processor cores or threads * * This function may be useful when determining how many threads to use. * If the hardware supports more than one thread per CPU core, the number * of hardware threads is returned if that information is available. * * \brief On success, the number of available CPU threads or cores is * returned. If this information isn't available or an error * occurs, zero is returned. */ extern LZMA_API(uint32_t) lzma_cputhreads(void) lzma_nothrow; Index: head/contrib/xz/src/liblzma/api/lzma/lzma12.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/lzma12.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/lzma12.h (revision 359201) @@ -1,420 +1,420 @@ /** * \file lzma/lzma12.h * \brief LZMA1 and LZMA2 filters */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /** * \brief LZMA1 Filter ID * * LZMA1 is the very same thing as what was called just LZMA in LZMA Utils, * 7-Zip, and LZMA SDK. It's called LZMA1 here to prevent developers from * accidentally using LZMA when they actually want LZMA2. * * LZMA1 shouldn't be used for new applications unless you _really_ know * what you are doing. LZMA2 is almost always a better choice. */ #define LZMA_FILTER_LZMA1 LZMA_VLI_C(0x4000000000000001) /** * \brief LZMA2 Filter ID * * Usually you want this instead of LZMA1. Compared to LZMA1, LZMA2 adds * support for LZMA_SYNC_FLUSH, uncompressed chunks (smaller expansion * when trying to compress uncompressible data), possibility to change * lc/lp/pb in the middle of encoding, and some other internal improvements. */ #define LZMA_FILTER_LZMA2 LZMA_VLI_C(0x21) /** * \brief Match finders * * Match finder has major effect on both speed and compression ratio. * Usually hash chains are faster than binary trees. * * If you will use LZMA_SYNC_FLUSH often, the hash chains may be a better * choice, because binary trees get much higher compression ratio penalty * with LZMA_SYNC_FLUSH. * * The memory usage formulas are only rough estimates, which are closest to * reality when dict_size is a power of two. The formulas are more complex * in reality, and can also change a little between liblzma versions. Use * lzma_raw_encoder_memusage() to get more accurate estimate of memory usage. */ typedef enum { LZMA_MF_HC3 = 0x03, /**< * \brief Hash Chain with 2- and 3-byte hashing * * Minimum nice_len: 3 * * Memory usage: * - dict_size <= 16 MiB: dict_size * 7.5 * - dict_size > 16 MiB: dict_size * 5.5 + 64 MiB */ LZMA_MF_HC4 = 0x04, /**< * \brief Hash Chain with 2-, 3-, and 4-byte hashing * * Minimum nice_len: 4 * * Memory usage: * - dict_size <= 32 MiB: dict_size * 7.5 * - dict_size > 32 MiB: dict_size * 6.5 */ LZMA_MF_BT2 = 0x12, /**< * \brief Binary Tree with 2-byte hashing * * Minimum nice_len: 2 * * Memory usage: dict_size * 9.5 */ LZMA_MF_BT3 = 0x13, /**< * \brief Binary Tree with 2- and 3-byte hashing * * Minimum nice_len: 3 * * Memory usage: * - dict_size <= 16 MiB: dict_size * 11.5 * - dict_size > 16 MiB: dict_size * 9.5 + 64 MiB */ LZMA_MF_BT4 = 0x14 /**< * \brief Binary Tree with 2-, 3-, and 4-byte hashing * * Minimum nice_len: 4 * * Memory usage: * - dict_size <= 32 MiB: dict_size * 11.5 * - dict_size > 32 MiB: dict_size * 10.5 */ } lzma_match_finder; /** * \brief Test if given match finder is supported * * Return true if the given match finder is supported by this liblzma build. * Otherwise false is returned. It is safe to call this with a value that * isn't listed in lzma_match_finder enumeration; the return value will be * false. * * There is no way to list which match finders are available in this * particular liblzma version and build. It would be useless, because * a new match finder, which the application developer wasn't aware, * could require giving additional options to the encoder that the older * match finders don't need. */ extern LZMA_API(lzma_bool) lzma_mf_is_supported(lzma_match_finder match_finder) lzma_nothrow lzma_attr_const; /** * \brief Compression modes * * This selects the function used to analyze the data produced by the match * finder. */ typedef enum { LZMA_MODE_FAST = 1, /**< * \brief Fast compression * * Fast mode is usually at its best when combined with * a hash chain match finder. */ LZMA_MODE_NORMAL = 2 /**< * \brief Normal compression * * This is usually notably slower than fast mode. Use this * together with binary tree match finders to expose the * full potential of the LZMA1 or LZMA2 encoder. */ } lzma_mode; /** * \brief Test if given compression mode is supported * * Return true if the given compression mode is supported by this liblzma * build. Otherwise false is returned. It is safe to call this with a value * that isn't listed in lzma_mode enumeration; the return value will be false. * * There is no way to list which modes are available in this particular * liblzma version and build. It would be useless, because a new compression * mode, which the application developer wasn't aware, could require giving * additional options to the encoder that the older modes don't need. */ extern LZMA_API(lzma_bool) lzma_mode_is_supported(lzma_mode mode) lzma_nothrow lzma_attr_const; /** * \brief Options specific to the LZMA1 and LZMA2 filters * * Since LZMA1 and LZMA2 share most of the code, it's simplest to share * the options structure too. For encoding, all but the reserved variables * need to be initialized unless specifically mentioned otherwise. * lzma_lzma_preset() can be used to get a good starting point. * * For raw decoding, both LZMA1 and LZMA2 need dict_size, preset_dict, and * preset_dict_size (if preset_dict != NULL). LZMA1 needs also lc, lp, and pb. */ typedef struct { /** * \brief Dictionary size in bytes * * Dictionary size indicates how many bytes of the recently processed * uncompressed data is kept in memory. One method to reduce size of * the uncompressed data is to store distance-length pairs, which * indicate what data to repeat from the dictionary buffer. Thus, * the bigger the dictionary, the better the compression ratio * usually is. * * Maximum size of the dictionary depends on multiple things: * - Memory usage limit * - Available address space (not a problem on 64-bit systems) * - Selected match finder (encoder only) * * Currently the maximum dictionary size for encoding is 1.5 GiB * (i.e. (UINT32_C(1) << 30) + (UINT32_C(1) << 29)) even on 64-bit * systems for certain match finder implementation reasons. In the * future, there may be match finders that support bigger * dictionaries. * * Decoder already supports dictionaries up to 4 GiB - 1 B (i.e. * UINT32_MAX), so increasing the maximum dictionary size of the * encoder won't cause problems for old decoders. * * Because extremely small dictionaries sizes would have unneeded * overhead in the decoder, the minimum dictionary size is 4096 bytes. * * \note When decoding, too big dictionary does no other harm * than wasting memory. */ uint32_t dict_size; # define LZMA_DICT_SIZE_MIN UINT32_C(4096) # define LZMA_DICT_SIZE_DEFAULT (UINT32_C(1) << 23) /** * \brief Pointer to an initial dictionary * * It is possible to initialize the LZ77 history window using * a preset dictionary. It is useful when compressing many * similar, relatively small chunks of data independently from * each other. The preset dictionary should contain typical * strings that occur in the files being compressed. The most * probable strings should be near the end of the preset dictionary. * * This feature should be used only in special situations. For * now, it works correctly only with raw encoding and decoding. * Currently none of the container formats supported by * liblzma allow preset dictionary when decoding, thus if * you create a .xz or .lzma file with preset dictionary, it * cannot be decoded with the regular decoder functions. In the * future, the .xz format will likely get support for preset * dictionary though. */ const uint8_t *preset_dict; /** * \brief Size of the preset dictionary * * Specifies the size of the preset dictionary. If the size is * bigger than dict_size, only the last dict_size bytes are * processed. * * This variable is read only when preset_dict is not NULL. * If preset_dict is not NULL but preset_dict_size is zero, * no preset dictionary is used (identical to only setting * preset_dict to NULL). */ uint32_t preset_dict_size; /** * \brief Number of literal context bits * * How many of the highest bits of the previous uncompressed * eight-bit byte (also known as `literal') are taken into * account when predicting the bits of the next literal. * * E.g. in typical English text, an upper-case letter is * often followed by a lower-case letter, and a lower-case * letter is usually followed by another lower-case letter. * In the US-ASCII character set, the highest three bits are 010 * for upper-case letters and 011 for lower-case letters. * When lc is at least 3, the literal coding can take advantage of * this property in the uncompressed data. * * There is a limit that applies to literal context bits and literal * position bits together: lc + lp <= 4. Without this limit the * decoding could become very slow, which could have security related * results in some cases like email servers doing virus scanning. * This limit also simplifies the internal implementation in liblzma. * * There may be LZMA1 streams that have lc + lp > 4 (maximum possible * lc would be 8). It is not possible to decode such streams with * liblzma. */ uint32_t lc; # define LZMA_LCLP_MIN 0 # define LZMA_LCLP_MAX 4 # define LZMA_LC_DEFAULT 3 /** * \brief Number of literal position bits * * lp affects what kind of alignment in the uncompressed data is * assumed when encoding literals. A literal is a single 8-bit byte. * See pb below for more information about alignment. */ uint32_t lp; # define LZMA_LP_DEFAULT 0 /** * \brief Number of position bits * * pb affects what kind of alignment in the uncompressed data is * assumed in general. The default means four-byte alignment * (2^ pb =2^2=4), which is often a good choice when there's * no better guess. * - * When the aligment is known, setting pb accordingly may reduce + * When the alignment is known, setting pb accordingly may reduce * the file size a little. E.g. with text files having one-byte * alignment (US-ASCII, ISO-8859-*, UTF-8), setting pb=0 can * improve compression slightly. For UTF-16 text, pb=1 is a good * choice. If the alignment is an odd number like 3 bytes, pb=0 * might be the best choice. * * Even though the assumed alignment can be adjusted with pb and * lp, LZMA1 and LZMA2 still slightly favor 16-byte alignment. * It might be worth taking into account when designing file formats * that are likely to be often compressed with LZMA1 or LZMA2. */ uint32_t pb; # define LZMA_PB_MIN 0 # define LZMA_PB_MAX 4 # define LZMA_PB_DEFAULT 2 /** Compression mode */ lzma_mode mode; /** * \brief Nice length of a match * * This determines how many bytes the encoder compares from the match * candidates when looking for the best match. Once a match of at * least nice_len bytes long is found, the encoder stops looking for * better candidates and encodes the match. (Naturally, if the found * match is actually longer than nice_len, the actual length is * encoded; it's not truncated to nice_len.) * * Bigger values usually increase the compression ratio and * compression time. For most files, 32 to 128 is a good value, * which gives very good compression ratio at good speed. * * The exact minimum value depends on the match finder. The maximum * is 273, which is the maximum length of a match that LZMA1 and * LZMA2 can encode. */ uint32_t nice_len; /** Match finder ID */ lzma_match_finder mf; /** * \brief Maximum search depth in the match finder * * For every input byte, match finder searches through the hash chain * or binary tree in a loop, each iteration going one step deeper in * the chain or tree. The searching stops if * - a match of at least nice_len bytes long is found; * - all match candidates from the hash chain or binary tree have * been checked; or * - maximum search depth is reached. * * Maximum search depth is needed to prevent the match finder from * wasting too much time in case there are lots of short match * candidates. On the other hand, stopping the search before all * candidates have been checked can reduce compression ratio. * * Setting depth to zero tells liblzma to use an automatic default * value, that depends on the selected match finder and nice_len. * The default is in the range [4, 200] or so (it may vary between * liblzma versions). * * Using a bigger depth value than the default can increase * compression ratio in some cases. There is no strict maximum value, * but high values (thousands or millions) should be used with care: * the encoder could remain fast enough with typical input, but * malicious input could cause the match finder to slow down * dramatically, possibly creating a denial of service attack. */ uint32_t depth; /* * Reserved space to allow possible future extensions without * breaking the ABI. You should not touch these, because the names * of these variables may change. These are and will never be used * with the currently supported options, so it is safe to leave these * uninitialized. */ uint32_t reserved_int1; uint32_t reserved_int2; uint32_t reserved_int3; uint32_t reserved_int4; uint32_t reserved_int5; uint32_t reserved_int6; uint32_t reserved_int7; uint32_t reserved_int8; lzma_reserved_enum reserved_enum1; lzma_reserved_enum reserved_enum2; lzma_reserved_enum reserved_enum3; lzma_reserved_enum reserved_enum4; void *reserved_ptr1; void *reserved_ptr2; } lzma_options_lzma; /** * \brief Set a compression preset to lzma_options_lzma structure * * 0 is the fastest and 9 is the slowest. These match the switches -0 .. -9 * of the xz command line tool. In addition, it is possible to bitwise-or * flags to the preset. Currently only LZMA_PRESET_EXTREME is supported. * The flags are defined in container.h, because the flags are used also * with lzma_easy_encoder(). * * The preset values are subject to changes between liblzma versions. * * This function is available only if LZMA1 or LZMA2 encoder has been enabled * when building liblzma. * * \return On success, false is returned. If the preset is not * supported, true is returned. */ extern LZMA_API(lzma_bool) lzma_lzma_preset( lzma_options_lzma *options, uint32_t preset) lzma_nothrow; Index: head/contrib/xz/src/liblzma/api/lzma/version.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/version.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/version.h (revision 359201) @@ -1,121 +1,121 @@ /** * \file lzma/version.h * \brief Version number */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /* * Version number split into components */ #define LZMA_VERSION_MAJOR 5 #define LZMA_VERSION_MINOR 2 -#define LZMA_VERSION_PATCH 4 +#define LZMA_VERSION_PATCH 5 #define LZMA_VERSION_STABILITY LZMA_VERSION_STABILITY_STABLE #ifndef LZMA_VERSION_COMMIT # define LZMA_VERSION_COMMIT "" #endif /* * Map symbolic stability levels to integers. */ #define LZMA_VERSION_STABILITY_ALPHA 0 #define LZMA_VERSION_STABILITY_BETA 1 #define LZMA_VERSION_STABILITY_STABLE 2 /** * \brief Compile-time version number * * The version number is of format xyyyzzzs where * - x = major * - yyy = minor * - zzz = revision * - s indicates stability: 0 = alpha, 1 = beta, 2 = stable * * The same xyyyzzz triplet is never reused with different stability levels. * For example, if 5.1.0alpha has been released, there will never be 5.1.0beta * or 5.1.0 stable. * * \note The version number of liblzma has nothing to with * the version number of Igor Pavlov's LZMA SDK. */ #define LZMA_VERSION (LZMA_VERSION_MAJOR * UINT32_C(10000000) \ + LZMA_VERSION_MINOR * UINT32_C(10000) \ + LZMA_VERSION_PATCH * UINT32_C(10) \ + LZMA_VERSION_STABILITY) /* * Macros to construct the compile-time version string */ #if LZMA_VERSION_STABILITY == LZMA_VERSION_STABILITY_ALPHA # define LZMA_VERSION_STABILITY_STRING "alpha" #elif LZMA_VERSION_STABILITY == LZMA_VERSION_STABILITY_BETA # define LZMA_VERSION_STABILITY_STRING "beta" #elif LZMA_VERSION_STABILITY == LZMA_VERSION_STABILITY_STABLE # define LZMA_VERSION_STABILITY_STRING "" #else # error Incorrect LZMA_VERSION_STABILITY #endif #define LZMA_VERSION_STRING_C_(major, minor, patch, stability, commit) \ #major "." #minor "." #patch stability commit #define LZMA_VERSION_STRING_C(major, minor, patch, stability, commit) \ LZMA_VERSION_STRING_C_(major, minor, patch, stability, commit) /** * \brief Compile-time version as a string * * This can be for example "4.999.5alpha", "4.999.8beta", or "5.0.0" (stable * versions don't have any "stable" suffix). In future, a snapshot built * from source code repository may include an additional suffix, for example * "4.999.8beta-21-g1d92". The commit ID won't be available in numeric form * in LZMA_VERSION macro. */ #define LZMA_VERSION_STRING LZMA_VERSION_STRING_C( \ LZMA_VERSION_MAJOR, LZMA_VERSION_MINOR, \ LZMA_VERSION_PATCH, LZMA_VERSION_STABILITY_STRING, \ LZMA_VERSION_COMMIT) /* #ifndef is needed for use with windres (MinGW or Cygwin). */ #ifndef LZMA_H_INTERNAL_RC /** * \brief Run-time version number as an integer * * Return the value of LZMA_VERSION macro at the compile time of liblzma. * This allows the application to compare if it was built against the same, * older, or newer version of liblzma that is currently running. */ extern LZMA_API(uint32_t) lzma_version_number(void) lzma_nothrow lzma_attr_const; /** * \brief Run-time version as a string * * This function may be useful if you want to display which version of * liblzma your application is currently using. */ extern LZMA_API(const char *) lzma_version_string(void) lzma_nothrow lzma_attr_const; #endif Index: head/contrib/xz/src/liblzma/api/lzma/vli.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma/vli.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma/vli.h (revision 359201) @@ -1,166 +1,166 @@ /** * \file lzma/vli.h * \brief Variable-length integer handling * * In the .xz format, most integers are encoded in a variable-length * representation, which is sometimes called little endian base-128 encoding. * This saves space when smaller values are more likely than bigger values. * * The encoding scheme encodes seven bits to every byte, using minimum * number of bytes required to represent the given value. Encodings that use * non-minimum number of bytes are invalid, thus every integer has exactly * one encoded representation. The maximum number of bits in a VLI is 63, * thus the vli argument must be less than or equal to UINT64_MAX / 2. You * should use LZMA_VLI_MAX for clarity. */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. * * See ../lzma.h for information about liblzma as a whole. */ #ifndef LZMA_H_INTERNAL # error Never include this file directly. Use instead. #endif /** * \brief Maximum supported value of a variable-length integer */ #define LZMA_VLI_MAX (UINT64_MAX / 2) /** * \brief VLI value to denote that the value is unknown */ #define LZMA_VLI_UNKNOWN UINT64_MAX /** * \brief Maximum supported encoded length of variable length integers */ #define LZMA_VLI_BYTES_MAX 9 /** * \brief VLI constant suffix */ #define LZMA_VLI_C(n) UINT64_C(n) /** * \brief Variable-length integer type * * Valid VLI values are in the range [0, LZMA_VLI_MAX]. Unknown value is * indicated with LZMA_VLI_UNKNOWN, which is the maximum value of the - * underlaying integer type. + * underlying integer type. * * lzma_vli will be uint64_t for the foreseeable future. If a bigger size * is needed in the future, it is guaranteed that 2 * LZMA_VLI_MAX will * not overflow lzma_vli. This simplifies integer overflow detection. */ typedef uint64_t lzma_vli; /** * \brief Validate a variable-length integer * * This is useful to test that application has given acceptable values * for example in the uncompressed_size and compressed_size variables. * * \return True if the integer is representable as VLI or if it * indicates unknown value. */ #define lzma_vli_is_valid(vli) \ ((vli) <= LZMA_VLI_MAX || (vli) == LZMA_VLI_UNKNOWN) /** * \brief Encode a variable-length integer * * This function has two modes: single-call and multi-call. Single-call mode * encodes the whole integer at once; it is an error if the output buffer is * too small. Multi-call mode saves the position in *vli_pos, and thus it is * possible to continue encoding if the buffer becomes full before the whole * integer has been encoded. * * \param vli Integer to be encoded * \param vli_pos How many VLI-encoded bytes have already been written * out. When starting to encode a new integer in * multi-call mode, *vli_pos must be set to zero. * To use single-call encoding, set vli_pos to NULL. * \param out Beginning of the output buffer * \param out_pos The next byte will be written to out[*out_pos]. * \param out_size Size of the out buffer; the first byte into * which no data is written to is out[out_size]. * * \return Slightly different return values are used in multi-call and * single-call modes. * * Single-call (vli_pos == NULL): * - LZMA_OK: Integer successfully encoded. * - LZMA_PROG_ERROR: Arguments are not sane. This can be due * to too little output space; single-call mode doesn't use * LZMA_BUF_ERROR, since the application should have checked * the encoded size with lzma_vli_size(). * * Multi-call (vli_pos != NULL): * - LZMA_OK: So far all OK, but the integer is not * completely written out yet. * - LZMA_STREAM_END: Integer successfully encoded. * - LZMA_BUF_ERROR: No output space was provided. * - LZMA_PROG_ERROR: Arguments are not sane. */ extern LZMA_API(lzma_ret) lzma_vli_encode(lzma_vli vli, size_t *vli_pos, uint8_t *out, size_t *out_pos, size_t out_size) lzma_nothrow; /** * \brief Decode a variable-length integer * * Like lzma_vli_encode(), this function has single-call and multi-call modes. * * \param vli Pointer to decoded integer. The decoder will * initialize it to zero when *vli_pos == 0, so * application isn't required to initialize *vli. * \param vli_pos How many bytes have already been decoded. When * starting to decode a new integer in multi-call * mode, *vli_pos must be initialized to zero. To * use single-call decoding, set vli_pos to NULL. * \param in Beginning of the input buffer * \param in_pos The next byte will be read from in[*in_pos]. * \param in_size Size of the input buffer; the first byte that * won't be read is in[in_size]. * * \return Slightly different return values are used in multi-call and * single-call modes. * * Single-call (vli_pos == NULL): * - LZMA_OK: Integer successfully decoded. * - LZMA_DATA_ERROR: Integer is corrupt. This includes hitting * the end of the input buffer before the whole integer was * decoded; providing no input at all will use LZMA_DATA_ERROR. * - LZMA_PROG_ERROR: Arguments are not sane. * * Multi-call (vli_pos != NULL): * - LZMA_OK: So far all OK, but the integer is not * completely decoded yet. * - LZMA_STREAM_END: Integer successfully decoded. * - LZMA_DATA_ERROR: Integer is corrupt. * - LZMA_BUF_ERROR: No input was provided. * - LZMA_PROG_ERROR: Arguments are not sane. */ extern LZMA_API(lzma_ret) lzma_vli_decode(lzma_vli *vli, size_t *vli_pos, const uint8_t *in, size_t *in_pos, size_t in_size) lzma_nothrow; /** * \brief Get the number of bytes required to encode a VLI * * \return Number of bytes on success (1-9). If vli isn't valid, * zero is returned. */ extern LZMA_API(uint32_t) lzma_vli_size(lzma_vli vli) lzma_nothrow lzma_attr_pure; Index: head/contrib/xz/src/liblzma/api/lzma.h =================================================================== --- head/contrib/xz/src/liblzma/api/lzma.h (revision 359200) +++ head/contrib/xz/src/liblzma/api/lzma.h (revision 359201) @@ -1,325 +1,326 @@ /** * \file api/lzma.h * \brief The public API of liblzma data compression library * * liblzma is a public domain general-purpose data compression library with * a zlib-like API. The native file format is .xz, but also the old .lzma * format and raw (no headers) streams are supported. Multiple compression * algorithms (filters) are supported. Currently LZMA2 is the primary filter. * * liblzma is part of XZ Utils . XZ Utils includes * a gzip-like command line tool named xz and some other tools. XZ Utils * is developed and maintained by Lasse Collin. * * Major parts of liblzma are based on Igor Pavlov's public domain LZMA SDK * . * * The SHA-256 implementation is based on the public domain code found from * 7-Zip , which has a modified version of the public * domain SHA-256 code found from Crypto++ . * The SHA-256 code in Crypto++ was written by Kevin Springle and Wei Dai. */ /* * Author: Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. */ #ifndef LZMA_H #define LZMA_H /***************************** * Required standard headers * *****************************/ /* * liblzma API headers need some standard types and macros. To allow * including lzma.h without requiring the application to include other * headers first, lzma.h includes the required standard headers unless * they already seem to be included already or if LZMA_MANUAL_HEADERS * has been defined. * * Here's what types and macros are needed and from which headers: * - stddef.h: size_t, NULL * - stdint.h: uint8_t, uint32_t, uint64_t, UINT32_C(n), uint64_C(n), * UINT32_MAX, UINT64_MAX * * However, inttypes.h is a little more portable than stdint.h, although * inttypes.h declares some unneeded things compared to plain stdint.h. * * The hacks below aren't perfect, specifically they assume that inttypes.h * exists and that it typedefs at least uint8_t, uint32_t, and uint64_t, * and that, in case of incomplete inttypes.h, unsigned int is 32-bit. * If the application already takes care of setting up all the types and * macros properly (for example by using gnulib's stdint.h or inttypes.h), * we try to detect that the macros are already defined and don't include * inttypes.h here again. However, you may define LZMA_MANUAL_HEADERS to * force this file to never include any system headers. * * Some could argue that liblzma API should provide all the required types, * for example lzma_uint64, LZMA_UINT64_C(n), and LZMA_UINT64_MAX. This was * seen as an unnecessary mess, since most systems already provide all the * necessary types and macros in the standard headers. * * Note that liblzma API still has lzma_bool, because using stdbool.h would * break C89 and C++ programs on many systems. sizeof(bool) in C99 isn't * necessarily the same as sizeof(bool) in C++. */ #ifndef LZMA_MANUAL_HEADERS /* * I suppose this works portably also in C++. Note that in C++, * we need to get size_t into the global namespace. */ # include /* * Skip inttypes.h if we already have all the required macros. If we * have the macros, we assume that we have the matching typedefs too. */ # if !defined(UINT32_C) || !defined(UINT64_C) \ || !defined(UINT32_MAX) || !defined(UINT64_MAX) /* * MSVC versions older than 2013 have no C99 support, and * thus they cannot be used to compile liblzma. Using an * existing liblzma.dll with old MSVC can work though(*), * but we need to define the required standard integer * types here in a MSVC-specific way. * * (*) If you do this, the existing liblzma.dll probably uses * a different runtime library than your MSVC-built * application. Mixing runtimes is generally bad, but * in this case it should work as long as you avoid * the few rarely-needed liblzma functions that allocate * memory and expect the caller to free it using free(). */ # if defined(_WIN32) && defined(_MSC_VER) && _MSC_VER < 1800 typedef unsigned __int8 uint8_t; typedef unsigned __int32 uint32_t; typedef unsigned __int64 uint64_t; # else /* Use the standard inttypes.h. */ # ifdef __cplusplus /* * C99 sections 7.18.2 and 7.18.4 specify * that C++ implementations define the limit * and constant macros only if specifically * requested. Note that if you want the * format macros (PRIu64 etc.) too, you need * to define __STDC_FORMAT_MACROS before * including lzma.h, since re-including * inttypes.h with __STDC_FORMAT_MACROS * defined doesn't necessarily work. */ # ifndef __STDC_LIMIT_MACROS # define __STDC_LIMIT_MACROS 1 # endif # ifndef __STDC_CONSTANT_MACROS # define __STDC_CONSTANT_MACROS 1 # endif # endif # include # endif /* * Some old systems have only the typedefs in inttypes.h, and * lack all the macros. For those systems, we need a few more * hacks. We assume that unsigned int is 32-bit and unsigned * long is either 32-bit or 64-bit. If these hacks aren't * enough, the application has to setup the types manually * before including lzma.h. */ # ifndef UINT32_C # if defined(_WIN32) && defined(_MSC_VER) # define UINT32_C(n) n ## UI32 # else # define UINT32_C(n) n ## U # endif # endif # ifndef UINT64_C # if defined(_WIN32) && defined(_MSC_VER) # define UINT64_C(n) n ## UI64 # else /* Get ULONG_MAX. */ # include # if ULONG_MAX == 4294967295UL # define UINT64_C(n) n ## ULL # else # define UINT64_C(n) n ## UL # endif # endif # endif # ifndef UINT32_MAX # define UINT32_MAX (UINT32_C(4294967295)) # endif # ifndef UINT64_MAX # define UINT64_MAX (UINT64_C(18446744073709551615)) # endif # endif #endif /* ifdef LZMA_MANUAL_HEADERS */ /****************** * LZMA_API macro * ******************/ /* * Some systems require that the functions and function pointers are * declared specially in the headers. LZMA_API_IMPORT is for importing * symbols and LZMA_API_CALL is to specify the calling convention. * * By default it is assumed that the application will link dynamically * against liblzma. #define LZMA_API_STATIC in your application if you * want to link against static liblzma. If you don't care about portability * to operating systems like Windows, or at least don't care about linking * against static liblzma on them, don't worry about LZMA_API_STATIC. That * is, most developers will never need to use LZMA_API_STATIC. * * The GCC variants are a special case on Windows (Cygwin and MinGW). * We rely on GCC doing the right thing with its auto-import feature, * and thus don't use __declspec(dllimport). This way developers don't * need to worry about LZMA_API_STATIC. Also the calling convention is * omitted on Cygwin but not on MinGW. */ #ifndef LZMA_API_IMPORT # if !defined(LZMA_API_STATIC) && defined(_WIN32) && !defined(__GNUC__) # define LZMA_API_IMPORT __declspec(dllimport) # else # define LZMA_API_IMPORT # endif #endif #ifndef LZMA_API_CALL # if defined(_WIN32) && !defined(__CYGWIN__) # define LZMA_API_CALL __cdecl # else # define LZMA_API_CALL # endif #endif #ifndef LZMA_API # define LZMA_API(type) LZMA_API_IMPORT type LZMA_API_CALL #endif /*********** * nothrow * ***********/ /* * None of the functions in liblzma may throw an exception. Even * the functions that use callback functions won't throw exceptions, * because liblzma would break if a callback function threw an exception. */ #ifndef lzma_nothrow # if defined(__cplusplus) # if __cplusplus >= 201103L # define lzma_nothrow noexcept # else # define lzma_nothrow throw() # endif -# elif __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 3) +# elif defined(__GNUC__) && (__GNUC__ > 3 \ + || (__GNUC__ == 3 && __GNUC_MINOR__ >= 3)) # define lzma_nothrow __attribute__((__nothrow__)) # else # define lzma_nothrow # endif #endif /******************** * GNU C extensions * ********************/ /* * GNU C extensions are used conditionally in the public API. It doesn't * break anything if these are sometimes enabled and sometimes not, only * affects warnings and optimizations. */ -#if __GNUC__ >= 3 +#if defined(__GNUC__) && __GNUC__ >= 3 # ifndef lzma_attribute # define lzma_attribute(attr) __attribute__(attr) # endif /* warn_unused_result was added in GCC 3.4. */ # ifndef lzma_attr_warn_unused_result # if __GNUC__ == 3 && __GNUC_MINOR__ < 4 # define lzma_attr_warn_unused_result # endif # endif #else # ifndef lzma_attribute # define lzma_attribute(attr) # endif #endif #ifndef lzma_attr_pure # define lzma_attr_pure lzma_attribute((__pure__)) #endif #ifndef lzma_attr_const # define lzma_attr_const lzma_attribute((__const__)) #endif #ifndef lzma_attr_warn_unused_result # define lzma_attr_warn_unused_result \ lzma_attribute((__warn_unused_result__)) #endif /************** * Subheaders * **************/ #ifdef __cplusplus extern "C" { #endif /* * Subheaders check that this is defined. It is to prevent including * them directly from applications. */ #define LZMA_H_INTERNAL 1 /* Basic features */ #include "lzma/version.h" #include "lzma/base.h" #include "lzma/vli.h" #include "lzma/check.h" /* Filters */ #include "lzma/filter.h" #include "lzma/bcj.h" #include "lzma/delta.h" #include "lzma/lzma12.h" /* Container formats */ #include "lzma/container.h" /* Advanced features */ #include "lzma/stream_flags.h" #include "lzma/block.h" #include "lzma/index.h" #include "lzma/index_hash.h" /* Hardware information */ #include "lzma/hardware.h" /* * All subheaders included. Undefine LZMA_H_INTERNAL to prevent applications * re-including the subheaders. */ #undef LZMA_H_INTERNAL #ifdef __cplusplus } #endif #endif /* ifndef LZMA_H */ Index: head/contrib/xz/src/liblzma/check/crc32_fast.c =================================================================== --- head/contrib/xz/src/liblzma/check/crc32_fast.c (revision 359200) +++ head/contrib/xz/src/liblzma/check/crc32_fast.c (revision 359201) @@ -1,82 +1,82 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file crc32.c /// \brief CRC32 calculation /// /// Calculate the CRC32 using the slice-by-eight algorithm. /// It is explained in this document: /// http://www.intel.com/technology/comms/perfnet/download/CRC_generators.pdf /// The code in this file is not the same as in Intel's paper, but /// the basic principle is identical. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "check.h" #include "crc_macros.h" // If you make any changes, do some benchmarking! Seemingly unrelated // changes can very easily ruin the performance (and very probably is // very compiler dependent). extern LZMA_API(uint32_t) lzma_crc32(const uint8_t *buf, size_t size, uint32_t crc) { crc = ~crc; #ifdef WORDS_BIGENDIAN crc = bswap32(crc); #endif if (size > 8) { // Fix the alignment, if needed. The if statement above // ensures that this won't read past the end of buf[]. while ((uintptr_t)(buf) & 7) { crc = lzma_crc32_table[0][*buf++ ^ A(crc)] ^ S8(crc); --size; } // Calculate the position where to stop. const uint8_t *const limit = buf + (size & ~(size_t)(7)); // Calculate how many bytes must be calculated separately // before returning the result. size &= (size_t)(7); // Calculate the CRC32 using the slice-by-eight algorithm. while (buf < limit) { - crc ^= *(const uint32_t *)(buf); + crc ^= aligned_read32ne(buf); buf += 4; crc = lzma_crc32_table[7][A(crc)] ^ lzma_crc32_table[6][B(crc)] ^ lzma_crc32_table[5][C(crc)] ^ lzma_crc32_table[4][D(crc)]; - const uint32_t tmp = *(const uint32_t *)(buf); + const uint32_t tmp = aligned_read32ne(buf); buf += 4; // At least with some compilers, it is critical for // performance, that the crc variable is XORed // between the two table-lookup pairs. crc = lzma_crc32_table[3][A(tmp)] ^ lzma_crc32_table[2][B(tmp)] ^ crc ^ lzma_crc32_table[1][C(tmp)] ^ lzma_crc32_table[0][D(tmp)]; } } while (size-- != 0) crc = lzma_crc32_table[0][*buf++ ^ A(crc)] ^ S8(crc); #ifdef WORDS_BIGENDIAN crc = bswap32(crc); #endif return ~crc; } Index: head/contrib/xz/src/liblzma/check/crc32_table.c =================================================================== --- head/contrib/xz/src/liblzma/check/crc32_table.c (revision 359200) +++ head/contrib/xz/src/liblzma/check/crc32_table.c (revision 359201) @@ -1,19 +1,22 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file crc32_table.c /// \brief Precalculated CRC32 table with correct endianness // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" +// Having the declaration here silences clang -Wmissing-variable-declarations. +extern const uint32_t lzma_crc32_table[8][256]; + #ifdef WORDS_BIGENDIAN # include "crc32_table_be.h" #else # include "crc32_table_le.h" #endif Index: head/contrib/xz/src/liblzma/check/crc64_fast.c =================================================================== --- head/contrib/xz/src/liblzma/check/crc64_fast.c (revision 359200) +++ head/contrib/xz/src/liblzma/check/crc64_fast.c (revision 359201) @@ -1,72 +1,72 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file crc64.c /// \brief CRC64 calculation /// /// Calculate the CRC64 using the slice-by-four algorithm. This is the same /// idea that is used in crc32_fast.c, but for CRC64 we use only four tables /// instead of eight to avoid increasing CPU cache usage. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "check.h" #include "crc_macros.h" #ifdef WORDS_BIGENDIAN # define A1(x) ((x) >> 56) #else # define A1 A #endif // See the comments in crc32_fast.c. They aren't duplicated here. extern LZMA_API(uint64_t) lzma_crc64(const uint8_t *buf, size_t size, uint64_t crc) { crc = ~crc; #ifdef WORDS_BIGENDIAN crc = bswap64(crc); #endif if (size > 4) { while ((uintptr_t)(buf) & 3) { crc = lzma_crc64_table[0][*buf++ ^ A1(crc)] ^ S8(crc); --size; } const uint8_t *const limit = buf + (size & ~(size_t)(3)); size &= (size_t)(3); while (buf < limit) { #ifdef WORDS_BIGENDIAN const uint32_t tmp = (crc >> 32) - ^ *(const uint32_t *)(buf); + ^ aligned_read32ne(buf); #else - const uint32_t tmp = crc ^ *(const uint32_t *)(buf); + const uint32_t tmp = crc ^ aligned_read32ne(buf); #endif buf += 4; crc = lzma_crc64_table[3][A(tmp)] ^ lzma_crc64_table[2][B(tmp)] ^ S32(crc) ^ lzma_crc64_table[1][C(tmp)] ^ lzma_crc64_table[0][D(tmp)]; } } while (size-- != 0) crc = lzma_crc64_table[0][*buf++ ^ A1(crc)] ^ S8(crc); #ifdef WORDS_BIGENDIAN crc = bswap64(crc); #endif return ~crc; } Index: head/contrib/xz/src/liblzma/check/crc64_table.c =================================================================== --- head/contrib/xz/src/liblzma/check/crc64_table.c (revision 359200) +++ head/contrib/xz/src/liblzma/check/crc64_table.c (revision 359201) @@ -1,19 +1,22 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file crc64_table.c /// \brief Precalculated CRC64 table with correct endianness // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" +// Having the declaration here silences clang -Wmissing-variable-declarations. +extern const uint64_t lzma_crc64_table[4][256]; + #ifdef WORDS_BIGENDIAN # include "crc64_table_be.h" #else # include "crc64_table_le.h" #endif Index: head/contrib/xz/src/liblzma/common/alone_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/alone_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/alone_decoder.c (revision 359201) @@ -1,243 +1,242 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file alone_decoder.c /// \brief Decoder for LZMA_Alone files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "alone_decoder.h" #include "lzma_decoder.h" #include "lz_decoder.h" typedef struct { lzma_next_coder next; enum { SEQ_PROPERTIES, SEQ_DICTIONARY_SIZE, SEQ_UNCOMPRESSED_SIZE, SEQ_CODER_INIT, SEQ_CODE, } sequence; /// If true, reject files that are unlikely to be .lzma files. /// If false, more non-.lzma files get accepted and will give /// LZMA_DATA_ERROR either immediately or after a few output bytes. bool picky; /// Position in the header fields size_t pos; /// Uncompressed size decoded from the header lzma_vli uncompressed_size; /// Memory usage limit uint64_t memlimit; /// Amount of memory actually needed (only an estimate) uint64_t memusage; /// Options decoded from the header needed to initialize /// the LZMA decoder lzma_options_lzma options; } lzma_alone_coder; static lzma_ret -alone_decode(void *coder_ptr, - const lzma_allocator *allocator lzma_attribute((__unused__)), +alone_decode(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_alone_coder *coder = coder_ptr; while (*out_pos < out_size && (coder->sequence == SEQ_CODE || *in_pos < in_size)) switch (coder->sequence) { case SEQ_PROPERTIES: if (lzma_lzma_lclppb_decode(&coder->options, in[*in_pos])) return LZMA_FORMAT_ERROR; coder->sequence = SEQ_DICTIONARY_SIZE; ++*in_pos; break; case SEQ_DICTIONARY_SIZE: coder->options.dict_size |= (size_t)(in[*in_pos]) << (coder->pos * 8); if (++coder->pos == 4) { if (coder->picky && coder->options.dict_size != UINT32_MAX) { // A hack to ditch tons of false positives: // We allow only dictionary sizes that are // 2^n or 2^n + 2^(n-1). LZMA_Alone created // only files with 2^n, but accepts any // dictionary size. uint32_t d = coder->options.dict_size - 1; d |= d >> 2; d |= d >> 3; d |= d >> 4; d |= d >> 8; d |= d >> 16; ++d; if (d != coder->options.dict_size) return LZMA_FORMAT_ERROR; } coder->pos = 0; coder->sequence = SEQ_UNCOMPRESSED_SIZE; } ++*in_pos; break; case SEQ_UNCOMPRESSED_SIZE: coder->uncompressed_size |= (lzma_vli)(in[*in_pos]) << (coder->pos * 8); ++*in_pos; if (++coder->pos < 8) break; // Another hack to ditch false positives: Assume that // if the uncompressed size is known, it must be less // than 256 GiB. if (coder->picky && coder->uncompressed_size != LZMA_VLI_UNKNOWN && coder->uncompressed_size >= (LZMA_VLI_C(1) << 38)) return LZMA_FORMAT_ERROR; // Calculate the memory usage so that it is ready // for SEQ_CODER_INIT. coder->memusage = lzma_lzma_decoder_memusage(&coder->options) + LZMA_MEMUSAGE_BASE; coder->pos = 0; coder->sequence = SEQ_CODER_INIT; // Fall through case SEQ_CODER_INIT: { if (coder->memusage > coder->memlimit) return LZMA_MEMLIMIT_ERROR; lzma_filter_info filters[2] = { { .init = &lzma_lzma_decoder_init, .options = &coder->options, }, { .init = NULL, } }; const lzma_ret ret = lzma_next_filter_init(&coder->next, allocator, filters); if (ret != LZMA_OK) return ret; // Use a hack to set the uncompressed size. lzma_lz_decoder_uncompressed(coder->next.coder, coder->uncompressed_size); coder->sequence = SEQ_CODE; break; } case SEQ_CODE: { return coder->next.code(coder->next.coder, allocator, in, in_pos, in_size, out, out_pos, out_size, action); } default: return LZMA_PROG_ERROR; } return LZMA_OK; } static void alone_decoder_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_alone_coder *coder = coder_ptr; lzma_next_end(&coder->next, allocator); lzma_free(coder, allocator); return; } static lzma_ret alone_decoder_memconfig(void *coder_ptr, uint64_t *memusage, uint64_t *old_memlimit, uint64_t new_memlimit) { lzma_alone_coder *coder = coder_ptr; *memusage = coder->memusage; *old_memlimit = coder->memlimit; if (new_memlimit != 0) { if (new_memlimit < coder->memusage) return LZMA_MEMLIMIT_ERROR; coder->memlimit = new_memlimit; } return LZMA_OK; } extern lzma_ret lzma_alone_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, uint64_t memlimit, bool picky) { lzma_next_coder_init(&lzma_alone_decoder_init, next, allocator); lzma_alone_coder *coder = next->coder; if (coder == NULL) { coder = lzma_alloc(sizeof(lzma_alone_coder), allocator); if (coder == NULL) return LZMA_MEM_ERROR; next->coder = coder; next->code = &alone_decode; next->end = &alone_decoder_end; next->memconfig = &alone_decoder_memconfig; coder->next = LZMA_NEXT_CODER_INIT; } coder->sequence = SEQ_PROPERTIES; coder->picky = picky; coder->pos = 0; coder->options.dict_size = 0; coder->options.preset_dict = NULL; coder->options.preset_dict_size = 0; coder->uncompressed_size = 0; coder->memlimit = my_max(1, memlimit); coder->memusage = LZMA_MEMUSAGE_BASE; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_alone_decoder(lzma_stream *strm, uint64_t memlimit) { lzma_next_strm_init(lzma_alone_decoder_init, strm, memlimit, false); strm->internal->supported_actions[LZMA_RUN] = true; strm->internal->supported_actions[LZMA_FINISH] = true; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/alone_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/alone_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/alone_encoder.c (revision 359201) @@ -1,163 +1,162 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file alone_decoder.c -/// \brief Decoder for LZMA_Alone files +/// \file alone_encoder.c +/// \brief Encoder for LZMA_Alone files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" #include "lzma_encoder.h" #define ALONE_HEADER_SIZE (1 + 4 + 8) typedef struct { lzma_next_coder next; enum { SEQ_HEADER, SEQ_CODE, } sequence; size_t header_pos; uint8_t header[ALONE_HEADER_SIZE]; } lzma_alone_coder; static lzma_ret -alone_encode(void *coder_ptr, - const lzma_allocator *allocator lzma_attribute((__unused__)), +alone_encode(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_alone_coder *coder = coder_ptr; while (*out_pos < out_size) switch (coder->sequence) { case SEQ_HEADER: lzma_bufcpy(coder->header, &coder->header_pos, ALONE_HEADER_SIZE, out, out_pos, out_size); if (coder->header_pos < ALONE_HEADER_SIZE) return LZMA_OK; coder->sequence = SEQ_CODE; break; case SEQ_CODE: return coder->next.code(coder->next.coder, allocator, in, in_pos, in_size, out, out_pos, out_size, action); default: assert(0); return LZMA_PROG_ERROR; } return LZMA_OK; } static void alone_encoder_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_alone_coder *coder = coder_ptr; lzma_next_end(&coder->next, allocator); lzma_free(coder, allocator); return; } // At least for now, this is not used by any internal function. static lzma_ret alone_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_options_lzma *options) { lzma_next_coder_init(&alone_encoder_init, next, allocator); lzma_alone_coder *coder = next->coder; if (coder == NULL) { coder = lzma_alloc(sizeof(lzma_alone_coder), allocator); if (coder == NULL) return LZMA_MEM_ERROR; next->coder = coder; next->code = &alone_encode; next->end = &alone_encoder_end; coder->next = LZMA_NEXT_CODER_INIT; } // Basic initializations coder->sequence = SEQ_HEADER; coder->header_pos = 0; // Encode the header: // - Properties (1 byte) if (lzma_lzma_lclppb_encode(options, coder->header)) return LZMA_OPTIONS_ERROR; // - Dictionary size (4 bytes) if (options->dict_size < LZMA_DICT_SIZE_MIN) return LZMA_OPTIONS_ERROR; // Round up to the next 2^n or 2^n + 2^(n - 1) depending on which // one is the next unless it is UINT32_MAX. While the header would // allow any 32-bit integer, we do this to keep the decoder of liblzma // accepting the resulting files. uint32_t d = options->dict_size - 1; d |= d >> 2; d |= d >> 3; d |= d >> 4; d |= d >> 8; d |= d >> 16; if (d != UINT32_MAX) ++d; - unaligned_write32le(coder->header + 1, d); + write32le(coder->header + 1, d); // - Uncompressed size (always unknown and using EOPM) memset(coder->header + 1 + 4, 0xFF, 8); // Initialize the LZMA encoder. const lzma_filter_info filters[2] = { { .init = &lzma_lzma_encoder_init, .options = (void *)(options), }, { .init = NULL, } }; return lzma_next_filter_init(&coder->next, allocator, filters); } /* extern lzma_ret lzma_alone_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_options_alone *options) { lzma_next_coder_init(&alone_encoder_init, next, allocator, options); } */ extern LZMA_API(lzma_ret) lzma_alone_encoder(lzma_stream *strm, const lzma_options_lzma *options) { lzma_next_strm_init(alone_encoder_init, strm, options); strm->internal->supported_actions[LZMA_RUN] = true; strm->internal->supported_actions[LZMA_FINISH] = true; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/block_header_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/block_header_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/block_header_decoder.c (revision 359201) @@ -1,124 +1,124 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file block_header_decoder.c /// \brief Decodes Block Header from .xz files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" #include "check.h" static void free_properties(lzma_block *block, const lzma_allocator *allocator) { // Free allocated filter options. The last array member is not // touched after the initialization in the beginning of // lzma_block_header_decode(), so we don't need to touch that here. for (size_t i = 0; i < LZMA_FILTERS_MAX; ++i) { lzma_free(block->filters[i].options, allocator); block->filters[i].id = LZMA_VLI_UNKNOWN; block->filters[i].options = NULL; } return; } extern LZMA_API(lzma_ret) lzma_block_header_decode(lzma_block *block, const lzma_allocator *allocator, const uint8_t *in) { // NOTE: We consider the header to be corrupt not only when the // CRC32 doesn't match, but also when variable-length integers // are invalid or over 63 bits, or if the header is too small // to contain the claimed information. // Initialize the filter options array. This way the caller can // safely free() the options even if an error occurs in this function. for (size_t i = 0; i <= LZMA_FILTERS_MAX; ++i) { block->filters[i].id = LZMA_VLI_UNKNOWN; block->filters[i].options = NULL; } // Versions 0 and 1 are supported. If a newer version was specified, // we need to downgrade it. if (block->version > 1) block->version = 1; // This isn't a Block Header option, but since the decompressor will // read it if version >= 1, it's better to initialize it here than // to expect the caller to do it since in almost all cases this // should be false. block->ignore_check = false; // Validate Block Header Size and Check type. The caller must have // already set these, so it is a programming error if this test fails. if (lzma_block_header_size_decode(in[0]) != block->header_size || (unsigned int)(block->check) > LZMA_CHECK_ID_MAX) return LZMA_PROG_ERROR; // Exclude the CRC32 field. const size_t in_size = block->header_size - 4; // Verify CRC32 - if (lzma_crc32(in, in_size, 0) != unaligned_read32le(in + in_size)) + if (lzma_crc32(in, in_size, 0) != read32le(in + in_size)) return LZMA_DATA_ERROR; // Check for unsupported flags. if (in[1] & 0x3C) return LZMA_OPTIONS_ERROR; // Start after the Block Header Size and Block Flags fields. size_t in_pos = 2; // Compressed Size if (in[1] & 0x40) { return_if_error(lzma_vli_decode(&block->compressed_size, NULL, in, &in_pos, in_size)); // Validate Compressed Size. This checks that it isn't zero // and that the total size of the Block is a valid VLI. if (lzma_block_unpadded_size(block) == 0) return LZMA_DATA_ERROR; } else { block->compressed_size = LZMA_VLI_UNKNOWN; } // Uncompressed Size if (in[1] & 0x80) return_if_error(lzma_vli_decode(&block->uncompressed_size, NULL, in, &in_pos, in_size)); else block->uncompressed_size = LZMA_VLI_UNKNOWN; // Filter Flags - const size_t filter_count = (in[1] & 3) + 1; + const size_t filter_count = (in[1] & 3U) + 1; for (size_t i = 0; i < filter_count; ++i) { const lzma_ret ret = lzma_filter_flags_decode( &block->filters[i], allocator, in, &in_pos, in_size); if (ret != LZMA_OK) { free_properties(block, allocator); return ret; } } // Padding while (in_pos < in_size) { if (in[in_pos++] != 0x00) { free_properties(block, allocator); // Possibly some new field present so use // LZMA_OPTIONS_ERROR instead of LZMA_DATA_ERROR. return LZMA_OPTIONS_ERROR; } } return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/block_header_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/block_header_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/block_header_encoder.c (revision 359201) @@ -1,132 +1,132 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file block_header_encoder.c /// \brief Encodes Block Header for .xz files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" #include "check.h" extern LZMA_API(lzma_ret) lzma_block_header_size(lzma_block *block) { if (block->version > 1) return LZMA_OPTIONS_ERROR; // Block Header Size + Block Flags + CRC32. uint32_t size = 1 + 1 + 4; // Compressed Size if (block->compressed_size != LZMA_VLI_UNKNOWN) { const uint32_t add = lzma_vli_size(block->compressed_size); if (add == 0 || block->compressed_size == 0) return LZMA_PROG_ERROR; size += add; } // Uncompressed Size if (block->uncompressed_size != LZMA_VLI_UNKNOWN) { const uint32_t add = lzma_vli_size(block->uncompressed_size); if (add == 0) return LZMA_PROG_ERROR; size += add; } // List of Filter Flags if (block->filters == NULL || block->filters[0].id == LZMA_VLI_UNKNOWN) return LZMA_PROG_ERROR; for (size_t i = 0; block->filters[i].id != LZMA_VLI_UNKNOWN; ++i) { // Don't allow too many filters. if (i == LZMA_FILTERS_MAX) return LZMA_PROG_ERROR; uint32_t add; return_if_error(lzma_filter_flags_size(&add, block->filters + i)); size += add; } // Pad to a multiple of four bytes. block->header_size = (size + 3) & ~UINT32_C(3); // NOTE: We don't verify that the encoded size of the Block stays // within limits. This is because it is possible that we are called // with exaggerated Compressed Size (e.g. LZMA_VLI_MAX) to reserve // space for Block Header, and later called again with lower, // real values. return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_block_header_encode(const lzma_block *block, uint8_t *out) { // Validate everything but filters. if (lzma_block_unpadded_size(block) == 0 || !lzma_vli_is_valid(block->uncompressed_size)) return LZMA_PROG_ERROR; // Indicate the size of the buffer _excluding_ the CRC32 field. const size_t out_size = block->header_size - 4; // Store the Block Header Size. out[0] = out_size / 4; // We write Block Flags in pieces. out[1] = 0x00; size_t out_pos = 2; // Compressed Size if (block->compressed_size != LZMA_VLI_UNKNOWN) { return_if_error(lzma_vli_encode(block->compressed_size, NULL, out, &out_pos, out_size)); out[1] |= 0x40; } // Uncompressed Size if (block->uncompressed_size != LZMA_VLI_UNKNOWN) { return_if_error(lzma_vli_encode(block->uncompressed_size, NULL, out, &out_pos, out_size)); out[1] |= 0x80; } // Filter Flags if (block->filters == NULL || block->filters[0].id == LZMA_VLI_UNKNOWN) return LZMA_PROG_ERROR; size_t filter_count = 0; do { // There can be a maximum of four filters. if (filter_count == LZMA_FILTERS_MAX) return LZMA_PROG_ERROR; return_if_error(lzma_filter_flags_encode( block->filters + filter_count, out, &out_pos, out_size)); } while (block->filters[++filter_count].id != LZMA_VLI_UNKNOWN); out[1] |= filter_count - 1; // Padding memzero(out + out_pos, out_size - out_pos); // CRC32 - unaligned_write32le(out + out_size, lzma_crc32(out, out_size, 0)); + write32le(out + out_size, lzma_crc32(out, out_size, 0)); return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/block_util.c =================================================================== --- head/contrib/xz/src/liblzma/common/block_util.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/block_util.c (revision 359201) @@ -1,90 +1,90 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file block_header.c +/// \file block_util.c /// \brief Utility functions to handle lzma_block // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" #include "index.h" extern LZMA_API(lzma_ret) lzma_block_compressed_size(lzma_block *block, lzma_vli unpadded_size) { // Validate everything but Uncompressed Size and filters. if (lzma_block_unpadded_size(block) == 0) return LZMA_PROG_ERROR; const uint32_t container_size = block->header_size + lzma_check_size(block->check); // Validate that Compressed Size will be greater than zero. if (unpadded_size <= container_size) return LZMA_DATA_ERROR; // Calculate what Compressed Size is supposed to be. // If Compressed Size was present in Block Header, // compare that the new value matches it. const lzma_vli compressed_size = unpadded_size - container_size; if (block->compressed_size != LZMA_VLI_UNKNOWN && block->compressed_size != compressed_size) return LZMA_DATA_ERROR; block->compressed_size = compressed_size; return LZMA_OK; } extern LZMA_API(lzma_vli) lzma_block_unpadded_size(const lzma_block *block) { // Validate the values that we are interested in i.e. all but // Uncompressed Size and the filters. // // NOTE: This function is used for validation too, so it is // essential that these checks are always done even if // Compressed Size is unknown. if (block == NULL || block->version > 1 || block->header_size < LZMA_BLOCK_HEADER_SIZE_MIN || block->header_size > LZMA_BLOCK_HEADER_SIZE_MAX || (block->header_size & 3) || !lzma_vli_is_valid(block->compressed_size) || block->compressed_size == 0 || (unsigned int)(block->check) > LZMA_CHECK_ID_MAX) return 0; // If Compressed Size is unknown, return that we cannot know // size of the Block either. if (block->compressed_size == LZMA_VLI_UNKNOWN) return LZMA_VLI_UNKNOWN; // Calculate Unpadded Size and validate it. const lzma_vli unpadded_size = block->compressed_size + block->header_size + lzma_check_size(block->check); assert(unpadded_size >= UNPADDED_SIZE_MIN); if (unpadded_size > UNPADDED_SIZE_MAX) return 0; return unpadded_size; } extern LZMA_API(lzma_vli) lzma_block_total_size(const lzma_block *block) { lzma_vli unpadded_size = lzma_block_unpadded_size(block); if (unpadded_size != LZMA_VLI_UNKNOWN) unpadded_size = vli_ceil4(unpadded_size); return unpadded_size; } Index: head/contrib/xz/src/liblzma/common/common.c =================================================================== --- head/contrib/xz/src/liblzma/common/common.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/common.c (revision 359201) @@ -1,445 +1,449 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file common.h +/// \file common.c /// \brief Common functions needed in many places in liblzma // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" ///////////// // Version // ///////////// extern LZMA_API(uint32_t) lzma_version_number(void) { return LZMA_VERSION; } extern LZMA_API(const char *) lzma_version_string(void) { return LZMA_VERSION_STRING; } /////////////////////// // Memory allocation // /////////////////////// extern void * lzma_attribute((__malloc__)) lzma_attr_alloc_size(1) lzma_alloc(size_t size, const lzma_allocator *allocator) { // Some malloc() variants return NULL if called with size == 0. if (size == 0) size = 1; void *ptr; if (allocator != NULL && allocator->alloc != NULL) ptr = allocator->alloc(allocator->opaque, 1, size); else ptr = malloc(size); return ptr; } extern void * lzma_attribute((__malloc__)) lzma_attr_alloc_size(1) lzma_alloc_zero(size_t size, const lzma_allocator *allocator) { // Some calloc() variants return NULL if called with size == 0. if (size == 0) size = 1; void *ptr; if (allocator != NULL && allocator->alloc != NULL) { ptr = allocator->alloc(allocator->opaque, 1, size); if (ptr != NULL) memzero(ptr, size); } else { ptr = calloc(1, size); } return ptr; } extern void lzma_free(void *ptr, const lzma_allocator *allocator) { if (allocator != NULL && allocator->free != NULL) allocator->free(allocator->opaque, ptr); else free(ptr); return; } ////////// // Misc // ////////// extern size_t lzma_bufcpy(const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size) { const size_t in_avail = in_size - *in_pos; const size_t out_avail = out_size - *out_pos; const size_t copy_size = my_min(in_avail, out_avail); - memcpy(out + *out_pos, in + *in_pos, copy_size); + // Call memcpy() only if there is something to copy. If there is + // nothing to copy, in or out might be NULL and then the memcpy() + // call would trigger undefined behavior. + if (copy_size > 0) + memcpy(out + *out_pos, in + *in_pos, copy_size); *in_pos += copy_size; *out_pos += copy_size; return copy_size; } extern lzma_ret lzma_next_filter_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { lzma_next_coder_init(filters[0].init, next, allocator); next->id = filters[0].id; return filters[0].init == NULL ? LZMA_OK : filters[0].init(next, allocator, filters); } extern lzma_ret lzma_next_filter_update(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter *reversed_filters) { // Check that the application isn't trying to change the Filter ID. // End of filters is indicated with LZMA_VLI_UNKNOWN in both // reversed_filters[0].id and next->id. if (reversed_filters[0].id != next->id) return LZMA_PROG_ERROR; if (reversed_filters[0].id == LZMA_VLI_UNKNOWN) return LZMA_OK; assert(next->update != NULL); return next->update(next->coder, allocator, NULL, reversed_filters); } extern void lzma_next_end(lzma_next_coder *next, const lzma_allocator *allocator) { if (next->init != (uintptr_t)(NULL)) { // To avoid tiny end functions that simply call // lzma_free(coder, allocator), we allow leaving next->end // NULL and call lzma_free() here. if (next->end != NULL) next->end(next->coder, allocator); else lzma_free(next->coder, allocator); // Reset the variables so the we don't accidentally think // that it is an already initialized coder. *next = LZMA_NEXT_CODER_INIT; } return; } ////////////////////////////////////// // External to internal API wrapper // ////////////////////////////////////// extern lzma_ret lzma_strm_init(lzma_stream *strm) { if (strm == NULL) return LZMA_PROG_ERROR; if (strm->internal == NULL) { strm->internal = lzma_alloc(sizeof(lzma_internal), strm->allocator); if (strm->internal == NULL) return LZMA_MEM_ERROR; strm->internal->next = LZMA_NEXT_CODER_INIT; } memzero(strm->internal->supported_actions, sizeof(strm->internal->supported_actions)); strm->internal->sequence = ISEQ_RUN; strm->internal->allow_buf_error = false; strm->total_in = 0; strm->total_out = 0; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_code(lzma_stream *strm, lzma_action action) { // Sanity checks if ((strm->next_in == NULL && strm->avail_in != 0) || (strm->next_out == NULL && strm->avail_out != 0) || strm->internal == NULL || strm->internal->next.code == NULL || (unsigned int)(action) > LZMA_ACTION_MAX || !strm->internal->supported_actions[action]) return LZMA_PROG_ERROR; // Check if unsupported members have been set to non-zero or non-NULL, // which would indicate that some new feature is wanted. if (strm->reserved_ptr1 != NULL || strm->reserved_ptr2 != NULL || strm->reserved_ptr3 != NULL || strm->reserved_ptr4 != NULL || strm->reserved_int1 != 0 || strm->reserved_int2 != 0 || strm->reserved_int3 != 0 || strm->reserved_int4 != 0 || strm->reserved_enum1 != LZMA_RESERVED_ENUM || strm->reserved_enum2 != LZMA_RESERVED_ENUM) return LZMA_OPTIONS_ERROR; switch (strm->internal->sequence) { case ISEQ_RUN: switch (action) { case LZMA_RUN: break; case LZMA_SYNC_FLUSH: strm->internal->sequence = ISEQ_SYNC_FLUSH; break; case LZMA_FULL_FLUSH: strm->internal->sequence = ISEQ_FULL_FLUSH; break; case LZMA_FINISH: strm->internal->sequence = ISEQ_FINISH; break; case LZMA_FULL_BARRIER: strm->internal->sequence = ISEQ_FULL_BARRIER; break; } break; case ISEQ_SYNC_FLUSH: // The same action must be used until we return // LZMA_STREAM_END, and the amount of input must not change. if (action != LZMA_SYNC_FLUSH || strm->internal->avail_in != strm->avail_in) return LZMA_PROG_ERROR; break; case ISEQ_FULL_FLUSH: if (action != LZMA_FULL_FLUSH || strm->internal->avail_in != strm->avail_in) return LZMA_PROG_ERROR; break; case ISEQ_FINISH: if (action != LZMA_FINISH || strm->internal->avail_in != strm->avail_in) return LZMA_PROG_ERROR; break; case ISEQ_FULL_BARRIER: if (action != LZMA_FULL_BARRIER || strm->internal->avail_in != strm->avail_in) return LZMA_PROG_ERROR; break; case ISEQ_END: return LZMA_STREAM_END; case ISEQ_ERROR: default: return LZMA_PROG_ERROR; } size_t in_pos = 0; size_t out_pos = 0; lzma_ret ret = strm->internal->next.code( strm->internal->next.coder, strm->allocator, strm->next_in, &in_pos, strm->avail_in, strm->next_out, &out_pos, strm->avail_out, action); strm->next_in += in_pos; strm->avail_in -= in_pos; strm->total_in += in_pos; strm->next_out += out_pos; strm->avail_out -= out_pos; strm->total_out += out_pos; strm->internal->avail_in = strm->avail_in; // Cast is needed to silence a warning about LZMA_TIMED_OUT, which // isn't part of lzma_ret enumeration. switch ((unsigned int)(ret)) { case LZMA_OK: // Don't return LZMA_BUF_ERROR when it happens the first time. // This is to avoid returning LZMA_BUF_ERROR when avail_out // was zero but still there was no more data left to written // to next_out. if (out_pos == 0 && in_pos == 0) { if (strm->internal->allow_buf_error) ret = LZMA_BUF_ERROR; else strm->internal->allow_buf_error = true; } else { strm->internal->allow_buf_error = false; } break; case LZMA_TIMED_OUT: strm->internal->allow_buf_error = false; ret = LZMA_OK; break; case LZMA_STREAM_END: if (strm->internal->sequence == ISEQ_SYNC_FLUSH || strm->internal->sequence == ISEQ_FULL_FLUSH || strm->internal->sequence == ISEQ_FULL_BARRIER) strm->internal->sequence = ISEQ_RUN; else strm->internal->sequence = ISEQ_END; // Fall through case LZMA_NO_CHECK: case LZMA_UNSUPPORTED_CHECK: case LZMA_GET_CHECK: case LZMA_MEMLIMIT_ERROR: // Something else than LZMA_OK, but not a fatal error, // that is, coding may be continued (except if ISEQ_END). strm->internal->allow_buf_error = false; break; default: // All the other errors are fatal; coding cannot be continued. assert(ret != LZMA_BUF_ERROR); strm->internal->sequence = ISEQ_ERROR; break; } return ret; } extern LZMA_API(void) lzma_end(lzma_stream *strm) { if (strm != NULL && strm->internal != NULL) { lzma_next_end(&strm->internal->next, strm->allocator); lzma_free(strm->internal, strm->allocator); strm->internal = NULL; } return; } extern LZMA_API(void) lzma_get_progress(lzma_stream *strm, uint64_t *progress_in, uint64_t *progress_out) { if (strm->internal->next.get_progress != NULL) { strm->internal->next.get_progress(strm->internal->next.coder, progress_in, progress_out); } else { *progress_in = strm->total_in; *progress_out = strm->total_out; } return; } extern LZMA_API(lzma_check) lzma_get_check(const lzma_stream *strm) { // Return LZMA_CHECK_NONE if we cannot know the check type. // It's a bug in the application if this happens. if (strm->internal->next.get_check == NULL) return LZMA_CHECK_NONE; return strm->internal->next.get_check(strm->internal->next.coder); } extern LZMA_API(uint64_t) lzma_memusage(const lzma_stream *strm) { uint64_t memusage; uint64_t old_memlimit; if (strm == NULL || strm->internal == NULL || strm->internal->next.memconfig == NULL || strm->internal->next.memconfig( strm->internal->next.coder, &memusage, &old_memlimit, 0) != LZMA_OK) return 0; return memusage; } extern LZMA_API(uint64_t) lzma_memlimit_get(const lzma_stream *strm) { uint64_t old_memlimit; uint64_t memusage; if (strm == NULL || strm->internal == NULL || strm->internal->next.memconfig == NULL || strm->internal->next.memconfig( strm->internal->next.coder, &memusage, &old_memlimit, 0) != LZMA_OK) return 0; return old_memlimit; } extern LZMA_API(lzma_ret) lzma_memlimit_set(lzma_stream *strm, uint64_t new_memlimit) { // Dummy variables to simplify memconfig functions uint64_t old_memlimit; uint64_t memusage; if (strm == NULL || strm->internal == NULL || strm->internal->next.memconfig == NULL) return LZMA_PROG_ERROR; // Zero is a special value that cannot be used as an actual limit. // If 0 was specified, use 1 instead. if (new_memlimit == 0) new_memlimit = 1; return strm->internal->next.memconfig(strm->internal->next.coder, &memusage, &old_memlimit, new_memlimit); } Index: head/contrib/xz/src/liblzma/common/filter_common.h =================================================================== --- head/contrib/xz/src/liblzma/common/filter_common.h (revision 359200) +++ head/contrib/xz/src/liblzma/common/filter_common.h (revision 359201) @@ -1,48 +1,48 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file filter_common.c +/// \file filter_common.h /// \brief Filter-specific stuff common for both encoder and decoder // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_FILTER_COMMON_H #define LZMA_FILTER_COMMON_H #include "common.h" /// Both lzma_filter_encoder and lzma_filter_decoder begin with these members. typedef struct { /// Filter ID lzma_vli id; /// Initializes the filter encoder and calls lzma_next_filter_init() /// for filters + 1. lzma_init_function init; /// Calculates memory usage of the encoder. If the options are /// invalid, UINT64_MAX is returned. uint64_t (*memusage)(const void *options); } lzma_filter_coder; typedef const lzma_filter_coder *(*lzma_filter_find)(lzma_vli id); extern lzma_ret lzma_raw_coder_init( lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter *filters, lzma_filter_find coder_find, bool is_encoder); extern uint64_t lzma_raw_coder_memusage(lzma_filter_find coder_find, const lzma_filter *filters); #endif Index: head/contrib/xz/src/liblzma/common/filter_decoder.h =================================================================== --- head/contrib/xz/src/liblzma/common/filter_decoder.h (revision 359200) +++ head/contrib/xz/src/liblzma/common/filter_decoder.h (revision 359201) @@ -1,23 +1,23 @@ /////////////////////////////////////////////////////////////////////////////// // -/// \file filter_decoder.c +/// \file filter_decoder.h /// \brief Filter ID mapping to filter-specific functions // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_FILTER_DECODER_H #define LZMA_FILTER_DECODER_H #include "common.h" extern lzma_ret lzma_raw_decoder_init( lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter *options); #endif Index: head/contrib/xz/src/liblzma/common/filter_flags_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/filter_flags_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/filter_flags_encoder.c (revision 359201) @@ -1,56 +1,56 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file filter_flags_encoder.c -/// \brief Decodes a Filter Flags field +/// \brief Encodes a Filter Flags field // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "filter_encoder.h" extern LZMA_API(lzma_ret) lzma_filter_flags_size(uint32_t *size, const lzma_filter *filter) { if (filter->id >= LZMA_FILTER_RESERVED_START) return LZMA_PROG_ERROR; return_if_error(lzma_properties_size(size, filter)); *size += lzma_vli_size(filter->id) + lzma_vli_size(*size); return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_filter_flags_encode(const lzma_filter *filter, uint8_t *out, size_t *out_pos, size_t out_size) { // Filter ID if (filter->id >= LZMA_FILTER_RESERVED_START) return LZMA_PROG_ERROR; return_if_error(lzma_vli_encode(filter->id, NULL, out, out_pos, out_size)); // Size of Properties uint32_t props_size; return_if_error(lzma_properties_size(&props_size, filter)); return_if_error(lzma_vli_encode(props_size, NULL, out, out_pos, out_size)); // Filter Properties if (out_size - *out_pos < props_size) return LZMA_PROG_ERROR; return_if_error(lzma_properties_encode(filter, out + *out_pos)); *out_pos += props_size; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/hardware_physmem.c =================================================================== --- head/contrib/xz/src/liblzma/common/hardware_physmem.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/hardware_physmem.c (revision 359201) @@ -1,25 +1,25 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file hardware_physmem.c /// \brief Get the total amount of physical memory (RAM) // // Author: Jonathan Nieder // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" #include "tuklib_physmem.h" extern LZMA_API(uint64_t) lzma_physmem(void) { // It is simpler to make lzma_physmem() a wrapper for - // tuklib_physmem() than to hack appropriate symbol visiblity + // tuklib_physmem() than to hack appropriate symbol visibility // support for the tuklib modules. return tuklib_physmem(); } Index: head/contrib/xz/src/liblzma/common/index.c =================================================================== --- head/contrib/xz/src/liblzma/common/index.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/index.c (revision 359201) @@ -1,1250 +1,1250 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file index.c /// \brief Handling of .xz Indexes and some other Stream information // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "index.h" #include "stream_flags_common.h" /// \brief How many Records to allocate at once /// /// This should be big enough to avoid making lots of tiny allocations /// but small enough to avoid too much unused memory at once. #define INDEX_GROUP_SIZE 512 /// \brief How many Records can be allocated at once at maximum #define PREALLOC_MAX ((SIZE_MAX - sizeof(index_group)) / sizeof(index_record)) /// \brief Base structure for index_stream and index_group structures typedef struct index_tree_node_s index_tree_node; struct index_tree_node_s { /// Uncompressed start offset of this Stream (relative to the /// beginning of the file) or Block (relative to the beginning /// of the Stream) lzma_vli uncompressed_base; /// Compressed start offset of this Stream or Block lzma_vli compressed_base; index_tree_node *parent; index_tree_node *left; index_tree_node *right; }; /// \brief AVL tree to hold index_stream or index_group structures typedef struct { /// Root node index_tree_node *root; /// Leftmost node. Since the tree will be filled sequentially, /// this won't change after the first node has been added to /// the tree. index_tree_node *leftmost; /// The rightmost node in the tree. Since the tree is filled /// sequentially, this is always the node where to add the new data. index_tree_node *rightmost; /// Number of nodes in the tree uint32_t count; } index_tree; typedef struct { lzma_vli uncompressed_sum; lzma_vli unpadded_sum; } index_record; typedef struct { /// Every Record group is part of index_stream.groups tree. index_tree_node node; /// Number of Blocks in this Stream before this group. lzma_vli number_base; /// Number of Records that can be put in records[]. size_t allocated; /// Index of the last Record in use. size_t last; /// The sizes in this array are stored as cumulative sums relative /// to the beginning of the Stream. This makes it possible to /// use binary search in lzma_index_locate(). /// /// Note that the cumulative summing is done specially for /// unpadded_sum: The previous value is rounded up to the next /// multiple of four before adding the Unpadded Size of the new /// Block. The total encoded size of the Blocks in the Stream /// is records[last].unpadded_sum in the last Record group of /// the Stream. /// /// For example, if the Unpadded Sizes are 39, 57, and 81, the /// stored values are 39, 97 (40 + 57), and 181 (100 + 181). /// The total encoded size of these Blocks is 184. /// /// This is a flexible array, because it makes easy to optimize /// memory usage in case someone concatenates many Streams that /// have only one or few Blocks. index_record records[]; } index_group; typedef struct { - /// Every index_stream is a node in the tree of Sreams. + /// Every index_stream is a node in the tree of Streams. index_tree_node node; /// Number of this Stream (first one is 1) uint32_t number; /// Total number of Blocks before this Stream lzma_vli block_number_base; /// Record groups of this Stream are stored in a tree. /// It's a T-tree with AVL-tree balancing. There are /// INDEX_GROUP_SIZE Records per node by default. /// This keeps the number of memory allocations reasonable /// and finding a Record is fast. index_tree groups; /// Number of Records in this Stream lzma_vli record_count; /// Size of the List of Records field in this Stream. This is used /// together with record_count to calculate the size of the Index /// field and thus the total size of the Stream. lzma_vli index_list_size; /// Stream Flags of this Stream. This is meaningful only if /// the Stream Flags have been told us with lzma_index_stream_flags(). /// Initially stream_flags.version is set to UINT32_MAX to indicate /// that the Stream Flags are unknown. lzma_stream_flags stream_flags; /// Amount of Stream Padding after this Stream. This defaults to /// zero and can be set with lzma_index_stream_padding(). lzma_vli stream_padding; } index_stream; struct lzma_index_s { /// AVL-tree containing the Stream(s). Often there is just one /// Stream, but using a tree keeps lookups fast even when there /// are many concatenated Streams. index_tree streams; /// Uncompressed size of all the Blocks in the Stream(s) lzma_vli uncompressed_size; /// Total size of all the Blocks in the Stream(s) lzma_vli total_size; /// Total number of Records in all Streams in this lzma_index lzma_vli record_count; /// Size of the List of Records field if all the Streams in this /// lzma_index were packed into a single Stream (makes it simpler to /// take many .xz files and combine them into a single Stream). /// /// This value together with record_count is needed to calculate /// Backward Size that is stored into Stream Footer. lzma_vli index_list_size; /// How many Records to allocate at once in lzma_index_append(). - /// This defaults to INDEX_GROUP_SIZE but can be overriden with + /// This defaults to INDEX_GROUP_SIZE but can be overridden with /// lzma_index_prealloc(). size_t prealloc; /// Bitmask indicating what integrity check types have been used /// as set by lzma_index_stream_flags(). The bit of the last Stream /// is not included here, since it is possible to change it by /// calling lzma_index_stream_flags() again. uint32_t checks; }; static void index_tree_init(index_tree *tree) { tree->root = NULL; tree->leftmost = NULL; tree->rightmost = NULL; tree->count = 0; return; } /// Helper for index_tree_end() static void index_tree_node_end(index_tree_node *node, const lzma_allocator *allocator, void (*free_func)(void *node, const lzma_allocator *allocator)) { // The tree won't ever be very huge, so recursion should be fine. // 20 levels in the tree is likely quite a lot already in practice. if (node->left != NULL) index_tree_node_end(node->left, allocator, free_func); if (node->right != NULL) index_tree_node_end(node->right, allocator, free_func); free_func(node, allocator); return; } /// Free the memory allocated for a tree. Each node is freed using the /// given free_func which is either &lzma_free or &index_stream_end. /// The latter is used to free the Record groups from each index_stream /// before freeing the index_stream itself. static void index_tree_end(index_tree *tree, const lzma_allocator *allocator, void (*free_func)(void *node, const lzma_allocator *allocator)) { assert(free_func != NULL); if (tree->root != NULL) index_tree_node_end(tree->root, allocator, free_func); return; } /// Add a new node to the tree. node->uncompressed_base and /// node->compressed_base must have been set by the caller already. static void index_tree_append(index_tree *tree, index_tree_node *node) { node->parent = tree->rightmost; node->left = NULL; node->right = NULL; ++tree->count; // Handle the special case of adding the first node. if (tree->root == NULL) { tree->root = node; tree->leftmost = node; tree->rightmost = node; return; } // The tree is always filled sequentially. assert(tree->rightmost->uncompressed_base <= node->uncompressed_base); assert(tree->rightmost->compressed_base < node->compressed_base); // Add the new node after the rightmost node. It's the correct // place due to the reason above. tree->rightmost->right = node; tree->rightmost = node; // Balance the AVL-tree if needed. We don't need to keep the balance // factors in nodes, because we always fill the tree sequentially, // and thus know the state of the tree just by looking at the node // count. From the node count we can calculate how many steps to go // up in the tree to find the rotation root. uint32_t up = tree->count ^ (UINT32_C(1) << bsr32(tree->count)); if (up != 0) { // Locate the root node for the rotation. up = ctz32(tree->count) + 2; do { node = node->parent; } while (--up > 0); // Rotate left using node as the rotation root. index_tree_node *pivot = node->right; if (node->parent == NULL) { tree->root = pivot; } else { assert(node->parent->right == node); node->parent->right = pivot; } pivot->parent = node->parent; node->right = pivot->left; if (node->right != NULL) node->right->parent = node; pivot->left = node; node->parent = pivot; } return; } /// Get the next node in the tree. Return NULL if there are no more nodes. static void * index_tree_next(const index_tree_node *node) { if (node->right != NULL) { node = node->right; while (node->left != NULL) node = node->left; return (void *)(node); } while (node->parent != NULL && node->parent->right == node) node = node->parent; return (void *)(node->parent); } /// Locate a node that contains the given uncompressed offset. It is /// caller's job to check that target is not bigger than the uncompressed /// size of the tree (the last node would be returned in that case still). static void * index_tree_locate(const index_tree *tree, lzma_vli target) { const index_tree_node *result = NULL; const index_tree_node *node = tree->root; assert(tree->leftmost == NULL || tree->leftmost->uncompressed_base == 0); // Consecutive nodes may have the same uncompressed_base. // We must pick the rightmost one. while (node != NULL) { if (node->uncompressed_base > target) { node = node->left; } else { result = node; node = node->right; } } return (void *)(result); } /// Allocate and initialize a new Stream using the given base offsets. static index_stream * index_stream_init(lzma_vli compressed_base, lzma_vli uncompressed_base, uint32_t stream_number, lzma_vli block_number_base, const lzma_allocator *allocator) { index_stream *s = lzma_alloc(sizeof(index_stream), allocator); if (s == NULL) return NULL; s->node.uncompressed_base = uncompressed_base; s->node.compressed_base = compressed_base; s->node.parent = NULL; s->node.left = NULL; s->node.right = NULL; s->number = stream_number; s->block_number_base = block_number_base; index_tree_init(&s->groups); s->record_count = 0; s->index_list_size = 0; s->stream_flags.version = UINT32_MAX; s->stream_padding = 0; return s; } /// Free the memory allocated for a Stream and its Record groups. static void index_stream_end(void *node, const lzma_allocator *allocator) { index_stream *s = node; index_tree_end(&s->groups, allocator, &lzma_free); lzma_free(s, allocator); return; } static lzma_index * index_init_plain(const lzma_allocator *allocator) { lzma_index *i = lzma_alloc(sizeof(lzma_index), allocator); if (i != NULL) { index_tree_init(&i->streams); i->uncompressed_size = 0; i->total_size = 0; i->record_count = 0; i->index_list_size = 0; i->prealloc = INDEX_GROUP_SIZE; i->checks = 0; } return i; } extern LZMA_API(lzma_index *) lzma_index_init(const lzma_allocator *allocator) { lzma_index *i = index_init_plain(allocator); if (i == NULL) return NULL; index_stream *s = index_stream_init(0, 0, 1, 0, allocator); if (s == NULL) { lzma_free(i, allocator); return NULL; } index_tree_append(&i->streams, &s->node); return i; } extern LZMA_API(void) lzma_index_end(lzma_index *i, const lzma_allocator *allocator) { // NOTE: If you modify this function, check also the bottom // of lzma_index_cat(). if (i != NULL) { index_tree_end(&i->streams, allocator, &index_stream_end); lzma_free(i, allocator); } return; } extern void lzma_index_prealloc(lzma_index *i, lzma_vli records) { if (records > PREALLOC_MAX) records = PREALLOC_MAX; i->prealloc = (size_t)(records); return; } extern LZMA_API(uint64_t) lzma_index_memusage(lzma_vli streams, lzma_vli blocks) { // This calculates an upper bound that is only a little bit // bigger than the exact maximum memory usage with the given // parameters. // Typical malloc() overhead is 2 * sizeof(void *) but we take // a little bit extra just in case. Using LZMA_MEMUSAGE_BASE // instead would give too inaccurate estimate. const size_t alloc_overhead = 4 * sizeof(void *); // Amount of memory needed for each Stream base structures. // We assume that every Stream has at least one Block and // thus at least one group. const size_t stream_base = sizeof(index_stream) + sizeof(index_group) + 2 * alloc_overhead; // Amount of memory needed per group. const size_t group_base = sizeof(index_group) + INDEX_GROUP_SIZE * sizeof(index_record) + alloc_overhead; // Number of groups. There may actually be more, but that overhead // has been taken into account in stream_base already. const lzma_vli groups = (blocks + INDEX_GROUP_SIZE - 1) / INDEX_GROUP_SIZE; // Memory used by index_stream and index_group structures. const uint64_t streams_mem = streams * stream_base; const uint64_t groups_mem = groups * group_base; // Memory used by the base structure. const uint64_t index_base = sizeof(lzma_index) + alloc_overhead; // Validate the arguments and catch integer overflows. // Maximum number of Streams is "only" UINT32_MAX, because // that limit is used by the tree containing the Streams. const uint64_t limit = UINT64_MAX - index_base; if (streams == 0 || streams > UINT32_MAX || blocks > LZMA_VLI_MAX || streams > limit / stream_base || groups > limit / group_base || limit - streams_mem < groups_mem) return UINT64_MAX; return index_base + streams_mem + groups_mem; } extern LZMA_API(uint64_t) lzma_index_memused(const lzma_index *i) { return lzma_index_memusage(i->streams.count, i->record_count); } extern LZMA_API(lzma_vli) lzma_index_block_count(const lzma_index *i) { return i->record_count; } extern LZMA_API(lzma_vli) lzma_index_stream_count(const lzma_index *i) { return i->streams.count; } extern LZMA_API(lzma_vli) lzma_index_size(const lzma_index *i) { return index_size(i->record_count, i->index_list_size); } extern LZMA_API(lzma_vli) lzma_index_total_size(const lzma_index *i) { return i->total_size; } extern LZMA_API(lzma_vli) lzma_index_stream_size(const lzma_index *i) { // Stream Header + Blocks + Index + Stream Footer return LZMA_STREAM_HEADER_SIZE + i->total_size + index_size(i->record_count, i->index_list_size) + LZMA_STREAM_HEADER_SIZE; } static lzma_vli index_file_size(lzma_vli compressed_base, lzma_vli unpadded_sum, lzma_vli record_count, lzma_vli index_list_size, lzma_vli stream_padding) { // Earlier Streams and Stream Paddings + Stream Header // + Blocks + Index + Stream Footer + Stream Padding // // This might go over LZMA_VLI_MAX due to too big unpadded_sum // when this function is used in lzma_index_append(). lzma_vli file_size = compressed_base + 2 * LZMA_STREAM_HEADER_SIZE + stream_padding + vli_ceil4(unpadded_sum); if (file_size > LZMA_VLI_MAX) return LZMA_VLI_UNKNOWN; // The same applies here. file_size += index_size(record_count, index_list_size); if (file_size > LZMA_VLI_MAX) return LZMA_VLI_UNKNOWN; return file_size; } extern LZMA_API(lzma_vli) lzma_index_file_size(const lzma_index *i) { const index_stream *s = (const index_stream *)(i->streams.rightmost); const index_group *g = (const index_group *)(s->groups.rightmost); return index_file_size(s->node.compressed_base, g == NULL ? 0 : g->records[g->last].unpadded_sum, s->record_count, s->index_list_size, s->stream_padding); } extern LZMA_API(lzma_vli) lzma_index_uncompressed_size(const lzma_index *i) { return i->uncompressed_size; } extern LZMA_API(uint32_t) lzma_index_checks(const lzma_index *i) { uint32_t checks = i->checks; // Get the type of the Check of the last Stream too. const index_stream *s = (const index_stream *)(i->streams.rightmost); if (s->stream_flags.version != UINT32_MAX) checks |= UINT32_C(1) << s->stream_flags.check; return checks; } extern uint32_t lzma_index_padding_size(const lzma_index *i) { return (LZMA_VLI_C(4) - index_size_unpadded( i->record_count, i->index_list_size)) & 3; } extern LZMA_API(lzma_ret) lzma_index_stream_flags(lzma_index *i, const lzma_stream_flags *stream_flags) { if (i == NULL || stream_flags == NULL) return LZMA_PROG_ERROR; // Validate the Stream Flags. return_if_error(lzma_stream_flags_compare( stream_flags, stream_flags)); index_stream *s = (index_stream *)(i->streams.rightmost); s->stream_flags = *stream_flags; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_index_stream_padding(lzma_index *i, lzma_vli stream_padding) { if (i == NULL || stream_padding > LZMA_VLI_MAX || (stream_padding & 3) != 0) return LZMA_PROG_ERROR; index_stream *s = (index_stream *)(i->streams.rightmost); // Check that the new value won't make the file grow too big. const lzma_vli old_stream_padding = s->stream_padding; s->stream_padding = 0; if (lzma_index_file_size(i) + stream_padding > LZMA_VLI_MAX) { s->stream_padding = old_stream_padding; return LZMA_DATA_ERROR; } s->stream_padding = stream_padding; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_index_append(lzma_index *i, const lzma_allocator *allocator, lzma_vli unpadded_size, lzma_vli uncompressed_size) { // Validate. if (i == NULL || unpadded_size < UNPADDED_SIZE_MIN || unpadded_size > UNPADDED_SIZE_MAX || uncompressed_size > LZMA_VLI_MAX) return LZMA_PROG_ERROR; index_stream *s = (index_stream *)(i->streams.rightmost); index_group *g = (index_group *)(s->groups.rightmost); const lzma_vli compressed_base = g == NULL ? 0 : vli_ceil4(g->records[g->last].unpadded_sum); const lzma_vli uncompressed_base = g == NULL ? 0 : g->records[g->last].uncompressed_sum; const uint32_t index_list_size_add = lzma_vli_size(unpadded_size) + lzma_vli_size(uncompressed_size); // Check that the file size will stay within limits. if (index_file_size(s->node.compressed_base, compressed_base + unpadded_size, s->record_count + 1, s->index_list_size + index_list_size_add, s->stream_padding) == LZMA_VLI_UNKNOWN) return LZMA_DATA_ERROR; // The size of the Index field must not exceed the maximum value // that can be stored in the Backward Size field. if (index_size(i->record_count + 1, i->index_list_size + index_list_size_add) > LZMA_BACKWARD_SIZE_MAX) return LZMA_DATA_ERROR; if (g != NULL && g->last + 1 < g->allocated) { // There is space in the last group at least for one Record. ++g->last; } else { // We need to allocate a new group. g = lzma_alloc(sizeof(index_group) + i->prealloc * sizeof(index_record), allocator); if (g == NULL) return LZMA_MEM_ERROR; g->last = 0; g->allocated = i->prealloc; // Reset prealloc so that if the application happens to // add new Records, the allocation size will be sane. i->prealloc = INDEX_GROUP_SIZE; // Set the start offsets of this group. g->node.uncompressed_base = uncompressed_base; g->node.compressed_base = compressed_base; g->number_base = s->record_count + 1; // Add the new group to the Stream. index_tree_append(&s->groups, &g->node); } // Add the new Record to the group. g->records[g->last].uncompressed_sum = uncompressed_base + uncompressed_size; g->records[g->last].unpadded_sum = compressed_base + unpadded_size; // Update the totals. ++s->record_count; s->index_list_size += index_list_size_add; i->total_size += vli_ceil4(unpadded_size); i->uncompressed_size += uncompressed_size; ++i->record_count; i->index_list_size += index_list_size_add; return LZMA_OK; } /// Structure to pass info to index_cat_helper() typedef struct { /// Uncompressed size of the destination lzma_vli uncompressed_size; /// Compressed file size of the destination lzma_vli file_size; /// Same as above but for Block numbers lzma_vli block_number_add; /// Number of Streams that were in the destination index before we /// started appending new Streams from the source index. This is /// used to fix the Stream numbering. uint32_t stream_number_add; /// Destination index' Stream tree index_tree *streams; } index_cat_info; /// Add the Stream nodes from the source index to dest using recursion. /// Simplest iterative traversal of the source tree wouldn't work, because /// we update the pointers in nodes when moving them to the destination tree. static void index_cat_helper(const index_cat_info *info, index_stream *this) { index_stream *left = (index_stream *)(this->node.left); index_stream *right = (index_stream *)(this->node.right); if (left != NULL) index_cat_helper(info, left); this->node.uncompressed_base += info->uncompressed_size; this->node.compressed_base += info->file_size; this->number += info->stream_number_add; this->block_number_base += info->block_number_add; index_tree_append(info->streams, &this->node); if (right != NULL) index_cat_helper(info, right); return; } extern LZMA_API(lzma_ret) lzma_index_cat(lzma_index *restrict dest, lzma_index *restrict src, const lzma_allocator *allocator) { const lzma_vli dest_file_size = lzma_index_file_size(dest); // Check that we don't exceed the file size limits. if (dest_file_size + lzma_index_file_size(src) > LZMA_VLI_MAX || dest->uncompressed_size + src->uncompressed_size > LZMA_VLI_MAX) return LZMA_DATA_ERROR; // Check that the encoded size of the combined lzma_indexes stays // within limits. In theory, this should be done only if we know // that the user plans to actually combine the Streams and thus // construct a single Index (probably rare). However, exceeding // this limit is quite theoretical, so we do this check always // to simplify things elsewhere. { const lzma_vli dest_size = index_size_unpadded( dest->record_count, dest->index_list_size); const lzma_vli src_size = index_size_unpadded( src->record_count, src->index_list_size); if (vli_ceil4(dest_size + src_size) > LZMA_BACKWARD_SIZE_MAX) return LZMA_DATA_ERROR; } // Optimize the last group to minimize memory usage. Allocation has // to be done before modifying dest or src. { index_stream *s = (index_stream *)(dest->streams.rightmost); index_group *g = (index_group *)(s->groups.rightmost); if (g != NULL && g->last + 1 < g->allocated) { assert(g->node.left == NULL); assert(g->node.right == NULL); index_group *newg = lzma_alloc(sizeof(index_group) + (g->last + 1) * sizeof(index_record), allocator); if (newg == NULL) return LZMA_MEM_ERROR; newg->node = g->node; newg->allocated = g->last + 1; newg->last = g->last; newg->number_base = g->number_base; memcpy(newg->records, g->records, newg->allocated * sizeof(index_record)); if (g->node.parent != NULL) { assert(g->node.parent->right == &g->node); g->node.parent->right = &newg->node; } if (s->groups.leftmost == &g->node) { assert(s->groups.root == &g->node); s->groups.leftmost = &newg->node; s->groups.root = &newg->node; } - if (s->groups.rightmost == &g->node) - s->groups.rightmost = &newg->node; + assert(s->groups.rightmost == &g->node); + s->groups.rightmost = &newg->node; lzma_free(g, allocator); // NOTE: newg isn't leaked here because // newg == (void *)&newg->node. } } // Add all the Streams from src to dest. Update the base offsets // of each Stream from src. const index_cat_info info = { .uncompressed_size = dest->uncompressed_size, .file_size = dest_file_size, .stream_number_add = dest->streams.count, .block_number_add = dest->record_count, .streams = &dest->streams, }; index_cat_helper(&info, (index_stream *)(src->streams.root)); // Update info about all the combined Streams. dest->uncompressed_size += src->uncompressed_size; dest->total_size += src->total_size; dest->record_count += src->record_count; dest->index_list_size += src->index_list_size; dest->checks = lzma_index_checks(dest) | src->checks; // There's nothing else left in src than the base structure. lzma_free(src, allocator); return LZMA_OK; } /// Duplicate an index_stream. static index_stream * index_dup_stream(const index_stream *src, const lzma_allocator *allocator) { // Catch a somewhat theoretical integer overflow. if (src->record_count > PREALLOC_MAX) return NULL; // Allocate and initialize a new Stream. index_stream *dest = index_stream_init(src->node.compressed_base, src->node.uncompressed_base, src->number, src->block_number_base, allocator); if (dest == NULL) return NULL; // Copy the overall information. dest->record_count = src->record_count; dest->index_list_size = src->index_list_size; dest->stream_flags = src->stream_flags; dest->stream_padding = src->stream_padding; // Return if there are no groups to duplicate. if (src->groups.leftmost == NULL) return dest; // Allocate memory for the Records. We put all the Records into // a single group. It's simplest and also tends to make // lzma_index_locate() a little bit faster with very big Indexes. index_group *destg = lzma_alloc(sizeof(index_group) + src->record_count * sizeof(index_record), allocator); if (destg == NULL) { index_stream_end(dest, allocator); return NULL; } // Initialize destg. destg->node.uncompressed_base = 0; destg->node.compressed_base = 0; destg->number_base = 1; destg->allocated = src->record_count; destg->last = src->record_count - 1; // Go through all the groups in src and copy the Records into destg. const index_group *srcg = (const index_group *)(src->groups.leftmost); size_t i = 0; do { memcpy(destg->records + i, srcg->records, (srcg->last + 1) * sizeof(index_record)); i += srcg->last + 1; srcg = index_tree_next(&srcg->node); } while (srcg != NULL); assert(i == destg->allocated); // Add the group to the new Stream. index_tree_append(&dest->groups, &destg->node); return dest; } extern LZMA_API(lzma_index *) lzma_index_dup(const lzma_index *src, const lzma_allocator *allocator) { // Allocate the base structure (no initial Stream). lzma_index *dest = index_init_plain(allocator); if (dest == NULL) return NULL; // Copy the totals. dest->uncompressed_size = src->uncompressed_size; dest->total_size = src->total_size; dest->record_count = src->record_count; dest->index_list_size = src->index_list_size; // Copy the Streams and the groups in them. const index_stream *srcstream = (const index_stream *)(src->streams.leftmost); do { index_stream *deststream = index_dup_stream( srcstream, allocator); if (deststream == NULL) { lzma_index_end(dest, allocator); return NULL; } index_tree_append(&dest->streams, &deststream->node); srcstream = index_tree_next(&srcstream->node); } while (srcstream != NULL); return dest; } /// Indexing for lzma_index_iter.internal[] enum { ITER_INDEX, ITER_STREAM, ITER_GROUP, ITER_RECORD, ITER_METHOD, }; /// Values for lzma_index_iter.internal[ITER_METHOD].s enum { ITER_METHOD_NORMAL, ITER_METHOD_NEXT, ITER_METHOD_LEFTMOST, }; static void iter_set_info(lzma_index_iter *iter) { const lzma_index *i = iter->internal[ITER_INDEX].p; const index_stream *stream = iter->internal[ITER_STREAM].p; const index_group *group = iter->internal[ITER_GROUP].p; const size_t record = iter->internal[ITER_RECORD].s; // lzma_index_iter.internal must not contain a pointer to the last // group in the index, because that may be reallocated by // lzma_index_cat(). if (group == NULL) { // There are no groups. assert(stream->groups.root == NULL); iter->internal[ITER_METHOD].s = ITER_METHOD_LEFTMOST; } else if (i->streams.rightmost != &stream->node || stream->groups.rightmost != &group->node) { // The group is not not the last group in the index. iter->internal[ITER_METHOD].s = ITER_METHOD_NORMAL; } else if (stream->groups.leftmost != &group->node) { // The group isn't the only group in the Stream, thus we // know that it must have a parent group i.e. it's not // the root node. assert(stream->groups.root != &group->node); assert(group->node.parent->right == &group->node); iter->internal[ITER_METHOD].s = ITER_METHOD_NEXT; iter->internal[ITER_GROUP].p = group->node.parent; } else { // The Stream has only one group. assert(stream->groups.root == &group->node); assert(group->node.parent == NULL); iter->internal[ITER_METHOD].s = ITER_METHOD_LEFTMOST; iter->internal[ITER_GROUP].p = NULL; } // NOTE: lzma_index_iter.stream.number is lzma_vli but we use uint32_t // internally. iter->stream.number = stream->number; iter->stream.block_count = stream->record_count; iter->stream.compressed_offset = stream->node.compressed_base; iter->stream.uncompressed_offset = stream->node.uncompressed_base; // iter->stream.flags will be NULL if the Stream Flags haven't been // set with lzma_index_stream_flags(). iter->stream.flags = stream->stream_flags.version == UINT32_MAX ? NULL : &stream->stream_flags; iter->stream.padding = stream->stream_padding; if (stream->groups.rightmost == NULL) { // Stream has no Blocks. iter->stream.compressed_size = index_size(0, 0) + 2 * LZMA_STREAM_HEADER_SIZE; iter->stream.uncompressed_size = 0; } else { const index_group *g = (const index_group *)( stream->groups.rightmost); // Stream Header + Stream Footer + Index + Blocks iter->stream.compressed_size = 2 * LZMA_STREAM_HEADER_SIZE + index_size(stream->record_count, stream->index_list_size) + vli_ceil4(g->records[g->last].unpadded_sum); iter->stream.uncompressed_size = g->records[g->last].uncompressed_sum; } if (group != NULL) { iter->block.number_in_stream = group->number_base + record; iter->block.number_in_file = iter->block.number_in_stream + stream->block_number_base; iter->block.compressed_stream_offset = record == 0 ? group->node.compressed_base : vli_ceil4(group->records[ record - 1].unpadded_sum); iter->block.uncompressed_stream_offset = record == 0 ? group->node.uncompressed_base : group->records[record - 1].uncompressed_sum; iter->block.uncompressed_size = group->records[record].uncompressed_sum - iter->block.uncompressed_stream_offset; iter->block.unpadded_size = group->records[record].unpadded_sum - iter->block.compressed_stream_offset; iter->block.total_size = vli_ceil4(iter->block.unpadded_size); iter->block.compressed_stream_offset += LZMA_STREAM_HEADER_SIZE; iter->block.compressed_file_offset = iter->block.compressed_stream_offset + iter->stream.compressed_offset; iter->block.uncompressed_file_offset = iter->block.uncompressed_stream_offset + iter->stream.uncompressed_offset; } return; } extern LZMA_API(void) lzma_index_iter_init(lzma_index_iter *iter, const lzma_index *i) { iter->internal[ITER_INDEX].p = i; lzma_index_iter_rewind(iter); return; } extern LZMA_API(void) lzma_index_iter_rewind(lzma_index_iter *iter) { iter->internal[ITER_STREAM].p = NULL; iter->internal[ITER_GROUP].p = NULL; iter->internal[ITER_RECORD].s = 0; iter->internal[ITER_METHOD].s = ITER_METHOD_NORMAL; return; } extern LZMA_API(lzma_bool) lzma_index_iter_next(lzma_index_iter *iter, lzma_index_iter_mode mode) { // Catch unsupported mode values. if ((unsigned int)(mode) > LZMA_INDEX_ITER_NONEMPTY_BLOCK) return true; const lzma_index *i = iter->internal[ITER_INDEX].p; const index_stream *stream = iter->internal[ITER_STREAM].p; const index_group *group = NULL; size_t record = iter->internal[ITER_RECORD].s; // If we are being asked for the next Stream, leave group to NULL // so that the rest of the this function thinks that this Stream // has no groups and will thus go to the next Stream. if (mode != LZMA_INDEX_ITER_STREAM) { // Get the pointer to the current group. See iter_set_inf() // for explanation. switch (iter->internal[ITER_METHOD].s) { case ITER_METHOD_NORMAL: group = iter->internal[ITER_GROUP].p; break; case ITER_METHOD_NEXT: group = index_tree_next(iter->internal[ITER_GROUP].p); break; case ITER_METHOD_LEFTMOST: group = (const index_group *)( stream->groups.leftmost); break; } } again: if (stream == NULL) { // We at the beginning of the lzma_index. // Locate the first Stream. stream = (const index_stream *)(i->streams.leftmost); if (mode >= LZMA_INDEX_ITER_BLOCK) { // Since we are being asked to return information // about the first a Block, skip Streams that have // no Blocks. while (stream->groups.leftmost == NULL) { stream = index_tree_next(&stream->node); if (stream == NULL) return true; } } // Start from the first Record in the Stream. group = (const index_group *)(stream->groups.leftmost); record = 0; } else if (group != NULL && record < group->last) { // The next Record is in the same group. ++record; } else { // This group has no more Records or this Stream has // no Blocks at all. record = 0; // If group is not NULL, this Stream has at least one Block // and thus at least one group. Find the next group. if (group != NULL) group = index_tree_next(&group->node); if (group == NULL) { // This Stream has no more Records. Find the next // Stream. If we are being asked to return information // about a Block, we skip empty Streams. do { stream = index_tree_next(&stream->node); if (stream == NULL) return true; } while (mode >= LZMA_INDEX_ITER_BLOCK && stream->groups.leftmost == NULL); group = (const index_group *)( stream->groups.leftmost); } } if (mode == LZMA_INDEX_ITER_NONEMPTY_BLOCK) { // We need to look for the next Block again if this Block // is empty. if (record == 0) { if (group->node.uncompressed_base == group->records[0].uncompressed_sum) goto again; } else if (group->records[record - 1].uncompressed_sum == group->records[record].uncompressed_sum) { goto again; } } iter->internal[ITER_STREAM].p = stream; iter->internal[ITER_GROUP].p = group; iter->internal[ITER_RECORD].s = record; iter_set_info(iter); return false; } extern LZMA_API(lzma_bool) lzma_index_iter_locate(lzma_index_iter *iter, lzma_vli target) { const lzma_index *i = iter->internal[ITER_INDEX].p; // If the target is past the end of the file, return immediately. if (i->uncompressed_size <= target) return true; // Locate the Stream containing the target offset. const index_stream *stream = index_tree_locate(&i->streams, target); assert(stream != NULL); target -= stream->node.uncompressed_base; // Locate the group containing the target offset. const index_group *group = index_tree_locate(&stream->groups, target); assert(group != NULL); // Use binary search to locate the exact Record. It is the first // Record whose uncompressed_sum is greater than target. // This is because we want the rightmost Record that fullfills the // search criterion. It is possible that there are empty Blocks; // we don't want to return them. size_t left = 0; size_t right = group->last; while (left < right) { const size_t pos = left + (right - left) / 2; if (group->records[pos].uncompressed_sum <= target) left = pos + 1; else right = pos; } iter->internal[ITER_STREAM].p = stream; iter->internal[ITER_GROUP].p = group; iter->internal[ITER_RECORD].s = left; iter_set_info(iter); return false; } Index: head/contrib/xz/src/liblzma/common/memcmplen.h =================================================================== --- head/contrib/xz/src/liblzma/common/memcmplen.h (revision 359200) +++ head/contrib/xz/src/liblzma/common/memcmplen.h (revision 359201) @@ -1,175 +1,164 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file memcmplen.h /// \brief Optimized comparison of two buffers // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_MEMCMPLEN_H #define LZMA_MEMCMPLEN_H #include "common.h" #ifdef HAVE_IMMINTRIN_H # include #endif /// Find out how many equal bytes the two buffers have. /// /// \param buf1 First buffer /// \param buf2 Second buffer /// \param len How many bytes have already been compared and will /// be assumed to match /// \param limit How many bytes to compare at most, including the /// already-compared bytes. This must be significantly /// smaller than UINT32_MAX to avoid integer overflows. /// Up to LZMA_MEMCMPLEN_EXTRA bytes may be read past /// the specified limit from both buf1 and buf2. /// /// \return Number of equal bytes in the buffers is returned. /// This is always at least len and at most limit. /// /// \note LZMA_MEMCMPLEN_EXTRA defines how many extra bytes may be read. /// It's rounded up to 2^n. This extra amount needs to be /// allocated in the buffers being used. It needs to be /// initialized too to keep Valgrind quiet. static inline uint32_t lzma_attribute((__always_inline__)) lzma_memcmplen(const uint8_t *buf1, const uint8_t *buf2, uint32_t len, uint32_t limit) { assert(len <= limit); assert(limit <= UINT32_MAX / 2); #if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ && ((TUKLIB_GNUC_REQ(3, 4) && defined(__x86_64__)) \ || (defined(__INTEL_COMPILER) && defined(__x86_64__)) \ || (defined(__INTEL_COMPILER) && defined(_M_X64)) \ || (defined(_MSC_VER) && defined(_M_X64))) // NOTE: This will use 64-bit unaligned access which // TUKLIB_FAST_UNALIGNED_ACCESS wasn't meant to permit, but // it's convenient here at least as long as it's x86-64 only. // // I keep this x86-64 only for now since that's where I know this // to be a good method. This may be fine on other 64-bit CPUs too. // On big endian one should use xor instead of subtraction and switch // to __builtin_clzll(). #define LZMA_MEMCMPLEN_EXTRA 8 while (len < limit) { - const uint64_t x = *(const uint64_t *)(buf1 + len) - - *(const uint64_t *)(buf2 + len); + const uint64_t x = read64ne(buf1 + len) - read64ne(buf2 + len); if (x != 0) { # if defined(_M_X64) // MSVC or Intel C compiler on Windows unsigned long tmp; _BitScanForward64(&tmp, x); len += (uint32_t)tmp >> 3; # else // GCC, clang, or Intel C compiler len += (uint32_t)__builtin_ctzll(x) >> 3; # endif return my_min(len, limit); } len += 8; } return limit; #elif defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ && defined(HAVE__MM_MOVEMASK_EPI8) \ && ((defined(__GNUC__) && defined(__SSE2_MATH__)) \ || (defined(__INTEL_COMPILER) && defined(__SSE2__)) \ || (defined(_MSC_VER) && defined(_M_IX86_FP) \ && _M_IX86_FP >= 2)) // NOTE: Like above, this will use 128-bit unaligned access which // TUKLIB_FAST_UNALIGNED_ACCESS wasn't meant to permit. // // SSE2 version for 32-bit and 64-bit x86. On x86-64 the above // version is sometimes significantly faster and sometimes // slightly slower than this SSE2 version, so this SSE2 // version isn't used on x86-64. # define LZMA_MEMCMPLEN_EXTRA 16 while (len < limit) { const uint32_t x = 0xFFFF ^ _mm_movemask_epi8(_mm_cmpeq_epi8( _mm_loadu_si128((const __m128i *)(buf1 + len)), _mm_loadu_si128((const __m128i *)(buf2 + len)))); if (x != 0) { -# if defined(__INTEL_COMPILER) - len += _bit_scan_forward(x); -# elif defined(_MSC_VER) - unsigned long tmp; - _BitScanForward(&tmp, x); - len += tmp; -# else - len += __builtin_ctz(x); -# endif + len += ctz32(x); return my_min(len, limit); } len += 16; } return limit; #elif defined(TUKLIB_FAST_UNALIGNED_ACCESS) && !defined(WORDS_BIGENDIAN) // Generic 32-bit little endian method # define LZMA_MEMCMPLEN_EXTRA 4 while (len < limit) { - uint32_t x = *(const uint32_t *)(buf1 + len) - - *(const uint32_t *)(buf2 + len); + uint32_t x = read32ne(buf1 + len) - read32ne(buf2 + len); if (x != 0) { if ((x & 0xFFFF) == 0) { len += 2; x >>= 16; } if ((x & 0xFF) == 0) ++len; return my_min(len, limit); } len += 4; } return limit; #elif defined(TUKLIB_FAST_UNALIGNED_ACCESS) && defined(WORDS_BIGENDIAN) // Generic 32-bit big endian method # define LZMA_MEMCMPLEN_EXTRA 4 while (len < limit) { - uint32_t x = *(const uint32_t *)(buf1 + len) - ^ *(const uint32_t *)(buf2 + len); + uint32_t x = read32ne(buf1 + len) ^ read32ne(buf2 + len); if (x != 0) { if ((x & 0xFFFF0000) == 0) { len += 2; x <<= 16; } if ((x & 0xFF000000) == 0) ++len; return my_min(len, limit); } len += 4; } return limit; #else // Simple portable version that doesn't use unaligned access. # define LZMA_MEMCMPLEN_EXTRA 0 while (len < limit && buf1[len] == buf2[len]) ++len; return len; #endif } #endif Index: head/contrib/xz/src/liblzma/common/stream_encoder_mt.c =================================================================== --- head/contrib/xz/src/liblzma/common/stream_encoder_mt.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/stream_encoder_mt.c (revision 359201) @@ -1,1143 +1,1143 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file stream_encoder_mt.c /// \brief Multithreaded .xz Stream encoder // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "filter_encoder.h" #include "easy_preset.h" #include "block_encoder.h" #include "block_buffer_encoder.h" #include "index_encoder.h" #include "outqueue.h" /// Maximum supported block size. This makes it simpler to prevent integer /// overflows if we are given unusually large block size. #define BLOCK_SIZE_MAX (UINT64_MAX / LZMA_THREADS_MAX) typedef enum { /// Waiting for work. THR_IDLE, /// Encoding is in progress. THR_RUN, /// Encoding is in progress but no more input data will /// be read. THR_FINISH, /// The main thread wants the thread to stop whatever it was doing /// but not exit. THR_STOP, /// The main thread wants the thread to exit. We could use /// cancellation but since there's stopped anyway, this is lazier. THR_EXIT, } worker_state; typedef struct lzma_stream_coder_s lzma_stream_coder; typedef struct worker_thread_s worker_thread; struct worker_thread_s { worker_state state; /// Input buffer of coder->block_size bytes. The main thread will /// put new input into this and update in_size accordingly. Once /// no more input is coming, state will be set to THR_FINISH. uint8_t *in; /// Amount of data available in the input buffer. This is modified /// only by the main thread. size_t in_size; /// Output buffer for this thread. This is set by the main /// thread every time a new Block is started with this thread /// structure. lzma_outbuf *outbuf; /// Pointer to the main structure is needed when putting this /// thread back to the stack of free threads. lzma_stream_coder *coder; /// The allocator is set by the main thread. Since a copy of the /// pointer is kept here, the application must not change the /// allocator before calling lzma_end(). const lzma_allocator *allocator; /// Amount of uncompressed data that has already been compressed. uint64_t progress_in; /// Amount of compressed data that is ready. uint64_t progress_out; /// Block encoder lzma_next_coder block_encoder; /// Compression options for this Block lzma_block block_options; /// Next structure in the stack of free worker threads. worker_thread *next; mythread_mutex mutex; mythread_cond cond; /// The ID of this thread is used to join the thread /// when it's not needed anymore. mythread thread_id; }; struct lzma_stream_coder_s { enum { SEQ_STREAM_HEADER, SEQ_BLOCK, SEQ_INDEX, SEQ_STREAM_FOOTER, } sequence; /// Start a new Block every block_size bytes of input unless /// LZMA_FULL_FLUSH or LZMA_FULL_BARRIER is used earlier. size_t block_size; /// The filter chain currently in use lzma_filter filters[LZMA_FILTERS_MAX + 1]; /// Index to hold sizes of the Blocks lzma_index *index; /// Index encoder lzma_next_coder index_encoder; /// Stream Flags for encoding the Stream Header and Stream Footer. lzma_stream_flags stream_flags; /// Buffer to hold Stream Header and Stream Footer. uint8_t header[LZMA_STREAM_HEADER_SIZE]; /// Read position in header[] size_t header_pos; /// Output buffer queue for compressed data lzma_outq outq; /// Maximum wait time if cannot use all the input and cannot /// fill the output buffer. This is in milliseconds. uint32_t timeout; /// Error code from a worker thread lzma_ret thread_error; /// Array of allocated thread-specific structures worker_thread *threads; /// Number of structures in "threads" above. This is also the /// number of threads that will be created at maximum. uint32_t threads_max; /// Number of thread structures that have been initialized, and /// thus the number of worker threads actually created so far. uint32_t threads_initialized; /// Stack of free threads. When a thread finishes, it puts itself /// back into this stack. This starts as empty because threads /// are created only when actually needed. worker_thread *threads_free; /// The most recent worker thread to which the main thread writes /// the new input from the application. worker_thread *thr; /// Amount of uncompressed data in Blocks that have already /// been finished. uint64_t progress_in; /// Amount of compressed data in Stream Header + Blocks that /// have already been finished. uint64_t progress_out; mythread_mutex mutex; mythread_cond cond; }; /// Tell the main thread that something has gone wrong. static void worker_error(worker_thread *thr, lzma_ret ret) { assert(ret != LZMA_OK); assert(ret != LZMA_STREAM_END); mythread_sync(thr->coder->mutex) { if (thr->coder->thread_error == LZMA_OK) thr->coder->thread_error = ret; mythread_cond_signal(&thr->coder->cond); } return; } static worker_state worker_encode(worker_thread *thr, worker_state state) { assert(thr->progress_in == 0); assert(thr->progress_out == 0); // Set the Block options. thr->block_options = (lzma_block){ .version = 0, .check = thr->coder->stream_flags.check, .compressed_size = thr->coder->outq.buf_size_max, .uncompressed_size = thr->coder->block_size, // TODO: To allow changing the filter chain, the filters // array must be copied to each worker_thread. .filters = thr->coder->filters, }; // Calculate maximum size of the Block Header. This amount is // reserved in the beginning of the buffer so that Block Header // along with Compressed Size and Uncompressed Size can be // written there. lzma_ret ret = lzma_block_header_size(&thr->block_options); if (ret != LZMA_OK) { worker_error(thr, ret); return THR_STOP; } // Initialize the Block encoder. ret = lzma_block_encoder_init(&thr->block_encoder, thr->allocator, &thr->block_options); if (ret != LZMA_OK) { worker_error(thr, ret); return THR_STOP; } size_t in_pos = 0; size_t in_size = 0; thr->outbuf->size = thr->block_options.header_size; const size_t out_size = thr->coder->outq.buf_size_max; do { mythread_sync(thr->mutex) { // Store in_pos and out_pos into *thr so that // an application may read them via // lzma_get_progress() to get progress information. // // NOTE: These aren't updated when the encoding // finishes. Instead, the final values are taken // later from thr->outbuf. thr->progress_in = in_pos; thr->progress_out = thr->outbuf->size; while (in_size == thr->in_size && thr->state == THR_RUN) mythread_cond_wait(&thr->cond, &thr->mutex); state = thr->state; in_size = thr->in_size; } // Return if we were asked to stop or exit. if (state >= THR_STOP) return state; lzma_action action = state == THR_FINISH ? LZMA_FINISH : LZMA_RUN; // Limit the amount of input given to the Block encoder // at once. This way this thread can react fairly quickly // if the main thread wants us to stop or exit. static const size_t in_chunk_max = 16384; size_t in_limit = in_size; if (in_size - in_pos > in_chunk_max) { in_limit = in_pos + in_chunk_max; action = LZMA_RUN; } ret = thr->block_encoder.code( thr->block_encoder.coder, thr->allocator, thr->in, &in_pos, in_limit, thr->outbuf->buf, &thr->outbuf->size, out_size, action); } while (ret == LZMA_OK && thr->outbuf->size < out_size); switch (ret) { case LZMA_STREAM_END: assert(state == THR_FINISH); // Encode the Block Header. By doing it after // the compression, we can store the Compressed Size // and Uncompressed Size fields. ret = lzma_block_header_encode(&thr->block_options, thr->outbuf->buf); if (ret != LZMA_OK) { worker_error(thr, ret); return THR_STOP; } break; case LZMA_OK: // The data was incompressible. Encode it using uncompressed // LZMA2 chunks. // // First wait that we have gotten all the input. mythread_sync(thr->mutex) { while (thr->state == THR_RUN) mythread_cond_wait(&thr->cond, &thr->mutex); state = thr->state; in_size = thr->in_size; } if (state >= THR_STOP) return state; // Do the encoding. This takes care of the Block Header too. thr->outbuf->size = 0; ret = lzma_block_uncomp_encode(&thr->block_options, thr->in, in_size, thr->outbuf->buf, &thr->outbuf->size, out_size); // It shouldn't fail. if (ret != LZMA_OK) { worker_error(thr, LZMA_PROG_ERROR); return THR_STOP; } break; default: worker_error(thr, ret); return THR_STOP; } // Set the size information that will be read by the main thread // to write the Index field. thr->outbuf->unpadded_size = lzma_block_unpadded_size(&thr->block_options); assert(thr->outbuf->unpadded_size != 0); thr->outbuf->uncompressed_size = thr->block_options.uncompressed_size; return THR_FINISH; } static MYTHREAD_RET_TYPE worker_start(void *thr_ptr) { worker_thread *thr = thr_ptr; worker_state state = THR_IDLE; // Init to silence a warning while (true) { // Wait for work. mythread_sync(thr->mutex) { while (true) { // The thread is already idle so if we are // requested to stop, just set the state. if (thr->state == THR_STOP) { thr->state = THR_IDLE; mythread_cond_signal(&thr->cond); } state = thr->state; if (state != THR_IDLE) break; mythread_cond_wait(&thr->cond, &thr->mutex); } } assert(state != THR_IDLE); assert(state != THR_STOP); if (state <= THR_FINISH) state = worker_encode(thr, state); if (state == THR_EXIT) break; // Mark the thread as idle unless the main thread has // told us to exit. Signal is needed for the case // where the main thread is waiting for the threads to stop. mythread_sync(thr->mutex) { if (thr->state != THR_EXIT) { thr->state = THR_IDLE; mythread_cond_signal(&thr->cond); } } mythread_sync(thr->coder->mutex) { // Mark the output buffer as finished if // no errors occurred. thr->outbuf->finished = state == THR_FINISH; // Update the main progress info. thr->coder->progress_in += thr->outbuf->uncompressed_size; thr->coder->progress_out += thr->outbuf->size; thr->progress_in = 0; thr->progress_out = 0; // Return this thread to the stack of free threads. thr->next = thr->coder->threads_free; thr->coder->threads_free = thr; mythread_cond_signal(&thr->coder->cond); } } // Exiting, free the resources. mythread_mutex_destroy(&thr->mutex); mythread_cond_destroy(&thr->cond); lzma_next_end(&thr->block_encoder, thr->allocator); lzma_free(thr->in, thr->allocator); return MYTHREAD_RET_VALUE; } /// Make the threads stop but not exit. Optionally wait for them to stop. static void threads_stop(lzma_stream_coder *coder, bool wait_for_threads) { // Tell the threads to stop. for (uint32_t i = 0; i < coder->threads_initialized; ++i) { mythread_sync(coder->threads[i].mutex) { coder->threads[i].state = THR_STOP; mythread_cond_signal(&coder->threads[i].cond); } } if (!wait_for_threads) return; // Wait for the threads to settle in the idle state. for (uint32_t i = 0; i < coder->threads_initialized; ++i) { mythread_sync(coder->threads[i].mutex) { while (coder->threads[i].state != THR_IDLE) mythread_cond_wait(&coder->threads[i].cond, &coder->threads[i].mutex); } } return; } /// Stop the threads and free the resources associated with them. /// Wait until the threads have exited. static void threads_end(lzma_stream_coder *coder, const lzma_allocator *allocator) { for (uint32_t i = 0; i < coder->threads_initialized; ++i) { mythread_sync(coder->threads[i].mutex) { coder->threads[i].state = THR_EXIT; mythread_cond_signal(&coder->threads[i].cond); } } for (uint32_t i = 0; i < coder->threads_initialized; ++i) { int ret = mythread_join(coder->threads[i].thread_id); assert(ret == 0); (void)ret; } lzma_free(coder->threads, allocator); return; } /// Initialize a new worker_thread structure and create a new thread. static lzma_ret initialize_new_thread(lzma_stream_coder *coder, const lzma_allocator *allocator) { worker_thread *thr = &coder->threads[coder->threads_initialized]; thr->in = lzma_alloc(coder->block_size, allocator); if (thr->in == NULL) return LZMA_MEM_ERROR; if (mythread_mutex_init(&thr->mutex)) goto error_mutex; if (mythread_cond_init(&thr->cond)) goto error_cond; thr->state = THR_IDLE; thr->allocator = allocator; thr->coder = coder; thr->progress_in = 0; thr->progress_out = 0; thr->block_encoder = LZMA_NEXT_CODER_INIT; if (mythread_create(&thr->thread_id, &worker_start, thr)) goto error_thread; ++coder->threads_initialized; coder->thr = thr; return LZMA_OK; error_thread: mythread_cond_destroy(&thr->cond); error_cond: mythread_mutex_destroy(&thr->mutex); error_mutex: lzma_free(thr->in, allocator); return LZMA_MEM_ERROR; } static lzma_ret get_thread(lzma_stream_coder *coder, const lzma_allocator *allocator) { // If there are no free output subqueues, there is no // point to try getting a thread. if (!lzma_outq_has_buf(&coder->outq)) return LZMA_OK; // If there is a free structure on the stack, use it. mythread_sync(coder->mutex) { if (coder->threads_free != NULL) { coder->thr = coder->threads_free; coder->threads_free = coder->threads_free->next; } } if (coder->thr == NULL) { // If there are no uninitialized structures left, return. if (coder->threads_initialized == coder->threads_max) return LZMA_OK; // Initialize a new thread. return_if_error(initialize_new_thread(coder, allocator)); } // Reset the parts of the thread state that have to be done // in the main thread. mythread_sync(coder->thr->mutex) { coder->thr->state = THR_RUN; coder->thr->in_size = 0; coder->thr->outbuf = lzma_outq_get_buf(&coder->outq); mythread_cond_signal(&coder->thr->cond); } return LZMA_OK; } static lzma_ret stream_encode_in(lzma_stream_coder *coder, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, lzma_action action) { while (*in_pos < in_size || (coder->thr != NULL && action != LZMA_RUN)) { if (coder->thr == NULL) { // Get a new thread. const lzma_ret ret = get_thread(coder, allocator); if (coder->thr == NULL) return ret; } // Copy the input data to thread's buffer. size_t thr_in_size = coder->thr->in_size; lzma_bufcpy(in, in_pos, in_size, coder->thr->in, &thr_in_size, coder->block_size); // Tell the Block encoder to finish if // - it has got block_size bytes of input; or // - all input was used and LZMA_FINISH, LZMA_FULL_FLUSH, // or LZMA_FULL_BARRIER was used. // // TODO: LZMA_SYNC_FLUSH and LZMA_SYNC_BARRIER. const bool finish = thr_in_size == coder->block_size || (*in_pos == in_size && action != LZMA_RUN); bool block_error = false; mythread_sync(coder->thr->mutex) { if (coder->thr->state == THR_IDLE) { // Something has gone wrong with the Block // encoder. It has set coder->thread_error // which we will read a few lines later. block_error = true; } else { // Tell the Block encoder its new amount // of input and update the state if needed. coder->thr->in_size = thr_in_size; if (finish) coder->thr->state = THR_FINISH; mythread_cond_signal(&coder->thr->cond); } } if (block_error) { lzma_ret ret; mythread_sync(coder->mutex) { ret = coder->thread_error; } return ret; } if (finish) coder->thr = NULL; } return LZMA_OK; } /// Wait until more input can be consumed, more output can be read, or /// an optional timeout is reached. static bool wait_for_work(lzma_stream_coder *coder, mythread_condtime *wait_abs, bool *has_blocked, bool has_input) { if (coder->timeout != 0 && !*has_blocked) { // Every time when stream_encode_mt() is called via // lzma_code(), *has_blocked starts as false. We set it // to true here and calculate the absolute time when // we must return if there's nothing to do. // // The idea of *has_blocked is to avoid unneeded calls // to mythread_condtime_set(), which may do a syscall // depending on the operating system. *has_blocked = true; mythread_condtime_set(wait_abs, &coder->cond, coder->timeout); } bool timed_out = false; mythread_sync(coder->mutex) { // There are four things that we wait. If one of them // becomes possible, we return. // - If there is input left, we need to get a free // worker thread and an output buffer for it. // - Data ready to be read from the output queue. // - A worker thread indicates an error. // - Time out occurs. while ((!has_input || coder->threads_free == NULL || !lzma_outq_has_buf(&coder->outq)) && !lzma_outq_is_readable(&coder->outq) && coder->thread_error == LZMA_OK && !timed_out) { if (coder->timeout != 0) timed_out = mythread_cond_timedwait( &coder->cond, &coder->mutex, wait_abs) != 0; else mythread_cond_wait(&coder->cond, &coder->mutex); } } return timed_out; } static lzma_ret stream_encode_mt(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_stream_coder *coder = coder_ptr; switch (coder->sequence) { case SEQ_STREAM_HEADER: lzma_bufcpy(coder->header, &coder->header_pos, sizeof(coder->header), out, out_pos, out_size); if (coder->header_pos < sizeof(coder->header)) return LZMA_OK; coder->header_pos = 0; coder->sequence = SEQ_BLOCK; // Fall through case SEQ_BLOCK: { // Initialized to silence warnings. lzma_vli unpadded_size = 0; lzma_vli uncompressed_size = 0; lzma_ret ret = LZMA_OK; // These are for wait_for_work(). bool has_blocked = false; mythread_condtime wait_abs; while (true) { mythread_sync(coder->mutex) { // Check for Block encoder errors. ret = coder->thread_error; if (ret != LZMA_OK) { assert(ret != LZMA_STREAM_END); - break; + break; // Break out of mythread_sync. } // Try to read compressed data to out[]. ret = lzma_outq_read(&coder->outq, out, out_pos, out_size, &unpadded_size, &uncompressed_size); } if (ret == LZMA_STREAM_END) { // End of Block. Add it to the Index. ret = lzma_index_append(coder->index, allocator, unpadded_size, uncompressed_size); // If we didn't fill the output buffer yet, // try to read more data. Maybe the next // outbuf has been finished already too. if (*out_pos < out_size) continue; } if (ret != LZMA_OK) { // coder->thread_error was set or // lzma_index_append() failed. threads_stop(coder, false); return ret; } // Try to give uncompressed data to a worker thread. ret = stream_encode_in(coder, allocator, in, in_pos, in_size, action); if (ret != LZMA_OK) { threads_stop(coder, false); return ret; } // See if we should wait or return. // // TODO: LZMA_SYNC_FLUSH and LZMA_SYNC_BARRIER. if (*in_pos == in_size) { // LZMA_RUN: More data is probably coming // so return to let the caller fill the // input buffer. if (action == LZMA_RUN) return LZMA_OK; // LZMA_FULL_BARRIER: The same as with // LZMA_RUN but tell the caller that the // barrier was completed. if (action == LZMA_FULL_BARRIER) return LZMA_STREAM_END; // Finishing or flushing isn't completed until // all input data has been encoded and copied // to the output buffer. if (lzma_outq_is_empty(&coder->outq)) { // LZMA_FINISH: Continue to encode // the Index field. if (action == LZMA_FINISH) break; // LZMA_FULL_FLUSH: Return to tell // the caller that flushing was // completed. if (action == LZMA_FULL_FLUSH) return LZMA_STREAM_END; } } // Return if there is no output space left. // This check must be done after testing the input // buffer, because we might want to use a different // return code. if (*out_pos == out_size) return LZMA_OK; // Neither in nor out has been used completely. // Wait until there's something we can do. if (wait_for_work(coder, &wait_abs, &has_blocked, *in_pos < in_size)) return LZMA_TIMED_OUT; } // All Blocks have been encoded and the threads have stopped. // Prepare to encode the Index field. return_if_error(lzma_index_encoder_init( &coder->index_encoder, allocator, coder->index)); coder->sequence = SEQ_INDEX; // Update the progress info to take the Index and // Stream Footer into account. Those are very fast to encode // so in terms of progress information they can be thought // to be ready to be copied out. coder->progress_out += lzma_index_size(coder->index) + LZMA_STREAM_HEADER_SIZE; } // Fall through case SEQ_INDEX: { // Call the Index encoder. It doesn't take any input, so // those pointers can be NULL. const lzma_ret ret = coder->index_encoder.code( coder->index_encoder.coder, allocator, NULL, NULL, 0, out, out_pos, out_size, LZMA_RUN); if (ret != LZMA_STREAM_END) return ret; // Encode the Stream Footer into coder->buffer. coder->stream_flags.backward_size = lzma_index_size(coder->index); if (lzma_stream_footer_encode(&coder->stream_flags, coder->header) != LZMA_OK) return LZMA_PROG_ERROR; coder->sequence = SEQ_STREAM_FOOTER; } // Fall through case SEQ_STREAM_FOOTER: lzma_bufcpy(coder->header, &coder->header_pos, sizeof(coder->header), out, out_pos, out_size); return coder->header_pos < sizeof(coder->header) ? LZMA_OK : LZMA_STREAM_END; } assert(0); return LZMA_PROG_ERROR; } static void stream_encoder_mt_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_stream_coder *coder = coder_ptr; // Threads must be killed before the output queue can be freed. threads_end(coder, allocator); lzma_outq_end(&coder->outq, allocator); for (size_t i = 0; coder->filters[i].id != LZMA_VLI_UNKNOWN; ++i) lzma_free(coder->filters[i].options, allocator); lzma_next_end(&coder->index_encoder, allocator); lzma_index_end(coder->index, allocator); mythread_cond_destroy(&coder->cond); mythread_mutex_destroy(&coder->mutex); lzma_free(coder, allocator); return; } /// Options handling for lzma_stream_encoder_mt_init() and /// lzma_stream_encoder_mt_memusage() static lzma_ret get_options(const lzma_mt *options, lzma_options_easy *opt_easy, const lzma_filter **filters, uint64_t *block_size, uint64_t *outbuf_size_max) { // Validate some of the options. if (options == NULL) return LZMA_PROG_ERROR; if (options->flags != 0 || options->threads == 0 || options->threads > LZMA_THREADS_MAX) return LZMA_OPTIONS_ERROR; if (options->filters != NULL) { // Filter chain was given, use it as is. *filters = options->filters; } else { // Use a preset. if (lzma_easy_preset(opt_easy, options->preset)) return LZMA_OPTIONS_ERROR; *filters = opt_easy->filters; } // Block size if (options->block_size > 0) { if (options->block_size > BLOCK_SIZE_MAX) return LZMA_OPTIONS_ERROR; *block_size = options->block_size; } else { // Determine the Block size from the filter chain. *block_size = lzma_mt_block_size(*filters); if (*block_size == 0) return LZMA_OPTIONS_ERROR; assert(*block_size <= BLOCK_SIZE_MAX); } // Calculate the maximum amount output that a single output buffer // may need to hold. This is the same as the maximum total size of // a Block. *outbuf_size_max = lzma_block_buffer_bound64(*block_size); if (*outbuf_size_max == 0) return LZMA_MEM_ERROR; return LZMA_OK; } static void get_progress(void *coder_ptr, uint64_t *progress_in, uint64_t *progress_out) { lzma_stream_coder *coder = coder_ptr; // Lock coder->mutex to prevent finishing threads from moving their // progress info from the worker_thread structure to lzma_stream_coder. mythread_sync(coder->mutex) { *progress_in = coder->progress_in; *progress_out = coder->progress_out; for (size_t i = 0; i < coder->threads_initialized; ++i) { mythread_sync(coder->threads[i].mutex) { *progress_in += coder->threads[i].progress_in; *progress_out += coder->threads[i] .progress_out; } } } return; } static lzma_ret stream_encoder_mt_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_mt *options) { lzma_next_coder_init(&stream_encoder_mt_init, next, allocator); // Get the filter chain. lzma_options_easy easy; const lzma_filter *filters; uint64_t block_size; uint64_t outbuf_size_max; return_if_error(get_options(options, &easy, &filters, &block_size, &outbuf_size_max)); #if SIZE_MAX < UINT64_MAX if (block_size > SIZE_MAX) return LZMA_MEM_ERROR; #endif // Validate the filter chain so that we can give an error in this // function instead of delaying it to the first call to lzma_code(). // The memory usage calculation verifies the filter chain as - // a side effect so we take advatange of that. + // a side effect so we take advantage of that. if (lzma_raw_encoder_memusage(filters) == UINT64_MAX) return LZMA_OPTIONS_ERROR; // Validate the Check ID. if ((unsigned int)(options->check) > LZMA_CHECK_ID_MAX) return LZMA_PROG_ERROR; if (!lzma_check_is_supported(options->check)) return LZMA_UNSUPPORTED_CHECK; // Allocate and initialize the base structure if needed. lzma_stream_coder *coder = next->coder; if (coder == NULL) { coder = lzma_alloc(sizeof(lzma_stream_coder), allocator); if (coder == NULL) return LZMA_MEM_ERROR; next->coder = coder; // For the mutex and condition variable initializations // the error handling has to be done here because // stream_encoder_mt_end() doesn't know if they have // already been initialized or not. if (mythread_mutex_init(&coder->mutex)) { lzma_free(coder, allocator); next->coder = NULL; return LZMA_MEM_ERROR; } if (mythread_cond_init(&coder->cond)) { mythread_mutex_destroy(&coder->mutex); lzma_free(coder, allocator); next->coder = NULL; return LZMA_MEM_ERROR; } next->code = &stream_encode_mt; next->end = &stream_encoder_mt_end; next->get_progress = &get_progress; // next->update = &stream_encoder_mt_update; coder->filters[0].id = LZMA_VLI_UNKNOWN; coder->index_encoder = LZMA_NEXT_CODER_INIT; coder->index = NULL; memzero(&coder->outq, sizeof(coder->outq)); coder->threads = NULL; coder->threads_max = 0; coder->threads_initialized = 0; } // Basic initializations coder->sequence = SEQ_STREAM_HEADER; coder->block_size = (size_t)(block_size); coder->thread_error = LZMA_OK; coder->thr = NULL; // Allocate the thread-specific base structures. assert(options->threads > 0); if (coder->threads_max != options->threads) { threads_end(coder, allocator); coder->threads = NULL; coder->threads_max = 0; coder->threads_initialized = 0; coder->threads_free = NULL; coder->threads = lzma_alloc( options->threads * sizeof(worker_thread), allocator); if (coder->threads == NULL) return LZMA_MEM_ERROR; coder->threads_max = options->threads; } else { // Reuse the old structures and threads. Tell the running // threads to stop and wait until they have stopped. threads_stop(coder, true); } // Output queue return_if_error(lzma_outq_init(&coder->outq, allocator, outbuf_size_max, options->threads)); // Timeout coder->timeout = options->timeout; // Free the old filter chain and copy the new one. for (size_t i = 0; coder->filters[i].id != LZMA_VLI_UNKNOWN; ++i) lzma_free(coder->filters[i].options, allocator); return_if_error(lzma_filters_copy( filters, coder->filters, allocator)); // Index lzma_index_end(coder->index, allocator); coder->index = lzma_index_init(allocator); if (coder->index == NULL) return LZMA_MEM_ERROR; // Stream Header coder->stream_flags.version = 0; coder->stream_flags.check = options->check; return_if_error(lzma_stream_header_encode( &coder->stream_flags, coder->header)); coder->header_pos = 0; // Progress info coder->progress_in = 0; coder->progress_out = LZMA_STREAM_HEADER_SIZE; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_stream_encoder_mt(lzma_stream *strm, const lzma_mt *options) { lzma_next_strm_init(stream_encoder_mt_init, strm, options); strm->internal->supported_actions[LZMA_RUN] = true; // strm->internal->supported_actions[LZMA_SYNC_FLUSH] = true; strm->internal->supported_actions[LZMA_FULL_FLUSH] = true; strm->internal->supported_actions[LZMA_FULL_BARRIER] = true; strm->internal->supported_actions[LZMA_FINISH] = true; return LZMA_OK; } // This function name is a monster but it's consistent with the older // monster names. :-( 31 chars is the max that C99 requires so in that // sense it's not too long. ;-) extern LZMA_API(uint64_t) lzma_stream_encoder_mt_memusage(const lzma_mt *options) { lzma_options_easy easy; const lzma_filter *filters; uint64_t block_size; uint64_t outbuf_size_max; if (get_options(options, &easy, &filters, &block_size, &outbuf_size_max) != LZMA_OK) return UINT64_MAX; // Memory usage of the input buffers const uint64_t inbuf_memusage = options->threads * block_size; // Memory usage of the filter encoders uint64_t filters_memusage = lzma_raw_encoder_memusage(filters); if (filters_memusage == UINT64_MAX) return UINT64_MAX; filters_memusage *= options->threads; // Memory usage of the output queue const uint64_t outq_memusage = lzma_outq_memusage( outbuf_size_max, options->threads); if (outq_memusage == UINT64_MAX) return UINT64_MAX; // Sum them with overflow checking. uint64_t total_memusage = LZMA_MEMUSAGE_BASE + sizeof(lzma_stream_coder) + options->threads * sizeof(worker_thread); if (UINT64_MAX - total_memusage < inbuf_memusage) return UINT64_MAX; total_memusage += inbuf_memusage; if (UINT64_MAX - total_memusage < filters_memusage) return UINT64_MAX; total_memusage += filters_memusage; if (UINT64_MAX - total_memusage < outq_memusage) return UINT64_MAX; return total_memusage + outq_memusage; } Index: head/contrib/xz/src/liblzma/common/stream_flags_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/stream_flags_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/stream_flags_decoder.c (revision 359201) @@ -1,82 +1,82 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file stream_flags_decoder.c /// \brief Decodes Stream Header and Stream Footer from .xz files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "stream_flags_common.h" static bool stream_flags_decode(lzma_stream_flags *options, const uint8_t *in) { // Reserved bits must be unset. if (in[0] != 0x00 || (in[1] & 0xF0)) return true; options->version = 0; options->check = in[1] & 0x0F; return false; } extern LZMA_API(lzma_ret) lzma_stream_header_decode(lzma_stream_flags *options, const uint8_t *in) { // Magic if (memcmp(in, lzma_header_magic, sizeof(lzma_header_magic)) != 0) return LZMA_FORMAT_ERROR; // Verify the CRC32 so we can distinguish between corrupt // and unsupported files. const uint32_t crc = lzma_crc32(in + sizeof(lzma_header_magic), LZMA_STREAM_FLAGS_SIZE, 0); - if (crc != unaligned_read32le(in + sizeof(lzma_header_magic) + if (crc != read32le(in + sizeof(lzma_header_magic) + LZMA_STREAM_FLAGS_SIZE)) return LZMA_DATA_ERROR; // Stream Flags if (stream_flags_decode(options, in + sizeof(lzma_header_magic))) return LZMA_OPTIONS_ERROR; // Set Backward Size to indicate unknown value. That way // lzma_stream_flags_compare() can be used to compare Stream Header // and Stream Footer while keeping it useful also for comparing // two Stream Footers. options->backward_size = LZMA_VLI_UNKNOWN; return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_stream_footer_decode(lzma_stream_flags *options, const uint8_t *in) { // Magic if (memcmp(in + sizeof(uint32_t) * 2 + LZMA_STREAM_FLAGS_SIZE, lzma_footer_magic, sizeof(lzma_footer_magic)) != 0) return LZMA_FORMAT_ERROR; // CRC32 const uint32_t crc = lzma_crc32(in + sizeof(uint32_t), sizeof(uint32_t) + LZMA_STREAM_FLAGS_SIZE, 0); - if (crc != unaligned_read32le(in)) + if (crc != read32le(in)) return LZMA_DATA_ERROR; // Stream Flags if (stream_flags_decode(options, in + sizeof(uint32_t) * 2)) return LZMA_OPTIONS_ERROR; // Backward Size - options->backward_size = unaligned_read32le(in + sizeof(uint32_t)); + options->backward_size = read32le(in + sizeof(uint32_t)); options->backward_size = (options->backward_size + 1) * 4; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/stream_flags_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/stream_flags_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/stream_flags_encoder.c (revision 359201) @@ -1,86 +1,86 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file stream_flags_encoder.c /// \brief Encodes Stream Header and Stream Footer for .xz files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "stream_flags_common.h" static bool stream_flags_encode(const lzma_stream_flags *options, uint8_t *out) { if ((unsigned int)(options->check) > LZMA_CHECK_ID_MAX) return true; out[0] = 0x00; out[1] = options->check; return false; } extern LZMA_API(lzma_ret) lzma_stream_header_encode(const lzma_stream_flags *options, uint8_t *out) { assert(sizeof(lzma_header_magic) + LZMA_STREAM_FLAGS_SIZE + 4 == LZMA_STREAM_HEADER_SIZE); if (options->version != 0) return LZMA_OPTIONS_ERROR; // Magic memcpy(out, lzma_header_magic, sizeof(lzma_header_magic)); // Stream Flags if (stream_flags_encode(options, out + sizeof(lzma_header_magic))) return LZMA_PROG_ERROR; // CRC32 of the Stream Header const uint32_t crc = lzma_crc32(out + sizeof(lzma_header_magic), LZMA_STREAM_FLAGS_SIZE, 0); - unaligned_write32le(out + sizeof(lzma_header_magic) - + LZMA_STREAM_FLAGS_SIZE, crc); + write32le(out + sizeof(lzma_header_magic) + LZMA_STREAM_FLAGS_SIZE, + crc); return LZMA_OK; } extern LZMA_API(lzma_ret) lzma_stream_footer_encode(const lzma_stream_flags *options, uint8_t *out) { assert(2 * 4 + LZMA_STREAM_FLAGS_SIZE + sizeof(lzma_footer_magic) == LZMA_STREAM_HEADER_SIZE); if (options->version != 0) return LZMA_OPTIONS_ERROR; // Backward Size if (!is_backward_size_valid(options)) return LZMA_PROG_ERROR; - unaligned_write32le(out + 4, options->backward_size / 4 - 1); + write32le(out + 4, options->backward_size / 4 - 1); // Stream Flags if (stream_flags_encode(options, out + 2 * 4)) return LZMA_PROG_ERROR; // CRC32 const uint32_t crc = lzma_crc32( out + 4, 4 + LZMA_STREAM_FLAGS_SIZE, 0); - unaligned_write32le(out, crc); + write32le(out, crc); // Magic memcpy(out + 2 * 4 + LZMA_STREAM_FLAGS_SIZE, lzma_footer_magic, sizeof(lzma_footer_magic)); return LZMA_OK; } Index: head/contrib/xz/src/liblzma/common/vli_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/common/vli_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/common/vli_decoder.c (revision 359201) @@ -1,86 +1,86 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file vli_decoder.c /// \brief Decodes variable-length integers // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "common.h" extern LZMA_API(lzma_ret) lzma_vli_decode(lzma_vli *restrict vli, size_t *vli_pos, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size) { // If we haven't been given vli_pos, work in single-call mode. size_t vli_pos_internal = 0; if (vli_pos == NULL) { vli_pos = &vli_pos_internal; *vli = 0; // If there's no input, use LZMA_DATA_ERROR. This way it is // easy to decode VLIs from buffers that have known size, // and get the correct error code in case the buffer is // too short. if (*in_pos >= in_size) return LZMA_DATA_ERROR; } else { // Initialize *vli when starting to decode a new integer. if (*vli_pos == 0) *vli = 0; // Validate the arguments. if (*vli_pos >= LZMA_VLI_BYTES_MAX || (*vli >> (*vli_pos * 7)) != 0) return LZMA_PROG_ERROR;; if (*in_pos >= in_size) return LZMA_BUF_ERROR; } do { // Read the next byte. Use a temporary variable so that we // can update *in_pos immediately. const uint8_t byte = in[*in_pos]; ++*in_pos; // Add the newly read byte to *vli. *vli += (lzma_vli)(byte & 0x7F) << (*vli_pos * 7); ++*vli_pos; // Check if this is the last byte of a multibyte integer. if ((byte & 0x80) == 0) { // We don't allow using variable-length integers as // padding i.e. the encoding must use the most the // compact form. if (byte == 0x00 && *vli_pos > 1) return LZMA_DATA_ERROR; return vli_pos == &vli_pos_internal ? LZMA_OK : LZMA_STREAM_END; } // There is at least one more byte coming. If we have already // read maximum number of bytes, the integer is considered // corrupt. // // If we need bigger integers in future, old versions liblzma - // will confusingly indicate the file being corrupt istead of + // will confusingly indicate the file being corrupt instead of // unsupported. I suppose it's still better this way, because // in the foreseeable future (writing this in 2008) the only // reason why files would appear having over 63-bit integers // is that the files are simply corrupt. if (*vli_pos == LZMA_VLI_BYTES_MAX) return LZMA_DATA_ERROR; } while (*in_pos < in_size); return vli_pos == &vli_pos_internal ? LZMA_DATA_ERROR : LZMA_OK; } Index: head/contrib/xz/src/liblzma/delta/delta_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/delta/delta_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/delta/delta_decoder.c (revision 359201) @@ -1,78 +1,78 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file delta_decoder.c /// \brief Delta filter decoder // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "delta_decoder.h" #include "delta_private.h" static void decode_buffer(lzma_delta_coder *coder, uint8_t *buffer, size_t size) { const size_t distance = coder->distance; for (size_t i = 0; i < size; ++i) { buffer[i] += coder->history[(distance + coder->pos) & 0xFF]; coder->history[coder->pos-- & 0xFF] = buffer[i]; } } static lzma_ret delta_decode(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_delta_coder *coder = coder_ptr; assert(coder->next.code != NULL); const size_t out_start = *out_pos; const lzma_ret ret = coder->next.code(coder->next.coder, allocator, in, in_pos, in_size, out, out_pos, out_size, action); decode_buffer(coder, out + out_start, *out_pos - out_start); return ret; } extern lzma_ret lzma_delta_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { next->code = &delta_decode; return lzma_delta_coder_init(next, allocator, filters); } extern lzma_ret lzma_delta_props_decode(void **options, const lzma_allocator *allocator, const uint8_t *props, size_t props_size) { if (props_size != 1) return LZMA_OPTIONS_ERROR; lzma_options_delta *opt = lzma_alloc(sizeof(lzma_options_delta), allocator); if (opt == NULL) return LZMA_MEM_ERROR; opt->type = LZMA_DELTA_TYPE_BYTE; - opt->dist = props[0] + 1; + opt->dist = props[0] + 1U; *options = opt; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/lz/lz_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/lz/lz_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/lz/lz_decoder.c (revision 359201) @@ -1,306 +1,311 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lz_decoder.c /// \brief LZ out window /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// // liblzma supports multiple LZ77-based filters. The LZ part is shared // between these filters. The LZ code takes care of dictionary handling // and passing the data between filters in the chain. The filter-specific // part decodes from the input buffer to the dictionary. #include "lz_decoder.h" typedef struct { /// Dictionary (history buffer) lzma_dict dict; /// The actual LZ-based decoder e.g. LZMA lzma_lz_decoder lz; /// Next filter in the chain, if any. Note that LZMA and LZMA2 are /// only allowed as the last filter, but the long-range filter in /// future can be in the middle of the chain. lzma_next_coder next; /// True if the next filter in the chain has returned LZMA_STREAM_END. bool next_finished; /// True if the LZ decoder (e.g. LZMA) has detected end of payload /// marker. This may become true before next_finished becomes true. bool this_finished; /// Temporary buffer needed when the LZ-based filter is not the last /// filter in the chain. The output of the next filter is first /// decoded into buffer[], which is then used as input for the actual /// LZ-based decoder. struct { size_t pos; size_t size; uint8_t buffer[LZMA_BUFFER_SIZE]; } temp; } lzma_coder; static void lz_decoder_reset(lzma_coder *coder) { coder->dict.pos = 0; coder->dict.full = 0; coder->dict.buf[coder->dict.size - 1] = '\0'; coder->dict.need_reset = false; return; } static lzma_ret decode_buffer(lzma_coder *coder, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size) { while (true) { // Wrap the dictionary if needed. if (coder->dict.pos == coder->dict.size) coder->dict.pos = 0; // Store the current dictionary position. It is needed to know // where to start copying to the out[] buffer. const size_t dict_start = coder->dict.pos; // Calculate how much we allow coder->lz.code() to decode. // It must not decode past the end of the dictionary // buffer, and we don't want it to decode more than is // actually needed to fill the out[] buffer. coder->dict.limit = coder->dict.pos + my_min(out_size - *out_pos, coder->dict.size - coder->dict.pos); // Call the coder->lz.code() to do the actual decoding. const lzma_ret ret = coder->lz.code( coder->lz.coder, &coder->dict, in, in_pos, in_size); // Copy the decoded data from the dictionary to the out[] - // buffer. + // buffer. Do it conditionally because out can be NULL + // (in which case copy_size is always 0). Calling memcpy() + // with a null-pointer is undefined even if the third + // argument is 0. const size_t copy_size = coder->dict.pos - dict_start; assert(copy_size <= out_size - *out_pos); - memcpy(out + *out_pos, coder->dict.buf + dict_start, - copy_size); + + if (copy_size > 0) + memcpy(out + *out_pos, coder->dict.buf + dict_start, + copy_size); + *out_pos += copy_size; // Reset the dictionary if so requested by coder->lz.code(). if (coder->dict.need_reset) { lz_decoder_reset(coder); // Since we reset dictionary, we don't check if // dictionary became full. if (ret != LZMA_OK || *out_pos == out_size) return ret; } else { // Return if everything got decoded or an error // occurred, or if there's no more data to decode. // // Note that detecting if there's something to decode // is done by looking if dictionary become full // instead of looking if *in_pos == in_size. This // is because it is possible that all the input was // consumed already but some data is pending to be // written to the dictionary. if (ret != LZMA_OK || *out_pos == out_size || coder->dict.pos < coder->dict.size) return ret; } } } static lzma_ret -lz_decode(void *coder_ptr, - const lzma_allocator *allocator lzma_attribute((__unused__)), +lz_decode(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_coder *coder = coder_ptr; if (coder->next.code == NULL) return decode_buffer(coder, in, in_pos, in_size, out, out_pos, out_size); // We aren't the last coder in the chain, we need to decode // our input to a temporary buffer. while (*out_pos < out_size) { // Fill the temporary buffer if it is empty. if (!coder->next_finished && coder->temp.pos == coder->temp.size) { coder->temp.pos = 0; coder->temp.size = 0; const lzma_ret ret = coder->next.code( coder->next.coder, allocator, in, in_pos, in_size, coder->temp.buffer, &coder->temp.size, LZMA_BUFFER_SIZE, action); if (ret == LZMA_STREAM_END) coder->next_finished = true; else if (ret != LZMA_OK || coder->temp.size == 0) return ret; } if (coder->this_finished) { if (coder->temp.size != 0) return LZMA_DATA_ERROR; if (coder->next_finished) return LZMA_STREAM_END; return LZMA_OK; } const lzma_ret ret = decode_buffer(coder, coder->temp.buffer, &coder->temp.pos, coder->temp.size, out, out_pos, out_size); if (ret == LZMA_STREAM_END) coder->this_finished = true; else if (ret != LZMA_OK) return ret; else if (coder->next_finished && *out_pos < out_size) return LZMA_DATA_ERROR; } return LZMA_OK; } static void lz_decoder_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_coder *coder = coder_ptr; lzma_next_end(&coder->next, allocator); lzma_free(coder->dict.buf, allocator); if (coder->lz.end != NULL) coder->lz.end(coder->lz.coder, allocator); else lzma_free(coder->lz.coder, allocator); lzma_free(coder, allocator); return; } extern lzma_ret lzma_lz_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, lzma_ret (*lz_init)(lzma_lz_decoder *lz, const lzma_allocator *allocator, const void *options, lzma_lz_options *lz_options)) { // Allocate the base structure if it isn't already allocated. lzma_coder *coder = next->coder; if (coder == NULL) { coder = lzma_alloc(sizeof(lzma_coder), allocator); if (coder == NULL) return LZMA_MEM_ERROR; next->coder = coder; next->code = &lz_decode; next->end = &lz_decoder_end; coder->dict.buf = NULL; coder->dict.size = 0; coder->lz = LZMA_LZ_DECODER_INIT; coder->next = LZMA_NEXT_CODER_INIT; } // Allocate and initialize the LZ-based decoder. It will also give // us the dictionary size. lzma_lz_options lz_options; return_if_error(lz_init(&coder->lz, allocator, filters[0].options, &lz_options)); // If the dictionary size is very small, increase it to 4096 bytes. // This is to prevent constant wrapping of the dictionary, which // would slow things down. The downside is that since we don't check // separately for the real dictionary size, we may happily accept // corrupt files. if (lz_options.dict_size < 4096) lz_options.dict_size = 4096; - // Make dictionary size a multipe of 16. Some LZ-based decoders like + // Make dictionary size a multiple of 16. Some LZ-based decoders like // LZMA use the lowest bits lzma_dict.pos to know the alignment of the // data. Aligned buffer is also good when memcpying from the // dictionary to the output buffer, since applications are // recommended to give aligned buffers to liblzma. // // Avoid integer overflow. if (lz_options.dict_size > SIZE_MAX - 15) return LZMA_MEM_ERROR; lz_options.dict_size = (lz_options.dict_size + 15) & ~((size_t)(15)); // Allocate and initialize the dictionary. if (coder->dict.size != lz_options.dict_size) { lzma_free(coder->dict.buf, allocator); coder->dict.buf = lzma_alloc(lz_options.dict_size, allocator); if (coder->dict.buf == NULL) return LZMA_MEM_ERROR; coder->dict.size = lz_options.dict_size; } lz_decoder_reset(next->coder); // Use the preset dictionary if it was given to us. if (lz_options.preset_dict != NULL && lz_options.preset_dict_size > 0) { // If the preset dictionary is bigger than the actual // dictionary, copy only the tail. const size_t copy_size = my_min(lz_options.preset_dict_size, lz_options.dict_size); const size_t offset = lz_options.preset_dict_size - copy_size; memcpy(coder->dict.buf, lz_options.preset_dict + offset, copy_size); coder->dict.pos = copy_size; coder->dict.full = copy_size; } // Miscellaneous initializations coder->next_finished = false; coder->this_finished = false; coder->temp.pos = 0; coder->temp.size = 0; // Initialize the next filter in the chain, if any. return lzma_next_filter_init(&coder->next, allocator, filters + 1); } extern uint64_t lzma_lz_decoder_memusage(size_t dictionary_size) { return sizeof(lzma_coder) + (uint64_t)(dictionary_size); } extern void lzma_lz_decoder_uncompressed(void *coder_ptr, lzma_vli uncompressed_size) { lzma_coder *coder = coder_ptr; coder->lz.set_uncompressed(coder->lz.coder, uncompressed_size); } Index: head/contrib/xz/src/liblzma/lz/lz_encoder_hash.h =================================================================== --- head/contrib/xz/src/liblzma/lz/lz_encoder_hash.h (revision 359200) +++ head/contrib/xz/src/liblzma/lz/lz_encoder_hash.h (revision 359201) @@ -1,108 +1,108 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lz_encoder_hash.h /// \brief Hash macros for match finders // // Author: Igor Pavlov // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_LZ_ENCODER_HASH_H #define LZMA_LZ_ENCODER_HASH_H #if defined(WORDS_BIGENDIAN) && !defined(HAVE_SMALL) // This is to make liblzma produce the same output on big endian // systems that it does on little endian systems. lz_encoder.c // takes care of including the actual table. extern const uint32_t lzma_lz_hash_table[256]; # define hash_table lzma_lz_hash_table #else # include "check.h" # define hash_table lzma_crc32_table[0] #endif #define HASH_2_SIZE (UINT32_C(1) << 10) #define HASH_3_SIZE (UINT32_C(1) << 16) #define HASH_4_SIZE (UINT32_C(1) << 20) #define HASH_2_MASK (HASH_2_SIZE - 1) #define HASH_3_MASK (HASH_3_SIZE - 1) #define HASH_4_MASK (HASH_4_SIZE - 1) #define FIX_3_HASH_SIZE (HASH_2_SIZE) #define FIX_4_HASH_SIZE (HASH_2_SIZE + HASH_3_SIZE) #define FIX_5_HASH_SIZE (HASH_2_SIZE + HASH_3_SIZE + HASH_4_SIZE) // Endianness doesn't matter in hash_2_calc() (no effect on the output). #ifdef TUKLIB_FAST_UNALIGNED_ACCESS # define hash_2_calc() \ - const uint32_t hash_value = *(const uint16_t *)(cur) + const uint32_t hash_value = read16ne(cur) #else # define hash_2_calc() \ const uint32_t hash_value \ = (uint32_t)(cur[0]) | ((uint32_t)(cur[1]) << 8) #endif #define hash_3_calc() \ const uint32_t temp = hash_table[cur[0]] ^ cur[1]; \ const uint32_t hash_2_value = temp & HASH_2_MASK; \ const uint32_t hash_value \ = (temp ^ ((uint32_t)(cur[2]) << 8)) & mf->hash_mask #define hash_4_calc() \ const uint32_t temp = hash_table[cur[0]] ^ cur[1]; \ const uint32_t hash_2_value = temp & HASH_2_MASK; \ const uint32_t hash_3_value \ = (temp ^ ((uint32_t)(cur[2]) << 8)) & HASH_3_MASK; \ const uint32_t hash_value = (temp ^ ((uint32_t)(cur[2]) << 8) \ ^ (hash_table[cur[3]] << 5)) & mf->hash_mask // The following are not currently used. #define hash_5_calc() \ const uint32_t temp = hash_table[cur[0]] ^ cur[1]; \ const uint32_t hash_2_value = temp & HASH_2_MASK; \ const uint32_t hash_3_value \ = (temp ^ ((uint32_t)(cur[2]) << 8)) & HASH_3_MASK; \ uint32_t hash_4_value = (temp ^ ((uint32_t)(cur[2]) << 8) ^ \ ^ hash_table[cur[3]] << 5); \ const uint32_t hash_value \ = (hash_4_value ^ (hash_table[cur[4]] << 3)) \ & mf->hash_mask; \ hash_4_value &= HASH_4_MASK /* #define hash_zip_calc() \ const uint32_t hash_value \ = (((uint32_t)(cur[0]) | ((uint32_t)(cur[1]) << 8)) \ ^ hash_table[cur[2]]) & 0xFFFF */ #define hash_zip_calc() \ const uint32_t hash_value \ = (((uint32_t)(cur[2]) | ((uint32_t)(cur[0]) << 8)) \ ^ hash_table[cur[1]]) & 0xFFFF #define mt_hash_2_calc() \ const uint32_t hash_2_value \ = (hash_table[cur[0]] ^ cur[1]) & HASH_2_MASK #define mt_hash_3_calc() \ const uint32_t temp = hash_table[cur[0]] ^ cur[1]; \ const uint32_t hash_2_value = temp & HASH_2_MASK; \ const uint32_t hash_3_value \ = (temp ^ ((uint32_t)(cur[2]) << 8)) & HASH_3_MASK #define mt_hash_4_calc() \ const uint32_t temp = hash_table[cur[0]] ^ cur[1]; \ const uint32_t hash_2_value = temp & HASH_2_MASK; \ const uint32_t hash_3_value \ = (temp ^ ((uint32_t)(cur[2]) << 8)) & HASH_3_MASK; \ const uint32_t hash_4_value = (temp ^ ((uint32_t)(cur[2]) << 8) ^ \ (hash_table[cur[3]] << 5)) & HASH_4_MASK #endif Index: head/contrib/xz/src/liblzma/lz/lz_encoder_mf.c =================================================================== --- head/contrib/xz/src/liblzma/lz/lz_encoder_mf.c (revision 359200) +++ head/contrib/xz/src/liblzma/lz/lz_encoder_mf.c (revision 359201) @@ -1,744 +1,744 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lz_encoder_mf.c /// \brief Match finders /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "lz_encoder.h" #include "lz_encoder_hash.h" #include "memcmplen.h" /// \brief Find matches starting from the current byte /// /// \return The length of the longest match found extern uint32_t lzma_mf_find(lzma_mf *mf, uint32_t *count_ptr, lzma_match *matches) { // Call the match finder. It returns the number of length-distance // pairs found. // FIXME: Minimum count is zero, what _exactly_ is the maximum? const uint32_t count = mf->find(mf, matches); // Length of the longest match; assume that no matches were found // and thus the maximum length is zero. uint32_t len_best = 0; if (count > 0) { #ifndef NDEBUG // Validate the matches. for (uint32_t i = 0; i < count; ++i) { assert(matches[i].len <= mf->nice_len); assert(matches[i].dist < mf->read_pos); assert(memcmp(mf_ptr(mf) - 1, mf_ptr(mf) - matches[i].dist - 2, matches[i].len) == 0); } #endif // The last used element in the array contains // the longest match. len_best = matches[count - 1].len; // If a match of maximum search length was found, try to // extend the match to maximum possible length. if (len_best == mf->nice_len) { // The limit for the match length is either the // maximum match length supported by the LZ-based // encoder or the number of bytes left in the // dictionary, whichever is smaller. uint32_t limit = mf_avail(mf) + 1; if (limit > mf->match_len_max) limit = mf->match_len_max; // Pointer to the byte we just ran through // the match finder. const uint8_t *p1 = mf_ptr(mf) - 1; // Pointer to the beginning of the match. We need -1 // here because the match distances are zero based. const uint8_t *p2 = p1 - matches[count - 1].dist - 1; len_best = lzma_memcmplen(p1, p2, len_best, limit); } } *count_ptr = count; // Finally update the read position to indicate that match finder was // run for this dictionary offset. ++mf->read_ahead; return len_best; } /// Hash value to indicate unused element in the hash. Since we start the /// positions from dict_size + 1, zero is always too far to qualify /// as usable match position. #define EMPTY_HASH_VALUE 0 /// Normalization must be done when lzma_mf.offset + lzma_mf.read_pos /// reaches MUST_NORMALIZE_POS. #define MUST_NORMALIZE_POS UINT32_MAX /// \brief Normalizes hash values /// /// The hash arrays store positions of match candidates. The positions are /// relative to an arbitrary offset that is not the same as the absolute /// offset in the input stream. The relative position of the current byte /// is lzma_mf.offset + lzma_mf.read_pos. The distances of the matches are /// the differences of the current read position and the position found from /// the hash. /// /// To prevent integer overflows of the offsets stored in the hash arrays, /// we need to "normalize" the stored values now and then. During the /// normalization, we drop values that indicate distance greater than the /// dictionary size, thus making space for new values. static void normalize(lzma_mf *mf) { assert(mf->read_pos + mf->offset == MUST_NORMALIZE_POS); // In future we may not want to touch the lowest bits, because there // may be match finders that use larger resolution than one byte. const uint32_t subvalue = (MUST_NORMALIZE_POS - mf->cyclic_size); - // & (~(UINT32_C(1) << 10) - 1); + // & ~((UINT32_C(1) << 10) - 1); for (uint32_t i = 0; i < mf->hash_count; ++i) { // If the distance is greater than the dictionary size, // we can simply mark the hash element as empty. if (mf->hash[i] <= subvalue) mf->hash[i] = EMPTY_HASH_VALUE; else mf->hash[i] -= subvalue; } for (uint32_t i = 0; i < mf->sons_count; ++i) { // Do the same for mf->son. // // NOTE: There may be uninitialized elements in mf->son. // Valgrind may complain that the "if" below depends on // an uninitialized value. In this case it is safe to ignore // the warning. See also the comments in lz_encoder_init() // in lz_encoder.c. if (mf->son[i] <= subvalue) mf->son[i] = EMPTY_HASH_VALUE; else mf->son[i] -= subvalue; } // Update offset to match the new locations. mf->offset -= subvalue; return; } /// Mark the current byte as processed from point of view of the match finder. static void move_pos(lzma_mf *mf) { if (++mf->cyclic_pos == mf->cyclic_size) mf->cyclic_pos = 0; ++mf->read_pos; assert(mf->read_pos <= mf->write_pos); if (unlikely(mf->read_pos + mf->offset == UINT32_MAX)) normalize(mf); } /// When flushing, we cannot run the match finder unless there is nice_len /// bytes available in the dictionary. Instead, we skip running the match /// finder (indicating that no match was found), and count how many bytes we /// have ignored this way. /// /// When new data is given after the flushing was completed, the match finder /// is restarted by rewinding mf->read_pos backwards by mf->pending. Then /// the missed bytes are added to the hash using the match finder's skip /// function (with small amount of input, it may start using mf->pending /// again if flushing). /// /// Due to this rewinding, we don't touch cyclic_pos or test for /// normalization. It will be done when the match finder's skip function /// catches up after a flush. static void move_pending(lzma_mf *mf) { ++mf->read_pos; assert(mf->read_pos <= mf->write_pos); ++mf->pending; } /// Calculate len_limit and determine if there is enough input to run /// the actual match finder code. Sets up "cur" and "pos". This macro /// is used by all find functions and binary tree skip functions. Hash /// chain skip function doesn't need len_limit so a simpler code is used /// in them. #define header(is_bt, len_min, ret_op) \ uint32_t len_limit = mf_avail(mf); \ if (mf->nice_len <= len_limit) { \ len_limit = mf->nice_len; \ } else if (len_limit < (len_min) \ || (is_bt && mf->action == LZMA_SYNC_FLUSH)) { \ assert(mf->action != LZMA_RUN); \ move_pending(mf); \ ret_op; \ } \ const uint8_t *cur = mf_ptr(mf); \ const uint32_t pos = mf->read_pos + mf->offset /// Header for find functions. "return 0" indicates that zero matches /// were found. #define header_find(is_bt, len_min) \ header(is_bt, len_min, return 0); \ uint32_t matches_count = 0 /// Header for a loop in a skip function. "continue" tells to skip the rest /// of the code in the loop. #define header_skip(is_bt, len_min) \ header(is_bt, len_min, continue) /// Calls hc_find_func() or bt_find_func() and calculates the total number /// of matches found. Updates the dictionary position and returns the number /// of matches found. #define call_find(func, len_best) \ do { \ matches_count = func(len_limit, pos, cur, cur_match, mf->depth, \ mf->son, mf->cyclic_pos, mf->cyclic_size, \ matches + matches_count, len_best) \ - matches; \ move_pos(mf); \ return matches_count; \ } while (0) //////////////// // Hash Chain // //////////////// #if defined(HAVE_MF_HC3) || defined(HAVE_MF_HC4) /// /// /// \param len_limit Don't look for matches longer than len_limit. /// \param pos lzma_mf.read_pos + lzma_mf.offset /// \param cur Pointer to current byte (mf_ptr(mf)) /// \param cur_match Start position of the current match candidate /// \param depth Maximum length of the hash chain /// \param son lzma_mf.son (contains the hash chain) /// \param cyclic_pos /// \param cyclic_size /// \param matches Array to hold the matches. /// \param len_best The length of the longest match found so far. static lzma_match * hc_find_func( const uint32_t len_limit, const uint32_t pos, const uint8_t *const cur, uint32_t cur_match, uint32_t depth, uint32_t *const son, const uint32_t cyclic_pos, const uint32_t cyclic_size, lzma_match *matches, uint32_t len_best) { son[cyclic_pos] = cur_match; while (true) { const uint32_t delta = pos - cur_match; if (depth-- == 0 || delta >= cyclic_size) return matches; const uint8_t *const pb = cur - delta; cur_match = son[cyclic_pos - delta + (delta > cyclic_pos ? cyclic_size : 0)]; if (pb[len_best] == cur[len_best] && pb[0] == cur[0]) { uint32_t len = lzma_memcmplen(pb, cur, 1, len_limit); if (len_best < len) { len_best = len; matches->len = len; matches->dist = delta - 1; ++matches; if (len == len_limit) return matches; } } } } #define hc_find(len_best) \ call_find(hc_find_func, len_best) #define hc_skip() \ do { \ mf->son[mf->cyclic_pos] = cur_match; \ move_pos(mf); \ } while (0) #endif #ifdef HAVE_MF_HC3 extern uint32_t lzma_mf_hc3_find(lzma_mf *mf, lzma_match *matches) { header_find(false, 3); hash_3_calc(); const uint32_t delta2 = pos - mf->hash[hash_2_value]; const uint32_t cur_match = mf->hash[FIX_3_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_value] = pos; uint32_t len_best = 2; if (delta2 < mf->cyclic_size && *(cur - delta2) == *cur) { len_best = lzma_memcmplen(cur - delta2, cur, len_best, len_limit); matches[0].len = len_best; matches[0].dist = delta2 - 1; matches_count = 1; if (len_best == len_limit) { hc_skip(); return 1; // matches_count } } hc_find(len_best); } extern void lzma_mf_hc3_skip(lzma_mf *mf, uint32_t amount) { do { if (mf_avail(mf) < 3) { move_pending(mf); continue; } const uint8_t *cur = mf_ptr(mf); const uint32_t pos = mf->read_pos + mf->offset; hash_3_calc(); const uint32_t cur_match = mf->hash[FIX_3_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_value] = pos; hc_skip(); } while (--amount != 0); } #endif #ifdef HAVE_MF_HC4 extern uint32_t lzma_mf_hc4_find(lzma_mf *mf, lzma_match *matches) { header_find(false, 4); hash_4_calc(); uint32_t delta2 = pos - mf->hash[hash_2_value]; const uint32_t delta3 = pos - mf->hash[FIX_3_HASH_SIZE + hash_3_value]; const uint32_t cur_match = mf->hash[FIX_4_HASH_SIZE + hash_value]; mf->hash[hash_2_value ] = pos; mf->hash[FIX_3_HASH_SIZE + hash_3_value] = pos; mf->hash[FIX_4_HASH_SIZE + hash_value] = pos; uint32_t len_best = 1; if (delta2 < mf->cyclic_size && *(cur - delta2) == *cur) { len_best = 2; matches[0].len = 2; matches[0].dist = delta2 - 1; matches_count = 1; } if (delta2 != delta3 && delta3 < mf->cyclic_size && *(cur - delta3) == *cur) { len_best = 3; matches[matches_count++].dist = delta3 - 1; delta2 = delta3; } if (matches_count != 0) { len_best = lzma_memcmplen(cur - delta2, cur, len_best, len_limit); matches[matches_count - 1].len = len_best; if (len_best == len_limit) { hc_skip(); return matches_count; } } if (len_best < 3) len_best = 3; hc_find(len_best); } extern void lzma_mf_hc4_skip(lzma_mf *mf, uint32_t amount) { do { if (mf_avail(mf) < 4) { move_pending(mf); continue; } const uint8_t *cur = mf_ptr(mf); const uint32_t pos = mf->read_pos + mf->offset; hash_4_calc(); const uint32_t cur_match = mf->hash[FIX_4_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_3_value] = pos; mf->hash[FIX_4_HASH_SIZE + hash_value] = pos; hc_skip(); } while (--amount != 0); } #endif ///////////////// // Binary Tree // ///////////////// #if defined(HAVE_MF_BT2) || defined(HAVE_MF_BT3) || defined(HAVE_MF_BT4) static lzma_match * bt_find_func( const uint32_t len_limit, const uint32_t pos, const uint8_t *const cur, uint32_t cur_match, uint32_t depth, uint32_t *const son, const uint32_t cyclic_pos, const uint32_t cyclic_size, lzma_match *matches, uint32_t len_best) { uint32_t *ptr0 = son + (cyclic_pos << 1) + 1; uint32_t *ptr1 = son + (cyclic_pos << 1); uint32_t len0 = 0; uint32_t len1 = 0; while (true) { const uint32_t delta = pos - cur_match; if (depth-- == 0 || delta >= cyclic_size) { *ptr0 = EMPTY_HASH_VALUE; *ptr1 = EMPTY_HASH_VALUE; return matches; } uint32_t *const pair = son + ((cyclic_pos - delta + (delta > cyclic_pos ? cyclic_size : 0)) << 1); const uint8_t *const pb = cur - delta; uint32_t len = my_min(len0, len1); if (pb[len] == cur[len]) { len = lzma_memcmplen(pb, cur, len + 1, len_limit); if (len_best < len) { len_best = len; matches->len = len; matches->dist = delta - 1; ++matches; if (len == len_limit) { *ptr1 = pair[0]; *ptr0 = pair[1]; return matches; } } } if (pb[len] < cur[len]) { *ptr1 = cur_match; ptr1 = pair + 1; cur_match = *ptr1; len1 = len; } else { *ptr0 = cur_match; ptr0 = pair; cur_match = *ptr0; len0 = len; } } } static void bt_skip_func( const uint32_t len_limit, const uint32_t pos, const uint8_t *const cur, uint32_t cur_match, uint32_t depth, uint32_t *const son, const uint32_t cyclic_pos, const uint32_t cyclic_size) { uint32_t *ptr0 = son + (cyclic_pos << 1) + 1; uint32_t *ptr1 = son + (cyclic_pos << 1); uint32_t len0 = 0; uint32_t len1 = 0; while (true) { const uint32_t delta = pos - cur_match; if (depth-- == 0 || delta >= cyclic_size) { *ptr0 = EMPTY_HASH_VALUE; *ptr1 = EMPTY_HASH_VALUE; return; } uint32_t *pair = son + ((cyclic_pos - delta + (delta > cyclic_pos ? cyclic_size : 0)) << 1); const uint8_t *pb = cur - delta; uint32_t len = my_min(len0, len1); if (pb[len] == cur[len]) { len = lzma_memcmplen(pb, cur, len + 1, len_limit); if (len == len_limit) { *ptr1 = pair[0]; *ptr0 = pair[1]; return; } } if (pb[len] < cur[len]) { *ptr1 = cur_match; ptr1 = pair + 1; cur_match = *ptr1; len1 = len; } else { *ptr0 = cur_match; ptr0 = pair; cur_match = *ptr0; len0 = len; } } } #define bt_find(len_best) \ call_find(bt_find_func, len_best) #define bt_skip() \ do { \ bt_skip_func(len_limit, pos, cur, cur_match, mf->depth, \ mf->son, mf->cyclic_pos, \ mf->cyclic_size); \ move_pos(mf); \ } while (0) #endif #ifdef HAVE_MF_BT2 extern uint32_t lzma_mf_bt2_find(lzma_mf *mf, lzma_match *matches) { header_find(true, 2); hash_2_calc(); const uint32_t cur_match = mf->hash[hash_value]; mf->hash[hash_value] = pos; bt_find(1); } extern void lzma_mf_bt2_skip(lzma_mf *mf, uint32_t amount) { do { header_skip(true, 2); hash_2_calc(); const uint32_t cur_match = mf->hash[hash_value]; mf->hash[hash_value] = pos; bt_skip(); } while (--amount != 0); } #endif #ifdef HAVE_MF_BT3 extern uint32_t lzma_mf_bt3_find(lzma_mf *mf, lzma_match *matches) { header_find(true, 3); hash_3_calc(); const uint32_t delta2 = pos - mf->hash[hash_2_value]; const uint32_t cur_match = mf->hash[FIX_3_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_value] = pos; uint32_t len_best = 2; if (delta2 < mf->cyclic_size && *(cur - delta2) == *cur) { len_best = lzma_memcmplen( cur, cur - delta2, len_best, len_limit); matches[0].len = len_best; matches[0].dist = delta2 - 1; matches_count = 1; if (len_best == len_limit) { bt_skip(); return 1; // matches_count } } bt_find(len_best); } extern void lzma_mf_bt3_skip(lzma_mf *mf, uint32_t amount) { do { header_skip(true, 3); hash_3_calc(); const uint32_t cur_match = mf->hash[FIX_3_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_value] = pos; bt_skip(); } while (--amount != 0); } #endif #ifdef HAVE_MF_BT4 extern uint32_t lzma_mf_bt4_find(lzma_mf *mf, lzma_match *matches) { header_find(true, 4); hash_4_calc(); uint32_t delta2 = pos - mf->hash[hash_2_value]; const uint32_t delta3 = pos - mf->hash[FIX_3_HASH_SIZE + hash_3_value]; const uint32_t cur_match = mf->hash[FIX_4_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_3_value] = pos; mf->hash[FIX_4_HASH_SIZE + hash_value] = pos; uint32_t len_best = 1; if (delta2 < mf->cyclic_size && *(cur - delta2) == *cur) { len_best = 2; matches[0].len = 2; matches[0].dist = delta2 - 1; matches_count = 1; } if (delta2 != delta3 && delta3 < mf->cyclic_size && *(cur - delta3) == *cur) { len_best = 3; matches[matches_count++].dist = delta3 - 1; delta2 = delta3; } if (matches_count != 0) { len_best = lzma_memcmplen( cur, cur - delta2, len_best, len_limit); matches[matches_count - 1].len = len_best; if (len_best == len_limit) { bt_skip(); return matches_count; } } if (len_best < 3) len_best = 3; bt_find(len_best); } extern void lzma_mf_bt4_skip(lzma_mf *mf, uint32_t amount) { do { header_skip(true, 4); hash_4_calc(); const uint32_t cur_match = mf->hash[FIX_4_HASH_SIZE + hash_value]; mf->hash[hash_2_value] = pos; mf->hash[FIX_3_HASH_SIZE + hash_3_value] = pos; mf->hash[FIX_4_HASH_SIZE + hash_value] = pos; bt_skip(); } while (--amount != 0); } #endif Index: head/contrib/xz/src/liblzma/lzma/fastpos.h =================================================================== --- head/contrib/xz/src/liblzma/lzma/fastpos.h (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/fastpos.h (revision 359201) @@ -1,141 +1,141 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file fastpos.h /// \brief Kind of two-bit version of bit scan reverse /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_FASTPOS_H #define LZMA_FASTPOS_H // LZMA encodes match distances by storing the highest two bits using // a six-bit value [0, 63], and then the missing lower bits. // Dictionary size is also stored using this encoding in the .xz // file format header. // // fastpos.h provides a way to quickly find out the correct six-bit // values. The following table gives some examples of this encoding: // // dist return // 0 0 // 1 1 // 2 2 // 3 3 // 4 4 // 5 4 // 6 5 // 7 5 // 8 6 // 11 6 // 12 7 // ... ... // 15 7 // 16 8 // 17 8 // ... ... // 23 8 // 24 9 // 25 9 // ... ... // // // Provided functions or macros // ---------------------------- // // get_dist_slot(dist) is the basic version. get_dist_slot_2(dist) // assumes that dist >= FULL_DISTANCES, thus the result is at least // FULL_DISTANCES_BITS * 2. Using get_dist_slot(dist) instead of // get_dist_slot_2(dist) would give the same result, but get_dist_slot_2(dist) // should be tiny bit faster due to the assumption being made. // // // Size vs. speed // -------------- // // With some CPUs that have fast BSR (bit scan reverse) instruction, the // size optimized version is slightly faster than the bigger table based // approach. Such CPUs include Intel Pentium Pro, Pentium II, Pentium III // and Core 2 (possibly others). AMD K7 seems to have slower BSR, but that // would still have speed roughly comparable to the table version. Older // x86 CPUs like the original Pentium have very slow BSR; on those systems // the table version is a lot faster. // // On some CPUs, the table version is a lot faster when using position // dependent code, but with position independent code the size optimized // version is slightly faster. This occurs at least on 32-bit SPARC (no // ASM optimizations). // // I'm making the table version the default, because that has good speed // on all systems I have tried. The size optimized version is sometimes // slightly faster, but sometimes it is a lot slower. #ifdef HAVE_SMALL # define get_dist_slot(dist) \ ((dist) <= 4 ? (dist) : get_dist_slot_2(dist)) static inline uint32_t get_dist_slot_2(uint32_t dist) { const uint32_t i = bsr32(dist); return (i + i) + ((dist >> (i - 1)) & 1); } #else #define FASTPOS_BITS 13 extern const uint8_t lzma_fastpos[1 << FASTPOS_BITS]; #define fastpos_shift(extra, n) \ ((extra) + (n) * (FASTPOS_BITS - 1)) #define fastpos_limit(extra, n) \ (UINT32_C(1) << (FASTPOS_BITS + fastpos_shift(extra, n))) #define fastpos_result(dist, extra, n) \ - lzma_fastpos[(dist) >> fastpos_shift(extra, n)] \ + (uint32_t)(lzma_fastpos[(dist) >> fastpos_shift(extra, n)]) \ + 2 * fastpos_shift(extra, n) static inline uint32_t get_dist_slot(uint32_t dist) { // If it is small enough, we can pick the result directly from // the precalculated table. if (dist < fastpos_limit(0, 0)) return lzma_fastpos[dist]; if (dist < fastpos_limit(0, 1)) return fastpos_result(dist, 0, 1); return fastpos_result(dist, 0, 2); } #ifdef FULL_DISTANCES_BITS static inline uint32_t get_dist_slot_2(uint32_t dist) { assert(dist >= FULL_DISTANCES); if (dist < fastpos_limit(FULL_DISTANCES_BITS - 1, 0)) return fastpos_result(dist, FULL_DISTANCES_BITS - 1, 0); if (dist < fastpos_limit(FULL_DISTANCES_BITS - 1, 1)) return fastpos_result(dist, FULL_DISTANCES_BITS - 1, 1); return fastpos_result(dist, FULL_DISTANCES_BITS - 1, 2); } #endif #endif #endif Index: head/contrib/xz/src/liblzma/lzma/fastpos_tablegen.c =================================================================== --- head/contrib/xz/src/liblzma/lzma/fastpos_tablegen.c (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/fastpos_tablegen.c (revision 359201) @@ -1,56 +1,55 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file fastpos_tablegen.c /// \brief Generates the lzma_fastpos[] lookup table /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// -#include #include #include #include "fastpos.h" int main(void) { uint8_t fastpos[1 << FASTPOS_BITS]; const uint8_t fast_slots = 2 * FASTPOS_BITS; uint32_t c = 2; fastpos[0] = 0; fastpos[1] = 1; for (uint8_t slot_fast = 2; slot_fast < fast_slots; ++slot_fast) { const uint32_t k = 1 << ((slot_fast >> 1) - 1); for (uint32_t j = 0; j < k; ++j, ++c) fastpos[c] = slot_fast; } printf("/* This file has been automatically generated " "by fastpos_tablegen.c. */\n\n" "#include \"common.h\"\n" "#include \"fastpos.h\"\n\n" "const uint8_t lzma_fastpos[1 << FASTPOS_BITS] = {"); for (size_t i = 0; i < (1 << FASTPOS_BITS); ++i) { if (i % 16 == 0) printf("\n\t"); printf("%3u", (unsigned int)(fastpos[i])); if (i != (1 << FASTPOS_BITS) - 1) printf(","); } printf("\n};\n"); return 0; } Index: head/contrib/xz/src/liblzma/lzma/lzma2_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma2_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma2_decoder.c (revision 359201) @@ -1,310 +1,310 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma2_decoder.c /// \brief LZMA2 decoder /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "lzma2_decoder.h" #include "lz_decoder.h" #include "lzma_decoder.h" typedef struct { enum sequence { SEQ_CONTROL, SEQ_UNCOMPRESSED_1, SEQ_UNCOMPRESSED_2, SEQ_COMPRESSED_0, SEQ_COMPRESSED_1, SEQ_PROPERTIES, SEQ_LZMA, SEQ_COPY, } sequence; /// Sequence after the size fields have been decoded. enum sequence next_sequence; /// LZMA decoder lzma_lz_decoder lzma; /// Uncompressed size of LZMA chunk size_t uncompressed_size; /// Compressed size of the chunk (naturally equals to uncompressed /// size of uncompressed chunk) size_t compressed_size; /// True if properties are needed. This is false before the /// first LZMA chunk. bool need_properties; /// True if dictionary reset is needed. This is false before the /// first chunk (LZMA or uncompressed). bool need_dictionary_reset; lzma_options_lzma options; } lzma_lzma2_coder; static lzma_ret lzma2_decode(void *coder_ptr, lzma_dict *restrict dict, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size) { lzma_lzma2_coder *restrict coder = coder_ptr; // With SEQ_LZMA it is possible that no new input is needed to do // some progress. The rest of the sequences assume that there is // at least one byte of input. while (*in_pos < in_size || coder->sequence == SEQ_LZMA) switch (coder->sequence) { case SEQ_CONTROL: { const uint32_t control = in[*in_pos]; ++*in_pos; // End marker if (control == 0x00) return LZMA_STREAM_END; if (control >= 0xE0 || control == 1) { // Dictionary reset implies that next LZMA chunk has // to set new properties. coder->need_properties = true; coder->need_dictionary_reset = true; } else if (coder->need_dictionary_reset) { return LZMA_DATA_ERROR; } if (control >= 0x80) { // LZMA chunk. The highest five bits of the // uncompressed size are taken from the control byte. coder->uncompressed_size = (control & 0x1F) << 16; coder->sequence = SEQ_UNCOMPRESSED_1; // See if there are new properties or if we need to // reset the state. if (control >= 0xC0) { // When there are new properties, state reset // is done at SEQ_PROPERTIES. coder->need_properties = false; coder->next_sequence = SEQ_PROPERTIES; } else if (coder->need_properties) { return LZMA_DATA_ERROR; } else { coder->next_sequence = SEQ_LZMA; // If only state reset is wanted with old // properties, do the resetting here for // simplicity. if (control >= 0xA0) coder->lzma.reset(coder->lzma.coder, &coder->options); } } else { // Invalid control values if (control > 2) return LZMA_DATA_ERROR; // It's uncompressed chunk coder->sequence = SEQ_COMPRESSED_0; coder->next_sequence = SEQ_COPY; } if (coder->need_dictionary_reset) { // Finish the dictionary reset and let the caller // flush the dictionary to the actual output buffer. coder->need_dictionary_reset = false; dict_reset(dict); return LZMA_OK; } break; } case SEQ_UNCOMPRESSED_1: coder->uncompressed_size += (uint32_t)(in[(*in_pos)++]) << 8; coder->sequence = SEQ_UNCOMPRESSED_2; break; case SEQ_UNCOMPRESSED_2: - coder->uncompressed_size += in[(*in_pos)++] + 1; + coder->uncompressed_size += in[(*in_pos)++] + 1U; coder->sequence = SEQ_COMPRESSED_0; coder->lzma.set_uncompressed(coder->lzma.coder, coder->uncompressed_size); break; case SEQ_COMPRESSED_0: coder->compressed_size = (uint32_t)(in[(*in_pos)++]) << 8; coder->sequence = SEQ_COMPRESSED_1; break; case SEQ_COMPRESSED_1: - coder->compressed_size += in[(*in_pos)++] + 1; + coder->compressed_size += in[(*in_pos)++] + 1U; coder->sequence = coder->next_sequence; break; case SEQ_PROPERTIES: if (lzma_lzma_lclppb_decode(&coder->options, in[(*in_pos)++])) return LZMA_DATA_ERROR; coder->lzma.reset(coder->lzma.coder, &coder->options); coder->sequence = SEQ_LZMA; break; case SEQ_LZMA: { // Store the start offset so that we can update // coder->compressed_size later. const size_t in_start = *in_pos; // Decode from in[] to *dict. const lzma_ret ret = coder->lzma.code(coder->lzma.coder, dict, in, in_pos, in_size); // Validate and update coder->compressed_size. const size_t in_used = *in_pos - in_start; if (in_used > coder->compressed_size) return LZMA_DATA_ERROR; coder->compressed_size -= in_used; // Return if we didn't finish the chunk, or an error occurred. if (ret != LZMA_STREAM_END) return ret; // The LZMA decoder must have consumed the whole chunk now. // We don't need to worry about uncompressed size since it // is checked by the LZMA decoder. if (coder->compressed_size != 0) return LZMA_DATA_ERROR; coder->sequence = SEQ_CONTROL; break; } case SEQ_COPY: { // Copy from input to the dictionary as is. dict_write(dict, in, in_pos, in_size, &coder->compressed_size); if (coder->compressed_size != 0) return LZMA_OK; coder->sequence = SEQ_CONTROL; break; } default: assert(0); return LZMA_PROG_ERROR; } return LZMA_OK; } static void lzma2_decoder_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_lzma2_coder *coder = coder_ptr; assert(coder->lzma.end == NULL); lzma_free(coder->lzma.coder, allocator); lzma_free(coder, allocator); return; } static lzma_ret lzma2_decoder_init(lzma_lz_decoder *lz, const lzma_allocator *allocator, const void *opt, lzma_lz_options *lz_options) { lzma_lzma2_coder *coder = lz->coder; if (coder == NULL) { coder = lzma_alloc(sizeof(lzma_lzma2_coder), allocator); if (coder == NULL) return LZMA_MEM_ERROR; lz->coder = coder; lz->code = &lzma2_decode; lz->end = &lzma2_decoder_end; coder->lzma = LZMA_LZ_DECODER_INIT; } const lzma_options_lzma *options = opt; coder->sequence = SEQ_CONTROL; coder->need_properties = true; coder->need_dictionary_reset = options->preset_dict == NULL || options->preset_dict_size == 0; return lzma_lzma_decoder_create(&coder->lzma, allocator, options, lz_options); } extern lzma_ret lzma_lzma2_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { // LZMA2 can only be the last filter in the chain. This is enforced // by the raw_decoder initialization. assert(filters[1].init == NULL); return lzma_lz_decoder_init(next, allocator, filters, &lzma2_decoder_init); } extern uint64_t lzma_lzma2_decoder_memusage(const void *options) { return sizeof(lzma_lzma2_coder) + lzma_lzma_decoder_memusage_nocheck(options); } extern lzma_ret lzma_lzma2_props_decode(void **options, const lzma_allocator *allocator, const uint8_t *props, size_t props_size) { if (props_size != 1) return LZMA_OPTIONS_ERROR; // Check that reserved bits are unset. if (props[0] & 0xC0) return LZMA_OPTIONS_ERROR; // Decode the dictionary size. if (props[0] > 40) return LZMA_OPTIONS_ERROR; lzma_options_lzma *opt = lzma_alloc( sizeof(lzma_options_lzma), allocator); if (opt == NULL) return LZMA_MEM_ERROR; if (props[0] == 40) { opt->dict_size = UINT32_MAX; } else { - opt->dict_size = 2 | (props[0] & 1); - opt->dict_size <<= props[0] / 2 + 11; + opt->dict_size = 2 | (props[0] & 1U); + opt->dict_size <<= props[0] / 2U + 11; } opt->preset_dict = NULL; opt->preset_dict_size = 0; *options = opt; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/lzma/lzma_common.h =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma_common.h (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma_common.h (revision 359201) @@ -1,224 +1,225 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma_common.h /// \brief Private definitions common to LZMA encoder and decoder /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_LZMA_COMMON_H #define LZMA_LZMA_COMMON_H #include "common.h" #include "range_common.h" /////////////////// // Miscellaneous // /////////////////// /// Maximum number of position states. A position state is the lowest pos bits /// number of bits of the current uncompressed offset. In some places there /// are different sets of probabilities for different pos states. #define POS_STATES_MAX (1 << LZMA_PB_MAX) /// Validates lc, lp, and pb. static inline bool is_lclppb_valid(const lzma_options_lzma *options) { return options->lc <= LZMA_LCLP_MAX && options->lp <= LZMA_LCLP_MAX && options->lc + options->lp <= LZMA_LCLP_MAX && options->pb <= LZMA_PB_MAX; } /////////// // State // /////////// /// This enum is used to track which events have occurred most recently and /// in which order. This information is used to predict the next event. /// /// Events: /// - Literal: One 8-bit byte /// - Match: Repeat a chunk of data at some distance /// - Long repeat: Multi-byte match at a recently seen distance /// - Short repeat: One-byte repeat at a recently seen distance /// /// The event names are in from STATE_oldest_older_previous. REP means /// either short or long repeated match, and NONLIT means any non-literal. typedef enum { STATE_LIT_LIT, STATE_MATCH_LIT_LIT, STATE_REP_LIT_LIT, STATE_SHORTREP_LIT_LIT, STATE_MATCH_LIT, STATE_REP_LIT, STATE_SHORTREP_LIT, STATE_LIT_MATCH, STATE_LIT_LONGREP, STATE_LIT_SHORTREP, STATE_NONLIT_MATCH, STATE_NONLIT_REP, } lzma_lzma_state; /// Total number of states #define STATES 12 /// The lowest 7 states indicate that the previous state was a literal. #define LIT_STATES 7 /// Indicate that the latest state was a literal. #define update_literal(state) \ state = ((state) <= STATE_SHORTREP_LIT_LIT \ ? STATE_LIT_LIT \ : ((state) <= STATE_LIT_SHORTREP \ ? (state) - 3 \ : (state) - 6)) /// Indicate that the latest state was a match. #define update_match(state) \ state = ((state) < LIT_STATES ? STATE_LIT_MATCH : STATE_NONLIT_MATCH) /// Indicate that the latest state was a long repeated match. #define update_long_rep(state) \ state = ((state) < LIT_STATES ? STATE_LIT_LONGREP : STATE_NONLIT_REP) /// Indicate that the latest state was a short match. #define update_short_rep(state) \ state = ((state) < LIT_STATES ? STATE_LIT_SHORTREP : STATE_NONLIT_REP) /// Test if the previous state was a literal. #define is_literal_state(state) \ ((state) < LIT_STATES) ///////////// // Literal // ///////////// /// Each literal coder is divided in three sections: /// - 0x001-0x0FF: Without match byte /// - 0x101-0x1FF: With match byte; match bit is 0 /// - 0x201-0x2FF: With match byte; match bit is 1 /// /// Match byte is used when the previous LZMA symbol was something else than /// a literal (that is, it was some kind of match). #define LITERAL_CODER_SIZE 0x300 /// Maximum number of literal coders #define LITERAL_CODERS_MAX (1 << LZMA_LCLP_MAX) /// Locate the literal coder for the next literal byte. The choice depends on /// - the lowest literal_pos_bits bits of the position of the current /// byte; and /// - the highest literal_context_bits bits of the previous byte. #define literal_subcoder(probs, lc, lp_mask, pos, prev_byte) \ - ((probs)[(((pos) & lp_mask) << lc) + ((prev_byte) >> (8 - lc))]) + ((probs)[(((pos) & (lp_mask)) << (lc)) \ + + ((uint32_t)(prev_byte) >> (8U - (lc)))]) static inline void literal_init(probability (*probs)[LITERAL_CODER_SIZE], uint32_t lc, uint32_t lp) { assert(lc + lp <= LZMA_LCLP_MAX); const uint32_t coders = 1U << (lc + lp); for (uint32_t i = 0; i < coders; ++i) for (uint32_t j = 0; j < LITERAL_CODER_SIZE; ++j) bit_reset(probs[i][j]); return; } ////////////////// // Match length // ////////////////// // Minimum length of a match is two bytes. #define MATCH_LEN_MIN 2 // Match length is encoded with 4, 5, or 10 bits. // // Length Bits // 2-9 4 = Choice=0 + 3 bits // 10-17 5 = Choice=1 + Choice2=0 + 3 bits // 18-273 10 = Choice=1 + Choice2=1 + 8 bits #define LEN_LOW_BITS 3 #define LEN_LOW_SYMBOLS (1 << LEN_LOW_BITS) #define LEN_MID_BITS 3 #define LEN_MID_SYMBOLS (1 << LEN_MID_BITS) #define LEN_HIGH_BITS 8 #define LEN_HIGH_SYMBOLS (1 << LEN_HIGH_BITS) #define LEN_SYMBOLS (LEN_LOW_SYMBOLS + LEN_MID_SYMBOLS + LEN_HIGH_SYMBOLS) // Maximum length of a match is 273 which is a result of the encoding // described above. #define MATCH_LEN_MAX (MATCH_LEN_MIN + LEN_SYMBOLS - 1) //////////////////// // Match distance // //////////////////// // Different sets of probabilities are used for match distances that have very // short match length: Lengths of 2, 3, and 4 bytes have a separate set of // probabilities for each length. The matches with longer length use a shared // set of probabilities. #define DIST_STATES 4 // Macro to get the index of the appropriate probability array. #define get_dist_state(len) \ ((len) < DIST_STATES + MATCH_LEN_MIN \ ? (len) - MATCH_LEN_MIN \ : DIST_STATES - 1) // The highest two bits of a match distance (distance slot) are encoded // using six bits. See fastpos.h for more explanation. #define DIST_SLOT_BITS 6 #define DIST_SLOTS (1 << DIST_SLOT_BITS) // Match distances up to 127 are fully encoded using probabilities. Since // the highest two bits (distance slot) are always encoded using six bits, // the distances 0-3 don't need any additional bits to encode, since the // distance slot itself is the same as the actual distance. DIST_MODEL_START // indicates the first distance slot where at least one additional bit is // needed. #define DIST_MODEL_START 4 // Match distances greater than 127 are encoded in three pieces: // - distance slot: the highest two bits // - direct bits: 2-26 bits below the highest two bits // - alignment bits: four lowest bits // // Direct bits don't use any probabilities. // // The distance slot value of 14 is for distances 128-191 (see the table in // fastpos.h to understand why). #define DIST_MODEL_END 14 // Distance slots that indicate a distance <= 127. #define FULL_DISTANCES_BITS (DIST_MODEL_END / 2) #define FULL_DISTANCES (1 << FULL_DISTANCES_BITS) // For match distances greater than 127, only the highest two bits and the // lowest four bits (alignment) is encoded using probabilities. #define ALIGN_BITS 4 #define ALIGN_SIZE (1 << ALIGN_BITS) #define ALIGN_MASK (ALIGN_SIZE - 1) // LZMA remembers the four most recent match distances. Reusing these distances // tends to take less space than re-encoding the actual distance value. #define REPS 4 #endif Index: head/contrib/xz/src/liblzma/lzma/lzma_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma_decoder.c (revision 359201) @@ -1,1064 +1,1064 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma_decoder.c /// \brief LZMA decoder /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "lz_decoder.h" #include "lzma_common.h" #include "lzma_decoder.h" #include "range_decoder.h" // The macros unroll loops with switch statements. // Silence warnings about missing fall-through comments. #if TUKLIB_GNUC_REQ(7, 0) # pragma GCC diagnostic ignored "-Wimplicit-fallthrough" #endif #ifdef HAVE_SMALL // Macros for (somewhat) size-optimized code. #define seq_4(seq) seq #define seq_6(seq) seq #define seq_8(seq) seq #define seq_len(seq) \ seq ## _CHOICE, \ seq ## _CHOICE2, \ seq ## _BITTREE #define len_decode(target, ld, pos_state, seq) \ do { \ case seq ## _CHOICE: \ rc_if_0(ld.choice, seq ## _CHOICE) { \ rc_update_0(ld.choice); \ probs = ld.low[pos_state];\ limit = LEN_LOW_SYMBOLS; \ target = MATCH_LEN_MIN; \ } else { \ rc_update_1(ld.choice); \ case seq ## _CHOICE2: \ rc_if_0(ld.choice2, seq ## _CHOICE2) { \ rc_update_0(ld.choice2); \ probs = ld.mid[pos_state]; \ limit = LEN_MID_SYMBOLS; \ target = MATCH_LEN_MIN + LEN_LOW_SYMBOLS; \ } else { \ rc_update_1(ld.choice2); \ probs = ld.high; \ limit = LEN_HIGH_SYMBOLS; \ target = MATCH_LEN_MIN + LEN_LOW_SYMBOLS \ + LEN_MID_SYMBOLS; \ } \ } \ symbol = 1; \ case seq ## _BITTREE: \ do { \ rc_bit(probs[symbol], , , seq ## _BITTREE); \ } while (symbol < limit); \ target += symbol - limit; \ } while (0) #else // HAVE_SMALL // Unrolled versions #define seq_4(seq) \ seq ## 0, \ seq ## 1, \ seq ## 2, \ seq ## 3 #define seq_6(seq) \ seq ## 0, \ seq ## 1, \ seq ## 2, \ seq ## 3, \ seq ## 4, \ seq ## 5 #define seq_8(seq) \ seq ## 0, \ seq ## 1, \ seq ## 2, \ seq ## 3, \ seq ## 4, \ seq ## 5, \ seq ## 6, \ seq ## 7 #define seq_len(seq) \ seq ## _CHOICE, \ seq ## _LOW0, \ seq ## _LOW1, \ seq ## _LOW2, \ seq ## _CHOICE2, \ seq ## _MID0, \ seq ## _MID1, \ seq ## _MID2, \ seq ## _HIGH0, \ seq ## _HIGH1, \ seq ## _HIGH2, \ seq ## _HIGH3, \ seq ## _HIGH4, \ seq ## _HIGH5, \ seq ## _HIGH6, \ seq ## _HIGH7 #define len_decode(target, ld, pos_state, seq) \ do { \ symbol = 1; \ case seq ## _CHOICE: \ rc_if_0(ld.choice, seq ## _CHOICE) { \ rc_update_0(ld.choice); \ rc_bit_case(ld.low[pos_state][symbol], , , seq ## _LOW0); \ rc_bit_case(ld.low[pos_state][symbol], , , seq ## _LOW1); \ rc_bit_case(ld.low[pos_state][symbol], , , seq ## _LOW2); \ target = symbol - LEN_LOW_SYMBOLS + MATCH_LEN_MIN; \ } else { \ rc_update_1(ld.choice); \ case seq ## _CHOICE2: \ rc_if_0(ld.choice2, seq ## _CHOICE2) { \ rc_update_0(ld.choice2); \ rc_bit_case(ld.mid[pos_state][symbol], , , \ seq ## _MID0); \ rc_bit_case(ld.mid[pos_state][symbol], , , \ seq ## _MID1); \ rc_bit_case(ld.mid[pos_state][symbol], , , \ seq ## _MID2); \ target = symbol - LEN_MID_SYMBOLS \ + MATCH_LEN_MIN + LEN_LOW_SYMBOLS; \ } else { \ rc_update_1(ld.choice2); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH0); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH1); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH2); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH3); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH4); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH5); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH6); \ rc_bit_case(ld.high[symbol], , , seq ## _HIGH7); \ target = symbol - LEN_HIGH_SYMBOLS \ + MATCH_LEN_MIN \ + LEN_LOW_SYMBOLS + LEN_MID_SYMBOLS; \ } \ } \ } while (0) #endif // HAVE_SMALL /// Length decoder probabilities; see comments in lzma_common.h. typedef struct { probability choice; probability choice2; probability low[POS_STATES_MAX][LEN_LOW_SYMBOLS]; probability mid[POS_STATES_MAX][LEN_MID_SYMBOLS]; probability high[LEN_HIGH_SYMBOLS]; } lzma_length_decoder; typedef struct { /////////////////// // Probabilities // /////////////////// /// Literals; see comments in lzma_common.h. probability literal[LITERAL_CODERS_MAX][LITERAL_CODER_SIZE]; /// If 1, it's a match. Otherwise it's a single 8-bit literal. probability is_match[STATES][POS_STATES_MAX]; /// If 1, it's a repeated match. The distance is one of rep0 .. rep3. probability is_rep[STATES]; /// If 0, distance of a repeated match is rep0. /// Otherwise check is_rep1. probability is_rep0[STATES]; /// If 0, distance of a repeated match is rep1. /// Otherwise check is_rep2. probability is_rep1[STATES]; /// If 0, distance of a repeated match is rep2. Otherwise it is rep3. probability is_rep2[STATES]; /// If 1, the repeated match has length of one byte. Otherwise /// the length is decoded from rep_len_decoder. probability is_rep0_long[STATES][POS_STATES_MAX]; /// Probability tree for the highest two bits of the match distance. /// There is a separate probability tree for match lengths of /// 2 (i.e. MATCH_LEN_MIN), 3, 4, and [5, 273]. probability dist_slot[DIST_STATES][DIST_SLOTS]; /// Probability trees for additional bits for match distance when the /// distance is in the range [4, 127]. probability pos_special[FULL_DISTANCES - DIST_MODEL_END]; /// Probability tree for the lowest four bits of a match distance /// that is equal to or greater than 128. probability pos_align[ALIGN_SIZE]; /// Length of a normal match lzma_length_decoder match_len_decoder; /// Length of a repeated match lzma_length_decoder rep_len_decoder; /////////////////// // Decoder state // /////////////////// // Range coder lzma_range_decoder rc; // Types of the most recently seen LZMA symbols lzma_lzma_state state; uint32_t rep0; ///< Distance of the latest match uint32_t rep1; ///< Distance of second latest match uint32_t rep2; ///< Distance of third latest match uint32_t rep3; ///< Distance of fourth latest match uint32_t pos_mask; // (1U << pb) - 1 uint32_t literal_context_bits; uint32_t literal_pos_mask; /// Uncompressed size as bytes, or LZMA_VLI_UNKNOWN if end of /// payload marker is expected. lzma_vli uncompressed_size; //////////////////////////////// // State of incomplete symbol // //////////////////////////////// /// Position where to continue the decoder loop enum { SEQ_NORMALIZE, SEQ_IS_MATCH, seq_8(SEQ_LITERAL), seq_8(SEQ_LITERAL_MATCHED), SEQ_LITERAL_WRITE, SEQ_IS_REP, seq_len(SEQ_MATCH_LEN), seq_6(SEQ_DIST_SLOT), SEQ_DIST_MODEL, SEQ_DIRECT, seq_4(SEQ_ALIGN), SEQ_EOPM, SEQ_IS_REP0, SEQ_SHORTREP, SEQ_IS_REP0_LONG, SEQ_IS_REP1, SEQ_IS_REP2, seq_len(SEQ_REP_LEN), SEQ_COPY, } sequence; /// Base of the current probability tree probability *probs; /// Symbol being decoded. This is also used as an index variable in /// bittree decoders: probs[symbol] uint32_t symbol; /// Used as a loop termination condition on bittree decoders and /// direct bits decoder. uint32_t limit; /// Matched literal decoder: 0x100 or 0 to help avoiding branches. /// Bittree reverse decoders: Offset of the next bit: 1 << offset uint32_t offset; /// If decoding a literal: match byte. /// If decoding a match: length of the match. uint32_t len; } lzma_lzma1_decoder; static lzma_ret lzma_decode(void *coder_ptr, lzma_dict *restrict dictptr, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size) { lzma_lzma1_decoder *restrict coder = coder_ptr; //////////////////// // Initialization // //////////////////// { const lzma_ret ret = rc_read_init( &coder->rc, in, in_pos, in_size); if (ret != LZMA_STREAM_END) return ret; } /////////////// // Variables // /////////////// // Making local copies of often-used variables improves both // speed and readability. lzma_dict dict = *dictptr; const size_t dict_start = dict.pos; // Range decoder rc_to_local(coder->rc, *in_pos); // State uint32_t state = coder->state; uint32_t rep0 = coder->rep0; uint32_t rep1 = coder->rep1; uint32_t rep2 = coder->rep2; uint32_t rep3 = coder->rep3; const uint32_t pos_mask = coder->pos_mask; // These variables are actually needed only if we last time ran // out of input in the middle of the decoder loop. probability *probs = coder->probs; uint32_t symbol = coder->symbol; uint32_t limit = coder->limit; uint32_t offset = coder->offset; uint32_t len = coder->len; const uint32_t literal_pos_mask = coder->literal_pos_mask; const uint32_t literal_context_bits = coder->literal_context_bits; // Temporary variables uint32_t pos_state = dict.pos & pos_mask; lzma_ret ret = LZMA_OK; // If uncompressed size is known, there must be no end of payload // marker. const bool no_eopm = coder->uncompressed_size != LZMA_VLI_UNKNOWN; if (no_eopm && coder->uncompressed_size < dict.limit - dict.pos) dict.limit = dict.pos + (size_t)(coder->uncompressed_size); // The main decoder loop. The "switch" is used to restart the decoder at // correct location. Once restarted, the "switch" is no longer used. switch (coder->sequence) while (true) { // Calculate new pos_state. This is skipped on the first loop // since we already calculated it when setting up the local // variables. pos_state = dict.pos & pos_mask; case SEQ_NORMALIZE: case SEQ_IS_MATCH: if (unlikely(no_eopm && dict.pos == dict.limit)) break; rc_if_0(coder->is_match[state][pos_state], SEQ_IS_MATCH) { rc_update_0(coder->is_match[state][pos_state]); // It's a literal i.e. a single 8-bit byte. probs = literal_subcoder(coder->literal, literal_context_bits, literal_pos_mask, dict.pos, dict_get(&dict, 0)); symbol = 1; if (is_literal_state(state)) { // Decode literal without match byte. #ifdef HAVE_SMALL case SEQ_LITERAL: do { rc_bit(probs[symbol], , , SEQ_LITERAL); } while (symbol < (1 << 8)); #else rc_bit_case(probs[symbol], , , SEQ_LITERAL0); rc_bit_case(probs[symbol], , , SEQ_LITERAL1); rc_bit_case(probs[symbol], , , SEQ_LITERAL2); rc_bit_case(probs[symbol], , , SEQ_LITERAL3); rc_bit_case(probs[symbol], , , SEQ_LITERAL4); rc_bit_case(probs[symbol], , , SEQ_LITERAL5); rc_bit_case(probs[symbol], , , SEQ_LITERAL6); rc_bit_case(probs[symbol], , , SEQ_LITERAL7); #endif } else { // Decode literal with match byte. // // We store the byte we compare against // ("match byte") to "len" to minimize the // number of variables we need to store // between decoder calls. - len = dict_get(&dict, rep0) << 1; + len = (uint32_t)(dict_get(&dict, rep0)) << 1; // The usage of "offset" allows omitting some // branches, which should give tiny speed // improvement on some CPUs. "offset" gets // set to zero if match_bit didn't match. offset = 0x100; #ifdef HAVE_SMALL case SEQ_LITERAL_MATCHED: do { const uint32_t match_bit = len & offset; const uint32_t subcoder_index = offset + match_bit + symbol; rc_bit(probs[subcoder_index], offset &= ~match_bit, offset &= match_bit, SEQ_LITERAL_MATCHED); // It seems to be faster to do this // here instead of putting it to the // beginning of the loop and then // putting the "case" in the middle // of the loop. len <<= 1; } while (symbol < (1 << 8)); #else // Unroll the loop. uint32_t match_bit; uint32_t subcoder_index; # define d(seq) \ case seq: \ match_bit = len & offset; \ subcoder_index = offset + match_bit + symbol; \ rc_bit(probs[subcoder_index], \ offset &= ~match_bit, \ offset &= match_bit, \ seq) d(SEQ_LITERAL_MATCHED0); len <<= 1; d(SEQ_LITERAL_MATCHED1); len <<= 1; d(SEQ_LITERAL_MATCHED2); len <<= 1; d(SEQ_LITERAL_MATCHED3); len <<= 1; d(SEQ_LITERAL_MATCHED4); len <<= 1; d(SEQ_LITERAL_MATCHED5); len <<= 1; d(SEQ_LITERAL_MATCHED6); len <<= 1; d(SEQ_LITERAL_MATCHED7); # undef d #endif } //update_literal(state); // Use a lookup table to update to literal state, // since compared to other state updates, this would // need two branches. static const lzma_lzma_state next_state[] = { STATE_LIT_LIT, STATE_LIT_LIT, STATE_LIT_LIT, STATE_LIT_LIT, STATE_MATCH_LIT_LIT, STATE_REP_LIT_LIT, STATE_SHORTREP_LIT_LIT, STATE_MATCH_LIT, STATE_REP_LIT, STATE_SHORTREP_LIT, STATE_MATCH_LIT, STATE_REP_LIT }; state = next_state[state]; case SEQ_LITERAL_WRITE: if (unlikely(dict_put(&dict, symbol))) { coder->sequence = SEQ_LITERAL_WRITE; goto out; } continue; } // Instead of a new byte we are going to get a byte range // (distance and length) which will be repeated from our // output history. rc_update_1(coder->is_match[state][pos_state]); case SEQ_IS_REP: rc_if_0(coder->is_rep[state], SEQ_IS_REP) { // Not a repeated match rc_update_0(coder->is_rep[state]); update_match(state); // The latest three match distances are kept in // memory in case there are repeated matches. rep3 = rep2; rep2 = rep1; rep1 = rep0; // Decode the length of the match. len_decode(len, coder->match_len_decoder, pos_state, SEQ_MATCH_LEN); // Prepare to decode the highest two bits of the // match distance. probs = coder->dist_slot[get_dist_state(len)]; symbol = 1; #ifdef HAVE_SMALL case SEQ_DIST_SLOT: do { rc_bit(probs[symbol], , , SEQ_DIST_SLOT); } while (symbol < DIST_SLOTS); #else rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT0); rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT1); rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT2); rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT3); rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT4); rc_bit_case(probs[symbol], , , SEQ_DIST_SLOT5); #endif // Get rid of the highest bit that was needed for // indexing of the probability array. symbol -= DIST_SLOTS; assert(symbol <= 63); if (symbol < DIST_MODEL_START) { // Match distances [0, 3] have only two bits. rep0 = symbol; } else { // Decode the lowest [1, 29] bits of // the match distance. limit = (symbol >> 1) - 1; assert(limit >= 1 && limit <= 30); rep0 = 2 + (symbol & 1); if (symbol < DIST_MODEL_END) { // Prepare to decode the low bits for // a distance of [4, 127]. assert(limit <= 5); rep0 <<= limit; assert(rep0 <= 96); // -1 is fine, because we start // decoding at probs[1], not probs[0]. // NOTE: This violates the C standard, // since we are doing pointer // arithmetic past the beginning of // the array. assert((int32_t)(rep0 - symbol - 1) >= -1); assert((int32_t)(rep0 - symbol - 1) <= 82); probs = coder->pos_special + rep0 - symbol - 1; symbol = 1; offset = 0; case SEQ_DIST_MODEL: #ifdef HAVE_SMALL do { rc_bit(probs[symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_DIST_MODEL); } while (++offset < limit); #else switch (limit) { case 5: assert(offset == 0); rc_bit(probs[symbol], , - rep0 += 1, + rep0 += 1U, SEQ_DIST_MODEL); ++offset; --limit; case 4: rc_bit(probs[symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_DIST_MODEL); ++offset; --limit; case 3: rc_bit(probs[symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_DIST_MODEL); ++offset; --limit; case 2: rc_bit(probs[symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_DIST_MODEL); ++offset; --limit; case 1: // We need "symbol" only for // indexing the probability // array, thus we can use // rc_bit_last() here to omit // the unneeded updating of // "symbol". rc_bit_last(probs[symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_DIST_MODEL); } #endif } else { // The distance is >= 128. Decode the // lower bits without probabilities // except the lowest four bits. assert(symbol >= 14); assert(limit >= 6); limit -= ALIGN_BITS; assert(limit >= 2); case SEQ_DIRECT: // Not worth manual unrolling do { rc_direct(rep0, SEQ_DIRECT); } while (--limit > 0); // Decode the lowest four bits using // probabilities. rep0 <<= ALIGN_BITS; symbol = 1; #ifdef HAVE_SMALL offset = 0; case SEQ_ALIGN: do { rc_bit(coder->pos_align[ symbol], , - rep0 += 1 << offset, + rep0 += 1U << offset, SEQ_ALIGN); } while (++offset < ALIGN_BITS); #else case SEQ_ALIGN0: rc_bit(coder->pos_align[symbol], , rep0 += 1, SEQ_ALIGN0); case SEQ_ALIGN1: rc_bit(coder->pos_align[symbol], , rep0 += 2, SEQ_ALIGN1); case SEQ_ALIGN2: rc_bit(coder->pos_align[symbol], , rep0 += 4, SEQ_ALIGN2); case SEQ_ALIGN3: // Like in SEQ_DIST_MODEL, we don't // need "symbol" for anything else // than indexing the probability array. rc_bit_last(coder->pos_align[symbol], , rep0 += 8, SEQ_ALIGN3); #endif if (rep0 == UINT32_MAX) { // End of payload marker was // found. It must not be // present if uncompressed // size is known. if (coder->uncompressed_size != LZMA_VLI_UNKNOWN) { ret = LZMA_DATA_ERROR; goto out; } case SEQ_EOPM: // LZMA1 stream with // end-of-payload marker. rc_normalize(SEQ_EOPM); ret = LZMA_STREAM_END; goto out; } } } // Validate the distance we just decoded. if (unlikely(!dict_is_distance_valid(&dict, rep0))) { ret = LZMA_DATA_ERROR; goto out; } } else { rc_update_1(coder->is_rep[state]); // Repeated match // // The match distance is a value that we have had // earlier. The latest four match distances are // available as rep0, rep1, rep2 and rep3. We will // now decode which of them is the new distance. // // There cannot be a match if we haven't produced // any output, so check that first. if (unlikely(!dict_is_distance_valid(&dict, 0))) { ret = LZMA_DATA_ERROR; goto out; } case SEQ_IS_REP0: rc_if_0(coder->is_rep0[state], SEQ_IS_REP0) { rc_update_0(coder->is_rep0[state]); // The distance is rep0. case SEQ_IS_REP0_LONG: rc_if_0(coder->is_rep0_long[state][pos_state], SEQ_IS_REP0_LONG) { rc_update_0(coder->is_rep0_long[ state][pos_state]); update_short_rep(state); case SEQ_SHORTREP: if (unlikely(dict_put(&dict, dict_get( &dict, rep0)))) { coder->sequence = SEQ_SHORTREP; goto out; } continue; } // Repeating more than one byte at // distance of rep0. rc_update_1(coder->is_rep0_long[ state][pos_state]); } else { rc_update_1(coder->is_rep0[state]); case SEQ_IS_REP1: // The distance is rep1, rep2 or rep3. Once // we find out which one of these three, it // is stored to rep0 and rep1, rep2 and rep3 // are updated accordingly. rc_if_0(coder->is_rep1[state], SEQ_IS_REP1) { rc_update_0(coder->is_rep1[state]); const uint32_t distance = rep1; rep1 = rep0; rep0 = distance; } else { rc_update_1(coder->is_rep1[state]); case SEQ_IS_REP2: rc_if_0(coder->is_rep2[state], SEQ_IS_REP2) { rc_update_0(coder->is_rep2[ state]); const uint32_t distance = rep2; rep2 = rep1; rep1 = rep0; rep0 = distance; } else { rc_update_1(coder->is_rep2[ state]); const uint32_t distance = rep3; rep3 = rep2; rep2 = rep1; rep1 = rep0; rep0 = distance; } } } update_long_rep(state); // Decode the length of the repeated match. len_decode(len, coder->rep_len_decoder, pos_state, SEQ_REP_LEN); } ///////////////////////////////// // Repeat from history buffer. // ///////////////////////////////// // The length is always between these limits. There is no way // to trigger the algorithm to set len outside this range. assert(len >= MATCH_LEN_MIN); assert(len <= MATCH_LEN_MAX); case SEQ_COPY: // Repeat len bytes from distance of rep0. if (unlikely(dict_repeat(&dict, rep0, &len))) { coder->sequence = SEQ_COPY; goto out; } } rc_normalize(SEQ_NORMALIZE); coder->sequence = SEQ_IS_MATCH; out: // Save state // NOTE: Must not copy dict.limit. dictptr->pos = dict.pos; dictptr->full = dict.full; rc_from_local(coder->rc, *in_pos); coder->state = state; coder->rep0 = rep0; coder->rep1 = rep1; coder->rep2 = rep2; coder->rep3 = rep3; coder->probs = probs; coder->symbol = symbol; coder->limit = limit; coder->offset = offset; coder->len = len; // Update the remaining amount of uncompressed data if uncompressed // size was known. if (coder->uncompressed_size != LZMA_VLI_UNKNOWN) { coder->uncompressed_size -= dict.pos - dict_start; // Since there cannot be end of payload marker if the // uncompressed size was known, we check here if we // finished decoding. if (coder->uncompressed_size == 0 && ret == LZMA_OK && coder->sequence != SEQ_NORMALIZE) ret = coder->sequence == SEQ_IS_MATCH ? LZMA_STREAM_END : LZMA_DATA_ERROR; } // We can do an additional check in the range decoder to catch some // corrupted files. if (ret == LZMA_STREAM_END) { if (!rc_is_finished(coder->rc)) ret = LZMA_DATA_ERROR; // Reset the range decoder so that it is ready to reinitialize // for a new LZMA2 chunk. rc_reset(coder->rc); } return ret; } static void lzma_decoder_uncompressed(void *coder_ptr, lzma_vli uncompressed_size) { lzma_lzma1_decoder *coder = coder_ptr; coder->uncompressed_size = uncompressed_size; } static void lzma_decoder_reset(void *coder_ptr, const void *opt) { lzma_lzma1_decoder *coder = coder_ptr; const lzma_options_lzma *options = opt; // NOTE: We assume that lc/lp/pb are valid since they were // successfully decoded with lzma_lzma_decode_properties(). // Calculate pos_mask. We don't need pos_bits as is for anything. coder->pos_mask = (1U << options->pb) - 1; // Initialize the literal decoder. literal_init(coder->literal, options->lc, options->lp); coder->literal_context_bits = options->lc; coder->literal_pos_mask = (1U << options->lp) - 1; // State coder->state = STATE_LIT_LIT; coder->rep0 = 0; coder->rep1 = 0; coder->rep2 = 0; coder->rep3 = 0; coder->pos_mask = (1U << options->pb) - 1; // Range decoder rc_reset(coder->rc); // Bit and bittree decoders for (uint32_t i = 0; i < STATES; ++i) { for (uint32_t j = 0; j <= coder->pos_mask; ++j) { bit_reset(coder->is_match[i][j]); bit_reset(coder->is_rep0_long[i][j]); } bit_reset(coder->is_rep[i]); bit_reset(coder->is_rep0[i]); bit_reset(coder->is_rep1[i]); bit_reset(coder->is_rep2[i]); } for (uint32_t i = 0; i < DIST_STATES; ++i) bittree_reset(coder->dist_slot[i], DIST_SLOT_BITS); for (uint32_t i = 0; i < FULL_DISTANCES - DIST_MODEL_END; ++i) bit_reset(coder->pos_special[i]); bittree_reset(coder->pos_align, ALIGN_BITS); // Len decoders (also bit/bittree) const uint32_t num_pos_states = 1U << options->pb; bit_reset(coder->match_len_decoder.choice); bit_reset(coder->match_len_decoder.choice2); bit_reset(coder->rep_len_decoder.choice); bit_reset(coder->rep_len_decoder.choice2); for (uint32_t pos_state = 0; pos_state < num_pos_states; ++pos_state) { bittree_reset(coder->match_len_decoder.low[pos_state], LEN_LOW_BITS); bittree_reset(coder->match_len_decoder.mid[pos_state], LEN_MID_BITS); bittree_reset(coder->rep_len_decoder.low[pos_state], LEN_LOW_BITS); bittree_reset(coder->rep_len_decoder.mid[pos_state], LEN_MID_BITS); } bittree_reset(coder->match_len_decoder.high, LEN_HIGH_BITS); bittree_reset(coder->rep_len_decoder.high, LEN_HIGH_BITS); coder->sequence = SEQ_IS_MATCH; coder->probs = NULL; coder->symbol = 0; coder->limit = 0; coder->offset = 0; coder->len = 0; return; } extern lzma_ret lzma_lzma_decoder_create(lzma_lz_decoder *lz, const lzma_allocator *allocator, const void *opt, lzma_lz_options *lz_options) { if (lz->coder == NULL) { lz->coder = lzma_alloc(sizeof(lzma_lzma1_decoder), allocator); if (lz->coder == NULL) return LZMA_MEM_ERROR; lz->code = &lzma_decode; lz->reset = &lzma_decoder_reset; lz->set_uncompressed = &lzma_decoder_uncompressed; } // All dictionary sizes are OK here. LZ decoder will take care of // the special cases. const lzma_options_lzma *options = opt; lz_options->dict_size = options->dict_size; lz_options->preset_dict = options->preset_dict; lz_options->preset_dict_size = options->preset_dict_size; return LZMA_OK; } /// Allocate and initialize LZMA decoder. This is used only via LZ /// initialization (lzma_lzma_decoder_init() passes function pointer to /// the LZ initialization). static lzma_ret lzma_decoder_init(lzma_lz_decoder *lz, const lzma_allocator *allocator, const void *options, lzma_lz_options *lz_options) { if (!is_lclppb_valid(options)) return LZMA_PROG_ERROR; return_if_error(lzma_lzma_decoder_create( lz, allocator, options, lz_options)); lzma_decoder_reset(lz->coder, options); lzma_decoder_uncompressed(lz->coder, LZMA_VLI_UNKNOWN); return LZMA_OK; } extern lzma_ret lzma_lzma_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { // LZMA can only be the last filter in the chain. This is enforced // by the raw_decoder initialization. assert(filters[1].init == NULL); return lzma_lz_decoder_init(next, allocator, filters, &lzma_decoder_init); } extern bool lzma_lzma_lclppb_decode(lzma_options_lzma *options, uint8_t byte) { if (byte > (4 * 5 + 4) * 9 + 8) return true; // See the file format specification to understand this. options->pb = byte / (9 * 5); byte -= options->pb * 9 * 5; options->lp = byte / 9; options->lc = byte - options->lp * 9; return options->lc + options->lp > LZMA_LCLP_MAX; } extern uint64_t lzma_lzma_decoder_memusage_nocheck(const void *options) { const lzma_options_lzma *const opt = options; return sizeof(lzma_lzma1_decoder) + lzma_lz_decoder_memusage(opt->dict_size); } extern uint64_t lzma_lzma_decoder_memusage(const void *options) { if (!is_lclppb_valid(options)) return UINT64_MAX; return lzma_lzma_decoder_memusage_nocheck(options); } extern lzma_ret lzma_lzma_props_decode(void **options, const lzma_allocator *allocator, const uint8_t *props, size_t props_size) { if (props_size != 5) return LZMA_OPTIONS_ERROR; lzma_options_lzma *opt = lzma_alloc(sizeof(lzma_options_lzma), allocator); if (opt == NULL) return LZMA_MEM_ERROR; if (lzma_lzma_lclppb_decode(opt, props[0])) goto error; // All dictionary sizes are accepted, including zero. LZ decoder // will automatically use a dictionary at least a few KiB even if // a smaller dictionary is requested. - opt->dict_size = unaligned_read32le(props + 1); + opt->dict_size = read32le(props + 1); opt->preset_dict = NULL; opt->preset_dict_size = 0; *options = opt; return LZMA_OK; error: lzma_free(opt, allocator); return LZMA_OPTIONS_ERROR; } Index: head/contrib/xz/src/liblzma/lzma/lzma_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma_encoder.c (revision 359201) @@ -1,677 +1,677 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma_encoder.c /// \brief LZMA encoder /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "lzma2_encoder.h" #include "lzma_encoder_private.h" #include "fastpos.h" ///////////// // Literal // ///////////// static inline void literal_matched(lzma_range_encoder *rc, probability *subcoder, uint32_t match_byte, uint32_t symbol) { uint32_t offset = 0x100; symbol += UINT32_C(1) << 8; do { match_byte <<= 1; const uint32_t match_bit = match_byte & offset; const uint32_t subcoder_index = offset + match_bit + (symbol >> 8); const uint32_t bit = (symbol >> 7) & 1; rc_bit(rc, &subcoder[subcoder_index], bit); symbol <<= 1; offset &= ~(match_byte ^ symbol); } while (symbol < (UINT32_C(1) << 16)); } static inline void literal(lzma_lzma1_encoder *coder, lzma_mf *mf, uint32_t position) { // Locate the literal byte to be encoded and the subcoder. const uint8_t cur_byte = mf->buffer[ mf->read_pos - mf->read_ahead]; probability *subcoder = literal_subcoder(coder->literal, coder->literal_context_bits, coder->literal_pos_mask, position, mf->buffer[mf->read_pos - mf->read_ahead - 1]); if (is_literal_state(coder->state)) { // Previous LZMA-symbol was a literal. Encode a normal // literal without a match byte. rc_bittree(&coder->rc, subcoder, 8, cur_byte); } else { // Previous LZMA-symbol was a match. Use the last byte of // the match as a "match byte". That is, compare the bits // of the current literal and the match byte. const uint8_t match_byte = mf->buffer[ mf->read_pos - coder->reps[0] - 1 - mf->read_ahead]; literal_matched(&coder->rc, subcoder, match_byte, cur_byte); } update_literal(coder->state); } ////////////////// // Match length // ////////////////// static void length_update_prices(lzma_length_encoder *lc, const uint32_t pos_state) { const uint32_t table_size = lc->table_size; lc->counters[pos_state] = table_size; const uint32_t a0 = rc_bit_0_price(lc->choice); const uint32_t a1 = rc_bit_1_price(lc->choice); const uint32_t b0 = a1 + rc_bit_0_price(lc->choice2); const uint32_t b1 = a1 + rc_bit_1_price(lc->choice2); uint32_t *const prices = lc->prices[pos_state]; uint32_t i; for (i = 0; i < table_size && i < LEN_LOW_SYMBOLS; ++i) prices[i] = a0 + rc_bittree_price(lc->low[pos_state], LEN_LOW_BITS, i); for (; i < table_size && i < LEN_LOW_SYMBOLS + LEN_MID_SYMBOLS; ++i) prices[i] = b0 + rc_bittree_price(lc->mid[pos_state], LEN_MID_BITS, i - LEN_LOW_SYMBOLS); for (; i < table_size; ++i) prices[i] = b1 + rc_bittree_price(lc->high, LEN_HIGH_BITS, i - LEN_LOW_SYMBOLS - LEN_MID_SYMBOLS); return; } static inline void length(lzma_range_encoder *rc, lzma_length_encoder *lc, const uint32_t pos_state, uint32_t len, const bool fast_mode) { assert(len <= MATCH_LEN_MAX); len -= MATCH_LEN_MIN; if (len < LEN_LOW_SYMBOLS) { rc_bit(rc, &lc->choice, 0); rc_bittree(rc, lc->low[pos_state], LEN_LOW_BITS, len); } else { rc_bit(rc, &lc->choice, 1); len -= LEN_LOW_SYMBOLS; if (len < LEN_MID_SYMBOLS) { rc_bit(rc, &lc->choice2, 0); rc_bittree(rc, lc->mid[pos_state], LEN_MID_BITS, len); } else { rc_bit(rc, &lc->choice2, 1); len -= LEN_MID_SYMBOLS; rc_bittree(rc, lc->high, LEN_HIGH_BITS, len); } } // Only getoptimum uses the prices so don't update the table when // in fast mode. if (!fast_mode) if (--lc->counters[pos_state] == 0) length_update_prices(lc, pos_state); } /////////// // Match // /////////// static inline void match(lzma_lzma1_encoder *coder, const uint32_t pos_state, const uint32_t distance, const uint32_t len) { update_match(coder->state); length(&coder->rc, &coder->match_len_encoder, pos_state, len, coder->fast_mode); const uint32_t dist_slot = get_dist_slot(distance); const uint32_t dist_state = get_dist_state(len); rc_bittree(&coder->rc, coder->dist_slot[dist_state], DIST_SLOT_BITS, dist_slot); if (dist_slot >= DIST_MODEL_START) { const uint32_t footer_bits = (dist_slot >> 1) - 1; const uint32_t base = (2 | (dist_slot & 1)) << footer_bits; const uint32_t dist_reduced = distance - base; if (dist_slot < DIST_MODEL_END) { // Careful here: base - dist_slot - 1 can be -1, but // rc_bittree_reverse starts at probs[1], not probs[0]. rc_bittree_reverse(&coder->rc, coder->dist_special + base - dist_slot - 1, footer_bits, dist_reduced); } else { rc_direct(&coder->rc, dist_reduced >> ALIGN_BITS, footer_bits - ALIGN_BITS); rc_bittree_reverse( &coder->rc, coder->dist_align, ALIGN_BITS, dist_reduced & ALIGN_MASK); ++coder->align_price_count; } } coder->reps[3] = coder->reps[2]; coder->reps[2] = coder->reps[1]; coder->reps[1] = coder->reps[0]; coder->reps[0] = distance; ++coder->match_price_count; } //////////////////// // Repeated match // //////////////////// static inline void rep_match(lzma_lzma1_encoder *coder, const uint32_t pos_state, const uint32_t rep, const uint32_t len) { if (rep == 0) { rc_bit(&coder->rc, &coder->is_rep0[coder->state], 0); rc_bit(&coder->rc, &coder->is_rep0_long[coder->state][pos_state], len != 1); } else { const uint32_t distance = coder->reps[rep]; rc_bit(&coder->rc, &coder->is_rep0[coder->state], 1); if (rep == 1) { rc_bit(&coder->rc, &coder->is_rep1[coder->state], 0); } else { rc_bit(&coder->rc, &coder->is_rep1[coder->state], 1); rc_bit(&coder->rc, &coder->is_rep2[coder->state], rep - 2); if (rep == 3) coder->reps[3] = coder->reps[2]; coder->reps[2] = coder->reps[1]; } coder->reps[1] = coder->reps[0]; coder->reps[0] = distance; } if (len == 1) { update_short_rep(coder->state); } else { length(&coder->rc, &coder->rep_len_encoder, pos_state, len, coder->fast_mode); update_long_rep(coder->state); } } ////////// // Main // ////////// static void encode_symbol(lzma_lzma1_encoder *coder, lzma_mf *mf, uint32_t back, uint32_t len, uint32_t position) { const uint32_t pos_state = position & coder->pos_mask; if (back == UINT32_MAX) { // Literal i.e. eight-bit byte assert(len == 1); rc_bit(&coder->rc, &coder->is_match[coder->state][pos_state], 0); literal(coder, mf, position); } else { // Some type of match rc_bit(&coder->rc, &coder->is_match[coder->state][pos_state], 1); if (back < REPS) { // It's a repeated match i.e. the same distance // has been used earlier. rc_bit(&coder->rc, &coder->is_rep[coder->state], 1); rep_match(coder, pos_state, back, len); } else { // Normal match rc_bit(&coder->rc, &coder->is_rep[coder->state], 0); match(coder, pos_state, back - REPS, len); } } assert(mf->read_ahead >= len); mf->read_ahead -= len; } static bool encode_init(lzma_lzma1_encoder *coder, lzma_mf *mf) { assert(mf_position(mf) == 0); if (mf->read_pos == mf->read_limit) { if (mf->action == LZMA_RUN) return false; // We cannot do anything. // We are finishing (we cannot get here when flushing). assert(mf->write_pos == mf->read_pos); assert(mf->action == LZMA_FINISH); } else { // Do the actual initialization. The first LZMA symbol must // always be a literal. mf_skip(mf, 1); mf->read_ahead = 0; rc_bit(&coder->rc, &coder->is_match[0][0], 0); rc_bittree(&coder->rc, coder->literal[0], 8, mf->buffer[0]); } // Initialization is done (except if empty file). coder->is_initialized = true; return true; } static void encode_eopm(lzma_lzma1_encoder *coder, uint32_t position) { const uint32_t pos_state = position & coder->pos_mask; rc_bit(&coder->rc, &coder->is_match[coder->state][pos_state], 1); rc_bit(&coder->rc, &coder->is_rep[coder->state], 0); match(coder, pos_state, UINT32_MAX, MATCH_LEN_MIN); } /// Number of bytes that a single encoding loop in lzma_lzma_encode() can /// consume from the dictionary. This limit comes from lzma_lzma_optimum() /// and may need to be updated if that function is significantly modified. #define LOOP_INPUT_MAX (OPTS + 1) extern lzma_ret lzma_lzma_encode(lzma_lzma1_encoder *restrict coder, lzma_mf *restrict mf, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, uint32_t limit) { // Initialize the stream if no data has been encoded yet. if (!coder->is_initialized && !encode_init(coder, mf)) return LZMA_OK; // Get the lowest bits of the uncompressed offset from the LZ layer. uint32_t position = mf_position(mf); while (true) { // Encode pending bits, if any. Calling this before encoding // the next symbol is needed only with plain LZMA, since // LZMA2 always provides big enough buffer to flush // everything out from the range encoder. For the same reason, // rc_encode() never returns true when this function is used // as part of LZMA2 encoder. if (rc_encode(&coder->rc, out, out_pos, out_size)) { assert(limit == UINT32_MAX); return LZMA_OK; } // With LZMA2 we need to take care that compressed size of // a chunk doesn't get too big. // FIXME? Check if this could be improved. if (limit != UINT32_MAX && (mf->read_pos - mf->read_ahead >= limit || *out_pos + rc_pending(&coder->rc) >= LZMA2_CHUNK_MAX - LOOP_INPUT_MAX)) break; // Check that there is some input to process. if (mf->read_pos >= mf->read_limit) { if (mf->action == LZMA_RUN) return LZMA_OK; if (mf->read_ahead == 0) break; } // Get optimal match (repeat position and length). // Value ranges for pos: // - [0, REPS): repeated match // - [REPS, UINT32_MAX): // match at (pos - REPS) // - UINT32_MAX: not a match but a literal // Value ranges for len: // - [MATCH_LEN_MIN, MATCH_LEN_MAX] uint32_t len; uint32_t back; if (coder->fast_mode) lzma_lzma_optimum_fast(coder, mf, &back, &len); else lzma_lzma_optimum_normal( coder, mf, &back, &len, position); encode_symbol(coder, mf, back, len, position); position += len; } if (!coder->is_flushed) { coder->is_flushed = true; // We don't support encoding plain LZMA streams without EOPM, // and LZMA2 doesn't use EOPM at LZMA level. if (limit == UINT32_MAX) encode_eopm(coder, position); // Flush the remaining bytes from the range encoder. rc_flush(&coder->rc); // Copy the remaining bytes to the output buffer. If there // isn't enough output space, we will copy out the remaining // bytes on the next call to this function by using // the rc_encode() call in the encoding loop above. if (rc_encode(&coder->rc, out, out_pos, out_size)) { assert(limit == UINT32_MAX); return LZMA_OK; } } // Make it ready for the next LZMA2 chunk. coder->is_flushed = false; return LZMA_STREAM_END; } static lzma_ret lzma_encode(void *coder, lzma_mf *restrict mf, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size) { // Plain LZMA has no support for sync-flushing. if (unlikely(mf->action == LZMA_SYNC_FLUSH)) return LZMA_OPTIONS_ERROR; return lzma_lzma_encode(coder, mf, out, out_pos, out_size, UINT32_MAX); } //////////////////// // Initialization // //////////////////// static bool is_options_valid(const lzma_options_lzma *options) { // Validate some of the options. LZ encoder validates nice_len too // but we need a valid value here earlier. return is_lclppb_valid(options) && options->nice_len >= MATCH_LEN_MIN && options->nice_len <= MATCH_LEN_MAX && (options->mode == LZMA_MODE_FAST || options->mode == LZMA_MODE_NORMAL); } static void set_lz_options(lzma_lz_options *lz_options, const lzma_options_lzma *options) { // LZ encoder initialization does the validation for these so we // don't need to validate here. lz_options->before_size = OPTS; lz_options->dict_size = options->dict_size; lz_options->after_size = LOOP_INPUT_MAX; lz_options->match_len_max = MATCH_LEN_MAX; lz_options->nice_len = options->nice_len; lz_options->match_finder = options->mf; lz_options->depth = options->depth; lz_options->preset_dict = options->preset_dict; lz_options->preset_dict_size = options->preset_dict_size; return; } static void length_encoder_reset(lzma_length_encoder *lencoder, const uint32_t num_pos_states, const bool fast_mode) { bit_reset(lencoder->choice); bit_reset(lencoder->choice2); for (size_t pos_state = 0; pos_state < num_pos_states; ++pos_state) { bittree_reset(lencoder->low[pos_state], LEN_LOW_BITS); bittree_reset(lencoder->mid[pos_state], LEN_MID_BITS); } bittree_reset(lencoder->high, LEN_HIGH_BITS); if (!fast_mode) for (uint32_t pos_state = 0; pos_state < num_pos_states; ++pos_state) length_update_prices(lencoder, pos_state); return; } extern lzma_ret lzma_lzma_encoder_reset(lzma_lzma1_encoder *coder, const lzma_options_lzma *options) { if (!is_options_valid(options)) return LZMA_OPTIONS_ERROR; coder->pos_mask = (1U << options->pb) - 1; coder->literal_context_bits = options->lc; coder->literal_pos_mask = (1U << options->lp) - 1; // Range coder rc_reset(&coder->rc); // State coder->state = STATE_LIT_LIT; for (size_t i = 0; i < REPS; ++i) coder->reps[i] = 0; literal_init(coder->literal, options->lc, options->lp); // Bit encoders for (size_t i = 0; i < STATES; ++i) { for (size_t j = 0; j <= coder->pos_mask; ++j) { bit_reset(coder->is_match[i][j]); bit_reset(coder->is_rep0_long[i][j]); } bit_reset(coder->is_rep[i]); bit_reset(coder->is_rep0[i]); bit_reset(coder->is_rep1[i]); bit_reset(coder->is_rep2[i]); } for (size_t i = 0; i < FULL_DISTANCES - DIST_MODEL_END; ++i) bit_reset(coder->dist_special[i]); // Bit tree encoders for (size_t i = 0; i < DIST_STATES; ++i) bittree_reset(coder->dist_slot[i], DIST_SLOT_BITS); bittree_reset(coder->dist_align, ALIGN_BITS); // Length encoders length_encoder_reset(&coder->match_len_encoder, 1U << options->pb, coder->fast_mode); length_encoder_reset(&coder->rep_len_encoder, 1U << options->pb, coder->fast_mode); // Price counts are incremented every time appropriate probabilities // are changed. price counts are set to zero when the price tables // are updated, which is done when the appropriate price counts have // big enough value, and lzma_mf.read_ahead == 0 which happens at // least every OPTS (a few thousand) possible price count increments. // // By resetting price counts to UINT32_MAX / 2, we make sure that the // price tables will be initialized before they will be used (since // the value is definitely big enough), and that it is OK to increment // price counts without risk of integer overflow (since UINT32_MAX / 2 // is small enough). The current code doesn't increment price counts // before initializing price tables, but it maybe done in future if // we add support for saving the state between LZMA2 chunks. coder->match_price_count = UINT32_MAX / 2; coder->align_price_count = UINT32_MAX / 2; coder->opts_end_index = 0; coder->opts_current_index = 0; return LZMA_OK; } extern lzma_ret lzma_lzma_encoder_create(void **coder_ptr, const lzma_allocator *allocator, const lzma_options_lzma *options, lzma_lz_options *lz_options) { // Allocate lzma_lzma1_encoder if it wasn't already allocated. if (*coder_ptr == NULL) { *coder_ptr = lzma_alloc(sizeof(lzma_lzma1_encoder), allocator); if (*coder_ptr == NULL) return LZMA_MEM_ERROR; } lzma_lzma1_encoder *coder = *coder_ptr; // Set compression mode. We haven't validates the options yet, // but it's OK here, since nothing bad happens with invalid // options in the code below, and they will get rejected by // lzma_lzma_encoder_reset() call at the end of this function. switch (options->mode) { case LZMA_MODE_FAST: coder->fast_mode = true; break; case LZMA_MODE_NORMAL: { coder->fast_mode = false; // Set dist_table_size. // Round the dictionary size up to next 2^n. uint32_t log_size = 0; while ((UINT32_C(1) << log_size) < options->dict_size) ++log_size; coder->dist_table_size = log_size * 2; // Length encoders' price table size coder->match_len_encoder.table_size = options->nice_len + 1 - MATCH_LEN_MIN; coder->rep_len_encoder.table_size = options->nice_len + 1 - MATCH_LEN_MIN; break; } default: return LZMA_OPTIONS_ERROR; } // We don't need to write the first byte as literal if there is // a non-empty preset dictionary. encode_init() wouldn't even work // if there is a non-empty preset dictionary, because encode_init() // assumes that position is zero and previous byte is also zero. coder->is_initialized = options->preset_dict != NULL && options->preset_dict_size > 0; coder->is_flushed = false; set_lz_options(lz_options, options); return lzma_lzma_encoder_reset(coder, options); } static lzma_ret lzma_encoder_init(lzma_lz_encoder *lz, const lzma_allocator *allocator, const void *options, lzma_lz_options *lz_options) { lz->code = &lzma_encode; return lzma_lzma_encoder_create( &lz->coder, allocator, options, lz_options); } extern lzma_ret lzma_lzma_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return lzma_lz_encoder_init( next, allocator, filters, &lzma_encoder_init); } extern uint64_t lzma_lzma_encoder_memusage(const void *options) { if (!is_options_valid(options)) return UINT64_MAX; lzma_lz_options lz_options; set_lz_options(&lz_options, options); const uint64_t lz_memusage = lzma_lz_encoder_memusage(&lz_options); if (lz_memusage == UINT64_MAX) return UINT64_MAX; return (uint64_t)(sizeof(lzma_lzma1_encoder)) + lz_memusage; } extern bool lzma_lzma_lclppb_encode(const lzma_options_lzma *options, uint8_t *byte) { if (!is_lclppb_valid(options)) return true; *byte = (options->pb * 5 + options->lp) * 9 + options->lc; assert(*byte <= (4 * 5 + 4) * 9 + 8); return false; } #ifdef HAVE_ENCODER_LZMA1 extern lzma_ret lzma_lzma_props_encode(const void *options, uint8_t *out) { const lzma_options_lzma *const opt = options; if (lzma_lzma_lclppb_encode(opt, out)) return LZMA_PROG_ERROR; - unaligned_write32le(out + 1, opt->dict_size); + write32le(out + 1, opt->dict_size); return LZMA_OK; } #endif extern LZMA_API(lzma_bool) lzma_mode_is_supported(lzma_mode mode) { return mode == LZMA_MODE_FAST || mode == LZMA_MODE_NORMAL; } Index: head/contrib/xz/src/liblzma/lzma/lzma_encoder_optimum_normal.c =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma_encoder_optimum_normal.c (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma_encoder_optimum_normal.c (revision 359201) @@ -1,855 +1,859 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma_encoder_optimum_normal.c // // Author: Igor Pavlov // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "lzma_encoder_private.h" #include "fastpos.h" #include "memcmplen.h" //////////// // Prices // //////////// static uint32_t get_literal_price(const lzma_lzma1_encoder *const coder, const uint32_t pos, const uint32_t prev_byte, const bool match_mode, uint32_t match_byte, uint32_t symbol) { const probability *const subcoder = literal_subcoder(coder->literal, coder->literal_context_bits, coder->literal_pos_mask, pos, prev_byte); uint32_t price = 0; if (!match_mode) { price = rc_bittree_price(subcoder, 8, symbol); } else { uint32_t offset = 0x100; symbol += UINT32_C(1) << 8; do { match_byte <<= 1; const uint32_t match_bit = match_byte & offset; const uint32_t subcoder_index = offset + match_bit + (symbol >> 8); const uint32_t bit = (symbol >> 7) & 1; price += rc_bit_price(subcoder[subcoder_index], bit); symbol <<= 1; offset &= ~(match_byte ^ symbol); } while (symbol < (UINT32_C(1) << 16)); } return price; } static inline uint32_t get_len_price(const lzma_length_encoder *const lencoder, const uint32_t len, const uint32_t pos_state) { // NOTE: Unlike the other price tables, length prices are updated // in lzma_encoder.c return lencoder->prices[pos_state][len - MATCH_LEN_MIN]; } static inline uint32_t get_short_rep_price(const lzma_lzma1_encoder *const coder, const lzma_lzma_state state, const uint32_t pos_state) { return rc_bit_0_price(coder->is_rep0[state]) + rc_bit_0_price(coder->is_rep0_long[state][pos_state]); } static inline uint32_t get_pure_rep_price(const lzma_lzma1_encoder *const coder, const uint32_t rep_index, const lzma_lzma_state state, uint32_t pos_state) { uint32_t price; if (rep_index == 0) { price = rc_bit_0_price(coder->is_rep0[state]); price += rc_bit_1_price(coder->is_rep0_long[state][pos_state]); } else { price = rc_bit_1_price(coder->is_rep0[state]); if (rep_index == 1) { price += rc_bit_0_price(coder->is_rep1[state]); } else { price += rc_bit_1_price(coder->is_rep1[state]); price += rc_bit_price(coder->is_rep2[state], rep_index - 2); } } return price; } static inline uint32_t get_rep_price(const lzma_lzma1_encoder *const coder, const uint32_t rep_index, const uint32_t len, const lzma_lzma_state state, const uint32_t pos_state) { return get_len_price(&coder->rep_len_encoder, len, pos_state) + get_pure_rep_price(coder, rep_index, state, pos_state); } static inline uint32_t get_dist_len_price(const lzma_lzma1_encoder *const coder, const uint32_t dist, const uint32_t len, const uint32_t pos_state) { const uint32_t dist_state = get_dist_state(len); uint32_t price; if (dist < FULL_DISTANCES) { price = coder->dist_prices[dist_state][dist]; } else { const uint32_t dist_slot = get_dist_slot_2(dist); price = coder->dist_slot_prices[dist_state][dist_slot] + coder->align_prices[dist & ALIGN_MASK]; } price += get_len_price(&coder->match_len_encoder, len, pos_state); return price; } static void fill_dist_prices(lzma_lzma1_encoder *coder) { for (uint32_t dist_state = 0; dist_state < DIST_STATES; ++dist_state) { uint32_t *const dist_slot_prices = coder->dist_slot_prices[dist_state]; // Price to encode the dist_slot. for (uint32_t dist_slot = 0; dist_slot < coder->dist_table_size; ++dist_slot) dist_slot_prices[dist_slot] = rc_bittree_price( coder->dist_slot[dist_state], DIST_SLOT_BITS, dist_slot); // For matches with distance >= FULL_DISTANCES, add the price // of the direct bits part of the match distance. (Align bits // are handled by fill_align_prices()). for (uint32_t dist_slot = DIST_MODEL_END; dist_slot < coder->dist_table_size; ++dist_slot) dist_slot_prices[dist_slot] += rc_direct_price( ((dist_slot >> 1) - 1) - ALIGN_BITS); // Distances in the range [0, 3] are fully encoded with // dist_slot, so they are used for coder->dist_prices // as is. for (uint32_t i = 0; i < DIST_MODEL_START; ++i) coder->dist_prices[dist_state][i] = dist_slot_prices[i]; } // Distances in the range [4, 127] depend on dist_slot and // dist_special. We do this in a loop separate from the above // loop to avoid redundant calls to get_dist_slot(). for (uint32_t i = DIST_MODEL_START; i < FULL_DISTANCES; ++i) { const uint32_t dist_slot = get_dist_slot(i); const uint32_t footer_bits = ((dist_slot >> 1) - 1); const uint32_t base = (2 | (dist_slot & 1)) << footer_bits; const uint32_t price = rc_bittree_reverse_price( coder->dist_special + base - dist_slot - 1, footer_bits, i - base); for (uint32_t dist_state = 0; dist_state < DIST_STATES; ++dist_state) coder->dist_prices[dist_state][i] = price + coder->dist_slot_prices[ dist_state][dist_slot]; } coder->match_price_count = 0; return; } static void fill_align_prices(lzma_lzma1_encoder *coder) { for (uint32_t i = 0; i < ALIGN_SIZE; ++i) coder->align_prices[i] = rc_bittree_reverse_price( coder->dist_align, ALIGN_BITS, i); coder->align_price_count = 0; return; } ///////////// // Optimal // ///////////// static inline void make_literal(lzma_optimal *optimal) { optimal->back_prev = UINT32_MAX; optimal->prev_1_is_literal = false; } static inline void make_short_rep(lzma_optimal *optimal) { optimal->back_prev = 0; optimal->prev_1_is_literal = false; } #define is_short_rep(optimal) \ ((optimal).back_prev == 0) static void backward(lzma_lzma1_encoder *restrict coder, uint32_t *restrict len_res, uint32_t *restrict back_res, uint32_t cur) { coder->opts_end_index = cur; uint32_t pos_mem = coder->opts[cur].pos_prev; uint32_t back_mem = coder->opts[cur].back_prev; do { if (coder->opts[cur].prev_1_is_literal) { make_literal(&coder->opts[pos_mem]); coder->opts[pos_mem].pos_prev = pos_mem - 1; if (coder->opts[cur].prev_2) { coder->opts[pos_mem - 1].prev_1_is_literal = false; coder->opts[pos_mem - 1].pos_prev = coder->opts[cur].pos_prev_2; coder->opts[pos_mem - 1].back_prev = coder->opts[cur].back_prev_2; } } const uint32_t pos_prev = pos_mem; const uint32_t back_cur = back_mem; back_mem = coder->opts[pos_prev].back_prev; pos_mem = coder->opts[pos_prev].pos_prev; coder->opts[pos_prev].back_prev = back_cur; coder->opts[pos_prev].pos_prev = cur; cur = pos_prev; } while (cur != 0); coder->opts_current_index = coder->opts[0].pos_prev; *len_res = coder->opts[0].pos_prev; *back_res = coder->opts[0].back_prev; return; } ////////// // Main // ////////// static inline uint32_t helper1(lzma_lzma1_encoder *restrict coder, lzma_mf *restrict mf, uint32_t *restrict back_res, uint32_t *restrict len_res, uint32_t position) { const uint32_t nice_len = mf->nice_len; uint32_t len_main; uint32_t matches_count; if (mf->read_ahead == 0) { len_main = mf_find(mf, &matches_count, coder->matches); } else { assert(mf->read_ahead == 1); len_main = coder->longest_match_length; matches_count = coder->matches_count; } const uint32_t buf_avail = my_min(mf_avail(mf) + 1, MATCH_LEN_MAX); if (buf_avail < 2) { *back_res = UINT32_MAX; *len_res = 1; return UINT32_MAX; } const uint8_t *const buf = mf_ptr(mf) - 1; uint32_t rep_lens[REPS]; uint32_t rep_max_index = 0; for (uint32_t i = 0; i < REPS; ++i) { const uint8_t *const buf_back = buf - coder->reps[i] - 1; if (not_equal_16(buf, buf_back)) { rep_lens[i] = 0; continue; } rep_lens[i] = lzma_memcmplen(buf, buf_back, 2, buf_avail); if (rep_lens[i] > rep_lens[rep_max_index]) rep_max_index = i; } if (rep_lens[rep_max_index] >= nice_len) { *back_res = rep_max_index; *len_res = rep_lens[rep_max_index]; mf_skip(mf, *len_res - 1); return UINT32_MAX; } if (len_main >= nice_len) { *back_res = coder->matches[matches_count - 1].dist + REPS; *len_res = len_main; mf_skip(mf, len_main - 1); return UINT32_MAX; } const uint8_t current_byte = *buf; const uint8_t match_byte = *(buf - coder->reps[0] - 1); if (len_main < 2 && current_byte != match_byte && rep_lens[rep_max_index] < 2) { *back_res = UINT32_MAX; *len_res = 1; return UINT32_MAX; } coder->opts[0].state = coder->state; const uint32_t pos_state = position & coder->pos_mask; coder->opts[1].price = rc_bit_0_price( coder->is_match[coder->state][pos_state]) + get_literal_price(coder, position, buf[-1], !is_literal_state(coder->state), match_byte, current_byte); make_literal(&coder->opts[1]); const uint32_t match_price = rc_bit_1_price( coder->is_match[coder->state][pos_state]); const uint32_t rep_match_price = match_price + rc_bit_1_price(coder->is_rep[coder->state]); if (match_byte == current_byte) { const uint32_t short_rep_price = rep_match_price + get_short_rep_price( coder, coder->state, pos_state); if (short_rep_price < coder->opts[1].price) { coder->opts[1].price = short_rep_price; make_short_rep(&coder->opts[1]); } } const uint32_t len_end = my_max(len_main, rep_lens[rep_max_index]); if (len_end < 2) { *back_res = coder->opts[1].back_prev; *len_res = 1; return UINT32_MAX; } coder->opts[1].pos_prev = 0; for (uint32_t i = 0; i < REPS; ++i) coder->opts[0].backs[i] = coder->reps[i]; uint32_t len = len_end; do { coder->opts[len].price = RC_INFINITY_PRICE; } while (--len >= 2); for (uint32_t i = 0; i < REPS; ++i) { uint32_t rep_len = rep_lens[i]; if (rep_len < 2) continue; const uint32_t price = rep_match_price + get_pure_rep_price( coder, i, coder->state, pos_state); do { const uint32_t cur_and_len_price = price + get_len_price( &coder->rep_len_encoder, rep_len, pos_state); if (cur_and_len_price < coder->opts[rep_len].price) { coder->opts[rep_len].price = cur_and_len_price; coder->opts[rep_len].pos_prev = 0; coder->opts[rep_len].back_prev = i; coder->opts[rep_len].prev_1_is_literal = false; } } while (--rep_len >= 2); } const uint32_t normal_match_price = match_price + rc_bit_0_price(coder->is_rep[coder->state]); len = rep_lens[0] >= 2 ? rep_lens[0] + 1 : 2; if (len <= len_main) { uint32_t i = 0; while (len > coder->matches[i].len) ++i; for(; ; ++len) { const uint32_t dist = coder->matches[i].dist; const uint32_t cur_and_len_price = normal_match_price + get_dist_len_price(coder, dist, len, pos_state); if (cur_and_len_price < coder->opts[len].price) { coder->opts[len].price = cur_and_len_price; coder->opts[len].pos_prev = 0; coder->opts[len].back_prev = dist + REPS; coder->opts[len].prev_1_is_literal = false; } if (len == coder->matches[i].len) if (++i == matches_count) break; } } return len_end; } static inline uint32_t helper2(lzma_lzma1_encoder *coder, uint32_t *reps, const uint8_t *buf, uint32_t len_end, uint32_t position, const uint32_t cur, const uint32_t nice_len, const uint32_t buf_avail_full) { uint32_t matches_count = coder->matches_count; uint32_t new_len = coder->longest_match_length; uint32_t pos_prev = coder->opts[cur].pos_prev; lzma_lzma_state state; if (coder->opts[cur].prev_1_is_literal) { --pos_prev; if (coder->opts[cur].prev_2) { state = coder->opts[coder->opts[cur].pos_prev_2].state; if (coder->opts[cur].back_prev_2 < REPS) update_long_rep(state); else update_match(state); } else { state = coder->opts[pos_prev].state; } update_literal(state); } else { state = coder->opts[pos_prev].state; } if (pos_prev == cur - 1) { if (is_short_rep(coder->opts[cur])) update_short_rep(state); else update_literal(state); } else { uint32_t pos; if (coder->opts[cur].prev_1_is_literal && coder->opts[cur].prev_2) { pos_prev = coder->opts[cur].pos_prev_2; pos = coder->opts[cur].back_prev_2; update_long_rep(state); } else { pos = coder->opts[cur].back_prev; if (pos < REPS) update_long_rep(state); else update_match(state); } if (pos < REPS) { reps[0] = coder->opts[pos_prev].backs[pos]; uint32_t i; for (i = 1; i <= pos; ++i) reps[i] = coder->opts[pos_prev].backs[i - 1]; for (; i < REPS; ++i) reps[i] = coder->opts[pos_prev].backs[i]; } else { reps[0] = pos - REPS; for (uint32_t i = 1; i < REPS; ++i) reps[i] = coder->opts[pos_prev].backs[i - 1]; } } coder->opts[cur].state = state; for (uint32_t i = 0; i < REPS; ++i) coder->opts[cur].backs[i] = reps[i]; const uint32_t cur_price = coder->opts[cur].price; const uint8_t current_byte = *buf; const uint8_t match_byte = *(buf - reps[0] - 1); const uint32_t pos_state = position & coder->pos_mask; const uint32_t cur_and_1_price = cur_price + rc_bit_0_price(coder->is_match[state][pos_state]) + get_literal_price(coder, position, buf[-1], !is_literal_state(state), match_byte, current_byte); bool next_is_literal = false; if (cur_and_1_price < coder->opts[cur + 1].price) { coder->opts[cur + 1].price = cur_and_1_price; coder->opts[cur + 1].pos_prev = cur; make_literal(&coder->opts[cur + 1]); next_is_literal = true; } const uint32_t match_price = cur_price + rc_bit_1_price(coder->is_match[state][pos_state]); const uint32_t rep_match_price = match_price + rc_bit_1_price(coder->is_rep[state]); if (match_byte == current_byte && !(coder->opts[cur + 1].pos_prev < cur && coder->opts[cur + 1].back_prev == 0)) { const uint32_t short_rep_price = rep_match_price + get_short_rep_price(coder, state, pos_state); if (short_rep_price <= coder->opts[cur + 1].price) { coder->opts[cur + 1].price = short_rep_price; coder->opts[cur + 1].pos_prev = cur; make_short_rep(&coder->opts[cur + 1]); next_is_literal = true; } } if (buf_avail_full < 2) return len_end; const uint32_t buf_avail = my_min(buf_avail_full, nice_len); if (!next_is_literal && match_byte != current_byte) { // speed optimization // try literal + rep0 const uint8_t *const buf_back = buf - reps[0] - 1; const uint32_t limit = my_min(buf_avail_full, nice_len + 1); const uint32_t len_test = lzma_memcmplen(buf, buf_back, 1, limit) - 1; if (len_test >= 2) { lzma_lzma_state state_2 = state; update_literal(state_2); const uint32_t pos_state_next = (position + 1) & coder->pos_mask; const uint32_t next_rep_match_price = cur_and_1_price + rc_bit_1_price(coder->is_match[state_2][pos_state_next]) + rc_bit_1_price(coder->is_rep[state_2]); //for (; len_test >= 2; --len_test) { const uint32_t offset = cur + 1 + len_test; while (len_end < offset) coder->opts[++len_end].price = RC_INFINITY_PRICE; const uint32_t cur_and_len_price = next_rep_match_price + get_rep_price(coder, 0, len_test, state_2, pos_state_next); if (cur_and_len_price < coder->opts[offset].price) { coder->opts[offset].price = cur_and_len_price; coder->opts[offset].pos_prev = cur + 1; coder->opts[offset].back_prev = 0; coder->opts[offset].prev_1_is_literal = true; coder->opts[offset].prev_2 = false; } //} } } uint32_t start_len = 2; // speed optimization for (uint32_t rep_index = 0; rep_index < REPS; ++rep_index) { const uint8_t *const buf_back = buf - reps[rep_index] - 1; if (not_equal_16(buf, buf_back)) continue; uint32_t len_test = lzma_memcmplen(buf, buf_back, 2, buf_avail); while (len_end < cur + len_test) coder->opts[++len_end].price = RC_INFINITY_PRICE; const uint32_t len_test_temp = len_test; const uint32_t price = rep_match_price + get_pure_rep_price( coder, rep_index, state, pos_state); do { const uint32_t cur_and_len_price = price + get_len_price(&coder->rep_len_encoder, len_test, pos_state); if (cur_and_len_price < coder->opts[cur + len_test].price) { coder->opts[cur + len_test].price = cur_and_len_price; coder->opts[cur + len_test].pos_prev = cur; coder->opts[cur + len_test].back_prev = rep_index; coder->opts[cur + len_test].prev_1_is_literal = false; } } while (--len_test >= 2); len_test = len_test_temp; if (rep_index == 0) start_len = len_test + 1; uint32_t len_test_2 = len_test + 1; const uint32_t limit = my_min(buf_avail_full, len_test_2 + nice_len); - for (; len_test_2 < limit - && buf[len_test_2] == buf_back[len_test_2]; - ++len_test_2) ; + // NOTE: len_test_2 may be greater than limit so the call to + // lzma_memcmplen() must be done conditionally. + if (len_test_2 < limit) + len_test_2 = lzma_memcmplen(buf, buf_back, len_test_2, limit); len_test_2 -= len_test + 1; if (len_test_2 >= 2) { lzma_lzma_state state_2 = state; update_long_rep(state_2); uint32_t pos_state_next = (position + len_test) & coder->pos_mask; const uint32_t cur_and_len_literal_price = price + get_len_price(&coder->rep_len_encoder, len_test, pos_state) + rc_bit_0_price(coder->is_match[state_2][pos_state_next]) + get_literal_price(coder, position + len_test, buf[len_test - 1], true, buf_back[len_test], buf[len_test]); update_literal(state_2); pos_state_next = (position + len_test + 1) & coder->pos_mask; const uint32_t next_rep_match_price = cur_and_len_literal_price + rc_bit_1_price(coder->is_match[state_2][pos_state_next]) + rc_bit_1_price(coder->is_rep[state_2]); //for(; len_test_2 >= 2; len_test_2--) { const uint32_t offset = cur + len_test + 1 + len_test_2; while (len_end < offset) coder->opts[++len_end].price = RC_INFINITY_PRICE; const uint32_t cur_and_len_price = next_rep_match_price + get_rep_price(coder, 0, len_test_2, state_2, pos_state_next); if (cur_and_len_price < coder->opts[offset].price) { coder->opts[offset].price = cur_and_len_price; coder->opts[offset].pos_prev = cur + len_test + 1; coder->opts[offset].back_prev = 0; coder->opts[offset].prev_1_is_literal = true; coder->opts[offset].prev_2 = true; coder->opts[offset].pos_prev_2 = cur; coder->opts[offset].back_prev_2 = rep_index; } //} } } //for (uint32_t len_test = 2; len_test <= new_len; ++len_test) if (new_len > buf_avail) { new_len = buf_avail; matches_count = 0; while (new_len > coder->matches[matches_count].len) ++matches_count; coder->matches[matches_count++].len = new_len; } if (new_len >= start_len) { const uint32_t normal_match_price = match_price + rc_bit_0_price(coder->is_rep[state]); while (len_end < cur + new_len) coder->opts[++len_end].price = RC_INFINITY_PRICE; uint32_t i = 0; while (start_len > coder->matches[i].len) ++i; for (uint32_t len_test = start_len; ; ++len_test) { const uint32_t cur_back = coder->matches[i].dist; uint32_t cur_and_len_price = normal_match_price + get_dist_len_price(coder, cur_back, len_test, pos_state); if (cur_and_len_price < coder->opts[cur + len_test].price) { coder->opts[cur + len_test].price = cur_and_len_price; coder->opts[cur + len_test].pos_prev = cur; coder->opts[cur + len_test].back_prev = cur_back + REPS; coder->opts[cur + len_test].prev_1_is_literal = false; } if (len_test == coder->matches[i].len) { // Try Match + Literal + Rep0 const uint8_t *const buf_back = buf - cur_back - 1; uint32_t len_test_2 = len_test + 1; const uint32_t limit = my_min(buf_avail_full, len_test_2 + nice_len); - for (; len_test_2 < limit && - buf[len_test_2] == buf_back[len_test_2]; - ++len_test_2) ; + // NOTE: len_test_2 may be greater than limit + // so the call to lzma_memcmplen() must be + // done conditionally. + if (len_test_2 < limit) + len_test_2 = lzma_memcmplen(buf, buf_back, + len_test_2, limit); len_test_2 -= len_test + 1; if (len_test_2 >= 2) { lzma_lzma_state state_2 = state; update_match(state_2); uint32_t pos_state_next = (position + len_test) & coder->pos_mask; const uint32_t cur_and_len_literal_price = cur_and_len_price + rc_bit_0_price( coder->is_match[state_2][pos_state_next]) + get_literal_price(coder, position + len_test, buf[len_test - 1], true, buf_back[len_test], buf[len_test]); update_literal(state_2); pos_state_next = (pos_state_next + 1) & coder->pos_mask; const uint32_t next_rep_match_price = cur_and_len_literal_price + rc_bit_1_price( coder->is_match[state_2][pos_state_next]) + rc_bit_1_price(coder->is_rep[state_2]); // for(; len_test_2 >= 2; --len_test_2) { const uint32_t offset = cur + len_test + 1 + len_test_2; while (len_end < offset) coder->opts[++len_end].price = RC_INFINITY_PRICE; cur_and_len_price = next_rep_match_price + get_rep_price(coder, 0, len_test_2, state_2, pos_state_next); if (cur_and_len_price < coder->opts[offset].price) { coder->opts[offset].price = cur_and_len_price; coder->opts[offset].pos_prev = cur + len_test + 1; coder->opts[offset].back_prev = 0; coder->opts[offset].prev_1_is_literal = true; coder->opts[offset].prev_2 = true; coder->opts[offset].pos_prev_2 = cur; coder->opts[offset].back_prev_2 = cur_back + REPS; } //} } if (++i == matches_count) break; } } } return len_end; } extern void lzma_lzma_optimum_normal(lzma_lzma1_encoder *restrict coder, lzma_mf *restrict mf, uint32_t *restrict back_res, uint32_t *restrict len_res, uint32_t position) { // If we have symbols pending, return the next pending symbol. if (coder->opts_end_index != coder->opts_current_index) { assert(mf->read_ahead > 0); *len_res = coder->opts[coder->opts_current_index].pos_prev - coder->opts_current_index; *back_res = coder->opts[coder->opts_current_index].back_prev; coder->opts_current_index = coder->opts[ coder->opts_current_index].pos_prev; return; } // Update the price tables. In LZMA SDK <= 4.60 (and possibly later) // this was done in both initialization function and in the main loop. // In liblzma they were moved into this single place. if (mf->read_ahead == 0) { if (coder->match_price_count >= (1 << 7)) fill_dist_prices(coder); if (coder->align_price_count >= ALIGN_SIZE) fill_align_prices(coder); } // TODO: This needs quite a bit of cleaning still. But splitting // the original function into two pieces makes it at least a little // more readable, since those two parts don't share many variables. uint32_t len_end = helper1(coder, mf, back_res, len_res, position); if (len_end == UINT32_MAX) return; uint32_t reps[REPS]; memcpy(reps, coder->reps, sizeof(reps)); uint32_t cur; for (cur = 1; cur < len_end; ++cur) { assert(cur < OPTS); coder->longest_match_length = mf_find( mf, &coder->matches_count, coder->matches); if (coder->longest_match_length >= mf->nice_len) break; len_end = helper2(coder, reps, mf_ptr(mf) - 1, len_end, position + cur, cur, mf->nice_len, my_min(mf_avail(mf) + 1, OPTS - 1 - cur)); } backward(coder, len_res, back_res, cur); return; } Index: head/contrib/xz/src/liblzma/lzma/lzma_encoder_private.h =================================================================== --- head/contrib/xz/src/liblzma/lzma/lzma_encoder_private.h (revision 359200) +++ head/contrib/xz/src/liblzma/lzma/lzma_encoder_private.h (revision 359201) @@ -1,148 +1,147 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file lzma_encoder_private.h /// \brief Private definitions for LZMA encoder /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_LZMA_ENCODER_PRIVATE_H #define LZMA_LZMA_ENCODER_PRIVATE_H #include "lz_encoder.h" #include "range_encoder.h" #include "lzma_common.h" #include "lzma_encoder.h" // Macro to compare if the first two bytes in two buffers differ. This is // needed in lzma_lzma_optimum_*() to test if the match is at least // MATCH_LEN_MIN bytes. Unaligned access gives tiny gain so there's no // reason to not use it when it is supported. #ifdef TUKLIB_FAST_UNALIGNED_ACCESS -# define not_equal_16(a, b) \ - (*(const uint16_t *)(a) != *(const uint16_t *)(b)) +# define not_equal_16(a, b) (read16ne(a) != read16ne(b)) #else # define not_equal_16(a, b) \ ((a)[0] != (b)[0] || (a)[1] != (b)[1]) #endif // Optimal - Number of entries in the optimum array. #define OPTS (1 << 12) typedef struct { probability choice; probability choice2; probability low[POS_STATES_MAX][LEN_LOW_SYMBOLS]; probability mid[POS_STATES_MAX][LEN_MID_SYMBOLS]; probability high[LEN_HIGH_SYMBOLS]; uint32_t prices[POS_STATES_MAX][LEN_SYMBOLS]; uint32_t table_size; uint32_t counters[POS_STATES_MAX]; } lzma_length_encoder; typedef struct { lzma_lzma_state state; bool prev_1_is_literal; bool prev_2; uint32_t pos_prev_2; uint32_t back_prev_2; uint32_t price; uint32_t pos_prev; // pos_next; uint32_t back_prev; uint32_t backs[REPS]; } lzma_optimal; struct lzma_lzma1_encoder_s { /// Range encoder lzma_range_encoder rc; /// State lzma_lzma_state state; /// The four most recent match distances uint32_t reps[REPS]; /// Array of match candidates lzma_match matches[MATCH_LEN_MAX + 1]; /// Number of match candidates in matches[] uint32_t matches_count; /// Variable to hold the length of the longest match between calls /// to lzma_lzma_optimum_*(). uint32_t longest_match_length; /// True if using getoptimumfast bool fast_mode; /// True if the encoder has been initialized by encoding the first /// byte as a literal. bool is_initialized; /// True if the range encoder has been flushed, but not all bytes /// have been written to the output buffer yet. bool is_flushed; uint32_t pos_mask; ///< (1 << pos_bits) - 1 uint32_t literal_context_bits; uint32_t literal_pos_mask; // These are the same as in lzma_decoder.c. See comments there. probability literal[LITERAL_CODERS_MAX][LITERAL_CODER_SIZE]; probability is_match[STATES][POS_STATES_MAX]; probability is_rep[STATES]; probability is_rep0[STATES]; probability is_rep1[STATES]; probability is_rep2[STATES]; probability is_rep0_long[STATES][POS_STATES_MAX]; probability dist_slot[DIST_STATES][DIST_SLOTS]; probability dist_special[FULL_DISTANCES - DIST_MODEL_END]; probability dist_align[ALIGN_SIZE]; // These are the same as in lzma_decoder.c except that the encoders // include also price tables. lzma_length_encoder match_len_encoder; lzma_length_encoder rep_len_encoder; // Price tables uint32_t dist_slot_prices[DIST_STATES][DIST_SLOTS]; uint32_t dist_prices[DIST_STATES][FULL_DISTANCES]; uint32_t dist_table_size; uint32_t match_price_count; uint32_t align_prices[ALIGN_SIZE]; uint32_t align_price_count; // Optimal uint32_t opts_end_index; uint32_t opts_current_index; lzma_optimal opts[OPTS]; }; extern void lzma_lzma_optimum_fast( lzma_lzma1_encoder *restrict coder, lzma_mf *restrict mf, uint32_t *restrict back_res, uint32_t *restrict len_res); extern void lzma_lzma_optimum_normal(lzma_lzma1_encoder *restrict coder, lzma_mf *restrict mf, uint32_t *restrict back_res, uint32_t *restrict len_res, uint32_t position); #endif Index: head/contrib/xz/src/liblzma/simple/arm.c =================================================================== --- head/contrib/xz/src/liblzma/simple/arm.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/arm.c (revision 359201) @@ -1,71 +1,71 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file arm.c /// \brief Filter for ARM binaries /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" static size_t arm_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { size_t i; for (i = 0; i + 4 <= size; i += 4) { if (buffer[i + 3] == 0xEB) { - uint32_t src = (buffer[i + 2] << 16) - | (buffer[i + 1] << 8) - | (buffer[i + 0]); + uint32_t src = ((uint32_t)(buffer[i + 2]) << 16) + | ((uint32_t)(buffer[i + 1]) << 8) + | (uint32_t)(buffer[i + 0]); src <<= 2; uint32_t dest; if (is_encoder) dest = now_pos + (uint32_t)(i) + 8 + src; else dest = src - (now_pos + (uint32_t)(i) + 8); dest >>= 2; buffer[i + 2] = (dest >> 16); buffer[i + 1] = (dest >> 8); buffer[i + 0] = dest; } } return i; } static lzma_ret arm_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, bool is_encoder) { return lzma_simple_coder_init(next, allocator, filters, &arm_code, 0, 4, 4, is_encoder); } extern lzma_ret lzma_simple_arm_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return arm_coder_init(next, allocator, filters, true); } extern lzma_ret lzma_simple_arm_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return arm_coder_init(next, allocator, filters, false); } Index: head/contrib/xz/src/liblzma/simple/armthumb.c =================================================================== --- head/contrib/xz/src/liblzma/simple/armthumb.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/armthumb.c (revision 359201) @@ -1,76 +1,76 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file armthumb.c /// \brief Filter for ARM-Thumb binaries /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" static size_t armthumb_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { size_t i; for (i = 0; i + 4 <= size; i += 2) { if ((buffer[i + 1] & 0xF8) == 0xF0 && (buffer[i + 3] & 0xF8) == 0xF8) { - uint32_t src = ((buffer[i + 1] & 0x7) << 19) - | (buffer[i + 0] << 11) - | ((buffer[i + 3] & 0x7) << 8) - | (buffer[i + 2]); + uint32_t src = (((uint32_t)(buffer[i + 1]) & 7) << 19) + | ((uint32_t)(buffer[i + 0]) << 11) + | (((uint32_t)(buffer[i + 3]) & 7) << 8) + | (uint32_t)(buffer[i + 2]); src <<= 1; uint32_t dest; if (is_encoder) dest = now_pos + (uint32_t)(i) + 4 + src; else dest = src - (now_pos + (uint32_t)(i) + 4); dest >>= 1; buffer[i + 1] = 0xF0 | ((dest >> 19) & 0x7); buffer[i + 0] = (dest >> 11); buffer[i + 3] = 0xF8 | ((dest >> 8) & 0x7); buffer[i + 2] = (dest); i += 2; } } return i; } static lzma_ret armthumb_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, bool is_encoder) { return lzma_simple_coder_init(next, allocator, filters, &armthumb_code, 0, 4, 2, is_encoder); } extern lzma_ret lzma_simple_armthumb_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return armthumb_coder_init(next, allocator, filters, true); } extern lzma_ret lzma_simple_armthumb_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return armthumb_coder_init(next, allocator, filters, false); } Index: head/contrib/xz/src/liblzma/simple/ia64.c =================================================================== --- head/contrib/xz/src/liblzma/simple/ia64.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/ia64.c (revision 359201) @@ -1,112 +1,112 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file ia64.c /// \brief Filter for IA64 (Itanium) binaries /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" static size_t ia64_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { static const uint32_t BRANCH_TABLE[32] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 6, 6, 0, 0, 7, 7, 4, 4, 0, 0, 4, 4, 0, 0 }; size_t i; for (i = 0; i + 16 <= size; i += 16) { const uint32_t instr_template = buffer[i] & 0x1F; const uint32_t mask = BRANCH_TABLE[instr_template]; uint32_t bit_pos = 5; for (size_t slot = 0; slot < 3; ++slot, bit_pos += 41) { if (((mask >> slot) & 1) == 0) continue; const size_t byte_pos = (bit_pos >> 3); const uint32_t bit_res = bit_pos & 0x7; uint64_t instruction = 0; for (size_t j = 0; j < 6; ++j) instruction += (uint64_t)( buffer[i + j + byte_pos]) << (8 * j); uint64_t inst_norm = instruction >> bit_res; if (((inst_norm >> 37) & 0xF) == 0x5 && ((inst_norm >> 9) & 0x7) == 0 /* && (inst_norm & 0x3F)== 0 */ ) { uint32_t src = (uint32_t)( (inst_norm >> 13) & 0xFFFFF); src |= ((inst_norm >> 36) & 1) << 20; src <<= 4; uint32_t dest; if (is_encoder) dest = now_pos + (uint32_t)(i) + src; else dest = src - (now_pos + (uint32_t)(i)); dest >>= 4; inst_norm &= ~((uint64_t)(0x8FFFFF) << 13); inst_norm |= (uint64_t)(dest & 0xFFFFF) << 13; inst_norm |= (uint64_t)(dest & 0x100000) << (36 - 20); - instruction &= (1 << bit_res) - 1; + instruction &= (1U << bit_res) - 1; instruction |= (inst_norm << bit_res); for (size_t j = 0; j < 6; j++) buffer[i + j + byte_pos] = (uint8_t)( instruction >> (8 * j)); } } } return i; } static lzma_ret ia64_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, bool is_encoder) { return lzma_simple_coder_init(next, allocator, filters, &ia64_code, 0, 16, 16, is_encoder); } extern lzma_ret lzma_simple_ia64_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return ia64_coder_init(next, allocator, filters, true); } extern lzma_ret lzma_simple_ia64_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return ia64_coder_init(next, allocator, filters, false); } Index: head/contrib/xz/src/liblzma/simple/powerpc.c =================================================================== --- head/contrib/xz/src/liblzma/simple/powerpc.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/powerpc.c (revision 359201) @@ -1,75 +1,76 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file powerpc.c /// \brief Filter for PowerPC (big endian) binaries /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" static size_t powerpc_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { size_t i; for (i = 0; i + 4 <= size; i += 4) { // PowerPC branch 6(48) 24(Offset) 1(Abs) 1(Link) if ((buffer[i] >> 2) == 0x12 && ((buffer[i + 3] & 3) == 1)) { - const uint32_t src = ((buffer[i + 0] & 3) << 24) - | (buffer[i + 1] << 16) - | (buffer[i + 2] << 8) - | (buffer[i + 3] & (~3)); + const uint32_t src + = (((uint32_t)(buffer[i + 0]) & 3) << 24) + | ((uint32_t)(buffer[i + 1]) << 16) + | ((uint32_t)(buffer[i + 2]) << 8) + | ((uint32_t)(buffer[i + 3]) & ~UINT32_C(3)); uint32_t dest; if (is_encoder) dest = now_pos + (uint32_t)(i) + src; else dest = src - (now_pos + (uint32_t)(i)); buffer[i + 0] = 0x48 | ((dest >> 24) & 0x03); buffer[i + 1] = (dest >> 16); buffer[i + 2] = (dest >> 8); buffer[i + 3] &= 0x03; buffer[i + 3] |= dest; } } return i; } static lzma_ret powerpc_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, bool is_encoder) { return lzma_simple_coder_init(next, allocator, filters, &powerpc_code, 0, 4, 4, is_encoder); } extern lzma_ret lzma_simple_powerpc_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return powerpc_coder_init(next, allocator, filters, true); } extern lzma_ret lzma_simple_powerpc_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return powerpc_coder_init(next, allocator, filters, false); } Index: head/contrib/xz/src/liblzma/simple/simple_coder.c =================================================================== --- head/contrib/xz/src/liblzma/simple/simple_coder.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/simple_coder.c (revision 359201) @@ -1,282 +1,290 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file simple_coder.c /// \brief Wrapper for simple filters /// /// Simple filters don't change the size of the data i.e. number of bytes /// in equals the number of bytes out. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" /// Copied or encodes/decodes more data to out[]. static lzma_ret copy_or_code(lzma_simple_coder *coder, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { assert(!coder->end_was_reached); if (coder->next.code == NULL) { lzma_bufcpy(in, in_pos, in_size, out, out_pos, out_size); // Check if end of stream was reached. if (coder->is_encoder && action == LZMA_FINISH && *in_pos == in_size) coder->end_was_reached = true; } else { // Call the next coder in the chain to provide us some data. const lzma_ret ret = coder->next.code( coder->next.coder, allocator, in, in_pos, in_size, out, out_pos, out_size, action); if (ret == LZMA_STREAM_END) { assert(!coder->is_encoder || action == LZMA_FINISH); coder->end_was_reached = true; } else if (ret != LZMA_OK) { return ret; } } return LZMA_OK; } static size_t call_filter(lzma_simple_coder *coder, uint8_t *buffer, size_t size) { const size_t filtered = coder->filter(coder->simple, coder->now_pos, coder->is_encoder, buffer, size); coder->now_pos += filtered; return filtered; } static lzma_ret simple_code(void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size, lzma_action action) { lzma_simple_coder *coder = coder_ptr; // TODO: Add partial support for LZMA_SYNC_FLUSH. We can support it // in cases when the filter is able to filter everything. With most // simple filters it can be done at offset that is a multiple of 2, // 4, or 16. With x86 filter, it needs good luck, and thus cannot // be made to work predictably. if (action == LZMA_SYNC_FLUSH) return LZMA_OPTIONS_ERROR; // Flush already filtered data from coder->buffer[] to out[]. if (coder->pos < coder->filtered) { lzma_bufcpy(coder->buffer, &coder->pos, coder->filtered, out, out_pos, out_size); // If we couldn't flush all the filtered data, return to // application immediately. if (coder->pos < coder->filtered) return LZMA_OK; if (coder->end_was_reached) { assert(coder->filtered == coder->size); return LZMA_STREAM_END; } } // If we get here, there is no filtered data left in the buffer. coder->filtered = 0; assert(!coder->end_was_reached); // If there is more output space left than there is unfiltered data // in coder->buffer[], flush coder->buffer[] to out[], and copy/code // more data to out[] hopefully filling it completely. Then filter // the data in out[]. This step is where most of the data gets // filtered if the buffer sizes used by the application are reasonable. const size_t out_avail = out_size - *out_pos; const size_t buf_avail = coder->size - coder->pos; if (out_avail > buf_avail || buf_avail == 0) { // Store the old position so that we know from which byte // to start filtering. const size_t out_start = *out_pos; // Flush data from coder->buffer[] to out[], but don't reset // coder->pos and coder->size yet. This way the coder can be // restarted if the next filter in the chain returns e.g. // LZMA_MEM_ERROR. - memcpy(out + *out_pos, coder->buffer + coder->pos, buf_avail); + // + // Do the memcpy() conditionally because out can be NULL + // (in which case buf_avail is always 0). Calling memcpy() + // with a null-pointer is undefined even if the third + // argument is 0. + if (buf_avail > 0) + memcpy(out + *out_pos, coder->buffer + coder->pos, + buf_avail); + *out_pos += buf_avail; // Copy/Encode/Decode more data to out[]. { const lzma_ret ret = copy_or_code(coder, allocator, in, in_pos, in_size, out, out_pos, out_size, action); assert(ret != LZMA_STREAM_END); if (ret != LZMA_OK) return ret; } // Filter out[]. const size_t size = *out_pos - out_start; const size_t filtered = call_filter( coder, out + out_start, size); const size_t unfiltered = size - filtered; assert(unfiltered <= coder->allocated / 2); // Now we can update coder->pos and coder->size, because // the next coder in the chain (if any) was successful. coder->pos = 0; coder->size = unfiltered; if (coder->end_was_reached) { // The last byte has been copied to out[] already. // They are left as is. coder->size = 0; } else if (unfiltered > 0) { // There is unfiltered data left in out[]. Copy it to // coder->buffer[] and rewind *out_pos appropriately. *out_pos -= unfiltered; memcpy(coder->buffer, out + *out_pos, unfiltered); } } else if (coder->pos > 0) { memmove(coder->buffer, coder->buffer + coder->pos, buf_avail); coder->size -= coder->pos; coder->pos = 0; } assert(coder->pos == 0); // If coder->buffer[] isn't empty, try to fill it by copying/decoding // more data. Then filter coder->buffer[] and copy the successfully // filtered data to out[]. It is probable, that some filtered and // unfiltered data will be left to coder->buffer[]. if (coder->size > 0) { { const lzma_ret ret = copy_or_code(coder, allocator, in, in_pos, in_size, coder->buffer, &coder->size, coder->allocated, action); assert(ret != LZMA_STREAM_END); if (ret != LZMA_OK) return ret; } coder->filtered = call_filter( coder, coder->buffer, coder->size); // Everything is considered to be filtered if coder->buffer[] // contains the last bytes of the data. if (coder->end_was_reached) coder->filtered = coder->size; // Flush as much as possible. lzma_bufcpy(coder->buffer, &coder->pos, coder->filtered, out, out_pos, out_size); } // Check if we got everything done. if (coder->end_was_reached && coder->pos == coder->size) return LZMA_STREAM_END; return LZMA_OK; } static void simple_coder_end(void *coder_ptr, const lzma_allocator *allocator) { lzma_simple_coder *coder = coder_ptr; lzma_next_end(&coder->next, allocator); lzma_free(coder->simple, allocator); lzma_free(coder, allocator); return; } static lzma_ret simple_coder_update(void *coder_ptr, const lzma_allocator *allocator, const lzma_filter *filters_null lzma_attribute((__unused__)), const lzma_filter *reversed_filters) { lzma_simple_coder *coder = coder_ptr; // No update support, just call the next filter in the chain. return lzma_next_filter_update( &coder->next, allocator, reversed_filters + 1); } extern lzma_ret lzma_simple_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, size_t (*filter)(void *simple, uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size), size_t simple_size, size_t unfiltered_max, uint32_t alignment, bool is_encoder) { // Allocate memory for the lzma_simple_coder structure if needed. lzma_simple_coder *coder = next->coder; if (coder == NULL) { // Here we allocate space also for the temporary buffer. We // need twice the size of unfiltered_max, because then it // is always possible to filter at least unfiltered_max bytes // more data in coder->buffer[] if it can be filled completely. coder = lzma_alloc(sizeof(lzma_simple_coder) + 2 * unfiltered_max, allocator); if (coder == NULL) return LZMA_MEM_ERROR; next->coder = coder; next->code = &simple_code; next->end = &simple_coder_end; next->update = &simple_coder_update; coder->next = LZMA_NEXT_CODER_INIT; coder->filter = filter; coder->allocated = 2 * unfiltered_max; // Allocate memory for filter-specific data structure. if (simple_size > 0) { coder->simple = lzma_alloc(simple_size, allocator); if (coder->simple == NULL) return LZMA_MEM_ERROR; } else { coder->simple = NULL; } } if (filters[0].options != NULL) { const lzma_options_bcj *simple = filters[0].options; coder->now_pos = simple->start_offset; if (coder->now_pos & (alignment - 1)) return LZMA_OPTIONS_ERROR; } else { coder->now_pos = 0; } // Reset variables. coder->is_encoder = is_encoder; coder->end_was_reached = false; coder->pos = 0; coder->filtered = 0; coder->size = 0; return lzma_next_filter_init(&coder->next, allocator, filters + 1); } Index: head/contrib/xz/src/liblzma/simple/simple_decoder.c =================================================================== --- head/contrib/xz/src/liblzma/simple/simple_decoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/simple_decoder.c (revision 359201) @@ -1,40 +1,40 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file simple_decoder.c /// \brief Properties decoder for simple filters // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_decoder.h" extern lzma_ret lzma_simple_props_decode(void **options, const lzma_allocator *allocator, const uint8_t *props, size_t props_size) { if (props_size == 0) return LZMA_OK; if (props_size != 4) return LZMA_OPTIONS_ERROR; lzma_options_bcj *opt = lzma_alloc( sizeof(lzma_options_bcj), allocator); if (opt == NULL) return LZMA_MEM_ERROR; - opt->start_offset = unaligned_read32le(props); + opt->start_offset = read32le(props); // Don't leave an options structure allocated if start_offset is zero. if (opt->start_offset == 0) lzma_free(opt, allocator); else *options = opt; return LZMA_OK; } Index: head/contrib/xz/src/liblzma/simple/simple_encoder.c =================================================================== --- head/contrib/xz/src/liblzma/simple/simple_encoder.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/simple_encoder.c (revision 359201) @@ -1,38 +1,38 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file simple_encoder.c /// \brief Properties encoder for simple filters // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_encoder.h" extern lzma_ret lzma_simple_props_size(uint32_t *size, const void *options) { const lzma_options_bcj *const opt = options; *size = (opt == NULL || opt->start_offset == 0) ? 0 : 4; return LZMA_OK; } extern lzma_ret lzma_simple_props_encode(const void *options, uint8_t *out) { const lzma_options_bcj *const opt = options; // The default start offset is zero, so we don't need to store any // options unless the start offset is non-zero. if (opt == NULL || opt->start_offset == 0) return LZMA_OK; - unaligned_write32le(out, opt->start_offset); + write32le(out, opt->start_offset); return LZMA_OK; } Index: head/contrib/xz/src/liblzma/simple/x86.c =================================================================== --- head/contrib/xz/src/liblzma/simple/x86.c (revision 359200) +++ head/contrib/xz/src/liblzma/simple/x86.c (revision 359201) @@ -1,159 +1,159 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file x86.c /// \brief Filter for x86 binaries (BCJ filter) /// // Authors: Igor Pavlov // Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "simple_private.h" #define Test86MSByte(b) ((b) == 0 || (b) == 0xFF) typedef struct { uint32_t prev_mask; uint32_t prev_pos; } lzma_simple_x86; static size_t x86_code(void *simple_ptr, uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { static const bool MASK_TO_ALLOWED_STATUS[8] = { true, true, true, false, true, false, false, false }; static const uint32_t MASK_TO_BIT_NUMBER[8] = { 0, 1, 2, 2, 3, 3, 3, 3 }; lzma_simple_x86 *simple = simple_ptr; uint32_t prev_mask = simple->prev_mask; uint32_t prev_pos = simple->prev_pos; if (size < 5) return 0; if (now_pos - prev_pos > 5) prev_pos = now_pos - 5; const size_t limit = size - 5; size_t buffer_pos = 0; while (buffer_pos <= limit) { uint8_t b = buffer[buffer_pos]; if (b != 0xE8 && b != 0xE9) { ++buffer_pos; continue; } const uint32_t offset = now_pos + (uint32_t)(buffer_pos) - prev_pos; prev_pos = now_pos + (uint32_t)(buffer_pos); if (offset > 5) { prev_mask = 0; } else { for (uint32_t i = 0; i < offset; ++i) { prev_mask &= 0x77; prev_mask <<= 1; } } b = buffer[buffer_pos + 4]; if (Test86MSByte(b) && MASK_TO_ALLOWED_STATUS[(prev_mask >> 1) & 0x7] && (prev_mask >> 1) < 0x10) { uint32_t src = ((uint32_t)(b) << 24) | ((uint32_t)(buffer[buffer_pos + 3]) << 16) | ((uint32_t)(buffer[buffer_pos + 2]) << 8) | (buffer[buffer_pos + 1]); uint32_t dest; while (true) { if (is_encoder) dest = src + (now_pos + (uint32_t)( buffer_pos) + 5); else dest = src - (now_pos + (uint32_t)( buffer_pos) + 5); if (prev_mask == 0) break; const uint32_t i = MASK_TO_BIT_NUMBER[ prev_mask >> 1]; b = (uint8_t)(dest >> (24 - i * 8)); if (!Test86MSByte(b)) break; - src = dest ^ ((1 << (32 - i * 8)) - 1); + src = dest ^ ((1U << (32 - i * 8)) - 1); } buffer[buffer_pos + 4] = (uint8_t)(~(((dest >> 24) & 1) - 1)); buffer[buffer_pos + 3] = (uint8_t)(dest >> 16); buffer[buffer_pos + 2] = (uint8_t)(dest >> 8); buffer[buffer_pos + 1] = (uint8_t)(dest); buffer_pos += 5; prev_mask = 0; } else { ++buffer_pos; prev_mask |= 1; if (Test86MSByte(b)) prev_mask |= 0x10; } } simple->prev_mask = prev_mask; simple->prev_pos = prev_pos; return buffer_pos; } static lzma_ret x86_coder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters, bool is_encoder) { const lzma_ret ret = lzma_simple_coder_init(next, allocator, filters, &x86_code, sizeof(lzma_simple_x86), 5, 1, is_encoder); if (ret == LZMA_OK) { lzma_simple_coder *coder = next->coder; lzma_simple_x86 *simple = coder->simple; simple->prev_mask = 0; simple->prev_pos = (uint32_t)(-5); } return ret; } extern lzma_ret lzma_simple_x86_encoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return x86_coder_init(next, allocator, filters, true); } extern lzma_ret lzma_simple_x86_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, const lzma_filter_info *filters) { return x86_coder_init(next, allocator, filters, false); } Index: head/contrib/xz/src/xz/args.c =================================================================== --- head/contrib/xz/src/xz/args.c (revision 359200) +++ head/contrib/xz/src/xz/args.c (revision 359201) @@ -1,700 +1,700 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file args.c /// \brief Argument parsing /// /// \note Filter-specific options parsing is in options.c. // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #include "getopt.h" #include bool opt_stdout = false; bool opt_force = false; bool opt_keep_original = false; bool opt_robot = false; bool opt_ignore_check = false; // We don't modify or free() this, but we need to assign it in some // non-const pointers. const char stdin_filename[] = "(stdin)"; /// Parse and set the memory usage limit for compression and/or decompression. static void parse_memlimit(const char *name, const char *name_percentage, char *str, bool set_compress, bool set_decompress) { bool is_percentage = false; uint64_t value; const size_t len = strlen(str); if (len > 0 && str[len - 1] == '%') { str[len - 1] = '\0'; is_percentage = true; value = str_to_uint64(name_percentage, str, 1, 100); } else { // On 32-bit systems, SIZE_MAX would make more sense than // UINT64_MAX. But use UINT64_MAX still so that scripts // that assume > 4 GiB values don't break. value = str_to_uint64(name, str, 0, UINT64_MAX); } hardware_memlimit_set( value, set_compress, set_decompress, is_percentage); return; } static void parse_block_list(char *str) { // It must be non-empty and not begin with a comma. if (str[0] == '\0' || str[0] == ',') message_fatal(_("%s: Invalid argument to --block-list"), str); // Count the number of comma-separated strings. size_t count = 1; for (size_t i = 0; str[i] != '\0'; ++i) if (str[i] == ',') ++count; // Prevent an unlikely integer overflow. if (count > SIZE_MAX / sizeof(uint64_t) - 1) message_fatal(_("%s: Too many arguments to --block-list"), str); // Allocate memory to hold all the sizes specified. // If --block-list was specified already, its value is forgotten. free(opt_block_list); opt_block_list = xmalloc((count + 1) * sizeof(uint64_t)); for (size_t i = 0; i < count; ++i) { // Locate the next comma and replace it with \0. char *p = strchr(str, ','); if (p != NULL) *p = '\0'; if (str[0] == '\0') { // There is no string, that is, a comma follows // another comma. Use the previous value. // - // NOTE: We checked earler that the first char + // NOTE: We checked earlier that the first char // of the whole list cannot be a comma. assert(i > 0); opt_block_list[i] = opt_block_list[i - 1]; } else { opt_block_list[i] = str_to_uint64("block-list", str, 0, UINT64_MAX); // Zero indicates no more new Blocks. if (opt_block_list[i] == 0) { if (i + 1 != count) message_fatal(_("0 can only be used " "as the last element " "in --block-list")); opt_block_list[i] = UINT64_MAX; } } str = p + 1; } // Terminate the array. opt_block_list[count] = 0; return; } static void parse_real(args_info *args, int argc, char **argv) { enum { OPT_X86 = INT_MIN, OPT_POWERPC, OPT_IA64, OPT_ARM, OPT_ARMTHUMB, OPT_SPARC, OPT_DELTA, OPT_LZMA1, OPT_LZMA2, OPT_SINGLE_STREAM, OPT_NO_SPARSE, OPT_FILES, OPT_FILES0, OPT_BLOCK_SIZE, OPT_BLOCK_LIST, OPT_MEM_COMPRESS, OPT_MEM_DECOMPRESS, OPT_NO_ADJUST, OPT_INFO_MEMORY, OPT_ROBOT, OPT_FLUSH_TIMEOUT, OPT_IGNORE_CHECK, }; static const char short_opts[] = "cC:defF:hHlkM:qQrS:tT:vVz0123456789"; static const struct option long_opts[] = { // Operation mode { "compress", no_argument, NULL, 'z' }, { "decompress", no_argument, NULL, 'd' }, { "uncompress", no_argument, NULL, 'd' }, { "test", no_argument, NULL, 't' }, { "list", no_argument, NULL, 'l' }, // Operation modifiers { "keep", no_argument, NULL, 'k' }, { "force", no_argument, NULL, 'f' }, { "stdout", no_argument, NULL, 'c' }, { "to-stdout", no_argument, NULL, 'c' }, { "single-stream", no_argument, NULL, OPT_SINGLE_STREAM }, { "no-sparse", no_argument, NULL, OPT_NO_SPARSE }, { "suffix", required_argument, NULL, 'S' }, // { "recursive", no_argument, NULL, 'r' }, // TODO { "files", optional_argument, NULL, OPT_FILES }, { "files0", optional_argument, NULL, OPT_FILES0 }, // Basic compression settings { "format", required_argument, NULL, 'F' }, { "check", required_argument, NULL, 'C' }, { "ignore-check", no_argument, NULL, OPT_IGNORE_CHECK }, { "block-size", required_argument, NULL, OPT_BLOCK_SIZE }, { "block-list", required_argument, NULL, OPT_BLOCK_LIST }, { "memlimit-compress", required_argument, NULL, OPT_MEM_COMPRESS }, { "memlimit-decompress", required_argument, NULL, OPT_MEM_DECOMPRESS }, { "memlimit", required_argument, NULL, 'M' }, { "memory", required_argument, NULL, 'M' }, // Old alias { "no-adjust", no_argument, NULL, OPT_NO_ADJUST }, { "threads", required_argument, NULL, 'T' }, { "flush-timeout", required_argument, NULL, OPT_FLUSH_TIMEOUT }, { "extreme", no_argument, NULL, 'e' }, { "fast", no_argument, NULL, '0' }, { "best", no_argument, NULL, '9' }, // Filters { "lzma1", optional_argument, NULL, OPT_LZMA1 }, { "lzma2", optional_argument, NULL, OPT_LZMA2 }, { "x86", optional_argument, NULL, OPT_X86 }, { "powerpc", optional_argument, NULL, OPT_POWERPC }, { "ia64", optional_argument, NULL, OPT_IA64 }, { "arm", optional_argument, NULL, OPT_ARM }, { "armthumb", optional_argument, NULL, OPT_ARMTHUMB }, { "sparc", optional_argument, NULL, OPT_SPARC }, { "delta", optional_argument, NULL, OPT_DELTA }, // Other options { "quiet", no_argument, NULL, 'q' }, { "verbose", no_argument, NULL, 'v' }, { "no-warn", no_argument, NULL, 'Q' }, { "robot", no_argument, NULL, OPT_ROBOT }, { "info-memory", no_argument, NULL, OPT_INFO_MEMORY }, { "help", no_argument, NULL, 'h' }, { "long-help", no_argument, NULL, 'H' }, { "version", no_argument, NULL, 'V' }, { NULL, 0, NULL, 0 } }; int c; while ((c = getopt_long(argc, argv, short_opts, long_opts, NULL)) != -1) { switch (c) { // Compression preset (also for decompression if --format=raw) case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': - coder_set_preset(c - '0'); + coder_set_preset((uint32_t)(c - '0')); break; // --memlimit-compress case OPT_MEM_COMPRESS: parse_memlimit("memlimit-compress", "memlimit-compress%", optarg, true, false); break; // --memlimit-decompress case OPT_MEM_DECOMPRESS: parse_memlimit("memlimit-decompress", "memlimit-decompress%", optarg, false, true); break; // --memlimit case 'M': parse_memlimit("memlimit", "memlimit%", optarg, true, true); break; // --suffix case 'S': suffix_set(optarg); break; case 'T': // The max is from src/liblzma/common/common.h. hardware_threads_set(str_to_uint64("threads", optarg, 0, 16384)); break; // --version case 'V': // This doesn't return. message_version(); // --stdout case 'c': opt_stdout = true; break; // --decompress case 'd': opt_mode = MODE_DECOMPRESS; break; // --extreme case 'e': coder_set_extreme(); break; // --force case 'f': opt_force = true; break; // --info-memory case OPT_INFO_MEMORY: // This doesn't return. hardware_memlimit_show(); // --help case 'h': // This doesn't return. message_help(false); // --long-help case 'H': // This doesn't return. message_help(true); // --list case 'l': opt_mode = MODE_LIST; break; // --keep case 'k': opt_keep_original = true; break; // --quiet case 'q': message_verbosity_decrease(); break; case 'Q': set_exit_no_warn(); break; case 't': opt_mode = MODE_TEST; break; // --verbose case 'v': message_verbosity_increase(); break; // --robot case OPT_ROBOT: opt_robot = true; // This is to make sure that floating point numbers // always have a dot as decimal separator. setlocale(LC_NUMERIC, "C"); break; case 'z': opt_mode = MODE_COMPRESS; break; // Filter setup case OPT_X86: coder_add_filter(LZMA_FILTER_X86, options_bcj(optarg)); break; case OPT_POWERPC: coder_add_filter(LZMA_FILTER_POWERPC, options_bcj(optarg)); break; case OPT_IA64: coder_add_filter(LZMA_FILTER_IA64, options_bcj(optarg)); break; case OPT_ARM: coder_add_filter(LZMA_FILTER_ARM, options_bcj(optarg)); break; case OPT_ARMTHUMB: coder_add_filter(LZMA_FILTER_ARMTHUMB, options_bcj(optarg)); break; case OPT_SPARC: coder_add_filter(LZMA_FILTER_SPARC, options_bcj(optarg)); break; case OPT_DELTA: coder_add_filter(LZMA_FILTER_DELTA, options_delta(optarg)); break; case OPT_LZMA1: coder_add_filter(LZMA_FILTER_LZMA1, options_lzma(optarg)); break; case OPT_LZMA2: coder_add_filter(LZMA_FILTER_LZMA2, options_lzma(optarg)); break; // Other // --format case 'F': { // Just in case, support both "lzma" and "alone" since // the latter was used for forward compatibility in // LZMA Utils 4.32.x. static const struct { char str[8]; enum format_type format; } types[] = { { "auto", FORMAT_AUTO }, { "xz", FORMAT_XZ }, { "lzma", FORMAT_LZMA }, { "alone", FORMAT_LZMA }, // { "gzip", FORMAT_GZIP }, // { "gz", FORMAT_GZIP }, { "raw", FORMAT_RAW }, }; size_t i = 0; while (strcmp(types[i].str, optarg) != 0) if (++i == ARRAY_SIZE(types)) message_fatal(_("%s: Unknown file " "format type"), optarg); opt_format = types[i].format; break; } // --check case 'C': { static const struct { char str[8]; lzma_check check; } types[] = { { "none", LZMA_CHECK_NONE }, { "crc32", LZMA_CHECK_CRC32 }, { "crc64", LZMA_CHECK_CRC64 }, { "sha256", LZMA_CHECK_SHA256 }, }; size_t i = 0; while (strcmp(types[i].str, optarg) != 0) { if (++i == ARRAY_SIZE(types)) message_fatal(_("%s: Unsupported " "integrity " "check type"), optarg); } // Use a separate check in case we are using different // liblzma than what was used to compile us. if (!lzma_check_is_supported(types[i].check)) message_fatal(_("%s: Unsupported integrity " "check type"), optarg); coder_set_check(types[i].check); break; } case OPT_IGNORE_CHECK: opt_ignore_check = true; break; case OPT_BLOCK_SIZE: opt_block_size = str_to_uint64("block-size", optarg, 0, LZMA_VLI_MAX); break; case OPT_BLOCK_LIST: { parse_block_list(optarg); break; } case OPT_SINGLE_STREAM: opt_single_stream = true; break; case OPT_NO_SPARSE: io_no_sparse(); break; case OPT_FILES: args->files_delim = '\n'; // Fall through case OPT_FILES0: if (args->files_name != NULL) message_fatal(_("Only one file can be " "specified with `--files' " "or `--files0'.")); if (optarg == NULL) { args->files_name = (char *)stdin_filename; args->files_file = stdin; } else { args->files_name = optarg; args->files_file = fopen(optarg, c == OPT_FILES ? "r" : "rb"); if (args->files_file == NULL) message_fatal("%s: %s", optarg, strerror(errno)); } break; case OPT_NO_ADJUST: opt_auto_adjust = false; break; case OPT_FLUSH_TIMEOUT: opt_flush_timeout = str_to_uint64("flush-timeout", optarg, 0, UINT64_MAX); break; default: message_try_help(); tuklib_exit(E_ERROR, E_ERROR, false); } } return; } static void parse_environment(args_info *args, char *argv0, const char *varname) { char *env = getenv(varname); if (env == NULL) return; // We modify the string, so make a copy of it. env = xstrdup(env); // Calculate the number of arguments in env. argc stats at one // to include space for the program name. int argc = 1; bool prev_was_space = true; for (size_t i = 0; env[i] != '\0'; ++i) { // NOTE: Cast to unsigned char is needed so that correct // value gets passed to isspace(), which expects // unsigned char cast to int. Casting to int is done // automatically due to integer promotion, but we need to // force char to unsigned char manually. Otherwise 8-bit // characters would get promoted to wrong value if // char is signed. if (isspace((unsigned char)env[i])) { prev_was_space = true; } else if (prev_was_space) { prev_was_space = false; // Keep argc small enough to fit into a signed int // and to keep it usable for memory allocation. if (++argc == my_min( INT_MAX, SIZE_MAX / sizeof(char *))) message_fatal(_("The environment variable " "%s contains too many " "arguments"), varname); } } // Allocate memory to hold pointers to the arguments. Add one to get // space for the terminating NULL (if some systems happen to need it). char **argv = xmalloc(((size_t)(argc) + 1) * sizeof(char *)); argv[0] = argv0; argv[argc] = NULL; // Go through the string again. Split the arguments using '\0' // characters and add pointers to the resulting strings to argv. argc = 1; prev_was_space = true; for (size_t i = 0; env[i] != '\0'; ++i) { if (isspace((unsigned char)env[i])) { prev_was_space = true; env[i] = '\0'; } else if (prev_was_space) { prev_was_space = false; argv[argc++] = env + i; } } // Parse the argument list we got from the environment. All non-option // arguments i.e. filenames are ignored. parse_real(args, argc, argv); // Reset the state of the getopt_long() so that we can parse the // command line options too. There are two incompatible ways to // do it. #ifdef HAVE_OPTRESET // BSD optind = 1; optreset = 1; #else // GNU, Solaris optind = 0; #endif // We don't need the argument list from environment anymore. free(argv); free(env); return; } extern void args_parse(args_info *args, int argc, char **argv) { // Initialize those parts of *args that we need later. args->files_name = NULL; args->files_file = NULL; args->files_delim = '\0'; // Check how we were called. { // Remove the leading path name, if any. const char *name = strrchr(argv[0], '/'); if (name == NULL) name = argv[0]; else ++name; // NOTE: It's possible that name[0] is now '\0' if argv[0] // is weird, but it doesn't matter here. // Look for full command names instead of substrings like // "un", "cat", and "lz" to reduce possibility of false // positives when the programs have been renamed. if (strstr(name, "xzcat") != NULL) { opt_mode = MODE_DECOMPRESS; opt_stdout = true; } else if (strstr(name, "unxz") != NULL) { opt_mode = MODE_DECOMPRESS; } else if (strstr(name, "lzcat") != NULL) { opt_format = FORMAT_LZMA; opt_mode = MODE_DECOMPRESS; opt_stdout = true; } else if (strstr(name, "unlzma") != NULL) { opt_format = FORMAT_LZMA; opt_mode = MODE_DECOMPRESS; } else if (strstr(name, "lzma") != NULL) { opt_format = FORMAT_LZMA; } } // First the flags from the environment parse_environment(args, argv[0], "XZ_DEFAULTS"); parse_environment(args, argv[0], "XZ_OPT"); // Then from the command line parse_real(args, argc, argv); // If encoder or decoder support was omitted at build time, // show an error now so that the rest of the code can rely on // that whatever is in opt_mode is also supported. #ifndef HAVE_ENCODERS if (opt_mode == MODE_COMPRESS) message_fatal(_("Compression support was disabled " "at build time")); #endif #ifndef HAVE_DECODERS // Even MODE_LIST cannot work without decoder support so MODE_COMPRESS // is the only valid choice. if (opt_mode != MODE_COMPRESS) message_fatal(_("Decompression support was disabled " "at build time")); #endif // Never remove the source file when the destination is not on disk. // In test mode the data is written nowhere, but setting opt_stdout // will make the rest of the code behave well. if (opt_stdout || opt_mode == MODE_TEST) { opt_keep_original = true; opt_stdout = true; } // When compressing, if no --format flag was used, or it // was --format=auto, we compress to the .xz format. if (opt_mode == MODE_COMPRESS && opt_format == FORMAT_AUTO) opt_format = FORMAT_XZ; // Compression settings need to be validated (options themselves and // their memory usage) when compressing to any file format. It has to // be done also when uncompressing raw data, since for raw decoding // the options given on the command line are used to know what kind // of raw data we are supposed to decode. if (opt_mode == MODE_COMPRESS || opt_format == FORMAT_RAW) coder_set_compression_settings(); // If no filenames are given, use stdin. if (argv[optind] == NULL && args->files_name == NULL) { // We don't modify or free() the "-" constant. The caller // modifies this so don't make the struct itself const. static char *names_stdin[2] = { (char *)"-", NULL }; args->arg_names = names_stdin; args->arg_count = 1; } else { // We got at least one filename from the command line, or // --files or --files0 was specified. args->arg_names = argv + optind; - args->arg_count = argc - optind; + args->arg_count = (unsigned int)(argc - optind); } return; } #ifndef NDEBUG extern void args_free(void) { free(opt_block_list); return; } #endif Index: head/contrib/xz/src/xz/coder.c =================================================================== --- head/contrib/xz/src/xz/coder.c (revision 359200) +++ head/contrib/xz/src/xz/coder.c (revision 359201) @@ -1,936 +1,944 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file coder.c /// \brief Compresses or uncompresses a file // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" /// Return value type for coder_init(). enum coder_init_ret { CODER_INIT_NORMAL, CODER_INIT_PASSTHRU, CODER_INIT_ERROR, }; enum operation_mode opt_mode = MODE_COMPRESS; enum format_type opt_format = FORMAT_AUTO; bool opt_auto_adjust = true; bool opt_single_stream = false; uint64_t opt_block_size = 0; uint64_t *opt_block_list = NULL; /// Stream used to communicate with liblzma static lzma_stream strm = LZMA_STREAM_INIT; /// Filters needed for all encoding all formats, and also decoding in raw data static lzma_filter filters[LZMA_FILTERS_MAX + 1]; /// Input and output buffers static io_buf in_buf; static io_buf out_buf; /// Number of filters. Zero indicates that we are using a preset. static uint32_t filters_count = 0; /// Number of the preset (0-9) static uint32_t preset_number = LZMA_PRESET_DEFAULT; /// Integrity check type static lzma_check check; /// This becomes false if the --check=CHECK option is used. static bool check_default = true; #if defined(HAVE_ENCODERS) && defined(MYTHREAD_ENABLED) static lzma_mt mt_options = { .flags = 0, .timeout = 300, .filters = filters, }; #endif extern void coder_set_check(lzma_check new_check) { check = new_check; check_default = false; return; } static void forget_filter_chain(void) { // Setting a preset makes us forget a possibly defined custom // filter chain. while (filters_count > 0) { --filters_count; free(filters[filters_count].options); filters[filters_count].options = NULL; } return; } extern void coder_set_preset(uint32_t new_preset) { preset_number &= ~LZMA_PRESET_LEVEL_MASK; preset_number |= new_preset; forget_filter_chain(); return; } extern void coder_set_extreme(void) { preset_number |= LZMA_PRESET_EXTREME; forget_filter_chain(); return; } extern void coder_add_filter(lzma_vli id, void *options) { if (filters_count == LZMA_FILTERS_MAX) message_fatal(_("Maximum number of filters is four")); filters[filters_count].id = id; filters[filters_count].options = options; ++filters_count; // Setting a custom filter chain makes us forget the preset options. // This makes a difference if one specifies e.g. "xz -9 --lzma2 -e" // where the custom filter chain resets the preset level back to // the default 6, making the example equivalent to "xz -6e". preset_number = LZMA_PRESET_DEFAULT; return; } static void lzma_attribute((__noreturn__)) memlimit_too_small(uint64_t memory_usage) { message(V_ERROR, _("Memory usage limit is too low for the given " "filter setup.")); message_mem_needed(V_ERROR, memory_usage); tuklib_exit(E_ERROR, E_ERROR, false); } extern void coder_set_compression_settings(void) { // The default check type is CRC64, but fallback to CRC32 // if CRC64 isn't supported by the copy of liblzma we are // using. CRC32 is always supported. if (check_default) { check = LZMA_CHECK_CRC64; if (!lzma_check_is_supported(check)) check = LZMA_CHECK_CRC32; } // Options for LZMA1 or LZMA2 in case we are using a preset. static lzma_options_lzma opt_lzma; if (filters_count == 0) { // We are using a preset. This is not a good idea in raw mode // except when playing around with things. Different versions // of this software may use different options in presets, and // thus make uncompressing the raw data difficult. if (opt_format == FORMAT_RAW) { // The message is shown only if warnings are allowed // but the exit status isn't changed. message(V_WARNING, _("Using a preset in raw mode " "is discouraged.")); message(V_WARNING, _("The exact options of the " "presets may vary between software " "versions.")); } // Get the preset for LZMA1 or LZMA2. if (lzma_lzma_preset(&opt_lzma, preset_number)) message_bug(); // Use LZMA2 except with --format=lzma we use LZMA1. filters[0].id = opt_format == FORMAT_LZMA ? LZMA_FILTER_LZMA1 : LZMA_FILTER_LZMA2; filters[0].options = &opt_lzma; filters_count = 1; } // Terminate the filter options array. filters[filters_count].id = LZMA_VLI_UNKNOWN; // If we are using the .lzma format, allow exactly one filter // which has to be LZMA1. if (opt_format == FORMAT_LZMA && (filters_count != 1 || filters[0].id != LZMA_FILTER_LZMA1)) message_fatal(_("The .lzma format supports only " "the LZMA1 filter")); // If we are using the .xz format, make sure that there is no LZMA1 // filter to prevent LZMA_PROG_ERROR. if (opt_format == FORMAT_XZ) for (size_t i = 0; i < filters_count; ++i) if (filters[i].id == LZMA_FILTER_LZMA1) message_fatal(_("LZMA1 cannot be used " "with the .xz format")); // Print the selected filter chain. message_filters_show(V_DEBUG, filters); // The --flush-timeout option requires LZMA_SYNC_FLUSH support // from the filter chain. Currently threaded encoder doesn't support // LZMA_SYNC_FLUSH so single-threaded mode must be used. if (opt_mode == MODE_COMPRESS && opt_flush_timeout != 0) { for (size_t i = 0; i < filters_count; ++i) { switch (filters[i].id) { case LZMA_FILTER_LZMA2: case LZMA_FILTER_DELTA: break; default: message_fatal(_("The filter chain is " "incompatible with --flush-timeout")); } } if (hardware_threads_get() > 1) { message(V_WARNING, _("Switching to single-threaded " "mode due to --flush-timeout")); hardware_threads_set(1); } } // Get the memory usage. Note that if --format=raw was used, // we can be decompressing. const uint64_t memory_limit = hardware_memlimit_get(opt_mode); uint64_t memory_usage = UINT64_MAX; if (opt_mode == MODE_COMPRESS) { #ifdef HAVE_ENCODERS # ifdef MYTHREAD_ENABLED if (opt_format == FORMAT_XZ && hardware_threads_get() > 1) { mt_options.threads = hardware_threads_get(); mt_options.block_size = opt_block_size; mt_options.check = check; memory_usage = lzma_stream_encoder_mt_memusage( &mt_options); if (memory_usage != UINT64_MAX) message(V_DEBUG, _("Using up to %" PRIu32 " threads."), mt_options.threads); } else # endif { memory_usage = lzma_raw_encoder_memusage(filters); } #endif } else { #ifdef HAVE_DECODERS memory_usage = lzma_raw_decoder_memusage(filters); #endif } if (memory_usage == UINT64_MAX) message_fatal(_("Unsupported filter chain or filter options")); // Print memory usage info before possible dictionary // size auto-adjusting. // // NOTE: If only encoder support was built, we cannot show the // what the decoder memory usage will be. message_mem_needed(V_DEBUG, memory_usage); #ifdef HAVE_DECODERS if (opt_mode == MODE_COMPRESS) { const uint64_t decmem = lzma_raw_decoder_memusage(filters); if (decmem != UINT64_MAX) message(V_DEBUG, _("Decompression will need " "%s MiB of memory."), uint64_to_str( round_up_to_mib(decmem), 0)); } #endif if (memory_usage <= memory_limit) return; // If --no-adjust was used or we didn't find LZMA1 or // LZMA2 as the last filter, give an error immediately. // --format=raw implies --no-adjust. if (!opt_auto_adjust || opt_format == FORMAT_RAW) memlimit_too_small(memory_usage); assert(opt_mode == MODE_COMPRESS); #ifdef HAVE_ENCODERS # ifdef MYTHREAD_ENABLED if (opt_format == FORMAT_XZ && mt_options.threads > 1) { // Try to reduce the number of threads before // adjusting the compression settings down. do { // FIXME? The real single-threaded mode has // lower memory usage, but it's not comparable // because it doesn't write the size info // into Block Headers. if (--mt_options.threads == 0) memlimit_too_small(memory_usage); memory_usage = lzma_stream_encoder_mt_memusage( &mt_options); if (memory_usage == UINT64_MAX) message_bug(); } while (memory_usage > memory_limit); message(V_WARNING, _("Adjusted the number of threads " "from %s to %s to not exceed " "the memory usage limit of %s MiB"), uint64_to_str(hardware_threads_get(), 0), uint64_to_str(mt_options.threads, 1), uint64_to_str(round_up_to_mib( memory_limit), 2)); } # endif if (memory_usage <= memory_limit) return; // Look for the last filter if it is LZMA2 or LZMA1, so we can make // it use less RAM. With other filters we don't know what to do. size_t i = 0; while (filters[i].id != LZMA_FILTER_LZMA2 && filters[i].id != LZMA_FILTER_LZMA1) { if (filters[i].id == LZMA_VLI_UNKNOWN) memlimit_too_small(memory_usage); ++i; } // Decrease the dictionary size until we meet the memory // usage limit. First round down to full mebibytes. lzma_options_lzma *opt = filters[i].options; const uint32_t orig_dict_size = opt->dict_size; opt->dict_size &= ~((UINT32_C(1) << 20) - 1); while (true) { // If it is below 1 MiB, auto-adjusting failed. We could be // more sophisticated and scale it down even more, but let's // see if many complain about this version. // // FIXME: Displays the scaled memory usage instead // of the original. if (opt->dict_size < (UINT32_C(1) << 20)) memlimit_too_small(memory_usage); memory_usage = lzma_raw_encoder_memusage(filters); if (memory_usage == UINT64_MAX) message_bug(); // Accept it if it is low enough. if (memory_usage <= memory_limit) break; // Otherwise 1 MiB down and try again. I hope this // isn't too slow method for cases where the original // dict_size is very big. opt->dict_size -= UINT32_C(1) << 20; } // Tell the user that we decreased the dictionary size. message(V_WARNING, _("Adjusted LZMA%c dictionary size " "from %s MiB to %s MiB to not exceed " "the memory usage limit of %s MiB"), filters[i].id == LZMA_FILTER_LZMA2 ? '2' : '1', uint64_to_str(orig_dict_size >> 20, 0), uint64_to_str(opt->dict_size >> 20, 1), uint64_to_str(round_up_to_mib(memory_limit), 2)); #endif return; } #ifdef HAVE_DECODERS /// Return true if the data in in_buf seems to be in the .xz format. static bool is_format_xz(void) { // Specify the magic as hex to be compatible with EBCDIC systems. static const uint8_t magic[6] = { 0xFD, 0x37, 0x7A, 0x58, 0x5A, 0x00 }; return strm.avail_in >= sizeof(magic) && memcmp(in_buf.u8, magic, sizeof(magic)) == 0; } /// Return true if the data in in_buf seems to be in the .lzma format. static bool is_format_lzma(void) { // The .lzma header is 13 bytes. if (strm.avail_in < 13) return false; // Decode the LZMA1 properties. lzma_filter filter = { .id = LZMA_FILTER_LZMA1 }; if (lzma_properties_decode(&filter, NULL, in_buf.u8, 5) != LZMA_OK) return false; // A hack to ditch tons of false positives: We allow only dictionary // sizes that are 2^n or 2^n + 2^(n-1) or UINT32_MAX. LZMA_Alone // created only files with 2^n, but accepts any dictionary size. // If someone complains, this will be reconsidered. lzma_options_lzma *opt = filter.options; const uint32_t dict_size = opt->dict_size; free(opt); if (dict_size != UINT32_MAX) { uint32_t d = dict_size - 1; d |= d >> 2; d |= d >> 3; d |= d >> 4; d |= d >> 8; d |= d >> 16; ++d; if (d != dict_size || dict_size == 0) return false; } // Another hack to ditch false positives: Assume that if the // uncompressed size is known, it must be less than 256 GiB. // Again, if someone complains, this will be reconsidered. uint64_t uncompressed_size = 0; for (size_t i = 0; i < 8; ++i) uncompressed_size |= (uint64_t)(in_buf.u8[5 + i]) << (i * 8); if (uncompressed_size != UINT64_MAX && uncompressed_size > (UINT64_C(1) << 38)) return false; return true; } #endif /// Detect the input file type (for now, this done only when decompressing), /// and initialize an appropriate coder. Return value indicates if a normal /// liblzma-based coder was initialized (CODER_INIT_NORMAL), if passthru /// mode should be used (CODER_INIT_PASSTHRU), or if an error occurred /// (CODER_INIT_ERROR). static enum coder_init_ret coder_init(file_pair *pair) { lzma_ret ret = LZMA_PROG_ERROR; if (opt_mode == MODE_COMPRESS) { #ifdef HAVE_ENCODERS switch (opt_format) { case FORMAT_AUTO: // args.c ensures this. assert(0); break; case FORMAT_XZ: # ifdef MYTHREAD_ENABLED if (hardware_threads_get() > 1) ret = lzma_stream_encoder_mt( &strm, &mt_options); else # endif ret = lzma_stream_encoder( &strm, filters, check); break; case FORMAT_LZMA: ret = lzma_alone_encoder(&strm, filters[0].options); break; case FORMAT_RAW: ret = lzma_raw_encoder(&strm, filters); break; } #endif } else { #ifdef HAVE_DECODERS uint32_t flags = 0; // It seems silly to warn about unsupported check if the // check won't be verified anyway due to --ignore-check. if (opt_ignore_check) flags |= LZMA_IGNORE_CHECK; else flags |= LZMA_TELL_UNSUPPORTED_CHECK; if (!opt_single_stream) flags |= LZMA_CONCATENATED; // We abuse FORMAT_AUTO to indicate unknown file format, // for which we may consider passthru mode. enum format_type init_format = FORMAT_AUTO; switch (opt_format) { case FORMAT_AUTO: if (is_format_xz()) init_format = FORMAT_XZ; else if (is_format_lzma()) init_format = FORMAT_LZMA; break; case FORMAT_XZ: if (is_format_xz()) init_format = FORMAT_XZ; break; case FORMAT_LZMA: if (is_format_lzma()) init_format = FORMAT_LZMA; break; case FORMAT_RAW: init_format = FORMAT_RAW; break; } switch (init_format) { case FORMAT_AUTO: // Unknown file format. If --decompress --stdout // --force have been given, then we copy the input // as is to stdout. Checking for MODE_DECOMPRESS // is needed, because we don't want to do use // passthru mode with --test. if (opt_mode == MODE_DECOMPRESS && opt_stdout && opt_force) return CODER_INIT_PASSTHRU; ret = LZMA_FORMAT_ERROR; break; case FORMAT_XZ: ret = lzma_stream_decoder(&strm, hardware_memlimit_get( MODE_DECOMPRESS), flags); break; case FORMAT_LZMA: ret = lzma_alone_decoder(&strm, hardware_memlimit_get( MODE_DECOMPRESS)); break; case FORMAT_RAW: // Memory usage has already been checked in // coder_set_compression_settings(). ret = lzma_raw_decoder(&strm, filters); break; } // Try to decode the headers. This will catch too low // memory usage limit in case it happens in the first // Block of the first Stream, which is where it very // probably will happen if it is going to happen. if (ret == LZMA_OK && init_format != FORMAT_RAW) { strm.next_out = NULL; strm.avail_out = 0; ret = lzma_code(&strm, LZMA_RUN); } #endif } if (ret != LZMA_OK) { message_error("%s: %s", pair->src_name, message_strm(ret)); if (ret == LZMA_MEMLIMIT_ERROR) message_mem_needed(V_ERROR, lzma_memusage(&strm)); return CODER_INIT_ERROR; } return CODER_INIT_NORMAL; } /// Resolve conflicts between opt_block_size and opt_block_list in single /// threaded mode. We want to default to opt_block_list, except when it is /// larger than opt_block_size. If this is the case for the current Block /// at *list_pos, then we break into smaller Blocks. Otherwise advance /// to the next Block in opt_block_list, and break apart if needed. static void split_block(uint64_t *block_remaining, uint64_t *next_block_remaining, size_t *list_pos) { if (*next_block_remaining > 0) { // The Block at *list_pos has previously been split up. assert(hardware_threads_get() == 1); assert(opt_block_size > 0); assert(opt_block_list != NULL); if (*next_block_remaining > opt_block_size) { // We have to split the current Block at *list_pos // into another opt_block_size length Block. *block_remaining = opt_block_size; } else { // This is the last remaining split Block for the // Block at *list_pos. *block_remaining = *next_block_remaining; } *next_block_remaining -= *block_remaining; } else { // The Block at *list_pos has been finished. Go to the next // entry in the list. If the end of the list has been reached, // reuse the size of the last Block. if (opt_block_list[*list_pos + 1] != 0) ++*list_pos; *block_remaining = opt_block_list[*list_pos]; // If in single-threaded mode, split up the Block if needed. // This is not needed in multi-threaded mode because liblzma // will do this due to how threaded encoding works. if (hardware_threads_get() == 1 && opt_block_size > 0 && *block_remaining > opt_block_size) { *next_block_remaining = *block_remaining - opt_block_size; *block_remaining = opt_block_size; } } } +static bool +coder_write_output(file_pair *pair) +{ + if (opt_mode != MODE_TEST) { + if (io_write(pair, &out_buf, IO_BUFFER_SIZE - strm.avail_out)) + return true; + } + + strm.next_out = out_buf.u8; + strm.avail_out = IO_BUFFER_SIZE; + return false; +} + + /// Compress or decompress using liblzma. static bool coder_normal(file_pair *pair) { // Encoder needs to know when we have given all the input to it. // The decoders need to know it too when we are using // LZMA_CONCATENATED. We need to check for src_eof here, because // the first input chunk has been already read if decompressing, // and that may have been the only chunk we will read. lzma_action action = pair->src_eof ? LZMA_FINISH : LZMA_RUN; lzma_ret ret; // Assume that something goes wrong. bool success = false; // block_remaining indicates how many input bytes to encode before // finishing the current .xz Block. The Block size is set with // --block-size=SIZE and --block-list. They have an effect only when // compressing to the .xz format. If block_remaining == UINT64_MAX, // only a single block is created. uint64_t block_remaining = UINT64_MAX; - // next_block_remining for when we are in single-threaded mode and + // next_block_remaining for when we are in single-threaded mode and // the Block in --block-list is larger than the --block-size=SIZE. uint64_t next_block_remaining = 0; // Position in opt_block_list. Unused if --block-list wasn't used. size_t list_pos = 0; // Handle --block-size for single-threaded mode and the first step // of --block-list. if (opt_mode == MODE_COMPRESS && opt_format == FORMAT_XZ) { // --block-size doesn't do anything here in threaded mode, // because the threaded encoder will take care of splitting // to fixed-sized Blocks. if (hardware_threads_get() == 1 && opt_block_size > 0) block_remaining = opt_block_size; // If --block-list was used, start with the first size. // // For threaded case, --block-size specifies how big Blocks // the encoder needs to be prepared to create at maximum // and --block-list will simultaneously cause new Blocks // to be started at specified intervals. To keep things // logical, the same is done in single-threaded mode. The // output is still not identical because in single-threaded // mode the size info isn't written into Block Headers. if (opt_block_list != NULL) { if (block_remaining < opt_block_list[list_pos]) { assert(hardware_threads_get() == 1); next_block_remaining = opt_block_list[list_pos] - block_remaining; } else { block_remaining = opt_block_list[list_pos]; } } } strm.next_out = out_buf.u8; strm.avail_out = IO_BUFFER_SIZE; while (!user_abort) { // Fill the input buffer if it is empty and we aren't // flushing or finishing. if (strm.avail_in == 0 && action == LZMA_RUN) { strm.next_in = in_buf.u8; strm.avail_in = io_read(pair, &in_buf, my_min(block_remaining, IO_BUFFER_SIZE)); if (strm.avail_in == SIZE_MAX) break; if (pair->src_eof) { action = LZMA_FINISH; } else if (block_remaining != UINT64_MAX) { // Start a new Block after every // opt_block_size bytes of input. block_remaining -= strm.avail_in; if (block_remaining == 0) action = LZMA_FULL_BARRIER; } - if (action == LZMA_RUN && flush_needed) + if (action == LZMA_RUN && pair->flush_needed) action = LZMA_SYNC_FLUSH; } // Let liblzma do the actual work. ret = lzma_code(&strm, action); // Write out if the output buffer became full. if (strm.avail_out == 0) { - if (opt_mode != MODE_TEST && io_write(pair, &out_buf, - IO_BUFFER_SIZE - strm.avail_out)) + if (coder_write_output(pair)) break; - - strm.next_out = out_buf.u8; - strm.avail_out = IO_BUFFER_SIZE; } if (ret == LZMA_STREAM_END && (action == LZMA_SYNC_FLUSH || action == LZMA_FULL_BARRIER)) { if (action == LZMA_SYNC_FLUSH) { // Flushing completed. Write the pending data - // out immediatelly so that the reading side + // out immediately so that the reading side // can decompress everything compressed so far. - if (io_write(pair, &out_buf, IO_BUFFER_SIZE - - strm.avail_out)) + if (coder_write_output(pair)) break; - strm.next_out = out_buf.u8; - strm.avail_out = IO_BUFFER_SIZE; - - // Set the time of the most recent flushing. - mytime_set_flush_time(); + // Mark that we haven't seen any new input + // since the previous flush. + pair->src_has_seen_input = false; + pair->flush_needed = false; } else { // Start a new Block after LZMA_FULL_BARRIER. if (opt_block_list == NULL) { assert(hardware_threads_get() == 1); assert(opt_block_size > 0); block_remaining = opt_block_size; } else { split_block(&block_remaining, &next_block_remaining, &list_pos); } } // Start a new Block after LZMA_FULL_FLUSH or continue // the same block after LZMA_SYNC_FLUSH. action = LZMA_RUN; } else if (ret != LZMA_OK) { // Determine if the return value indicates that we // won't continue coding. const bool stop = ret != LZMA_NO_CHECK && ret != LZMA_UNSUPPORTED_CHECK; if (stop) { // Write the remaining bytes even if something // went wrong, because that way the user gets // as much data as possible, which can be good // when trying to get at least some useful // data out of damaged files. - if (opt_mode != MODE_TEST && io_write(pair, - &out_buf, IO_BUFFER_SIZE - - strm.avail_out)) + if (coder_write_output(pair)) break; } if (ret == LZMA_STREAM_END) { if (opt_single_stream) { io_fix_src_pos(pair, strm.avail_in); success = true; break; } // Check that there is no trailing garbage. // This is needed for LZMA_Alone and raw // streams. if (strm.avail_in == 0 && !pair->src_eof) { // Try reading one more byte. // Hopefully we don't get any more // input, and thus pair->src_eof // becomes true. strm.avail_in = io_read( pair, &in_buf, 1); if (strm.avail_in == SIZE_MAX) break; assert(strm.avail_in == 0 || strm.avail_in == 1); } if (strm.avail_in == 0) { assert(pair->src_eof); success = true; break; } // We hadn't reached the end of the file. ret = LZMA_DATA_ERROR; assert(stop); } // If we get here and stop is true, something went // wrong and we print an error. Otherwise it's just // a warning and coding can continue. if (stop) { message_error("%s: %s", pair->src_name, message_strm(ret)); } else { message_warning("%s: %s", pair->src_name, message_strm(ret)); // When compressing, all possible errors set // stop to true. assert(opt_mode != MODE_COMPRESS); } if (ret == LZMA_MEMLIMIT_ERROR) { // Display how much memory it would have // actually needed. message_mem_needed(V_ERROR, lzma_memusage(&strm)); } if (stop) break; } // Show progress information under certain conditions. message_progress_update(); } return success; } /// Copy from input file to output file without processing the data in any /// way. This is used only when trying to decompress unrecognized files /// with --decompress --stdout --force, so the output is always stdout. static bool coder_passthru(file_pair *pair) { while (strm.avail_in != 0) { if (user_abort) return false; if (io_write(pair, &in_buf, strm.avail_in)) return false; strm.total_in += strm.avail_in; strm.total_out = strm.total_in; message_progress_update(); strm.avail_in = io_read(pair, &in_buf, IO_BUFFER_SIZE); if (strm.avail_in == SIZE_MAX) return false; } return true; } extern void coder_run(const char *filename) { // Set and possibly print the filename for the progress message. message_filename(filename); // Try to open the input file. file_pair *pair = io_open_src(filename); if (pair == NULL) return; // Assume that something goes wrong. bool success = false; if (opt_mode == MODE_COMPRESS) { strm.next_in = NULL; strm.avail_in = 0; } else { // Read the first chunk of input data. This is needed // to detect the input file type. strm.next_in = in_buf.u8; strm.avail_in = io_read(pair, &in_buf, IO_BUFFER_SIZE); } if (strm.avail_in != SIZE_MAX) { // Initialize the coder. This will detect the file format // and, in decompression or testing mode, check the memory // usage of the first Block too. This way we don't try to // open the destination file if we see that coding wouldn't // work at all anyway. This also avoids deleting the old // "target" file if --force was used. const enum coder_init_ret init_ret = coder_init(pair); if (init_ret != CODER_INIT_ERROR && !user_abort) { // Don't open the destination file when --test // is used. if (opt_mode == MODE_TEST || !io_open_dest(pair)) { // Remember the current time. It is needed - // for progress indicator and for timed - // flushing. + // for progress indicator. mytime_set_start_time(); // Initialize the progress indicator. + const bool is_passthru = init_ret + == CODER_INIT_PASSTHRU; const uint64_t in_size - = pair->src_st.st_size <= 0 - ? 0 : pair->src_st.st_size; - message_progress_start(&strm, in_size); + = pair->src_st.st_size <= 0 + ? 0 : (uint64_t)(pair->src_st.st_size); + message_progress_start(&strm, + is_passthru, in_size); // Do the actual coding or passthru. - if (init_ret == CODER_INIT_NORMAL) - success = coder_normal(pair); - else + if (is_passthru) success = coder_passthru(pair); + else + success = coder_normal(pair); message_progress_end(success); } } } // Close the file pair. It needs to know if coding was successful to // know if the source or target file should be unlinked. io_close(pair, success); return; } #ifndef NDEBUG extern void coder_free(void) { lzma_end(&strm); return; } #endif Index: head/contrib/xz/src/xz/file_io.c =================================================================== --- head/contrib/xz/src/xz/file_io.c (revision 359200) +++ head/contrib/xz/src/xz/file_io.c (revision 359201) @@ -1,1300 +1,1321 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file file_io.c /// \brief File opening, unlinking, and closing // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #include #ifdef TUKLIB_DOSLIKE # include #else # include static bool warn_fchown; #endif #if defined(HAVE_FUTIMES) || defined(HAVE_FUTIMESAT) || defined(HAVE_UTIMES) # include #elif defined(HAVE__FUTIME) # include #elif defined(HAVE_UTIME) # include #endif #ifdef HAVE_CAPSICUM # ifdef HAVE_SYS_CAPSICUM_H # include # else # include # endif #endif #include "tuklib_open_stdxxx.h" #ifndef O_BINARY # define O_BINARY 0 #endif #ifndef O_NOCTTY # define O_NOCTTY 0 #endif // Using this macro to silence a warning from gcc -Wlogical-op. #if EAGAIN == EWOULDBLOCK # define IS_EAGAIN_OR_EWOULDBLOCK(e) ((e) == EAGAIN) #else # define IS_EAGAIN_OR_EWOULDBLOCK(e) \ ((e) == EAGAIN || (e) == EWOULDBLOCK) #endif typedef enum { IO_WAIT_MORE, // Reading or writing is possible. IO_WAIT_ERROR, // Error or user_abort IO_WAIT_TIMEOUT, // poll() timed out } io_wait_ret; /// If true, try to create sparse files when decompressing. static bool try_sparse = true; #ifdef ENABLE_SANDBOX /// True if the conditions for sandboxing (described in main()) have been met. static bool sandbox_allowed = false; #endif #ifndef TUKLIB_DOSLIKE /// File status flags of standard input. This is used by io_open_src() /// and io_close_src(). static int stdin_flags; static bool restore_stdin_flags = false; /// Original file status flags of standard output. This is used by /// io_open_dest() and io_close_dest() to save and restore the flags. static int stdout_flags; static bool restore_stdout_flags = false; /// Self-pipe used together with the user_abort variable to avoid /// race conditions with signal handling. static int user_abort_pipe[2]; #endif static bool io_write_buf(file_pair *pair, const uint8_t *buf, size_t size); extern void io_init(void) { // Make sure that stdin, stdout, and stderr are connected to // a valid file descriptor. Exit immediately with exit code ERROR // if we cannot make the file descriptors valid. Maybe we should // print an error message, but our stderr could be screwed anyway. tuklib_open_stdxxx(E_ERROR); #ifndef TUKLIB_DOSLIKE // If fchown() fails setting the owner, we warn about it only if // we are root. warn_fchown = geteuid() == 0; // Create a pipe for the self-pipe trick. if (pipe(user_abort_pipe)) message_fatal(_("Error creating a pipe: %s"), strerror(errno)); // Make both ends of the pipe non-blocking. for (unsigned i = 0; i < 2; ++i) { int flags = fcntl(user_abort_pipe[i], F_GETFL); if (flags == -1 || fcntl(user_abort_pipe[i], F_SETFL, flags | O_NONBLOCK) == -1) message_fatal(_("Error creating a pipe: %s"), strerror(errno)); } #endif #ifdef __DJGPP__ // Avoid doing useless things when statting files. // This isn't important but doesn't hurt. _djstat_flags = _STAT_EXEC_EXT | _STAT_EXEC_MAGIC | _STAT_DIRSIZE; #endif return; } #ifndef TUKLIB_DOSLIKE extern void io_write_to_user_abort_pipe(void) { // If the write() fails, it's probably due to the pipe being full. // Failing in that case is fine. If the reason is something else, // there's not much we can do since this is called in a signal // handler. So ignore the errors and try to avoid warnings with // GCC and glibc when _FORTIFY_SOURCE=2 is used. uint8_t b = '\0'; const int ret = write(user_abort_pipe[1], &b, 1); (void)ret; return; } #endif extern void io_no_sparse(void) { try_sparse = false; return; } #ifdef ENABLE_SANDBOX extern void io_allow_sandbox(void) { sandbox_allowed = true; return; } /// Enables operating-system-specific sandbox if it is possible. /// src_fd is the file descriptor of the input file. static void io_sandbox_enter(int src_fd) { if (!sandbox_allowed) { - message(V_DEBUG, _("Sandbox is disabled due " - "to incompatible command line arguments")); + // This message is more often annoying than useful so + // it's commented out. It can be useful when developing + // the sandboxing code. + //message(V_DEBUG, _("Sandbox is disabled due " + // "to incompatible command line arguments")); return; } const char dummy_str[] = "x"; // Try to ensure that both libc and xz locale files have been // loaded when NLS is enabled. snprintf(NULL, 0, "%s%s", _(dummy_str), strerror(EINVAL)); // Try to ensure that iconv data files needed for handling multibyte // characters have been loaded. This is needed at least with glibc. tuklib_mbstr_width(dummy_str, NULL); #ifdef HAVE_CAPSICUM // Capsicum needs FreeBSD 10.0 or later. cap_rights_t rights; if (cap_rights_limit(src_fd, cap_rights_init(&rights, CAP_EVENT, CAP_FCNTL, CAP_LOOKUP, CAP_READ, CAP_SEEK))) goto error; if (cap_rights_limit(STDOUT_FILENO, cap_rights_init(&rights, CAP_EVENT, CAP_FCNTL, CAP_FSTAT, CAP_LOOKUP, CAP_WRITE, CAP_SEEK))) goto error; if (cap_rights_limit(user_abort_pipe[0], cap_rights_init(&rights, CAP_EVENT))) goto error; if (cap_rights_limit(user_abort_pipe[1], cap_rights_init(&rights, CAP_WRITE))) goto error; if (cap_enter()) goto error; #else # error ENABLE_SANDBOX is defined but no sandboxing method was found. #endif - message(V_DEBUG, _("Sandbox was successfully enabled")); + // This message is annoying in xz -lvv. + //message(V_DEBUG, _("Sandbox was successfully enabled")); return; error: message(V_DEBUG, _("Failed to enable the sandbox")); } #endif // ENABLE_SANDBOX #ifndef TUKLIB_DOSLIKE /// \brief Waits for input or output to become available or for a signal /// /// This uses the self-pipe trick to avoid a race condition that can occur /// if a signal is caught after user_abort has been checked but before e.g. /// read() has been called. In that situation read() could block unless /// non-blocking I/O is used. With non-blocking I/O something like select() /// or poll() is needed to avoid a busy-wait loop, and the same race condition /// pops up again. There are pselect() (POSIX-1.2001) and ppoll() (not in /// POSIX) but neither is portable enough in 2013. The self-pipe trick is /// old and very portable. static io_wait_ret io_wait(file_pair *pair, int timeout, bool is_reading) { struct pollfd pfd[2]; if (is_reading) { pfd[0].fd = pair->src_fd; pfd[0].events = POLLIN; } else { pfd[0].fd = pair->dest_fd; pfd[0].events = POLLOUT; } pfd[1].fd = user_abort_pipe[0]; pfd[1].events = POLLIN; while (true) { const int ret = poll(pfd, 2, timeout); if (user_abort) return IO_WAIT_ERROR; if (ret == -1) { if (errno == EINTR || errno == EAGAIN) continue; message_error(_("%s: poll() failed: %s"), is_reading ? pair->src_name : pair->dest_name, strerror(errno)); return IO_WAIT_ERROR; } - if (ret == 0) { - assert(opt_flush_timeout != 0); - flush_needed = true; + if (ret == 0) return IO_WAIT_TIMEOUT; - } if (pfd[0].revents != 0) return IO_WAIT_MORE; } } #endif /// \brief Unlink a file /// /// This tries to verify that the file being unlinked really is the file that /// we want to unlink by verifying device and inode numbers. There's still /// a small unavoidable race, but this is much better than nothing (the file /// could have been moved/replaced even hours earlier). static void io_unlink(const char *name, const struct stat *known_st) { #if defined(TUKLIB_DOSLIKE) // On DOS-like systems, st_ino is meaningless, so don't bother // testing it. Just silence a compiler warning. (void)known_st; #else struct stat new_st; // If --force was used, use stat() instead of lstat(). This way // (de)compressing symlinks works correctly. However, it also means // that xz cannot detect if a regular file foo is renamed to bar // and then a symlink foo -> bar is created. Because of stat() // instead of lstat(), xz will think that foo hasn't been replaced // with another file. Thus, xz will remove foo even though it no // longer is the same file that xz used when it started compressing. // Probably it's not too bad though, so this doesn't need a more // complex fix. const int stat_ret = opt_force ? stat(name, &new_st) : lstat(name, &new_st); if (stat_ret # ifdef __VMS // st_ino is an array, and we don't want to // compare st_dev at all. || memcmp(&new_st.st_ino, &known_st->st_ino, sizeof(new_st.st_ino)) != 0 # else // Typical POSIX-like system || new_st.st_dev != known_st->st_dev || new_st.st_ino != known_st->st_ino # endif ) // TRANSLATORS: When compression or decompression finishes, // and xz is going to remove the source file, xz first checks // if the source file still exists, and if it does, does its // device and inode numbers match what xz saw when it opened // the source file. If these checks fail, this message is // shown, %s being the filename, and the file is not deleted. // The check for device and inode numbers is there, because // it is possible that the user has put a new file in place // of the original file, and in that case it obviously // shouldn't be removed. message_error(_("%s: File seems to have been moved, " "not removing"), name); else #endif // There's a race condition between lstat() and unlink() // but at least we have tried to avoid removing wrong file. if (unlink(name)) message_error(_("%s: Cannot remove: %s"), name, strerror(errno)); return; } /// \brief Copies owner/group and permissions /// /// \todo ACL and EA support /// static void io_copy_attrs(const file_pair *pair) { // Skip chown and chmod on Windows. #ifndef TUKLIB_DOSLIKE // This function is more tricky than you may think at first. // Blindly copying permissions may permit users to access the // destination file who didn't have permission to access the // source file. // Try changing the owner of the file. If we aren't root or the owner // isn't already us, fchown() probably doesn't succeed. We warn // about failing fchown() only if we are root. - if (fchown(pair->dest_fd, pair->src_st.st_uid, -1) && warn_fchown) + if (fchown(pair->dest_fd, pair->src_st.st_uid, (gid_t)(-1)) + && warn_fchown) message_warning(_("%s: Cannot set the file owner: %s"), pair->dest_name, strerror(errno)); mode_t mode; - if (fchown(pair->dest_fd, -1, pair->src_st.st_gid)) { + if (fchown(pair->dest_fd, (uid_t)(-1), pair->src_st.st_gid)) { message_warning(_("%s: Cannot set the file group: %s"), pair->dest_name, strerror(errno)); // We can still safely copy some additional permissions: // `group' must be at least as strict as `other' and // also vice versa. // // NOTE: After this, the owner of the source file may // get additional permissions. This shouldn't be too bad, // because the owner would have had permission to chmod // the original file anyway. mode = ((pair->src_st.st_mode & 0070) >> 3) & (pair->src_st.st_mode & 0007); mode = (pair->src_st.st_mode & 0700) | (mode << 3) | mode; } else { // Drop the setuid, setgid, and sticky bits. mode = pair->src_st.st_mode & 0777; } if (fchmod(pair->dest_fd, mode)) message_warning(_("%s: Cannot set the file permissions: %s"), pair->dest_name, strerror(errno)); #endif // Copy the timestamps. We have several possible ways to do this, of // which some are better in both security and precision. // // First, get the nanosecond part of the timestamps. As of writing, // it's not standardized by POSIX, and there are several names for // the same thing in struct stat. long atime_nsec; long mtime_nsec; # if defined(HAVE_STRUCT_STAT_ST_ATIM_TV_NSEC) // GNU and Solaris atime_nsec = pair->src_st.st_atim.tv_nsec; mtime_nsec = pair->src_st.st_mtim.tv_nsec; # elif defined(HAVE_STRUCT_STAT_ST_ATIMESPEC_TV_NSEC) // BSD atime_nsec = pair->src_st.st_atimespec.tv_nsec; mtime_nsec = pair->src_st.st_mtimespec.tv_nsec; # elif defined(HAVE_STRUCT_STAT_ST_ATIMENSEC) // GNU and BSD without extensions atime_nsec = pair->src_st.st_atimensec; mtime_nsec = pair->src_st.st_mtimensec; # elif defined(HAVE_STRUCT_STAT_ST_UATIME) // Tru64 atime_nsec = pair->src_st.st_uatime * 1000; mtime_nsec = pair->src_st.st_umtime * 1000; # elif defined(HAVE_STRUCT_STAT_ST_ATIM_ST__TIM_TV_NSEC) // UnixWare atime_nsec = pair->src_st.st_atim.st__tim.tv_nsec; mtime_nsec = pair->src_st.st_mtim.st__tim.tv_nsec; # else // Safe fallback atime_nsec = 0; mtime_nsec = 0; # endif // Construct a structure to hold the timestamps and call appropriate // function to set the timestamps. #if defined(HAVE_FUTIMENS) // Use nanosecond precision. struct timespec tv[2]; tv[0].tv_sec = pair->src_st.st_atime; tv[0].tv_nsec = atime_nsec; tv[1].tv_sec = pair->src_st.st_mtime; tv[1].tv_nsec = mtime_nsec; (void)futimens(pair->dest_fd, tv); #elif defined(HAVE_FUTIMES) || defined(HAVE_FUTIMESAT) || defined(HAVE_UTIMES) // Use microsecond precision. struct timeval tv[2]; tv[0].tv_sec = pair->src_st.st_atime; tv[0].tv_usec = atime_nsec / 1000; tv[1].tv_sec = pair->src_st.st_mtime; tv[1].tv_usec = mtime_nsec / 1000; # if defined(HAVE_FUTIMES) (void)futimes(pair->dest_fd, tv); # elif defined(HAVE_FUTIMESAT) (void)futimesat(pair->dest_fd, NULL, tv); # else // Argh, no function to use a file descriptor to set the timestamp. (void)utimes(pair->dest_name, tv); # endif #elif defined(HAVE__FUTIME) // Use one-second precision with Windows-specific _futime(). // We could use utime() too except that for some reason the // timestamp will get reset at close(). With _futime() it works. // This struct cannot be const as _futime() takes a non-const pointer. struct _utimbuf buf = { .actime = pair->src_st.st_atime, .modtime = pair->src_st.st_mtime, }; // Avoid warnings. (void)atime_nsec; (void)mtime_nsec; (void)_futime(pair->dest_fd, &buf); #elif defined(HAVE_UTIME) // Use one-second precision. utime() doesn't support using file // descriptor either. Some systems have broken utime() prototype // so don't make this const. struct utimbuf buf = { .actime = pair->src_st.st_atime, .modtime = pair->src_st.st_mtime, }; // Avoid warnings. (void)atime_nsec; (void)mtime_nsec; (void)utime(pair->dest_name, &buf); #endif return; } /// Opens the source file. Returns false on success, true on error. static bool io_open_src_real(file_pair *pair) { // There's nothing to open when reading from stdin. if (pair->src_name == stdin_filename) { pair->src_fd = STDIN_FILENO; #ifdef TUKLIB_DOSLIKE setmode(STDIN_FILENO, O_BINARY); #else // Try to set stdin to non-blocking mode. It won't work // e.g. on OpenBSD if stdout is e.g. /dev/null. In such // case we proceed as if stdin were non-blocking anyway // (in case of /dev/null it will be in practice). The // same applies to stdout in io_open_dest_real(). stdin_flags = fcntl(STDIN_FILENO, F_GETFL); if (stdin_flags == -1) { message_error(_("Error getting the file status flags " "from standard input: %s"), strerror(errno)); return true; } if ((stdin_flags & O_NONBLOCK) == 0 && fcntl(STDIN_FILENO, F_SETFL, stdin_flags | O_NONBLOCK) != -1) restore_stdin_flags = true; #endif #ifdef HAVE_POSIX_FADVISE // It will fail if stdin is a pipe and that's fine. (void)posix_fadvise(STDIN_FILENO, 0, 0, opt_mode == MODE_LIST ? POSIX_FADV_RANDOM : POSIX_FADV_SEQUENTIAL); #endif return false; } // Symlinks are not followed unless writing to stdout or --force // was used. const bool follow_symlinks = opt_stdout || opt_force; // We accept only regular files if we are writing the output // to disk too. bzip2 allows overriding this with --force but // gzip and xz don't. const bool reg_files_only = !opt_stdout; // Flags for open() int flags = O_RDONLY | O_BINARY | O_NOCTTY; #ifndef TUKLIB_DOSLIKE // Use non-blocking I/O: // - It prevents blocking when opening FIFOs and some other // special files, which is good if we want to accept only // regular files. // - It can help avoiding some race conditions with signal handling. flags |= O_NONBLOCK; #endif #if defined(O_NOFOLLOW) if (!follow_symlinks) flags |= O_NOFOLLOW; #elif !defined(TUKLIB_DOSLIKE) // Some POSIX-like systems lack O_NOFOLLOW (it's not required // by POSIX). Check for symlinks with a separate lstat() on // these systems. if (!follow_symlinks) { struct stat st; if (lstat(pair->src_name, &st)) { message_error("%s: %s", pair->src_name, strerror(errno)); return true; } else if (S_ISLNK(st.st_mode)) { message_warning(_("%s: Is a symbolic link, " "skipping"), pair->src_name); return true; } } #else // Avoid warnings. (void)follow_symlinks; #endif // Try to open the file. Signals have been blocked so EINTR shouldn't // be possible. pair->src_fd = open(pair->src_name, flags); if (pair->src_fd == -1) { // Signals (that have a signal handler) have been blocked. assert(errno != EINTR); #ifdef O_NOFOLLOW // Give an understandable error message if the reason // for failing was that the file was a symbolic link. // // Note that at least Linux, OpenBSD, Solaris, and Darwin // use ELOOP to indicate that O_NOFOLLOW was the reason // that open() failed. Because there may be // directories in the pathname, ELOOP may occur also // because of a symlink loop in the directory part. // So ELOOP doesn't tell us what actually went wrong, // and this stupidity went into POSIX-1.2008 too. // // FreeBSD associates EMLINK with O_NOFOLLOW and // Tru64 uses ENOTSUP. We use these directly here // and skip the lstat() call and the associated race. // I want to hear if there are other kernels that // fail with something else than ELOOP with O_NOFOLLOW. bool was_symlink = false; # if defined(__FreeBSD__) || defined(__DragonFly__) if (errno == EMLINK) was_symlink = true; # elif defined(__digital__) && defined(__unix__) if (errno == ENOTSUP) was_symlink = true; # elif defined(__NetBSD__) if (errno == EFTYPE) was_symlink = true; # else if (errno == ELOOP && !follow_symlinks) { const int saved_errno = errno; struct stat st; if (lstat(pair->src_name, &st) == 0 && S_ISLNK(st.st_mode)) was_symlink = true; errno = saved_errno; } # endif if (was_symlink) message_warning(_("%s: Is a symbolic link, " "skipping"), pair->src_name); else #endif // Something else than O_NOFOLLOW failing // (assuming that the race conditions didn't // confuse us). message_error("%s: %s", pair->src_name, strerror(errno)); return true; } // Stat the source file. We need the result also when we copy // the permissions, and when unlinking. // // NOTE: Use stat() instead of fstat() with DJGPP, because // then we have a better chance to get st_ino value that can // be used in io_open_dest_real() to prevent overwriting the // source file. #ifdef __DJGPP__ if (stat(pair->src_name, &pair->src_st)) goto error_msg; #else if (fstat(pair->src_fd, &pair->src_st)) goto error_msg; #endif if (S_ISDIR(pair->src_st.st_mode)) { message_warning(_("%s: Is a directory, skipping"), pair->src_name); goto error; } if (reg_files_only && !S_ISREG(pair->src_st.st_mode)) { message_warning(_("%s: Not a regular file, skipping"), pair->src_name); goto error; } #ifndef TUKLIB_DOSLIKE if (reg_files_only && !opt_force) { if (pair->src_st.st_mode & (S_ISUID | S_ISGID)) { // gzip rejects setuid and setgid files even // when --force was used. bzip2 doesn't check // for them, but calls fchown() after fchmod(), // and many systems automatically drop setuid // and setgid bits there. // // We accept setuid and setgid files if // --force was used. We drop these bits // explicitly in io_copy_attr(). message_warning(_("%s: File has setuid or " "setgid bit set, skipping"), pair->src_name); goto error; } if (pair->src_st.st_mode & S_ISVTX) { message_warning(_("%s: File has sticky bit " "set, skipping"), pair->src_name); goto error; } if (pair->src_st.st_nlink > 1) { message_warning(_("%s: Input file has more " "than one hard link, " "skipping"), pair->src_name); goto error; } } // If it is something else than a regular file, wait until // there is input available. This way reading from FIFOs // will work when open() is used with O_NONBLOCK. if (!S_ISREG(pair->src_st.st_mode)) { signals_unblock(); const io_wait_ret ret = io_wait(pair, -1, true); signals_block(); if (ret != IO_WAIT_MORE) goto error; } #endif #ifdef HAVE_POSIX_FADVISE // It will fail with some special files like FIFOs but that is fine. (void)posix_fadvise(pair->src_fd, 0, 0, opt_mode == MODE_LIST ? POSIX_FADV_RANDOM : POSIX_FADV_SEQUENTIAL); #endif return false; error_msg: message_error("%s: %s", pair->src_name, strerror(errno)); error: (void)close(pair->src_fd); return true; } extern file_pair * io_open_src(const char *src_name) { if (is_empty_filename(src_name)) return NULL; // Since we have only one file open at a time, we can use // a statically allocated structure. static file_pair pair; pair = (file_pair){ .src_name = src_name, .dest_name = NULL, .src_fd = -1, .dest_fd = -1, .src_eof = false, + .src_has_seen_input = false, + .flush_needed = false, .dest_try_sparse = false, .dest_pending_sparse = 0, }; // Block the signals, for which we have a custom signal handler, so // that we don't need to worry about EINTR. signals_block(); const bool error = io_open_src_real(&pair); signals_unblock(); #ifdef ENABLE_SANDBOX if (!error) io_sandbox_enter(pair.src_fd); #endif return error ? NULL : &pair; } /// \brief Closes source file of the file_pair structure /// /// \param pair File whose src_fd should be closed /// \param success If true, the file will be removed from the disk if /// closing succeeds and --keep hasn't been used. static void io_close_src(file_pair *pair, bool success) { #ifndef TUKLIB_DOSLIKE if (restore_stdin_flags) { assert(pair->src_fd == STDIN_FILENO); restore_stdin_flags = false; if (fcntl(STDIN_FILENO, F_SETFL, stdin_flags) == -1) message_error(_("Error restoring the status flags " "to standard input: %s"), strerror(errno)); } #endif if (pair->src_fd != STDIN_FILENO && pair->src_fd != -1) { // Close the file before possibly unlinking it. On DOS-like // systems this is always required since unlinking will fail // if the file is open. On POSIX systems it usually works // to unlink open files, but in some cases it doesn't and // one gets EBUSY in errno. // // xz 5.2.2 and older unlinked the file before closing it // (except on DOS-like systems). The old code didn't handle // EBUSY and could fail e.g. on some CIFS shares. The // advantage of unlinking before closing is negligible // (avoids a race between close() and stat()/lstat() and // unlink()), so let's keep this simple. (void)close(pair->src_fd); if (success && !opt_keep_original) io_unlink(pair->src_name, &pair->src_st); } return; } static bool io_open_dest_real(file_pair *pair) { if (opt_stdout || pair->src_fd == STDIN_FILENO) { // We don't modify or free() this. pair->dest_name = (char *)"(stdout)"; pair->dest_fd = STDOUT_FILENO; #ifdef TUKLIB_DOSLIKE setmode(STDOUT_FILENO, O_BINARY); #else // Try to set O_NONBLOCK if it isn't already set. // If it fails, we assume that stdout is non-blocking // in practice. See the comments in io_open_src_real() // for similar situation with stdin. // // NOTE: O_APPEND may be unset later in this function // and it relies on stdout_flags being set here. stdout_flags = fcntl(STDOUT_FILENO, F_GETFL); if (stdout_flags == -1) { message_error(_("Error getting the file status flags " "from standard output: %s"), strerror(errno)); return true; } if ((stdout_flags & O_NONBLOCK) == 0 && fcntl(STDOUT_FILENO, F_SETFL, stdout_flags | O_NONBLOCK) != -1) restore_stdout_flags = true; #endif } else { pair->dest_name = suffix_get_dest_name(pair->src_name); if (pair->dest_name == NULL) return true; #ifdef __DJGPP__ struct stat st; if (stat(pair->dest_name, &st) == 0) { // Check that it isn't a special file like "prn". if (st.st_dev == -1) { message_error("%s: Refusing to write to " "a DOS special file", pair->dest_name); free(pair->dest_name); return true; } // Check that we aren't overwriting the source file. if (st.st_dev == pair->src_st.st_dev && st.st_ino == pair->src_st.st_ino) { message_error("%s: Output file is the same " "as the input file", pair->dest_name); free(pair->dest_name); return true; } } #endif // If --force was used, unlink the target file first. if (opt_force && unlink(pair->dest_name) && errno != ENOENT) { message_error(_("%s: Cannot remove: %s"), pair->dest_name, strerror(errno)); free(pair->dest_name); return true; } // Open the file. int flags = O_WRONLY | O_BINARY | O_NOCTTY | O_CREAT | O_EXCL; #ifndef TUKLIB_DOSLIKE flags |= O_NONBLOCK; #endif const mode_t mode = S_IRUSR | S_IWUSR; pair->dest_fd = open(pair->dest_name, flags, mode); if (pair->dest_fd == -1) { message_error("%s: %s", pair->dest_name, strerror(errno)); free(pair->dest_name); return true; } } #ifndef TUKLIB_DOSLIKE // dest_st isn't used on DOS-like systems except as a dummy // argument to io_unlink(), so don't fstat() on such systems. if (fstat(pair->dest_fd, &pair->dest_st)) { // If fstat() really fails, we have a safe fallback here. # if defined(__VMS) pair->dest_st.st_ino[0] = 0; pair->dest_st.st_ino[1] = 0; pair->dest_st.st_ino[2] = 0; # else pair->dest_st.st_dev = 0; pair->dest_st.st_ino = 0; # endif } else if (try_sparse && opt_mode == MODE_DECOMPRESS) { // When writing to standard output, we need to be extra // careful: // - It may be connected to something else than // a regular file. // - We aren't necessarily writing to a new empty file // or to the end of an existing file. // - O_APPEND may be active. // // TODO: I'm keeping this disabled for DOS-like systems // for now. FAT doesn't support sparse files, but NTFS // does, so maybe this should be enabled on Windows after // some testing. if (pair->dest_fd == STDOUT_FILENO) { if (!S_ISREG(pair->dest_st.st_mode)) return false; if (stdout_flags & O_APPEND) { // Creating a sparse file is not possible // when O_APPEND is active (it's used by // shell's >> redirection). As I understand // it, it is safe to temporarily disable // O_APPEND in xz, because if someone // happened to write to the same file at the // same time, results would be bad anyway // (users shouldn't assume that xz uses any // specific block size when writing data). // // The write position may be something else // than the end of the file, so we must fix // it to start writing at the end of the file // to imitate O_APPEND. if (lseek(STDOUT_FILENO, 0, SEEK_END) == -1) return false; // Construct the new file status flags. // If O_NONBLOCK was set earlier in this // function, it must be kept here too. int flags = stdout_flags & ~O_APPEND; if (restore_stdout_flags) flags |= O_NONBLOCK; // If this fcntl() fails, we continue but won't // try to create sparse output. The original // flags will still be restored if needed (to // unset O_NONBLOCK) when the file is finished. if (fcntl(STDOUT_FILENO, F_SETFL, flags) == -1) return false; // Disabling O_APPEND succeeded. Mark // that the flags should be restored // in io_close_dest(). (This may have already // been set when enabling O_NONBLOCK.) restore_stdout_flags = true; } else if (lseek(STDOUT_FILENO, 0, SEEK_CUR) != pair->dest_st.st_size) { // Writing won't start exactly at the end // of the file. We cannot use sparse output, // because it would probably corrupt the file. return false; } } pair->dest_try_sparse = true; } #endif return false; } extern bool io_open_dest(file_pair *pair) { signals_block(); const bool ret = io_open_dest_real(pair); signals_unblock(); return ret; } /// \brief Closes destination file of the file_pair structure /// /// \param pair File whose dest_fd should be closed /// \param success If false, the file will be removed from the disk. /// /// \return Zero if closing succeeds. On error, -1 is returned and /// error message printed. static bool io_close_dest(file_pair *pair, bool success) { #ifndef TUKLIB_DOSLIKE // If io_open_dest() has disabled O_APPEND, restore it here. if (restore_stdout_flags) { assert(pair->dest_fd == STDOUT_FILENO); restore_stdout_flags = false; if (fcntl(STDOUT_FILENO, F_SETFL, stdout_flags) == -1) { message_error(_("Error restoring the O_APPEND flag " "to standard output: %s"), strerror(errno)); return true; } } #endif if (pair->dest_fd == -1 || pair->dest_fd == STDOUT_FILENO) return false; if (close(pair->dest_fd)) { message_error(_("%s: Closing the file failed: %s"), pair->dest_name, strerror(errno)); // Closing destination file failed, so we cannot trust its // contents. Get rid of junk: io_unlink(pair->dest_name, &pair->dest_st); free(pair->dest_name); return true; } // If the operation using this file wasn't successful, we git rid // of the junk file. if (!success) io_unlink(pair->dest_name, &pair->dest_st); free(pair->dest_name); return false; } extern void io_close(file_pair *pair, bool success) { // Take care of sparseness at the end of the output file. if (success && pair->dest_try_sparse && pair->dest_pending_sparse > 0) { // Seek forward one byte less than the size of the pending // hole, then write one zero-byte. This way the file grows // to its correct size. An alternative would be to use // ftruncate() but that isn't portable enough (e.g. it // doesn't work with FAT on Linux; FAT isn't that important // since it doesn't support sparse files anyway, but we don't // want to create corrupt files on it). if (lseek(pair->dest_fd, pair->dest_pending_sparse - 1, SEEK_CUR) == -1) { message_error(_("%s: Seeking failed when trying " "to create a sparse file: %s"), pair->dest_name, strerror(errno)); success = false; } else { const uint8_t zero[1] = { '\0' }; if (io_write_buf(pair, zero, 1)) success = false; } } signals_block(); // Copy the file attributes. We need to skip this if destination // file isn't open or it is standard output. if (success && pair->dest_fd != -1 && pair->dest_fd != STDOUT_FILENO) io_copy_attrs(pair); // Close the destination first. If it fails, we must not remove // the source file! if (io_close_dest(pair, success)) success = false; // Close the source file, and unlink it if the operation using this // file pair was successful and we haven't requested to keep the // source file. io_close_src(pair, success); signals_unblock(); return; } extern void io_fix_src_pos(file_pair *pair, size_t rewind_size) { assert(rewind_size <= IO_BUFFER_SIZE); if (rewind_size > 0) { // This doesn't need to work on unseekable file descriptors, // so just ignore possible errors. (void)lseek(pair->src_fd, -(off_t)(rewind_size), SEEK_CUR); } return; } extern size_t -io_read(file_pair *pair, io_buf *buf_union, size_t size) +io_read(file_pair *pair, io_buf *buf, size_t size) { // We use small buffers here. assert(size < SSIZE_MAX); - uint8_t *buf = buf_union->u8; - size_t left = size; + size_t pos = 0; - while (left > 0) { - const ssize_t amount = read(pair->src_fd, buf, left); + while (pos < size) { + const ssize_t amount = read( + pair->src_fd, buf->u8 + pos, size - pos); if (amount == 0) { pair->src_eof = true; break; } if (amount == -1) { if (errno == EINTR) { if (user_abort) return SIZE_MAX; continue; } #ifndef TUKLIB_DOSLIKE if (IS_EAGAIN_OR_EWOULDBLOCK(errno)) { - const io_wait_ret ret = io_wait(pair, - mytime_get_flush_timeout(), - true); - switch (ret) { + // Disable the flush-timeout if no input has + // been seen since the previous flush and thus + // there would be nothing to flush after the + // timeout expires (avoids busy waiting). + const int timeout = pair->src_has_seen_input + ? mytime_get_flush_timeout() + : -1; + + switch (io_wait(pair, timeout, true)) { case IO_WAIT_MORE: continue; case IO_WAIT_ERROR: return SIZE_MAX; case IO_WAIT_TIMEOUT: - return size - left; + pair->flush_needed = true; + return pos; default: message_bug(); } } #endif message_error(_("%s: Read error: %s"), pair->src_name, strerror(errno)); return SIZE_MAX; } - buf += (size_t)(amount); - left -= (size_t)(amount); + pos += (size_t)(amount); + + if (!pair->src_has_seen_input) { + pair->src_has_seen_input = true; + mytime_set_flush_time(); + } } - return size - left; + return pos; } extern bool io_pread(file_pair *pair, io_buf *buf, size_t size, off_t pos) { // Using lseek() and read() is more portable than pread() and // for us it is as good as real pread(). if (lseek(pair->src_fd, pos, SEEK_SET) != pos) { message_error(_("%s: Error seeking the file: %s"), pair->src_name, strerror(errno)); return true; } const size_t amount = io_read(pair, buf, size); if (amount == SIZE_MAX) return true; if (amount != size) { message_error(_("%s: Unexpected end of file"), pair->src_name); return true; } return false; } static bool is_sparse(const io_buf *buf) { assert(IO_BUFFER_SIZE % sizeof(uint64_t) == 0); for (size_t i = 0; i < ARRAY_SIZE(buf->u64); ++i) if (buf->u64[i] != 0) return false; return true; } static bool io_write_buf(file_pair *pair, const uint8_t *buf, size_t size) { assert(size < SSIZE_MAX); while (size > 0) { const ssize_t amount = write(pair->dest_fd, buf, size); if (amount == -1) { if (errno == EINTR) { if (user_abort) return true; continue; } #ifndef TUKLIB_DOSLIKE if (IS_EAGAIN_OR_EWOULDBLOCK(errno)) { if (io_wait(pair, -1, false) == IO_WAIT_MORE) continue; return true; } #endif // Handle broken pipe specially. gzip and bzip2 // don't print anything on SIGPIPE. In addition, // gzip --quiet uses exit status 2 (warning) on // broken pipe instead of whatever raise(SIGPIPE) // would make it return. It is there to hide "Broken // pipe" message on some old shells (probably old // GNU bash). // // We don't do anything special with --quiet, which // is what bzip2 does too. If we get SIGPIPE, we // will handle it like other signals by setting // user_abort, and get EPIPE here. if (errno != EPIPE) message_error(_("%s: Write error: %s"), pair->dest_name, strerror(errno)); return true; } buf += (size_t)(amount); size -= (size_t)(amount); } return false; } extern bool io_write(file_pair *pair, const io_buf *buf, size_t size) { assert(size <= IO_BUFFER_SIZE); if (pair->dest_try_sparse) { // Check if the block is sparse (contains only zeros). If it // sparse, we just store the amount and return. We will take // care of actually skipping over the hole when we hit the // next data block or close the file. // // Since io_close() requires that dest_pending_sparse > 0 // if the file ends with sparse block, we must also return // if size == 0 to avoid doing the lseek(). if (size == IO_BUFFER_SIZE) { - if (is_sparse(buf)) { - pair->dest_pending_sparse += size; + // Even if the block was sparse, treat it as non-sparse + // if the pending sparse amount is large compared to + // the size of off_t. In practice this only matters + // on 32-bit systems where off_t isn't always 64 bits. + const off_t pending_max + = (off_t)(1) << (sizeof(off_t) * CHAR_BIT - 2); + if (is_sparse(buf) && pair->dest_pending_sparse + < pending_max) { + pair->dest_pending_sparse += (off_t)(size); return false; } } else if (size == 0) { return false; } // This is not a sparse block. If we have a pending hole, // skip it now. if (pair->dest_pending_sparse > 0) { if (lseek(pair->dest_fd, pair->dest_pending_sparse, SEEK_CUR) == -1) { message_error(_("%s: Seeking failed when " "trying to create a sparse " "file: %s"), pair->dest_name, strerror(errno)); return true; } pair->dest_pending_sparse = 0; } } return io_write_buf(pair, buf->u8, size); } Index: head/contrib/xz/src/xz/file_io.h =================================================================== --- head/contrib/xz/src/xz/file_io.h (revision 359200) +++ head/contrib/xz/src/xz/file_io.h (revision 359201) @@ -1,156 +1,166 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file file_io.h /// \brief I/O types and functions // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// // Some systems have suboptimal BUFSIZ. Use a bit bigger value on them. // We also need that IO_BUFFER_SIZE is a multiple of 8 (sizeof(uint64_t)) #if BUFSIZ <= 1024 # define IO_BUFFER_SIZE 8192 #else # define IO_BUFFER_SIZE (BUFSIZ & ~7U) #endif /// is_sparse() accesses the buffer as uint64_t for maximum speed. -/// Use an union to make sure that the buffer is properly aligned. +/// The u32 and u64 members must only be access through this union +/// to avoid strict aliasing violations. Taking a pointer of u8 +/// should be fine as long as uint8_t maps to unsigned char which +/// can alias anything. typedef union { uint8_t u8[IO_BUFFER_SIZE]; uint32_t u32[IO_BUFFER_SIZE / sizeof(uint32_t)]; uint64_t u64[IO_BUFFER_SIZE / sizeof(uint64_t)]; } io_buf; typedef struct { /// Name of the source filename (as given on the command line) or /// pointer to static "(stdin)" when reading from standard input. const char *src_name; /// Destination filename converted from src_name or pointer to static /// "(stdout)" when writing to standard output. char *dest_name; /// File descriptor of the source file int src_fd; /// File descriptor of the target file int dest_fd; /// True once end of the source file has been detected. bool src_eof; + + /// For --flush-timeout: True if at least one byte has been read + /// since the previous flush or the start of the file. + bool src_has_seen_input; + + /// For --flush-timeout: True when flushing is needed. + bool flush_needed; /// If true, we look for long chunks of zeros and try to create /// a sparse file. bool dest_try_sparse; /// This is used only if dest_try_sparse is true. This holds the /// number of zero bytes we haven't written out, because we plan /// to make that byte range a sparse chunk. off_t dest_pending_sparse; /// Stat of the source file. struct stat src_st; /// Stat of the destination file. struct stat dest_st; } file_pair; /// \brief Initialize the I/O module extern void io_init(void); #ifndef TUKLIB_DOSLIKE /// \brief Write a byte to user_abort_pipe[1] /// /// This is called from a signal handler. extern void io_write_to_user_abort_pipe(void); #endif /// \brief Disable creation of sparse files when decompressing extern void io_no_sparse(void); #ifdef ENABLE_SANDBOX /// \brief main() calls this if conditions for sandboxing have been met. extern void io_allow_sandbox(void); #endif /// \brief Open the source file extern file_pair *io_open_src(const char *src_name); /// \brief Open the destination file extern bool io_open_dest(file_pair *pair); /// \brief Closes the file descriptors and frees possible allocated memory /// /// The success argument determines if source or destination file gets /// unlinked: /// - false: The destination file is unlinked. /// - true: The source file is unlinked unless writing to stdout or --keep /// was used. extern void io_close(file_pair *pair, bool success); /// \brief Reads from the source file to a buffer /// /// \param pair File pair having the source file open for reading /// \param buf Destination buffer to hold the read data /// \param size Size of the buffer; assumed be smaller than SSIZE_MAX /// /// \return On success, number of bytes read is returned. On end of /// file zero is returned and pair->src_eof set to true. /// On error, SIZE_MAX is returned and error message printed. extern size_t io_read(file_pair *pair, io_buf *buf, size_t size); /// \brief Fix the position in src_fd /// /// This is used when --single-thream has been specified and decompression /// is successful. If the input file descriptor supports seeking, this /// function fixes the input position to point to the next byte after the /// decompressed stream. /// /// \param pair File pair having the source file open for reading /// \param rewind_size How many bytes of extra have been read i.e. /// how much to seek backwards. extern void io_fix_src_pos(file_pair *pair, size_t rewind_size); /// \brief Read from source file from given offset to a buffer /// /// This is remotely similar to standard pread(). This uses lseek() though, /// so the read offset is changed on each call. /// /// \param pair Seekable source file /// \param buf Destination buffer /// \param size Amount of data to read /// \param pos Offset relative to the beginning of the file, /// from which the data should be read. /// /// \return On success, false is returned. On error, error message /// is printed and true is returned. extern bool io_pread(file_pair *pair, io_buf *buf, size_t size, off_t pos); /// \brief Writes a buffer to the destination file /// /// \param pair File pair having the destination file open for writing /// \param buf Buffer containing the data to be written /// \param size Size of the buffer; assumed be smaller than SSIZE_MAX /// /// \return On success, zero is returned. On error, -1 is returned /// and error message printed. extern bool io_write(file_pair *pair, const io_buf *buf, size_t size); Index: head/contrib/xz/src/xz/main.c =================================================================== --- head/contrib/xz/src/xz/main.c (revision 359200) +++ head/contrib/xz/src/xz/main.c (revision 359201) @@ -1,330 +1,330 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file main.c /// \brief main() // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #include /// Exit status to use. This can be changed with set_exit_status(). static enum exit_status_type exit_status = E_SUCCESS; #if defined(_WIN32) && !defined(__CYGWIN__) /// exit_status has to be protected with a critical section due to /// how "signal handling" is done on Windows. See signals.c for details. static CRITICAL_SECTION exit_status_cs; #endif /// True if --no-warn is specified. When this is true, we don't set /// the exit status to E_WARNING when something worth a warning happens. static bool no_warn = false; extern void set_exit_status(enum exit_status_type new_status) { assert(new_status == E_WARNING || new_status == E_ERROR); #if defined(_WIN32) && !defined(__CYGWIN__) EnterCriticalSection(&exit_status_cs); #endif if (exit_status != E_ERROR) exit_status = new_status; #if defined(_WIN32) && !defined(__CYGWIN__) LeaveCriticalSection(&exit_status_cs); #endif return; } extern void set_exit_no_warn(void) { no_warn = true; return; } static const char * read_name(const args_info *args) { // FIXME: Maybe we should have some kind of memory usage limit here // like the tool has for the actual compression and decompression. // Giving some huge text file with --files0 makes us to read the // whole file in RAM. static char *name = NULL; static size_t size = 256; // Allocate the initial buffer. This is never freed, since after it // is no longer needed, the program exits very soon. It is safe to // use xmalloc() and xrealloc() in this function, because while // executing this function, no files are open for writing, and thus // there's no need to cleanup anything before exiting. if (name == NULL) name = xmalloc(size); // Write position in name size_t pos = 0; // Read one character at a time into name. while (!user_abort) { const int c = fgetc(args->files_file); if (ferror(args->files_file)) { // Take care of EINTR since we have established // the signal handlers already. if (errno == EINTR) continue; message_error(_("%s: Error reading filenames: %s"), args->files_name, strerror(errno)); return NULL; } if (feof(args->files_file)) { if (pos != 0) message_error(_("%s: Unexpected end of input " "when reading filenames"), args->files_name); return NULL; } if (c == args->files_delim) { // We allow consecutive newline (--files) or '\0' // characters (--files0), and ignore such empty // filenames. if (pos == 0) continue; // A non-empty name was read. Terminate it with '\0' // and return it. name[pos] = '\0'; return name; } if (c == '\0') { // A null character was found when using --files, // which expects plain text input separated with // newlines. message_error(_("%s: Null character found when " "reading filenames; maybe you meant " "to use `--files0' instead " "of `--files'?"), args->files_name); return NULL; } name[pos++] = c; // Allocate more memory if needed. There must always be space // at least for one character to allow terminating the string // with '\0'. if (pos == size) { size *= 2; name = xrealloc(name, size); } } return NULL; } int main(int argc, char **argv) { #if defined(_WIN32) && !defined(__CYGWIN__) InitializeCriticalSection(&exit_status_cs); #endif // Set up the progname variable. tuklib_progname_init(argv); // Initialize the file I/O. This makes sure that // stdin, stdout, and stderr are something valid. io_init(); // Set up the locale and message translations. tuklib_gettext_init(PACKAGE, LOCALEDIR); // Initialize handling of error/warning/other messages. message_init(); - // Set hardware-dependent default values. These can be overriden + // Set hardware-dependent default values. These can be overridden // on the command line, thus this must be done before args_parse(). hardware_init(); // Parse the command line arguments and get an array of filenames. // This doesn't return if something is wrong with the command line // arguments. If there are no arguments, one filename ("-") is still // returned to indicate stdin. args_info args; args_parse(&args, argc, argv); if (opt_mode != MODE_LIST && opt_robot) message_fatal(_("Compression and decompression with --robot " "are not supported yet.")); // Tell the message handling code how many input files there are if // we know it. This way the progress indicator can show it. if (args.files_name != NULL) message_set_files(0); else message_set_files(args.arg_count); // Refuse to write compressed data to standard output if it is // a terminal. if (opt_mode == MODE_COMPRESS) { if (opt_stdout || (args.arg_count == 1 && strcmp(args.arg_names[0], "-") == 0)) { if (is_tty_stdout()) { message_try_help(); tuklib_exit(E_ERROR, E_ERROR, false); } } } // Set up the signal handlers. We don't need these before we // start the actual action and not in --list mode, so this is // done after parsing the command line arguments. // // It's good to keep signal handlers in normal compression and // decompression modes even when only writing to stdout, because // we might need to restore O_APPEND flag on stdout before exiting. // In --test mode, signal handlers aren't really needed, but let's // keep them there for consistency with normal decompression. if (opt_mode != MODE_LIST) signals_init(); #ifdef ENABLE_SANDBOX // Set a flag that sandboxing is allowed if all these are true: // - --files or --files0 wasn't used. // - There is exactly one input file or we are reading from stdin. // - We won't create any files: output goes to stdout or --test // or --list was used. Note that --test implies opt_stdout = true // but --list doesn't. // // This is obviously not ideal but it was easy to implement and // it covers the most common use cases. // // TODO: Make sandboxing work for other situations too. if (args.files_name == NULL && args.arg_count == 1 && (opt_stdout || strcmp("-", args.arg_names[0]) == 0 || opt_mode == MODE_LIST)) io_allow_sandbox(); #endif // coder_run() handles compression, decompression, and testing. // list_file() is for --list. void (*run)(const char *filename) = &coder_run; #ifdef HAVE_DECODERS if (opt_mode == MODE_LIST) run = &list_file; #endif // Process the files given on the command line. Note that if no names // were given, args_parse() gave us a fake "-" filename. for (unsigned i = 0; i < args.arg_count && !user_abort; ++i) { if (strcmp("-", args.arg_names[i]) == 0) { // Processing from stdin to stdout. Check that we // aren't writing compressed data to a terminal or // reading it from a terminal. if (opt_mode == MODE_COMPRESS) { if (is_tty_stdout()) continue; } else if (is_tty_stdin()) { continue; } // It doesn't make sense to compress data from stdin // if we are supposed to read filenames from stdin // too (enabled with --files or --files0). if (args.files_name == stdin_filename) { message_error(_("Cannot read data from " "standard input when " "reading filenames " "from standard input")); continue; } // Replace the "-" with a special pointer, which is // recognized by coder_run() and other things. // This way error messages get a proper filename // string and the code still knows that it is // handling the special case of stdin. args.arg_names[i] = (char *)stdin_filename; } // Do the actual compression or decompression. run(args.arg_names[i]); } // If --files or --files0 was used, process the filenames from the // given file or stdin. Note that here we don't consider "-" to // indicate stdin like we do with the command line arguments. if (args.files_name != NULL) { // read_name() checks for user_abort so we don't need to // check it as loop termination condition. while (true) { const char *name = read_name(&args); if (name == NULL) break; // read_name() doesn't return empty names. assert(name[0] != '\0'); run(name); } if (args.files_name != stdin_filename) (void)fclose(args.files_file); } #ifdef HAVE_DECODERS // All files have now been handled. If in --list mode, display // the totals before exiting. We don't have signal handlers // enabled in --list mode, so we don't need to check user_abort. if (opt_mode == MODE_LIST) { assert(!user_abort); list_totals(); } #endif #ifndef NDEBUG coder_free(); args_free(); #endif // If we have got a signal, raise it to kill the program instead // of calling tuklib_exit(). signals_exit(); // Make a local copy of exit_status to keep the Windows code // thread safe. At this point it is fine if we miss the user // pressing C-c and don't set the exit_status to E_ERROR on // Windows. #if defined(_WIN32) && !defined(__CYGWIN__) EnterCriticalSection(&exit_status_cs); #endif enum exit_status_type es = exit_status; #if defined(_WIN32) && !defined(__CYGWIN__) LeaveCriticalSection(&exit_status_cs); #endif // Suppress the exit status indicating a warning if --no-warn // was specified. if (es == E_WARNING && no_warn) es = E_SUCCESS; - tuklib_exit(es, E_ERROR, message_verbosity_get() != V_SILENT); + tuklib_exit((int)es, E_ERROR, message_verbosity_get() != V_SILENT); } Index: head/contrib/xz/src/xz/message.c =================================================================== --- head/contrib/xz/src/xz/message.c (revision 359200) +++ head/contrib/xz/src/xz/message.c (revision 359201) @@ -1,1258 +1,1272 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file message.c /// \brief Printing messages // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #include /// Number of the current file static unsigned int files_pos = 0; /// Total number of input files; zero if unknown. static unsigned int files_total; /// Verbosity level static enum message_verbosity verbosity = V_WARNING; /// Filename which we will print with the verbose messages static const char *filename; /// True once the a filename has been printed to stderr as part of progress /// message. If automatic progress updating isn't enabled, this becomes true /// after the first progress message has been printed due to user sending /// SIGINFO, SIGUSR1, or SIGALRM. Once this variable is true, we will print /// an empty line before the next filename to make the output more readable. static bool first_filename_printed = false; /// This is set to true when we have printed the current filename to stderr /// as part of a progress message. This variable is useful only if not /// updating progress automatically: if user sends many SIGINFO, SIGUSR1, or /// SIGALRM signals, we won't print the name of the same file multiple times. static bool current_filename_printed = false; /// True if we should print progress indicator and update it automatically /// if also verbose >= V_VERBOSE. static bool progress_automatic; /// True if message_progress_start() has been called but /// message_progress_end() hasn't been called yet. static bool progress_started = false; /// This is true when a progress message was printed and the cursor is still /// on the same line with the progress message. In that case, a newline has /// to be printed before any error messages. static bool progress_active = false; /// Pointer to lzma_stream used to do the encoding or decoding. static lzma_stream *progress_strm; +/// This is true if we are in passthru mode (not actually compressing or +/// decompressing) and thus cannot use lzma_get_progress(progress_strm, ...). +/// That is, we are using coder_passthru() in coder.c. +static bool progress_is_from_passthru; + /// Expected size of the input stream is needed to show completion percentage /// and estimate remaining time. static uint64_t expected_in_size; // Use alarm() and SIGALRM when they are supported. This has two minor // advantages over the alternative of polling gettimeofday(): // - It is possible for the user to send SIGINFO, SIGUSR1, or SIGALRM to // get intermediate progress information even when --verbose wasn't used // or stderr is not a terminal. // - alarm() + SIGALRM seems to have slightly less overhead than polling // gettimeofday(). #ifdef SIGALRM const int message_progress_sigs[] = { SIGALRM, #ifdef SIGINFO SIGINFO, #endif #ifdef SIGUSR1 SIGUSR1, #endif 0 }; /// The signal handler for SIGALRM sets this to true. It is set back to false /// once the progress message has been updated. static volatile sig_atomic_t progress_needs_updating = false; /// Signal handler for SIGALRM static void progress_signal_handler(int sig lzma_attribute((__unused__))) { progress_needs_updating = true; return; } #else /// This is true when progress message printing is wanted. Using the same /// variable name as above to avoid some ifdefs. static bool progress_needs_updating = false; /// Elapsed time when the next progress message update should be done. static uint64_t progress_next_update; #endif extern void message_init(void) { // If --verbose is used, we use a progress indicator if and only // if stderr is a terminal. If stderr is not a terminal, we print // verbose information only after finishing the file. As a special // exception, even if --verbose was not used, user can send SIGALRM // to make us print progress information once without automatic // updating. progress_automatic = isatty(STDERR_FILENO); // Commented out because COLUMNS is rarely exported to environment. // Most users have at least 80 columns anyway, let's think something // fancy here if enough people complain. /* if (progress_automatic) { // stderr is a terminal. Check the COLUMNS environment // variable to see if the terminal is wide enough. If COLUMNS // doesn't exist or it has some unparsable value, we assume // that the terminal is wide enough. const char *columns_str = getenv("COLUMNS"); if (columns_str != NULL) { char *endptr; const long columns = strtol(columns_str, &endptr, 10); if (*endptr != '\0' || columns < 80) progress_automatic = false; } } */ #ifdef SIGALRM // Establish the signal handlers which set a flag to tell us that // progress info should be updated. struct sigaction sa; sigemptyset(&sa.sa_mask); sa.sa_flags = 0; sa.sa_handler = &progress_signal_handler; for (size_t i = 0; message_progress_sigs[i] != 0; ++i) if (sigaction(message_progress_sigs[i], &sa, NULL)) message_signal_handler(); #endif return; } extern void message_verbosity_increase(void) { if (verbosity < V_DEBUG) ++verbosity; return; } extern void message_verbosity_decrease(void) { if (verbosity > V_SILENT) --verbosity; return; } extern enum message_verbosity message_verbosity_get(void) { return verbosity; } extern void message_set_files(unsigned int files) { files_total = files; return; } /// Prints the name of the current file if it hasn't been printed already, /// except if we are processing exactly one stream from stdin to stdout. /// I think it looks nicer to not print "(stdin)" when --verbose is used /// in a pipe and no other files are processed. static void print_filename(void) { if (!opt_robot && (files_total != 1 || filename != stdin_filename)) { signals_block(); FILE *file = opt_mode == MODE_LIST ? stdout : stderr; // If a file was already processed, put an empty line // before the next filename to improve readability. if (first_filename_printed) fputc('\n', file); first_filename_printed = true; current_filename_printed = true; // If we don't know how many files there will be due // to usage of --files or --files0. if (files_total == 0) fprintf(file, "%s (%u)\n", filename, files_pos); else fprintf(file, "%s (%u/%u)\n", filename, files_pos, files_total); signals_unblock(); } return; } extern void message_filename(const char *src_name) { // Start numbering the files starting from one. ++files_pos; filename = src_name; if (verbosity >= V_VERBOSE && (progress_automatic || opt_mode == MODE_LIST)) print_filename(); else current_filename_printed = false; return; } extern void -message_progress_start(lzma_stream *strm, uint64_t in_size) +message_progress_start(lzma_stream *strm, bool is_passthru, uint64_t in_size) { // Store the pointer to the lzma_stream used to do the coding. // It is needed to find out the position in the stream. progress_strm = strm; + progress_is_from_passthru = is_passthru; // Store the expected size of the file. If we aren't printing any // statistics, then is will be unused. But since it is possible // that the user sends us a signal to show statistics, we need // to have it available anyway. expected_in_size = in_size; // Indicate that progress info may need to be printed before // printing error messages. progress_started = true; // If progress indicator is wanted, print the filename and possibly // the file count now. if (verbosity >= V_VERBOSE && progress_automatic) { // Start the timer to display the first progress message // after one second. An alternative would be to show the // first message almost immediately, but delaying by one // second looks better to me, since extremely early // progress info is pretty much useless. #ifdef SIGALRM // First disable a possibly existing alarm. alarm(0); progress_needs_updating = false; alarm(1); #else progress_needs_updating = true; progress_next_update = 1000; #endif } return; } /// Make the string indicating completion percentage. static const char * progress_percentage(uint64_t in_pos) { // If the size of the input file is unknown or the size told us is // clearly wrong since we have processed more data than the alleged // size of the file, show a static string indicating that we have // no idea of the completion percentage. if (expected_in_size == 0 || in_pos > expected_in_size) return "--- %"; // Never show 100.0 % before we actually are finished. double percentage = (double)(in_pos) / (double)(expected_in_size) * 99.9; // Use big enough buffer to hold e.g. a multibyte decimal point. static char buf[16]; snprintf(buf, sizeof(buf), "%.1f %%", percentage); return buf; } /// Make the string containing the amount of input processed, amount of /// output produced, and the compression ratio. static const char * progress_sizes(uint64_t compressed_pos, uint64_t uncompressed_pos, bool final) { // Use big enough buffer to hold e.g. a multibyte thousand separators. static char buf[128]; char *pos = buf; size_t left = sizeof(buf); // Print the sizes. If this the final message, use more reasonable // units than MiB if the file was small. const enum nicestr_unit unit_min = final ? NICESTR_B : NICESTR_MIB; my_snprintf(&pos, &left, "%s / %s", uint64_to_nicestr(compressed_pos, unit_min, NICESTR_TIB, false, 0), uint64_to_nicestr(uncompressed_pos, unit_min, NICESTR_TIB, false, 1)); // Avoid division by zero. If we cannot calculate the ratio, set // it to some nice number greater than 10.0 so that it gets caught // in the next if-clause. const double ratio = uncompressed_pos > 0 ? (double)(compressed_pos) / (double)(uncompressed_pos) : 16.0; // If the ratio is very bad, just indicate that it is greater than // 9.999. This way the length of the ratio field stays fixed. if (ratio > 9.999) snprintf(pos, left, " > %.3f", 9.999); else snprintf(pos, left, " = %.3f", ratio); return buf; } /// Make the string containing the processing speed of uncompressed data. static const char * progress_speed(uint64_t uncompressed_pos, uint64_t elapsed) { // Don't print the speed immediately, since the early values look // somewhat random. if (elapsed < 3000) return ""; static const char unit[][8] = { "KiB/s", "MiB/s", "GiB/s", }; size_t unit_index = 0; // Calculate the speed as KiB/s. double speed = (double)(uncompressed_pos) / ((double)(elapsed) * (1024.0 / 1000.0)); // Adjust the unit of the speed if needed. while (speed > 999.0) { speed /= 1024.0; if (++unit_index == ARRAY_SIZE(unit)) return ""; // Way too fast ;-) } // Use decimal point only if the number is small. Examples: // - 0.1 KiB/s // - 9.9 KiB/s // - 99 KiB/s // - 999 KiB/s // Use big enough buffer to hold e.g. a multibyte decimal point. static char buf[16]; snprintf(buf, sizeof(buf), "%.*f %s", speed > 9.9 ? 0 : 1, speed, unit[unit_index]); return buf; } /// Make a string indicating elapsed time. The format is either /// M:SS or H:MM:SS depending on if the time is an hour or more. static const char * progress_time(uint64_t mseconds) { // 9999 hours = 416 days static char buf[sizeof("9999:59:59")]; // 32-bit variable is enough for elapsed time (136 years). uint32_t seconds = (uint32_t)(mseconds / 1000); // Don't show anything if the time is zero or ridiculously big. if (seconds == 0 || seconds > ((9999 * 60) + 59) * 60 + 59) return ""; uint32_t minutes = seconds / 60; seconds %= 60; if (minutes >= 60) { const uint32_t hours = minutes / 60; minutes %= 60; snprintf(buf, sizeof(buf), "%" PRIu32 ":%02" PRIu32 ":%02" PRIu32, hours, minutes, seconds); } else { snprintf(buf, sizeof(buf), "%" PRIu32 ":%02" PRIu32, minutes, seconds); } return buf; } /// Return a string containing estimated remaining time when /// reasonably possible. static const char * progress_remaining(uint64_t in_pos, uint64_t elapsed) { // Don't show the estimated remaining time when it wouldn't // make sense: // - Input size is unknown. // - Input has grown bigger since we started (de)compressing. // - We haven't processed much data yet, so estimate would be // too inaccurate. // - Only a few seconds has passed since we started (de)compressing, // so estimate would be too inaccurate. if (expected_in_size == 0 || in_pos > expected_in_size || in_pos < (UINT64_C(1) << 19) || elapsed < 8000) return ""; // Calculate the estimate. Don't give an estimate of zero seconds, // since it is possible that all the input has been already passed // to the library, but there is still quite a bit of output pending. - uint32_t remaining = (double)(expected_in_size - in_pos) - * ((double)(elapsed) / 1000.0) / (double)(in_pos); + uint32_t remaining = (uint32_t)((double)(expected_in_size - in_pos) + * ((double)(elapsed) / 1000.0) / (double)(in_pos)); if (remaining < 1) remaining = 1; static char buf[sizeof("9 h 55 min")]; // Select appropriate precision for the estimated remaining time. if (remaining <= 10) { // A maximum of 10 seconds remaining. // Show the number of seconds as is. snprintf(buf, sizeof(buf), "%" PRIu32 " s", remaining); } else if (remaining <= 50) { // A maximum of 50 seconds remaining. // Round up to the next multiple of five seconds. remaining = (remaining + 4) / 5 * 5; snprintf(buf, sizeof(buf), "%" PRIu32 " s", remaining); } else if (remaining <= 590) { // A maximum of 9 minutes and 50 seconds remaining. // Round up to the next multiple of ten seconds. remaining = (remaining + 9) / 10 * 10; snprintf(buf, sizeof(buf), "%" PRIu32 " min %" PRIu32 " s", remaining / 60, remaining % 60); } else if (remaining <= 59 * 60) { // A maximum of 59 minutes remaining. // Round up to the next multiple of a minute. remaining = (remaining + 59) / 60; snprintf(buf, sizeof(buf), "%" PRIu32 " min", remaining); } else if (remaining <= 9 * 3600 + 50 * 60) { // A maximum of 9 hours and 50 minutes left. // Round up to the next multiple of ten minutes. remaining = (remaining + 599) / 600 * 10; snprintf(buf, sizeof(buf), "%" PRIu32 " h %" PRIu32 " min", remaining / 60, remaining % 60); } else if (remaining <= 23 * 3600) { // A maximum of 23 hours remaining. // Round up to the next multiple of an hour. remaining = (remaining + 3599) / 3600; snprintf(buf, sizeof(buf), "%" PRIu32 " h", remaining); } else if (remaining <= 9 * 24 * 3600 + 23 * 3600) { // A maximum of 9 days and 23 hours remaining. // Round up to the next multiple of an hour. remaining = (remaining + 3599) / 3600; snprintf(buf, sizeof(buf), "%" PRIu32 " d %" PRIu32 " h", remaining / 24, remaining % 24); } else if (remaining <= 999 * 24 * 3600) { // A maximum of 999 days remaining. ;-) // Round up to the next multiple of a day. remaining = (remaining + 24 * 3600 - 1) / (24 * 3600); snprintf(buf, sizeof(buf), "%" PRIu32 " d", remaining); } else { // The estimated remaining time is too big. Don't show it. return ""; } return buf; } /// Get how much uncompressed and compressed data has been processed. static void progress_pos(uint64_t *in_pos, uint64_t *compressed_pos, uint64_t *uncompressed_pos) { uint64_t out_pos; - lzma_get_progress(progress_strm, in_pos, &out_pos); + if (progress_is_from_passthru) { + // In passthru mode the progress info is in total_in/out but + // the *progress_strm itself isn't initialized and thus we + // cannot use lzma_get_progress(). + *in_pos = progress_strm->total_in; + out_pos = progress_strm->total_out; + } else { + lzma_get_progress(progress_strm, in_pos, &out_pos); + } // It cannot have processed more input than it has been given. assert(*in_pos <= progress_strm->total_in); // It cannot have produced more output than it claims to have ready. assert(out_pos >= progress_strm->total_out); if (opt_mode == MODE_COMPRESS) { *compressed_pos = out_pos; *uncompressed_pos = *in_pos; } else { *compressed_pos = *in_pos; *uncompressed_pos = out_pos; } return; } extern void message_progress_update(void) { if (!progress_needs_updating) return; // Calculate how long we have been processing this file. const uint64_t elapsed = mytime_get_elapsed(); #ifndef SIGALRM if (progress_next_update > elapsed) return; progress_next_update = elapsed + 1000; #endif // Get our current position in the stream. uint64_t in_pos; uint64_t compressed_pos; uint64_t uncompressed_pos; progress_pos(&in_pos, &compressed_pos, &uncompressed_pos); // Block signals so that fprintf() doesn't get interrupted. signals_block(); // Print the filename if it hasn't been printed yet. if (!current_filename_printed) print_filename(); // Print the actual progress message. The idea is that there is at // least three spaces between the fields in typical situations, but // even in rare situations there is at least one space. const char *cols[5] = { progress_percentage(in_pos), progress_sizes(compressed_pos, uncompressed_pos, false), progress_speed(uncompressed_pos, elapsed), progress_time(elapsed), progress_remaining(in_pos, elapsed), }; fprintf(stderr, "\r %*s %*s %*s %10s %10s\r", tuklib_mbstr_fw(cols[0], 6), cols[0], tuklib_mbstr_fw(cols[1], 35), cols[1], tuklib_mbstr_fw(cols[2], 9), cols[2], cols[3], cols[4]); #ifdef SIGALRM // Updating the progress info was finished. Reset // progress_needs_updating to wait for the next SIGALRM. // // NOTE: This has to be done before alarm(1) or with (very) bad // luck we could be setting this to false after the alarm has already // been triggered. progress_needs_updating = false; if (verbosity >= V_VERBOSE && progress_automatic) { // Mark that the progress indicator is active, so if an error // occurs, the error message gets printed cleanly. progress_active = true; // Restart the timer so that progress_needs_updating gets // set to true after about one second. alarm(1); } else { // The progress message was printed because user had sent us // SIGALRM. In this case, each progress message is printed // on its own line. fputc('\n', stderr); } #else // When SIGALRM isn't supported and we get here, it's always due to // automatic progress update. We set progress_active here too like // described above. assert(verbosity >= V_VERBOSE); assert(progress_automatic); progress_active = true; #endif signals_unblock(); return; } static void progress_flush(bool finished) { if (!progress_started || verbosity < V_VERBOSE) return; uint64_t in_pos; uint64_t compressed_pos; uint64_t uncompressed_pos; progress_pos(&in_pos, &compressed_pos, &uncompressed_pos); // Avoid printing intermediate progress info if some error occurs // in the beginning of the stream. (If something goes wrong later in // the stream, it is sometimes useful to tell the user where the // error approximately occurred, especially if the error occurs // after a time-consuming operation.) if (!finished && !progress_active && (compressed_pos == 0 || uncompressed_pos == 0)) return; progress_active = false; const uint64_t elapsed = mytime_get_elapsed(); signals_block(); // When using the auto-updating progress indicator, the final // statistics are printed in the same format as the progress // indicator itself. if (progress_automatic) { const char *cols[5] = { finished ? "100 %" : progress_percentage(in_pos), progress_sizes(compressed_pos, uncompressed_pos, true), progress_speed(uncompressed_pos, elapsed), progress_time(elapsed), finished ? "" : progress_remaining(in_pos, elapsed), }; fprintf(stderr, "\r %*s %*s %*s %10s %10s\n", tuklib_mbstr_fw(cols[0], 6), cols[0], tuklib_mbstr_fw(cols[1], 35), cols[1], tuklib_mbstr_fw(cols[2], 9), cols[2], cols[3], cols[4]); } else { // The filename is always printed. fprintf(stderr, "%s: ", filename); // Percentage is printed only if we didn't finish yet. if (!finished) { // Don't print the percentage when it isn't known // (starts with a dash). const char *percentage = progress_percentage(in_pos); if (percentage[0] != '-') fprintf(stderr, "%s, ", percentage); } // Size information is always printed. fprintf(stderr, "%s", progress_sizes( compressed_pos, uncompressed_pos, true)); // The speed and elapsed time aren't always shown. const char *speed = progress_speed(uncompressed_pos, elapsed); if (speed[0] != '\0') fprintf(stderr, ", %s", speed); const char *elapsed_str = progress_time(elapsed); if (elapsed_str[0] != '\0') fprintf(stderr, ", %s", elapsed_str); fputc('\n', stderr); } signals_unblock(); return; } extern void message_progress_end(bool success) { assert(progress_started); progress_flush(success); progress_started = false; return; } static void vmessage(enum message_verbosity v, const char *fmt, va_list ap) { if (v <= verbosity) { signals_block(); progress_flush(false); // TRANSLATORS: This is the program name in the beginning // of the line in messages. Usually it becomes "xz: ". // This is a translatable string because French needs // a space before a colon. fprintf(stderr, _("%s: "), progname); vfprintf(stderr, fmt, ap); fputc('\n', stderr); signals_unblock(); } return; } extern void message(enum message_verbosity v, const char *fmt, ...) { va_list ap; va_start(ap, fmt); vmessage(v, fmt, ap); va_end(ap); return; } extern void message_warning(const char *fmt, ...) { va_list ap; va_start(ap, fmt); vmessage(V_WARNING, fmt, ap); va_end(ap); set_exit_status(E_WARNING); return; } extern void message_error(const char *fmt, ...) { va_list ap; va_start(ap, fmt); vmessage(V_ERROR, fmt, ap); va_end(ap); set_exit_status(E_ERROR); return; } extern void message_fatal(const char *fmt, ...) { va_list ap; va_start(ap, fmt); vmessage(V_ERROR, fmt, ap); va_end(ap); tuklib_exit(E_ERROR, E_ERROR, false); } extern void message_bug(void) { message_fatal(_("Internal error (bug)")); } extern void message_signal_handler(void) { message_fatal(_("Cannot establish signal handlers")); } extern const char * message_strm(lzma_ret code) { switch (code) { case LZMA_NO_CHECK: return _("No integrity check; not verifying file integrity"); case LZMA_UNSUPPORTED_CHECK: return _("Unsupported type of integrity check; " "not verifying file integrity"); case LZMA_MEM_ERROR: return strerror(ENOMEM); case LZMA_MEMLIMIT_ERROR: return _("Memory usage limit reached"); case LZMA_FORMAT_ERROR: return _("File format not recognized"); case LZMA_OPTIONS_ERROR: return _("Unsupported options"); case LZMA_DATA_ERROR: return _("Compressed data is corrupt"); case LZMA_BUF_ERROR: return _("Unexpected end of input"); case LZMA_OK: case LZMA_STREAM_END: case LZMA_GET_CHECK: case LZMA_PROG_ERROR: // Without "default", compiler will warn if new constants // are added to lzma_ret, it is not too easy to forget to // add the new constants to this function. break; } return _("Internal error (bug)"); } extern void message_mem_needed(enum message_verbosity v, uint64_t memusage) { if (v > verbosity) return; // Convert memusage to MiB, rounding up to the next full MiB. // This way the user can always use the displayed usage as // the new memory usage limit. (If we rounded to the nearest, // the user might need to +1 MiB to get high enough limit.) memusage = round_up_to_mib(memusage); uint64_t memlimit = hardware_memlimit_get(opt_mode); // Handle the case when there is no memory usage limit. // This way we don't print a weird message with a huge number. if (memlimit == UINT64_MAX) { message(v, _("%s MiB of memory is required. " "The limiter is disabled."), uint64_to_str(memusage, 0)); return; } // With US-ASCII: // 2^64 with thousand separators + " MiB" suffix + '\0' = 26 + 4 + 1 // But there may be multibyte chars so reserve enough space. char memlimitstr[128]; // Show the memory usage limit as MiB unless it is less than 1 MiB. // This way it's easy to notice errors where one has typed // --memory=123 instead of --memory=123MiB. if (memlimit < (UINT32_C(1) << 20)) { snprintf(memlimitstr, sizeof(memlimitstr), "%s B", uint64_to_str(memlimit, 1)); } else { // Round up just like with memusage. If this function is // called for informational purposes (to just show the // current usage and limit), we should never show that // the usage is higher than the limit, which would give // a false impression that the memory usage limit isn't // properly enforced. snprintf(memlimitstr, sizeof(memlimitstr), "%s MiB", uint64_to_str(round_up_to_mib(memlimit), 1)); } message(v, _("%s MiB of memory is required. The limit is %s."), uint64_to_str(memusage, 0), memlimitstr); return; } /// \brief Convert uint32_t to a nice string for --lzma[12]=dict=SIZE /// /// The idea is to use KiB or MiB suffix when possible. static const char * uint32_to_optstr(uint32_t num) { static char buf[16]; if ((num & ((UINT32_C(1) << 20) - 1)) == 0) snprintf(buf, sizeof(buf), "%" PRIu32 "MiB", num >> 20); else if ((num & ((UINT32_C(1) << 10) - 1)) == 0) snprintf(buf, sizeof(buf), "%" PRIu32 "KiB", num >> 10); else snprintf(buf, sizeof(buf), "%" PRIu32, num); return buf; } extern void message_filters_to_str(char buf[FILTERS_STR_SIZE], const lzma_filter *filters, bool all_known) { char *pos = buf; size_t left = FILTERS_STR_SIZE; for (size_t i = 0; filters[i].id != LZMA_VLI_UNKNOWN; ++i) { // Add the dashes for the filter option. A space is // needed after the first and later filters. my_snprintf(&pos, &left, "%s", i == 0 ? "--" : " --"); switch (filters[i].id) { case LZMA_FILTER_LZMA1: case LZMA_FILTER_LZMA2: { const lzma_options_lzma *opt = filters[i].options; const char *mode = NULL; const char *mf = NULL; if (all_known) { switch (opt->mode) { case LZMA_MODE_FAST: mode = "fast"; break; case LZMA_MODE_NORMAL: mode = "normal"; break; default: mode = "UNKNOWN"; break; } switch (opt->mf) { case LZMA_MF_HC3: mf = "hc3"; break; case LZMA_MF_HC4: mf = "hc4"; break; case LZMA_MF_BT2: mf = "bt2"; break; case LZMA_MF_BT3: mf = "bt3"; break; case LZMA_MF_BT4: mf = "bt4"; break; default: mf = "UNKNOWN"; break; } } // Add the filter name and dictionary size, which // is always known. my_snprintf(&pos, &left, "lzma%c=dict=%s", filters[i].id == LZMA_FILTER_LZMA2 ? '2' : '1', uint32_to_optstr(opt->dict_size)); // With LZMA1 also lc/lp/pb are known when // decompressing, but this function is never // used to print information about .lzma headers. assert(filters[i].id == LZMA_FILTER_LZMA2 || all_known); // Print the rest of the options, which are known // only when compressing. if (all_known) my_snprintf(&pos, &left, ",lc=%" PRIu32 ",lp=%" PRIu32 ",pb=%" PRIu32 ",mode=%s,nice=%" PRIu32 ",mf=%s" ",depth=%" PRIu32, opt->lc, opt->lp, opt->pb, mode, opt->nice_len, mf, opt->depth); break; } case LZMA_FILTER_X86: case LZMA_FILTER_POWERPC: case LZMA_FILTER_IA64: case LZMA_FILTER_ARM: case LZMA_FILTER_ARMTHUMB: case LZMA_FILTER_SPARC: { static const char bcj_names[][9] = { "x86", "powerpc", "ia64", "arm", "armthumb", "sparc", }; const lzma_options_bcj *opt = filters[i].options; my_snprintf(&pos, &left, "%s", bcj_names[filters[i].id - LZMA_FILTER_X86]); // Show the start offset only when really needed. if (opt != NULL && opt->start_offset != 0) my_snprintf(&pos, &left, "=start=%" PRIu32, opt->start_offset); break; } case LZMA_FILTER_DELTA: { const lzma_options_delta *opt = filters[i].options; my_snprintf(&pos, &left, "delta=dist=%" PRIu32, opt->dist); break; } default: // This should be possible only if liblzma is // newer than the xz tool. my_snprintf(&pos, &left, "UNKNOWN"); break; } } return; } extern void message_filters_show(enum message_verbosity v, const lzma_filter *filters) { if (v > verbosity) return; char buf[FILTERS_STR_SIZE]; message_filters_to_str(buf, filters, true); fprintf(stderr, _("%s: Filter chain: %s\n"), progname, buf); return; } extern void message_try_help(void) { // Print this with V_WARNING instead of V_ERROR to prevent it from // showing up when --quiet has been specified. message(V_WARNING, _("Try `%s --help' for more information."), progname); return; } extern void message_version(void) { // It is possible that liblzma version is different than the command // line tool version, so print both. if (opt_robot) { printf("XZ_VERSION=%" PRIu32 "\nLIBLZMA_VERSION=%" PRIu32 "\n", LZMA_VERSION, lzma_version_number()); } else { printf("xz (" PACKAGE_NAME ") " LZMA_VERSION_STRING "\n"); printf("liblzma %s\n", lzma_version_string()); } tuklib_exit(E_SUCCESS, E_ERROR, verbosity != V_SILENT); } extern void message_help(bool long_help) { printf(_("Usage: %s [OPTION]... [FILE]...\n" "Compress or decompress FILEs in the .xz format.\n\n"), progname); // NOTE: The short help doesn't currently have options that // take arguments. if (long_help) puts(_("Mandatory arguments to long options are mandatory " "for short options too.\n")); if (long_help) puts(_(" Operation mode:\n")); puts(_( " -z, --compress force compression\n" " -d, --decompress force decompression\n" " -t, --test test compressed file integrity\n" " -l, --list list information about .xz files")); if (long_help) puts(_("\n Operation modifiers:\n")); puts(_( " -k, --keep keep (don't delete) input files\n" " -f, --force force overwrite of output file and (de)compress links\n" " -c, --stdout write to standard output and don't delete input files")); if (long_help) { puts(_( " --single-stream decompress only the first stream, and silently\n" " ignore possible remaining input data")); puts(_( " --no-sparse do not create sparse files when decompressing\n" " -S, --suffix=.SUF use the suffix `.SUF' on compressed files\n" " --files[=FILE] read filenames to process from FILE; if FILE is\n" " omitted, filenames are read from the standard input;\n" " filenames must be terminated with the newline character\n" " --files0[=FILE] like --files but use the null character as terminator")); } if (long_help) { puts(_("\n Basic file format and compression options:\n")); puts(_( " -F, --format=FMT file format to encode or decode; possible values are\n" " `auto' (default), `xz', `lzma', and `raw'\n" " -C, --check=CHECK integrity check type: `none' (use with caution),\n" " `crc32', `crc64' (default), or `sha256'")); puts(_( " --ignore-check don't verify the integrity check when decompressing")); } puts(_( " -0 ... -9 compression preset; default is 6; take compressor *and*\n" " decompressor memory usage into account before using 7-9!")); puts(_( " -e, --extreme try to improve compression ratio by using more CPU time;\n" " does not affect decompressor memory requirements")); puts(_( " -T, --threads=NUM use at most NUM threads; the default is 1; set to 0\n" " to use as many threads as there are processor cores")); if (long_help) { puts(_( " --block-size=SIZE\n" " start a new .xz block after every SIZE bytes of input;\n" " use this to set the block size for threaded compression")); puts(_( " --block-list=SIZES\n" " start a new .xz block after the given comma-separated\n" " intervals of uncompressed data")); puts(_( " --flush-timeout=TIMEOUT\n" " when compressing, if more than TIMEOUT milliseconds has\n" " passed since the previous flush and reading more input\n" " would block, all pending data is flushed out" )); puts(_( // xgettext:no-c-format " --memlimit-compress=LIMIT\n" " --memlimit-decompress=LIMIT\n" " -M, --memlimit=LIMIT\n" " set memory usage limit for compression, decompression,\n" " or both; LIMIT is in bytes, % of RAM, or 0 for defaults")); puts(_( " --no-adjust if compression settings exceed the memory usage limit,\n" " give an error instead of adjusting the settings downwards")); } if (long_help) { puts(_( "\n Custom filter chain for compression (alternative for using presets):")); #if defined(HAVE_ENCODER_LZMA1) || defined(HAVE_DECODER_LZMA1) \ || defined(HAVE_ENCODER_LZMA2) || defined(HAVE_DECODER_LZMA2) // TRANSLATORS: The word "literal" in "literal context bits" // means how many "context bits" to use when encoding // literals. A literal is a single 8-bit byte. It doesn't // mean "literally" here. puts(_( "\n" " --lzma1[=OPTS] LZMA1 or LZMA2; OPTS is a comma-separated list of zero or\n" " --lzma2[=OPTS] more of the following options (valid values; default):\n" " preset=PRE reset options to a preset (0-9[e])\n" " dict=NUM dictionary size (4KiB - 1536MiB; 8MiB)\n" " lc=NUM number of literal context bits (0-4; 3)\n" " lp=NUM number of literal position bits (0-4; 0)\n" " pb=NUM number of position bits (0-4; 2)\n" " mode=MODE compression mode (fast, normal; normal)\n" " nice=NUM nice length of a match (2-273; 64)\n" " mf=NAME match finder (hc3, hc4, bt2, bt3, bt4; bt4)\n" " depth=NUM maximum search depth; 0=automatic (default)")); #endif puts(_( "\n" " --x86[=OPTS] x86 BCJ filter (32-bit and 64-bit)\n" " --powerpc[=OPTS] PowerPC BCJ filter (big endian only)\n" " --ia64[=OPTS] IA-64 (Itanium) BCJ filter\n" " --arm[=OPTS] ARM BCJ filter (little endian only)\n" " --armthumb[=OPTS] ARM-Thumb BCJ filter (little endian only)\n" " --sparc[=OPTS] SPARC BCJ filter\n" " Valid OPTS for all BCJ filters:\n" " start=NUM start offset for conversions (default=0)")); #if defined(HAVE_ENCODER_DELTA) || defined(HAVE_DECODER_DELTA) puts(_( "\n" " --delta[=OPTS] Delta filter; valid OPTS (valid values; default):\n" " dist=NUM distance between bytes being subtracted\n" " from each other (1-256; 1)")); #endif } if (long_help) puts(_("\n Other options:\n")); puts(_( " -q, --quiet suppress warnings; specify twice to suppress errors too\n" " -v, --verbose be verbose; specify twice for even more verbose")); if (long_help) { puts(_( " -Q, --no-warn make warnings not affect the exit status")); puts(_( " --robot use machine-parsable messages (useful for scripts)")); puts(""); puts(_( " --info-memory display the total amount of RAM and the currently active\n" " memory usage limits, and exit")); puts(_( " -h, --help display the short help (lists only the basic options)\n" " -H, --long-help display this long help and exit")); } else { puts(_( " -h, --help display this short help and exit\n" " -H, --long-help display the long help (lists also the advanced options)")); } puts(_( " -V, --version display the version number and exit")); puts(_("\nWith no FILE, or when FILE is -, read standard input.\n")); // TRANSLATORS: This message indicates the bug reporting address // for this package. Please add _another line_ saying // "Report translation bugs to <...>\n" with the email or WWW // address for translation bugs. Thanks. printf(_("Report bugs to <%s> (in English or Finnish).\n"), PACKAGE_BUGREPORT); printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL); #if LZMA_VERSION_STABILITY != LZMA_VERSION_STABILITY_STABLE puts(_( "THIS IS A DEVELOPMENT VERSION NOT INTENDED FOR PRODUCTION USE.")); #endif tuklib_exit(E_SUCCESS, E_ERROR, verbosity != V_SILENT); } Index: head/contrib/xz/src/xz/message.h =================================================================== --- head/contrib/xz/src/xz/message.h (revision 359200) +++ head/contrib/xz/src/xz/message.h (revision 359201) @@ -1,167 +1,168 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file message.h /// \brief Printing messages to stderr // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// /// Verbosity levels enum message_verbosity { V_SILENT, ///< No messages V_ERROR, ///< Only error messages V_WARNING, ///< Errors and warnings V_VERBOSE, ///< Errors, warnings, and verbose statistics V_DEBUG, ///< Very verbose }; /// \brief Signals used for progress message handling extern const int message_progress_sigs[]; /// \brief Initializes the message functions /// /// If an error occurs, this function doesn't return. /// extern void message_init(void); /// Increase verbosity level by one step unless it was at maximum. extern void message_verbosity_increase(void); /// Decrease verbosity level by one step unless it was at minimum. extern void message_verbosity_decrease(void); /// Get the current verbosity level. extern enum message_verbosity message_verbosity_get(void); /// \brief Print a message if verbosity level is at least "verbosity" /// /// This doesn't touch the exit status. extern void message(enum message_verbosity verbosity, const char *fmt, ...) lzma_attribute((__format__(__printf__, 2, 3))); /// \brief Prints a warning and possibly sets exit status /// /// The message is printed only if verbosity level is at least V_WARNING. /// The exit status is set to WARNING unless it was already at ERROR. extern void message_warning(const char *fmt, ...) lzma_attribute((__format__(__printf__, 1, 2))); /// \brief Prints an error message and sets exit status /// /// The message is printed only if verbosity level is at least V_ERROR. /// The exit status is set to ERROR. extern void message_error(const char *fmt, ...) lzma_attribute((__format__(__printf__, 1, 2))); /// \brief Prints an error message and exits with EXIT_ERROR /// /// The message is printed only if verbosity level is at least V_ERROR. extern void message_fatal(const char *fmt, ...) lzma_attribute((__format__(__printf__, 1, 2))) lzma_attribute((__noreturn__)); /// Print an error message that an internal error occurred and exit with /// EXIT_ERROR. extern void message_bug(void) lzma_attribute((__noreturn__)); /// Print a message that establishing signal handlers failed, and exit with /// exit status ERROR. extern void message_signal_handler(void) lzma_attribute((__noreturn__)); /// Convert lzma_ret to a string. extern const char *message_strm(lzma_ret code); /// Display how much memory was needed and how much the limit was. extern void message_mem_needed(enum message_verbosity v, uint64_t memusage); /// Buffer size for message_filters_to_str() #define FILTERS_STR_SIZE 512 /// \brief Get the filter chain as a string /// /// \param buf Pointer to caller allocated buffer to hold /// the filter chain string /// \param filters Pointer to the filter chain /// \param all_known If true, all filter options are printed. /// If false, only the options that get stored /// into .xz headers are printed. extern void message_filters_to_str(char buf[FILTERS_STR_SIZE], const lzma_filter *filters, bool all_known); /// Print the filter chain. extern void message_filters_show( enum message_verbosity v, const lzma_filter *filters); /// Print a message that user should try --help. extern void message_try_help(void); /// Prints the version number to stdout and exits with exit status SUCCESS. extern void message_version(void) lzma_attribute((__noreturn__)); /// Print the help message. extern void message_help(bool long_help) lzma_attribute((__noreturn__)); /// \brief Set the total number of files to be processed /// /// Standard input is counted as a file here. This is used when printing /// the filename via message_filename(). extern void message_set_files(unsigned int files); /// \brief Set the name of the current file and possibly print it too /// /// The name is printed immediately if --list was used or if --verbose /// was used and stderr is a terminal. Even when the filename isn't printed, /// it is stored so that it can be printed later if needed for progress /// messages. extern void message_filename(const char *src_name); /// \brief Start progress info handling /// /// message_filename() must be called before this function to set /// the filename. /// /// This must be paired with a call to message_progress_end() before the /// given *strm becomes invalid. /// /// \param strm Pointer to lzma_stream used for the coding. /// \param in_size Size of the input file, or zero if unknown. /// -extern void message_progress_start(lzma_stream *strm, uint64_t in_size); +extern void message_progress_start(lzma_stream *strm, + bool is_passthru, uint64_t in_size); /// Update the progress info if in verbose mode and enough time has passed /// since the previous update. This can be called only when /// message_progress_start() has already been used. extern void message_progress_update(void); /// \brief Finishes the progress message if we were in verbose mode /// /// \param finished True if the whole stream was successfully coded /// and output written to the output stream. /// extern void message_progress_end(bool finished); Index: head/contrib/xz/src/xz/mytime.c =================================================================== --- head/contrib/xz/src/xz/mytime.c (revision 359200) +++ head/contrib/xz/src/xz/mytime.c (revision 359201) @@ -1,89 +1,85 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file mytime.c /// \brief Time handling functions // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #if !(defined(HAVE_CLOCK_GETTIME) && HAVE_DECL_CLOCK_MONOTONIC) # include #endif uint64_t opt_flush_timeout = 0; -bool flush_needed; static uint64_t start_time; static uint64_t next_flush; /// \brief Get the current time as milliseconds /// /// It's relative to some point but not necessarily to the UNIX Epoch. static uint64_t mytime_now(void) { // NOTE: HAVE_DECL_CLOCK_MONOTONIC is always defined to 0 or 1. #if defined(HAVE_CLOCK_GETTIME) && HAVE_DECL_CLOCK_MONOTONIC // If CLOCK_MONOTONIC was available at compile time but for some // reason isn't at runtime, fallback to CLOCK_REALTIME which // according to POSIX is mandatory for all implementations. static clockid_t clk_id = CLOCK_MONOTONIC; struct timespec tv; while (clock_gettime(clk_id, &tv)) clk_id = CLOCK_REALTIME; - return (uint64_t)(tv.tv_sec) * UINT64_C(1000) + tv.tv_nsec / 1000000; + return (uint64_t)tv.tv_sec * 1000 + (uint64_t)(tv.tv_nsec / 1000000); #else struct timeval tv; gettimeofday(&tv, NULL); - return (uint64_t)(tv.tv_sec) * UINT64_C(1000) + tv.tv_usec / 1000; + return (uint64_t)tv.tv_sec * 1000 + (uint64_t)(tv.tv_usec / 1000); #endif } extern void mytime_set_start_time(void) { start_time = mytime_now(); - next_flush = start_time + opt_flush_timeout; - flush_needed = false; return; } extern uint64_t mytime_get_elapsed(void) { return mytime_now() - start_time; } extern void mytime_set_flush_time(void) { next_flush = mytime_now() + opt_flush_timeout; - flush_needed = false; return; } extern int mytime_get_flush_timeout(void) { if (opt_flush_timeout == 0 || opt_mode != MODE_COMPRESS) return -1; const uint64_t now = mytime_now(); if (now >= next_flush) return 0; const uint64_t remaining = next_flush - now; return remaining > INT_MAX ? INT_MAX : (int)remaining; } Index: head/contrib/xz/src/xz/mytime.h =================================================================== --- head/contrib/xz/src/xz/mytime.h (revision 359200) +++ head/contrib/xz/src/xz/mytime.h (revision 359201) @@ -1,47 +1,43 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file mytime.h /// \brief Time handling functions // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// /// \brief Number of milliseconds to between LZMA_SYNC_FLUSHes /// /// If 0, timed flushing is disabled. Otherwise if no more input is available /// and not at the end of the file and at least opt_flush_timeout milliseconds /// has elapsed since the start of compression or the previous flushing /// (LZMA_SYNC_FLUSH or LZMA_FULL_FLUSH), set LZMA_SYNC_FLUSH to flush /// the pending data. extern uint64_t opt_flush_timeout; -/// \brief True when flushing is needed due to expired timeout -extern bool flush_needed; - - /// \brief Store the time when (de)compression was started /// /// The start time is also stored as the time of the first flush. extern void mytime_set_start_time(void); /// \brief Get the number of milliseconds since the operation started extern uint64_t mytime_get_elapsed(void); /// \brief Store the time of when compressor was flushed extern void mytime_set_flush_time(void); /// \brief Get the number of milliseconds until the next flush /// /// This returns -1 if no timed flushing is used. /// -/// The return value is inteded for use with poll(). +/// The return value is intended for use with poll(). extern int mytime_get_flush_timeout(void); Index: head/contrib/xz/src/xz/options.c =================================================================== --- head/contrib/xz/src/xz/options.c (revision 359200) +++ head/contrib/xz/src/xz/options.c (revision 359201) @@ -1,363 +1,363 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file options.c /// \brief Parser for filter-specific options // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" /////////////////// // Generic stuff // /////////////////// typedef struct { const char *name; uint64_t id; } name_id_map; typedef struct { const char *name; const name_id_map *map; uint64_t min; uint64_t max; } option_map; /// Parses option=value pairs that are separated with commas: /// opt=val,opt=val,opt=val /// /// Each option is a string, that is converted to an integer using the /// index where the option string is in the array. /// /// Value can be /// - a string-id map mapping a list of possible string values to integers /// (opts[i].map != NULL, opts[i].min and opts[i].max are ignored); /// - a number with minimum and maximum value limit /// (opts[i].map == NULL && opts[i].min != UINT64_MAX); /// - a string that will be parsed by the filter-specific code /// (opts[i].map == NULL && opts[i].min == UINT64_MAX, opts[i].max ignored) /// /// When parsing both option and value succeed, a filter-specific function /// is called, which should update the given value to filter-specific /// options structure. /// /// \param str String containing the options from the command line /// \param opts Filter-specific option map /// \param set Filter-specific function to update filter_options /// \param filter_options Pointer to filter-specific options structure /// /// \return Returns only if no errors occur. /// static void parse_options(const char *str, const option_map *opts, void (*set)(void *filter_options, unsigned key, uint64_t value, const char *valuestr), void *filter_options) { if (str == NULL || str[0] == '\0') return; char *s = xstrdup(str); char *name = s; while (*name != '\0') { if (*name == ',') { ++name; continue; } char *split = strchr(name, ','); if (split != NULL) *split = '\0'; char *value = strchr(name, '='); if (value != NULL) *value++ = '\0'; if (value == NULL || value[0] == '\0') message_fatal(_("%s: Options must be `name=value' " "pairs separated with commas"), str); // Look for the option name from the option map. unsigned i = 0; while (true) { if (opts[i].name == NULL) message_fatal(_("%s: Invalid option name"), name); if (strcmp(name, opts[i].name) == 0) break; ++i; } // Option was found from the map. See how we should handle it. if (opts[i].map != NULL) { // value is a string which we should map // to an integer. unsigned j; for (j = 0; opts[i].map[j].name != NULL; ++j) { if (strcmp(opts[i].map[j].name, value) == 0) break; } if (opts[i].map[j].name == NULL) message_fatal(_("%s: Invalid option value"), value); set(filter_options, i, opts[i].map[j].id, value); } else if (opts[i].min == UINT64_MAX) { // value is a special string that will be // parsed by set(). set(filter_options, i, 0, value); } else { // value is an integer. const uint64_t v = str_to_uint64(name, value, opts[i].min, opts[i].max); set(filter_options, i, v, value); } // Check if it was the last option. if (split == NULL) break; name = split + 1; } free(s); return; } /////////// // Delta // /////////// enum { OPT_DIST, }; static void set_delta(void *options, unsigned key, uint64_t value, const char *valuestr lzma_attribute((__unused__))) { lzma_options_delta *opt = options; switch (key) { case OPT_DIST: opt->dist = value; break; } } extern lzma_options_delta * options_delta(const char *str) { static const option_map opts[] = { { "dist", NULL, LZMA_DELTA_DIST_MIN, LZMA_DELTA_DIST_MAX }, { NULL, NULL, 0, 0 } }; lzma_options_delta *options = xmalloc(sizeof(lzma_options_delta)); *options = (lzma_options_delta){ // It's hard to give a useful default for this. .type = LZMA_DELTA_TYPE_BYTE, .dist = LZMA_DELTA_DIST_MIN, }; parse_options(str, opts, &set_delta, options); return options; } ///////// // BCJ // ///////// enum { OPT_START_OFFSET, }; static void set_bcj(void *options, unsigned key, uint64_t value, const char *valuestr lzma_attribute((__unused__))) { lzma_options_bcj *opt = options; switch (key) { case OPT_START_OFFSET: opt->start_offset = value; break; } } extern lzma_options_bcj * options_bcj(const char *str) { static const option_map opts[] = { { "start", NULL, 0, UINT32_MAX }, { NULL, NULL, 0, 0 } }; lzma_options_bcj *options = xmalloc(sizeof(lzma_options_bcj)); *options = (lzma_options_bcj){ .start_offset = 0, }; parse_options(str, opts, &set_bcj, options); return options; } ////////// // LZMA // ////////// enum { OPT_PRESET, OPT_DICT, OPT_LC, OPT_LP, OPT_PB, OPT_MODE, OPT_NICE, OPT_MF, OPT_DEPTH, }; static void lzma_attribute((__noreturn__)) error_lzma_preset(const char *valuestr) { message_fatal(_("Unsupported LZMA1/LZMA2 preset: %s"), valuestr); } static void set_lzma(void *options, unsigned key, uint64_t value, const char *valuestr) { lzma_options_lzma *opt = options; switch (key) { case OPT_PRESET: { if (valuestr[0] < '0' || valuestr[0] > '9') error_lzma_preset(valuestr); - uint32_t preset = valuestr[0] - '0'; + uint32_t preset = (uint32_t)(valuestr[0] - '0'); // Currently only "e" is supported as a modifier, // so keep this simple for now. if (valuestr[1] != '\0') { if (valuestr[1] == 'e') preset |= LZMA_PRESET_EXTREME; else error_lzma_preset(valuestr); if (valuestr[2] != '\0') error_lzma_preset(valuestr); } if (lzma_lzma_preset(options, preset)) error_lzma_preset(valuestr); break; } case OPT_DICT: opt->dict_size = value; break; case OPT_LC: opt->lc = value; break; case OPT_LP: opt->lp = value; break; case OPT_PB: opt->pb = value; break; case OPT_MODE: opt->mode = value; break; case OPT_NICE: opt->nice_len = value; break; case OPT_MF: opt->mf = value; break; case OPT_DEPTH: opt->depth = value; break; } } extern lzma_options_lzma * options_lzma(const char *str) { static const name_id_map modes[] = { { "fast", LZMA_MODE_FAST }, { "normal", LZMA_MODE_NORMAL }, { NULL, 0 } }; static const name_id_map mfs[] = { { "hc3", LZMA_MF_HC3 }, { "hc4", LZMA_MF_HC4 }, { "bt2", LZMA_MF_BT2 }, { "bt3", LZMA_MF_BT3 }, { "bt4", LZMA_MF_BT4 }, { NULL, 0 } }; static const option_map opts[] = { { "preset", NULL, UINT64_MAX, 0 }, { "dict", NULL, LZMA_DICT_SIZE_MIN, (UINT32_C(1) << 30) + (UINT32_C(1) << 29) }, { "lc", NULL, LZMA_LCLP_MIN, LZMA_LCLP_MAX }, { "lp", NULL, LZMA_LCLP_MIN, LZMA_LCLP_MAX }, { "pb", NULL, LZMA_PB_MIN, LZMA_PB_MAX }, { "mode", modes, 0, 0 }, { "nice", NULL, 2, 273 }, { "mf", mfs, 0, 0 }, { "depth", NULL, 0, UINT32_MAX }, { NULL, NULL, 0, 0 } }; lzma_options_lzma *options = xmalloc(sizeof(lzma_options_lzma)); if (lzma_lzma_preset(options, LZMA_PRESET_DEFAULT)) message_bug(); parse_options(str, opts, &set_lzma, options); if (options->lc + options->lp > LZMA_LCLP_MAX) message_fatal(_("The sum of lc and lp must not exceed 4")); const uint32_t nice_len_min = options->mf & 0x0F; if (options->nice_len < nice_len_min) message_fatal(_("The selected match finder requires at " "least nice=%" PRIu32), nice_len_min); return options; } Index: head/contrib/xz/src/xz/private.h =================================================================== --- head/contrib/xz/src/xz/private.h (revision 359200) +++ head/contrib/xz/src/xz/private.h (revision 359201) @@ -1,66 +1,66 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file private.h -/// \brief Common includes, definions, and prototypes +/// \brief Common includes, definitions, and prototypes // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "sysdefs.h" #include "mythread.h" #include "lzma.h" #include #include #include #include #include #include #include #include "tuklib_gettext.h" #include "tuklib_progname.h" #include "tuklib_exit.h" #include "tuklib_mbstr.h" #if defined(_WIN32) && !defined(__CYGWIN__) # define WIN32_LEAN_AND_MEAN # include #endif #ifndef STDIN_FILENO # define STDIN_FILENO (fileno(stdin)) #endif #ifndef STDOUT_FILENO # define STDOUT_FILENO (fileno(stdout)) #endif #ifndef STDERR_FILENO # define STDERR_FILENO (fileno(stderr)) #endif #ifdef HAVE_CAPSICUM # define ENABLE_SANDBOX 1 #endif #include "main.h" #include "mytime.h" #include "coder.h" #include "message.h" #include "args.h" #include "hardware.h" #include "file_io.h" #include "options.h" #include "signals.h" #include "suffix.h" #include "util.h" #ifdef HAVE_DECODERS # include "list.h" #endif Index: head/contrib/xz/src/xz/signals.c =================================================================== --- head/contrib/xz/src/xz/signals.c (revision 359200) +++ head/contrib/xz/src/xz/signals.c (revision 359201) @@ -1,209 +1,209 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file signals.c /// \brief Handling signals to abort operation // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" volatile sig_atomic_t user_abort = false; #if !(defined(_WIN32) && !defined(__CYGWIN__)) /// If we were interrupted by a signal, we store the signal number so that /// we can raise that signal to kill the program when all cleanups have /// been done. static volatile sig_atomic_t exit_signal = 0; -/// Mask of signals for which have have established a signal handler to set +/// Mask of signals for which we have established a signal handler to set /// user_abort to true. static sigset_t hooked_signals; /// True once signals_init() has finished. This is used to skip blocking /// signals (with uninitialized hooked_signals) if signals_block() and /// signals_unblock() are called before signals_init() has been called. static bool signals_are_initialized = false; /// signals_block() and signals_unblock() can be called recursively. static size_t signals_block_count = 0; static void signal_handler(int sig) { exit_signal = sig; user_abort = true; #ifndef TUKLIB_DOSLIKE io_write_to_user_abort_pipe(); #endif return; } extern void signals_init(void) { // List of signals for which we establish the signal handler. static const int sigs[] = { SIGINT, SIGTERM, #ifdef SIGHUP SIGHUP, #endif #ifdef SIGPIPE SIGPIPE, #endif #ifdef SIGXCPU SIGXCPU, #endif #ifdef SIGXFSZ SIGXFSZ, #endif }; // Mask of the signals for which we have established a signal handler. sigemptyset(&hooked_signals); for (size_t i = 0; i < ARRAY_SIZE(sigs); ++i) sigaddset(&hooked_signals, sigs[i]); #ifdef SIGALRM // Add also the signals from message.c to hooked_signals. for (size_t i = 0; message_progress_sigs[i] != 0; ++i) sigaddset(&hooked_signals, message_progress_sigs[i]); #endif // Using "my_sa" because "sa" may conflict with a sockaddr variable // from system headers on Solaris. struct sigaction my_sa; // All the signals that we handle we also blocked while the signal // handler runs. my_sa.sa_mask = hooked_signals; // Don't set SA_RESTART, because we want EINTR so that we can check // for user_abort and cleanup before exiting. We block the signals // for which we have established a handler when we don't want EINTR. my_sa.sa_flags = 0; my_sa.sa_handler = &signal_handler; for (size_t i = 0; i < ARRAY_SIZE(sigs); ++i) { // If the parent process has left some signals ignored, // we don't unignore them. struct sigaction old; if (sigaction(sigs[i], NULL, &old) == 0 && old.sa_handler == SIG_IGN) continue; // Establish the signal handler. if (sigaction(sigs[i], &my_sa, NULL)) message_signal_handler(); } signals_are_initialized = true; return; } #ifndef __VMS extern void signals_block(void) { if (signals_are_initialized) { if (signals_block_count++ == 0) { const int saved_errno = errno; mythread_sigmask(SIG_BLOCK, &hooked_signals, NULL); errno = saved_errno; } } return; } extern void signals_unblock(void) { if (signals_are_initialized) { assert(signals_block_count > 0); if (--signals_block_count == 0) { const int saved_errno = errno; mythread_sigmask(SIG_UNBLOCK, &hooked_signals, NULL); errno = saved_errno; } } return; } #endif extern void signals_exit(void) { - const int sig = exit_signal; + const int sig = (int)exit_signal; if (sig != 0) { #if defined(TUKLIB_DOSLIKE) || defined(__VMS) // Don't raise(), set only exit status. This avoids // printing unwanted message about SIGINT when the user // presses C-c. set_exit_status(E_ERROR); #else struct sigaction sa; sa.sa_handler = SIG_DFL; sigfillset(&sa.sa_mask); sa.sa_flags = 0; sigaction(sig, &sa, NULL); - raise(exit_signal); + raise(sig); #endif } return; } #else // While Windows has some very basic signal handling functions as required // by C89, they are not really used, and e.g. SIGINT doesn't work exactly // the way it does on POSIX (Windows creates a new thread for the signal // handler). Instead, we use SetConsoleCtrlHandler() to catch user // pressing C-c, because that seems to be the recommended way to do it. // // NOTE: This doesn't work under MSYS. Trying with SIGINT doesn't work // either even if it appeared to work at first. So test using Windows // console window. static BOOL WINAPI signal_handler(DWORD type lzma_attribute((__unused__))) { // Since we don't get a signal number which we could raise() at // signals_exit() like on POSIX, just set the exit status to // indicate an error, so that we cannot return with zero exit status. set_exit_status(E_ERROR); user_abort = true; return TRUE; } extern void signals_init(void) { if (!SetConsoleCtrlHandler(&signal_handler, TRUE)) message_signal_handler(); return; } #endif Index: head/contrib/xz/src/xz/util.c =================================================================== --- head/contrib/xz/src/xz/util.c (revision 359200) +++ head/contrib/xz/src/xz/util.c (revision 359201) @@ -1,288 +1,298 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file util.c /// \brief Miscellaneous utility functions // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "private.h" #include /// Buffers for uint64_to_str() and uint64_to_nicestr() static char bufs[4][128]; /// Thousand separator support in uint64_to_str() and uint64_to_nicestr() static enum { UNKNOWN, WORKS, BROKEN } thousand = UNKNOWN; extern void * xrealloc(void *ptr, size_t size) { assert(size > 0); // Save ptr so that we can free it if realloc fails. // The point is that message_fatal ends up calling stdio functions // which in some libc implementations might allocate memory from // the heap. Freeing ptr improves the chances that there's free // memory for stdio functions if they need it. void *p = ptr; ptr = realloc(ptr, size); if (ptr == NULL) { const int saved_errno = errno; free(p); message_fatal("%s", strerror(saved_errno)); } return ptr; } extern char * xstrdup(const char *src) { assert(src != NULL); const size_t size = strlen(src) + 1; char *dest = xmalloc(size); return memcpy(dest, src, size); } extern uint64_t str_to_uint64(const char *name, const char *value, uint64_t min, uint64_t max) { uint64_t result = 0; // Skip blanks. while (*value == ' ' || *value == '\t') ++value; // Accept special value "max". Supporting "min" doesn't seem useful. if (strcmp(value, "max") == 0) return max; if (*value < '0' || *value > '9') message_fatal(_("%s: Value is not a non-negative " "decimal integer"), value); do { // Don't overflow. if (result > UINT64_MAX / 10) goto error; result *= 10; // Another overflow check - const uint32_t add = *value - '0'; + const uint32_t add = (uint32_t)(*value - '0'); if (UINT64_MAX - add < result) goto error; result += add; ++value; } while (*value >= '0' && *value <= '9'); if (*value != '\0') { // Look for suffix. Originally this supported both base-2 // and base-10, but since there seems to be little need // for base-10 in this program, treat everything as base-2 // and also be more relaxed about the case of the first // letter of the suffix. uint64_t multiplier = 0; if (*value == 'k' || *value == 'K') multiplier = UINT64_C(1) << 10; else if (*value == 'm' || *value == 'M') multiplier = UINT64_C(1) << 20; else if (*value == 'g' || *value == 'G') multiplier = UINT64_C(1) << 30; ++value; // Allow also e.g. Ki, KiB, and KB. if (*value != '\0' && strcmp(value, "i") != 0 && strcmp(value, "iB") != 0 && strcmp(value, "B") != 0) multiplier = 0; if (multiplier == 0) { message(V_ERROR, _("%s: Invalid multiplier suffix"), value - 1); message_fatal(_("Valid suffixes are `KiB' (2^10), " "`MiB' (2^20), and `GiB' (2^30).")); } // Don't overflow here either. if (result > UINT64_MAX / multiplier) goto error; result *= multiplier; } if (result < min || result > max) goto error; return result; error: message_fatal(_("Value of the option `%s' must be in the range " "[%" PRIu64 ", %" PRIu64 "]"), name, min, max); } extern uint64_t round_up_to_mib(uint64_t n) { return (n >> 20) + ((n & ((UINT32_C(1) << 20) - 1)) != 0); } -/// Check if thousand separator is supported. Run-time checking is easiest, -/// because it seems to be sometimes lacking even on POSIXish system. +/// Check if thousands separator is supported. Run-time checking is easiest +/// because it seems to be sometimes lacking even on a POSIXish system. +/// Note that trying to use thousands separators when snprintf() doesn't +/// support them results in undefined behavior. This just has happened to +/// work well enough in practice. +/// +/// DJGPP 2.05 added support for thousands separators but it's broken +/// at least under WinXP with Finnish locale that uses a non-breaking space +/// as the thousands separator. Workaround by disabling thousands separators +/// for DJGPP builds. static void check_thousand_sep(uint32_t slot) { if (thousand == UNKNOWN) { bufs[slot][0] = '\0'; +#ifndef __DJGPP__ snprintf(bufs[slot], sizeof(bufs[slot]), "%'u", 1U); +#endif thousand = bufs[slot][0] == '1' ? WORKS : BROKEN; } return; } extern const char * uint64_to_str(uint64_t value, uint32_t slot) { assert(slot < ARRAY_SIZE(bufs)); check_thousand_sep(slot); if (thousand == WORKS) snprintf(bufs[slot], sizeof(bufs[slot]), "%'" PRIu64, value); else snprintf(bufs[slot], sizeof(bufs[slot]), "%" PRIu64, value); return bufs[slot]; } extern const char * uint64_to_nicestr(uint64_t value, enum nicestr_unit unit_min, enum nicestr_unit unit_max, bool always_also_bytes, uint32_t slot) { assert(unit_min <= unit_max); assert(unit_max <= NICESTR_TIB); assert(slot < ARRAY_SIZE(bufs)); check_thousand_sep(slot); enum nicestr_unit unit = NICESTR_B; char *pos = bufs[slot]; size_t left = sizeof(bufs[slot]); if ((unit_min == NICESTR_B && value < 10000) || unit_max == NICESTR_B) { // The value is shown as bytes. if (thousand == WORKS) my_snprintf(&pos, &left, "%'u", (unsigned int)value); else my_snprintf(&pos, &left, "%u", (unsigned int)value); } else { // Scale the value to a nicer unit. Unless unit_min and // unit_max limit us, we will show at most five significant // digits with one decimal place. double d = (double)(value); do { d /= 1024.0; ++unit; } while (unit < unit_min || (d > 9999.9 && unit < unit_max)); if (thousand == WORKS) my_snprintf(&pos, &left, "%'.1f", d); else my_snprintf(&pos, &left, "%.1f", d); } static const char suffix[5][4] = { "B", "KiB", "MiB", "GiB", "TiB" }; my_snprintf(&pos, &left, " %s", suffix[unit]); if (always_also_bytes && value >= 10000) { if (thousand == WORKS) snprintf(pos, left, " (%'" PRIu64 " B)", value); else snprintf(pos, left, " (%" PRIu64 " B)", value); } return bufs[slot]; } extern void my_snprintf(char **pos, size_t *left, const char *fmt, ...) { va_list ap; va_start(ap, fmt); const int len = vsnprintf(*pos, *left, fmt, ap); va_end(ap); // If an error occurred, we want the caller to think that the whole // buffer was used. This way no more data will be written to the // buffer. We don't need better error handling here, although it // is possible that the result looks garbage on the terminal if // e.g. an UTF-8 character gets split. That shouldn't (easily) // happen though, because the buffers used have some extra room. if (len < 0 || (size_t)(len) >= *left) { *left = 0; } else { *pos += len; - *left -= len; + *left -= (size_t)(len); } return; } extern bool is_empty_filename(const char *filename) { if (filename[0] == '\0') { message_error(_("Empty filename, skipping")); return true; } return false; } extern bool is_tty_stdin(void) { const bool ret = isatty(STDIN_FILENO); if (ret) message_error(_("Compressed data cannot be read from " "a terminal")); return ret; } extern bool is_tty_stdout(void) { const bool ret = isatty(STDOUT_FILENO); if (ret) message_error(_("Compressed data cannot be written to " "a terminal")); return ret; } Index: head/contrib/xz/src/xz/xz.1 =================================================================== --- head/contrib/xz/src/xz/xz.1 (revision 359200) +++ head/contrib/xz/src/xz/xz.1 (revision 359201) @@ -1,2805 +1,2805 @@ '\" t .\" .\" Author: Lasse Collin .\" .\" This file has been put into the public domain. .\" You can do whatever you want with this file. .\" -.TH XZ 1 "2017-04-19" "Tukaani" "XZ Utils" +.TH XZ 1 "2020-02-01" "Tukaani" "XZ Utils" . .SH NAME xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files . .SH SYNOPSIS .B xz .RI [ option... ] .RI [ file... ] . .SH COMMAND ALIASES .B unxz is equivalent to .BR "xz \-\-decompress" . .br .B xzcat is equivalent to .BR "xz \-\-decompress \-\-stdout" . .br .B lzma is equivalent to .BR "xz \-\-format=lzma" . .br .B unlzma is equivalent to .BR "xz \-\-format=lzma \-\-decompress" . .br .B lzcat is equivalent to .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" . .PP When writing scripts that need to decompress files, it is recommended to always use the name .B xz with appropriate arguments .RB ( "xz \-d" or .BR "xz \-dc" ) instead of the names .B unxz and .BR xzcat . . .SH DESCRIPTION .B xz is a general-purpose data compression tool with command line syntax similar to .BR gzip (1) and .BR bzip2 (1). The native file format is the .B .xz format, but the legacy .B .lzma format used by LZMA Utils and raw compressed streams with no container format headers are also supported. .PP .B xz compresses or decompresses each .I file according to the selected operation mode. If no .I files are given or .I file is .BR \- , .B xz reads from standard input and writes the processed data to standard output. .B xz will refuse (display an error and skip the .IR file ) to write compressed data to standard output if it is a terminal. Similarly, .B xz will refuse to read compressed data from standard input if it is a terminal. .PP Unless .B \-\-stdout is specified, .I files other than .B \- are written to a new file whose name is derived from the source .I file name: .IP \(bu 3 When compressing, the suffix of the target file format .RB ( .xz or .BR .lzma ) is appended to the source filename to get the target filename. .IP \(bu 3 When decompressing, the .B .xz or .B .lzma suffix is removed from the filename to get the target filename. .B xz also recognizes the suffixes .B .txz and .BR .tlz , and replaces them with the .B .tar suffix. .PP If the target file already exists, an error is displayed and the .I file is skipped. .PP Unless writing to standard output, .B xz will display a warning and skip the .I file if any of the following applies: .IP \(bu 3 .I File is not a regular file. Symbolic links are not followed, and thus they are not considered to be regular files. .IP \(bu 3 .I File has more than one hard link. .IP \(bu 3 .I File has setuid, setgid, or sticky bit set. .IP \(bu 3 The operation mode is set to compress and the .I file already has a suffix of the target file format .RB ( .xz or .B .txz when compressing to the .B .xz format, and .B .lzma or .B .tlz when compressing to the .B .lzma format). .IP \(bu 3 The operation mode is set to decompress and the .I file doesn't have a suffix of any of the supported file formats .RB ( .xz , .BR .txz , .BR .lzma , or .BR .tlz ). .PP After successfully compressing or decompressing the .IR file , .B xz copies the owner, group, permissions, access time, and modification time from the source .I file to the target file. If copying the group fails, the permissions are modified so that the target file doesn't become accessible to users who didn't have permission to access the source .IR file . .B xz doesn't support copying other metadata like access control lists or extended attributes yet. .PP Once the target file has been successfully closed, the source .I file is removed unless .B \-\-keep was specified. The source .I file is never removed if the output is written to standard output. .PP Sending .B SIGINFO or .B SIGUSR1 to the .B xz process makes it print progress information to standard error. This has only limited use since when standard error is a terminal, using .B \-\-verbose will display an automatically updating progress indicator. . .SS "Memory usage" The memory usage of .B xz varies from a few hundred kilobytes to several gigabytes depending on the compression settings. The settings used when compressing a file determine the memory requirements of the decompressor. Typically the decompressor needs 5\ % to 20\ % of the amount of memory that the compressor needed when creating the file. For example, decompressing a file created with .B xz \-9 currently requires 65\ MiB of memory. Still, it is possible to have .B .xz files that require several gigabytes of memory to decompress. .PP Especially users of older systems may find the possibility of very large memory usage annoying. To prevent uncomfortable surprises, .B xz has a built-in memory usage limiter, which is disabled by default. While some operating systems provide ways to limit the memory usage of processes, relying on it wasn't deemed to be flexible enough (e.g. using .BR ulimit (1) to limit virtual memory tends to cripple .BR mmap (2)). .PP The memory usage limiter can be enabled with the command line option \fB\-\-memlimit=\fIlimit\fR. Often it is more convenient to enable the limiter by default by setting the environment variable .BR XZ_DEFAULTS , e.g.\& .BR XZ_DEFAULTS=\-\-memlimit=150MiB . It is possible to set the limits separately for compression and decompression by using \fB\-\-memlimit\-compress=\fIlimit\fR and \fB\-\-memlimit\-decompress=\fIlimit\fR. Using these two options outside .B XZ_DEFAULTS is rarely useful because a single run of .B xz cannot do both compression and decompression and .BI \-\-memlimit= limit (or \fB\-M\fR \fIlimit\fR) is shorter to type on the command line. .PP If the specified memory usage limit is exceeded when decompressing, .B xz will display an error and decompressing the file will fail. If the limit is exceeded when compressing, .B xz will try to scale the settings down so that the limit is no longer exceeded (except when using \fB\-\-format=raw\fR or \fB\-\-no\-adjust\fR). This way the operation won't fail unless the limit is very small. The scaling of the settings is done in steps that don't match the compression level presets, e.g. if the limit is only slightly less than the amount required for .BR "xz \-9" , the settings will be scaled down only a little, not all the way down to .BR "xz \-8" . . .SS "Concatenation and padding with .xz files" It is possible to concatenate .B .xz files as is. .B xz will decompress such files as if they were a single .B .xz file. .PP It is possible to insert padding between the concatenated parts or after the last part. The padding must consist of null bytes and the size of the padding must be a multiple of four bytes. This can be useful e.g. if the .B .xz file is stored on a medium that measures file sizes in 512-byte blocks. .PP Concatenation and padding are not allowed with .B .lzma files or raw streams. . .SH OPTIONS . .SS "Integer suffixes and special values" In most places where an integer argument is expected, an optional suffix is supported to easily indicate large integers. There must be no space between the integer and the suffix. .TP .B KiB Multiply the integer by 1,024 (2^10). .BR Ki , .BR k , .BR kB , .BR K , and .B KB are accepted as synonyms for .BR KiB . .TP .B MiB Multiply the integer by 1,048,576 (2^20). .BR Mi , .BR m , .BR M , and .B MB are accepted as synonyms for .BR MiB . .TP .B GiB Multiply the integer by 1,073,741,824 (2^30). .BR Gi , .BR g , .BR G , and .B GB are accepted as synonyms for .BR GiB . .PP The special value .B max can be used to indicate the maximum integer value supported by the option. . .SS "Operation mode" If multiple operation mode options are given, the last one takes effect. .TP .BR \-z ", " \-\-compress Compress. This is the default operation mode when no operation mode option is specified and no other operation mode is implied from the command name (for example, .B unxz implies .BR \-\-decompress ). .TP .BR \-d ", " \-\-decompress ", " \-\-uncompress Decompress. .TP .BR \-t ", " \-\-test Test the integrity of compressed .IR files . This option is equivalent to .B "\-\-decompress \-\-stdout" except that the decompressed data is discarded instead of being written to standard output. No files are created or removed. .TP .BR \-l ", " \-\-list Print information about compressed .IR files . No uncompressed output is produced, and no files are created or removed. In list mode, the program cannot read the compressed data from standard input or from other unseekable sources. .IP "" The default listing shows basic information about .IR files , one file per line. To get more detailed information, use also the .B \-\-verbose option. For even more information, use .B \-\-verbose twice, but note that this may be slow, because getting all the extra information requires many seeks. The width of verbose output exceeds 80 characters, so piping the output to e.g.\& .B "less\ \-S" may be convenient if the terminal isn't wide enough. .IP "" The exact output may vary between .B xz versions and different locales. For machine-readable output, .B \-\-robot \-\-list should be used. . .SS "Operation modifiers" .TP .BR \-k ", " \-\-keep Don't delete the input files. .TP .BR \-f ", " \-\-force This option has several effects: .RS .IP \(bu 3 If the target file already exists, delete it before compressing or decompressing. .IP \(bu 3 Compress or decompress even if the input is a symbolic link to a regular file, has more than one hard link, or has the setuid, setgid, or sticky bit set. The setuid, setgid, and sticky bits are not copied to the target file. .IP \(bu 3 When used with .B \-\-decompress .BR \-\-stdout and .B xz cannot recognize the type of the source file, copy the source file as is to standard output. This allows .B xzcat .B \-\-force to be used like .BR cat (1) for files that have not been compressed with .BR xz . Note that in future, .B xz might support new compressed file formats, which may make .B xz decompress more types of files instead of copying them as is to standard output. .BI \-\-format= format can be used to restrict .B xz to decompress only a single file format. .RE .TP .BR \-c ", " \-\-stdout ", " \-\-to\-stdout Write the compressed or decompressed data to standard output instead of a file. This implies .BR \-\-keep . .TP .B \-\-single\-stream Decompress only the first .B .xz stream, and silently ignore possible remaining input data following the stream. Normally such trailing garbage makes .B xz display an error. .IP "" .B xz never decompresses more than one stream from .B .lzma files or raw streams, but this option still makes .B xz ignore the possible trailing data after the .B .lzma file or raw stream. .IP "" This option has no effect if the operation mode is not .B \-\-decompress or .BR \-\-test . .TP .B \-\-no\-sparse Disable creation of sparse files. By default, if decompressing into a regular file, .B xz tries to make the file sparse if the decompressed data contains long sequences of binary zeros. It also works when writing to standard output as long as standard output is connected to a regular file and certain additional conditions are met to make it safe. Creating sparse files may save disk space and speed up the decompression by reducing the amount of disk I/O. .TP \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf When compressing, use .I .suf as the suffix for the target file instead of .B .xz or .BR .lzma . If not writing to standard output and the source file already has the suffix .IR .suf , a warning is displayed and the file is skipped. .IP "" When decompressing, recognize files with the suffix .I .suf in addition to files with the .BR .xz , .BR .txz , .BR .lzma , or .B .tlz suffix. If the source file has the suffix .IR .suf , the suffix is removed to get the target filename. .IP "" When compressing or decompressing raw streams .RB ( \-\-format=raw ), the suffix must always be specified unless writing to standard output, because there is no default suffix for raw streams. .TP \fB\-\-files\fR[\fB=\fIfile\fR] Read the filenames to process from .IR file ; if .I file is omitted, filenames are read from standard input. Filenames must be terminated with the newline character. A dash .RB ( \- ) is taken as a regular filename; it doesn't mean standard input. If filenames are given also as command line arguments, they are processed before the filenames read from .IR file . .TP \fB\-\-files0\fR[\fB=\fIfile\fR] This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except that each filename must be terminated with the null character. . .SS "Basic file format and compression options" .TP \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat Specify the file .I format to compress or decompress: .RS .TP .B auto This is the default. When compressing, .B auto is equivalent to .BR xz . When decompressing, the format of the input file is automatically detected. Note that raw streams (created with .BR \-\-format=raw ) cannot be auto-detected. .TP .B xz Compress to the .B .xz file format, or accept only .B .xz files when decompressing. .TP .BR lzma ", " alone Compress to the legacy .B .lzma file format, or accept only .B .lzma files when decompressing. The alternative name .B alone is provided for backwards compatibility with LZMA Utils. .TP .B raw Compress or uncompress a raw stream (no headers). This is meant for advanced users only. To decode raw streams, you need use .B \-\-format=raw and explicitly specify the filter chain, which normally would have been stored in the container headers. .RE .TP \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck Specify the type of the integrity check. The check is calculated from the uncompressed data and stored in the .B .xz file. This option has an effect only when compressing into the .B .xz format; the .B .lzma format doesn't support integrity checks. The integrity check (if any) is verified when the .B .xz file is decompressed. .IP "" Supported .I check types: .RS .TP .B none Don't calculate an integrity check at all. This is usually a bad idea. This can be useful when integrity of the data is verified by other means anyway. .TP .B crc32 Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet). .TP .B crc64 Calculate CRC64 using the polynomial from ECMA-182. This is the default, since it is slightly better than CRC32 at detecting damaged files and the speed difference is negligible. .TP .B sha256 Calculate SHA-256. This is somewhat slower than CRC32 and CRC64. .RE .IP "" Integrity of the .B .xz headers is always verified with CRC32. It is not possible to change or disable it. .TP .B \-\-ignore\-check Don't verify the integrity check of the compressed data when decompressing. The CRC32 values in the .B .xz headers will still be verified normally. .IP "" .B "Do not use this option unless you know what you are doing." Possible reasons to use this option: .RS .IP \(bu 3 Trying to recover data from a corrupt .xz file. .IP \(bu 3 Speeding up decompression. This matters mostly with SHA-256 or with files that have compressed extremely well. It's recommended to not use this option for this purpose unless the file integrity is verified externally in some other way. .RE .TP .BR \-0 " ... " \-9 Select a compression preset level. The default is .BR \-6 . If multiple preset levels are specified, the last one takes effect. If a custom filter chain was already specified, setting a compression preset level clears the custom filter chain. .IP "" The differences between the presets are more significant than with .BR gzip (1) and .BR bzip2 (1). The selected compression settings determine the memory requirements of the decompressor, thus using a too high preset level might make it painful to decompress the file on an old system with little RAM. Specifically, .B "it's not a good idea to blindly use \-9 for everything" like it often is with .BR gzip (1) and .BR bzip2 (1). .RS .TP .BR "\-0" " ... " "\-3" These are somewhat fast presets. .B \-0 is sometimes faster than .B "gzip \-9" while compressing much better. The higher ones often have speed comparable to .BR bzip2 (1) with comparable or better compression ratio, although the results depend a lot on the type of data being compressed. .TP .BR "\-4" " ... " "\-6" Good to very good compression while keeping decompressor memory usage reasonable even for old systems. .B \-6 is the default, which is usually a good choice e.g. for distributing files that need to be decompressible even on systems with only 16\ MiB RAM. .RB ( \-5e or .B \-6e may be worth considering too. See .BR \-\-extreme .) .TP .B "\-7 ... \-9" These are like .B \-6 but with higher compressor and decompressor memory requirements. These are useful only when compressing files bigger than 8\ MiB, 16\ MiB, and 32\ MiB, respectively. .RE .IP "" On the same hardware, the decompression speed is approximately a constant number of bytes of compressed data per second. In other words, the better the compression, the faster the decompression will usually be. This also means that the amount of uncompressed output produced per second can vary a lot. .IP "" The following table summarises the features of the presets: .RS .RS .PP .TS tab(;); c c c c c n n n n n. Preset;DictSize;CompCPU;CompMem;DecMem \-0;256 KiB;0;3 MiB;1 MiB \-1;1 MiB;1;9 MiB;2 MiB \-2;2 MiB;2;17 MiB;3 MiB \-3;4 MiB;3;32 MiB;5 MiB \-4;4 MiB;4;48 MiB;5 MiB \-5;8 MiB;5;94 MiB;9 MiB \-6;8 MiB;6;94 MiB;9 MiB \-7;16 MiB;6;186 MiB;17 MiB \-8;32 MiB;6;370 MiB;33 MiB \-9;64 MiB;6;674 MiB;65 MiB .TE .RE .RE .IP "" Column descriptions: .RS .IP \(bu 3 DictSize is the LZMA2 dictionary size. It is waste of memory to use a dictionary bigger than the size of the uncompressed file. This is why it is good to avoid using the presets .BR \-7 " ... " \-9 when there's no real need for them. At .B \-6 and lower, the amount of memory wasted is usually low enough to not matter. .IP \(bu 3 CompCPU is a simplified representation of the LZMA2 settings that affect compression speed. The dictionary size affects speed too, so while CompCPU is the same for levels .BR \-6 " ... " \-9 , higher levels still tend to be a little slower. To get even slower and thus possibly better compression, see .BR \-\-extreme . .IP \(bu 3 CompMem contains the compressor memory requirements in the single-threaded mode. It may vary slightly between .B xz versions. Memory requirements of some of the future multithreaded modes may be dramatically higher than that of the single-threaded mode. .IP \(bu 3 DecMem contains the decompressor memory requirements. That is, the compression settings determine the memory requirements of the decompressor. The exact decompressor memory usage is slightly more than the LZMA2 dictionary size, but the values in the table have been rounded up to the next full MiB. .RE .TP .BR \-e ", " \-\-extreme Use a slower variant of the selected compression preset level .RB ( \-0 " ... " \-9 ) to hopefully get a little bit better compression ratio, but with bad luck this can also make it worse. Decompressor memory usage is not affected, but compressor memory usage increases a little at preset levels .BR \-0 " ... " \-3 . .IP "" Since there are two presets with dictionary sizes 4\ MiB and 8\ MiB, the presets .B \-3e and .B \-5e use slightly faster settings (lower CompCPU) than .B \-4e and .BR \-6e , respectively. That way no two presets are identical. .RS .RS .PP .TS tab(;); c c c c c n n n n n. Preset;DictSize;CompCPU;CompMem;DecMem \-0e;256 KiB;8;4 MiB;1 MiB \-1e;1 MiB;8;13 MiB;2 MiB \-2e;2 MiB;8;25 MiB;3 MiB \-3e;4 MiB;7;48 MiB;5 MiB \-4e;4 MiB;8;48 MiB;5 MiB \-5e;8 MiB;7;94 MiB;9 MiB \-6e;8 MiB;8;94 MiB;9 MiB \-7e;16 MiB;8;186 MiB;17 MiB \-8e;32 MiB;8;370 MiB;33 MiB \-9e;64 MiB;8;674 MiB;65 MiB .TE .RE .RE .IP "" For example, there are a total of four presets that use 8\ MiB dictionary, whose order from the fastest to the slowest is .BR \-5 , .BR \-6 , .BR \-5e , and .BR \-6e . .TP .B \-\-fast .PD 0 .TP .B \-\-best .PD These are somewhat misleading aliases for .B \-0 and .BR \-9 , respectively. These are provided only for backwards compatibility with LZMA Utils. Avoid using these options. .TP .BI \-\-block\-size= size When compressing to the .B .xz format, split the input data into blocks of .I size bytes. The blocks are compressed independently from each other, which helps with multi-threading and makes limited random-access decompression possible. This option is typically used to override the default block size in multi-threaded mode, but this option can be used in single-threaded mode too. .IP "" In multi-threaded mode about three times .I size bytes will be allocated in each thread for buffering input and output. The default .I size is three times the LZMA2 dictionary size or 1 MiB, whichever is more. Typically a good value is 2\-4 times the size of the LZMA2 dictionary or at least 1 MiB. Using .I size less than the LZMA2 dictionary size is waste of RAM because then the LZMA2 dictionary buffer will never get fully used. The sizes of the blocks are stored in the block headers, which a future version of .B xz will use for multi-threaded decompression. .IP "" In single-threaded mode no block splitting is done by default. Setting this option doesn't affect memory usage. No size information is stored in block headers, thus files created in single-threaded mode won't be identical to files created in multi-threaded mode. The lack of size information also means that a future version of .B xz won't be able decompress the files in multi-threaded mode. .TP .BI \-\-block\-list= sizes When compressing to the .B .xz format, start a new block after the given intervals of uncompressed data. .IP "" The uncompressed .I sizes of the blocks are specified as a comma-separated list. Omitting a size (two or more consecutive commas) is a shorthand to use the size of the previous block. .IP "" If the input file is bigger than the sum of .IR sizes , the last value in .I sizes is repeated until the end of the file. A special value of .B 0 may be used as the last value to indicate that the rest of the file should be encoded as a single block. .IP "" If one specifies .I sizes that exceed the encoder's block size (either the default value in threaded mode or the value specified with \fB\-\-block\-size=\fIsize\fR), the encoder will create additional blocks while keeping the boundaries specified in .IR sizes . For example, if one specifies .B \-\-block\-size=10MiB .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB and the input file is 80 MiB, one will get 11 blocks: 5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB. .IP "" In multi-threaded mode the sizes of the blocks are stored in the block headers. This isn't done in single-threaded mode, so the encoded output won't be identical to that of the multi-threaded mode. .TP .BI \-\-flush\-timeout= timeout When compressing, if more than .I timeout milliseconds (a positive integer) has passed since the previous flush and reading more input would block, all the pending input data is flushed from the encoder and made available in the output stream. This can be useful if .B xz is used to compress data that is streamed over a network. Small .I timeout values make the data available at the receiving end with a small delay, but large .I timeout values give better compression ratio. .IP "" This feature is disabled by default. If this option is specified more than once, the last one takes effect. The special .I timeout value of .B 0 can be used to explicitly disable this feature. .IP "" This feature is not available on non-POSIX systems. .IP "" .\" FIXME .B "This feature is still experimental." Currently .B xz is unsuitable for decompressing the stream in real time due to how .B xz does buffering. .TP .BI \-\-memlimit\-compress= limit Set a memory usage limit for compression. If this option is specified multiple times, the last one takes effect. .IP "" If the compression settings exceed the .IR limit , .B xz will adjust the settings downwards so that the limit is no longer exceeded and display a notice that automatic adjustment was done. Such adjustments are not made when compressing with .B \-\-format=raw or if .B \-\-no\-adjust has been specified. In those cases, an error is displayed and .B xz will exit with exit status 1. .IP "" The .I limit can be specified in multiple ways: .RS .IP \(bu 3 The .I limit can be an absolute value in bytes. Using an integer suffix like .B MiB can be useful. Example: .B "\-\-memlimit\-compress=80MiB" .IP \(bu 3 The .I limit can be specified as a percentage of total physical memory (RAM). This can be useful especially when setting the .B XZ_DEFAULTS environment variable in a shell initialization script that is shared between different computers. That way the limit is automatically bigger on systems with more memory. Example: .B "\-\-memlimit\-compress=70%" .IP \(bu 3 The .I limit can be reset back to its default value by setting it to .BR 0 . This is currently equivalent to setting the .I limit to .B max (no memory usage limit). Once multithreading support has been implemented, there may be a difference between .B 0 and .B max for the multithreaded case, so it is recommended to use .B 0 instead of .B max until the details have been decided. .RE .IP "" For 32-bit .BR xz there is a special case: if the .I limit would be over .BR "4020\ MiB" , the .I limit is set to .BR "4020\ MiB" . (The values .B 0 and .B max aren't affected by this. A similar feature doesn't exist for decompression.) This can be helpful when a 32-bit executable has access to 4\ GiB address space while hopefully doing no harm in other situations. .IP "" See also the section .BR "Memory usage" . .TP .BI \-\-memlimit\-decompress= limit Set a memory usage limit for decompression. This also affects the .B \-\-list mode. If the operation is not possible without exceeding the .IR limit , .B xz will display an error and decompressing the file will fail. See .BI \-\-memlimit\-compress= limit for possible ways to specify the .IR limit . .TP \fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit This is equivalent to specifying \fB\-\-memlimit\-compress=\fIlimit \fB\-\-memlimit\-decompress=\fIlimit\fR. .TP .B \-\-no\-adjust Display an error and exit if the compression settings exceed the memory usage limit. The default is to adjust the settings downwards so that the memory usage limit is not exceeded. Automatic adjusting is always disabled when creating raw streams .RB ( \-\-format=raw ). .TP \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads Specify the number of worker threads to use. Setting .I threads to a special value .B 0 makes .B xz use as many threads as there are CPU cores on the system. The actual number of threads can be less than .I threads if the input file is not big enough for threading with the given settings or if using more threads would exceed the memory usage limit. .IP "" Currently the only threading method is to split the input into blocks and compress them independently from each other. The default block size depends on the compression level and -can be overriden with the +can be overridden with the .BI \-\-block\-size= size option. .IP "" Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if .BI \-\-block\-size= size is used. . .SS "Custom compressor filter chains" A custom filter chain allows specifying the compression settings in detail instead of relying on the settings associated to the presets. When a custom filter chain is specified, preset options (\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR) earlier on the command line are forgotten. If a preset option is specified after one or more custom filter chain options, the new preset takes effect and the custom filter chain options specified earlier are forgotten. .PP A filter chain is comparable to piping on the command line. When compressing, the uncompressed input goes to the first filter, whose output goes to the next filter (if any). The output of the last filter gets written to the compressed file. The maximum number of filters in the chain is four, but typically a filter chain has only one or two filters. .PP Many filters have limitations on where they can be in the filter chain: some filters can work only as the last filter in the chain, some only as a non-last filter, and some work in any position in the chain. Depending on the filter, this limitation is either inherent to the filter design or exists to prevent security issues. .PP A custom filter chain is specified by using one or more filter options in the order they are wanted in the filter chain. That is, the order of filter options is significant! When decoding raw streams .RB ( \-\-format=raw ), the filter chain is specified in the same order as it was specified when compressing. .PP Filters take filter-specific .I options as a comma-separated list. Extra commas in .I options are ignored. Every option has a default value, so you need to specify only those you want to change. .PP To see the whole filter chain and .IR options , use .B "xz \-vv" (that is, use .B \-\-verbose twice). This works also for viewing the filter chain options used by presets. .TP \fB\-\-lzma1\fR[\fB=\fIoptions\fR] .PD 0 .TP \fB\-\-lzma2\fR[\fB=\fIoptions\fR] .PD Add LZMA1 or LZMA2 filter to the filter chain. These filters can be used only as the last filter in the chain. .IP "" LZMA1 is a legacy filter, which is supported almost solely due to the legacy .B .lzma file format, which supports only LZMA1. LZMA2 is an updated version of LZMA1 to fix some practical issues of LZMA1. The .B .xz format uses LZMA2 and doesn't support LZMA1 at all. Compression speed and ratios of LZMA1 and LZMA2 are practically the same. .IP "" LZMA1 and LZMA2 share the same set of .IR options : .RS .TP .BI preset= preset Reset all LZMA1 or LZMA2 .I options to .IR preset . .I Preset consist of an integer, which may be followed by single-letter preset modifiers. The integer can be from .B 0 to .BR 9 , matching the command line options \fB\-0\fR ... \fB\-9\fR. The only supported modifier is currently .BR e , which matches .BR \-\-extreme . If no .B preset is specified, the default values of LZMA1 or LZMA2 .I options are taken from the preset .BR 6 . .TP .BI dict= size Dictionary (history buffer) .I size indicates how many bytes of the recently processed uncompressed data is kept in memory. The algorithm tries to find repeating byte sequences (matches) in the uncompressed data, and replace them with references to the data currently in the dictionary. The bigger the dictionary, the higher is the chance to find a match. Thus, increasing dictionary .I size usually improves compression ratio, but a dictionary bigger than the uncompressed file is waste of memory. .IP "" Typical dictionary .I size is from 64\ KiB to 64\ MiB. The minimum is 4\ KiB. The maximum for compression is currently 1.5\ GiB (1536\ MiB). The decompressor already supports dictionaries up to one byte less than 4\ GiB, which is the maximum for the LZMA1 and LZMA2 stream formats. .IP "" Dictionary .I size and match finder .RI ( mf ) together determine the memory usage of the LZMA1 or LZMA2 encoder. The same (or bigger) dictionary .I size is required for decompressing that was used when compressing, thus the memory usage of the decoder is determined by the dictionary size used when compressing. The .B .xz headers store the dictionary .I size either as .RI "2^" n or .RI "2^" n " + 2^(" n "\-1)," so these .I sizes are somewhat preferred for compression. Other .I sizes will get rounded up when stored in the .B .xz headers. .TP .BI lc= lc Specify the number of literal context bits. The minimum is 0 and the maximum is 4; the default is 3. In addition, the sum of .I lc and .I lp must not exceed 4. .IP "" All bytes that cannot be encoded as matches are encoded as literals. That is, literals are simply 8-bit bytes that are encoded one at a time. .IP "" The literal coding makes an assumption that the highest .I lc bits of the previous uncompressed byte correlate with the next byte. E.g. in typical English text, an upper-case letter is often followed by a lower-case letter, and a lower-case letter is usually followed by another lower-case letter. In the US-ASCII character set, the highest three bits are 010 for upper-case letters and 011 for lower-case letters. When .I lc is at least 3, the literal coding can take advantage of this property in the uncompressed data. .IP "" The default value (3) is usually good. If you want maximum compression, test .BR lc=4 . Sometimes it helps a little, and sometimes it makes compression worse. If it makes it worse, test e.g.\& .B lc=2 too. .TP .BI lp= lp Specify the number of literal position bits. The minimum is 0 and the maximum is 4; the default is 0. .IP "" .I Lp affects what kind of alignment in the uncompressed data is assumed when encoding literals. See .I pb below for more information about alignment. .TP .BI pb= pb Specify the number of position bits. The minimum is 0 and the maximum is 4; the default is 2. .IP "" .I Pb affects what kind of alignment in the uncompressed data is assumed in general. The default means four-byte alignment .RI (2^ pb =2^2=4), which is often a good choice when there's no better guess. .IP "" When the aligment is known, setting .I pb accordingly may reduce the file size a little. E.g. with text files having one-byte alignment (US-ASCII, ISO-8859-*, UTF-8), setting .B pb=0 can improve compression slightly. For UTF-16 text, .B pb=1 is a good choice. If the alignment is an odd number like 3 bytes, .B pb=0 might be the best choice. .IP "" Even though the assumed alignment can be adjusted with .I pb and .IR lp , LZMA1 and LZMA2 still slightly favor 16-byte alignment. It might be worth taking into account when designing file formats that are likely to be often compressed with LZMA1 or LZMA2. .TP .BI mf= mf Match finder has a major effect on encoder speed, memory usage, and compression ratio. Usually Hash Chain match finders are faster than Binary Tree match finders. The default depends on the .IR preset : 0 uses .BR hc3 , 1\-3 use .BR hc4 , and the rest use .BR bt4 . .IP "" The following match finders are supported. The memory usage formulas below are rough approximations, which are closest to the reality when .I dict is a power of two. .RS .TP .B hc3 Hash Chain with 2- and 3-byte hashing .br Minimum value for .IR nice : 3 .br Memory usage: .br .I dict * 7.5 (if .I dict <= 16 MiB); .br .I dict * 5.5 + 64 MiB (if .I dict > 16 MiB) .TP .B hc4 Hash Chain with 2-, 3-, and 4-byte hashing .br Minimum value for .IR nice : 4 .br Memory usage: .br .I dict * 7.5 (if .I dict <= 32 MiB); .br .I dict * 6.5 (if .I dict > 32 MiB) .TP .B bt2 Binary Tree with 2-byte hashing .br Minimum value for .IR nice : 2 .br Memory usage: .I dict * 9.5 .TP .B bt3 Binary Tree with 2- and 3-byte hashing .br Minimum value for .IR nice : 3 .br Memory usage: .br .I dict * 11.5 (if .I dict <= 16 MiB); .br .I dict * 9.5 + 64 MiB (if .I dict > 16 MiB) .TP .B bt4 Binary Tree with 2-, 3-, and 4-byte hashing .br Minimum value for .IR nice : 4 .br Memory usage: .br .I dict * 11.5 (if .I dict <= 32 MiB); .br .I dict * 10.5 (if .I dict > 32 MiB) .RE .TP .BI mode= mode Compression .I mode specifies the method to analyze the data produced by the match finder. Supported .I modes are .B fast and .BR normal . The default is .B fast for .I presets 0\-3 and .B normal for .I presets 4\-9. .IP "" Usually .B fast is used with Hash Chain match finders and .B normal with Binary Tree match finders. This is also what the .I presets do. .TP .BI nice= nice Specify what is considered to be a nice length for a match. Once a match of at least .I nice bytes is found, the algorithm stops looking for possibly better matches. .IP "" .I Nice can be 2\-273 bytes. Higher values tend to give better compression ratio at the expense of speed. The default depends on the .IR preset . .TP .BI depth= depth Specify the maximum search depth in the match finder. The default is the special value of 0, which makes the compressor determine a reasonable .I depth from .I mf and .IR nice . .IP "" Reasonable .I depth for Hash Chains is 4\-100 and 16\-1000 for Binary Trees. Using very high values for .I depth can make the encoder extremely slow with some files. Avoid setting the .I depth over 1000 unless you are prepared to interrupt the compression in case it is taking far too long. .RE .IP "" When decoding raw streams .RB ( \-\-format=raw ), LZMA2 needs only the dictionary .IR size . LZMA1 needs also .IR lc , .IR lp , and .IR pb . .TP \fB\-\-x86\fR[\fB=\fIoptions\fR] .PD 0 .TP \fB\-\-powerpc\fR[\fB=\fIoptions\fR] .TP \fB\-\-ia64\fR[\fB=\fIoptions\fR] .TP \fB\-\-arm\fR[\fB=\fIoptions\fR] .TP \fB\-\-armthumb\fR[\fB=\fIoptions\fR] .TP \fB\-\-sparc\fR[\fB=\fIoptions\fR] .PD Add a branch/call/jump (BCJ) filter to the filter chain. These filters can be used only as a non-last filter in the filter chain. .IP "" A BCJ filter converts relative addresses in the machine code to their absolute counterparts. This doesn't change the size of the data, but it increases redundancy, which can help LZMA2 to produce 0\-15\ % smaller .B .xz file. The BCJ filters are always reversible, so using a BCJ filter for wrong type of data doesn't cause any data loss, although it may make the compression ratio slightly worse. .IP "" It is fine to apply a BCJ filter on a whole executable; there's no need to apply it only on the executable section. Applying a BCJ filter on an archive that contains both executable and non-executable files may or may not give good results, so it generally isn't good to blindly apply a BCJ filter when compressing binary packages for distribution. .IP "" These BCJ filters are very fast and use insignificant amount of memory. If a BCJ filter improves compression ratio of a file, it can improve decompression speed at the same time. This is because, on the same hardware, the decompression speed of LZMA2 is roughly a fixed number of bytes of compressed data per second. .IP "" These BCJ filters have known problems related to the compression ratio: .RS .IP \(bu 3 Some types of files containing executable code (e.g. object files, static libraries, and Linux kernel modules) have the addresses in the instructions filled with filler values. These BCJ filters will still do the address conversion, which will make the compression worse with these files. .IP \(bu 3 Applying a BCJ filter on an archive containing multiple similar executables can make the compression ratio worse than not using a BCJ filter. This is because the BCJ filter doesn't detect the boundaries of the executable files, and doesn't reset the address conversion counter for each executable. .RE .IP "" Both of the above problems will be fixed in the future in a new filter. The old BCJ filters will still be useful in embedded systems, because the decoder of the new filter will be bigger and use more memory. .IP "" -Different instruction sets have have different alignment: +Different instruction sets have different alignment: .RS .RS .PP .TS tab(;); l n l l n l. Filter;Alignment;Notes x86;1;32-bit or 64-bit x86 PowerPC;4;Big endian only ARM;4;Little endian only ARM-Thumb;2;Little endian only IA-64;16;Big or little endian SPARC;4;Big or little endian .TE .RE .RE .IP "" Since the BCJ-filtered data is usually compressed with LZMA2, the compression ratio may be improved slightly if the LZMA2 options are set to match the alignment of the selected BCJ filter. For example, with the IA-64 filter, it's good to set .B pb=4 with LZMA2 (2^4=16). The x86 filter is an exception; it's usually good to stick to LZMA2's default four-byte alignment when compressing x86 executables. .IP "" All BCJ filters support the same .IR options : .RS .TP .BI start= offset Specify the start .I offset that is used when converting between relative and absolute addresses. The .I offset must be a multiple of the alignment of the filter (see the table above). The default is zero. In practice, the default is good; specifying a custom .I offset is almost never useful. .RE .TP \fB\-\-delta\fR[\fB=\fIoptions\fR] Add the Delta filter to the filter chain. The Delta filter can be only used as a non-last filter in the filter chain. .IP "" Currently only simple byte-wise delta calculation is supported. It can be useful when compressing e.g. uncompressed bitmap images or uncompressed PCM audio. However, special purpose algorithms may give significantly better results than Delta + LZMA2. This is true especially with audio, which compresses faster and better e.g. with .BR flac (1). .IP "" Supported .IR options : .RS .TP .BI dist= distance Specify the .I distance of the delta calculation in bytes. .I distance must be 1\-256. The default is 1. .IP "" For example, with .B dist=2 and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be A1 B1 01 02 01 02 01 02. .RE . .SS "Other options" .TP .BR \-q ", " \-\-quiet Suppress warnings and notices. Specify this twice to suppress errors too. This option has no effect on the exit status. That is, even if a warning was suppressed, the exit status to indicate a warning is still used. .TP .BR \-v ", " \-\-verbose Be verbose. If standard error is connected to a terminal, .B xz will display a progress indicator. Specifying .B \-\-verbose twice will give even more verbose output. .IP "" The progress indicator shows the following information: .RS .IP \(bu 3 Completion percentage is shown if the size of the input file is known. That is, the percentage cannot be shown in pipes. .IP \(bu 3 Amount of compressed data produced (compressing) or consumed (decompressing). .IP \(bu 3 Amount of uncompressed data consumed (compressing) or produced (decompressing). .IP \(bu 3 Compression ratio, which is calculated by dividing the amount of compressed data processed so far by the amount of uncompressed data processed so far. .IP \(bu 3 Compression or decompression speed. This is measured as the amount of uncompressed data consumed (compression) or produced (decompression) per second. It is shown after a few seconds have passed since .B xz started processing the file. .IP \(bu 3 Elapsed time in the format M:SS or H:MM:SS. .IP \(bu 3 Estimated remaining time is shown only when the size of the input file is known and a couple of seconds have already passed since .B xz started processing the file. The time is shown in a less precise format which never has any colons, e.g. 2 min 30 s. .RE .IP "" When standard error is not a terminal, .B \-\-verbose will make .B xz print the filename, compressed size, uncompressed size, compression ratio, and possibly also the speed and elapsed time on a single line to standard error after compressing or decompressing the file. The speed and elapsed time are included only when the operation took at least a few seconds. If the operation didn't finish, e.g. due to user interruption, also the completion percentage is printed if the size of the input file is known. .TP .BR \-Q ", " \-\-no\-warn Don't set the exit status to 2 even if a condition worth a warning was detected. This option doesn't affect the verbosity level, thus both .B \-\-quiet and .B \-\-no\-warn have to be used to not display warnings and to not alter the exit status. .TP .B \-\-robot Print messages in a machine-parsable format. This is intended to ease writing frontends that want to use .B xz instead of liblzma, which may be the case with various scripts. The output with this option enabled is meant to be stable across .B xz releases. See the section .B "ROBOT MODE" for details. .TP .BR \-\-info\-memory Display, in human-readable format, how much physical memory (RAM) .B xz thinks the system has and the memory usage limits for compression and decompression, and exit successfully. .TP .BR \-h ", " \-\-help Display a help message describing the most commonly used options, and exit successfully. .TP .BR \-H ", " \-\-long\-help Display a help message describing all features of .BR xz , and exit successfully .TP .BR \-V ", " \-\-version Display the version number of .B xz and liblzma in human readable format. To get machine-parsable output, specify .B \-\-robot before .BR \-\-version . . .SH "ROBOT MODE" The robot mode is activated with the .B \-\-robot option. It makes the output of .B xz easier to parse by other programs. Currently .B \-\-robot is supported only together with .BR \-\-version , .BR \-\-info\-memory , and .BR \-\-list . It will be supported for compression and decompression in the future. . .SS Version .B "xz \-\-robot \-\-version" will print the version number of .B xz and liblzma in the following format: .PP .BI XZ_VERSION= XYYYZZZS .br .BI LIBLZMA_VERSION= XYYYZZZS .TP .I X Major version. .TP .I YYY Minor version. Even numbers are stable. Odd numbers are alpha or beta versions. .TP .I ZZZ Patch level for stable releases or just a counter for development releases. .TP .I S Stability. 0 is alpha, 1 is beta, and 2 is stable. .I S should be always 2 when .I YYY is even. .PP .I XYYYZZZS are the same on both lines if .B xz and liblzma are from the same XZ Utils release. .PP Examples: 4.999.9beta is .B 49990091 and 5.0.0 is .BR 50000002 . . .SS "Memory limit information" .B "xz \-\-robot \-\-info\-memory" prints a single line with three tab-separated columns: .IP 1. 4 Total amount of physical memory (RAM) in bytes .IP 2. 4 Memory usage limit for compression in bytes. A special value of zero indicates the default setting, which for single-threaded mode is the same as no limit. .IP 3. 4 Memory usage limit for decompression in bytes. A special value of zero indicates the default setting, which for single-threaded mode is the same as no limit. .PP In the future, the output of .B "xz \-\-robot \-\-info\-memory" may have more columns, but never more than a single line. . .SS "List mode" .B "xz \-\-robot \-\-list" uses tab-separated output. The first column of every line has a string that indicates the type of the information found on that line: .TP .B name This is always the first line when starting to list a file. The second column on the line is the filename. .TP .B file This line contains overall information about the .B .xz file. This line is always printed after the .B name line. .TP .B stream This line type is used only when .B \-\-verbose was specified. There are as many .B stream lines as there are streams in the .B .xz file. .TP .B block This line type is used only when .B \-\-verbose was specified. There are as many .B block lines as there are blocks in the .B .xz file. The .B block lines are shown after all the .B stream lines; different line types are not interleaved. .TP .B summary This line type is used only when .B \-\-verbose was specified twice. This line is printed after all .B block lines. Like the .B file line, the .B summary line contains overall information about the .B .xz file. .TP .B totals This line is always the very last line of the list output. It shows the total counts and sizes. .PP The columns of the .B file lines: .PD 0 .RS .IP 2. 4 Number of streams in the file .IP 3. 4 Total number of blocks in the stream(s) .IP 4. 4 Compressed size of the file .IP 5. 4 Uncompressed size of the file .IP 6. 4 Compression ratio, for example .BR 0.123. If ratio is over 9.999, three dashes .RB ( \-\-\- ) are displayed instead of the ratio. .IP 7. 4 Comma-separated list of integrity check names. The following strings are used for the known check types: .BR None , .BR CRC32 , .BR CRC64 , and .BR SHA\-256 . For unknown check types, .BI Unknown\- N is used, where .I N is the Check ID as a decimal number (one or two digits). .IP 8. 4 Total size of stream padding in the file .RE .PD .PP The columns of the .B stream lines: .PD 0 .RS .IP 2. 4 Stream number (the first stream is 1) .IP 3. 4 Number of blocks in the stream .IP 4. 4 Compressed start offset .IP 5. 4 Uncompressed start offset .IP 6. 4 Compressed size (does not include stream padding) .IP 7. 4 Uncompressed size .IP 8. 4 Compression ratio .IP 9. 4 Name of the integrity check .IP 10. 4 Size of stream padding .RE .PD .PP The columns of the .B block lines: .PD 0 .RS .IP 2. 4 Number of the stream containing this block .IP 3. 4 Block number relative to the beginning of the stream (the first block is 1) .IP 4. 4 Block number relative to the beginning of the file .IP 5. 4 Compressed start offset relative to the beginning of the file .IP 6. 4 Uncompressed start offset relative to the beginning of the file .IP 7. 4 Total compressed size of the block (includes headers) .IP 8. 4 Uncompressed size .IP 9. 4 Compression ratio .IP 10. 4 Name of the integrity check .RE .PD .PP If .B \-\-verbose was specified twice, additional columns are included on the .B block lines. These are not displayed with a single .BR \-\-verbose , because getting this information requires many seeks and can thus be slow: .PD 0 .RS .IP 11. 4 Value of the integrity check in hexadecimal .IP 12. 4 Block header size .IP 13. 4 Block flags: .B c indicates that compressed size is present, and .B u indicates that uncompressed size is present. If the flag is not set, a dash .RB ( \- ) is shown instead to keep the string length fixed. New flags may be added to the end of the string in the future. .IP 14. 4 Size of the actual compressed data in the block (this excludes the block header, block padding, and check fields) .IP 15. 4 Amount of memory (in bytes) required to decompress this block with this .B xz version .IP 16. 4 Filter chain. Note that most of the options used at compression time cannot be known, because only the options that are needed for decompression are stored in the .B .xz headers. .RE .PD .PP The columns of the .B summary lines: .PD 0 .RS .IP 2. 4 Amount of memory (in bytes) required to decompress this file with this .B xz version .IP 3. 4 .B yes or .B no indicating if all block headers have both compressed size and uncompressed size stored in them .PP .I Since .B xz .I 5.1.2alpha: .IP 4. 4 Minimum .B xz version required to decompress the file .RE .PD .PP The columns of the .B totals line: .PD 0 .RS .IP 2. 4 Number of streams .IP 3. 4 Number of blocks .IP 4. 4 Compressed size .IP 5. 4 Uncompressed size .IP 6. 4 Average compression ratio .IP 7. 4 Comma-separated list of integrity check names that were present in the files .IP 8. 4 Stream padding size .IP 9. 4 Number of files. This is here to keep the order of the earlier columns the same as on .B file lines. .PD .RE .PP If .B \-\-verbose was specified twice, additional columns are included on the .B totals line: .PD 0 .RS .IP 10. 4 Maximum amount of memory (in bytes) required to decompress the files with this .B xz version .IP 11. 4 .B yes or .B no indicating if all block headers have both compressed size and uncompressed size stored in them .PP .I Since .B xz .I 5.1.2alpha: .IP 12. 4 Minimum .B xz version required to decompress the file .RE .PD .PP Future versions may add new line types and new columns can be added to the existing line types, but the existing columns won't be changed. . .SH "EXIT STATUS" .TP .B 0 All is good. .TP .B 1 An error occurred. .TP .B 2 Something worth a warning occurred, but no actual errors occurred. .PP Notices (not warnings or errors) printed on standard error don't affect the exit status. . .SH ENVIRONMENT .B xz parses space-separated lists of options from the environment variables .B XZ_DEFAULTS and .BR XZ_OPT , in this order, before parsing the options from the command line. Note that only options are parsed from the environment variables; all non-options are silently ignored. Parsing is done with .BR getopt_long (3) which is used also for the command line arguments. .TP .B XZ_DEFAULTS User-specific or system-wide default options. Typically this is set in a shell initialization script to enable .BR xz 's memory usage limiter by default. Excluding shell initialization scripts and similar special cases, scripts must never set or unset .BR XZ_DEFAULTS . .TP .B XZ_OPT This is for passing options to .B xz when it is not possible to set the options directly on the .B xz command line. This is the case e.g. when .B xz is run by a script or tool, e.g. GNU .BR tar (1): .RS .RS .PP .nf .ft CW XZ_OPT=\-2v tar caf foo.tar.xz foo .ft R .fi .RE .RE .IP "" Scripts may use .B XZ_OPT e.g. to set script-specific default compression options. It is still recommended to allow users to override .B XZ_OPT if that is reasonable, e.g. in .BR sh (1) scripts one may use something like this: .RS .RS .PP .nf .ft CW XZ_OPT=${XZ_OPT\-"\-7e"} export XZ_OPT .ft R .fi .RE .RE . .SH "LZMA UTILS COMPATIBILITY" The command line syntax of .B xz is practically a superset of .BR lzma , .BR unlzma , and .BR lzcat as found from LZMA Utils 4.32.x. In most cases, it is possible to replace LZMA Utils with XZ Utils without breaking existing scripts. There are some incompatibilities though, which may sometimes cause problems. . .SS "Compression preset levels" The numbering of the compression level presets is not identical in .B xz and LZMA Utils. The most important difference is how dictionary sizes are mapped to different presets. Dictionary size is roughly equal to the decompressor memory usage. .RS .PP .TS tab(;); c c c c n n. Level;xz;LZMA Utils \-0;256 KiB;N/A \-1;1 MiB;64 KiB \-2;2 MiB;1 MiB \-3;4 MiB;512 KiB \-4;4 MiB;1 MiB \-5;8 MiB;2 MiB \-6;8 MiB;4 MiB \-7;16 MiB;8 MiB \-8;32 MiB;16 MiB \-9;64 MiB;32 MiB .TE .RE .PP The dictionary size differences affect the compressor memory usage too, but there are some other differences between LZMA Utils and XZ Utils, which make the difference even bigger: .RS .PP .TS tab(;); c c c c n n. Level;xz;LZMA Utils 4.32.x \-0;3 MiB;N/A \-1;9 MiB;2 MiB \-2;17 MiB;12 MiB \-3;32 MiB;12 MiB \-4;48 MiB;16 MiB \-5;94 MiB;26 MiB \-6;94 MiB;45 MiB \-7;186 MiB;83 MiB \-8;370 MiB;159 MiB \-9;674 MiB;311 MiB .TE .RE .PP The default preset level in LZMA Utils is .B \-7 while in XZ Utils it is .BR \-6 , so both use an 8 MiB dictionary by default. . .SS "Streamed vs. non-streamed .lzma files" The uncompressed size of the file can be stored in the .B .lzma header. LZMA Utils does that when compressing regular files. The alternative is to mark that uncompressed size is unknown and use end-of-payload marker to indicate where the decompressor should stop. LZMA Utils uses this method when uncompressed size isn't known, which is the case for example in pipes. .PP .B xz supports decompressing .B .lzma files with or without end-of-payload marker, but all .B .lzma files created by .B xz will use end-of-payload marker and have uncompressed size marked as unknown in the .B .lzma header. This may be a problem in some uncommon situations. For example, a .B .lzma decompressor in an embedded device might work only with files that have known uncompressed size. If you hit this problem, you need to use LZMA Utils or LZMA SDK to create .B .lzma files with known uncompressed size. . .SS "Unsupported .lzma files" The .B .lzma format allows .I lc values up to 8, and .I lp values up to 4. LZMA Utils can decompress files with any .I lc and .IR lp , but always creates files with .B lc=3 and .BR lp=0 . Creating files with other .I lc and .I lp is possible with .B xz and with LZMA SDK. .PP The implementation of the LZMA1 filter in liblzma requires that the sum of .I lc and .I lp must not exceed 4. Thus, .B .lzma files, which exceed this limitation, cannot be decompressed with .BR xz . .PP LZMA Utils creates only .B .lzma files which have a dictionary size of .RI "2^" n (a power of 2) but accepts files with any dictionary size. liblzma accepts only .B .lzma files which have a dictionary size of .RI "2^" n or .RI "2^" n " + 2^(" n "\-1)." This is to decrease false positives when detecting .B .lzma files. .PP These limitations shouldn't be a problem in practice, since practically all .B .lzma files have been compressed with settings that liblzma will accept. . .SS "Trailing garbage" When decompressing, LZMA Utils silently ignore everything after the first .B .lzma stream. In most situations, this is a bug. This also means that LZMA Utils don't support decompressing concatenated .B .lzma files. .PP If there is data left after the first .B .lzma stream, .B xz considers the file to be corrupt unless .B \-\-single\-stream was used. This may break obscure scripts which have assumed that trailing garbage is ignored. . .SH NOTES . .SS "Compressed output may vary" The exact compressed output produced from the same uncompressed input file may vary between XZ Utils versions even if compression options are identical. This is because the encoder can be improved (faster or better compression) without affecting the file format. The output can vary even between different builds of the same XZ Utils version, if different build options are used. .PP The above means that once .B \-\-rsyncable has been implemented, the resulting files won't necessarily be rsyncable unless both old and new files have been compressed with the same xz version. This problem can be fixed if a part of the encoder implementation is frozen to keep rsyncable output stable across xz versions. . .SS "Embedded .xz decompressors" Embedded .B .xz decompressor implementations like XZ Embedded don't necessarily support files created with integrity .I check types other than .B none and .BR crc32 . Since the default is .BR \-\-check=crc64 , you must use .B \-\-check=none or .B \-\-check=crc32 when creating files for embedded systems. .PP Outside embedded systems, all .B .xz format decompressors support all the .I check types, or at least are able to decompress the file without verifying the integrity check if the particular .I check is not supported. .PP XZ Embedded supports BCJ filters, but only with the default start offset. . .SH EXAMPLES . .SS Basics Compress the file .I foo into .I foo.xz using the default compression level .RB ( \-6 ), and remove .I foo if compression is successful: .RS .PP .nf .ft CW xz foo .ft R .fi .RE .PP Decompress .I bar.xz into .I bar and don't remove .I bar.xz even if decompression is successful: .RS .PP .nf .ft CW xz \-dk bar.xz .ft R .fi .RE .PP Create .I baz.tar.xz with the preset .B \-4e .RB ( "\-4 \-\-extreme" ), which is slower than e.g. the default .BR \-6 , but needs less memory for compression and decompression (48\ MiB and 5\ MiB, respectively): .RS .PP .nf .ft CW tar cf \- baz | xz \-4e > baz.tar.xz .ft R .fi .RE .PP A mix of compressed and uncompressed files can be decompressed to standard output with a single command: .RS .PP .nf .ft CW xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt .ft R .fi .RE . .SS "Parallel compression of many files" On GNU and *BSD, .BR find (1) and .BR xargs (1) can be used to parallelize compression of many files: .RS .PP .nf .ft CW find . \-type f \e! \-name '*.xz' \-print0 \e | xargs \-0r \-P4 \-n16 xz \-T1 .ft R .fi .RE .PP The .B \-P option to .BR xargs (1) sets the number of parallel .B xz processes. The best value for the .B \-n option depends on how many files there are to be compressed. If there are only a couple of files, the value should probably be 1; with tens of thousands of files, 100 or even more may be appropriate to reduce the number of .B xz processes that .BR xargs (1) will eventually create. .PP The option .B \-T1 for .B xz is there to force it to single-threaded mode, because .BR xargs (1) is used to control the amount of parallelization. . .SS "Robot mode" Calculate how many bytes have been saved in total after compressing multiple files: .RS .PP .nf .ft CW xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}' .ft R .fi .RE .PP A script may want to know that it is using new enough .BR xz . The following .BR sh (1) script checks that the version number of the .B xz tool is at least 5.0.0. This method is compatible with old beta versions, which didn't support the .B \-\-robot option: .RS .PP .nf .ft CW if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" || [ "$XZ_VERSION" \-lt 50000002 ]; then echo "Your xz is too old." fi unset XZ_VERSION LIBLZMA_VERSION .ft R .fi .RE .PP Set a memory usage limit for decompression using .BR XZ_OPT , but if a limit has already been set, don't increase it: .RS .PP .nf .ft CW NEWLIM=$((123 << 20)) # 123 MiB OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3) if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM" export XZ_OPT fi .ft R .fi .RE . .SS "Custom compressor filter chains" The simplest use for custom filter chains is customizing a LZMA2 preset. This can be useful, because the presets cover only a subset of the potentially useful combinations of compression settings. .PP The CompCPU columns of the tables from the descriptions of the options .BR "\-0" " ... " "\-9" and .B \-\-extreme are useful when customizing LZMA2 presets. Here are the relevant parts collected from those two tables: .RS .PP .TS tab(;); c c n n. Preset;CompCPU \-0;0 \-1;1 \-2;2 \-3;3 \-4;4 \-5;5 \-6;6 \-5e;7 \-6e;8 .TE .RE .PP If you know that a file requires somewhat big dictionary (e.g. 32 MiB) to compress well, but you want to compress it quicker than .B "xz \-8" would do, a preset with a low CompCPU value (e.g. 1) can be modified to use a bigger dictionary: .RS .PP .nf .ft CW xz \-\-lzma2=preset=1,dict=32MiB foo.tar .ft R .fi .RE .PP With certain files, the above command may be faster than .B "xz \-6" while compressing significantly better. However, it must be emphasized that only some files benefit from a big dictionary while keeping the CompCPU value low. The most obvious situation, where a big dictionary can help a lot, is an archive containing very similar files of at least a few megabytes each. The dictionary size has to be significantly bigger than any individual file to allow LZMA2 to take full advantage of the similarities between consecutive files. .PP If very high compressor and decompressor memory usage is fine, and the file being compressed is at least several hundred megabytes, it may be useful to use an even bigger dictionary than the 64 MiB that .B "xz \-9" would use: .RS .PP .nf .ft CW xz \-vv \-\-lzma2=dict=192MiB big_foo.tar .ft R .fi .RE .PP Using .B \-vv .RB ( "\-\-verbose \-\-verbose" ) like in the above example can be useful to see the memory requirements of the compressor and decompressor. Remember that using a dictionary bigger than the size of the uncompressed file is waste of memory, so the above command isn't useful for small files. .PP Sometimes the compression time doesn't matter, but the decompressor memory usage has to be kept low e.g. to make it possible to decompress the file on an embedded system. The following command uses .B \-6e .RB ( "\-6 \-\-extreme" ) as a base and sets the dictionary to only 64\ KiB. The resulting file can be decompressed with XZ Embedded (that's why there is .BR \-\-check=crc32 ) using about 100\ KiB of memory. .RS .PP .nf .ft CW xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo .ft R .fi .RE .PP If you want to squeeze out as many bytes as possible, adjusting the number of literal context bits .RI ( lc ) and number of position bits .RI ( pb ) can sometimes help. Adjusting the number of literal position bits .RI ( lp ) might help too, but usually .I lc and .I pb are more important. E.g. a source code archive contains mostly US-ASCII text, so something like the following might give slightly (like 0.1\ %) smaller file than .B "xz \-6e" (try also without .BR lc=4 ): .RS .PP .nf .ft CW xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar .ft R .fi .RE .PP Using another filter together with LZMA2 can improve compression with certain file types. E.g. to compress a x86-32 or x86-64 shared library using the x86 BCJ filter: .RS .PP .nf .ft CW xz \-\-x86 \-\-lzma2 libfoo.so .ft R .fi .RE .PP Note that the order of the filter options is significant. If .B \-\-x86 is specified after .BR \-\-lzma2 , .B xz will give an error, because there cannot be any filter after LZMA2, and also because the x86 BCJ filter cannot be used as the last filter in the chain. .PP The Delta filter together with LZMA2 can give good results with bitmap images. It should usually beat PNG, which has a few more advanced filters than simple delta but uses Deflate for the actual compression. .PP The image has to be saved in uncompressed format, e.g. as uncompressed TIFF. The distance parameter of the Delta filter is set to match the number of bytes per pixel in the image. E.g. 24-bit RGB bitmap needs .BR dist=3 , and it is also good to pass .B pb=0 to LZMA2 to accommodate the three-byte alignment: .RS .PP .nf .ft CW xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff .ft R .fi .RE .PP If multiple images have been put into a single archive (e.g.\& .BR .tar ), the Delta filter will work on that too as long as all images have the same number of bytes per pixel. . .SH "SEE ALSO" .BR xzdec (1), .BR xzdiff (1), .BR xzgrep (1), .BR xzless (1), .BR xzmore (1), .BR gzip (1), .BR bzip2 (1), .BR 7z (1) .PP XZ Utils: .br XZ Embedded: .br LZMA SDK: Index: head/contrib/xz/src/xzdec/xzdec.c =================================================================== --- head/contrib/xz/src/xzdec/xzdec.c (revision 359200) +++ head/contrib/xz/src/xzdec/xzdec.c (revision 359201) @@ -1,323 +1,323 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file xzdec.c /// \brief Simple single-threaded tool to uncompress .xz or .lzma files // // Author: Lasse Collin // // This file has been put into the public domain. // You can do whatever you want with this file. // /////////////////////////////////////////////////////////////////////////////// #include "sysdefs.h" #include "lzma.h" #include #include #include #include #include "getopt.h" #include "tuklib_progname.h" #include "tuklib_exit.h" #ifdef TUKLIB_DOSLIKE # include # include #endif #ifdef LZMADEC # define TOOL_FORMAT "lzma" #else # define TOOL_FORMAT "xz" #endif /// Error messages are suppressed if this is zero, which is the case when /// --quiet has been given at least twice. -static unsigned int display_errors = 2; +static int display_errors = 2; static void lzma_attribute((__format__(__printf__, 1, 2))) my_errorf(const char *fmt, ...) { va_list ap; va_start(ap, fmt); if (display_errors) { fprintf(stderr, "%s: ", progname); vfprintf(stderr, fmt, ap); fprintf(stderr, "\n"); } va_end(ap); return; } static void lzma_attribute((__noreturn__)) help(void) { printf( "Usage: %s [OPTION]... [FILE]...\n" "Decompress files in the ." TOOL_FORMAT " format to standard output.\n" "\n" " -d, --decompress (ignored, only decompression is supported)\n" " -k, --keep (ignored, files are never deleted)\n" " -c, --stdout (ignored, output is always written to standard output)\n" " -q, --quiet specify *twice* to suppress errors\n" " -Q, --no-warn (ignored, the exit status 2 is never used)\n" " -h, --help display this help and exit\n" " -V, --version display the version number and exit\n" "\n" "With no FILE, or when FILE is -, read standard input.\n" "\n" "Report bugs to <" PACKAGE_BUGREPORT "> (in English or Finnish).\n" PACKAGE_NAME " home page: <" PACKAGE_URL ">\n", progname); tuklib_exit(EXIT_SUCCESS, EXIT_FAILURE, display_errors); } static void lzma_attribute((__noreturn__)) version(void) { printf(TOOL_FORMAT "dec (" PACKAGE_NAME ") " LZMA_VERSION_STRING "\n" "liblzma %s\n", lzma_version_string()); tuklib_exit(EXIT_SUCCESS, EXIT_FAILURE, display_errors); } /// Parses command line options. static void parse_options(int argc, char **argv) { static const char short_opts[] = "cdkM:hqQV"; static const struct option long_opts[] = { { "stdout", no_argument, NULL, 'c' }, { "to-stdout", no_argument, NULL, 'c' }, { "decompress", no_argument, NULL, 'd' }, { "uncompress", no_argument, NULL, 'd' }, { "keep", no_argument, NULL, 'k' }, { "quiet", no_argument, NULL, 'q' }, { "no-warn", no_argument, NULL, 'Q' }, { "help", no_argument, NULL, 'h' }, { "version", no_argument, NULL, 'V' }, { NULL, 0, NULL, 0 } }; int c; while ((c = getopt_long(argc, argv, short_opts, long_opts, NULL)) != -1) { switch (c) { case 'c': case 'd': case 'k': case 'Q': break; case 'q': if (display_errors > 0) --display_errors; break; case 'h': help(); case 'V': version(); default: exit(EXIT_FAILURE); } } return; } static void uncompress(lzma_stream *strm, FILE *file, const char *filename) { lzma_ret ret; // Initialize the decoder #ifdef LZMADEC ret = lzma_alone_decoder(strm, UINT64_MAX); #else ret = lzma_stream_decoder(strm, UINT64_MAX, LZMA_CONCATENATED); #endif // The only reasonable error here is LZMA_MEM_ERROR. if (ret != LZMA_OK) { my_errorf("%s", ret == LZMA_MEM_ERROR ? strerror(ENOMEM) : "Internal error (bug)"); exit(EXIT_FAILURE); } // Input and output buffers uint8_t in_buf[BUFSIZ]; uint8_t out_buf[BUFSIZ]; strm->avail_in = 0; strm->next_out = out_buf; strm->avail_out = BUFSIZ; lzma_action action = LZMA_RUN; while (true) { if (strm->avail_in == 0) { strm->next_in = in_buf; strm->avail_in = fread(in_buf, 1, BUFSIZ, file); if (ferror(file)) { // POSIX says that fread() sets errno if // an error occurred. ferror() doesn't // touch errno. my_errorf("%s: Error reading input file: %s", filename, strerror(errno)); exit(EXIT_FAILURE); } #ifndef LZMADEC // When using LZMA_CONCATENATED, we need to tell // liblzma when it has got all the input. if (feof(file)) action = LZMA_FINISH; #endif } ret = lzma_code(strm, action); // Write and check write error before checking decoder error. // This way as much data as possible gets written to output // even if decoder detected an error. if (strm->avail_out == 0 || ret != LZMA_OK) { const size_t write_size = BUFSIZ - strm->avail_out; if (fwrite(out_buf, 1, write_size, stdout) != write_size) { // Wouldn't be a surprise if writing to stderr // would fail too but at least try to show an // error message. my_errorf("Cannot write to standard output: " "%s", strerror(errno)); exit(EXIT_FAILURE); } strm->next_out = out_buf; strm->avail_out = BUFSIZ; } if (ret != LZMA_OK) { if (ret == LZMA_STREAM_END) { #ifdef LZMADEC // Check that there's no trailing garbage. if (strm->avail_in != 0 || fread(in_buf, 1, 1, file) != 0 || !feof(file)) ret = LZMA_DATA_ERROR; else return; #else // lzma_stream_decoder() already guarantees // that there's no trailing garbage. assert(strm->avail_in == 0); assert(action == LZMA_FINISH); assert(feof(file)); return; #endif } const char *msg; switch (ret) { case LZMA_MEM_ERROR: msg = strerror(ENOMEM); break; case LZMA_FORMAT_ERROR: msg = "File format not recognized"; break; case LZMA_OPTIONS_ERROR: // FIXME: Better message? msg = "Unsupported compression options"; break; case LZMA_DATA_ERROR: msg = "File is corrupt"; break; case LZMA_BUF_ERROR: msg = "Unexpected end of input"; break; default: msg = "Internal error (bug)"; break; } my_errorf("%s: %s", filename, msg); exit(EXIT_FAILURE); } } } int main(int argc, char **argv) { // Initialize progname which we will be used in error messages. tuklib_progname_init(argv); // Parse the command line options. parse_options(argc, argv); // The same lzma_stream is used for all files that we decode. This way // we don't need to reallocate memory for every file if they use same // compression settings. lzma_stream strm = LZMA_STREAM_INIT; // Some systems require setting stdin and stdout to binary mode. #ifdef TUKLIB_DOSLIKE setmode(fileno(stdin), O_BINARY); setmode(fileno(stdout), O_BINARY); #endif if (optind == argc) { // No filenames given, decode from stdin. uncompress(&strm, stdin, "(stdin)"); } else { // Loop through the filenames given on the command line. do { // "-" indicates stdin. if (strcmp(argv[optind], "-") == 0) { uncompress(&strm, stdin, "(stdin)"); } else { FILE *file = fopen(argv[optind], "rb"); if (file == NULL) { my_errorf("%s: %s", argv[optind], strerror(errno)); exit(EXIT_FAILURE); } uncompress(&strm, file, argv[optind]); fclose(file); } } while (++optind < argc); } #ifndef NDEBUG // Free the memory only when debugging. Freeing wastes some time, // but allows detecting possible memory leaks with Valgrind. lzma_end(&strm); #endif tuklib_exit(EXIT_SUCCESS, EXIT_FAILURE, display_errors); } Index: head/contrib/xz =================================================================== --- head/contrib/xz (revision 359200) +++ head/contrib/xz (revision 359201) Property changes on: head/contrib/xz ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /vendor/xz/dist:r357609-359197 Index: head/lib/liblzma/config.h =================================================================== --- head/lib/liblzma/config.h (revision 359200) +++ head/lib/liblzma/config.h (revision 359201) @@ -1,516 +1,533 @@ /* $FreeBSD$ */ /* config.h. Generated from config.h.in by configure. */ /* config.h.in. Generated from configure.ac by autoheader. */ /* Define if building universal (internal helper macro) */ /* #undef AC_APPLE_UNIVERSAL_BUILD */ /* How many MiB of RAM to assume if the real amount cannot be determined. */ #define ASSUME_RAM 128 /* Define to 1 if translation of program messages to the user's native language is requested. */ /* FreeBSD - disabled intentionally */ /* #undef ENABLE_NLS */ /* Define to 1 if bswap_16 is available. */ /* #undef HAVE_BSWAP_16 */ /* Define to 1 if bswap_32 is available. */ /* #undef HAVE_BSWAP_32 */ /* Define to 1 if bswap_64 is available. */ /* #undef HAVE_BSWAP_64 */ /* Define to 1 if you have the header file. */ /* #undef HAVE_BYTESWAP_H */ /* Define to 1 if Capsicum is available. */ #define HAVE_CAPSICUM 1 /* Define to 1 if the system has the type `CC_SHA256_CTX'. */ /* #undef HAVE_CC_SHA256_CTX */ /* Define to 1 if you have the `CC_SHA256_Init' function. */ /* #undef HAVE_CC_SHA256_INIT */ /* Define to 1 if you have the Mac OS X function CFLocaleCopyCurrent in the CoreFoundation framework. */ /* #undef HAVE_CFLOCALECOPYCURRENT */ +/* Define to 1 if you have the Mac OS X function + CFLocaleCopyPreferredLanguages in the CoreFoundation framework. */ +/* #undef HAVE_CFLOCALECOPYPREFERREDLANGUAGES */ + /* Define to 1 if you have the Mac OS X function CFPreferencesCopyAppValue in the CoreFoundation framework. */ /* #undef HAVE_CFPREFERENCESCOPYAPPVALUE */ /* Define to 1 if crc32 integrity check is enabled. */ #define HAVE_CHECK_CRC32 1 /* Define to 1 if crc64 integrity check is enabled. */ #define HAVE_CHECK_CRC64 1 /* Define to 1 if sha256 integrity check is enabled. */ #define HAVE_CHECK_SHA256 1 /* Define to 1 if you have the `clock_gettime' function. */ #define HAVE_CLOCK_GETTIME 1 /* Define to 1 if you have the header file. */ /* #undef HAVE_COMMONCRYPTO_COMMONDIGEST_H */ /* Define if the GNU dcgettext() function is already present or preinstalled. */ /* FreeBSD - disabled intentionally */ /* #undef HAVE_DCGETTEXT */ /* Define to 1 if you have the declaration of `CLOCK_MONOTONIC', and to 0 if you don't. */ #define HAVE_DECL_CLOCK_MONOTONIC 1 /* Define to 1 if you have the declaration of `program_invocation_name', and to 0 if you don't. */ #define HAVE_DECL_PROGRAM_INVOCATION_NAME 0 /* Define to 1 if any of HAVE_DECODER_foo have been defined. */ #define HAVE_DECODERS 1 /* Define to 1 if arm decoder is enabled. */ #define HAVE_DECODER_ARM 1 /* Define to 1 if armthumb decoder is enabled. */ #define HAVE_DECODER_ARMTHUMB 1 /* Define to 1 if delta decoder is enabled. */ #define HAVE_DECODER_DELTA 1 /* Define to 1 if ia64 decoder is enabled. */ #define HAVE_DECODER_IA64 1 /* Define to 1 if lzma1 decoder is enabled. */ #define HAVE_DECODER_LZMA1 1 /* Define to 1 if lzma2 decoder is enabled. */ #define HAVE_DECODER_LZMA2 1 /* Define to 1 if powerpc decoder is enabled. */ #define HAVE_DECODER_POWERPC 1 /* Define to 1 if sparc decoder is enabled. */ #define HAVE_DECODER_SPARC 1 /* Define to 1 if x86 decoder is enabled. */ #define HAVE_DECODER_X86 1 /* Define to 1 if you have the header file. */ #define HAVE_DLFCN_H 1 /* Define to 1 if any of HAVE_ENCODER_foo have been defined. */ #define HAVE_ENCODERS 1 /* Define to 1 if arm encoder is enabled. */ #define HAVE_ENCODER_ARM 1 /* Define to 1 if armthumb encoder is enabled. */ #define HAVE_ENCODER_ARMTHUMB 1 /* Define to 1 if delta encoder is enabled. */ #define HAVE_ENCODER_DELTA 1 /* Define to 1 if ia64 encoder is enabled. */ #define HAVE_ENCODER_IA64 1 /* Define to 1 if lzma1 encoder is enabled. */ #define HAVE_ENCODER_LZMA1 1 /* Define to 1 if lzma2 encoder is enabled. */ #define HAVE_ENCODER_LZMA2 1 /* Define to 1 if powerpc encoder is enabled. */ #define HAVE_ENCODER_POWERPC 1 /* Define to 1 if sparc encoder is enabled. */ #define HAVE_ENCODER_SPARC 1 /* Define to 1 if x86 encoder is enabled. */ #define HAVE_ENCODER_X86 1 /* Define to 1 if you have the header file. */ #define HAVE_FCNTL_H 1 /* Define to 1 if you have the `futimens' function. */ #define HAVE_FUTIMENS 1 /* Define to 1 if you have the `futimes' function. */ /* #undef HAVE_FUTIMES */ /* Define to 1 if you have the `futimesat' function. */ /* #undef HAVE_FUTIMESAT */ /* Define to 1 if you have the header file. */ #define HAVE_GETOPT_H 1 /* Define to 1 if you have the `getopt_long' function. */ #define HAVE_GETOPT_LONG 1 /* Define if the GNU gettext() function is already present or preinstalled. */ /* FreeBSD - disabled intentionally */ /* #undef HAVE_GETTEXT */ /* Define if you have the iconv() function and it works. */ #define HAVE_ICONV 1 /* Define to 1 if you have the header file. */ /* FreeBSD - only with clang because the base gcc does not support it */ #if defined(__clang__) && defined(__FreeBSD__) && defined(__amd64__) #define HAVE_IMMINTRIN_H 1 #endif /* Define to 1 if you have the header file. */ #define HAVE_INTTYPES_H 1 /* Define to 1 if you have the header file. */ #define HAVE_LIMITS_H 1 /* Define to 1 if mbrtowc and mbstate_t are properly declared. */ #define HAVE_MBRTOWC 1 /* Define to 1 if you have the header file. */ #define HAVE_MEMORY_H 1 /* Define to 1 to enable bt2 match finder. */ #define HAVE_MF_BT2 1 /* Define to 1 to enable bt3 match finder. */ #define HAVE_MF_BT3 1 /* Define to 1 to enable bt4 match finder. */ #define HAVE_MF_BT4 1 /* Define to 1 to enable hc3 match finder. */ #define HAVE_MF_HC3 1 /* Define to 1 to enable hc4 match finder. */ #define HAVE_MF_HC4 1 /* Define to 1 if getopt.h declares extern int optreset. */ #define HAVE_OPTRESET 1 /* Define to 1 if you have the `posix_fadvise' function. */ #define HAVE_POSIX_FADVISE 1 /* Define to 1 if you have the `pthread_condattr_setclock' function. */ #define HAVE_PTHREAD_CONDATTR_SETCLOCK 1 /* Have PTHREAD_PRIO_INHERIT. */ #define HAVE_PTHREAD_PRIO_INHERIT 1 /* Define to 1 if you have the `SHA256Init' function. */ /* #undef HAVE_SHA256INIT */ /* Define to 1 if the system has the type `SHA256_CTX'. */ /* FreeBSD - disabled libmd SHA256 for now */ /* #undef HAVE_SHA256_CTX */ /* Define to 1 if you have the header file. */ /* FreeBSD - disabled libmd SHA256 for now */ /* #undef HAVE_SHA256_H */ /* Define to 1 if you have the `SHA256_Init' function. */ /* FreeBSD - disabled libmd SHA256 for now */ /* #undef HAVE_SHA256_INIT */ /* Define to 1 if the system has the type `SHA2_CTX'. */ /* #undef HAVE_SHA2_CTX */ /* Define to 1 if you have the header file. */ /* #undef HAVE_SHA2_H */ /* Define to 1 if optimizing for size. */ /* #undef HAVE_SMALL */ /* Define to 1 if stdbool.h conforms to C99. */ #define HAVE_STDBOOL_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STDINT_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STDLIB_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STRINGS_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STRING_H 1 /* Define to 1 if `st_atimensec' is a member of `struct stat'. */ -/* #undef HAVE_STRUCT_STAT_ST_ATIMENSEC */ +#define HAVE_STRUCT_STAT_ST_ATIMENSEC 1 /* Define to 1 if `st_atimespec.tv_nsec' is a member of `struct stat'. */ #define HAVE_STRUCT_STAT_ST_ATIMESPEC_TV_NSEC 1 /* Define to 1 if `st_atim.st__tim.tv_nsec' is a member of `struct stat'. */ /* #undef HAVE_STRUCT_STAT_ST_ATIM_ST__TIM_TV_NSEC */ /* Define to 1 if `st_atim.tv_nsec' is a member of `struct stat'. */ #define HAVE_STRUCT_STAT_ST_ATIM_TV_NSEC 1 /* Define to 1 if `st_uatime' is a member of `struct stat'. */ /* #undef HAVE_STRUCT_STAT_ST_UATIME */ /* Define to 1 if you have the header file. */ /* #undef HAVE_SYS_BYTEORDER_H */ /* Define to 1 if you have the header file. */ #define HAVE_SYS_CAPSICUM_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_ENDIAN_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_PARAM_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_STAT_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_TIME_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_TYPES_H 1 /* Define to 1 if the system has the type `uintptr_t'. */ #define HAVE_UINTPTR_T 1 /* Define to 1 if you have the header file. */ #define HAVE_UNISTD_H 1 /* Define to 1 if you have the `utime' function. */ /* #undef HAVE_UTIME */ /* Define to 1 if you have the `utimes' function. */ /* #undef HAVE_UTIMES */ /* Define to 1 or 0, depending whether the compiler supports simple visibility declarations. */ #define HAVE_VISIBILITY 1 /* Define to 1 if you have the `wcwidth' function. */ #define HAVE_WCWIDTH 1 /* Define to 1 if the system has the type `_Bool'. */ #define HAVE__BOOL 1 /* Define to 1 if you have the `_futime' function. */ /* #undef HAVE__FUTIME */ /* Define to 1 if _mm_movemask_epi8 is available. */ #if defined(__FreeBSD__) && defined(__amd64__) #define HAVE__MM_MOVEMASK_EPI8 1 #endif +/* Define to 1 if the GNU C extension __builtin_assume_aligned is supported. + */ +#define HAVE___BUILTIN_ASSUME_ALIGNED 1 + +/* Define to 1 if the GNU C extensions __builtin_bswap16/32/64 are supported. + */ +#define HAVE___BUILTIN_BSWAPXX 1 + /* Define to the sub-directory where libtool stores uninstalled libraries. */ #define LT_OBJDIR ".libs/" /* Define to 1 when using POSIX threads (pthreads). */ #define MYTHREAD_POSIX 1 /* Define to 1 when using Windows Vista compatible threads. This uses features that are not available on Windows XP. */ /* #undef MYTHREAD_VISTA */ /* Define to 1 when using Windows 95 (and thus XP) compatible threads. This avoids use of features that were added in Windows Vista. */ /* #undef MYTHREAD_WIN95 */ /* Define to 1 to disable debugging code. */ #define NDEBUG 1 /* Name of package */ #define PACKAGE "xz" /* Define to the address where bug reports for this package should be sent. */ #define PACKAGE_BUGREPORT "lasse.collin@tukaani.org" /* Define to the full name of this package. */ #define PACKAGE_NAME "XZ Utils" /* Define to the full name and version of this package. */ -#define PACKAGE_STRING "XZ Utils 5.2.4" +#define PACKAGE_STRING "XZ Utils 5.2.5" /* Define to the one symbol short name of this package. */ #define PACKAGE_TARNAME "xz" /* Define to the home page for this package. */ #define PACKAGE_URL "https://tukaani.org/xz/" /* Define to the version of this package. */ -#define PACKAGE_VERSION "5.2.4" +#define PACKAGE_VERSION "5.2.5" /* Define to necessary symbol if this constant uses a non-standard name on your system. */ /* #undef PTHREAD_CREATE_JOINABLE */ /* The size of `size_t', as computed by sizeof. */ #define SIZEOF_SIZE_T 8 /* Define to 1 if you have the ANSI C header files. */ #define STDC_HEADERS 1 /* Define to 1 if the number of available CPU cores can be detected with cpuset(2). */ #define TUKLIB_CPUCORES_CPUSET 1 /* Define to 1 if the number of available CPU cores can be detected with pstat_getdynamic(). */ /* #undef TUKLIB_CPUCORES_PSTAT_GETDYNAMIC */ /* Define to 1 if the number of available CPU cores can be detected with sched_getaffinity() */ /* #undef TUKLIB_CPUCORES_SCHED_GETAFFINITY */ /* Define to 1 if the number of available CPU cores can be detected with sysconf(_SC_NPROCESSORS_ONLN) or sysconf(_SC_NPROC_ONLN). */ /* #undef TUKLIB_CPUCORES_SYSCONF */ /* Define to 1 if the number of available CPU cores can be detected with sysctl(). */ /* #undef TUKLIB_CPUCORES_SYSCTL */ /* Define to 1 if the system supports fast unaligned access to 16-bit and 32-bit integers. */ /* FreeBSD - derive from __NO_STRICT_ALIGNMENT */ /* #undef TUKLIB_FAST_UNALIGNED_ACCESS */ /* Define to 1 if the amount of physical memory can be detected with _system_configuration.physmem. */ /* #undef TUKLIB_PHYSMEM_AIX */ /* Define to 1 if the amount of physical memory can be detected with getinvent_r(). */ /* #undef TUKLIB_PHYSMEM_GETINVENT_R */ /* Define to 1 if the amount of physical memory can be detected with getsysinfo(). */ /* #undef TUKLIB_PHYSMEM_GETSYSINFO */ /* Define to 1 if the amount of physical memory can be detected with pstat_getstatic(). */ /* #undef TUKLIB_PHYSMEM_PSTAT_GETSTATIC */ /* Define to 1 if the amount of physical memory can be detected with sysconf(_SC_PAGESIZE) and sysconf(_SC_PHYS_PAGES). */ #define TUKLIB_PHYSMEM_SYSCONF 1 /* Define to 1 if the amount of physical memory can be detected with sysctl(). */ /* #undef TUKLIB_PHYSMEM_SYSCTL */ /* Define to 1 if the amount of physical memory can be detected with Linux sysinfo(). */ /* #undef TUKLIB_PHYSMEM_SYSINFO */ +/* Define to 1 to use unsafe type punning, e.g. char *x = ...; *(int *)x = + 123; which violates strict aliasing rules and thus is undefined behavior + and might result in broken code. */ +/* #undef TUKLIB_USE_UNSAFE_TYPE_PUNNING */ + /* Enable extensions on AIX 3, Interix. */ #ifndef _ALL_SOURCE # define _ALL_SOURCE 1 #endif /* Enable GNU extensions on systems that have them. */ #ifndef _GNU_SOURCE # define _GNU_SOURCE 1 #endif /* Enable threading extensions on Solaris. */ #ifndef _POSIX_PTHREAD_SEMANTICS # define _POSIX_PTHREAD_SEMANTICS 1 #endif /* Enable extensions on HP NonStop. */ #ifndef _TANDEM_SOURCE # define _TANDEM_SOURCE 1 #endif /* Enable general extensions on Solaris. */ #ifndef __EXTENSIONS__ # define __EXTENSIONS__ 1 #endif /* Version number of package */ -#define VERSION "5.2.4" +#define VERSION "5.2.5" /* Define WORDS_BIGENDIAN to 1 if your processor stores words with the most significant byte first (like Motorola and SPARC, unlike Intel). */ #if defined(__FreeBSD__) #include #if defined(__NO_STRICT_ALIGNMENT) #define TUKLIB_FAST_UNALIGNED_ACCESS 1 #endif #include #if _BYTE_ORDER == _BIG_ENDIAN # define WORDS_BIGENDIAN 1 #endif #endif /* Enable large inode numbers on Mac OS X 10.5. */ #ifndef _DARWIN_USE_64_BIT_INODE # define _DARWIN_USE_64_BIT_INODE 1 #endif /* Number of bits in a file offset, on hosts where this is settable. */ /* #undef _FILE_OFFSET_BITS */ /* Define for large files, on AIX-style hosts. */ /* #undef _LARGE_FILES */ /* Define to 1 if on MINIX. */ /* #undef _MINIX */ /* Define to 2 if the system does not provide POSIX.1 features except with this defined. */ /* #undef _POSIX_1_SOURCE */ /* Define to 1 if you need to in order for `stat' and other things to work. */ /* #undef _POSIX_SOURCE */ /* Define for Solaris 2.5.1 so the uint32_t typedef from , , or is not used. If the typedef were allowed, the #define below would cause a syntax error. */ /* #undef _UINT32_T */ /* Define for Solaris 2.5.1 so the uint64_t typedef from , , or is not used. If the typedef were allowed, the #define below would cause a syntax error. */ /* #undef _UINT64_T */ /* Define for Solaris 2.5.1 so the uint8_t typedef from , , or is not used. If the typedef were allowed, the #define below would cause a syntax error. */ /* #undef _UINT8_T */ /* Define to rpl_ if the getopt replacement functions and variables should be used. */ /* #undef __GETOPT_PREFIX */ /* Define to the type of a signed integer type of width exactly 32 bits if such a type exists and the standard includes do not define it. */ /* #undef int32_t */ /* Define to the type of a signed integer type of width exactly 64 bits if such a type exists and the standard includes do not define it. */ /* #undef int64_t */ /* Define to the type of an unsigned integer type of width exactly 16 bits if such a type exists and the standard includes do not define it. */ /* #undef uint16_t */ /* Define to the type of an unsigned integer type of width exactly 32 bits if such a type exists and the standard includes do not define it. */ /* #undef uint32_t */ /* Define to the type of an unsigned integer type of width exactly 64 bits if such a type exists and the standard includes do not define it. */ /* #undef uint64_t */ /* Define to the type of an unsigned integer type of width exactly 8 bits if such a type exists and the standard includes do not define it. */ /* #undef uint8_t */ /* Define to the type of an unsigned integer type wide enough to hold a pointer, if such a type exists, and if the system does not define it. */ /* #undef uintptr_t */