Introduction
XZ Utils are a complete C99 implementation of the .xz file format. XZ Utils were originally written for POSIX systems but have been ported to a few non-POSIX systems as well.
The core of the XZ Utils compression code is based on LZMA SDK but it has been modified significantly to be suitable for XZ Utils.
XZ Utils consist of several components:
-
liblzma is a compression library with an API similar to that of zlib.
-
xz is a command line tool with syntax similar to that of gzip.
-
xzdec is a decompression-only tool smaller than the full-featured xz tool.
-
A set of shell scripts (xzgrep, xzdiff, etc.) have been adapted from gzip to ease viewing, grepping, and comparing compressed files.
Documentation
Man pages with keyword indexes:
liblzma API documentation was generated using Doxygen.
Security issues
CVE-2024-47611: Argument injection on Windows
When built for native Windows (MinGW-w64 or MSVC), the command line tools from XZ Utils 5.6.2 and older have a command line argument injection vulnerability. Command line tools built for Cygwin or MSYS2 are unaffected. liblzma is unaffected.
If a command line contains Unicode characters (for example, filenames) that don’t exist in the current legacy code page, the characters are converted to similar-looking characters with best-fit mapping. Some best-fit mappings result in ASCII characters that change the meaning of the command line, which can be exploited with malicious filenames to do argument injection or directory traversal attacks.
This issue was discovered by Orange Tsai and splitline from DEVCORE Research Team.
XZ Utils 5.6.3 (and the backported commits in the v5.4 and v5.2 branches) force the process code page to UTF-8. This avoids best-fit mappings and thus fixes the issue. However, forcing the process code page to UTF-8 is possible only on Windows 10 version 1903 and later. The command line tools remain vulnerable if used on an old older version of Windows.
A related smaller issue remains: Windows filenames may contain unpaired surrogates (invalid UTF-16). These are converted to the replacement character U+FFFD in the UTF-8 code page. Thus, filenames with different unpaired surrogates appear identical and aren’t distinguishable from filenames that contain the actual replacement character U+FFFD.
Compatibility notes:
-
UTF-8 is now the expected encoding of the file lists read using
--files
and--files0
options when running on Windows 10 version 1903 or later. -
If building with a MinGW-w64 toolchain, it is recommended to use UCRT version instead of the old MSVCRT. With the UTF-8 code page, messages with non-ASCII characters are not shown properly with MSVCRT.
CVE-2024-3094: liblzma backdoor
XZ Utils 5.6.0 (2024-02-24) and 5.6.1 (2024-03-09) release tarballs contain a backdoor that was inserted by a malicious co-maintainer. It was discovered by Andres Freund and made public on 2024-03-29. The incident is known as CVE-2024-3094.
This is still being investigated. See the XZ Utils backdoor page for more information and also the XZ Utils review notes.
CVE-2022-1271: xzgrep filename handling
CVE-2022-1271 is also known as ZDI-CAN-16587.
Malicious filenames can make xzgrep to write to arbitrary files or (with a GNU sed extension) lead to arbitrary code execution. A patch to fix it was made public on 2022-04-07. The patch applies to 4.999.9beta, 5.0.0 to 5.2.5, 5.3.1alpha, and 5.3.2alpha. Newer XZ Utils releases include an improved fix for the problem.
The vulnerability was discovered by cleemy desu wayo working with Trend Micro Zero Day Initiative.
CVE-2020-22916: A bogus CVE
CVE-2020-22916 is bogus; it’s not a security issue or a bug.
The report had a corrupt .lzma file which uses a tiny 256-byte dictionary. So decompression needs very little memory. The reporter claimed that decompressing it “could cause endless output”.
Both XZ Utils and the long-deprecated LZMA Utils produce 114,881,179 bytes of output from the file before reporting an error. This is not “endless output”. The decompression speed is good too.
Source packages
See the NEWS file for a summary of changes between versions.
The releases have been signed with Lasse Collin’s OpenPGP key.
Stable
XZ Utils 5.6.3 were released on 2024-10-01. The release includes a security fix for a command line tool argument injection vulnerability on Windows. For the old 5.2.13 and 5.4.7 releases, the fix is available in the v5.2 and v5.4 branches in the Git repository. New releases won’t be made from the old branches.
2226 KiB |
||
1688 KiB |
||
1298 KiB |
XZ Utils source packages are also available on Sourceforge.
Old releases
Source and binary packages of the old XZ Utils releases are available on a separate page.
Development
The primary Git repository is on GitHub:
git clone https://github.com/tukaani-project/xz
The repository is mirrored (with some delay) to git.tukaani.org as well.
The master branch contains the latest development code.
Maintenance status of the stable branches:
-
v5.6: maintained until 5.8.0 is ready
-
v5.4: critical fixes only (no new releases)
-
v5.2: critical fixes only (no new releases)
-
v5.0: unmaintained
The other branches on GitHub are temporary development branches which also see force pushes. These branches aren’t mirrored to git.tukaani.org.
Building from xz.git
Two multi-platform build systems are supported:
-
The GNU Autotools-based build is old, feature complete, and the most tested. Apart from Windows, DOS, and OpenVMS, the Autotools-based build is the most likely to work on less common or old platforms. Run
./autogen.sh
to generate theconfigure
script and other files. -
CMake-based build became feature complete in June 2024. However, it hasn’t received as much testing as the Autotools-based build. See the notes at the top of CMakeLists.txt.
Special cases: OpenVMS and DOS builds use different build systems. See the file INSTALL.
Minimum Autotools and CMake versions
For GNU Autotools, it is recommended to use the latest versions. The minimum versions required are old though:
-
Autoconf 2.69
-
Automake 1.12
-
gettext 0.19.6
-
Libtool 2.4
For the CMake build, version 3.20 or greater is required. Translation support also requires that GNU gettext-tools are installed.
Optional dependencies
po4a is needed for translated documentation (man pages).
-
Autotools: To build without po4a, pass
--no-po4a
as the argument toautogen.sh
. -
CMake: Run the shell script
po4a/update-po
to generate the translated man pages inside the source tree (thus the source tree will have extra files; it won’t stay completely clean). If the translated man pages exist in the source tree,make install
will install them if translation support was enabled (XZ_NLS
).
Doxygen can be used
to generate liblzma API documentation in HTML format
which make install
will also install.
Doxygen usage is disabled by default.
-
Autotools: Pass
--enable-doxygen
as an argument toconfigure
. -
CMake: Pass
-DXZ_DOXYGEN=ON
as an argument tocmake
.
Future build system plans
People have wished for Meson support and work on it has been started.
While dropping Autotools is tempting, there are use cases where Autotools have benefits still:
-
Easier bootstrapping on modern operating systems. muon might make this easier with Meson-based build though.
-
Better support for old or obscure operating systems. As time goes on, these get less and less important though.
Binary packages
Many free software operating systems already provide easy-to-install XZ Utils binaries. It doesn’t make sense to provide links to all those here.
No up-to-date binaries for Windows or DOS are currently available. See the old releases page for old versions.
Supported platforms
Below is an incomplete and somewhat vague (version numbers mostly missing) list of operating systems on which XZ Utils should work. Compiler(s) or toolchains are mentioned in parenthesis. GCC refers to GCC 3 or later. If Clang/LLVM is available for the operating system then it should work too. Additions and corrections are welcome.
-
GNU/Linux (GCC, Clang, ICC, ICX, XL C)
-
GNU/Hurd (GCC)
-
DragonflyBSD
-
FreeBSD
-
MirBSD
-
NetBSD
-
OpenBSD
-
MINIX 3.3.0 and later [1]
-
Haiku
-
SerenityOS
-
AROS and AmigaOS
-
macOS / Mac OS X / Darwin
-
Solaris 10, 11 (GCC, Sun Studio / Oracle Developer Studio) [1]
-
AIX (GCC, XL C) [1]
-
z/OS UNIX System Services (XL C) [1]
-
QNX
-
HP-UX (GCC, HP ANSI C) [2]
-
OpenVMS (HP C compiler) [1]
-
OpenVOS 17 (GCC)
-
Windows 2000 and later (GCC or Clang/LLVM with MinGW-w64, GCC/Cygwin, Visual Studio 2015 or later) [1]
-
DOS e.g. FreeDOS and MS-DOS (GCC/DJGPP) [1]
XZ Utils have or had support for the following operating systems but recent releases might not work anymore. If the latest XZ Utils don’t work, try XZ Utils 5.4.7 or an even older release. Support for obsolete operating systems and versions might be retained or restored if it is easy to do.
Licensing
From the version 5.5.2beta onwards, the core components of XZ Utils are under the BSD Zero Clause License (0BSD). The earlier versions that were released as public domain obviously remain in the public domain.
Some parts of XZ Utils (for example, scripts from GNU gzip and some build system files) are under different free software licenses such as GNU LGPLv2.1, GNU GPLv2, or GNU GPLv3.
See the file COPYING for more details.