Class XZInputStream

java.lang.Object
java.io.InputStream
org.tukaani.xz.XZInputStream
All Implemented Interfaces:
Closeable, AutoCloseable

public class XZInputStream
extends InputStream
Decompresses a .xz file in streamed mode (no seeking).

Use this to decompress regular standalone .xz files. This reads from its input stream until the end of the input or until an error occurs. This supports decompressing concatenated .xz files.

Typical use cases

Getting an input stream to decompress a .xz file:

 InputStream infile = new FileInputStream("foo.xz");
 XZInputStream inxz = new XZInputStream(infile);
 

It's important to keep in mind that decompressor memory usage depends on the settings used to compress the file. The worst-case memory usage of XZInputStream is currently 1.5 GiB. Still, very few files will require more than about 65 MiB because that's how much decompressing a file created with the highest preset level will need, and only a few people use settings other than the predefined presets.

It is possible to specify a memory usage limit for XZInputStream. If decompression requires more memory than the specified limit, MemoryLimitException will be thrown when reading from the stream. For example, the following sets the memory usage limit to 100 MiB:

 InputStream infile = new FileInputStream("foo.xz");
 XZInputStream inxz = new XZInputStream(infile, 100 * 1024);
 

When uncompressed size is known beforehand

If you are decompressing complete files and your application knows exactly how much uncompressed data there should be, it is good to try reading one more byte by calling read() and checking that it returns -1. This way the decompressor will parse the file footers and verify the integrity checks, giving the caller more confidence that the uncompressed data is valid. (This advice seems to apply to java.util.zip.GZIPInputStream too.)

See Also:
SingleXZInputStream
  • Constructor Details

    • XZInputStream

      public XZInputStream​(InputStream in) throws IOException
      Creates a new XZ decompressor without a memory usage limit.

      This constructor reads and parses the XZ Stream Header (12 bytes) from in. The header of the first Block is not read until read is called.

      Parameters:
      in - input stream from which XZ-compressed data is read
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
    • XZInputStream

      public XZInputStream​(InputStream in, ArrayCache arrayCache) throws IOException
      Creates a new XZ decompressor without a memory usage limit.

      This is identical to XZInputStream(InputStream) except that this takes also the arrayCache argument.

      Parameters:
      in - input stream from which XZ-compressed data is read
      arrayCache - cache to be used for allocating large arrays
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
      Since:
      1.7
    • XZInputStream

      public XZInputStream​(InputStream in, int memoryLimit) throws IOException
      Creates a new XZ decompressor with an optional memory usage limit.

      This is identical to XZInputStream(InputStream) except that this takes also the memoryLimit argument.

      Parameters:
      in - input stream from which XZ-compressed data is read
      memoryLimit - memory usage limit in kibibytes (KiB) or -1 to impose no memory usage limit
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
    • XZInputStream

      public XZInputStream​(InputStream in, int memoryLimit, ArrayCache arrayCache) throws IOException
      Creates a new XZ decompressor with an optional memory usage limit.

      This is identical to XZInputStream(InputStream) except that this takes also the memoryLimit and arrayCache arguments.

      Parameters:
      in - input stream from which XZ-compressed data is read
      memoryLimit - memory usage limit in kibibytes (KiB) or -1 to impose no memory usage limit
      arrayCache - cache to be used for allocating large arrays
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
      Since:
      1.7
    • XZInputStream

      public XZInputStream​(InputStream in, int memoryLimit, boolean verifyCheck) throws IOException
      Creates a new XZ decompressor with an optional memory usage limit and ability to disable verification of integrity checks.

      This is identical to XZInputStream(InputStream,int) except that this takes also the verifyCheck argument.

      Note that integrity check verification should almost never be disabled. Possible reasons to disable integrity check verification:

      • Trying to recover data from a corrupt .xz file.
      • Speeding up decompression. This matters mostly with SHA-256 or with files that have compressed extremely well. It's recommended that integrity checking isn't disabled for performance reasons unless the file integrity is verified externally in some other way.

      verifyCheck only affects the integrity check of the actual compressed data. The CRC32 fields in the headers are always verified.

      Parameters:
      in - input stream from which XZ-compressed data is read
      memoryLimit - memory usage limit in kibibytes (KiB) or -1 to impose no memory usage limit
      verifyCheck - if true, the integrity checks will be verified; this should almost never be set to false
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
      Since:
      1.6
    • XZInputStream

      public XZInputStream​(InputStream in, int memoryLimit, boolean verifyCheck, ArrayCache arrayCache) throws IOException
      Creates a new XZ decompressor with an optional memory usage limit and ability to disable verification of integrity checks.

      This is identical to XZInputStream(InputStream,int,boolean) except that this takes also the arrayCache argument.

      Parameters:
      in - input stream from which XZ-compressed data is read
      memoryLimit - memory usage limit in kibibytes (KiB) or -1 to impose no memory usage limit
      verifyCheck - if true, the integrity checks will be verified; this should almost never be set to false
      arrayCache - cache to be used for allocating large arrays
      Throws:
      XZFormatException - input is not in the XZ format
      CorruptedInputException - XZ header CRC32 doesn't match
      UnsupportedOptionsException - XZ header is valid but specifies options not supported by this implementation
      EOFException - less than 12 bytes of input was available from in
      IOException - may be thrown by in
      Since:
      1.7
  • Method Details

    • read

      public int read() throws IOException
      Decompresses the next byte from this input stream.

      Reading lots of data with read() from this input stream may be inefficient. Wrap it in BufferedInputStream if you need to read lots of data one byte at a time.

      Specified by:
      read in class InputStream
      Returns:
      the next decompressed byte, or -1 to indicate the end of the compressed stream
      Throws:
      CorruptedInputException
      UnsupportedOptionsException
      MemoryLimitException
      XZIOException - if the stream has been closed
      EOFException - compressed input is truncated or corrupt
      IOException - may be thrown by in
    • read

      public int read​(byte[] buf, int off, int len) throws IOException
      Decompresses into an array of bytes.

      If len is zero, no bytes are read and 0 is returned. Otherwise this will try to decompress len bytes of uncompressed data. Less than len bytes may be read only in the following situations:

      • The end of the compressed data was reached successfully.
      • An error is detected after at least one but less len bytes have already been successfully decompressed. The next call with non-zero len will immediately throw the pending exception.
      • An exception is thrown.
      Overrides:
      read in class InputStream
      Parameters:
      buf - target buffer for uncompressed data
      off - start offset in buf
      len - maximum number of uncompressed bytes to read
      Returns:
      number of bytes read, or -1 to indicate the end of the compressed stream
      Throws:
      CorruptedInputException
      UnsupportedOptionsException
      MemoryLimitException
      XZIOException - if the stream has been closed
      EOFException - compressed input is truncated or corrupt
      IOException - may be thrown by in
    • available

      public int available() throws IOException
      Returns the number of uncompressed bytes that can be read without blocking. The value is returned with an assumption that the compressed input data will be valid. If the compressed data is corrupt, CorruptedInputException may get thrown before the number of bytes claimed to be available have been read from this input stream.
      Overrides:
      available in class InputStream
      Returns:
      the number of uncompressed bytes that can be read without blocking
      Throws:
      IOException
    • close

      public void close() throws IOException
      Closes the stream and calls in.close(). If the stream was already closed, this does nothing.

      This is equivalent to close(true).

      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Overrides:
      close in class InputStream
      Throws:
      IOException - if thrown by in.close()
    • close

      public void close​(boolean closeInput) throws IOException
      Closes the stream and optionally calls in.close(). If the stream was already closed, this does nothing. If close(false) has been called, a further call of close(true) does nothing (it doesn't call in.close()).

      If you don't want to close the underlying InputStream, there is usually no need to worry about closing this stream either; it's fine to do nothing and let the garbage collector handle it. However, if you are using ArrayCache, close(false) can be useful to put the allocated arrays back to the cache without closing the underlying InputStream.

      Note that if you successfully reach the end of the stream (read returns -1), the arrays are automatically put back to the cache by that read call. In this situation close(false) is redundant (but harmless).

      Throws:
      IOException - if thrown by in.close()
      Since:
      1.7