Class LZMA2Options
- All Implemented Interfaces:
Cloneable
public class LZMA2Options extends FilterOptions
While this allows setting the LZMA2 compression options in detail,
often you only need LZMA2Options()
or
LZMA2Options(int)
.
-
Field Summary
Fields Modifier and Type Field Description static int
DICT_SIZE_DEFAULT
The default dictionary size is 8 MiB.static int
DICT_SIZE_MAX
Maximum dictionary size for compression is 768 MiB.static int
DICT_SIZE_MIN
Minimum dictionary size is 4 KiB.static int
LC_DEFAULT
The default number of literal context bits is 3.static int
LC_LP_MAX
Maximum value for lc + lp is 4.static int
LP_DEFAULT
The default number of literal position bits is 0.static int
MF_BT4
Match finder: Binary tree 2-3-4static int
MF_HC4
Match finder: Hash Chain 2-3-4static int
MODE_FAST
Compression mode: fast.static int
MODE_NORMAL
Compression mode: normal.static int
MODE_UNCOMPRESSED
Compression mode: uncompressed.static int
NICE_LEN_MAX
Maximum value forniceLen
is 273.static int
NICE_LEN_MIN
Minimum value forniceLen
is 8.static int
PB_DEFAULT
The default number of position bits is 2.static int
PB_MAX
Maximum value for pb is 4.static int
PRESET_DEFAULT
Default compression preset level is 6.static int
PRESET_MAX
Maximum valid compression preset level is 9.static int
PRESET_MIN
Minimum valid compression preset level is 0. -
Constructor Summary
Constructors Constructor Description LZMA2Options()
Creates new LZMA2 options and sets them to the default values.LZMA2Options(int preset)
Creates new LZMA2 options and sets them to the given preset.LZMA2Options(int dictSize, int lc, int lp, int pb, int mode, int niceLen, int mf, int depthLimit)
Creates new LZMA2 options and sets them to the given custom values. -
Method Summary
Modifier and Type Method Description Object
clone()
int
getDecoderMemoryUsage()
Gets how much memory the LZMA2 decoder will need to decompress the data that was encoded with these options and stored in a .xz file.int
getDepthLimit()
Gets the match finder search depth limit.int
getDictSize()
Gets the dictionary size in bytes.int
getEncoderMemoryUsage()
Gets how much memory the encoder will need with these options.InputStream
getInputStream(InputStream in, ArrayCache arrayCache)
Gets a raw (no XZ headers) decoder input stream using these options and the given ArrayCache.int
getLc()
Gets the number of literal context bits.int
getLp()
Gets the number of literal position bits.int
getMatchFinder()
Gets the match finder type.int
getMode()
Gets the compression mode.int
getNiceLen()
Gets the nice length of matches.FinishableOutputStream
getOutputStream(FinishableOutputStream out, ArrayCache arrayCache)
Gets a raw (no XZ headers) encoder output stream using these options and the given ArrayCache.int
getPb()
Gets the number of position bits.byte[]
getPresetDict()
Gets the preset dictionary.void
setDepthLimit(int depthLimit)
Sets the match finder search depth limit.void
setDictSize(int dictSize)
Sets the dictionary size in bytes.void
setLc(int lc)
Sets the number of literal context bits.void
setLcLp(int lc, int lp)
Sets the number of literal context bits and literal position bits.void
setLp(int lp)
Sets the number of literal position bits.void
setMatchFinder(int mf)
Sets the match finder type.void
setMode(int mode)
Sets the compression mode.void
setNiceLen(int niceLen)
Sets the nice length of matches.void
setPb(int pb)
Sets the number of position bits.void
setPreset(int preset)
Sets the compression options to the given preset.void
setPresetDict(byte[] presetDict)
Sets a preset dictionary.Methods inherited from class org.tukaani.xz.FilterOptions
getDecoderMemoryUsage, getEncoderMemoryUsage, getInputStream, getOutputStream
-
Field Details
-
PRESET_MIN
public static final int PRESET_MINMinimum valid compression preset level is 0.- See Also:
- Constant Field Values
-
PRESET_MAX
public static final int PRESET_MAXMaximum valid compression preset level is 9.- See Also:
- Constant Field Values
-
PRESET_DEFAULT
public static final int PRESET_DEFAULTDefault compression preset level is 6.- See Also:
- Constant Field Values
-
DICT_SIZE_MIN
public static final int DICT_SIZE_MINMinimum dictionary size is 4 KiB.- See Also:
- Constant Field Values
-
DICT_SIZE_MAX
public static final int DICT_SIZE_MAXMaximum dictionary size for compression is 768 MiB.The decompressor supports bigger dictionaries, up to almost 2 GiB. With HC4 the encoder would support dictionaries bigger than 768 MiB. The 768 MiB limit comes from the current implementation of BT4 where we would otherwise hit the limits of signed ints in array indexing.
If you really need bigger dictionary for decompression, use
LZMA2InputStream
directly.- See Also:
- Constant Field Values
-
DICT_SIZE_DEFAULT
public static final int DICT_SIZE_DEFAULTThe default dictionary size is 8 MiB.- See Also:
- Constant Field Values
-
LC_LP_MAX
public static final int LC_LP_MAXMaximum value for lc + lp is 4.- See Also:
- Constant Field Values
-
LC_DEFAULT
public static final int LC_DEFAULTThe default number of literal context bits is 3.- See Also:
- Constant Field Values
-
LP_DEFAULT
public static final int LP_DEFAULTThe default number of literal position bits is 0.- See Also:
- Constant Field Values
-
PB_MAX
public static final int PB_MAXMaximum value for pb is 4.- See Also:
- Constant Field Values
-
PB_DEFAULT
public static final int PB_DEFAULTThe default number of position bits is 2.- See Also:
- Constant Field Values
-
MODE_UNCOMPRESSED
public static final int MODE_UNCOMPRESSEDCompression mode: uncompressed. The data is wrapped into a LZMA2 stream without compression.- See Also:
- Constant Field Values
-
MODE_FAST
public static final int MODE_FASTCompression mode: fast. This is usually combined with a hash chain match finder.- See Also:
- Constant Field Values
-
MODE_NORMAL
public static final int MODE_NORMALCompression mode: normal. This is usually combined with a binary tree match finder.- See Also:
- Constant Field Values
-
NICE_LEN_MIN
public static final int NICE_LEN_MINMinimum value forniceLen
is 8.- See Also:
- Constant Field Values
-
NICE_LEN_MAX
public static final int NICE_LEN_MAXMaximum value forniceLen
is 273.- See Also:
- Constant Field Values
-
MF_HC4
public static final int MF_HC4Match finder: Hash Chain 2-3-4- See Also:
- Constant Field Values
-
MF_BT4
public static final int MF_BT4Match finder: Binary tree 2-3-4- See Also:
- Constant Field Values
-
-
Constructor Details
-
LZMA2Options
public LZMA2Options()Creates new LZMA2 options and sets them to the default values. This is equivalent toLZMA2Options(PRESET_DEFAULT)
. -
LZMA2Options
Creates new LZMA2 options and sets them to the given preset.- Throws:
UnsupportedOptionsException
-preset
is not supported
-
LZMA2Options
public LZMA2Options(int dictSize, int lc, int lp, int pb, int mode, int niceLen, int mf, int depthLimit) throws UnsupportedOptionsExceptionCreates new LZMA2 options and sets them to the given custom values.- Throws:
UnsupportedOptionsException
- unsupported options were specified
-
-
Method Details
-
setPreset
Sets the compression options to the given preset.The presets 0-3 are fast presets with medium compression. The presets 4-6 are fairly slow presets with high compression. The default preset (
PRESET_DEFAULT
) is 6.The presets 7-9 are like the preset 6 but use bigger dictionaries and have higher compressor and decompressor memory requirements. Unless the uncompressed size of the file exceeds 8 MiB, 16 MiB, or 32 MiB, it is waste of memory to use the presets 7, 8, or 9, respectively.
- Throws:
UnsupportedOptionsException
-preset
is not supported
-
setDictSize
Sets the dictionary size in bytes.The dictionary (or history buffer) holds the most recently seen uncompressed data. Bigger dictionary usually means better compression. However, using a dictioanary bigger than the size of the uncompressed data is waste of memory.
Any value in the range [DICT_SIZE_MIN, DICT_SIZE_MAX] is valid, but sizes of 2^n and 2^n + 2^(n-1) bytes are somewhat recommended.
- Throws:
UnsupportedOptionsException
-dictSize
is not supported
-
getDictSize
public int getDictSize()Gets the dictionary size in bytes. -
setPresetDict
public void setPresetDict(byte[] presetDict)Sets a preset dictionary. Use null to disable the use of a preset dictionary. By default there is no preset dictionary.The .xz format doesn't support a preset dictionary for now. Do not set a preset dictionary unless you use raw LZMA2.
Preset dictionary can be useful when compressing many similar, relatively small chunks of data independently from each other. A preset dictionary should contain typical strings that occur in the files being compressed. The most probable strings should be near the end of the preset dictionary. The preset dictionary used for compression is also needed for decompression.
-
getPresetDict
public byte[] getPresetDict()Gets the preset dictionary. -
setLcLp
Sets the number of literal context bits and literal position bits.The sum of
lc
andlp
is limited to 4. Trying to exceed it will throw an exception. This function lets you change both at the same time.- Throws:
UnsupportedOptionsException
-lc
andlp
are invalid
-
setLc
Sets the number of literal context bits.All bytes that cannot be encoded as matches are encoded as literals. That is, literals are simply 8-bit bytes that are encoded one at a time.
The literal coding makes an assumption that the highest
lc
bits of the previous uncompressed byte correlate with the next byte. For example, in typical English text, an upper-case letter is often followed by a lower-case letter, and a lower-case letter is usually followed by another lower-case letter. In the US-ASCII character set, the highest three bits are 010 for upper-case letters and 011 for lower-case letters. Whenlc
is at least 3, the literal coding can take advantage of this property in the uncompressed data.The default value (3) is usually good. If you want maximum compression, try
setLc(4)
. Sometimes it helps a little, and sometimes it makes compression worse. If it makes it worse, test for examplesetLc(2)
too.- Throws:
UnsupportedOptionsException
-lc
is invalid, or the sum oflc
andlp
exceed LC_LP_MAX
-
setLp
Sets the number of literal position bits.This affets what kind of alignment in the uncompressed data is assumed when encoding literals. See
setPb
for more information about alignment.- Throws:
UnsupportedOptionsException
-lp
is invalid, or the sum oflc
andlp
exceed LC_LP_MAX
-
getLc
public int getLc()Gets the number of literal context bits. -
getLp
public int getLp()Gets the number of literal position bits. -
setPb
Sets the number of position bits.This affects what kind of alignment in the uncompressed data is assumed in general. The default (2) means four-byte alignment (2^
pb
= 2^2 = 4), which is often a good choice when there's no better guess.When the alignment is known, setting the number of position bits accordingly may reduce the file size a little. For example with text files having one-byte alignment (US-ASCII, ISO-8859-*, UTF-8), using
setPb(0)
can improve compression slightly. For UTF-16 text,setPb(1)
is a good choice. If the alignment is an odd number like 3 bytes,setPb(0)
might be the best choice.Even though the assumed alignment can be adjusted with
setPb
andsetLp
, LZMA2 still slightly favors 16-byte alignment. It might be worth taking into account when designing file formats that are likely to be often compressed with LZMA2.- Throws:
UnsupportedOptionsException
-pb
is invalid
-
getPb
public int getPb()Gets the number of position bits. -
setMode
Sets the compression mode.This specifies the method to analyze the data produced by a match finder. The default is
MODE_FAST
for presets 0-3 andMODE_NORMAL
for presets 4-9.Usually
MODE_FAST
is used with Hash Chain match finders andMODE_NORMAL
with Binary Tree match finders. This is also what the presets do.The special mode
MODE_UNCOMPRESSED
doesn't try to compress the data at all (and doesn't use a match finder) and will simply wrap it in uncompressed LZMA2 chunks.- Throws:
UnsupportedOptionsException
-mode
is not supported
-
getMode
public int getMode()Gets the compression mode. -
setNiceLen
Sets the nice length of matches. Once a match of at leastniceLen
bytes is found, the algorithm stops looking for better matches. Higher values tend to give better compression at the expense of speed. The default depends on the preset.- Throws:
UnsupportedOptionsException
-niceLen
is invalid
-
getNiceLen
public int getNiceLen()Gets the nice length of matches. -
setMatchFinder
Sets the match finder type.Match finder has a major effect on compression speed, memory usage, and compression ratio. Usually Hash Chain match finders are faster than Binary Tree match finders. The default depends on the preset: 0-3 use
MF_HC4
and 4-9 useMF_BT4
.- Throws:
UnsupportedOptionsException
-mf
is not supported
-
getMatchFinder
public int getMatchFinder()Gets the match finder type. -
setDepthLimit
Sets the match finder search depth limit.The default is a special value of
0
which indicates that the depth limit should be automatically calculated by the selected match finder from the nice length of matches.Reasonable depth limit for Hash Chain match finders is 4-100 and 16-1000 for Binary Tree match finders. Using very high values can make the compressor extremely slow with some files. Avoid settings higher than 1000 unless you are prepared to interrupt the compression in case it is taking far too long.
- Throws:
UnsupportedOptionsException
-depthLimit
is invalid
-
getDepthLimit
public int getDepthLimit()Gets the match finder search depth limit. -
getEncoderMemoryUsage
public int getEncoderMemoryUsage()Description copied from class:FilterOptions
Gets how much memory the encoder will need with these options.- Specified by:
getEncoderMemoryUsage
in classFilterOptions
-
getOutputStream
Description copied from class:FilterOptions
Gets a raw (no XZ headers) encoder output stream using these options and the given ArrayCache. Raw streams are an advanced feature. In most cases you want to store the compressed data in the .xz container format instead of using a raw stream. To use this filter in a .xz file, pass this object to XZOutputStream.- Specified by:
getOutputStream
in classFilterOptions
-
getDecoderMemoryUsage
public int getDecoderMemoryUsage()Gets how much memory the LZMA2 decoder will need to decompress the data that was encoded with these options and stored in a .xz file.The returned value may bigger than the value returned by a direct call to
LZMA2InputStream.getMemoryUsage(int)
if the dictionary size is not 2^n or 2^n + 2^(n-1) bytes. This is because the .xz headers store the dictionary size in such a format and other values are rounded up to the next such value. Such rounding is harmess except it might waste some memory if an unsual dictionary size is used.If you use raw LZMA2 streams and unusual dictioanary size, call
LZMA2InputStream.getMemoryUsage(int)
directly to get raw decoder memory requirements.- Specified by:
getDecoderMemoryUsage
in classFilterOptions
-
getInputStream
Description copied from class:FilterOptions
Gets a raw (no XZ headers) decoder input stream using these options and the given ArrayCache.- Specified by:
getInputStream
in classFilterOptions
- Throws:
IOException
-
clone
-