From e7ab81d5a542f9f9253013ffe75720c3730a8a4b Mon Sep 17 00:00:00 2001
From: bapt `compressedSize` : must be the _exact_ size of some number of compressed and/or skippable frames.
- `dstCapacity` is an upper bound of originalSize.
+ `dstCapacity` is an upper bound of originalSize to regenerate.
If user cannot imply a maximum upper bound, it's better to use streaming mode to decompress data.
@return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
or an errorCode if it fails (which can be tested using ZSTD_isError()).
NOTE: This function is planned to be obsolete, in favour of ZSTD_getFrameContentSize.
- ZSTD_getFrameContentSize functions the same way, returning the decompressed size of a single
- frame, but distinguishes empty frames from frames with an unknown size, or errors.
-
- Additionally, ZSTD_findDecompressedSize can be used instead. It can handle multiple
- concatenated frames in one buffer, and so is more general.
- As a result however, it requires more computation and entire frames to be passed to it,
- as opposed to ZSTD_getFrameContentSize which requires only a single frame's header.
+ NOTE: This function is planned to be obsolete, in favor of ZSTD_getFrameContentSize().
+ ZSTD_getFrameContentSize() works the same way,
+ returning the decompressed size of a single frame,
+ but distinguishes empty frames from frames with an unknown size, or errors.
'src' is the start of a zstd compressed frame.
@return : content size to be decompressed, as a 64-bits value _if known_, 0 otherwise.
- note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode.
+ note 1 : decompressed size is an optional field, it may not be present, typically in streaming mode.
When `return==0`, data to decompress could be any size.
In which case, it's necessary to use streaming mode to decompress data.
- Optionally, application can still use ZSTD_decompress() while relying on implied limits.
+ Optionally, application can use ZSTD_decompress() while relying on implied limits.
(For example, data may be necessarily cut into blocks <= 16 KB).
note 2 : decompressed size is always present when compression is done with ZSTD_compress()
note 3 : decompressed size can be very large (64-bits value),
@@ -96,7 +93,7 @@
note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified.
Always ensure result fits within application's authorized limits.
Each application can set its own limits.
- note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more.
+ note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameHeader() to know more.
Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()).
Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()).
+ Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx())
Compression using a predefined Dictionary (see dictBuilder/zdict.h).
- Note : This function loads the dictionary, resulting in significant startup delay.
- Note : When `dict == NULL || dictSize < 8` no dictionary is used.
+ Compression using a predefined Dictionary (see dictBuilder/zdict.h).
+ Note : This function loads the dictionary, resulting in significant startup delay.
+ Note : When `dict == NULL || dictSize < 8` no dictionary is used.
Decompression using a predefined Dictionary (see dictBuilder/zdict.h).
- Dictionary must be identical to the one used during compression.
- Note : This function loads the dictionary, resulting in significant startup delay.
- Note : When `dict == NULL || dictSize < 8` no dictionary is used.
+ Decompression using a predefined Dictionary (see dictBuilder/zdict.h).
+ Dictionary must be identical to the one used during compression.
+ Note : This function loads the dictionary, resulting in significant startup delay.
+ Note : When `dict == NULL || dictSize < 8` no dictionary is used.
When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
- ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
- ZSTD_CDict can be created once and used by multiple threads concurrently, as its usage is read-only.
- `dictBuffer` can be released after ZSTD_CDict creation, as its content is copied within CDict
+ When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
+ ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
+ ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
+ `dictBuffer` can be released after ZSTD_CDict creation, since its content is copied within CDict
Function frees memory allocated by ZSTD_createCDict().
+ Function frees memory allocated by ZSTD_createCDict().
zstd 1.2.0 Manual
+zstd 1.3.0 Manual
Contents
@@ -13,14 +13,14 @@
Introduction
- zstd, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios
- at zlib-level and better compression ratios. The zstd compression library provides in-memory compression and
- decompression functions. The library supports compression levels from 1 up to ZSTD_maxCLevel() which is 22.
+ zstd, short for Zstandard, is a fast lossless compression algorithm,
+ targeting real-time compression scenarios at zlib-level and better compression ratios.
+ The zstd compression library provides in-memory compression and decompression functions.
+ The library supports compression levels from 1 up to ZSTD_maxCLevel() which is currently 22.
Levels >= 20, labeled `--ultra`, should be used with caution, as they require more memory.
Compression can be done in:
- a single step (described as Simple API)
- a single step, reusing a context (described as Explicit memory management)
- unbounded multiple steps (described as Streaming compression)
- The compression ratio achievable on small data can be highly improved using compression with a dictionary in:
+ The compression ratio achievable on small data can be highly improved using a dictionary in:
- a single step (described as Simple dictionary API)
- a single step, reusing a dictionary (described as Fast dictionary API)
Advanced experimental functions can be accessed using #define ZSTD_STATIC_LINKING_ONLY before including zstd.h.
- These APIs shall never be used with a dynamic library.
+ Advanced experimental APIs shall never be used with a dynamic library.
They are not "stable", their definition may change in the future. Only static linking is allowed.
Version
-unsigned ZSTD_versionNumber(void); /**< library version number; to be used when checking dll version */
+
unsigned ZSTD_versionNumber(void); /**< useful to check dll version */
Simple API
@@ -66,28 +67,24 @@
size_t ZSTD_decompress( void* dst, size_t dstCapacity,
const void* src, size_t compressedSize);
unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
-
Helper functions
int ZSTD_maxCLevel(void);
/*!< maximum compression level available */
@@ -114,20 +111,26 @@ const char* ZSTD_getErrorName(size_t code); /*!< provides readable strin
ZSTD_CCtx* ZSTD_createCCtx(void);
size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx);
-size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel);
+
size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx,
+ void* dst, size_t dstCapacity,
+ const void* src, size_t srcSize,
+ int compressionLevel);
Decompression context
When decompressing many times,
- it is recommended to allocate a context just once, and re-use it for each successive compression operation.
+ it is recommended to allocate a context only once,
+ and re-use it for each successive compression operation.
This will make workload friendlier for system's memory.
- Use one context per thread for parallel execution in multi-threaded environments.
+ Use one context per thread for parallel execution.
typedef struct ZSTD_DCtx_s ZSTD_DCtx;
ZSTD_DCtx* ZSTD_createDCtx(void);
size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
-size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
-
size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx,
+ void* dst, size_t dstCapacity,
+ const void* src, size_t srcSize);
+
Simple dictionary API
@@ -137,32 +140,33 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
const void* src, size_t srcSize,
const void* dict,size_t dictSize,
int compressionLevel);
-
size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const void* dict,size_t dictSize);
-
-Fast dictionary API
+Bulk processing dictionary API
-ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, int compressionLevel);
-
ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize,
+ int compressionLevel);
+
size_t ZSTD_freeCDict(ZSTD_CDict* CDict);
-
size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx,
@@ -176,20 +180,20 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
ZSTD_DDict* ZSTD_createDDict(const void* dictBuffer, size_t dictSize); -Create a digested dictionary, ready to start decompression operation without startup delay. - dictBuffer can be released after DDict creation, as its content is copied inside DDict +
Create a digested dictionary, ready to start decompression operation without startup delay. + dictBuffer can be released after DDict creation, as its content is copied inside DDict
size_t ZSTD_freeDDict(ZSTD_DDict* ddict); -Function frees memory allocated with ZSTD_createDDict() +
Function frees memory allocated with ZSTD_createDDict()
size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const ZSTD_DDict* ddict); -Decompression using a digested Dictionary. - Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. +
Decompression using a digested Dictionary. + Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times.
typedef ZSTD_CCtx ZSTD_CStream; /**< CCtx and CStream are now effectively same object (>= v1.3.0) */ +
ZSTD_CStream* ZSTD_createCStream(void); size_t ZSTD_freeCStream(ZSTD_CStream* zcs);
typedef ZSTD_DCtx ZSTD_DStream; /**< DCtx and DStream are now effectively same object (>= v1.3.0) */ +
ZSTD_DStream* ZSTD_createDStream(void); size_t ZSTD_freeDStream(ZSTD_DStream* zds);
typedef enum { ZSTD_fast, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt, ZSTD_btopt2 } ZSTD_strategy; /* from faster to stronger */ +typedef enum { ZSTD_fast=1, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, + ZSTD_btlazy2, ZSTD_btopt, ZSTD_btultra } ZSTD_strategy; /* from faster to stronger */
typedef struct { unsigned windowLog; /**< largest match distance : larger == more compression, more memory needed during decompression */ @@ -319,68 +330,141 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB ZSTD_frameParameters fParams; } ZSTD_parameters;
+typedef struct { + unsigned long long frameContentSize; + size_t windowSize; + unsigned dictID; + unsigned checksumFlag; +} ZSTD_frameHeader; +
Custom memory allocation functions
typedef void* (*ZSTD_allocFunction) (void* opaque, size_t size); typedef void (*ZSTD_freeFunction) (void* opaque, void* address); typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; void* opaque; } ZSTD_customMem; +/* use this constant to defer to stdlib's functions */ +static const ZSTD_customMem ZSTD_defaultCMem = { NULL, NULL, NULL };
size_t ZSTD_findFrameCompressedSize(const void* src, size_t srcSize);`src` should point to the start of a ZSTD encoded frame or skippable frame `srcSize` must be at least as large as the frame - @return : the compressed size of the frame pointed to by `src`, suitable to pass to - `ZSTD_decompress` or similar, or an error code if given invalid input. + @return : the compressed size of the frame pointed to by `src`, + suitable to pass to `ZSTD_decompress` or similar, + or an error code if given invalid input.
unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize); -`src` should point to the start of a ZSTD encoded frame - `srcSize` must be at least as large as the frame header. A value greater than or equal - to `ZSTD_frameHeaderSize_max` is guaranteed to be large enough in all cases. - @return : decompressed size of the frame pointed to be `src` if known, otherwise - - ZSTD_CONTENTSIZE_UNKNOWN if the size cannot be determined - - ZSTD_CONTENTSIZE_ERROR if an error occurred (e.g. invalid magic number, srcSize too small) +
#define ZSTD_CONTENTSIZE_UNKNOWN (0ULL - 1) +#define ZSTD_CONTENTSIZE_ERROR (0ULL - 2) +unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize); +`src` should point to the start of a ZSTD encoded frame. + `srcSize` must be at least as large as the frame header. + A value >= `ZSTD_frameHeaderSize_max` is guaranteed to be large enough. + @return : - decompressed size of the frame pointed to be `src` if known + - ZSTD_CONTENTSIZE_UNKNOWN if the size cannot be determined + - ZSTD_CONTENTSIZE_ERROR if an error occurred (e.g. invalid magic number, srcSize too small)
unsigned long long ZSTD_findDecompressedSize(const void* src, size_t srcSize); -`src` should point the start of a series of ZSTD encoded and/or skippable frames - `srcSize` must be the _exact_ size of this series +
`src` should point the start of a series of ZSTD encoded and/or skippable frames + `srcSize` must be the _exact_ size of this series (i.e. there should be a frame boundary exactly `srcSize` bytes after `src`) - @return : the decompressed size of all data in the contained frames, as a 64-bit value _if known_ - - if the decompressed size cannot be determined: ZSTD_CONTENTSIZE_UNKNOWN - - if an error occurred: ZSTD_CONTENTSIZE_ERROR + @return : - decompressed size of all data in all successive frames + - if the decompressed size cannot be determined: ZSTD_CONTENTSIZE_UNKNOWN + - if an error occurred: ZSTD_CONTENTSIZE_ERROR - note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. - When `return==ZSTD_CONTENTSIZE_UNKNOWN`, data to decompress could be any size. - In which case, it's necessary to use streaming mode to decompress data. - Optionally, application can still use ZSTD_decompress() while relying on implied limits. - (For example, data may be necessarily cut into blocks <= 16 KB). - note 2 : decompressed size is always present when compression is done with ZSTD_compress() - note 3 : decompressed size can be very large (64-bits value), - potentially larger than what local system can handle as a single memory segment. - In which case, it's necessary to use streaming mode to decompress data. - note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. - Always ensure result fits within application's authorized limits. - Each application can set its own limits. - note 5 : ZSTD_findDecompressedSize handles multiple frames, and so it must traverse the input to - read each contained frame header. This is efficient as most of the data is skipped, - however it does mean that all frame data must be present and valid. + note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. + When `return==ZSTD_CONTENTSIZE_UNKNOWN`, data to decompress could be any size. + In which case, it's necessary to use streaming mode to decompress data. + Optionally, application can still use ZSTD_decompress() while relying on implied limits. + (For example, data may be necessarily cut into blocks <= 16 KB). + note 2 : decompressed size is always present when compression is done with ZSTD_compress() + note 3 : decompressed size can be very large (64-bits value), + potentially larger than what local system can handle as a single memory segment. + In which case, it's necessary to use streaming mode to decompress data. + note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. + Always ensure result fits within application's authorized limits. + Each application can set its own limits. + note 5 : ZSTD_findDecompressedSize handles multiple frames, and so it must traverse the input to + read each contained frame header. This is efficient as most of the data is skipped, + however it does mean that all frame data must be present and valid. +
+ +size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize); +`src` should point to the start of a ZSTD frame + `srcSize` must be >= ZSTD_frameHeaderSize_prefix. + @return : size of the Frame Header +
+ +Context memory usage
+ +size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); +size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); +size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs); +size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds); +size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); +size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); +These functions give the current memory usage of selected object. + Object memory usage can evolve if it's re-used multiple times. +
+ +size_t ZSTD_estimateCCtxSize(int compressionLevel); +size_t ZSTD_estimateCCtxSize_advanced(ZSTD_compressionParameters cParams); +size_t ZSTD_estimateDCtxSize(void); +These functions make it possible to estimate memory usage + of a future {D,C}Ctx, before its creation. + ZSTD_estimateCCtxSize() will provide a budget large enough for any compression level up to selected one. + It will also consider src size to be arbitrarily "large", which is worst case. + If srcSize is known to always be small, ZSTD_estimateCCtxSize_advanced() can provide a tighter estimation. + ZSTD_estimateCCtxSize_advanced() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. + Note : CCtx estimation is only correct for single-threaded compression +
+ +size_t ZSTD_estimateCStreamSize(int compressionLevel); +size_t ZSTD_estimateCStreamSize_advanced(ZSTD_compressionParameters cParams); +size_t ZSTD_estimateDStreamSize(size_t windowSize); +size_t ZSTD_estimateDStreamSize_fromFrame(const void* src, size_t srcSize); +ZSTD_estimateCStreamSize() will provide a budget large enough for any compression level up to selected one. + It will also consider src size to be arbitrarily "large", which is worst case. + If srcSize is known to always be small, ZSTD_estimateCStreamSize_advanced() can provide a tighter estimation. + ZSTD_estimateCStreamSize_advanced() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. + Note : CStream estimation is only correct for single-threaded compression. + ZSTD_DStream memory budget depends on window Size. + This information can be passed manually, using ZSTD_estimateDStreamSize, + or deducted from a valid frame Header, using ZSTD_estimateDStreamSize_fromFrame(); + Note : if streaming is init with function ZSTD_init?Stream_usingDict(), + an internal ?Dict will be created, which additional size is not estimated here. + In this case, get total size by adding ZSTD_estimate?DictSize +
+ +size_t ZSTD_estimateCDictSize(size_t dictSize, int compressionLevel); +size_t ZSTD_estimateCDictSize_advanced(size_t dictSize, ZSTD_compressionParameters cParams, unsigned byReference); +size_t ZSTD_estimateDDictSize(size_t dictSize, unsigned byReference); +ZSTD_estimateCDictSize() will bet that src size is relatively "small", and content is copied, like ZSTD_createCDict(). + ZSTD_estimateCStreamSize_advanced() makes it possible to control precisely compression parameters, like ZSTD_createCDict_advanced(). + Note : dictionary created "byReference" are smaller
Advanced compression functions
-size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams); -Gives the amount of memory allocated for a ZSTD_CCtx given a set of compression parameters. - `frameContentSize` is an optional parameter, provide `0` if unknown -
-ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem);Create a ZSTD compression context using external alloc and free functions
-size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); -Gives the amount of memory used by a given ZSTD_CCtx +
ZSTD_CCtx* ZSTD_initStaticCCtx(void* workspace, size_t workspaceSize); +workspace: The memory area to emplace the context into. + Provided pointer must 8-bytes aligned. + It must outlive context usage. + workspaceSize: Use ZSTD_estimateCCtxSize() or ZSTD_estimateCStreamSize() + to determine how large workspace must be to support scenario. + @return : pointer to ZSTD_CCtx*, or NULL if error (size too small) + Note : zstd will never resize nor malloc() when using a static cctx. + If it needs more memory than available, it will simply error out. + Note 2 : there is no corresponding "free" function. + Since workspace was allocated externally, it must be freed externally too. + Limitation 1 : currently not compatible with internal CDict creation, such as + ZSTD_CCtx_loadDictionary() or ZSTD_initCStream_usingDict(). + Limitation 2 : currently not compatible with multi-threading +
typedef enum { @@ -399,13 +483,34 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; v It is important that dictBuffer outlives CDict, it must remain read accessible throughout the lifetime of CDict
-ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, unsigned byReference, +typedef enum { ZSTD_dm_auto=0, /* dictionary is "full" if it starts with ZSTD_MAGIC_DICTIONARY, rawContent otherwize */ + ZSTD_dm_rawContent, /* ensures dictionary is always loaded as rawContent, even if it starts with ZSTD_MAGIC_DICTIONARY */ + ZSTD_dm_fullDict /* refuses to load a dictionary if it does not respect Zstandard's specification */ +} ZSTD_dictMode_e; +
+ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, ZSTD_compressionParameters cParams, ZSTD_customMem customMem);Create a ZSTD_CDict using external alloc and free, and customized compression parameters
-size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); -Gives the amount of memory used by a given ZSTD_sizeof_CDict +
ZSTD_CDict* ZSTD_initStaticCDict( + void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, + ZSTD_compressionParameters cParams); +Generate a digested dictionary in provided memory area. + workspace: The memory area to emplace the dictionary into. + Provided pointer must 8-bytes aligned. + It must outlive dictionary usage. + workspaceSize: Use ZSTD_estimateCDictSize() + to determine how large workspace must be. + cParams : use ZSTD_getCParams() to transform a compression level + into its relevants cParams. + @return : pointer to ZSTD_CDict*, or NULL if error (size too small) + Note : there is no corresponding "free" function. + Since workspace was allocated externally, it must be freed externally. +
ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long estimatedSrcSize, size_t dictSize); @@ -423,8 +528,8 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; v
ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize); -optimize params for a given `srcSize` and `dictSize`. - both values are optional, select `0` if unknown. +
optimize params for a given `srcSize` and `dictSize`. + both values are optional, select `0` if unknown.
size_t ZSTD_compress_advanced (ZSTD_CCtx* cctx, @@ -451,22 +556,32 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; v Note 3 : Skippable Frame Identifiers are considered valid.
-size_t ZSTD_estimateDCtxSize(void); -Gives the potential amount of memory allocated to create a ZSTD_DCtx -
-ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem);Create a ZSTD decompression context using external alloc and free functions
-size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); -Gives the amount of memory used by a given ZSTD_DCtx +
ZSTD_DCtx* ZSTD_initStaticDCtx(void* workspace, size_t workspaceSize); +workspace: The memory area to emplace the context into. + Provided pointer must 8-bytes aligned. + It must outlive context usage. + workspaceSize: Use ZSTD_estimateDCtxSize() or ZSTD_estimateDStreamSize() + to determine how large workspace must be to support scenario. + @return : pointer to ZSTD_DCtx*, or NULL if error (size too small) + Note : zstd will never resize nor malloc() when using a static dctx. + If it needs more memory than available, it will simply error out. + Note 2 : static dctx is incompatible with legacy support + Note 3 : there is no corresponding "free" function. + Since workspace was allocated externally, it must be freed externally. + Limitation : currently not compatible with internal DDict creation, + such as ZSTD_initDStream_usingDict(). +
ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, size_t dictSize);Create a digested dictionary, ready to start decompression operation without startup delay. - Dictionary content is simply referenced, and therefore stays in dictBuffer. - It is important that dictBuffer outlives DDict, it must remain read accessible throughout the lifetime of DDict + Dictionary content is referenced, and therefore stays in dictBuffer. + It is important that dictBuffer outlives DDict, + it must remain read accessible throughout the lifetime of DDict
ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, @@ -474,8 +589,19 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; vCreate a ZSTD_DDict using external alloc and free, optionally by reference
-size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); -Gives the amount of memory used by a given ZSTD_DDict +
ZSTD_DDict* ZSTD_initStaticDDict(void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference); +Generate a digested dictionary in provided memory area. + workspace: The memory area to emplace the dictionary into. + Provided pointer must 8-bytes aligned. + It must outlive dictionary usage. + workspaceSize: Use ZSTD_estimateDDictSize() + to determine how large workspace must be. + @return : pointer to ZSTD_DDict*, or NULL if error (size too small) + Note : there is no corresponding "free" function. + Since workspace was allocated externally, it must be freed externally. +
unsigned ZSTD_getDictID_fromDict(const void* dict, size_t dictSize); @@ -499,19 +625,19 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; v Note : this use case also happens when using a non-conformant dictionary. - `srcSize` is too small, and as a result, the frame header could not be decoded (only possible if `srcSize < ZSTD_FRAMEHEADERSIZE_MAX`). - This is not a Zstandard frame. - When identifying the exact failure cause, it's possible to use ZSTD_getFrameParams(), which will provide a more precise error code. + When identifying the exact failure cause, it's possible to use ZSTD_getFrameHeader(), which will provide a more precise error code.
Advanced streaming functions
Advanced Streaming compression functions
ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem); -size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs);/**< size of CStream is variable, depending primarily on compression level */ +ZSTD_CStream* ZSTD_initStaticCStream(void* workspace, size_t workspaceSize); /**< same as ZSTD_initStaticCCtx() */ size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize); /**< pledgedSrcSize must be correct, a size of 0 means unknown. for a frame size of 0 use initCStream_advanced */ -size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ +size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); /**< creates of an internal CDict (incompatible with static CCtx), except if dict == NULL or dictSize < 8, in which case no dict is used. */ size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be 0 (meaning unknown). note: if the contentSizeFlag is set, pledgedSrcSize == 0 means the source size is actually 0 */ size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); /**< note : cdict will just be referenced, and must outlive compression session */ -size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, unsigned long long pledgedSrcSize, ZSTD_frameParameters fParams); /**< same as ZSTD_initCStream_usingCDict(), with control over frame parameters */ +size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, ZSTD_frameParameters fParams, unsigned long long pledgedSrcSize); /**< same as ZSTD_initCStream_usingCDict(), with control over frame parameters */
size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize);start a new compression job, using same parameters from previous job. @@ -524,11 +650,11 @@ size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict*
Advanced Streaming decompression functions
typedef enum { DStream_p_maxWindowSize } ZSTD_DStreamParameter_e; ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem); -size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize);/**< note: a dict will not be used if dict == NULL or dictSize < 8 */ +ZSTD_DStream* ZSTD_initStaticDStream(void* workspace, size_t workspaceSize); /**< same as ZSTD_initStaticDCtx() */ size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, ZSTD_DStreamParameter_e paramType, unsigned paramValue); +size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); /**< note : ddict will just be referenced, and must outlive decompression session */ size_t ZSTD_resetDStream(ZSTD_DStream* zds); /**< re-use decompression parameters from previous init; saves dictionary loading */ -size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds);
Buffer-less and synchronous inner streaming functions
This is an advanced API, giving full control over buffer management, for users which need direct control over memory. @@ -578,21 +704,24 @@ size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned lo Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it. A ZSTD_DCtx object can be re-used multiple times. - First typical operation is to retrieve frame parameters, using ZSTD_getFrameParams(). - It fills a ZSTD_frameParams structure which provide important information to correctly decode the frame, - such as the minimum rolling buffer size to allocate to decompress data (`windowSize`), - and the dictionary ID used. + First typical operation is to retrieve frame parameters, using ZSTD_getFrameHeader(). + It fills a ZSTD_frameHeader structure with important information to correctly decode the frame, + such as minimum rolling buffer size to allocate to decompress data (`windowSize`), + and the dictionary ID in use. (Note : content size is optional, it may not be present. 0 means : content size unknown). Note that these values could be wrong, either because of data malformation, or because an attacker is spoofing deliberate false information. As a consequence, check that values remain within valid application range, especially `windowSize`, before allocation. - Each application can set its own limit, depending on local restrictions. For extended interoperability, it is recommended to support at least 8 MB. - Frame parameters are extracted from the beginning of the compressed frame. - Data fragment must be large enough to ensure successful decoding, typically `ZSTD_frameHeaderSize_max` bytes. - @result : 0 : successful decoding, the `ZSTD_frameParams` structure is correctly filled. + Each application can set its own limit, depending on local restrictions. + For extended interoperability, it is recommended to support windowSize of at least 8 MB. + Frame header is extracted from the beginning of compressed frame, so providing only the frame's beginning is enough. + Data fragment must be large enough to ensure successful decoding. + `ZSTD_frameHeaderSize_max` bytes is guaranteed to always be large enough. + @result : 0 : successful decoding, the `ZSTD_frameHeader` structure is correctly filled. >0 : `srcSize` is too small, please provide at least @result bytes on next attempt. errorCode, which can be tested using ZSTD_isError(). - Start decompression, with ZSTD_decompressBegin() or ZSTD_decompressBegin_usingDict(). + Start decompression, with ZSTD_decompressBegin(). + If decompression requires a dictionary, use ZSTD_decompressBegin_usingDict() or ZSTD_decompressBegin_usingDDict(). Alternatively, you can copy a prepared context, using ZSTD_copyDCtx(). Then use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively. @@ -624,29 +753,196 @@ size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned lo b) Frame Size - 4 Bytes, Little endian format, unsigned 32-bits c) Frame Content - any content (User Data) of length equal to Frame Size For skippable frames ZSTD_decompressContinue() always returns 0. - For skippable frames ZSTD_getFrameParams() returns fparamsPtr->windowLog==0 what means that a frame is skippable. + For skippable frames ZSTD_getFrameHeader() returns fparamsPtr->windowLog==0 what means that a frame is skippable. Note : If fparamsPtr->frameContentSize==0, it is ambiguous: the frame might actually be a Zstd encoded frame with no content. For purposes of decompression, it is valid in both cases to skip the frame using ZSTD_findFrameCompressedSize to find its size in bytes. It also returns Frame Size as fparamsPtr->frameContentSize.-typedef struct { - unsigned long long frameContentSize; - unsigned windowSize; - unsigned dictID; - unsigned checksumFlag; -} ZSTD_frameParams; -
-Buffer-less streaming decompression functions
size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize);/**< doesn't consume input, see details below */ +Buffer-less streaming decompression functions
size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize);/**< doesn't consume input */ size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx); size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); +size_t ZSTD_decompressBegin_usingDDict(ZSTD_DCtx* dctx, const ZSTD_DDict* ddict); void ZSTD_copyDCtx(ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx); -size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx); -size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); -typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e; -ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx);
+typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e; +
+New advanced API (experimental, and compression only)
+typedef enum { + /* compression parameters */ + ZSTD_p_compressionLevel=100, /* Update all compression parameters according to pre-defined cLevel table + * Default level is ZSTD_CLEVEL_DEFAULT==3. + * Special: value 0 means "do not change cLevel". */ + ZSTD_p_windowLog, /* Maximum allowed back-reference distance, expressed as power of 2. + * Must be clamped between ZSTD_WINDOWLOG_MIN and ZSTD_WINDOWLOG_MAX. + * Special: value 0 means "do not change windowLog". */ + ZSTD_p_hashLog, /* Size of the probe table, as a power of 2. + * Resulting table size is (1 << (hashLog+2)). + * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX. + * Larger tables improve compression ratio of strategies <= dFast, + * and improve speed of strategies > dFast. + * Special: value 0 means "do not change hashLog". */ + ZSTD_p_chainLog, /* Size of the full-search table, as a power of 2. + * Resulting table size is (1 << (chainLog+2)). + * Larger tables result in better and slower compression. + * This parameter is useless when using "fast" strategy. + * Special: value 0 means "do not change chainLog". */ + ZSTD_p_searchLog, /* Number of search attempts, as a power of 2. + * More attempts result in better and slower compression. + * This parameter is useless when using "fast" and "dFast" strategies. + * Special: value 0 means "do not change searchLog". */ + ZSTD_p_minMatch, /* Minimum size of searched matches (note : repCode matches can be smaller). + * Larger values make faster compression and decompression, but decrease ratio. + * Must be clamped between ZSTD_SEARCHLENGTH_MIN and ZSTD_SEARCHLENGTH_MAX. + * Note that currently, for all strategies < btopt, effective minimum is 4. + * Note that currently, for all strategies > fast, effective maximum is 6. + * Special: value 0 means "do not change minMatchLength". */ + ZSTD_p_targetLength, /* Only useful for strategies >= btopt. + * Length of Match considered "good enough" to stop search. + * Larger values make compression stronger and slower. + * Special: value 0 means "do not change targetLength". */ + ZSTD_p_compressionStrategy, /* See ZSTD_strategy enum definition. + * Cast selected strategy as unsigned for ZSTD_CCtx_setParameter() compatibility. + * The higher the value of selected strategy, the more complex it is, + * resulting in stronger and slower compression. + * Special: value 0 means "do not change strategy". */ + + /* frame parameters */ + ZSTD_p_contentSizeFlag=200, /* Content size is written into frame header _whenever known_ (default:1) */ + ZSTD_p_checksumFlag, /* A 32-bits checksum of content is written at end of frame (default:0) */ + ZSTD_p_dictIDFlag, /* When applicable, dictID of dictionary is provided in frame header (default:1) */ + + /* dictionary parameters (must be set before ZSTD_CCtx_loadDictionary) */ + ZSTD_p_dictMode=300, /* Select how dictionary content must be interpreted. Value must be from type ZSTD_dictMode_e. + * default : 0==auto : dictionary will be "full" if it respects specification, otherwise it will be "rawContent" */ + ZSTD_p_refDictContent, /* Dictionary content will be referenced, instead of copied (default:0==byCopy). + * It requires that dictionary buffer outlives its users */ + + /* multi-threading parameters */ + ZSTD_p_nbThreads=400, /* Select how many threads a compression job can spawn (default:1) + * More threads improve speed, but also increase memory usage. + * Can only receive a value > 1 if ZSTD_MULTITHREAD is enabled. + * Special: value 0 means "do not change nbThreads" */ + ZSTD_p_jobSize, /* Size of a compression job. Each compression job is completed in parallel. + * 0 means default, which is dynamically determined based on compression parameters. + * Job size must be a minimum of overlapSize, or 1 KB, whichever is largest + * The minimum size is automatically and transparently enforced */ + ZSTD_p_overlapSizeLog, /* Size of previous input reloaded at the beginning of each job. + * 0 => no overlap, 6(default) => use 1/8th of windowSize, >=9 => use full windowSize */ + + /* advanced parameters - may not remain available after API update */ + ZSTD_p_forceMaxWindow=1100, /* Force back-reference distances to remain < windowSize, + * even when referencing into Dictionary content (default:0) */ + +} ZSTD_cParameter; +
+size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value); +Set one compression parameter, selected by enum ZSTD_cParameter. + Note : when `value` is an enum, cast it to unsigned for proper type checking. + @result : 0, or an error code (which can be tested with ZSTD_isError()). +
+ +size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize); +Total input data size to be compressed as a single frame. + This value will be controlled at the end, and result in error if not respected. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Note 1 : 0 means zero, empty. + In order to mean "unknown content size", pass constant ZSTD_CONTENTSIZE_UNKNOWN. + Note that ZSTD_CONTENTSIZE_UNKNOWN is default value for new compression jobs. + Note 2 : If all data is provided and consumed in a single round, + this value is overriden by srcSize instead. +
+ +size_t ZSTD_CCtx_loadDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); +Create an internal CDict from dict buffer. + Decompression will have to use same buffer. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special : Adding a NULL (or 0-size) dictionary invalidates any previous dictionary, + meaning "return to no-dictionary mode". + Note 1 : `dict` content will be copied internally, + except if ZSTD_p_refDictContent is set before loading. + Note 2 : Loading a dictionary involves building tables, which are dependent on compression parameters. + For this reason, compression parameters cannot be changed anymore after loading a dictionary. + It's also a CPU-heavy operation, with non-negligible impact on latency. + Note 3 : Dictionary will be used for all future compression jobs. + To return to "no-dictionary" situation, load a NULL dictionary +
+ +size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); +Reference a prepared dictionary, to be used for all next compression jobs. + Note that compression parameters are enforced from within CDict, + and supercede any compression parameter previously set within CCtx. + The dictionary will remain valid for future compression jobs using same CCtx. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special : adding a NULL CDict means "return to no-dictionary mode". + Note 1 : Currently, only one dictionary can be managed. + Adding a new dictionary effectively "discards" any previous one. + Note 2 : CDict is just referenced, its lifetime must outlive CCtx. + +
+ +size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, const void* prefix, size_t prefixSize); +Reference a prefix (single-usage dictionary) for next compression job. + Decompression need same prefix to properly regenerate data. + Prefix is **only used once**. Tables are discarded at end of compression job. + Subsequent compression jobs will be done without prefix (if none is explicitly referenced). + If there is a need to use same prefix multiple times, consider embedding it into a ZSTD_CDict instead. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special : Adding any prefix (including NULL) invalidates any previous prefix or dictionary + Note 1 : Prefix buffer is referenced. It must outlive compression job. + Note 2 : Referencing a prefix involves building tables, which are dependent on compression parameters. + It's a CPU-heavy operation, with non-negligible impact on latency. + Note 3 : it's possible to alter ZSTD_p_dictMode using ZSTD_CCtx_setParameter() +
+ +typedef enum { + ZSTD_e_continue=0, /* collect more data, encoder transparently decides when to output result, for optimal conditions */ + ZSTD_e_flush, /* flush any data provided so far - frame will continue, future data can still reference previous data for better compression */ + ZSTD_e_end /* flush any remaining data and ends current frame. Any future compression starts a new frame. */ +} ZSTD_EndDirective; +
+size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp); +Behave about the same as ZSTD_compressStream. To note : + - Compression parameters are pushed into CCtx before starting compression, using ZSTD_CCtx_setParameter() + - Compression parameters cannot be changed once compression is started. + - *dstPos must be <= dstCapacity, *srcPos must be <= srcSize + - *dspPos and *srcPos will be updated. They are guaranteed to remain below their respective limit. + - @return provides the minimum amount of data still to flush from internal buffers + or an error code, which can be tested using ZSTD_isError(). + if @return != 0, flush is not fully completed, there is some data left within internal buffers. + - after a ZSTD_e_end directive, if internal buffer is not fully flushed, + only ZSTD_e_end or ZSTD_e_flush operations are allowed. + It is necessary to fully flush internal buffers + before starting a new compression job, or changing compression parameters. + +
+ +void ZSTD_CCtx_reset(ZSTD_CCtx* cctx); /* Not ready yet ! */ +Return a CCtx to clean state. + Useful after an error, or to interrupt an ongoing compression job and start a new one. + Any internal data not yet flushed is cancelled. + Dictionary (if any) is dropped. + It's possible to modify compression parameters after a reset. + +
+ +size_t ZSTD_compress_generic_simpleArgs ( + ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, size_t* dstPos, + const void* src, size_t srcSize, size_t* srcPos, + ZSTD_EndDirective endOp); +Same as ZSTD_compress_generic(), + but using only integral types as arguments. + Argument list is larger and less expressive than ZSTD_{in,out}Buffer, + but can be helpful for binders from dynamic languages + which have troubles handling structures containing memory pointers. + +
+Block functions
Block functions produce and decode raw zstd blocks, without frame metadata. Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes). @@ -659,7 +955,7 @@ ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); + compression : any ZSTD_compressBegin*() variant, including with dictionary + decompression : any ZSTD_decompressBegin*() variant, including with dictionary + copyCCtx() and copyDCtx() can be used too - - Block size is limited, it must be <= ZSTD_getBlockSizeMax() <= ZSTD_BLOCKSIZE_ABSOLUTEMAX + - Block size is limited, it must be <= ZSTD_getBlockSize() <= ZSTD_BLOCKSIZE_MAX + If input is larger than a block size, it's necessary to split input data into multiple blocks + For inputs larger than a single block size, consider using the regular ZSTD_compress() instead. Frame metadata is not that costly, and quickly becomes negligible as source size grows larger. @@ -672,7 +968,7 @@ ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); Use ZSTD_insertBlock() for such a case.-Raw zstd block functions
size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx); +Raw zstd block functions
size_t ZSTD_getBlockSize (const ZSTD_CCtx* cctx); size_t ZSTD_compressBlock (ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize);/**< insert block into `dctx` history. Useful for uncompressed blocks */ diff --git a/contrib/zstd/examples/Makefile b/contrib/zstd/examples/Makefile index b84983f08c0c..e279a537d71e 100644 --- a/contrib/zstd/examples/Makefile +++ b/contrib/zstd/examples/Makefile @@ -46,7 +46,7 @@ clean: simple_compression simple_decompression \ dictionary_compression dictionary_decompression \ streaming_compression streaming_decompression \ - multiple_streaming_compression + multiple_streaming_compression @echo Cleaning completed test: all diff --git a/contrib/zstd/lib/Makefile b/contrib/zstd/lib/Makefile index d8d8e179d205..5845cf170e71 100644 --- a/contrib/zstd/lib/Makefile +++ b/contrib/zstd/lib/Makefile @@ -22,9 +22,11 @@ VERSION?= $(LIBVER) CPPFLAGS+= -I. -I./common -DXXH_NAMESPACE=ZSTD_ CFLAGS ?= -O3 -DEBUGFLAGS = -g -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \ - -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement \ - -Wstrict-prototypes -Wundef -Wpointer-arith -Wformat-security +DEBUGFLAGS = -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \ + -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement \ + -Wstrict-prototypes -Wundef -Wpointer-arith -Wformat-security \ + -Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings \ + -Wredundant-decls CFLAGS += $(DEBUGFLAGS) $(MOREFLAGS) FLAGS = $(CPPFLAGS) $(CFLAGS) @@ -71,7 +73,7 @@ libzstd.a: $(ZSTD_OBJ) @echo compiling static library @$(AR) $(ARFLAGS) $@ $^ -libzstd.a-mt: CPPFLAGS += -DZSTD_MULTHREAD +libzstd.a-mt: CPPFLAGS += -DZSTD_MULTITHREAD libzstd.a-mt: libzstd.a $(LIBZSTD): LDFLAGS += -shared -fPIC -fvisibility=hidden @@ -147,7 +149,7 @@ install: libzstd.a libzstd libzstd.pc @$(INSTALL) -d -m 755 $(DESTDIR)$(PKGCONFIGDIR)/ $(DESTDIR)$(INCLUDEDIR)/ @$(INSTALL_DATA) libzstd.pc $(DESTDIR)$(PKGCONFIGDIR)/ @echo Installing libraries - @$(INSTALL_LIB) libzstd.a $(DESTDIR)$(LIBDIR) + @$(INSTALL_DATA) libzstd.a $(DESTDIR)$(LIBDIR) @$(INSTALL_LIB) libzstd.$(SHARED_EXT_VER) $(DESTDIR)$(LIBDIR) @ln -sf libzstd.$(SHARED_EXT_VER) $(DESTDIR)$(LIBDIR)/libzstd.$(SHARED_EXT_MAJOR) @ln -sf libzstd.$(SHARED_EXT_VER) $(DESTDIR)$(LIBDIR)/libzstd.$(SHARED_EXT) diff --git a/contrib/zstd/lib/common/bitstream.h b/contrib/zstd/lib/common/bitstream.h index ca42850df324..07b85026c95b 100644 --- a/contrib/zstd/lib/common/bitstream.h +++ b/contrib/zstd/lib/common/bitstream.h @@ -39,7 +39,6 @@ extern "C" { #endif - /* * This API consists of small unitary functions, which must be inlined for best performance. * Since link-time-optimization is not available for all compilers, @@ -59,7 +58,9 @@ extern "C" { #if defined(BIT_DEBUG) && (BIT_DEBUG>=1) # include#else -# define assert(condition) ((void)0) +# ifndef assert +# define assert(condition) ((void)0) +# endif #endif @@ -74,6 +75,7 @@ extern "C" { #define STREAM_ACCUMULATOR_MIN_64 57 #define STREAM_ACCUMULATOR_MIN ((U32)(MEM_32bits() ? STREAM_ACCUMULATOR_MIN_32 : STREAM_ACCUMULATOR_MIN_64)) + /*-****************************************** * bitStream encoding API (write forward) ********************************************/ @@ -303,13 +305,25 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si bitD->bitContainer = *(const BYTE*)(bitD->start); switch(srcSize) { - case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16); - case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24); - case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32); - case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; - case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; - case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; - default:; + case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16); + /* fall-through */ + + case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24); + /* fall-through */ + + case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32); + /* fall-through */ + + case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; + /* fall-through */ + + case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; + /* fall-through */ + + case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; + /* fall-through */ + + default: break; } { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1]; bitD->bitsConsumed = lastByte ? 8 - BIT_highbit32(lastByte) : 0; diff --git a/contrib/zstd/lib/common/error_private.c b/contrib/zstd/lib/common/error_private.c index b3287245f1ee..2d752cd23a72 100644 --- a/contrib/zstd/lib/common/error_private.c +++ b/contrib/zstd/lib/common/error_private.c @@ -24,7 +24,8 @@ const char* ERR_getErrorString(ERR_enum code) case PREFIX(frameParameter_unsupported): return "Unsupported frame parameter"; case PREFIX(frameParameter_unsupportedBy32bits): return "Frame parameter unsupported in 32-bits mode"; case PREFIX(frameParameter_windowTooLarge): return "Frame requires too much memory for decoding"; - case PREFIX(compressionParameter_unsupported): return "Compression parameter is out of bound"; + case PREFIX(compressionParameter_unsupported): return "Compression parameter is not supported"; + case PREFIX(compressionParameter_outOfBound): return "Compression parameter is out of bound"; case PREFIX(init_missing): return "Context should be init first"; case PREFIX(memory_allocation): return "Allocation error : not enough memory"; case PREFIX(stage_wrong): return "Operation not authorized at current processing stage"; @@ -38,6 +39,8 @@ const char* ERR_getErrorString(ERR_enum code) case PREFIX(dictionary_corrupted): return "Dictionary is corrupted"; case PREFIX(dictionary_wrong): return "Dictionary mismatch"; case PREFIX(dictionaryCreation_failed): return "Cannot create Dictionary from provided samples"; + case PREFIX(frameIndex_tooLarge): return "Frame index is too large"; + case PREFIX(seekableIO): return "An I/O error occurred when reading/seeking"; case PREFIX(maxCode): default: return notErrorCode; } diff --git a/contrib/zstd/lib/common/huf.h b/contrib/zstd/lib/common/huf.h index 7873ca3d42a5..dabd359915a7 100644 --- a/contrib/zstd/lib/common/huf.h +++ b/contrib/zstd/lib/common/huf.h @@ -111,6 +111,18 @@ HUF_PUBLIC_API size_t HUF_compress2 (void* dst, size_t dstCapacity, const void* #define HUF_WORKSPACE_SIZE_U32 (HUF_WORKSPACE_SIZE / sizeof(U32)) HUF_PUBLIC_API size_t HUF_compress4X_wksp (void* dst, size_t dstCapacity, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog, void* workSpace, size_t wkspSize); +/** + * The minimum workspace size for the `workSpace` used in + * HUF_readDTableX2_wksp() and HUF_readDTableX4_wksp(). + * + * The space used depends on HUF_TABLELOG_MAX, ranging from ~1500 bytes when + * HUF_TABLE_LOG_MAX=12 to ~1850 bytes when HUF_TABLE_LOG_MAX=15. + * Buffer overflow errors may potentially occur if code modifications result in + * a required workspace size greater than that specified in the following + * macro. + */ +#define HUF_DECOMPRESS_WORKSPACE_SIZE (2 << 10) +#define HUF_DECOMPRESS_WORKSPACE_SIZE_U32 (HUF_DECOMPRESS_WORKSPACE_SIZE / sizeof(U32)) /* ****************************************************************** @@ -170,8 +182,11 @@ size_t HUF_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cS size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< decodes RLE and uncompressed */ size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */ +size_t HUF_decompress4X_hufOnly_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< considers RLE and uncompressed as errors */ size_t HUF_decompress4X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */ +size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */ size_t HUF_decompress4X4_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */ +size_t HUF_decompress4X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */ /* **************************************** @@ -243,7 +258,9 @@ HUF_decompress() does the following: U32 HUF_selectDecoder (size_t dstSize, size_t cSrcSize); size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize); +size_t HUF_readDTableX2_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize); size_t HUF_readDTableX4 (HUF_DTable* DTable, const void* src, size_t srcSize); +size_t HUF_readDTableX4_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize); size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); @@ -266,8 +283,11 @@ size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cS size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */ size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); +size_t HUF_decompress1X_DCtx_wksp (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); size_t HUF_decompress1X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */ +size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */ size_t HUF_decompress1X4_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */ +size_t HUF_decompress1X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */ size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); /**< automatic selection of sing or double symbol decoder, based on DTable */ size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); diff --git a/contrib/zstd/lib/common/mem.h b/contrib/zstd/lib/common/mem.h index 4773a8b9309e..b0e5bf60b43b 100644 --- a/contrib/zstd/lib/common/mem.h +++ b/contrib/zstd/lib/common/mem.h @@ -352,20 +352,6 @@ MEM_STATIC void MEM_writeBEST(void* memPtr, size_t val) } -/* function safe only for comparisons */ -MEM_STATIC U32 MEM_readMINMATCH(const void* memPtr, U32 length) -{ - switch (length) - { - default : - case 4 : return MEM_read32(memPtr); - case 3 : if (MEM_isLittleEndian()) - return MEM_read32(memPtr)<<8; - else - return MEM_read32(memPtr)>>8; - } -} - #if defined (__cplusplus) } #endif diff --git a/contrib/zstd/lib/common/pool.c b/contrib/zstd/lib/common/pool.c index e439fe1b0dd6..749fa4f2f7b4 100644 --- a/contrib/zstd/lib/common/pool.c +++ b/contrib/zstd/lib/common/pool.c @@ -146,6 +146,13 @@ void POOL_free(POOL_ctx *ctx) { free(ctx); } +size_t POOL_sizeof(POOL_ctx *ctx) { + if (ctx==NULL) return 0; /* supports sizeof NULL */ + return sizeof(*ctx) + + ctx->queueSize * sizeof(POOL_job) + + ctx->numThreads * sizeof(pthread_t); +} + void POOL_add(void *ctxVoid, POOL_function function, void *opaque) { POOL_ctx *ctx = (POOL_ctx *)ctxVoid; if (!ctx) { return; } @@ -191,4 +198,9 @@ void POOL_add(void *ctx, POOL_function function, void *opaque) { function(opaque); } +size_t POOL_sizeof(POOL_ctx *ctx) { + if (ctx==NULL) return 0; /* supports sizeof NULL */ + return sizeof(*ctx); +} + #endif /* ZSTD_MULTITHREAD */ diff --git a/contrib/zstd/lib/common/pool.h b/contrib/zstd/lib/common/pool.h index 50cb25b12c9f..386cd674b7c0 100644 --- a/contrib/zstd/lib/common/pool.h +++ b/contrib/zstd/lib/common/pool.h @@ -32,6 +32,11 @@ POOL_ctx *POOL_create(size_t numThreads, size_t queueSize); */ void POOL_free(POOL_ctx *ctx); +/*! POOL_sizeof() : + return memory usage of pool returned by POOL_create(). +*/ +size_t POOL_sizeof(POOL_ctx *ctx); + /*! POOL_function : The function type that can be added to a thread pool. */ diff --git a/contrib/zstd/lib/common/threading.c b/contrib/zstd/lib/common/threading.c index 32d58796a9ea..141376c56199 100644 --- a/contrib/zstd/lib/common/threading.c +++ b/contrib/zstd/lib/common/threading.c @@ -1,4 +1,3 @@ - /** * Copyright (c) 2016 Tino Reichardt * All rights reserved. diff --git a/contrib/zstd/lib/common/zstd_common.c b/contrib/zstd/lib/common/zstd_common.c index 8408a589ae8d..f68167238158 100644 --- a/contrib/zstd/lib/common/zstd_common.c +++ b/contrib/zstd/lib/common/zstd_common.c @@ -12,16 +12,19 @@ /*-************************************* * Dependencies ***************************************/ -#include /* malloc */ +#include /* malloc, calloc, free */ +#include /* memset */ #include "error_private.h" #define ZSTD_STATIC_LINKING_ONLY -#include "zstd.h" /* declaration of ZSTD_isError, ZSTD_getErrorName, ZSTD_getErrorCode, ZSTD_getErrorString, ZSTD_versionNumber */ +#include "zstd.h" /*-**************************************** * Version ******************************************/ -unsigned ZSTD_versionNumber (void) { return ZSTD_VERSION_NUMBER; } +unsigned ZSTD_versionNumber(void) { return ZSTD_VERSION_NUMBER; } + +const char* ZSTD_versionString(void) { return ZSTD_VERSION_STRING; } /*-**************************************** @@ -47,27 +50,31 @@ const char* ZSTD_getErrorString(ZSTD_ErrorCode code) { return ERR_getErrorString /*=************************************************************** * Custom allocator ****************************************************************/ -/* default uses stdlib */ -void* ZSTD_defaultAllocFunction(void* opaque, size_t size) -{ - void* address = malloc(size); - (void)opaque; - return address; -} - -void ZSTD_defaultFreeFunction(void* opaque, void* address) -{ - (void)opaque; - free(address); -} - void* ZSTD_malloc(size_t size, ZSTD_customMem customMem) { - return customMem.customAlloc(customMem.opaque, size); + if (customMem.customAlloc) + return customMem.customAlloc(customMem.opaque, size); + return malloc(size); +} + +void* ZSTD_calloc(size_t size, ZSTD_customMem customMem) +{ + if (customMem.customAlloc) { + /* calloc implemented as malloc+memset; + * not as efficient as calloc, but next best guess for custom malloc */ + void* const ptr = customMem.customAlloc(customMem.opaque, size); + memset(ptr, 0, size); + return ptr; + } + return calloc(1, size); } void ZSTD_free(void* ptr, ZSTD_customMem customMem) { - if (ptr!=NULL) - customMem.customFree(customMem.opaque, ptr); + if (ptr!=NULL) { + if (customMem.customFree) + customMem.customFree(customMem.opaque, ptr); + else + free(ptr); + } } diff --git a/contrib/zstd/lib/common/zstd_errors.h b/contrib/zstd/lib/common/zstd_errors.h index 3d579d969363..19f1597aa340 100644 --- a/contrib/zstd/lib/common/zstd_errors.h +++ b/contrib/zstd/lib/common/zstd_errors.h @@ -19,10 +19,12 @@ extern "C" { /* ===== ZSTDERRORLIB_API : control library symbols visibility ===== */ -#if defined(__GNUC__) && (__GNUC__ >= 4) -# define ZSTDERRORLIB_VISIBILITY __attribute__ ((visibility ("default"))) -#else -# define ZSTDERRORLIB_VISIBILITY +#ifndef ZSTDERRORLIB_VISIBILITY +# if defined(__GNUC__) && (__GNUC__ >= 4) +# define ZSTDERRORLIB_VISIBILITY __attribute__ ((visibility ("default"))) +# else +# define ZSTDERRORLIB_VISIBILITY +# endif #endif #if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) # define ZSTDERRORLIB_API __declspec(dllexport) ZSTDERRORLIB_VISIBILITY @@ -33,8 +35,11 @@ extern "C" { #endif /*-**************************************** -* error codes list -******************************************/ + * error codes list + * note : this API is still considered unstable + * it should not be used with a dynamic library + * only static linking is allowed + ******************************************/ typedef enum { ZSTD_error_no_error, ZSTD_error_GENERIC, @@ -45,6 +50,7 @@ typedef enum { ZSTD_error_frameParameter_unsupportedBy32bits, ZSTD_error_frameParameter_windowTooLarge, ZSTD_error_compressionParameter_unsupported, + ZSTD_error_compressionParameter_outOfBound, ZSTD_error_init_missing, ZSTD_error_memory_allocation, ZSTD_error_stage_wrong, @@ -58,12 +64,14 @@ typedef enum { ZSTD_error_dictionary_corrupted, ZSTD_error_dictionary_wrong, ZSTD_error_dictionaryCreation_failed, + ZSTD_error_frameIndex_tooLarge, + ZSTD_error_seekableIO, ZSTD_error_maxCode } ZSTD_ErrorCode; /*! ZSTD_getErrorCode() : convert a `size_t` function result into a `ZSTD_ErrorCode` enum type, - which can be used to compare directly with enum list published into "error_public.h" */ + which can be used to compare with enum list published above */ ZSTDERRORLIB_API ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult); ZSTDERRORLIB_API const char* ZSTD_getErrorString(ZSTD_ErrorCode code); diff --git a/contrib/zstd/lib/common/zstd_internal.h b/contrib/zstd/lib/common/zstd_internal.h index 2533333ba83c..f2c4e6249fb8 100644 --- a/contrib/zstd/lib/common/zstd_internal.h +++ b/contrib/zstd/lib/common/zstd_internal.h @@ -18,6 +18,7 @@ # include /* For Visual 2005 */ # pragma warning(disable : 4100) /* disable: C4100: unreferenced formal parameter */ # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */ +# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ # pragma warning(disable : 4324) /* disable: C4324: padded structure */ #else # if defined (__cplusplus) || defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */ @@ -50,9 +51,42 @@ #define ZSTD_STATIC_LINKING_ONLY #include "zstd.h" #ifndef XXH_STATIC_LINKING_ONLY -# define XXH_STATIC_LINKING_ONLY /* XXH64_state_t */ +# define XXH_STATIC_LINKING_ONLY /* XXH64_state_t */ +#endif +#include "xxhash.h" /* XXH_reset, update, digest */ + + +/*-************************************* +* Debug +***************************************/ +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=1) +# include +#else +# ifndef assert +# define assert(condition) ((void)0) +# endif +#endif + +#define ZSTD_STATIC_ASSERT(c) { enum { ZSTD_static_assert = 1/(int)(!!(c)) }; } + +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=2) +# include +/* recommended values for ZSTD_DEBUG display levels : + * 1 : no display, enables assert() only + * 2 : reserved for currently active debugging path + * 3 : events once per object lifetime (CCtx, CDict) + * 4 : events once per frame + * 5 : events once per block + * 6 : events once per sequence (*very* verbose) */ +# define DEBUGLOG(l, ...) { \ + if (l<=ZSTD_DEBUG) { \ + fprintf(stderr, __FILE__ ": "); \ + fprintf(stderr, __VA_ARGS__); \ + fprintf(stderr, " \n"); \ + } } +#else +# define DEBUGLOG(l, ...) {} /* disabled */ #endif -#include "xxhash.h" /* XXH_reset, update, digest */ /*-************************************* @@ -70,7 +104,6 @@ * Common constants ***************************************/ #define ZSTD_OPT_NUM (1<<12) -#define ZSTD_DICT_MAGIC 0xEC30A437 /* v0.7+ */ #define ZSTD_REP_NUM 3 /* number of repcodes */ #define ZSTD_REP_CHECK (ZSTD_REP_NUM) /* number of repcodes to check by the optimal parser */ @@ -235,15 +268,10 @@ typedef struct { const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx); void ZSTD_seqToCodes(const seqStore_t* seqStorePtr); -int ZSTD_isSkipFrame(ZSTD_DCtx* dctx); /* custom memory allocation functions */ -void* ZSTD_defaultAllocFunction(void* opaque, size_t size); -void ZSTD_defaultFreeFunction(void* opaque, void* address); -#ifndef ZSTD_DLL_IMPORT -static const ZSTD_customMem defaultCustomMem = { ZSTD_defaultAllocFunction, ZSTD_defaultFreeFunction, NULL }; -#endif void* ZSTD_malloc(size_t size, ZSTD_customMem customMem); +void* ZSTD_calloc(size_t size, ZSTD_customMem customMem); void ZSTD_free(void* ptr, ZSTD_customMem customMem); @@ -281,4 +309,26 @@ MEM_STATIC U32 ZSTD_highbit32(U32 val) void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx); +/*! ZSTD_initCStream_internal() : + * Private use only. Init streaming operation. + * expects params to be valid. + * must receive dict, or cdict, or none, but not both. + * @return : 0, or an error code */ +size_t ZSTD_initCStream_internal(ZSTD_CStream* zcs, + const void* dict, size_t dictSize, + const ZSTD_CDict* cdict, + ZSTD_parameters params, unsigned long long pledgedSrcSize); + +/*! ZSTD_compressStream_generic() : + * Private use only. To be called from zstdmt_compress.c in single-thread mode. */ +size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective const flushMode); + +/*! ZSTD_getParamsFromCDict() : + * as the name implies */ +ZSTD_parameters ZSTD_getParamsFromCDict(const ZSTD_CDict* cdict); + + #endif /* ZSTD_CCOMMON_H_MODULE */ diff --git a/contrib/zstd/lib/compress/huf_compress.c b/contrib/zstd/lib/compress/huf_compress.c index fe11aafb8f14..7af0789a9c58 100644 --- a/contrib/zstd/lib/compress/huf_compress.c +++ b/contrib/zstd/lib/compress/huf_compress.c @@ -266,7 +266,8 @@ static U32 HUF_setMaxHeight(nodeElt* huffNode, U32 lastNonNull, U32 maxNbBits) if (highTotal <= lowTotal) break; } } /* only triggered when no more rank 1 symbol left => find closest one (note : there is necessarily at least one !) */ - while ((nBitsToDecrease<=HUF_TABLELOG_MAX) && (rankLast[nBitsToDecrease] == noSymbol)) /* HUF_MAX_TABLELOG test just to please gcc 5+; but it should not be necessary */ + /* HUF_MAX_TABLELOG test just to please gcc 5+; but it should not be necessary */ + while ((nBitsToDecrease<=HUF_TABLELOG_MAX) && (rankLast[nBitsToDecrease] == noSymbol)) nBitsToDecrease ++; totalCost -= 1 << (nBitsToDecrease-1); if (rankLast[nBitsToDecrease-1] == noSymbol) @@ -463,12 +464,15 @@ size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, si { case 3 : HUF_encodeSymbol(&bitC, ip[n+ 2], CTable); HUF_FLUSHBITS_2(&bitC); + /* fall-through */ case 2 : HUF_encodeSymbol(&bitC, ip[n+ 1], CTable); HUF_FLUSHBITS_1(&bitC); + /* fall-through */ case 1 : HUF_encodeSymbol(&bitC, ip[n+ 0], CTable); HUF_FLUSHBITS(&bitC); - case 0 : - default: ; + /* fall-through */ + case 0 : /* fall-through */ + default: break; } for (; n>0; n-=4) { /* note : n&3==0 at this stage */ diff --git a/contrib/zstd/lib/compress/zstd_compress.c b/contrib/zstd/lib/compress/zstd_compress.c index c08b315dab93..9300357f2d38 100644 --- a/contrib/zstd/lib/compress/zstd_compress.c +++ b/contrib/zstd/lib/compress/zstd_compress.c @@ -8,6 +8,14 @@ */ +/*-************************************* +* Tuning parameters +***************************************/ +#ifndef ZSTD_CLEVEL_DEFAULT +# define ZSTD_CLEVEL_DEFAULT 3 +#endif + + /*-************************************* * Dependencies ***************************************/ @@ -18,26 +26,7 @@ #define HUF_STATIC_LINKING_ONLY #include "huf.h" #include "zstd_internal.h" /* includes zstd.h */ - - -/*-************************************* -* Debug -***************************************/ -#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=1) -# include -#else -# define assert(condition) ((void)0) -#endif - -#define ZSTD_STATIC_ASSERT(c) { enum { ZSTD_static_assert = 1/(int)(!!(c)) }; } - -#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=2) -# include - static unsigned g_debugLevel = ZSTD_DEBUG; -# define DEBUGLOG(l, ...) if (l<=g_debugLevel) { fprintf(stderr, __FILE__ ": "); fprintf(stderr, __VA_ARGS__); fprintf(stderr, " \n"); } -#else -# define DEBUGLOG(l, ...) {} /* disabled */ -#endif +#include "zstdmt_compress.h" /*-************************************* @@ -79,6 +68,15 @@ static void ZSTD_resetSeqStore(seqStore_t* ssPtr) /*-************************************* * Context memory management ***************************************/ +typedef enum { zcss_init=0, zcss_load, zcss_flush } ZSTD_cStreamStage; + +struct ZSTD_CDict_s { + void* dictBuffer; + const void* dictContent; + size_t dictContentSize; + ZSTD_CCtx* refContext; +}; /* typedef'd to ZSTD_CDict within "zstd.h" */ + struct ZSTD_CCtx_s { const BYTE* nextSrc; /* next block here to continue on current prefix */ const BYTE* base; /* All regular indexes relative to this position */ @@ -90,19 +88,21 @@ struct ZSTD_CCtx_s { U32 hashLog3; /* dispatch table : larger == faster, more memory */ U32 loadedDictEnd; /* index of end of dictionary */ U32 forceWindow; /* force back-references to respect limit of 1< customMem = customMem; + cctx->compressionLevel = ZSTD_CLEVEL_DEFAULT; + ZSTD_STATIC_ASSERT(zcss_init==0); + ZSTD_STATIC_ASSERT(ZSTD_CONTENTSIZE_UNKNOWN==(0ULL - 1)); + return cctx; +} + +ZSTD_CCtx* ZSTD_initStaticCCtx(void *workspace, size_t workspaceSize) +{ + ZSTD_CCtx* cctx = (ZSTD_CCtx*) workspace; + if (workspaceSize <= sizeof(ZSTD_CCtx)) return NULL; /* minimum size */ + if ((size_t)workspace & 7) return NULL; /* must be 8-aligned */ + memset(workspace, 0, workspaceSize); /* may be a bit generous, could memset be smaller ? */ + cctx->staticSize = workspaceSize; + cctx->workSpace = (void*)(cctx+1); + cctx->workSpaceSize = workspaceSize - sizeof(ZSTD_CCtx); + + /* entropy space (never moves) */ + /* note : this code should be shared with resetCCtx, rather than copy/pasted */ + { void* ptr = cctx->workSpace; + cctx->hufCTable = (HUF_CElt*)ptr; + ptr = (char*)cctx->hufCTable + hufCTable_size; + cctx->offcodeCTable = (FSE_CTable*) ptr; + ptr = (char*)ptr + offcodeCTable_size; + cctx->matchlengthCTable = (FSE_CTable*) ptr; + ptr = (char*)ptr + matchlengthCTable_size; + cctx->litlengthCTable = (FSE_CTable*) ptr; + ptr = (char*)ptr + litlengthCTable_size; + assert(((size_t)ptr & 3) == 0); /* ensure correct alignment */ + cctx->entropyScratchSpace = (unsigned*) ptr; + } + return cctx; } size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx) { if (cctx==NULL) return 0; /* support free on NULL */ + if (cctx->staticSize) return ERROR(memory_allocation); /* not compatible with static CCtx */ ZSTD_free(cctx->workSpace, cctx->customMem); + cctx->workSpace = NULL; + ZSTD_freeCDict(cctx->cdictLocal); + cctx->cdictLocal = NULL; + ZSTDMT_freeCCtx(cctx->mtctx); + cctx->mtctx = NULL; ZSTD_free(cctx, cctx->customMem); return 0; /* reserved as a potential error code in the future */ } @@ -147,46 +208,294 @@ size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx) size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx) { if (cctx==NULL) return 0; /* support sizeof on NULL */ - return sizeof(*cctx) + cctx->workSpaceSize; + DEBUGLOG(5, "sizeof(*cctx) : %u", (U32)sizeof(*cctx)); + DEBUGLOG(5, "workSpaceSize : %u", (U32)cctx->workSpaceSize); + DEBUGLOG(5, "streaming buffers : %u", (U32)(cctx->outBuffSize + cctx->inBuffSize)); + DEBUGLOG(5, "inner MTCTX : %u", (U32)ZSTDMT_sizeof_CCtx(cctx->mtctx)); + return sizeof(*cctx) + cctx->workSpaceSize + + ZSTD_sizeof_CDict(cctx->cdictLocal) + + cctx->outBuffSize + cctx->inBuffSize + + ZSTDMT_sizeof_CCtx(cctx->mtctx); } +size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs) +{ + return ZSTD_sizeof_CCtx(zcs); /* same object */ +} + +/* private API call, for dictBuilder only */ +const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) { return &(ctx->seqStore); } + +static ZSTD_parameters ZSTD_getParamsFromCCtx(const ZSTD_CCtx* cctx) { return cctx->appliedParams; } + +/* older variant; will be deprecated */ size_t ZSTD_setCCtxParameter(ZSTD_CCtx* cctx, ZSTD_CCtxParameter param, unsigned value) { switch(param) { case ZSTD_p_forceWindow : cctx->forceWindow = value>0; cctx->loadedDictEnd = 0; return 0; - case ZSTD_p_forceRawDict : cctx->forceRawDict = value>0; return 0; + ZSTD_STATIC_ASSERT(ZSTD_dm_auto==0); + ZSTD_STATIC_ASSERT(ZSTD_dm_rawContent==1); + case ZSTD_p_forceRawDict : cctx->dictMode = (ZSTD_dictMode_e)(value>0); return 0; default: return ERROR(parameter_unknown); } } -const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) /* hidden interface */ + +#define ZSTD_CLEVEL_CUSTOM 999 +static void ZSTD_cLevelToCParams(ZSTD_CCtx* cctx) { - return &(ctx->seqStore); + if (cctx->compressionLevel==ZSTD_CLEVEL_CUSTOM) return; + cctx->requestedParams.cParams = ZSTD_getCParams(cctx->compressionLevel, + cctx->pledgedSrcSizePlusOne-1, 0); + cctx->compressionLevel = ZSTD_CLEVEL_CUSTOM; } -static ZSTD_parameters ZSTD_getParamsFromCCtx(const ZSTD_CCtx* cctx) +#define CLAMPCHECK(val,min,max) { \ + if (((val)<(min)) | ((val)>(max))) { \ + return ERROR(compressionParameter_outOfBound); \ +} } + +size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value) { - return cctx->params; + if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); + + switch(param) + { + case ZSTD_p_compressionLevel : + if ((int)value > ZSTD_maxCLevel()) value = ZSTD_maxCLevel(); /* cap max compression level */ + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + cctx->compressionLevel = value; + return 0; + + case ZSTD_p_windowLog : + DEBUGLOG(5, "setting ZSTD_p_windowLog = %u (cdict:%u)", + value, (cctx->cdict!=NULL)); + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.windowLog = value; + return 0; + + case ZSTD_p_hashLog : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.hashLog = value; + return 0; + + case ZSTD_p_chainLog : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.chainLog = value; + return 0; + + case ZSTD_p_searchLog : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.searchLog = value; + return 0; + + case ZSTD_p_minMatch : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.searchLength = value; + return 0; + + case ZSTD_p_targetLength : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, ZSTD_TARGETLENGTH_MIN, ZSTD_TARGETLENGTH_MAX); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.targetLength = value; + return 0; + + case ZSTD_p_compressionStrategy : + if (value == 0) return 0; /* special value : 0 means "don't change anything" */ + if (cctx->cdict) return ERROR(stage_wrong); + CLAMPCHECK(value, (unsigned)ZSTD_fast, (unsigned)ZSTD_btultra); + ZSTD_cLevelToCParams(cctx); + cctx->requestedParams.cParams.strategy = (ZSTD_strategy)value; + return 0; + + case ZSTD_p_contentSizeFlag : + DEBUGLOG(5, "set content size flag = %u", (value>0)); + /* Content size written in frame header _when known_ (default:1) */ + cctx->requestedParams.fParams.contentSizeFlag = value>0; + return 0; + + case ZSTD_p_checksumFlag : + /* A 32-bits content checksum will be calculated and written at end of frame (default:0) */ + cctx->requestedParams.fParams.checksumFlag = value>0; + return 0; + + case ZSTD_p_dictIDFlag : /* When applicable, dictionary's dictID is provided in frame header (default:1) */ + DEBUGLOG(5, "set dictIDFlag = %u", (value>0)); + cctx->requestedParams.fParams.noDictIDFlag = (value==0); + return 0; + + /* Dictionary parameters */ + case ZSTD_p_dictMode : + if (cctx->cdict) return ERROR(stage_wrong); /* must be set before loading */ + /* restrict dictionary mode, to "rawContent" or "fullDict" only */ + ZSTD_STATIC_ASSERT((U32)ZSTD_dm_fullDict > (U32)ZSTD_dm_rawContent); + if (value > (unsigned)ZSTD_dm_fullDict) + return ERROR(compressionParameter_outOfBound); + cctx->dictMode = (ZSTD_dictMode_e)value; + return 0; + + case ZSTD_p_refDictContent : + if (cctx->cdict) return ERROR(stage_wrong); /* must be set before loading */ + /* dictionary content will be referenced, instead of copied */ + cctx->dictContentByRef = value>0; + return 0; + + case ZSTD_p_forceMaxWindow : /* Force back-references to remain < windowSize, + * even when referencing into Dictionary content + * default : 0 when using a CDict, 1 when using a Prefix */ + cctx->forceWindow = value>0; + cctx->loadedDictEnd = 0; + return 0; + + case ZSTD_p_nbThreads: + if (value==0) return 0; + DEBUGLOG(5, " setting nbThreads : %u", value); +#ifndef ZSTD_MULTITHREAD + if (value > 1) return ERROR(compressionParameter_unsupported); +#endif + if ((value>1) && (cctx->nbThreads != value)) { + if (cctx->staticSize) /* MT not compatible with static alloc */ + return ERROR(compressionParameter_unsupported); + ZSTDMT_freeCCtx(cctx->mtctx); + cctx->nbThreads = 1; + cctx->mtctx = ZSTDMT_createCCtx(value); + if (cctx->mtctx == NULL) return ERROR(memory_allocation); + } + cctx->nbThreads = value; + return 0; + + case ZSTD_p_jobSize: + if (cctx->nbThreads <= 1) return ERROR(compressionParameter_unsupported); + assert(cctx->mtctx != NULL); + return ZSTDMT_setMTCtxParameter(cctx->mtctx, ZSTDMT_p_sectionSize, value); + + case ZSTD_p_overlapSizeLog: + DEBUGLOG(5, " setting overlap with nbThreads == %u", cctx->nbThreads); + if (cctx->nbThreads <= 1) return ERROR(compressionParameter_unsupported); + assert(cctx->mtctx != NULL); + return ZSTDMT_setMTCtxParameter(cctx->mtctx, ZSTDMT_p_overlapSectionLog, value); + + default: return ERROR(parameter_unknown); + } } +ZSTDLIB_API size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize) +{ + DEBUGLOG(5, " setting pledgedSrcSize to %u", (U32)pledgedSrcSize); + if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); + cctx->pledgedSrcSizePlusOne = pledgedSrcSize+1; + return 0; +} -/** ZSTD_checkParams() : - ensure param values remain within authorized range. +ZSTDLIB_API size_t ZSTD_CCtx_loadDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize) +{ + if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); + if (cctx->staticSize) return ERROR(memory_allocation); /* no malloc for static CCtx */ + DEBUGLOG(5, "load dictionary of size %u", (U32)dictSize); + ZSTD_freeCDict(cctx->cdictLocal); /* in case one already exists */ + if (dict==NULL || dictSize==0) { /* no dictionary mode */ + cctx->cdictLocal = NULL; + cctx->cdict = NULL; + } else { + ZSTD_compressionParameters const cParams = + cctx->compressionLevel == ZSTD_CLEVEL_CUSTOM ? + cctx->requestedParams.cParams : + ZSTD_getCParams(cctx->compressionLevel, 0, dictSize); + cctx->cdictLocal = ZSTD_createCDict_advanced( + dict, dictSize, + cctx->dictContentByRef, cctx->dictMode, + cParams, cctx->customMem); + cctx->cdict = cctx->cdictLocal; + if (cctx->cdictLocal == NULL) + return ERROR(memory_allocation); + } + return 0; +} + +size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict) +{ + if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); + cctx->cdict = cdict; + cctx->prefix = NULL; /* exclusive */ + cctx->prefixSize = 0; + return 0; +} + +size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, const void* prefix, size_t prefixSize) +{ + if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); + cctx->cdict = NULL; /* prefix discards any prior cdict */ + cctx->prefix = prefix; + cctx->prefixSize = prefixSize; + return 0; +} + +static void ZSTD_startNewCompression(ZSTD_CCtx* cctx) +{ + cctx->streamStage = zcss_init; + cctx->pledgedSrcSizePlusOne = 0; +} + +/*! ZSTD_CCtx_reset() : + * Also dumps dictionary */ +void ZSTD_CCtx_reset(ZSTD_CCtx* cctx) +{ + ZSTD_startNewCompression(cctx); + cctx->cdict = NULL; +} + +/** ZSTD_checkCParams() : + control CParam values remain within authorized range. @return : 0, or an error code if one value is beyond authorized range */ size_t ZSTD_checkCParams(ZSTD_compressionParameters cParams) { -# define CLAMPCHECK(val,min,max) { if ((val max)) return ERROR(compressionParameter_unsupported); } CLAMPCHECK(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); CLAMPCHECK(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); CLAMPCHECK(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); CLAMPCHECK(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); CLAMPCHECK(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); CLAMPCHECK(cParams.targetLength, ZSTD_TARGETLENGTH_MIN, ZSTD_TARGETLENGTH_MAX); - if ((U32)(cParams.strategy) > (U32)ZSTD_btopt2) return ERROR(compressionParameter_unsupported); + if ((U32)(cParams.strategy) > (U32)ZSTD_btultra) return ERROR(compressionParameter_unsupported); return 0; } +/** ZSTD_clampCParams() : + * make CParam values within valid range. + * @return : valid CParams */ +static ZSTD_compressionParameters ZSTD_clampCParams(ZSTD_compressionParameters cParams) +{ +# define CLAMP(val,min,max) { \ + if (val max) val=max; \ + } + CLAMP(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); + CLAMP(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); + CLAMP(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); + CLAMP(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); + CLAMP(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); + CLAMP(cParams.targetLength, ZSTD_TARGETLENGTH_MIN, ZSTD_TARGETLENGTH_MAX); + if ((U32)(cParams.strategy) > (U32)ZSTD_btultra) cParams.strategy = ZSTD_btultra; + return cParams; +} /** ZSTD_cycleLog() : * condition for correct operation : hashLog > 1 */ @@ -196,14 +505,15 @@ static U32 ZSTD_cycleLog(U32 hashLog, ZSTD_strategy strat) return hashLog - btScale; } -/** ZSTD_adjustCParams() : +/** ZSTD_adjustCParams_internal() : optimize `cPar` for a given input (`srcSize` and `dictSize`). mostly downsizing to reduce memory consumption and initialization. Both `srcSize` and `dictSize` are optional (use 0 if unknown), but if both are 0, no optimization can be done. Note : cPar is considered validated at this stage. Use ZSTD_checkParams() to ensure that. */ -ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize) +ZSTD_compressionParameters ZSTD_adjustCParams_internal(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize) { + assert(ZSTD_checkCParams(cPar)==0); if (srcSize+dictSize == 0) return cPar; /* no size information available : no adjustment */ /* resize params, to use less memory when necessary */ @@ -223,10 +533,16 @@ ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, u return cPar; } - -size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams) +ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize) { - size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << cParams.windowLog); + cPar = ZSTD_clampCParams(cPar); + return ZSTD_adjustCParams_internal(cPar, srcSize, dictSize); +} + + +size_t ZSTD_estimateCCtxSize_advanced(ZSTD_compressionParameters cParams) +{ + size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog); U32 const divider = (cParams.searchLength==3) ? 3 : 4; size_t const maxNbSeq = blockSize / divider; size_t const tokenSpace = blockSize + 11*maxNbSeq; @@ -240,31 +556,64 @@ size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams) + entropyScratchSpace_size; size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); - size_t const optSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1< nextSrc - cctx->base); - cctx->params = params; - cctx->frameContentSize = frameContentSize; + DEBUGLOG(5, "continue mode"); + cctx->appliedParams = params; + cctx->pledgedSrcSizePlusOne = pledgedSrcSize+1; cctx->consumedSrcSize = 0; + if (pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN) + cctx->appliedParams.fParams.contentSizeFlag = 0; + DEBUGLOG(5, "pledged content size : %u ; flag : %u", + (U32)pledgedSrcSize, cctx->appliedParams.fParams.contentSizeFlag); cctx->lowLimit = end; cctx->dictLimit = end; cctx->nextToUpdate = end+1; @@ -277,30 +626,39 @@ static size_t ZSTD_continueCCtx(ZSTD_CCtx* cctx, ZSTD_parameters params, U64 fra return 0; } -typedef enum { ZSTDcrp_continue, ZSTDcrp_noMemset, ZSTDcrp_fullReset } ZSTD_compResetPolicy_e; +typedef enum { ZSTDcrp_continue, ZSTDcrp_noMemset } ZSTD_compResetPolicy_e; +typedef enum { ZSTDb_not_buffered, ZSTDb_buffered } ZSTD_buffered_policy_e; /*! ZSTD_resetCCtx_internal() : - note : `params` must be validated */ -static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc, - ZSTD_parameters params, U64 frameContentSize, - ZSTD_compResetPolicy_e const crp) + note : `params` are assumed fully validated at this stage */ +static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc, + ZSTD_parameters params, U64 pledgedSrcSize, + ZSTD_compResetPolicy_e const crp, + ZSTD_buffered_policy_e const zbuff) { - if (crp == ZSTDcrp_continue) - if (ZSTD_equivalentParams(params, zc->params)) { + assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); + + if (crp == ZSTDcrp_continue) { + if (ZSTD_equivalentParams(params.cParams, zc->appliedParams.cParams)) { + DEBUGLOG(5, "ZSTD_equivalentParams()==1"); zc->fseCTables_ready = 0; zc->hufCTable_repeatMode = HUF_repeat_none; - return ZSTD_continueCCtx(zc, params, frameContentSize); - } + return ZSTD_continueCCtx(zc, params, pledgedSrcSize); + } } - { size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << params.cParams.windowLog); + { size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << params.cParams.windowLog); U32 const divider = (params.cParams.searchLength==3) ? 3 : 4; size_t const maxNbSeq = blockSize / divider; size_t const tokenSpace = blockSize + 11*maxNbSeq; - size_t const chainSize = (params.cParams.strategy == ZSTD_fast) ? 0 : (1 << params.cParams.chainLog); + size_t const chainSize = (params.cParams.strategy == ZSTD_fast) ? + 0 : (1 << params.cParams.chainLog); size_t const hSize = ((size_t)1) << params.cParams.hashLog; - U32 const hashLog3 = (params.cParams.searchLength>3) ? 0 : MIN(ZSTD_HASHLOG3_MAX, params.cParams.windowLog); + U32 const hashLog3 = (params.cParams.searchLength>3) ? + 0 : MIN(ZSTD_HASHLOG3_MAX, params.cParams.windowLog); size_t const h3Size = ((size_t)1) << hashLog3; size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); + size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0; + size_t const buffInSize = (zbuff==ZSTDb_buffered) ? ((size_t)1 << params.cParams.windowLog) + blockSize : 0; void* ptr; /* Check if workSpace is large enough, alloc a new one if needed */ @@ -309,9 +667,20 @@ static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc, + entropyScratchSpace_size; size_t const optPotentialSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1< workSpaceSize < neededSpace) { + size_t const optSpace = ( (params.cParams.strategy == ZSTD_btopt) + || (params.cParams.strategy == ZSTD_btultra)) ? + optPotentialSpace : 0; + size_t const bufferSpace = buffInSize + buffOutSize; + size_t const neededSpace = entropySpace + optSpace + tableSpace + + tokenSpace + bufferSpace; + + if (zc->workSpaceSize < neededSpace) { /* too small : resize /*/ + DEBUGLOG(5, "Need to update workSpaceSize from %uK to %uK \n", + (unsigned)zc->workSpaceSize>>10, + (unsigned)neededSpace>>10); + /* static cctx : no resize, error out */ + if (zc->staticSize) return ERROR(memory_allocation); + zc->workSpaceSize = 0; ZSTD_free(zc->workSpace, zc->customMem); zc->workSpace = ZSTD_malloc(neededSpace, zc->customMem); @@ -333,10 +702,14 @@ static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc, } } /* init params */ - zc->params = params; - zc->blockSize = blockSize; - zc->frameContentSize = frameContentSize; + zc->appliedParams = params; + zc->pledgedSrcSizePlusOne = pledgedSrcSize+1; zc->consumedSrcSize = 0; + if (pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN) + zc->appliedParams.fParams.contentSizeFlag = 0; + DEBUGLOG(5, "pledged content size : %u ; flag : %u", + (U32)pledgedSrcSize, zc->appliedParams.fParams.contentSizeFlag); + zc->blockSize = blockSize; XXH64_reset(&zc->xxhState, 0); zc->stage = ZSTDcs_init; @@ -363,7 +736,8 @@ static size_t ZSTD_resetCCtx_internal (ZSTD_CCtx* zc, ptr = (char*)zc->entropyScratchSpace + entropyScratchSpace_size; /* opt parser space */ - if ((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btopt2)) { + if ((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btultra)) { + DEBUGLOG(5, "reserving optimal parser space"); assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */ zc->seqStore.litFreq = (U32*)ptr; zc->seqStore.litLengthFreq = zc->seqStore.litFreq + (1< seqStore.mlCode = zc->seqStore.llCode + maxNbSeq; zc->seqStore.ofCode = zc->seqStore.mlCode + maxNbSeq; zc->seqStore.litStart = zc->seqStore.ofCode + maxNbSeq; + ptr = zc->seqStore.litStart + blockSize; + + /* buffers */ + zc->inBuffSize = buffInSize; + zc->inBuff = (char*)ptr; + zc->outBuffSize = buffOutSize; + zc->outBuff = zc->inBuff + buffInSize; return 0; } @@ -411,21 +792,25 @@ void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx) { * Only works during stage ZSTDcs_init (i.e. after creation, but before first call to ZSTD_compressContinue()). * pledgedSrcSize=0 means "empty" if fParams.contentSizeFlag=1 * @return : 0, or an error code */ -size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx, - ZSTD_frameParameters fParams, unsigned long long pledgedSrcSize) +static size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx, + const ZSTD_CCtx* srcCCtx, + ZSTD_frameParameters fParams, + unsigned long long pledgedSrcSize, + ZSTD_buffered_policy_e zbuff) { + DEBUGLOG(5, "ZSTD_copyCCtx_internal"); if (srcCCtx->stage!=ZSTDcs_init) return ERROR(stage_wrong); memcpy(&dstCCtx->customMem, &srcCCtx->customMem, sizeof(ZSTD_customMem)); - { ZSTD_parameters params = srcCCtx->params; + { ZSTD_parameters params = srcCCtx->appliedParams; params.fParams = fParams; - DEBUGLOG(5, "ZSTD_resetCCtx_internal : dictIDFlag : %u \n", !fParams.noDictIDFlag); - ZSTD_resetCCtx_internal(dstCCtx, params, pledgedSrcSize, ZSTDcrp_noMemset); + ZSTD_resetCCtx_internal(dstCCtx, params, pledgedSrcSize, + ZSTDcrp_noMemset, zbuff); } /* copy tables */ - { size_t const chainSize = (srcCCtx->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << srcCCtx->params.cParams.chainLog); - size_t const hSize = (size_t)1 << srcCCtx->params.cParams.hashLog; + { size_t const chainSize = (srcCCtx->appliedParams.cParams.strategy == ZSTD_fast) ? 0 : (1 << srcCCtx->appliedParams.cParams.chainLog); + size_t const hSize = (size_t)1 << srcCCtx->appliedParams.cParams.hashLog; size_t const h3Size = (size_t)1 << srcCCtx->hashLog3; size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); assert((U32*)dstCCtx->chainTable == (U32*)dstCCtx->hashTable + hSize); /* chainTable must follow hashTable */ @@ -467,9 +852,11 @@ size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx, size_t ZSTD_copyCCtx(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx, unsigned long long pledgedSrcSize) { ZSTD_frameParameters fParams = { 1 /*content*/, 0 /*checksum*/, 0 /*noDictID*/ }; + ZSTD_buffered_policy_e const zbuff = (ZSTD_buffered_policy_e)(srcCCtx->inBuffSize>0); + ZSTD_STATIC_ASSERT((U32)ZSTDb_buffered==1); fParams.contentSizeFlag = pledgedSrcSize>0; - return ZSTD_copyCCtx_internal(dstCCtx, srcCCtx, fParams, pledgedSrcSize); + return ZSTD_copyCCtx_internal(dstCCtx, srcCCtx, fParams, pledgedSrcSize, zbuff); } @@ -488,10 +875,10 @@ static void ZSTD_reduceTable (U32* const table, U32 const size, U32 const reduce * rescale all indexes to avoid future overflow (indexes are U32) */ static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue) { - { U32 const hSize = 1 << zc->params.cParams.hashLog; + { U32 const hSize = 1 << zc->appliedParams.cParams.hashLog; ZSTD_reduceTable(zc->hashTable, hSize, reducerValue); } - { U32 const chainSize = (zc->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << zc->params.cParams.chainLog); + { U32 const chainSize = (zc->appliedParams.cParams.strategy == ZSTD_fast) ? 0 : (1 << zc->appliedParams.cParams.chainLog); ZSTD_reduceTable(zc->chainTable, chainSize, reducerValue); } { U32 const h3Size = (zc->hashLog3) ? 1 << zc->hashLog3 : 0; @@ -529,10 +916,11 @@ static size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void case 2: /* 2 - 2 - 12 */ MEM_writeLE16(ostart, (U16)((U32)set_basic + (1<<2) + (srcSize<<4))); break; - default: /*note : should not be necessary : flSize is within {1,2,3} */ case 3: /* 2 - 2 - 20 */ MEM_writeLE32(ostart, (U32)((U32)set_basic + (3<<2) + (srcSize<<4))); break; + default: /* not necessary : flSize is {1,2,3} */ + assert(0); } memcpy(ostart + flSize, src, srcSize); @@ -554,10 +942,11 @@ static size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, cons case 2: /* 2 - 2 - 12 */ MEM_writeLE16(ostart, (U16)((U32)set_rle + (1<<2) + (srcSize<<4))); break; - default: /*note : should not be necessary : flSize is necessarily within {1,2,3} */ case 3: /* 2 - 2 - 20 */ MEM_writeLE32(ostart, (U32)((U32)set_rle + (3<<2) + (srcSize<<4))); break; + default: /* not necessary : flSize is {1,2,3} */ + assert(0); } ostart[flSize] = *(const BYTE*)src; @@ -587,7 +976,7 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc, if (dstCapacity < lhSize+1) return ERROR(dstSize_tooSmall); /* not enough space for compression */ { HUF_repeat repeat = zc->hufCTable_repeatMode; - int const preferRepeat = zc->params.cParams.strategy < ZSTD_lazy ? srcSize <= 1024 : 0; + int const preferRepeat = zc->appliedParams.cParams.strategy < ZSTD_lazy ? srcSize <= 1024 : 0; if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1; cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, zc->entropyScratchSpace, entropyScratchSpace_size, zc->hufCTable, &repeat, preferRepeat) @@ -619,13 +1008,14 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc, MEM_writeLE32(ostart, lhc); break; } - default: /* should not be necessary, lhSize is only {3,4,5} */ case 5: /* 2 - 2 - 18 - 18 */ { U32 const lhc = hType + (3 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<22); MEM_writeLE32(ostart, lhc); ostart[4] = (BYTE)(cLitSize >> 10); break; } + default: /* not possible : lhSize is {3,4,5} */ + assert(0); } return lhSize+cLitSize; } @@ -676,7 +1066,7 @@ MEM_STATIC size_t ZSTD_compressSequences (ZSTD_CCtx* zc, void* dst, size_t dstCapacity, size_t srcSize) { - const int longOffsets = zc->params.cParams.windowLog > STREAM_ACCUMULATOR_MIN; + const int longOffsets = zc->appliedParams.cParams.windowLog > STREAM_ACCUMULATOR_MIN; const seqStore_t* seqStorePtr = &(zc->seqStore); U32 count[MaxSeq+1]; S16 norm[MaxSeq+1]; @@ -881,12 +1271,6 @@ _check_compressibility: return op - ostart; } -#if 0 /* for debug */ -# define STORESEQ_DEBUG -#include /* fprintf */ -U32 g_startDebug = 0; -const BYTE* g_start = NULL; -#endif /*! ZSTD_storeSeq() : Store a sequence (literal length, literals, offset code and match length code) into seqStore_t. @@ -895,16 +1279,16 @@ const BYTE* g_start = NULL; */ MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, U32 offsetCode, size_t matchCode) { -#ifdef STORESEQ_DEBUG - if (g_startDebug) { - const U32 pos = (U32)((const BYTE*)literals - g_start); - if (g_start==NULL) g_start = (const BYTE*)literals; - if ((pos > 1895000) && (pos < 1895300)) - DEBUGLOG(5, "Cpos %6u :%5u literals & match %3u bytes at distance %6u \n", - pos, (U32)litLength, (U32)matchCode+MINMATCH, (U32)offsetCode); - } +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG >= 6) + static const BYTE* g_start = NULL; + U32 const pos = (U32)((const BYTE*)literals - g_start); + if (g_start==NULL) g_start = (const BYTE*)literals; + if ((pos > 0) && (pos < 1000000000)) + DEBUGLOG(6, "Cpos %6u :%5u literals & match %3u bytes at distance %6u", + pos, (U32)litLength, (U32)matchCode+MINMATCH, (U32)offsetCode); #endif /* copy Literals */ + assert(seqStorePtr->lit + litLength <= seqStorePtr->litStart + 128 KB); ZSTD_wildcopy(seqStorePtr->lit, literals, litLength); seqStorePtr->lit += litLength; @@ -1078,7 +1462,7 @@ static size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls) static void ZSTD_fillHashTable (ZSTD_CCtx* zc, const void* end, const U32 mls) { U32* const hashTable = zc->hashTable; - U32 const hBits = zc->params.cParams.hashLog; + U32 const hBits = zc->appliedParams.cParams.hashLog; const BYTE* const base = zc->base; const BYTE* ip = base + zc->nextToUpdate; const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE; @@ -1097,7 +1481,7 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx, const U32 mls) { U32* const hashTable = cctx->hashTable; - U32 const hBits = cctx->params.cParams.hashLog; + U32 const hBits = cctx->appliedParams.cParams.hashLog; seqStore_t* seqStorePtr = &(cctx->seqStore); const BYTE* const base = cctx->base; const BYTE* const istart = (const BYTE*)src; @@ -1182,7 +1566,7 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx, static void ZSTD_compressBlock_fast(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { - const U32 mls = ctx->params.cParams.searchLength; + const U32 mls = ctx->appliedParams.cParams.searchLength; switch(mls) { default: /* includes case 3 */ @@ -1203,7 +1587,7 @@ static void ZSTD_compressBlock_fast_extDict_generic(ZSTD_CCtx* ctx, const U32 mls) { U32* hashTable = ctx->hashTable; - const U32 hBits = ctx->params.cParams.hashLog; + const U32 hBits = ctx->appliedParams.cParams.hashLog; seqStore_t* seqStorePtr = &(ctx->seqStore); const BYTE* const base = ctx->base; const BYTE* const dictBase = ctx->dictBase; @@ -1296,7 +1680,7 @@ static void ZSTD_compressBlock_fast_extDict_generic(ZSTD_CCtx* ctx, static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { - U32 const mls = ctx->params.cParams.searchLength; + U32 const mls = ctx->appliedParams.cParams.searchLength; switch(mls) { default: /* includes case 3 */ @@ -1318,9 +1702,9 @@ static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx, static void ZSTD_fillDoubleHashTable (ZSTD_CCtx* cctx, const void* end, const U32 mls) { U32* const hashLarge = cctx->hashTable; - U32 const hBitsL = cctx->params.cParams.hashLog; + U32 const hBitsL = cctx->appliedParams.cParams.hashLog; U32* const hashSmall = cctx->chainTable; - U32 const hBitsS = cctx->params.cParams.chainLog; + U32 const hBitsS = cctx->appliedParams.cParams.chainLog; const BYTE* const base = cctx->base; const BYTE* ip = base + cctx->nextToUpdate; const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE; @@ -1340,9 +1724,9 @@ void ZSTD_compressBlock_doubleFast_generic(ZSTD_CCtx* cctx, const U32 mls) { U32* const hashLong = cctx->hashTable; - const U32 hBitsL = cctx->params.cParams.hashLog; + const U32 hBitsL = cctx->appliedParams.cParams.hashLog; U32* const hashSmall = cctx->chainTable; - const U32 hBitsS = cctx->params.cParams.chainLog; + const U32 hBitsS = cctx->appliedParams.cParams.chainLog; seqStore_t* seqStorePtr = &(cctx->seqStore); const BYTE* const base = cctx->base; const BYTE* const istart = (const BYTE*)src; @@ -1452,7 +1836,7 @@ void ZSTD_compressBlock_doubleFast_generic(ZSTD_CCtx* cctx, static void ZSTD_compressBlock_doubleFast(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { - const U32 mls = ctx->params.cParams.searchLength; + const U32 mls = ctx->appliedParams.cParams.searchLength; switch(mls) { default: /* includes case 3 */ @@ -1473,9 +1857,9 @@ static void ZSTD_compressBlock_doubleFast_extDict_generic(ZSTD_CCtx* ctx, const U32 mls) { U32* const hashLong = ctx->hashTable; - U32 const hBitsL = ctx->params.cParams.hashLog; + U32 const hBitsL = ctx->appliedParams.cParams.hashLog; U32* const hashSmall = ctx->chainTable; - U32 const hBitsS = ctx->params.cParams.chainLog; + U32 const hBitsS = ctx->appliedParams.cParams.chainLog; seqStore_t* seqStorePtr = &(ctx->seqStore); const BYTE* const base = ctx->base; const BYTE* const dictBase = ctx->dictBase; @@ -1602,7 +1986,7 @@ static void ZSTD_compressBlock_doubleFast_extDict_generic(ZSTD_CCtx* ctx, static void ZSTD_compressBlock_doubleFast_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { - U32 const mls = ctx->params.cParams.searchLength; + U32 const mls = ctx->appliedParams.cParams.searchLength; switch(mls) { default: /* includes case 3 */ @@ -1628,10 +2012,10 @@ static U32 ZSTD_insertBt1(ZSTD_CCtx* zc, const BYTE* const ip, const U32 mls, co U32 extDict) { U32* const hashTable = zc->hashTable; - U32 const hashLog = zc->params.cParams.hashLog; + U32 const hashLog = zc->appliedParams.cParams.hashLog; size_t const h = ZSTD_hashPtr(ip, hashLog, mls); U32* const bt = zc->chainTable; - U32 const btLog = zc->params.cParams.chainLog - 1; + U32 const btLog = zc->appliedParams.cParams.chainLog - 1; U32 const btMask = (1 << btLog) - 1; U32 matchIndex = hashTable[h]; size_t commonLengthSmaller=0, commonLengthLarger=0; @@ -1733,10 +2117,10 @@ static size_t ZSTD_insertBtAndFindBestMatch ( U32 extDict) { U32* const hashTable = zc->hashTable; - U32 const hashLog = zc->params.cParams.hashLog; + U32 const hashLog = zc->appliedParams.cParams.hashLog; size_t const h = ZSTD_hashPtr(ip, hashLog, mls); U32* const bt = zc->chainTable; - U32 const btLog = zc->params.cParams.chainLog - 1; + U32 const btLog = zc->appliedParams.cParams.chainLog - 1; U32 const btMask = (1 << btLog) - 1; U32 matchIndex = hashTable[h]; size_t commonLengthSmaller=0, commonLengthLarger=0; @@ -1896,9 +2280,9 @@ FORCE_INLINE U32 ZSTD_insertAndFindFirstIndex (ZSTD_CCtx* zc, const BYTE* ip, U32 mls) { U32* const hashTable = zc->hashTable; - const U32 hashLog = zc->params.cParams.hashLog; + const U32 hashLog = zc->appliedParams.cParams.hashLog; U32* const chainTable = zc->chainTable; - const U32 chainMask = (1 << zc->params.cParams.chainLog) - 1; + const U32 chainMask = (1 << zc->appliedParams.cParams.chainLog) - 1; const BYTE* const base = zc->base; const U32 target = (U32)(ip - base); U32 idx = zc->nextToUpdate; @@ -1915,8 +2299,8 @@ U32 ZSTD_insertAndFindFirstIndex (ZSTD_CCtx* zc, const BYTE* ip, U32 mls) } - -FORCE_INLINE /* inlining is important to hardwire a hot branch (template emulation) */ +/* inlining is important to hardwire a hot branch (template emulation) */ +FORCE_INLINE size_t ZSTD_HcFindBestMatch_generic ( ZSTD_CCtx* zc, /* Index table will be updated */ const BYTE* const ip, const BYTE* const iLimit, @@ -1924,7 +2308,7 @@ size_t ZSTD_HcFindBestMatch_generic ( const U32 maxNbAttempts, const U32 mls, const U32 extDict) { U32* const chainTable = zc->chainTable; - const U32 chainSize = (1 << zc->params.cParams.chainLog); + const U32 chainSize = (1 << zc->appliedParams.cParams.chainLog); const U32 chainMask = chainSize-1; const BYTE* const base = zc->base; const BYTE* const dictBase = zc->dictBase; @@ -2018,8 +2402,8 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx, const BYTE* const ilimit = iend - 8; const BYTE* const base = ctx->base + ctx->dictLimit; - U32 const maxSearches = 1 << ctx->params.cParams.searchLog; - U32 const mls = ctx->params.cParams.searchLength; + U32 const maxSearches = 1 << ctx->appliedParams.cParams.searchLog; + U32 const mls = ctx->appliedParams.cParams.searchLength; typedef size_t (*searchMax_f)(ZSTD_CCtx* zc, const BYTE* ip, const BYTE* iLimit, size_t* offsetPtr, @@ -2101,15 +2485,19 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx, break; /* nothing found : store previous solution */ } + /* NOTE: + * start[-offset+ZSTD_REP_MOVE-1] is undefined behavior. + * (-offset+ZSTD_REP_MOVE-1) is unsigned, and is added to start, which + * overflows the pointer, which is undefined behavior. + */ /* catch up */ if (offset) { while ( (start > anchor) && (start > base+offset-ZSTD_REP_MOVE) - && (start[-1] == start[-1-offset+ZSTD_REP_MOVE]) ) /* only search for offset within prefix */ + && (start[-1] == (start-offset+ZSTD_REP_MOVE)[-1]) ) /* only search for offset within prefix */ { start--; matchLength++; } offset_2 = offset_1; offset_1 = (U32)(offset - ZSTD_REP_MOVE); } - /* store sequence */ _storeSequence: { size_t const litLength = start - anchor; @@ -2182,8 +2570,8 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx, const BYTE* const dictEnd = dictBase + dictLimit; const BYTE* const dictStart = dictBase + ctx->lowLimit; - const U32 maxSearches = 1 << ctx->params.cParams.searchLog; - const U32 mls = ctx->params.cParams.searchLength; + const U32 maxSearches = 1 << ctx->appliedParams.cParams.searchLog; + const U32 mls = ctx->appliedParams.cParams.searchLength; typedef size_t (*searchMax_f)(ZSTD_CCtx* zc, const BYTE* ip, const BYTE* iLimit, size_t* offsetPtr, @@ -2370,7 +2758,7 @@ static void ZSTD_compressBlock_btopt(ZSTD_CCtx* ctx, const void* src, size_t src #endif } -static void ZSTD_compressBlock_btopt2(ZSTD_CCtx* ctx, const void* src, size_t srcSize) +static void ZSTD_compressBlock_btultra(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { #ifdef ZSTD_OPT_H_91842398743 ZSTD_compressBlock_opt_generic(ctx, src, srcSize, 1); @@ -2390,7 +2778,7 @@ static void ZSTD_compressBlock_btopt_extDict(ZSTD_CCtx* ctx, const void* src, si #endif } -static void ZSTD_compressBlock_btopt2_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize) +static void ZSTD_compressBlock_btultra_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize) { #ifdef ZSTD_OPT_H_91842398743 ZSTD_compressBlock_opt_extDict_generic(ctx, src, srcSize, 1); @@ -2401,26 +2789,32 @@ static void ZSTD_compressBlock_btopt2_extDict(ZSTD_CCtx* ctx, const void* src, s } +/* ZSTD_selectBlockCompressor() : + * assumption : strat is a valid strategy */ typedef void (*ZSTD_blockCompressor) (ZSTD_CCtx* ctx, const void* src, size_t srcSize); - static ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, int extDict) { - static const ZSTD_blockCompressor blockCompressor[2][8] = { - { ZSTD_compressBlock_fast, ZSTD_compressBlock_doubleFast, ZSTD_compressBlock_greedy, + static const ZSTD_blockCompressor blockCompressor[2][(unsigned)ZSTD_btultra+1] = { + { ZSTD_compressBlock_fast /* default for 0 */, + ZSTD_compressBlock_fast, ZSTD_compressBlock_doubleFast, ZSTD_compressBlock_greedy, ZSTD_compressBlock_lazy, ZSTD_compressBlock_lazy2, ZSTD_compressBlock_btlazy2, - ZSTD_compressBlock_btopt, ZSTD_compressBlock_btopt2 }, - { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_doubleFast_extDict, ZSTD_compressBlock_greedy_extDict, + ZSTD_compressBlock_btopt, ZSTD_compressBlock_btultra }, + { ZSTD_compressBlock_fast_extDict /* default for 0 */, + ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_doubleFast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, - ZSTD_compressBlock_btopt_extDict, ZSTD_compressBlock_btopt2_extDict } + ZSTD_compressBlock_btopt_extDict, ZSTD_compressBlock_btultra_extDict } }; + ZSTD_STATIC_ASSERT((unsigned)ZSTD_fast == 1); + assert((U32)strat >= (U32)ZSTD_fast); + assert((U32)strat <= (U32)ZSTD_btultra); - return blockCompressor[extDict][(U32)strat]; + return blockCompressor[extDict!=0][(U32)strat]; } static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCapacity, const void* src, size_t srcSize) { - ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->params.cParams.strategy, zc->lowLimit < zc->dictLimit); + ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->appliedParams.cParams.strategy, zc->lowLimit < zc->dictLimit); const BYTE* const base = zc->base; const BYTE* const istart = (const BYTE*)src; const U32 current = (U32)(istart-base); @@ -2433,14 +2827,14 @@ static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCa } -/*! ZSTD_compress_generic() : +/*! ZSTD_compress_frameChunk() : * Compress a chunk of data into one or multiple blocks. * All blocks will be terminated, all input will be consumed. * Function will issue an error if there is not enough `dstCapacity` to hold the compressed content. * Frame is supposed already started (header already produced) * @return : compressed size, or an error code */ -static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, +static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, U32 lastFrameChunk) @@ -2450,9 +2844,9 @@ static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, const BYTE* ip = (const BYTE*)src; BYTE* const ostart = (BYTE*)dst; BYTE* op = ostart; - U32 const maxDist = 1 << cctx->params.cParams.windowLog; + U32 const maxDist = 1 << cctx->appliedParams.cParams.windowLog; - if (cctx->params.fParams.checksumFlag && srcSize) + if (cctx->appliedParams.fParams.checksumFlag && srcSize) XXH64_update(&cctx->xxhState, src, srcSize); while (remaining) { @@ -2465,9 +2859,9 @@ static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, /* preemptive overflow correction */ if (cctx->lowLimit > (3U<<29)) { - U32 const cycleMask = (1 << ZSTD_cycleLog(cctx->params.cParams.hashLog, cctx->params.cParams.strategy)) - 1; + U32 const cycleMask = (1 << ZSTD_cycleLog(cctx->appliedParams.cParams.hashLog, cctx->appliedParams.cParams.strategy)) - 1; U32 const current = (U32)(ip - cctx->base); - U32 const newCurrent = (current & cycleMask) + (1 << cctx->params.cParams.windowLog); + U32 const newCurrent = (current & cycleMask) + (1 << cctx->appliedParams.cParams.windowLog); U32 const correction = current - newCurrent; ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX_64 <= 30); ZSTD_reduceIndex(cctx, correction); @@ -2522,22 +2916,20 @@ static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity, U32 const singleSegment = params.fParams.contentSizeFlag && (windowSize >= pledgedSrcSize); BYTE const windowLogByte = (BYTE)((params.cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3); U32 const fcsCode = params.fParams.contentSizeFlag ? - (pledgedSrcSize>=256) + (pledgedSrcSize>=65536+256) + (pledgedSrcSize>=0xFFFFFFFFU) : /* 0-3 */ - 0; + (pledgedSrcSize>=256) + (pledgedSrcSize>=65536+256) + (pledgedSrcSize>=0xFFFFFFFFU) : 0; /* 0-3 */ BYTE const frameHeaderDecriptionByte = (BYTE)(dictIDSizeCode + (checksumFlag<<2) + (singleSegment<<5) + (fcsCode<<6) ); size_t pos; if (dstCapacity < ZSTD_frameHeaderSize_max) return ERROR(dstSize_tooSmall); - DEBUGLOG(5, "ZSTD_writeFrameHeader : dictIDFlag : %u \n", !params.fParams.noDictIDFlag); - DEBUGLOG(5, "ZSTD_writeFrameHeader : dictID : %u \n", dictID); - DEBUGLOG(5, "ZSTD_writeFrameHeader : dictIDSizeCode : %u \n", dictIDSizeCode); + DEBUGLOG(5, "ZSTD_writeFrameHeader : dictIDFlag : %u ; dictID : %u ; dictIDSizeCode : %u", + !params.fParams.noDictIDFlag, dictID, dictIDSizeCode); MEM_writeLE32(dst, ZSTD_MAGICNUMBER); op[4] = frameHeaderDecriptionByte; pos=5; if (!singleSegment) op[pos++] = windowLogByte; switch(dictIDSizeCode) { - default: /* impossible */ + default: assert(0); /* impossible */ case 0 : break; case 1 : op[pos] = (BYTE)(dictID); pos++; break; case 2 : MEM_writeLE16(op+pos, (U16)dictID); pos+=2; break; @@ -2545,7 +2937,7 @@ static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity, } switch(fcsCode) { - default: /* impossible */ + default: assert(0); /* impossible */ case 0 : if (singleSegment) op[pos++] = (BYTE)(pledgedSrcSize); break; case 1 : MEM_writeLE16(op+pos, (U16)(pledgedSrcSize-256)); pos+=2; break; case 2 : MEM_writeLE32(op+pos, (U32)(pledgedSrcSize)); pos+=4; break; @@ -2563,10 +2955,13 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx, const BYTE* const ip = (const BYTE*) src; size_t fhSize = 0; + DEBUGLOG(5, "ZSTD_compressContinue_internal"); + DEBUGLOG(5, "stage: %u", cctx->stage); if (cctx->stage==ZSTDcs_created) return ERROR(stage_wrong); /* missing init (ZSTD_compressBegin) */ if (frame && (cctx->stage==ZSTDcs_init)) { - fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->params, cctx->frameContentSize, cctx->dictID); + fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->appliedParams, + cctx->pledgedSrcSizePlusOne-1, cctx->dictID); if (ZSTD_isError(fhSize)) return fhSize; dstCapacity -= fhSize; dst = (char*)dst + fhSize; @@ -2596,7 +2991,7 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx, if (srcSize) { size_t const cSize = frame ? - ZSTD_compress_generic (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) : + ZSTD_compress_frameChunk (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) : ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize); if (ZSTD_isError(cSize)) return cSize; cctx->consumedSrcSize += srcSize; @@ -2614,14 +3009,18 @@ size_t ZSTD_compressContinue (ZSTD_CCtx* cctx, } -size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx) +size_t ZSTD_getBlockSize(const ZSTD_CCtx* cctx) { - return MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << cctx->params.cParams.windowLog); + U32 const cLevel = cctx->compressionLevel; + ZSTD_compressionParameters cParams = (cLevel == ZSTD_CLEVEL_CUSTOM) ? + cctx->appliedParams.cParams : + ZSTD_getCParams(cLevel, 0, 0); + return MIN (ZSTD_BLOCKSIZE_MAX, 1 << cParams.windowLog); } size_t ZSTD_compressBlock(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize) { - size_t const blockSizeMax = ZSTD_getBlockSizeMax(cctx); + size_t const blockSizeMax = ZSTD_getBlockSize(cctx); if (srcSize > blockSizeMax) return ERROR(srcSize_wrong); return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 0 /* frame mode */, 0 /* last chunk */); } @@ -2645,32 +3044,32 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t zc->nextSrc = iend; if (srcSize <= HASH_READ_SIZE) return 0; - switch(zc->params.cParams.strategy) + switch(zc->appliedParams.cParams.strategy) { case ZSTD_fast: - ZSTD_fillHashTable (zc, iend, zc->params.cParams.searchLength); + ZSTD_fillHashTable (zc, iend, zc->appliedParams.cParams.searchLength); break; case ZSTD_dfast: - ZSTD_fillDoubleHashTable (zc, iend, zc->params.cParams.searchLength); + ZSTD_fillDoubleHashTable (zc, iend, zc->appliedParams.cParams.searchLength); break; case ZSTD_greedy: case ZSTD_lazy: case ZSTD_lazy2: if (srcSize >= HASH_READ_SIZE) - ZSTD_insertAndFindFirstIndex(zc, iend-HASH_READ_SIZE, zc->params.cParams.searchLength); + ZSTD_insertAndFindFirstIndex(zc, iend-HASH_READ_SIZE, zc->appliedParams.cParams.searchLength); break; case ZSTD_btlazy2: case ZSTD_btopt: - case ZSTD_btopt2: + case ZSTD_btultra: if (srcSize >= HASH_READ_SIZE) - ZSTD_updateTree(zc, iend-HASH_READ_SIZE, iend, 1 << zc->params.cParams.searchLog, zc->params.cParams.searchLength); + ZSTD_updateTree(zc, iend-HASH_READ_SIZE, iend, 1 << zc->appliedParams.cParams.searchLog, zc->appliedParams.cParams.searchLength); break; default: - return ERROR(GENERIC); /* strategy doesn't exist; impossible */ + assert(0); /* not possible : not a valid strategy id */ } zc->nextToUpdate = (U32)(iend - zc->base); @@ -2710,7 +3109,7 @@ static size_t ZSTD_loadZstdDictionary(ZSTD_CCtx* cctx, const void* dict, size_t BYTE scratchBuffer[1< dictID = cctx->params.fParams.noDictIDFlag ? 0 : MEM_readLE32(dictPtr); + cctx->dictID = cctx->appliedParams.fParams.noDictIDFlag ? 0 : MEM_readLE32(dictPtr); dictPtr += 4; { size_t const hufHeaderSize = HUF_readCTable(cctx->hufCTable, 255, dictPtr, dictEnd-dictPtr); @@ -2781,28 +3180,56 @@ static size_t ZSTD_loadZstdDictionary(ZSTD_CCtx* cctx, const void* dict, size_t /** ZSTD_compress_insertDictionary() : * @return : 0, or an error code */ -static size_t ZSTD_compress_insertDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize) +static size_t ZSTD_compress_insertDictionary(ZSTD_CCtx* cctx, + const void* dict, size_t dictSize, + ZSTD_dictMode_e dictMode) { + DEBUGLOG(5, "ZSTD_compress_insertDictionary"); if ((dict==NULL) || (dictSize<=8)) return 0; - /* dict as pure content */ - if ((MEM_readLE32(dict) != ZSTD_DICT_MAGIC) || (cctx->forceRawDict)) + /* dict restricted modes */ + if (dictMode==ZSTD_dm_rawContent) return ZSTD_loadDictionaryContent(cctx, dict, dictSize); - /* dict as zstd dictionary */ + if (MEM_readLE32(dict) != ZSTD_MAGIC_DICTIONARY) { + if (dictMode == ZSTD_dm_auto) { + DEBUGLOG(5, "raw content dictionary detected"); + return ZSTD_loadDictionaryContent(cctx, dict, dictSize); + } + if (dictMode == ZSTD_dm_fullDict) + return ERROR(dictionary_wrong); + assert(0); /* impossible */ + } + + /* dict as full zstd dictionary */ return ZSTD_loadZstdDictionary(cctx, dict, dictSize); } /*! ZSTD_compressBegin_internal() : -* @return : 0, or an error code */ + * @return : 0, or an error code */ static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, - ZSTD_parameters params, U64 pledgedSrcSize) + ZSTD_dictMode_e dictMode, + const ZSTD_CDict* cdict, + ZSTD_parameters params, U64 pledgedSrcSize, + ZSTD_buffered_policy_e zbuff) { - ZSTD_compResetPolicy_e const crp = dictSize ? ZSTDcrp_fullReset : ZSTDcrp_continue; + DEBUGLOG(4, "ZSTD_compressBegin_internal"); + DEBUGLOG(4, "dict ? %s", dict ? "dict" : (cdict ? "cdict" : "none")); + DEBUGLOG(4, "dictMode : %u", (U32)dictMode); + /* params are supposed to be fully validated at this point */ assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); - CHECK_F(ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, crp)); - return ZSTD_compress_insertDictionary(cctx, dict, dictSize); + assert(!((dict) && (cdict))); /* either dict or cdict, not both */ + + if (cdict && cdict->dictContentSize>0) { + return ZSTD_copyCCtx_internal(cctx, cdict->refContext, + params.fParams, pledgedSrcSize, + zbuff); + } + + CHECK_F( ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, + ZSTDcrp_continue, zbuff) ); + return ZSTD_compress_insertDictionary(cctx, dict, dictSize, dictMode); } @@ -2814,14 +3241,16 @@ size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, { /* compression parameters verification and optimization */ CHECK_F(ZSTD_checkCParams(params.cParams)); - return ZSTD_compressBegin_internal(cctx, dict, dictSize, params, pledgedSrcSize); + return ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dm_auto, NULL, + params, pledgedSrcSize, ZSTDb_not_buffered); } size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize); - return ZSTD_compressBegin_internal(cctx, dict, dictSize, params, 0); + return ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dm_auto, NULL, + params, 0, ZSTDb_not_buffered); } @@ -2840,11 +3269,12 @@ static size_t ZSTD_writeEpilogue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity) BYTE* op = ostart; size_t fhSize = 0; + DEBUGLOG(5, "ZSTD_writeEpilogue"); if (cctx->stage == ZSTDcs_created) return ERROR(stage_wrong); /* init missing */ /* special case : empty frame */ if (cctx->stage == ZSTDcs_init) { - fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->params, 0, 0); + fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->appliedParams, 0, 0); if (ZSTD_isError(fhSize)) return fhSize; dstCapacity -= fhSize; op += fhSize; @@ -2860,7 +3290,7 @@ static size_t ZSTD_writeEpilogue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity) dstCapacity -= ZSTD_blockHeaderSize; } - if (cctx->params.fParams.checksumFlag) { + if (cctx->appliedParams.fParams.checksumFlag) { U32 const checksum = (U32) XXH64_digest(&cctx->xxhState); if (dstCapacity<4) return ERROR(dstSize_tooSmall); MEM_writeLE32(op, checksum); @@ -2877,13 +3307,19 @@ size_t ZSTD_compressEnd (ZSTD_CCtx* cctx, const void* src, size_t srcSize) { size_t endResult; - size_t const cSize = ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1 /* frame mode */, 1 /* last chunk */); + size_t const cSize = ZSTD_compressContinue_internal(cctx, + dst, dstCapacity, src, srcSize, + 1 /* frame mode */, 1 /* last chunk */); if (ZSTD_isError(cSize)) return cSize; endResult = ZSTD_writeEpilogue(cctx, (char*)dst + cSize, dstCapacity-cSize); if (ZSTD_isError(endResult)) return endResult; - if (cctx->params.fParams.contentSizeFlag) { /* control src size */ - if (cctx->frameContentSize != cctx->consumedSrcSize) return ERROR(srcSize_wrong); - } + if (cctx->appliedParams.fParams.contentSizeFlag) { /* control src size */ + DEBUGLOG(5, "end of frame : controlling src size"); + if (cctx->pledgedSrcSizePlusOne != cctx->consumedSrcSize+1) { + DEBUGLOG(5, "error : pledgedSrcSize = %u, while realSrcSize = %u", + (U32)cctx->pledgedSrcSizePlusOne-1, (U32)cctx->consumedSrcSize); + return ERROR(srcSize_wrong); + } } return cSize + endResult; } @@ -2894,7 +3330,8 @@ static size_t ZSTD_compress_internal (ZSTD_CCtx* cctx, const void* dict,size_t dictSize, ZSTD_parameters params) { - CHECK_F(ZSTD_compressBegin_internal(cctx, dict, dictSize, params, srcSize)); + CHECK_F( ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dm_auto, NULL, + params, srcSize, ZSTDb_not_buffered) ); return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize); } @@ -2926,25 +3363,36 @@ size_t ZSTD_compress(void* dst, size_t dstCapacity, const void* src, size_t srcS size_t result; ZSTD_CCtx ctxBody; memset(&ctxBody, 0, sizeof(ctxBody)); - memcpy(&ctxBody.customMem, &defaultCustomMem, sizeof(ZSTD_customMem)); + ctxBody.customMem = ZSTD_defaultCMem; result = ZSTD_compressCCtx(&ctxBody, dst, dstCapacity, src, srcSize, compressionLevel); - ZSTD_free(ctxBody.workSpace, defaultCustomMem); /* can't free ctxBody itself, as it's on stack; free only heap content */ + ZSTD_free(ctxBody.workSpace, ZSTD_defaultCMem); /* can't free ctxBody itself, as it's on stack; free only heap content */ return result; } /* ===== Dictionary API ===== */ -struct ZSTD_CDict_s { - void* dictBuffer; - const void* dictContent; - size_t dictContentSize; - ZSTD_CCtx* refContext; -}; /* typedef'd tp ZSTD_CDict within "zstd.h" */ +/*! ZSTD_estimateCDictSize_advanced() : + * Estimate amount of memory that will be needed to create a dictionary with following arguments */ +size_t ZSTD_estimateCDictSize_advanced(size_t dictSize, ZSTD_compressionParameters cParams, unsigned byReference) +{ + DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", (U32)sizeof(ZSTD_CDict)); + DEBUGLOG(5, "CCtx estimate : %u", (U32)ZSTD_estimateCCtxSize_advanced(cParams)); + return sizeof(ZSTD_CDict) + ZSTD_estimateCCtxSize_advanced(cParams) + + (byReference ? 0 : dictSize); +} + +size_t ZSTD_estimateCDictSize(size_t dictSize, int compressionLevel) +{ + ZSTD_compressionParameters const cParams = ZSTD_getCParams(compressionLevel, 0, dictSize); + return ZSTD_estimateCDictSize_advanced(dictSize, cParams, 0); +} size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict) { if (cdict==NULL) return 0; /* support sizeof on NULL */ + DEBUGLOG(5, "sizeof(*cdict) : %u", (U32)sizeof(*cdict)); + DEBUGLOG(5, "ZSTD_sizeof_CCtx : %u", (U32)ZSTD_sizeof_CCtx(cdict->refContext)); return ZSTD_sizeof_CCtx(cdict->refContext) + (cdict->dictBuffer ? cdict->dictContentSize : 0) + sizeof(*cdict); } @@ -2956,13 +3404,46 @@ static ZSTD_parameters ZSTD_makeParams(ZSTD_compressionParameters cParams, ZSTD_ return params; } -ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, unsigned byReference, +static size_t ZSTD_initCDict_internal( + ZSTD_CDict* cdict, + const void* dictBuffer, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, + ZSTD_compressionParameters cParams) +{ + DEBUGLOG(5, "ZSTD_initCDict_internal, mode %u", (U32)dictMode); + if ((byReference) || (!dictBuffer) || (!dictSize)) { + cdict->dictBuffer = NULL; + cdict->dictContent = dictBuffer; + } else { + void* const internalBuffer = ZSTD_malloc(dictSize, cdict->refContext->customMem); + cdict->dictBuffer = internalBuffer; + cdict->dictContent = internalBuffer; + if (!internalBuffer) return ERROR(memory_allocation); + memcpy(internalBuffer, dictBuffer, dictSize); + } + cdict->dictContentSize = dictSize; + + { ZSTD_frameParameters const fParams = { 0 /* contentSizeFlag */, + 0 /* checksumFlag */, 0 /* noDictIDFlag */ }; /* dummy */ + ZSTD_parameters const params = ZSTD_makeParams(cParams, fParams); + CHECK_F( ZSTD_compressBegin_internal(cdict->refContext, + cdict->dictContent, dictSize, dictMode, + NULL, + params, ZSTD_CONTENTSIZE_UNKNOWN, + ZSTDb_not_buffered) ); + } + + return 0; +} + +ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, ZSTD_compressionParameters cParams, ZSTD_customMem customMem) { - if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; - if (!customMem.customAlloc || !customMem.customFree) return NULL; + DEBUGLOG(5, "ZSTD_createCDict_advanced, mode %u", (U32)dictMode); + if (!customMem.customAlloc ^ !customMem.customFree) return NULL; - { ZSTD_CDict* const cdict = (ZSTD_CDict*) ZSTD_malloc(sizeof(ZSTD_CDict), customMem); + { ZSTD_CDict* const cdict = (ZSTD_CDict*)ZSTD_malloc(sizeof(ZSTD_CDict), customMem); ZSTD_CCtx* const cctx = ZSTD_createCCtx_advanced(customMem); if (!cdict || !cctx) { @@ -2970,46 +3451,34 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, u ZSTD_freeCCtx(cctx); return NULL; } + cdict->refContext = cctx; - if ((byReference) || (!dictBuffer) || (!dictSize)) { - cdict->dictBuffer = NULL; - cdict->dictContent = dictBuffer; - } else { - void* const internalBuffer = ZSTD_malloc(dictSize, customMem); - if (!internalBuffer) { ZSTD_free(cctx, customMem); ZSTD_free(cdict, customMem); return NULL; } - memcpy(internalBuffer, dictBuffer, dictSize); - cdict->dictBuffer = internalBuffer; - cdict->dictContent = internalBuffer; + if (ZSTD_isError( ZSTD_initCDict_internal(cdict, + dictBuffer, dictSize, + byReference, dictMode, + cParams) )) { + ZSTD_freeCDict(cdict); + return NULL; } - { ZSTD_frameParameters const fParams = { 0 /* contentSizeFlag */, 0 /* checksumFlag */, 0 /* noDictIDFlag */ }; /* dummy */ - ZSTD_parameters const params = ZSTD_makeParams(cParams, fParams); - size_t const errorCode = ZSTD_compressBegin_advanced(cctx, cdict->dictContent, dictSize, params, 0); - if (ZSTD_isError(errorCode)) { - ZSTD_free(cdict->dictBuffer, customMem); - ZSTD_free(cdict, customMem); - ZSTD_freeCCtx(cctx); - return NULL; - } } - - cdict->refContext = cctx; - cdict->dictContentSize = dictSize; return cdict; } } ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel) { - ZSTD_customMem const allocator = { NULL, NULL, NULL }; ZSTD_compressionParameters cParams = ZSTD_getCParams(compressionLevel, 0, dictSize); - return ZSTD_createCDict_advanced(dict, dictSize, 0, cParams, allocator); + return ZSTD_createCDict_advanced(dict, dictSize, + 0 /* byReference */, ZSTD_dm_auto, + cParams, ZSTD_defaultCMem); } ZSTD_CDict* ZSTD_createCDict_byReference(const void* dict, size_t dictSize, int compressionLevel) { - ZSTD_customMem const allocator = { NULL, NULL, NULL }; ZSTD_compressionParameters cParams = ZSTD_getCParams(compressionLevel, 0, dictSize); - return ZSTD_createCDict_advanced(dict, dictSize, 1, cParams, allocator); + return ZSTD_createCDict_advanced(dict, dictSize, + 1 /* byReference */, ZSTD_dm_auto, + cParams, ZSTD_defaultCMem); } size_t ZSTD_freeCDict(ZSTD_CDict* cdict) @@ -3023,7 +3492,54 @@ size_t ZSTD_freeCDict(ZSTD_CDict* cdict) } } -static ZSTD_parameters ZSTD_getParamsFromCDict(const ZSTD_CDict* cdict) { +/*! ZSTD_initStaticCDict_advanced() : + * Generate a digested dictionary in provided memory area. + * workspace: The memory area to emplace the dictionary into. + * Provided pointer must 8-bytes aligned. + * It must outlive dictionary usage. + * workspaceSize: Use ZSTD_estimateCDictSize() + * to determine how large workspace must be. + * cParams : use ZSTD_getCParams() to transform a compression level + * into its relevants cParams. + * @return : pointer to ZSTD_CDict*, or NULL if error (size too small) + * Note : there is no corresponding "free" function. + * Since workspace was allocated externally, it must be freed externally. + */ +ZSTD_CDict* ZSTD_initStaticCDict(void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, + ZSTD_compressionParameters cParams) +{ + size_t const cctxSize = ZSTD_estimateCCtxSize_advanced(cParams); + size_t const neededSize = sizeof(ZSTD_CDict) + (byReference ? 0 : dictSize) + + cctxSize; + ZSTD_CDict* const cdict = (ZSTD_CDict*) workspace; + void* ptr; + DEBUGLOG(5, "(size_t)workspace & 7 : %u", (U32)(size_t)workspace & 7); + if ((size_t)workspace & 7) return NULL; /* 8-aligned */ + DEBUGLOG(5, "(workspaceSize < neededSize) : (%u < %u) => %u", + (U32)workspaceSize, (U32)neededSize, (U32)(workspaceSize < neededSize)); + if (workspaceSize < neededSize) return NULL; + + if (!byReference) { + memcpy(cdict+1, dict, dictSize); + dict = cdict+1; + ptr = (char*)workspace + sizeof(ZSTD_CDict) + dictSize; + } else { + ptr = cdict+1; + } + cdict->refContext = ZSTD_initStaticCCtx(ptr, cctxSize); + + if (ZSTD_isError( ZSTD_initCDict_internal(cdict, + dict, dictSize, + 1 /* byReference */, dictMode, + cParams) )) + return NULL; + + return cdict; +} + +ZSTD_parameters ZSTD_getParamsFromCDict(const ZSTD_CDict* cdict) { return ZSTD_getParamsFromCCtx(cdict->refContext); } @@ -3033,16 +3549,16 @@ size_t ZSTD_compressBegin_usingCDict_advanced( ZSTD_CCtx* const cctx, const ZSTD_CDict* const cdict, ZSTD_frameParameters const fParams, unsigned long long const pledgedSrcSize) { - if (cdict==NULL) return ERROR(GENERIC); /* does not support NULL cdict */ - DEBUGLOG(5, "ZSTD_compressBegin_usingCDict_advanced : dictIDFlag == %u \n", !fParams.noDictIDFlag); - if (cdict->dictContentSize) - CHECK_F( ZSTD_copyCCtx_internal(cctx, cdict->refContext, fParams, pledgedSrcSize) ) - else { - ZSTD_parameters params = cdict->refContext->params; + if (cdict==NULL) return ERROR(dictionary_wrong); + { ZSTD_parameters params = cdict->refContext->appliedParams; params.fParams = fParams; - CHECK_F(ZSTD_compressBegin_internal(cctx, NULL, 0, params, pledgedSrcSize)); + DEBUGLOG(5, "ZSTD_compressBegin_usingCDict_advanced"); + return ZSTD_compressBegin_internal(cctx, + NULL, 0, ZSTD_dm_auto, + cdict, + params, pledgedSrcSize, + ZSTDb_not_buffered); } - return 0; } /* ZSTD_compressBegin_usingCDict() : @@ -3051,7 +3567,7 @@ size_t ZSTD_compressBegin_usingCDict_advanced( size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict) { ZSTD_frameParameters const fParams = { 0 /*content*/, 0 /*checksum*/, 0 /*noDictID*/ }; - DEBUGLOG(5, "ZSTD_compressBegin_usingCDict : dictIDFlag == %u \n", !fParams.noDictIDFlag); + DEBUGLOG(5, "ZSTD_compressBegin_usingCDict : dictIDFlag == %u", !fParams.noDictIDFlag); return ZSTD_compressBegin_usingCDict_advanced(cctx, cdict, fParams, 0); } @@ -3084,176 +3600,133 @@ size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx, * Streaming ********************************************************************/ -typedef enum { zcss_init, zcss_load, zcss_flush, zcss_final } ZSTD_cStreamStage; - -struct ZSTD_CStream_s { - ZSTD_CCtx* cctx; - ZSTD_CDict* cdictLocal; - const ZSTD_CDict* cdict; - char* inBuff; - size_t inBuffSize; - size_t inToCompress; - size_t inBuffPos; - size_t inBuffTarget; - size_t blockSize; - char* outBuff; - size_t outBuffSize; - size_t outBuffContentSize; - size_t outBuffFlushedSize; - ZSTD_cStreamStage stage; - U32 checksum; - U32 frameEnded; - U64 pledgedSrcSize; - ZSTD_parameters params; - ZSTD_customMem customMem; -}; /* typedef'd to ZSTD_CStream within "zstd.h" */ - ZSTD_CStream* ZSTD_createCStream(void) { - return ZSTD_createCStream_advanced(defaultCustomMem); + return ZSTD_createCStream_advanced(ZSTD_defaultCMem); +} + +ZSTD_CStream* ZSTD_initStaticCStream(void *workspace, size_t workspaceSize) +{ + return ZSTD_initStaticCCtx(workspace, workspaceSize); } ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem) -{ - ZSTD_CStream* zcs; - - if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; - if (!customMem.customAlloc || !customMem.customFree) return NULL; - - zcs = (ZSTD_CStream*)ZSTD_malloc(sizeof(ZSTD_CStream), customMem); - if (zcs==NULL) return NULL; - memset(zcs, 0, sizeof(ZSTD_CStream)); - memcpy(&zcs->customMem, &customMem, sizeof(ZSTD_customMem)); - zcs->cctx = ZSTD_createCCtx_advanced(customMem); - if (zcs->cctx == NULL) { ZSTD_freeCStream(zcs); return NULL; } - return zcs; +{ /* CStream and CCtx are now same object */ + return ZSTD_createCCtx_advanced(customMem); } size_t ZSTD_freeCStream(ZSTD_CStream* zcs) { - if (zcs==NULL) return 0; /* support free on NULL */ - { ZSTD_customMem const cMem = zcs->customMem; - ZSTD_freeCCtx(zcs->cctx); - zcs->cctx = NULL; - ZSTD_freeCDict(zcs->cdictLocal); - zcs->cdictLocal = NULL; - ZSTD_free(zcs->inBuff, cMem); - zcs->inBuff = NULL; - ZSTD_free(zcs->outBuff, cMem); - zcs->outBuff = NULL; - ZSTD_free(zcs, cMem); - return 0; - } + return ZSTD_freeCCtx(zcs); /* same object */ } + /*====== Initialization ======*/ -size_t ZSTD_CStreamInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; } +size_t ZSTD_CStreamInSize(void) { return ZSTD_BLOCKSIZE_MAX; } size_t ZSTD_CStreamOutSize(void) { - return ZSTD_compressBound(ZSTD_BLOCKSIZE_ABSOLUTEMAX) + ZSTD_blockHeaderSize + 4 /* 32-bits hash */ ; + return ZSTD_compressBound(ZSTD_BLOCKSIZE_MAX) + ZSTD_blockHeaderSize + 4 /* 32-bits hash */ ; } -static size_t ZSTD_resetCStream_internal(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize) +static size_t ZSTD_resetCStream_internal(ZSTD_CStream* zcs, + const void* dict, size_t dictSize, ZSTD_dictMode_e dictMode, + const ZSTD_CDict* cdict, + ZSTD_parameters params, unsigned long long pledgedSrcSize) { - if (zcs->inBuffSize==0) return ERROR(stage_wrong); /* zcs has not been init at least once => can't reset */ + DEBUGLOG(4, "ZSTD_resetCStream_internal"); + /* params are supposed to be fully validated at this point */ + assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); + assert(!((dict) && (cdict))); /* either dict or cdict, not both */ - DEBUGLOG(5, "ZSTD_resetCStream_internal : dictIDFlag == %u \n", !zcs->params.fParams.noDictIDFlag); - - if (zcs->cdict) CHECK_F(ZSTD_compressBegin_usingCDict_advanced(zcs->cctx, zcs->cdict, zcs->params.fParams, pledgedSrcSize)) - else CHECK_F(ZSTD_compressBegin_internal(zcs->cctx, NULL, 0, zcs->params, pledgedSrcSize)); + CHECK_F( ZSTD_compressBegin_internal(zcs, + dict, dictSize, dictMode, + cdict, + params, pledgedSrcSize, + ZSTDb_buffered) ); zcs->inToCompress = 0; zcs->inBuffPos = 0; zcs->inBuffTarget = zcs->blockSize; zcs->outBuffContentSize = zcs->outBuffFlushedSize = 0; - zcs->stage = zcss_load; + zcs->streamStage = zcss_load; zcs->frameEnded = 0; - zcs->pledgedSrcSize = pledgedSrcSize; return 0; /* ready to go */ } size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize) { - - zcs->params.fParams.contentSizeFlag = (pledgedSrcSize > 0); - DEBUGLOG(5, "ZSTD_resetCStream : dictIDFlag == %u \n", !zcs->params.fParams.noDictIDFlag); - return ZSTD_resetCStream_internal(zcs, pledgedSrcSize); + ZSTD_parameters params = zcs->requestedParams; + params.fParams.contentSizeFlag = (pledgedSrcSize > 0); + DEBUGLOG(5, "ZSTD_resetCStream"); + if (zcs->compressionLevel != ZSTD_CLEVEL_CUSTOM) { + params.cParams = ZSTD_getCParams(zcs->compressionLevel, pledgedSrcSize, 0 /* dictSize */); + } + return ZSTD_resetCStream_internal(zcs, NULL, 0, zcs->dictMode, zcs->cdict, params, pledgedSrcSize); } -/* ZSTD_initCStream_internal() : - * params are supposed validated at this stage - * and zcs->cdict is supposed to be correct */ -static size_t ZSTD_initCStream_stage2(ZSTD_CStream* zcs, - const ZSTD_parameters params, - unsigned long long pledgedSrcSize) +/*! ZSTD_initCStream_internal() : + * Note : not static, but hidden (not exposed). Used by zstdmt_compress.c + * Assumption 1 : params are valid + * Assumption 2 : either dict, or cdict, is defined, not both */ +size_t ZSTD_initCStream_internal(ZSTD_CStream* zcs, + const void* dict, size_t dictSize, const ZSTD_CDict* cdict, + ZSTD_parameters params, unsigned long long pledgedSrcSize) { + DEBUGLOG(5, "ZSTD_initCStream_internal"); assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); + assert(!((dict) && (cdict))); /* either dict or cdict, not both */ - /* allocate buffers */ - { size_t const neededInBuffSize = (size_t)1 << params.cParams.windowLog; - if (zcs->inBuffSize < neededInBuffSize) { - zcs->inBuffSize = 0; - ZSTD_free(zcs->inBuff, zcs->customMem); - zcs->inBuff = (char*) ZSTD_malloc(neededInBuffSize, zcs->customMem); - if (zcs->inBuff == NULL) return ERROR(memory_allocation); - zcs->inBuffSize = neededInBuffSize; + if (dict && dictSize >= 8) { + DEBUGLOG(5, "loading dictionary of size %u", (U32)dictSize); + if (zcs->staticSize) { /* static CCtx : never uses malloc */ + /* incompatible with internal cdict creation */ + return ERROR(memory_allocation); } - zcs->blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, neededInBuffSize); - } - if (zcs->outBuffSize < ZSTD_compressBound(zcs->blockSize)+1) { - size_t const outBuffSize = ZSTD_compressBound(zcs->blockSize)+1; - zcs->outBuffSize = 0; - ZSTD_free(zcs->outBuff, zcs->customMem); - zcs->outBuff = (char*) ZSTD_malloc(outBuffSize, zcs->customMem); - if (zcs->outBuff == NULL) return ERROR(memory_allocation); - zcs->outBuffSize = outBuffSize; + ZSTD_freeCDict(zcs->cdictLocal); + zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, + zcs->dictContentByRef, zcs->dictMode, + params.cParams, zcs->customMem); + zcs->cdict = zcs->cdictLocal; + if (zcs->cdictLocal == NULL) return ERROR(memory_allocation); + } else { + if (cdict) { + ZSTD_parameters const cdictParams = ZSTD_getParamsFromCDict(cdict); + params.cParams = cdictParams.cParams; /* cParams are enforced from cdict */ + } + ZSTD_freeCDict(zcs->cdictLocal); + zcs->cdictLocal = NULL; + zcs->cdict = cdict; } - zcs->checksum = params.fParams.checksumFlag > 0; - zcs->params = params; - - DEBUGLOG(5, "ZSTD_initCStream_stage2 : dictIDFlag == %u \n", !params.fParams.noDictIDFlag); - return ZSTD_resetCStream_internal(zcs, pledgedSrcSize); + zcs->requestedParams = params; + zcs->compressionLevel = ZSTD_CLEVEL_CUSTOM; + return ZSTD_resetCStream_internal(zcs, NULL, 0, zcs->dictMode, zcs->cdict, params, pledgedSrcSize); } /* ZSTD_initCStream_usingCDict_advanced() : * same as ZSTD_initCStream_usingCDict(), with control over frame parameters */ -size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, unsigned long long pledgedSrcSize, ZSTD_frameParameters fParams) -{ - if (!cdict) return ERROR(GENERIC); /* cannot handle NULL cdict (does not know what to do) */ +size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, + const ZSTD_CDict* cdict, + ZSTD_frameParameters fParams, + unsigned long long pledgedSrcSize) +{ /* cannot handle NULL cdict (does not know what to do) */ + if (!cdict) return ERROR(dictionary_wrong); { ZSTD_parameters params = ZSTD_getParamsFromCDict(cdict); params.fParams = fParams; - zcs->cdict = cdict; - return ZSTD_initCStream_stage2(zcs, params, pledgedSrcSize); + return ZSTD_initCStream_internal(zcs, + NULL, 0, cdict, + params, pledgedSrcSize); } } /* note : cdict must outlive compression session */ size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict) { - ZSTD_frameParameters const fParams = { 0 /* content */, 0 /* checksum */, 0 /* noDictID */ }; - return ZSTD_initCStream_usingCDict_advanced(zcs, cdict, 0, fParams); -} - -static size_t ZSTD_initCStream_internal(ZSTD_CStream* zcs, - const void* dict, size_t dictSize, - ZSTD_parameters params, unsigned long long pledgedSrcSize) -{ - assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); - zcs->cdict = NULL; - - if (dict && dictSize >= 8) { - ZSTD_freeCDict(zcs->cdictLocal); - zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, 0 /* copy */, params.cParams, zcs->customMem); - if (zcs->cdictLocal == NULL) return ERROR(memory_allocation); - zcs->cdict = zcs->cdictLocal; - } - - DEBUGLOG(5, "ZSTD_initCStream_internal : dictIDFlag == %u \n", !params.fParams.noDictIDFlag); - return ZSTD_initCStream_stage2(zcs, params, pledgedSrcSize); + ZSTD_frameParameters const fParams = { 0 /* contentSize */, 0 /* checksum */, 0 /* hideDictID */ }; + return ZSTD_initCStream_usingCDict_advanced(zcs, cdict, fParams, 0); /* note : will check that cdict != NULL */ } size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, @@ -3261,120 +3734,180 @@ size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, ZSTD_parameters params, unsigned long long pledgedSrcSize) { CHECK_F( ZSTD_checkCParams(params.cParams) ); - DEBUGLOG(5, "ZSTD_initCStream_advanced : dictIDFlag == %u \n", !params.fParams.noDictIDFlag); - return ZSTD_initCStream_internal(zcs, dict, dictSize, params, pledgedSrcSize); + zcs->requestedParams = params; + zcs->compressionLevel = ZSTD_CLEVEL_CUSTOM; + return ZSTD_initCStream_internal(zcs, dict, dictSize, NULL, params, pledgedSrcSize); } size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize); - return ZSTD_initCStream_internal(zcs, dict, dictSize, params, 0); + zcs->compressionLevel = compressionLevel; + return ZSTD_initCStream_internal(zcs, dict, dictSize, NULL, params, 0); } size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize) { ZSTD_parameters params = ZSTD_getParams(compressionLevel, pledgedSrcSize, 0); params.fParams.contentSizeFlag = (pledgedSrcSize>0); - return ZSTD_initCStream_internal(zcs, NULL, 0, params, pledgedSrcSize); + return ZSTD_initCStream_internal(zcs, NULL, 0, NULL, params, pledgedSrcSize); } size_t ZSTD_initCStream(ZSTD_CStream* zcs, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, 0); - return ZSTD_initCStream_internal(zcs, NULL, 0, params, 0); -} - -size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs) -{ - if (zcs==NULL) return 0; /* support sizeof on NULL */ - return sizeof(*zcs) + ZSTD_sizeof_CCtx(zcs->cctx) + ZSTD_sizeof_CDict(zcs->cdictLocal) + zcs->outBuffSize + zcs->inBuffSize; + return ZSTD_initCStream_internal(zcs, NULL, 0, NULL, params, 0); } /*====== Compression ======*/ -typedef enum { zsf_gather, zsf_flush, zsf_end } ZSTD_flush_e; - -MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize) +MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, + const void* src, size_t srcSize) { size_t const length = MIN(dstCapacity, srcSize); - memcpy(dst, src, length); + if (length) memcpy(dst, src, length); return length; } -static size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs, - void* dst, size_t* dstCapacityPtr, - const void* src, size_t* srcSizePtr, - ZSTD_flush_e const flush) +/** ZSTD_compressStream_generic(): + * internal function for all *compressStream*() variants and *compress_generic() + * @return : hint size for next input */ +size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective const flushMode) { + const char* const istart = (const char*)input->src; + const char* const iend = istart + input->size; + const char* ip = istart + input->pos; + char* const ostart = (char*)output->dst; + char* const oend = ostart + output->size; + char* op = ostart + output->pos; U32 someMoreWork = 1; - const char* const istart = (const char*)src; - const char* const iend = istart + *srcSizePtr; - const char* ip = istart; - char* const ostart = (char*)dst; - char* const oend = ostart + *dstCapacityPtr; - char* op = ostart; + + /* check expectations */ + DEBUGLOG(5, "ZSTD_compressStream_generic, flush=%u", (U32)flushMode); + assert(zcs->inBuff != NULL); + assert(zcs->inBuffSize>0); + assert(zcs->outBuff!= NULL); + assert(zcs->outBuffSize>0); + assert(output->pos <= output->size); + assert(input->pos <= input->size); while (someMoreWork) { - switch(zcs->stage) + switch(zcs->streamStage) { - case zcss_init: return ERROR(init_missing); /* call ZBUFF_compressInit() first ! */ + case zcss_init: + /* call ZSTD_initCStream() first ! */ + return ERROR(init_missing); case zcss_load: - /* complete inBuffer */ + if ( (flushMode == ZSTD_e_end) + && ((size_t)(oend-op) >= ZSTD_compressBound(iend-ip)) /* enough dstCapacity */ + && (zcs->inBuffPos == 0) ) { + /* shortcut to compression pass directly into output buffer */ + size_t const cSize = ZSTD_compressEnd(zcs, + op, oend-op, ip, iend-ip); + DEBUGLOG(4, "ZSTD_compressEnd : %u", (U32)cSize); + if (ZSTD_isError(cSize)) return cSize; + ip = iend; + op += cSize; + zcs->frameEnded = 1; + ZSTD_startNewCompression(zcs); + someMoreWork = 0; break; + } + /* complete loading into inBuffer */ { size_t const toLoad = zcs->inBuffTarget - zcs->inBuffPos; - size_t const loaded = ZSTD_limitCopy(zcs->inBuff + zcs->inBuffPos, toLoad, ip, iend-ip); + size_t const loaded = ZSTD_limitCopy( + zcs->inBuff + zcs->inBuffPos, toLoad, + ip, iend-ip); zcs->inBuffPos += loaded; ip += loaded; - if ( (zcs->inBuffPos==zcs->inToCompress) || (!flush && (toLoad != loaded)) ) { - someMoreWork = 0; break; /* not enough input to get a full block : stop there, wait for more */ - } } + if ( (flushMode == ZSTD_e_continue) + && (zcs->inBuffPos < zcs->inBuffTarget) ) { + /* not enough input to fill full block : stop here */ + someMoreWork = 0; break; + } + if ( (flushMode == ZSTD_e_flush) + && (zcs->inBuffPos == zcs->inToCompress) ) { + /* empty */ + someMoreWork = 0; break; + } + } /* compress current block (note : this stage cannot be stopped in the middle) */ + DEBUGLOG(5, "stream compression stage (flushMode==%u)", flushMode); { void* cDst; size_t cSize; size_t const iSize = zcs->inBuffPos - zcs->inToCompress; size_t oSize = oend-op; + unsigned const lastBlock = (flushMode == ZSTD_e_end) && (ip==iend); if (oSize >= ZSTD_compressBound(iSize)) - cDst = op; /* compress directly into output buffer (avoid flush stage) */ + cDst = op; /* compress into output buffer, to skip flush stage */ else cDst = zcs->outBuff, oSize = zcs->outBuffSize; - cSize = (flush == zsf_end) ? - ZSTD_compressEnd(zcs->cctx, cDst, oSize, zcs->inBuff + zcs->inToCompress, iSize) : - ZSTD_compressContinue(zcs->cctx, cDst, oSize, zcs->inBuff + zcs->inToCompress, iSize); + cSize = lastBlock ? + ZSTD_compressEnd(zcs, cDst, oSize, + zcs->inBuff + zcs->inToCompress, iSize) : + ZSTD_compressContinue(zcs, cDst, oSize, + zcs->inBuff + zcs->inToCompress, iSize); if (ZSTD_isError(cSize)) return cSize; - if (flush == zsf_end) zcs->frameEnded = 1; + zcs->frameEnded = lastBlock; /* prepare next block */ zcs->inBuffTarget = zcs->inBuffPos + zcs->blockSize; if (zcs->inBuffTarget > zcs->inBuffSize) - zcs->inBuffPos = 0, zcs->inBuffTarget = zcs->blockSize; /* note : inBuffSize >= blockSize */ + zcs->inBuffPos = 0, zcs->inBuffTarget = zcs->blockSize; + DEBUGLOG(5, "inBuffTarget:%u / inBuffSize:%u", + (U32)zcs->inBuffTarget, (U32)zcs->inBuffSize); + if (!lastBlock) + assert(zcs->inBuffTarget <= zcs->inBuffSize); zcs->inToCompress = zcs->inBuffPos; - if (cDst == op) { op += cSize; break; } /* no need to flush */ + if (cDst == op) { /* no need to flush */ + op += cSize; + if (zcs->frameEnded) { + DEBUGLOG(5, "Frame completed directly in outBuffer"); + someMoreWork = 0; + ZSTD_startNewCompression(zcs); + } + break; + } zcs->outBuffContentSize = cSize; zcs->outBuffFlushedSize = 0; - zcs->stage = zcss_flush; /* pass-through to flush stage */ + zcs->streamStage = zcss_flush; /* pass-through to flush stage */ } - + /* fall-through */ case zcss_flush: + DEBUGLOG(5, "flush stage"); { size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize; - size_t const flushed = ZSTD_limitCopy(op, oend-op, zcs->outBuff + zcs->outBuffFlushedSize, toFlush); + size_t const flushed = ZSTD_limitCopy(op, oend-op, + zcs->outBuff + zcs->outBuffFlushedSize, toFlush); + DEBUGLOG(5, "toFlush: %u into %u ==> flushed: %u", + (U32)toFlush, (U32)(oend-op), (U32)flushed); op += flushed; zcs->outBuffFlushedSize += flushed; - if (toFlush!=flushed) { someMoreWork = 0; break; } /* dst too small to store flushed data : stop there */ + if (toFlush!=flushed) { + /* flush not fully completed, presumably because dst is too small */ + assert(op==oend); + someMoreWork = 0; + break; + } zcs->outBuffContentSize = zcs->outBuffFlushedSize = 0; - zcs->stage = zcss_load; + if (zcs->frameEnded) { + DEBUGLOG(5, "Frame completed on flush"); + someMoreWork = 0; + ZSTD_startNewCompression(zcs); + break; + } + zcs->streamStage = zcss_load; break; } - case zcss_final: - someMoreWork = 0; /* do nothing */ - break; - - default: - return ERROR(GENERIC); /* impossible */ + default: /* impossible */ + assert(0); } } - *srcSizePtr = ip - istart; - *dstCapacityPtr = op - ostart; + input->pos = ip - istart; + output->pos = op - ostart; if (zcs->frameEnded) return 0; { size_t hintInSize = zcs->inBuffTarget - zcs->inBuffPos; if (hintInSize==0) hintInSize = zcs->blockSize; @@ -3384,14 +3917,86 @@ static size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs, size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input) { - size_t sizeRead = input->size - input->pos; - size_t sizeWritten = output->size - output->pos; - size_t const result = ZSTD_compressStream_generic(zcs, - (char*)(output->dst) + output->pos, &sizeWritten, - (const char*)(input->src) + input->pos, &sizeRead, zsf_gather); - input->pos += sizeRead; - output->pos += sizeWritten; - return result; + /* check conditions */ + if (output->pos > output->size) return ERROR(GENERIC); + if (input->pos > input->size) return ERROR(GENERIC); + + return ZSTD_compressStream_generic(zcs, output, input, ZSTD_e_continue); +} + +/*! ZSTDMT_initCStream_internal() : + * Private use only. Init streaming operation. + * expects params to be valid. + * must receive dict, or cdict, or none, but not both. + * @return : 0, or an error code */ +size_t ZSTDMT_initCStream_internal(ZSTDMT_CCtx* zcs, + const void* dict, size_t dictSize, const ZSTD_CDict* cdict, + ZSTD_parameters params, unsigned long long pledgedSrcSize); + + +size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp) +{ + /* check conditions */ + if (output->pos > output->size) return ERROR(GENERIC); + if (input->pos > input->size) return ERROR(GENERIC); + assert(cctx!=NULL); + + /* transparent initialization stage */ + if (cctx->streamStage == zcss_init) { + const void* const prefix = cctx->prefix; + size_t const prefixSize = cctx->prefixSize; + ZSTD_parameters params = cctx->requestedParams; + if (cctx->compressionLevel != ZSTD_CLEVEL_CUSTOM) + params.cParams = ZSTD_getCParams(cctx->compressionLevel, + cctx->pledgedSrcSizePlusOne-1, 0 /*dictSize*/); + cctx->prefix = NULL; cctx->prefixSize = 0; /* single usage */ + assert(prefix==NULL || cctx->cdict==NULL); /* only one can be set */ + +#ifdef ZSTD_MULTITHREAD + if (cctx->nbThreads > 1) { + DEBUGLOG(4, "call ZSTDMT_initCStream_internal as nbThreads=%u", cctx->nbThreads); + CHECK_F( ZSTDMT_initCStream_internal(cctx->mtctx, prefix, prefixSize, cctx->cdict, params, cctx->pledgedSrcSizePlusOne-1) ); + cctx->streamStage = zcss_load; + } else +#endif + { + CHECK_F( ZSTD_resetCStream_internal(cctx, prefix, prefixSize, cctx->dictMode, cctx->cdict, params, cctx->pledgedSrcSizePlusOne-1) ); + } } + + /* compression stage */ +#ifdef ZSTD_MULTITHREAD + if (cctx->nbThreads > 1) { + size_t const flushMin = ZSTDMT_compressStream_generic(cctx->mtctx, output, input, endOp); + DEBUGLOG(5, "ZSTDMT_compressStream_generic : %u", (U32)flushMin); + if ( ZSTD_isError(flushMin) + || (endOp == ZSTD_e_end && flushMin == 0) ) { /* compression completed */ + ZSTD_startNewCompression(cctx); + } + return flushMin; + } +#endif + + CHECK_F( ZSTD_compressStream_generic(cctx, output, input, endOp) ); + DEBUGLOG(5, "completed ZSTD_compress_generic"); + return cctx->outBuffContentSize - cctx->outBuffFlushedSize; /* remaining to flush */ +} + +size_t ZSTD_compress_generic_simpleArgs ( + ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, size_t* dstPos, + const void* src, size_t srcSize, size_t* srcPos, + ZSTD_EndDirective endOp) +{ + ZSTD_outBuffer output = { dst, dstCapacity, *dstPos }; + ZSTD_inBuffer input = { src, srcSize, *srcPos }; + /* ZSTD_compress_generic() will check validity of dstPos and srcPos */ + size_t const cErr = ZSTD_compress_generic(cctx, &output, &input, endOp); + *dstPos = output.pos; + *srcPos = input.pos; + return cErr; } @@ -3401,89 +4006,59 @@ size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuf * @return : amount of data remaining to flush */ size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output) { - size_t srcSize = 0; - size_t sizeWritten = output->size - output->pos; - size_t const result = ZSTD_compressStream_generic(zcs, - (char*)(output->dst) + output->pos, &sizeWritten, - &srcSize, &srcSize, /* use a valid src address instead of NULL */ - zsf_flush); - output->pos += sizeWritten; - if (ZSTD_isError(result)) return result; - return zcs->outBuffContentSize - zcs->outBuffFlushedSize; /* remaining to flush */ + ZSTD_inBuffer input = { NULL, 0, 0 }; + if (output->pos > output->size) return ERROR(GENERIC); + CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_flush) ); + return zcs->outBuffContentSize - zcs->outBuffFlushedSize; /* remaining to flush */ } size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output) { - BYTE* const ostart = (BYTE*)(output->dst) + output->pos; - BYTE* const oend = (BYTE*)(output->dst) + output->size; - BYTE* op = ostart; - - if (zcs->stage != zcss_final) { - /* flush whatever remains */ - size_t srcSize = 0; - size_t sizeWritten = output->size - output->pos; - size_t const notEnded = ZSTD_compressStream_generic(zcs, ostart, &sizeWritten, - &srcSize /* use a valid src address instead of NULL */, &srcSize, zsf_end); - size_t const remainingToFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize; - op += sizeWritten; - if (remainingToFlush) { - output->pos += sizeWritten; - return remainingToFlush + ZSTD_BLOCKHEADERSIZE /* final empty block */ + (zcs->checksum * 4); - } - /* create epilogue */ - zcs->stage = zcss_final; - zcs->outBuffContentSize = !notEnded ? 0 : - /* write epilogue, including final empty block, into outBuff */ - ZSTD_compressEnd(zcs->cctx, zcs->outBuff, zcs->outBuffSize, NULL, 0); - if (ZSTD_isError(zcs->outBuffContentSize)) return zcs->outBuffContentSize; - } - - /* flush epilogue */ - { size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize; - size_t const flushed = ZSTD_limitCopy(op, oend-op, zcs->outBuff + zcs->outBuffFlushedSize, toFlush); - op += flushed; - zcs->outBuffFlushedSize += flushed; - output->pos += op-ostart; - if (toFlush==flushed) zcs->stage = zcss_init; /* end reached */ - return toFlush - flushed; + ZSTD_inBuffer input = { NULL, 0, 0 }; + if (output->pos > output->size) return ERROR(GENERIC); + CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_end) ); + { size_t const lastBlockSize = zcs->frameEnded ? 0 : ZSTD_BLOCKHEADERSIZE; + size_t const checksumSize = zcs->frameEnded ? 0 : zcs->appliedParams.fParams.checksumFlag * 4; + size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize + lastBlockSize + checksumSize; + DEBUGLOG(5, "ZSTD_endStream : remaining to flush : %u", + (unsigned)toFlush); + return toFlush; } } - /*-===== Pre-defined compression levels =====-*/ -#define ZSTD_DEFAULT_CLEVEL 1 #define ZSTD_MAX_CLEVEL 22 int ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; } static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEVEL+1] = { -{ /* "default" */ +{ /* "default" - guarantees a monotonically increasing memory budget */ /* W, C, H, S, L, TL, strat */ { 18, 12, 12, 1, 7, 16, ZSTD_fast }, /* level 0 - never used */ { 19, 13, 14, 1, 7, 16, ZSTD_fast }, /* level 1 */ { 19, 15, 16, 1, 6, 16, ZSTD_fast }, /* level 2 */ - { 20, 16, 17, 1, 5, 16, ZSTD_dfast }, /* level 3.*/ - { 20, 18, 18, 1, 5, 16, ZSTD_dfast }, /* level 4.*/ - { 20, 15, 18, 3, 5, 16, ZSTD_greedy }, /* level 5 */ - { 21, 16, 19, 2, 5, 16, ZSTD_lazy }, /* level 6 */ - { 21, 17, 20, 3, 5, 16, ZSTD_lazy }, /* level 7 */ + { 20, 16, 17, 1, 5, 16, ZSTD_dfast }, /* level 3 */ + { 20, 17, 18, 1, 5, 16, ZSTD_dfast }, /* level 4 */ + { 20, 17, 18, 2, 5, 16, ZSTD_greedy }, /* level 5 */ + { 21, 17, 19, 2, 5, 16, ZSTD_lazy }, /* level 6 */ + { 21, 18, 19, 3, 5, 16, ZSTD_lazy }, /* level 7 */ { 21, 18, 20, 3, 5, 16, ZSTD_lazy2 }, /* level 8 */ - { 21, 20, 20, 3, 5, 16, ZSTD_lazy2 }, /* level 9 */ + { 21, 19, 20, 3, 5, 16, ZSTD_lazy2 }, /* level 9 */ { 21, 19, 21, 4, 5, 16, ZSTD_lazy2 }, /* level 10 */ { 22, 20, 22, 4, 5, 16, ZSTD_lazy2 }, /* level 11 */ { 22, 20, 22, 5, 5, 16, ZSTD_lazy2 }, /* level 12 */ { 22, 21, 22, 5, 5, 16, ZSTD_lazy2 }, /* level 13 */ { 22, 21, 22, 6, 5, 16, ZSTD_lazy2 }, /* level 14 */ - { 22, 21, 21, 5, 5, 16, ZSTD_btlazy2 }, /* level 15 */ + { 22, 21, 22, 5, 5, 16, ZSTD_btlazy2 }, /* level 15 */ { 23, 22, 22, 5, 5, 16, ZSTD_btlazy2 }, /* level 16 */ - { 23, 21, 22, 4, 5, 24, ZSTD_btopt }, /* level 17 */ + { 23, 22, 22, 4, 5, 24, ZSTD_btopt }, /* level 17 */ { 23, 22, 22, 5, 4, 32, ZSTD_btopt }, /* level 18 */ { 23, 23, 22, 6, 3, 48, ZSTD_btopt }, /* level 19 */ - { 25, 25, 23, 7, 3, 64, ZSTD_btopt2 }, /* level 20 */ - { 26, 26, 23, 7, 3,256, ZSTD_btopt2 }, /* level 21 */ - { 27, 27, 25, 9, 3,512, ZSTD_btopt2 }, /* level 22 */ + { 25, 25, 23, 7, 3, 64, ZSTD_btultra }, /* level 20 */ + { 26, 26, 24, 7, 3,256, ZSTD_btultra }, /* level 21 */ + { 27, 27, 25, 9, 3,512, ZSTD_btultra }, /* level 22 */ }, { /* for srcSize <= 256 KB */ /* W, C, H, S, L, T, strat */ @@ -3507,9 +4082,9 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 18, 19, 18, 8, 3, 64, ZSTD_btopt }, /* level 17.*/ { 18, 19, 18, 9, 3,128, ZSTD_btopt }, /* level 18.*/ { 18, 19, 18, 10, 3,256, ZSTD_btopt }, /* level 19.*/ - { 18, 19, 18, 11, 3,512, ZSTD_btopt2 }, /* level 20.*/ - { 18, 19, 18, 12, 3,512, ZSTD_btopt2 }, /* level 21.*/ - { 18, 19, 18, 13, 3,512, ZSTD_btopt2 }, /* level 22.*/ + { 18, 19, 18, 11, 3,512, ZSTD_btultra }, /* level 20.*/ + { 18, 19, 18, 12, 3,512, ZSTD_btultra }, /* level 21.*/ + { 18, 19, 18, 13, 3,512, ZSTD_btultra }, /* level 22.*/ }, { /* for srcSize <= 128 KB */ /* W, C, H, S, L, T, strat */ @@ -3533,9 +4108,9 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 17, 18, 17, 7, 3, 64, ZSTD_btopt }, /* level 17.*/ { 17, 18, 17, 7, 3,256, ZSTD_btopt }, /* level 18.*/ { 17, 18, 17, 8, 3,256, ZSTD_btopt }, /* level 19.*/ - { 17, 18, 17, 9, 3,256, ZSTD_btopt2 }, /* level 20.*/ - { 17, 18, 17, 10, 3,256, ZSTD_btopt2 }, /* level 21.*/ - { 17, 18, 17, 11, 3,512, ZSTD_btopt2 }, /* level 22.*/ + { 17, 18, 17, 9, 3,256, ZSTD_btultra }, /* level 20.*/ + { 17, 18, 17, 10, 3,256, ZSTD_btultra }, /* level 21.*/ + { 17, 18, 17, 11, 3,512, ZSTD_btultra }, /* level 22.*/ }, { /* for srcSize <= 16 KB */ /* W, C, H, S, L, T, strat */ @@ -3559,39 +4134,59 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 14, 15, 15, 6, 3,128, ZSTD_btopt }, /* level 17.*/ { 14, 15, 15, 6, 3,256, ZSTD_btopt }, /* level 18.*/ { 14, 15, 15, 7, 3,256, ZSTD_btopt }, /* level 19.*/ - { 14, 15, 15, 8, 3,256, ZSTD_btopt2 }, /* level 20.*/ - { 14, 15, 15, 9, 3,256, ZSTD_btopt2 }, /* level 21.*/ - { 14, 15, 15, 10, 3,256, ZSTD_btopt2 }, /* level 22.*/ + { 14, 15, 15, 8, 3,256, ZSTD_btultra }, /* level 20.*/ + { 14, 15, 15, 9, 3,256, ZSTD_btultra }, /* level 21.*/ + { 14, 15, 15, 10, 3,256, ZSTD_btultra }, /* level 22.*/ }, }; +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=1) +/* This function just controls + * the monotonic memory budget increase of ZSTD_defaultCParameters[0]. + * Run once, on first ZSTD_getCParams() usage, if ZSTD_DEBUG is enabled + */ +MEM_STATIC void ZSTD_check_compressionLevel_monotonicIncrease_memoryBudget(void) +{ + int level; + for (level=1; level ZSTD_MAX_CLEVEL) compressionLevel = ZSTD_MAX_CLEVEL; - cp = ZSTD_defaultCParameters[tableID][compressionLevel]; - if (MEM_32bits()) { /* auto-correction, for 32-bits mode */ - if (cp.windowLog > ZSTD_WINDOWLOG_MAX) cp.windowLog = ZSTD_WINDOWLOG_MAX; - if (cp.chainLog > ZSTD_CHAINLOG_MAX) cp.chainLog = ZSTD_CHAINLOG_MAX; - if (cp.hashLog > ZSTD_HASHLOG_MAX) cp.hashLog = ZSTD_HASHLOG_MAX; + +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=1) + static int g_monotonicTest = 1; + if (g_monotonicTest) { + ZSTD_check_compressionLevel_monotonicIncrease_memoryBudget(); + g_monotonicTest=0; } - cp = ZSTD_adjustCParams(cp, srcSize, dictSize); - return cp; +#endif + + if (compressionLevel <= 0) compressionLevel = ZSTD_CLEVEL_DEFAULT; /* 0 == default; no negative compressionLevel yet */ + if (compressionLevel > ZSTD_MAX_CLEVEL) compressionLevel = ZSTD_MAX_CLEVEL; + { ZSTD_compressionParameters const cp = ZSTD_defaultCParameters[tableID][compressionLevel]; + return ZSTD_adjustCParams_internal(cp, srcSizeHint, dictSize); } } /*! ZSTD_getParams() : * same as ZSTD_getCParams(), but @return a `ZSTD_parameters` object (instead of `ZSTD_compressionParameters`). * All fields of `ZSTD_frameParameters` are set to default (0) */ -ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSize, size_t dictSize) { +ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSizeHint, size_t dictSize) { ZSTD_parameters params; - ZSTD_compressionParameters const cParams = ZSTD_getCParams(compressionLevel, srcSize, dictSize); + ZSTD_compressionParameters const cParams = ZSTD_getCParams(compressionLevel, srcSizeHint, dictSize); memset(¶ms, 0, sizeof(params)); params.cParams = cParams; return params; diff --git a/contrib/zstd/lib/compress/zstd_opt.h b/contrib/zstd/lib/compress/zstd_opt.h index 54376119121d..e8e98915ea31 100644 --- a/contrib/zstd/lib/compress/zstd_opt.h +++ b/contrib/zstd/lib/compress/zstd_opt.h @@ -43,6 +43,7 @@ MEM_STATIC void ZSTD_rescaleFreqs(seqStore_t* ssPtr, const BYTE* src, size_t src if (ssPtr->litLengthSum == 0) { if (srcSize <= 1024) ssPtr->staticPrices = 1; + assert(ssPtr->litFreq!=NULL); for (u=0; u<=MaxLit; u++) ssPtr->litFreq[u] = 0; for (u=0; u >8; + } +} + /* Update hashTable3 up to ip (excluded) Assumption : always within prefix (i.e. not within extDict) */ @@ -234,12 +249,12 @@ static U32 ZSTD_insertBtAndGetAllMatches ( { const BYTE* const base = zc->base; const U32 current = (U32)(ip-base); - const U32 hashLog = zc->params.cParams.hashLog; + const U32 hashLog = zc->appliedParams.cParams.hashLog; const size_t h = ZSTD_hashPtr(ip, hashLog, mls); U32* const hashTable = zc->hashTable; U32 matchIndex = hashTable[h]; U32* const bt = zc->chainTable; - const U32 btLog = zc->params.cParams.chainLog - 1; + const U32 btLog = zc->appliedParams.cParams.chainLog - 1; const U32 btMask= (1U << btLog) - 1; size_t commonLengthSmaller=0, commonLengthLarger=0; const BYTE* const dictBase = zc->dictBase; @@ -267,7 +282,7 @@ static U32 ZSTD_insertBtAndGetAllMatches ( if (match[bestLength] == ip[bestLength]) currentMl = ZSTD_count(ip, match, iLimit); } else { match = dictBase + matchIndex3; - if (MEM_readMINMATCH(match, MINMATCH) == MEM_readMINMATCH(ip, MINMATCH)) /* assumption : matchIndex3 <= dictLimit-4 (by table construction) */ + if (ZSTD_readMINMATCH(match, MINMATCH) == ZSTD_readMINMATCH(ip, MINMATCH)) /* assumption : matchIndex3 <= dictLimit-4 (by table construction) */ currentMl = ZSTD_count_2segments(ip+MINMATCH, match+MINMATCH, iLimit, dictEnd, prefixStart) + MINMATCH; } @@ -410,10 +425,10 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx, const BYTE* const base = ctx->base; const BYTE* const prefixStart = base + ctx->dictLimit; - const U32 maxSearches = 1U << ctx->params.cParams.searchLog; - const U32 sufficient_len = ctx->params.cParams.targetLength; - const U32 mls = ctx->params.cParams.searchLength; - const U32 minMatch = (ctx->params.cParams.searchLength == 3) ? 3 : 4; + const U32 maxSearches = 1U << ctx->appliedParams.cParams.searchLog; + const U32 sufficient_len = ctx->appliedParams.cParams.targetLength; + const U32 mls = ctx->appliedParams.cParams.searchLength; + const U32 minMatch = (ctx->appliedParams.cParams.searchLength == 3) ? 3 : 4; ZSTD_optimal_t* opt = seqStorePtr->priceTable; ZSTD_match_t* matches = seqStorePtr->matchTable; @@ -439,7 +454,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx, for (i=(ip == anchor); i 0) && (repCur < (S32)(ip-prefixStart)) - && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(ip - repCur, minMatch))) { + && (ZSTD_readMINMATCH(ip, minMatch) == ZSTD_readMINMATCH(ip - repCur, minMatch))) { mlen = (U32)ZSTD_count(ip+minMatch, ip+minMatch-repCur, iend) + minMatch; if (mlen > sufficient_len || mlen >= ZSTD_OPT_NUM) { best_mlen = mlen; best_off = i; cur = 0; last_pos = 1; @@ -524,7 +539,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx, for (i=(opt[cur].mlen != 1); i 0) && (repCur < (S32)(inr-prefixStart)) - && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(inr - repCur, minMatch))) { + && (ZSTD_readMINMATCH(inr, minMatch) == ZSTD_readMINMATCH(inr - repCur, minMatch))) { mlen = (U32)ZSTD_count(inr+minMatch, inr+minMatch - repCur, iend) + minMatch; if (mlen > sufficient_len || cur + mlen >= ZSTD_OPT_NUM) { @@ -663,10 +678,10 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx, const BYTE* const dictBase = ctx->dictBase; const BYTE* const dictEnd = dictBase + dictLimit; - const U32 maxSearches = 1U << ctx->params.cParams.searchLog; - const U32 sufficient_len = ctx->params.cParams.targetLength; - const U32 mls = ctx->params.cParams.searchLength; - const U32 minMatch = (ctx->params.cParams.searchLength == 3) ? 3 : 4; + const U32 maxSearches = 1U << ctx->appliedParams.cParams.searchLog; + const U32 sufficient_len = ctx->appliedParams.cParams.targetLength; + const U32 mls = ctx->appliedParams.cParams.searchLength; + const U32 minMatch = (ctx->appliedParams.cParams.searchLength == 3) ? 3 : 4; ZSTD_optimal_t* opt = seqStorePtr->priceTable; ZSTD_match_t* matches = seqStorePtr->matchTable; @@ -698,7 +713,7 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx, const BYTE* const repMatch = repBase + repIndex; if ( (repCur > 0 && repCur <= (S32)current) && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex)) /* intentional overflow */ - && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) { + && (ZSTD_readMINMATCH(ip, minMatch) == ZSTD_readMINMATCH(repMatch, minMatch)) ) { /* repcode detected we should take it */ const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; mlen = (U32)ZSTD_count_2segments(ip+minMatch, repMatch+minMatch, iend, repEnd, prefixStart) + minMatch; @@ -794,7 +809,7 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx, const BYTE* const repMatch = repBase + repIndex; if ( (repCur > 0 && repCur <= (S32)(current+cur)) && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex)) /* intentional overflow */ - && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) { + && (ZSTD_readMINMATCH(inr, minMatch) == ZSTD_readMINMATCH(repMatch, minMatch)) ) { /* repcode detected */ const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; mlen = (U32)ZSTD_count_2segments(inr+minMatch, repMatch+minMatch, iend, repEnd, prefixStart) + minMatch; diff --git a/contrib/zstd/lib/compress/zstdmt_compress.c b/contrib/zstd/lib/compress/zstdmt_compress.c index fc7f52a2902d..0cee01eacb86 100644 --- a/contrib/zstd/lib/compress/zstdmt_compress.c +++ b/contrib/zstd/lib/compress/zstdmt_compress.c @@ -14,34 +14,31 @@ /* ====== Compiler specifics ====== */ #if defined(_MSC_VER) -# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ +# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ #endif /* ====== Dependencies ====== */ -#include /* malloc */ -#include /* memcpy */ -#include "pool.h" /* threadpool */ -#include "threading.h" /* mutex */ -#include "zstd_internal.h" /* MIN, ERROR, ZSTD_*, ZSTD_highbit32 */ +#include /* memcpy, memset */ +#include "pool.h" /* threadpool */ +#include "threading.h" /* mutex */ +#include "zstd_internal.h" /* MIN, ERROR, ZSTD_*, ZSTD_highbit32 */ #include "zstdmt_compress.h" /* ====== Debug ====== */ -#if 0 +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=2) # include # include # include - static unsigned g_debugLevel = 5; -# define DEBUGLOGRAW(l, ...) if (l<=g_debugLevel) { fprintf(stderr, __VA_ARGS__); } -# define DEBUGLOG(l, ...) if (l<=g_debugLevel) { fprintf(stderr, __FILE__ ": "); fprintf(stderr, __VA_ARGS__); fprintf(stderr, " \n"); } +# define DEBUGLOGRAW(l, ...) if (l<=ZSTD_DEBUG) { fprintf(stderr, __VA_ARGS__); } -# define DEBUG_PRINTHEX(l,p,n) { \ - unsigned debug_u; \ - for (debug_u=0; debug_u<(n); debug_u++) \ +# define DEBUG_PRINTHEX(l,p,n) { \ + unsigned debug_u; \ + for (debug_u=0; debug_u<(n); debug_u++) \ DEBUGLOGRAW(l, "%02X ", ((const unsigned char*)(p))[debug_u]); \ - DEBUGLOGRAW(l, " \n"); \ + DEBUGLOGRAW(l, " \n"); \ } static unsigned long long GetCurrentClockTimeMicroseconds(void) @@ -53,22 +50,22 @@ static unsigned long long GetCurrentClockTimeMicroseconds(void) return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond); } } -#define MUTEX_WAIT_TIME_DLEVEL 5 -#define PTHREAD_MUTEX_LOCK(mutex) \ -if (g_debugLevel>=MUTEX_WAIT_TIME_DLEVEL) { \ - unsigned long long const beforeTime = GetCurrentClockTimeMicroseconds(); \ - pthread_mutex_lock(mutex); \ - { unsigned long long const afterTime = GetCurrentClockTimeMicroseconds(); \ - unsigned long long const elapsedTime = (afterTime-beforeTime); \ - if (elapsedTime > 1000) { /* or whatever threshold you like; I'm using 1 millisecond here */ \ - DEBUGLOG(MUTEX_WAIT_TIME_DLEVEL, "Thread took %llu microseconds to acquire mutex %s \n", \ - elapsedTime, #mutex); \ - } } \ -} else pthread_mutex_lock(mutex); +#define MUTEX_WAIT_TIME_DLEVEL 6 +#define PTHREAD_MUTEX_LOCK(mutex) { \ + if (ZSTD_DEBUG>=MUTEX_WAIT_TIME_DLEVEL) { \ + unsigned long long const beforeTime = GetCurrentClockTimeMicroseconds(); \ + pthread_mutex_lock(mutex); \ + { unsigned long long const afterTime = GetCurrentClockTimeMicroseconds(); \ + unsigned long long const elapsedTime = (afterTime-beforeTime); \ + if (elapsedTime > 1000) { /* or whatever threshold you like; I'm using 1 millisecond here */ \ + DEBUGLOG(MUTEX_WAIT_TIME_DLEVEL, "Thread took %llu microseconds to acquire mutex %s \n", \ + elapsedTime, #mutex); \ + } } \ + } else pthread_mutex_lock(mutex); \ +} #else -# define DEBUGLOG(l, ...) {} /* disabled */ # define PTHREAD_MUTEX_LOCK(m) pthread_mutex_lock(m) # define DEBUG_PRINTHEX(l,p,n) {} @@ -87,16 +84,19 @@ static const buffer_t g_nullBuffer = { NULL, 0 }; typedef struct ZSTDMT_bufferPool_s { unsigned totalBuffers; unsigned nbBuffers; + ZSTD_customMem cMem; buffer_t bTable[1]; /* variable size */ } ZSTDMT_bufferPool; -static ZSTDMT_bufferPool* ZSTDMT_createBufferPool(unsigned nbThreads) +static ZSTDMT_bufferPool* ZSTDMT_createBufferPool(unsigned nbThreads, ZSTD_customMem cMem) { unsigned const maxNbBuffers = 2*nbThreads + 2; - ZSTDMT_bufferPool* const bufPool = (ZSTDMT_bufferPool*)calloc(1, sizeof(ZSTDMT_bufferPool) + (maxNbBuffers-1) * sizeof(buffer_t)); + ZSTDMT_bufferPool* const bufPool = (ZSTDMT_bufferPool*)ZSTD_calloc( + sizeof(ZSTDMT_bufferPool) + (maxNbBuffers-1) * sizeof(buffer_t), cMem); if (bufPool==NULL) return NULL; bufPool->totalBuffers = maxNbBuffers; bufPool->nbBuffers = 0; + bufPool->cMem = cMem; return bufPool; } @@ -105,23 +105,39 @@ static void ZSTDMT_freeBufferPool(ZSTDMT_bufferPool* bufPool) unsigned u; if (!bufPool) return; /* compatibility with free on NULL */ for (u=0; u totalBuffers; u++) - free(bufPool->bTable[u].start); - free(bufPool); + ZSTD_free(bufPool->bTable[u].start, bufPool->cMem); + ZSTD_free(bufPool, bufPool->cMem); } -/* assumption : invocation from main thread only ! */ +/* only works at initialization, not during compression */ +static size_t ZSTDMT_sizeof_bufferPool(ZSTDMT_bufferPool* bufPool) +{ + size_t const poolSize = sizeof(*bufPool) + + (bufPool->totalBuffers - 1) * sizeof(buffer_t); + unsigned u; + size_t totalBufferSize = 0; + for (u=0; u totalBuffers; u++) + totalBufferSize += bufPool->bTable[u].size; + + return poolSize + totalBufferSize; +} + +/** ZSTDMT_getBuffer() : + * assumption : invocation from main thread only ! */ static buffer_t ZSTDMT_getBuffer(ZSTDMT_bufferPool* pool, size_t bSize) { if (pool->nbBuffers) { /* try to use an existing buffer */ buffer_t const buf = pool->bTable[--(pool->nbBuffers)]; size_t const availBufferSize = buf.size; - if ((availBufferSize >= bSize) & (availBufferSize <= 10*bSize)) /* large enough, but not too much */ + if ((availBufferSize >= bSize) & (availBufferSize <= 10*bSize)) + /* large enough, but not too much */ return buf; - free(buf.start); /* size conditions not respected : scratch this buffer and create a new one */ + /* size conditions not respected : scratch this buffer, create new one */ + ZSTD_free(buf.start, pool->cMem); } /* create new buffer */ { buffer_t buffer; - void* const start = malloc(bSize); + void* const start = ZSTD_malloc(bSize, pool->cMem); if (start==NULL) bSize = 0; buffer.start = start; /* note : start can be NULL if malloc fails ! */ buffer.size = bSize; @@ -138,7 +154,7 @@ static void ZSTDMT_releaseBuffer(ZSTDMT_bufferPool* pool, buffer_t buf) return; } /* Reached bufferPool capacity (should not happen) */ - free(buf.start); + ZSTD_free(buf.start, pool->cMem); } @@ -147,6 +163,7 @@ static void ZSTDMT_releaseBuffer(ZSTDMT_bufferPool* pool, buffer_t buf) typedef struct { unsigned totalCCtx; unsigned availCCtx; + ZSTD_customMem cMem; ZSTD_CCtx* cctx[1]; /* variable size */ } ZSTDMT_CCtxPool; @@ -158,23 +175,40 @@ static void ZSTDMT_freeCCtxPool(ZSTDMT_CCtxPool* pool) unsigned u; for (u=0; u totalCCtx; u++) ZSTD_freeCCtx(pool->cctx[u]); /* note : compatible with free on NULL */ - free(pool); + ZSTD_free(pool, pool->cMem); } /* ZSTDMT_createCCtxPool() : * implies nbThreads >= 1 , checked by caller ZSTDMT_createCCtx() */ -static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbThreads) +static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbThreads, + ZSTD_customMem cMem) { - ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) calloc(1, sizeof(ZSTDMT_CCtxPool) + (nbThreads-1)*sizeof(ZSTD_CCtx*)); + ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) ZSTD_calloc( + sizeof(ZSTDMT_CCtxPool) + (nbThreads-1)*sizeof(ZSTD_CCtx*), cMem); if (!cctxPool) return NULL; + cctxPool->cMem = cMem; cctxPool->totalCCtx = nbThreads; cctxPool->availCCtx = 1; /* at least one cctx for single-thread mode */ - cctxPool->cctx[0] = ZSTD_createCCtx(); + cctxPool->cctx[0] = ZSTD_createCCtx_advanced(cMem); if (!cctxPool->cctx[0]) { ZSTDMT_freeCCtxPool(cctxPool); return NULL; } - DEBUGLOG(1, "cctxPool created, with %u threads", nbThreads); + DEBUGLOG(3, "cctxPool created, with %u threads", nbThreads); return cctxPool; } +/* only works during initialization phase, not during compression */ +static size_t ZSTDMT_sizeof_CCtxPool(ZSTDMT_CCtxPool* cctxPool) +{ + unsigned const nbThreads = cctxPool->totalCCtx; + size_t const poolSize = sizeof(*cctxPool) + + (nbThreads-1)*sizeof(ZSTD_CCtx*); + unsigned u; + size_t totalCCtxSize = 0; + for (u=0; u cctx[u]); + + return poolSize + totalCCtxSize; +} + static ZSTD_CCtx* ZSTDMT_getCCtx(ZSTDMT_CCtxPool* pool) { if (pool->availCCtx) { @@ -218,7 +252,7 @@ typedef struct { pthread_mutex_t* jobCompleted_mutex; pthread_cond_t* jobCompleted_cond; ZSTD_parameters params; - ZSTD_CDict* cdict; + const ZSTD_CDict* cdict; unsigned long long fullFrameSize; } ZSTDMT_jobDescription; @@ -228,11 +262,11 @@ void ZSTDMT_compressChunk(void* jobDescription) ZSTDMT_jobDescription* const job = (ZSTDMT_jobDescription*)jobDescription; const void* const src = (const char*)job->srcStart + job->dictSize; buffer_t const dstBuff = job->dstBuff; - DEBUGLOG(3, "job (first:%u) (last:%u) : dictSize %u, srcSize %u", + DEBUGLOG(5, "job (first:%u) (last:%u) : dictSize %u, srcSize %u", job->firstChunk, job->lastChunk, (U32)job->dictSize, (U32)job->srcSize); if (job->cdict) { /* should only happen for first segment */ size_t const initError = ZSTD_compressBegin_usingCDict_advanced(job->cctx, job->cdict, job->params.fParams, job->fullFrameSize); - if (job->cdict) DEBUGLOG(3, "using CDict "); + DEBUGLOG(5, "using CDict"); if (ZSTD_isError(initError)) { job->cSize = initError; goto _endJob; } } else { /* srcStart points at reloaded section */ if (!job->firstChunk) job->params.fParams.contentSizeFlag = 0; /* ensure no srcSize control */ @@ -247,12 +281,12 @@ void ZSTDMT_compressChunk(void* jobDescription) ZSTD_invalidateRepCodes(job->cctx); } - DEBUGLOG(4, "Compressing : "); + DEBUGLOG(5, "Compressing : "); DEBUG_PRINTHEX(4, job->srcStart, 12); job->cSize = (job->lastChunk) ? ZSTD_compressEnd (job->cctx, dstBuff.start, dstBuff.size, src, job->srcSize) : ZSTD_compressContinue(job->cctx, dstBuff.start, dstBuff.size, src, job->srcSize); - DEBUGLOG(3, "compressed %u bytes into %u bytes (first:%u) (last:%u)", + DEBUGLOG(5, "compressed %u bytes into %u bytes (first:%u) (last:%u)", (unsigned)job->srcSize, (unsigned)job->cSize, job->firstChunk, job->lastChunk); DEBUGLOG(5, "dstBuff.size : %u ; => %s", (U32)dstBuff.size, ZSTD_getErrorName(job->cSize)); @@ -271,6 +305,7 @@ _endJob: struct ZSTDMT_CCtx_s { POOL_ctx* factory; + ZSTDMT_jobDescription* jobs; ZSTDMT_bufferPool* buffPool; ZSTDMT_CCtxPool* cctxPool; pthread_mutex_t jobCompleted_mutex; @@ -292,50 +327,64 @@ struct ZSTDMT_CCtx_s { unsigned overlapRLog; unsigned long long frameContentSize; size_t sectionSize; - ZSTD_CDict* cdict; - ZSTD_CStream* cstream; - ZSTDMT_jobDescription jobs[1]; /* variable size (must lies at the end) */ + ZSTD_customMem cMem; + ZSTD_CDict* cdictLocal; + const ZSTD_CDict* cdict; }; -ZSTDMT_CCtx *ZSTDMT_createCCtx(unsigned nbThreads) +static ZSTDMT_jobDescription* ZSTDMT_allocJobsTable(U32* nbJobsPtr, ZSTD_customMem cMem) { - ZSTDMT_CCtx* cctx; - U32 const minNbJobs = nbThreads + 2; - U32 const nbJobsLog2 = ZSTD_highbit32(minNbJobs) + 1; + U32 const nbJobsLog2 = ZSTD_highbit32(*nbJobsPtr) + 1; U32 const nbJobs = 1 << nbJobsLog2; - DEBUGLOG(5, "nbThreads : %u ; minNbJobs : %u ; nbJobsLog2 : %u ; nbJobs : %u \n", - nbThreads, minNbJobs, nbJobsLog2, nbJobs); + *nbJobsPtr = nbJobs; + return (ZSTDMT_jobDescription*) ZSTD_calloc( + nbJobs * sizeof(ZSTDMT_jobDescription), cMem); +} + +ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbThreads, ZSTD_customMem cMem) +{ + ZSTDMT_CCtx* mtctx; + U32 nbJobs = nbThreads + 2; + DEBUGLOG(3, "ZSTDMT_createCCtx_advanced"); + if ((nbThreads < 1) | (nbThreads > ZSTDMT_NBTHREADS_MAX)) return NULL; - cctx = (ZSTDMT_CCtx*) calloc(1, sizeof(ZSTDMT_CCtx) + nbJobs*sizeof(ZSTDMT_jobDescription)); - if (!cctx) return NULL; - cctx->nbThreads = nbThreads; - cctx->jobIDMask = nbJobs - 1; - cctx->allJobsCompleted = 1; - cctx->sectionSize = 0; - cctx->overlapRLog = 3; - cctx->factory = POOL_create(nbThreads, 1); - cctx->buffPool = ZSTDMT_createBufferPool(nbThreads); - cctx->cctxPool = ZSTDMT_createCCtxPool(nbThreads); - if (!cctx->factory | !cctx->buffPool | !cctx->cctxPool) { /* one object was not created */ - ZSTDMT_freeCCtx(cctx); + if ((cMem.customAlloc!=NULL) ^ (cMem.customFree!=NULL)) + /* invalid custom allocator */ + return NULL; + + mtctx = (ZSTDMT_CCtx*) ZSTD_calloc(sizeof(ZSTDMT_CCtx), cMem); + if (!mtctx) return NULL; + mtctx->cMem = cMem; + mtctx->nbThreads = nbThreads; + mtctx->allJobsCompleted = 1; + mtctx->sectionSize = 0; + mtctx->overlapRLog = 3; + mtctx->factory = POOL_create(nbThreads, 1); + mtctx->jobs = ZSTDMT_allocJobsTable(&nbJobs, cMem); + mtctx->jobIDMask = nbJobs - 1; + mtctx->buffPool = ZSTDMT_createBufferPool(nbThreads, cMem); + mtctx->cctxPool = ZSTDMT_createCCtxPool(nbThreads, cMem); + if (!mtctx->factory | !mtctx->jobs | !mtctx->buffPool | !mtctx->cctxPool) { + ZSTDMT_freeCCtx(mtctx); return NULL; } - if (nbThreads==1) { - cctx->cstream = ZSTD_createCStream(); - if (!cctx->cstream) { - ZSTDMT_freeCCtx(cctx); return NULL; - } } - pthread_mutex_init(&cctx->jobCompleted_mutex, NULL); /* Todo : check init function return */ - pthread_cond_init(&cctx->jobCompleted_cond, NULL); - DEBUGLOG(4, "mt_cctx created, for %u threads \n", nbThreads); - return cctx; + pthread_mutex_init(&mtctx->jobCompleted_mutex, NULL); /* Todo : check init function return */ + pthread_cond_init(&mtctx->jobCompleted_cond, NULL); + DEBUGLOG(3, "mt_cctx created, for %u threads", nbThreads); + return mtctx; +} + +ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbThreads) +{ + return ZSTDMT_createCCtx_advanced(nbThreads, ZSTD_defaultCMem); } /* ZSTDMT_releaseAllJobResources() : - * Ensure all workers are killed first. */ + * note : ensure all workers are killed first ! */ static void ZSTDMT_releaseAllJobResources(ZSTDMT_CCtx* mtctx) { unsigned jobID; + DEBUGLOG(3, "ZSTDMT_releaseAllJobResources"); for (jobID=0; jobID <= mtctx->jobIDMask; jobID++) { ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->jobs[jobID].dstBuff); mtctx->jobs[jobID].dstBuff = g_nullBuffer; @@ -356,15 +405,26 @@ size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* mtctx) POOL_free(mtctx->factory); if (!mtctx->allJobsCompleted) ZSTDMT_releaseAllJobResources(mtctx); /* stop workers first */ ZSTDMT_freeBufferPool(mtctx->buffPool); /* release job resources into pools first */ + ZSTD_free(mtctx->jobs, mtctx->cMem); ZSTDMT_freeCCtxPool(mtctx->cctxPool); - ZSTD_freeCDict(mtctx->cdict); - ZSTD_freeCStream(mtctx->cstream); + ZSTD_freeCDict(mtctx->cdictLocal); pthread_mutex_destroy(&mtctx->jobCompleted_mutex); pthread_cond_destroy(&mtctx->jobCompleted_cond); - free(mtctx); + ZSTD_free(mtctx, mtctx->cMem); return 0; } +size_t ZSTDMT_sizeof_CCtx(ZSTDMT_CCtx* mtctx) +{ + if (mtctx == NULL) return 0; /* supports sizeof NULL */ + return sizeof(*mtctx) + + POOL_sizeof(mtctx->factory) + + ZSTDMT_sizeof_bufferPool(mtctx->buffPool) + + (mtctx->jobIDMask+1) * sizeof(ZSTDMT_jobDescription) + + ZSTDMT_sizeof_CCtxPool(mtctx->cctxPool) + + ZSTD_sizeof_CDict(mtctx->cdictLocal); +} + size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, unsigned value) { switch(parameter) @@ -373,7 +433,7 @@ size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, mtctx->sectionSize = value; return 0; case ZSTDMT_p_overlapSectionLog : - DEBUGLOG(4, "ZSTDMT_p_overlapSectionLog : %u", value); + DEBUGLOG(5, "ZSTDMT_p_overlapSectionLog : %u", value); mtctx->overlapRLog = (value >= 9) ? 0 : 9 - value; return 0; default : @@ -386,31 +446,49 @@ size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, /* ===== Multi-threaded compression ===== */ /* ------------------------------------------ */ -size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, +static unsigned computeNbChunks(size_t srcSize, unsigned windowLog, unsigned nbThreads) { + size_t const chunkSizeTarget = (size_t)1 << (windowLog + 2); + size_t const chunkMaxSize = chunkSizeTarget << 2; + size_t const passSizeMax = chunkMaxSize * nbThreads; + unsigned const multiplier = (unsigned)(srcSize / passSizeMax) + 1; + unsigned const nbChunksLarge = multiplier * nbThreads; + unsigned const nbChunksMax = (unsigned)(srcSize / chunkSizeTarget) + 1; + unsigned const nbChunksSmall = MIN(nbChunksMax, nbThreads); + return (multiplier>1) ? nbChunksLarge : nbChunksSmall; +} + + +size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, - int compressionLevel) + const ZSTD_CDict* cdict, + ZSTD_parameters const params, + unsigned overlapRLog) { - ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, 0); - U32 const overlapLog = (compressionLevel >= ZSTD_maxCLevel()) ? 0 : 3; - size_t const overlapSize = (size_t)1 << (params.cParams.windowLog - overlapLog); - size_t const chunkTargetSize = (size_t)1 << (params.cParams.windowLog + 2); - unsigned const nbChunksMax = (unsigned)(srcSize / chunkTargetSize) + 1; - unsigned nbChunks = MIN(nbChunksMax, mtctx->nbThreads); + size_t const overlapSize = (overlapRLog>=9) ? 0 : (size_t)1 << (params.cParams.windowLog - overlapRLog); + unsigned nbChunks = computeNbChunks(srcSize, params.cParams.windowLog, mtctx->nbThreads); size_t const proposedChunkSize = (srcSize + (nbChunks-1)) / nbChunks; - size_t const avgChunkSize = ((proposedChunkSize & 0x1FFFF) < 0xFFFF) ? proposedChunkSize + 0xFFFF : proposedChunkSize; /* avoid too small last block */ - size_t remainingSrcSize = srcSize; + size_t const avgChunkSize = ((proposedChunkSize & 0x1FFFF) < 0x7FFF) ? proposedChunkSize + 0xFFFF : proposedChunkSize; /* avoid too small last block */ const char* const srcStart = (const char*)src; + size_t remainingSrcSize = srcSize; unsigned const compressWithinDst = (dstCapacity >= ZSTD_compressBound(srcSize)) ? nbChunks : (unsigned)(dstCapacity / ZSTD_compressBound(avgChunkSize)); /* presumes avgChunkSize >= 256 KB, which should be the case */ size_t frameStartPos = 0, dstBufferPos = 0; - DEBUGLOG(3, "windowLog : %2u => chunkTargetSize : %u bytes ", params.cParams.windowLog, (U32)chunkTargetSize); - DEBUGLOG(2, "nbChunks : %2u (chunkSize : %u bytes) ", nbChunks, (U32)avgChunkSize); - params.fParams.contentSizeFlag = 1; - + DEBUGLOG(4, "nbChunks : %2u (chunkSize : %u bytes) ", nbChunks, (U32)avgChunkSize); if (nbChunks==1) { /* fallback to single-thread mode */ ZSTD_CCtx* const cctx = mtctx->cctxPool->cctx[0]; - return ZSTD_compressCCtx(cctx, dst, dstCapacity, src, srcSize, compressionLevel); + if (cdict) return ZSTD_compress_usingCDict_advanced(cctx, dst, dstCapacity, src, srcSize, cdict, params.fParams); + return ZSTD_compress_advanced(cctx, dst, dstCapacity, src, srcSize, NULL, 0, params); + } + assert(avgChunkSize >= 256 KB); /* condition for ZSTD_compressBound(A) + ZSTD_compressBound(B) <= ZSTD_compressBound(A+B), which is useful to avoid allocating extra buffers */ + + if (nbChunks > mtctx->jobIDMask+1) { /* enlarge job table */ + U32 nbJobs = nbChunks; + ZSTD_free(mtctx->jobs, mtctx->cMem); + mtctx->jobIDMask = 0; + mtctx->jobs = ZSTDMT_allocJobsTable(&nbJobs, mtctx->cMem); + if (mtctx->jobs==NULL) return ERROR(memory_allocation); + mtctx->jobIDMask = nbJobs - 1; } { unsigned u; @@ -425,15 +503,18 @@ size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, if ((cctx==NULL) || (dstBuffer.start==NULL)) { mtctx->jobs[u].cSize = ERROR(memory_allocation); /* job result */ mtctx->jobs[u].jobCompleted = 1; - nbChunks = u+1; + nbChunks = u+1; /* only wait and free u jobs, instead of initially expected nbChunks ones */ break; /* let's wait for previous jobs to complete, but don't start new ones */ } mtctx->jobs[u].srcStart = srcStart + frameStartPos - dictSize; mtctx->jobs[u].dictSize = dictSize; mtctx->jobs[u].srcSize = chunkSize; + mtctx->jobs[u].cdict = mtctx->nextJobID==0 ? cdict : NULL; mtctx->jobs[u].fullFrameSize = srcSize; mtctx->jobs[u].params = params; + /* do not calculate checksum within sections, but write it in header for first section */ + if (u!=0) mtctx->jobs[u].params.fParams.checksumFlag = 0; mtctx->jobs[u].dstBuff = dstBuffer; mtctx->jobs[u].cctx = cctx; mtctx->jobs[u].firstChunk = (u==0); @@ -442,27 +523,27 @@ size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, mtctx->jobs[u].jobCompleted_mutex = &mtctx->jobCompleted_mutex; mtctx->jobs[u].jobCompleted_cond = &mtctx->jobCompleted_cond; - DEBUGLOG(3, "posting job %u (%u bytes)", u, (U32)chunkSize); - DEBUG_PRINTHEX(3, mtctx->jobs[u].srcStart, 12); + DEBUGLOG(5, "posting job %u (%u bytes)", u, (U32)chunkSize); + DEBUG_PRINTHEX(6, mtctx->jobs[u].srcStart, 12); POOL_add(mtctx->factory, ZSTDMT_compressChunk, &mtctx->jobs[u]); frameStartPos += chunkSize; dstBufferPos += dstBufferCapacity; remainingSrcSize -= chunkSize; } } - /* note : since nbChunks <= nbThreads, all jobs should be running immediately in parallel */ + /* collect result */ { unsigned chunkID; size_t error = 0, dstPos = 0; for (chunkID=0; chunkID jobCompleted_mutex); while (mtctx->jobs[chunkID].jobCompleted==0) { - DEBUGLOG(4, "waiting for jobCompleted signal from chunk %u", chunkID); + DEBUGLOG(5, "waiting for jobCompleted signal from chunk %u", chunkID); pthread_cond_wait(&mtctx->jobCompleted_cond, &mtctx->jobCompleted_mutex); } pthread_mutex_unlock(&mtctx->jobCompleted_mutex); - DEBUGLOG(3, "ready to write chunk %u ", chunkID); + DEBUGLOG(5, "ready to write chunk %u ", chunkID); ZSTDMT_releaseCCtx(mtctx->cctxPool, mtctx->jobs[chunkID].cctx); mtctx->jobs[chunkID].cctx = NULL; @@ -470,20 +551,33 @@ size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, { size_t const cSize = mtctx->jobs[chunkID].cSize; if (ZSTD_isError(cSize)) error = cSize; if ((!error) && (dstPos + cSize > dstCapacity)) error = ERROR(dstSize_tooSmall); - if (chunkID) { /* note : chunk 0 is already written directly into dst */ + if (chunkID) { /* note : chunk 0 is written directly at dst, which is correct position */ if (!error) - memmove((char*)dst + dstPos, mtctx->jobs[chunkID].dstBuff.start, cSize); /* may overlap if chunk decompressed within dst */ - if (chunkID >= compressWithinDst) /* otherwise, it decompresses within dst */ + memmove((char*)dst + dstPos, mtctx->jobs[chunkID].dstBuff.start, cSize); /* may overlap when chunk compressed within dst */ + if (chunkID >= compressWithinDst) { /* chunk compressed into its own buffer, which must be released */ + DEBUGLOG(5, "releasing buffer %u>=%u", chunkID, compressWithinDst); ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->jobs[chunkID].dstBuff); + } mtctx->jobs[chunkID].dstBuff = g_nullBuffer; } dstPos += cSize ; } } - if (!error) DEBUGLOG(3, "compressed size : %u ", (U32)dstPos); + if (!error) DEBUGLOG(4, "compressed size : %u ", (U32)dstPos); return error ? error : dstPos; } +} + +size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + int compressionLevel) +{ + U32 const overlapRLog = (compressionLevel >= ZSTD_maxCLevel()) ? 0 : 3; + ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, 0); + params.fParams.contentSizeFlag = 1; + return ZSTDMT_compress_advanced(mtctx, dst, dstCapacity, src, srcSize, NULL, params, overlapRLog); } @@ -491,12 +585,14 @@ size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, /* ======= Streaming API ======= */ /* ====================================== */ -static void ZSTDMT_waitForAllJobsCompleted(ZSTDMT_CCtx* zcs) { +static void ZSTDMT_waitForAllJobsCompleted(ZSTDMT_CCtx* zcs) +{ + DEBUGLOG(4, "ZSTDMT_waitForAllJobsCompleted"); while (zcs->doneJobID < zcs->nextJobID) { unsigned const jobID = zcs->doneJobID & zcs->jobIDMask; PTHREAD_MUTEX_LOCK(&zcs->jobCompleted_mutex); while (zcs->jobs[jobID].jobCompleted==0) { - DEBUGLOG(4, "waiting for jobCompleted signal from chunk %u", zcs->doneJobID); /* we want to block when waiting for data to flush */ + DEBUGLOG(5, "waiting for jobCompleted signal from chunk %u", zcs->doneJobID); /* we want to block when waiting for data to flush */ pthread_cond_wait(&zcs->jobCompleted_cond, &zcs->jobCompleted_mutex); } pthread_mutex_unlock(&zcs->jobCompleted_mutex); @@ -505,33 +601,54 @@ static void ZSTDMT_waitForAllJobsCompleted(ZSTDMT_CCtx* zcs) { } -static size_t ZSTDMT_initCStream_internal(ZSTDMT_CCtx* zcs, - const void* dict, size_t dictSize, unsigned updateDict, - ZSTD_parameters params, unsigned long long pledgedSrcSize) +/** ZSTDMT_initCStream_internal() : + * internal usage only */ +size_t ZSTDMT_initCStream_internal(ZSTDMT_CCtx* zcs, + const void* dict, size_t dictSize, const ZSTD_CDict* cdict, + ZSTD_parameters params, unsigned long long pledgedSrcSize) { - ZSTD_customMem const cmem = { NULL, NULL, NULL }; - DEBUGLOG(3, "Started new compression, with windowLog : %u", params.cParams.windowLog); - if (zcs->nbThreads==1) return ZSTD_initCStream_advanced(zcs->cstream, dict, dictSize, params, pledgedSrcSize); - if (zcs->allJobsCompleted == 0) { /* previous job not correctly finished */ + DEBUGLOG(4, "ZSTDMT_initCStream_internal"); + /* params are supposed to be fully validated at this point */ + assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); + assert(!((dict) && (cdict))); /* either dict or cdict, not both */ + + if (zcs->nbThreads==1) { + DEBUGLOG(4, "single thread mode"); + return ZSTD_initCStream_internal(zcs->cctxPool->cctx[0], + dict, dictSize, cdict, + params, pledgedSrcSize); + } + + if (zcs->allJobsCompleted == 0) { /* previous compression not correctly finished */ ZSTDMT_waitForAllJobsCompleted(zcs); ZSTDMT_releaseAllJobResources(zcs); zcs->allJobsCompleted = 1; } + zcs->params = params; - if (updateDict) { - ZSTD_freeCDict(zcs->cdict); zcs->cdict = NULL; - if (dict && dictSize) { - zcs->cdict = ZSTD_createCDict_advanced(dict, dictSize, 0, params.cParams, cmem); - if (zcs->cdict == NULL) return ERROR(memory_allocation); - } } zcs->frameContentSize = pledgedSrcSize; + if (dict) { + DEBUGLOG(4,"cdictLocal: %08X", (U32)(size_t)zcs->cdictLocal); + ZSTD_freeCDict(zcs->cdictLocal); + zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, + 0 /* byRef */, ZSTD_dm_auto, /* note : a loadPrefix becomes an internal CDict */ + params.cParams, zcs->cMem); + zcs->cdict = zcs->cdictLocal; + if (zcs->cdictLocal == NULL) return ERROR(memory_allocation); + } else { + DEBUGLOG(4,"cdictLocal: %08X", (U32)(size_t)zcs->cdictLocal); + ZSTD_freeCDict(zcs->cdictLocal); + zcs->cdictLocal = NULL; + zcs->cdict = cdict; + } + zcs->targetDictSize = (zcs->overlapRLog>=9) ? 0 : (size_t)1 << (zcs->params.cParams.windowLog - zcs->overlapRLog); DEBUGLOG(4, "overlapRLog : %u ", zcs->overlapRLog); - DEBUGLOG(3, "overlap Size : %u KB", (U32)(zcs->targetDictSize>>10)); + DEBUGLOG(4, "overlap Size : %u KB", (U32)(zcs->targetDictSize>>10)); zcs->targetSectionSize = zcs->sectionSize ? zcs->sectionSize : (size_t)1 << (zcs->params.cParams.windowLog + 2); zcs->targetSectionSize = MAX(ZSTDMT_SECTION_SIZE_MIN, zcs->targetSectionSize); zcs->targetSectionSize = MAX(zcs->targetDictSize, zcs->targetSectionSize); - DEBUGLOG(3, "Section Size : %u KB", (U32)(zcs->targetSectionSize>>10)); + DEBUGLOG(4, "Section Size : %u KB", (U32)(zcs->targetSectionSize>>10)); zcs->marginSize = zcs->targetSectionSize >> 2; zcs->inBuffSize = zcs->targetDictSize + zcs->targetSectionSize + zcs->marginSize; zcs->inBuff.buffer = ZSTDMT_getBuffer(zcs->buffPool, zcs->inBuffSize); @@ -546,24 +663,39 @@ static size_t ZSTDMT_initCStream_internal(ZSTDMT_CCtx* zcs, return 0; } -size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* zcs, +size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize) { - return ZSTDMT_initCStream_internal(zcs, dict, dictSize, 1, params, pledgedSrcSize); + DEBUGLOG(5, "ZSTDMT_initCStream_advanced"); + return ZSTDMT_initCStream_internal(mtctx, dict, dictSize, NULL, params, pledgedSrcSize); } +size_t ZSTDMT_initCStream_usingCDict(ZSTDMT_CCtx* mtctx, + const ZSTD_CDict* cdict, + ZSTD_frameParameters fParams, + unsigned long long pledgedSrcSize) +{ + ZSTD_parameters params = ZSTD_getParamsFromCDict(cdict); + if (cdict==NULL) return ERROR(dictionary_wrong); /* method incompatible with NULL cdict */ + params.fParams = fParams; + return ZSTDMT_initCStream_internal(mtctx, NULL, 0 /*dictSize*/, cdict, + params, pledgedSrcSize); +} + + /* ZSTDMT_resetCStream() : * pledgedSrcSize is optional and can be zero == unknown */ size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* zcs, unsigned long long pledgedSrcSize) { - if (zcs->nbThreads==1) return ZSTD_resetCStream(zcs->cstream, pledgedSrcSize); + if (zcs->nbThreads==1) + return ZSTD_resetCStream(zcs->cctxPool->cctx[0], pledgedSrcSize); return ZSTDMT_initCStream_internal(zcs, NULL, 0, 0, zcs->params, pledgedSrcSize); } size_t ZSTDMT_initCStream(ZSTDMT_CCtx* zcs, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, 0); - return ZSTDMT_initCStream_internal(zcs, NULL, 0, 1, params, 0); + return ZSTDMT_initCStream_internal(zcs, NULL, 0, NULL, params, 0); } @@ -582,13 +714,16 @@ static size_t ZSTDMT_createCompressionJob(ZSTDMT_CCtx* zcs, size_t srcSize, unsi return ERROR(memory_allocation); } - DEBUGLOG(4, "preparing job %u to compress %u bytes with %u preload ", zcs->nextJobID, (U32)srcSize, (U32)zcs->dictSize); + DEBUGLOG(4, "preparing job %u to compress %u bytes with %u preload ", + zcs->nextJobID, (U32)srcSize, (U32)zcs->dictSize); zcs->jobs[jobID].src = zcs->inBuff.buffer; zcs->jobs[jobID].srcStart = zcs->inBuff.buffer.start; zcs->jobs[jobID].srcSize = srcSize; - zcs->jobs[jobID].dictSize = zcs->dictSize; /* note : zcs->inBuff.filled is presumed >= srcSize + dictSize */ + zcs->jobs[jobID].dictSize = zcs->dictSize; + assert(zcs->inBuff.filled >= srcSize + zcs->dictSize); zcs->jobs[jobID].params = zcs->params; - if (zcs->nextJobID) zcs->jobs[jobID].params.fParams.checksumFlag = 0; /* do not calculate checksum within sections, just keep it in header for first section */ + /* do not calculate checksum within sections, but write it in header for first section */ + if (zcs->nextJobID) zcs->jobs[jobID].params.fParams.checksumFlag = 0; zcs->jobs[jobID].cdict = zcs->nextJobID==0 ? zcs->cdict : NULL; zcs->jobs[jobID].fullFrameSize = zcs->frameContentSize; zcs->jobs[jobID].dstBuff = dstBuffer; @@ -603,6 +738,7 @@ static size_t ZSTDMT_createCompressionJob(ZSTDMT_CCtx* zcs, size_t srcSize, unsi /* get a new buffer for next input */ if (!endFrame) { size_t const newDictSize = MIN(srcSize + zcs->dictSize, zcs->targetDictSize); + DEBUGLOG(5, "ZSTDMT_createCompressionJob::endFrame = %u", endFrame); zcs->inBuff.buffer = ZSTDMT_getBuffer(zcs->buffPool, zcs->inBuffSize); if (zcs->inBuff.buffer.start == NULL) { /* not enough memory to allocate next input buffer */ zcs->jobs[jobID].jobCompleted = 1; @@ -611,22 +747,33 @@ static size_t ZSTDMT_createCompressionJob(ZSTDMT_CCtx* zcs, size_t srcSize, unsi ZSTDMT_releaseAllJobResources(zcs); return ERROR(memory_allocation); } - DEBUGLOG(5, "inBuff filled to %u", (U32)zcs->inBuff.filled); + DEBUGLOG(5, "inBuff currently filled to %u", (U32)zcs->inBuff.filled); zcs->inBuff.filled -= srcSize + zcs->dictSize - newDictSize; - DEBUGLOG(5, "new job : filled to %u, with %u dict and %u src", (U32)zcs->inBuff.filled, (U32)newDictSize, (U32)(zcs->inBuff.filled - newDictSize)); - memmove(zcs->inBuff.buffer.start, (const char*)zcs->jobs[jobID].srcStart + zcs->dictSize + srcSize - newDictSize, zcs->inBuff.filled); + DEBUGLOG(5, "new job : inBuff filled to %u, with %u dict and %u src", + (U32)zcs->inBuff.filled, (U32)newDictSize, + (U32)(zcs->inBuff.filled - newDictSize)); + memmove(zcs->inBuff.buffer.start, + (const char*)zcs->jobs[jobID].srcStart + zcs->dictSize + srcSize - newDictSize, + zcs->inBuff.filled); DEBUGLOG(5, "new inBuff pre-filled"); zcs->dictSize = newDictSize; - } else { + } else { /* if (endFrame==1) */ + DEBUGLOG(5, "ZSTDMT_createCompressionJob::endFrame = %u", endFrame); zcs->inBuff.buffer = g_nullBuffer; zcs->inBuff.filled = 0; zcs->dictSize = 0; zcs->frameEnded = 1; if (zcs->nextJobID == 0) - zcs->params.fParams.checksumFlag = 0; /* single chunk : checksum is calculated directly within worker thread */ + /* single chunk exception : checksum is calculated directly within worker thread */ + zcs->params.fParams.checksumFlag = 0; } - DEBUGLOG(3, "posting job %u : %u bytes (end:%u) (note : doneJob = %u=>%u)", zcs->nextJobID, (U32)zcs->jobs[jobID].srcSize, zcs->jobs[jobID].lastChunk, zcs->doneJobID, zcs->doneJobID & zcs->jobIDMask); + DEBUGLOG(4, "posting job %u : %u bytes (end:%u) (note : doneJob = %u=>%u)", + zcs->nextJobID, + (U32)zcs->jobs[jobID].srcSize, + zcs->jobs[jobID].lastChunk, + zcs->doneJobID, + zcs->doneJobID & zcs->jobIDMask); POOL_add(zcs->factory, ZSTDMT_compressChunk, &zcs->jobs[jobID]); /* this call is blocking when thread worker pool is exhausted */ zcs->nextJobID++; return 0; @@ -664,7 +811,7 @@ static size_t ZSTDMT_flushNextJob(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, unsi XXH64_update(&zcs->xxhState, (const char*)job.srcStart + job.dictSize, job.srcSize); if (zcs->frameEnded && (zcs->doneJobID+1 == zcs->nextJobID)) { /* write checksum at end of last section */ U32 const checksum = (U32)XXH64_digest(&zcs->xxhState); - DEBUGLOG(4, "writing checksum : %08X \n", checksum); + DEBUGLOG(5, "writing checksum : %08X \n", checksum); MEM_writeLE32((char*)job.dstBuff.start + job.cSize, checksum); job.cSize += 4; zcs->jobs[wJobID].cSize += 4; @@ -675,7 +822,7 @@ static size_t ZSTDMT_flushNextJob(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, unsi zcs->jobs[wJobID].jobScanned = 1; } { size_t const toWrite = MIN(job.cSize - job.dstFlushed, output->size - output->pos); - DEBUGLOG(4, "Flushing %u bytes from job %u ", (U32)toWrite, zcs->doneJobID); + DEBUGLOG(5, "Flushing %u bytes from job %u ", (U32)toWrite, zcs->doneJobID); memcpy((char*)output->dst + output->pos, (const char*)job.dstBuff.start + job.dstFlushed, toWrite); output->pos += toWrite; job.dstFlushed += toWrite; @@ -696,26 +843,81 @@ static size_t ZSTDMT_flushNextJob(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, unsi } } -size_t ZSTDMT_compressStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input) +/** ZSTDMT_compressStream_generic() : + * internal use only + * assumption : output and input are valid (pos <= size) + * @return : minimum amount of data remaining to flush, 0 if none */ +size_t ZSTDMT_compressStream_generic(ZSTDMT_CCtx* mtctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp) { - size_t const newJobThreshold = zcs->dictSize + zcs->targetSectionSize + zcs->marginSize; - if (zcs->frameEnded) return ERROR(stage_wrong); /* current frame being ended. Only flush is allowed. Restart with init */ - if (zcs->nbThreads==1) return ZSTD_compressStream(zcs->cstream, output, input); + size_t const newJobThreshold = mtctx->dictSize + mtctx->targetSectionSize + mtctx->marginSize; + assert(output->pos <= output->size); + assert(input->pos <= input->size); + if ((mtctx->frameEnded) && (endOp==ZSTD_e_continue)) { + /* current frame being ended. Only flush/end are allowed. Or start new frame with init */ + return ERROR(stage_wrong); + } + if (mtctx->nbThreads==1) { + return ZSTD_compressStream_generic(mtctx->cctxPool->cctx[0], output, input, endOp); + } + + /* single-pass shortcut (note : this is blocking-mode) */ + if ( (mtctx->nextJobID==0) /* just started */ + && (mtctx->inBuff.filled==0) /* nothing buffered */ + && (endOp==ZSTD_e_end) /* end order */ + && (output->size - output->pos >= ZSTD_compressBound(input->size - input->pos)) ) { /* enough room */ + size_t const cSize = ZSTDMT_compress_advanced(mtctx, + (char*)output->dst + output->pos, output->size - output->pos, + (const char*)input->src + input->pos, input->size - input->pos, + mtctx->cdict, mtctx->params, mtctx->overlapRLog); + if (ZSTD_isError(cSize)) return cSize; + input->pos = input->size; + output->pos += cSize; + ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->inBuff.buffer); /* was allocated in initStream */ + mtctx->allJobsCompleted = 1; + mtctx->frameEnded = 1; + return 0; + } /* fill input buffer */ - { size_t const toLoad = MIN(input->size - input->pos, zcs->inBuffSize - zcs->inBuff.filled); - memcpy((char*)zcs->inBuff.buffer.start + zcs->inBuff.filled, input->src, toLoad); + if ((input->src) && (mtctx->inBuff.buffer.start)) { /* support NULL input */ + size_t const toLoad = MIN(input->size - input->pos, mtctx->inBuffSize - mtctx->inBuff.filled); + DEBUGLOG(2, "inBuff:%08X; inBuffSize=%u; ToCopy=%u", (U32)(size_t)mtctx->inBuff.buffer.start, (U32)mtctx->inBuffSize, (U32)toLoad); + memcpy((char*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled, (const char*)input->src + input->pos, toLoad); input->pos += toLoad; - zcs->inBuff.filled += toLoad; + mtctx->inBuff.filled += toLoad; } - if ( (zcs->inBuff.filled >= newJobThreshold) /* filled enough : let's compress */ - && (zcs->nextJobID <= zcs->doneJobID + zcs->jobIDMask) ) { /* avoid overwriting job round buffer */ - CHECK_F( ZSTDMT_createCompressionJob(zcs, zcs->targetSectionSize, 0) ); + if ( (mtctx->inBuff.filled >= newJobThreshold) /* filled enough : let's compress */ + && (mtctx->nextJobID <= mtctx->doneJobID + mtctx->jobIDMask) ) { /* avoid overwriting job round buffer */ + CHECK_F( ZSTDMT_createCompressionJob(mtctx, mtctx->targetSectionSize, 0 /* endFrame */) ); } - /* check for data to flush */ - CHECK_F( ZSTDMT_flushNextJob(zcs, output, (zcs->inBuff.filled == zcs->inBuffSize)) ); /* block if it wasn't possible to create new job due to saturation */ + /* check for potential compressed data ready to be flushed */ + CHECK_F( ZSTDMT_flushNextJob(mtctx, output, (mtctx->inBuff.filled == mtctx->inBuffSize) /* blockToFlush */) ); /* block if it wasn't possible to create new job due to saturation */ + + if (input->pos < input->size) /* input not consumed : do not flush yet */ + endOp = ZSTD_e_continue; + + switch(endOp) + { + case ZSTD_e_flush: + return ZSTDMT_flushStream(mtctx, output); + case ZSTD_e_end: + return ZSTDMT_endStream(mtctx, output); + case ZSTD_e_continue: + return 1; + default: + return ERROR(GENERIC); /* invalid endDirective */ + } +} + + +size_t ZSTDMT_compressStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input) +{ + CHECK_F( ZSTDMT_compressStream_generic(zcs, output, input, ZSTD_e_continue) ); /* recommended next input size : fill current input buffer */ return zcs->inBuffSize - zcs->inBuff.filled; /* note : could be zero when input buffer is fully filled and no more availability to create new job */ @@ -726,26 +928,28 @@ static size_t ZSTDMT_flushStream_internal(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* outp { size_t const srcSize = zcs->inBuff.filled - zcs->dictSize; - if (srcSize) DEBUGLOG(4, "flushing : %u bytes left to compress", (U32)srcSize); if ( ((srcSize > 0) || (endFrame && !zcs->frameEnded)) && (zcs->nextJobID <= zcs->doneJobID + zcs->jobIDMask) ) { CHECK_F( ZSTDMT_createCompressionJob(zcs, srcSize, endFrame) ); } /* check if there is any data available to flush */ - DEBUGLOG(5, "zcs->doneJobID : %u ; zcs->nextJobID : %u ", zcs->doneJobID, zcs->nextJobID); - return ZSTDMT_flushNextJob(zcs, output, 1); + return ZSTDMT_flushNextJob(zcs, output, 1 /* blockToFlush */); } size_t ZSTDMT_flushStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output) { - if (zcs->nbThreads==1) return ZSTD_flushStream(zcs->cstream, output); - return ZSTDMT_flushStream_internal(zcs, output, 0); + DEBUGLOG(5, "ZSTDMT_flushStream"); + if (zcs->nbThreads==1) + return ZSTD_flushStream(zcs->cctxPool->cctx[0], output); + return ZSTDMT_flushStream_internal(zcs, output, 0 /* endFrame */); } size_t ZSTDMT_endStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output) { - if (zcs->nbThreads==1) return ZSTD_endStream(zcs->cstream, output); - return ZSTDMT_flushStream_internal(zcs, output, 1); + DEBUGLOG(4, "ZSTDMT_endStream"); + if (zcs->nbThreads==1) + return ZSTD_endStream(zcs->cctxPool->cctx[0], output); + return ZSTDMT_flushStream_internal(zcs, output, 1 /* endFrame */); } diff --git a/contrib/zstd/lib/compress/zstdmt_compress.h b/contrib/zstd/lib/compress/zstdmt_compress.h index 27f78ee0314a..fad63b6d8610 100644 --- a/contrib/zstd/lib/compress/zstdmt_compress.h +++ b/contrib/zstd/lib/compress/zstdmt_compress.h @@ -15,25 +15,34 @@ #endif -/* Note : All prototypes defined in this file shall be considered experimental. - * There is no guarantee of API continuity (yet) on any of these prototypes */ +/* Note : All prototypes defined in this file are labelled experimental. + * No guarantee of API continuity is provided on any of them. + * In fact, the expectation is that these prototypes will be replaced + * by ZSTD_compress_generic() API in the near future */ /* === Dependencies === */ -#include /* size_t */ +#include /* size_t */ #define ZSTD_STATIC_LINKING_ONLY /* ZSTD_parameters */ -#include "zstd.h" /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */ +#include "zstd.h" /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */ -/* === Simple one-pass functions === */ - +/* === Memory management === */ typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx; ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbThreads); -ZSTDLIB_API size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* cctx); +ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx_advanced(unsigned nbThreads, + ZSTD_customMem cMem); +ZSTDLIB_API size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* mtctx); + +ZSTDLIB_API size_t ZSTDMT_sizeof_CCtx(ZSTDMT_CCtx* mtctx); + + +/* === Simple buffer-to-butter one-pass function === */ + +ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + int compressionLevel); -ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* cctx, - void* dst, size_t dstCapacity, - const void* src, size_t srcSize, - int compressionLevel); /* === Streaming functions === */ @@ -53,8 +62,22 @@ ZSTDLIB_API size_t ZSTDMT_endStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output); # define ZSTDMT_SECTION_SIZE_MIN (1U << 20) /* 1 MB - Minimum size of each compression job */ #endif -ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx, const void* dict, size_t dictSize, /**< dict can be released after init, a local copy is preserved within zcs */ - ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be zero == unknown */ +ZSTDLIB_API size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + const ZSTD_CDict* cdict, + ZSTD_parameters const params, + unsigned overlapRLog); + +ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx, + const void* dict, size_t dictSize, /* dict can be released after init, a local copy is preserved within zcs */ + ZSTD_parameters params, + unsigned long long pledgedSrcSize); /* pledgedSrcSize is optional and can be zero == unknown */ + +ZSTDLIB_API size_t ZSTDMT_initCStream_usingCDict(ZSTDMT_CCtx* mtctx, + const ZSTD_CDict* cdict, + ZSTD_frameParameters fparams, + unsigned long long pledgedSrcSize); /* note : zero means empty */ /* ZSDTMT_parameter : * List of parameters that can be set using ZSTDMT_setMTCtxParameter() */ @@ -71,6 +94,19 @@ typedef enum { ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, unsigned value); +/*! ZSTDMT_compressStream_generic() : + * Combines ZSTDMT_compressStream() with ZSTDMT_flushStream() or ZSTDMT_endStream() + * depending on flush directive. + * @return : minimum amount of data still to be flushed + * 0 if fully flushed + * or an error code */ +ZSTDLIB_API size_t ZSTDMT_compressStream_generic(ZSTDMT_CCtx* mtctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp); + + + #if defined (__cplusplus) } #endif diff --git a/contrib/zstd/lib/decompress/huf_decompress.c b/contrib/zstd/lib/decompress/huf_decompress.c index ea35c3620176..2a1b70ea5ef2 100644 --- a/contrib/zstd/lib/decompress/huf_decompress.c +++ b/contrib/zstd/lib/decompress/huf_decompress.c @@ -67,6 +67,12 @@ #define HUF_STATIC_ASSERT(c) { enum { HUF_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */ +/* ************************************************************** +* Byte alignment for workSpace management +****************************************************************/ +#define HUF_ALIGN(x, a) HUF_ALIGN_MASK((x), (a) - 1) +#define HUF_ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask)) + /*-***************************/ /* generic DTableDesc */ /*-***************************/ @@ -87,16 +93,28 @@ static DTableDesc HUF_getDTableDesc(const HUF_DTable* table) typedef struct { BYTE byte; BYTE nbBits; } HUF_DEltX2; /* single-symbol decoding */ -size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize) +size_t HUF_readDTableX2_wksp(HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize) { - BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1]; - U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1]; /* large enough for values from 0 to 16 */ U32 tableLog = 0; U32 nbSymbols = 0; size_t iSize; void* const dtPtr = DTable + 1; HUF_DEltX2* const dt = (HUF_DEltX2*)dtPtr; + U32* rankVal; + BYTE* huffWeight; + size_t spaceUsed32 = 0; + + rankVal = (U32 *)workSpace + spaceUsed32; + spaceUsed32 += HUF_TABLELOG_ABSOLUTEMAX + 1; + huffWeight = (BYTE *)((U32 *)workSpace + spaceUsed32); + spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2; + + if ((spaceUsed32 << 2) > wkspSize) + return ERROR(tableLog_tooLarge); + workSpace = (U32 *)workSpace + spaceUsed32; + wkspSize -= (spaceUsed32 << 2); + HUF_STATIC_ASSERT(sizeof(DTableDesc) == sizeof(HUF_DTable)); /* memset(huffWeight, 0, sizeof(huffWeight)); */ /* is not necessary, even though some analyzer complain ... */ @@ -135,6 +153,13 @@ size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize) return iSize; } +size_t HUF_readDTableX2(HUF_DTable* DTable, const void* src, size_t srcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_readDTableX2_wksp(DTable, src, srcSize, + workSpace, sizeof(workSpace)); +} + static BYTE HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, const U32 dtLog) { @@ -212,11 +237,13 @@ size_t HUF_decompress1X2_usingDTable( return HUF_decompress1X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable); } -size_t HUF_decompress1X2_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize, + void* workSpace, size_t wkspSize) { const BYTE* ip = (const BYTE*) cSrc; - size_t const hSize = HUF_readDTableX2 (DCtx, cSrc, cSrcSize); + size_t const hSize = HUF_readDTableX2_wksp(DCtx, cSrc, cSrcSize, workSpace, wkspSize); if (HUF_isError(hSize)) return hSize; if (hSize >= cSrcSize) return ERROR(srcSize_wrong); ip += hSize; cSrcSize -= hSize; @@ -224,6 +251,15 @@ size_t HUF_decompress1X2_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, cons return HUF_decompress1X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx); } + +size_t HUF_decompress1X2_DCtx(HUF_DTable* DCtx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress1X2_DCtx_wksp(DCtx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} + size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) { HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX); @@ -335,11 +371,14 @@ size_t HUF_decompress4X2_usingDTable( } -size_t HUF_decompress4X2_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize, + void* workSpace, size_t wkspSize) { const BYTE* ip = (const BYTE*) cSrc; - size_t const hSize = HUF_readDTableX2 (dctx, cSrc, cSrcSize); + size_t const hSize = HUF_readDTableX2_wksp (dctx, cSrc, cSrcSize, + workSpace, wkspSize); if (HUF_isError(hSize)) return hSize; if (hSize >= cSrcSize) return ERROR(srcSize_wrong); ip += hSize; cSrcSize -= hSize; @@ -347,6 +386,13 @@ size_t HUF_decompress4X2_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, cons return HUF_decompress4X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, dctx); } + +size_t HUF_decompress4X2_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress4X2_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) { HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX); @@ -403,7 +449,8 @@ static void HUF_fillDTableX4Level2(HUF_DEltX4* DTable, U32 sizeLog, const U32 co } } } -typedef U32 rankVal_t[HUF_TABLELOG_MAX][HUF_TABLELOG_MAX + 1]; +typedef U32 rankValCol_t[HUF_TABLELOG_MAX + 1]; +typedef rankValCol_t rankVal_t[HUF_TABLELOG_MAX]; static void HUF_fillDTableX4(HUF_DEltX4* DTable, const U32 targetLog, const sortedSymbol_t* sortedList, const U32 sortedListSize, @@ -447,20 +494,43 @@ static void HUF_fillDTableX4(HUF_DEltX4* DTable, const U32 targetLog, } } -size_t HUF_readDTableX4 (HUF_DTable* DTable, const void* src, size_t srcSize) +size_t HUF_readDTableX4_wksp(HUF_DTable* DTable, const void* src, + size_t srcSize, void* workSpace, + size_t wkspSize) { - BYTE weightList[HUF_SYMBOLVALUE_MAX + 1]; - sortedSymbol_t sortedSymbol[HUF_SYMBOLVALUE_MAX + 1]; - U32 rankStats[HUF_TABLELOG_MAX + 1] = { 0 }; - U32 rankStart0[HUF_TABLELOG_MAX + 2] = { 0 }; - U32* const rankStart = rankStart0+1; - rankVal_t rankVal; U32 tableLog, maxW, sizeOfSort, nbSymbols; DTableDesc dtd = HUF_getDTableDesc(DTable); U32 const maxTableLog = dtd.maxTableLog; size_t iSize; void* dtPtr = DTable+1; /* force compiler to avoid strict-aliasing */ HUF_DEltX4* const dt = (HUF_DEltX4*)dtPtr; + U32 *rankStart; + + rankValCol_t* rankVal; + U32* rankStats; + U32* rankStart0; + sortedSymbol_t* sortedSymbol; + BYTE* weightList; + size_t spaceUsed32 = 0; + + rankVal = (rankValCol_t *)((U32 *)workSpace + spaceUsed32); + spaceUsed32 += (sizeof(rankValCol_t) * HUF_TABLELOG_MAX) >> 2; + rankStats = (U32 *)workSpace + spaceUsed32; + spaceUsed32 += HUF_TABLELOG_MAX + 1; + rankStart0 = (U32 *)workSpace + spaceUsed32; + spaceUsed32 += HUF_TABLELOG_MAX + 2; + sortedSymbol = (sortedSymbol_t *)workSpace + (spaceUsed32 * sizeof(U32)) / sizeof(sortedSymbol_t); + spaceUsed32 += HUF_ALIGN(sizeof(sortedSymbol_t) * (HUF_SYMBOLVALUE_MAX + 1), sizeof(U32)) >> 2; + weightList = (BYTE *)((U32 *)workSpace + spaceUsed32); + spaceUsed32 += HUF_ALIGN(HUF_SYMBOLVALUE_MAX + 1, sizeof(U32)) >> 2; + + if ((spaceUsed32 << 2) > wkspSize) + return ERROR(tableLog_tooLarge); + workSpace = (U32 *)workSpace + spaceUsed32; + wkspSize -= (spaceUsed32 << 2); + + rankStart = rankStart0 + 1; + memset(rankStats, 0, sizeof(U32) * (2 * HUF_TABLELOG_MAX + 2 + 1)); HUF_STATIC_ASSERT(sizeof(HUF_DEltX4) == sizeof(HUF_DTable)); /* if compiler fails here, assertion is wrong */ if (maxTableLog > HUF_TABLELOG_MAX) return ERROR(tableLog_tooLarge); @@ -527,6 +597,12 @@ size_t HUF_readDTableX4 (HUF_DTable* DTable, const void* src, size_t srcSize) return iSize; } +size_t HUF_readDTableX4(HUF_DTable* DTable, const void* src, size_t srcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_readDTableX4_wksp(DTable, src, srcSize, + workSpace, sizeof(workSpace)); +} static U32 HUF_decodeSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog) { @@ -545,7 +621,8 @@ static U32 HUF_decodeLastSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DE if (DStream->bitsConsumed < (sizeof(DStream->bitContainer)*8)) { BIT_skipBits(DStream, dt[val].nbBits); if (DStream->bitsConsumed > (sizeof(DStream->bitContainer)*8)) - DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8); /* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */ + /* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */ + DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8); } } return 1; } @@ -626,11 +703,14 @@ size_t HUF_decompress1X4_usingDTable( return HUF_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable); } -size_t HUF_decompress1X4_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress1X4_DCtx_wksp(HUF_DTable* DCtx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize, + void* workSpace, size_t wkspSize) { const BYTE* ip = (const BYTE*) cSrc; - size_t const hSize = HUF_readDTableX4 (DCtx, cSrc, cSrcSize); + size_t const hSize = HUF_readDTableX4_wksp(DCtx, cSrc, cSrcSize, + workSpace, wkspSize); if (HUF_isError(hSize)) return hSize; if (hSize >= cSrcSize) return ERROR(srcSize_wrong); ip += hSize; cSrcSize -= hSize; @@ -638,6 +718,15 @@ size_t HUF_decompress1X4_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, cons return HUF_decompress1X4_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx); } + +size_t HUF_decompress1X4_DCtx(HUF_DTable* DCtx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress1X4_DCtx_wksp(DCtx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} + size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) { HUF_CREATE_STATIC_DTABLEX4(DTable, HUF_TABLELOG_MAX); @@ -748,11 +837,14 @@ size_t HUF_decompress4X4_usingDTable( } -size_t HUF_decompress4X4_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress4X4_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize, + void* workSpace, size_t wkspSize) { const BYTE* ip = (const BYTE*) cSrc; - size_t hSize = HUF_readDTableX4 (dctx, cSrc, cSrcSize); + size_t hSize = HUF_readDTableX4_wksp(dctx, cSrc, cSrcSize, + workSpace, wkspSize); if (HUF_isError(hSize)) return hSize; if (hSize >= cSrcSize) return ERROR(srcSize_wrong); ip += hSize; cSrcSize -= hSize; @@ -760,6 +852,15 @@ size_t HUF_decompress4X4_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, cons return HUF_decompress4X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx); } + +size_t HUF_decompress4X4_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress4X4_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} + size_t HUF_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) { HUF_CREATE_STATIC_DTABLEX4(DTable, HUF_TABLELOG_MAX); @@ -861,19 +962,32 @@ size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const } } -size_t HUF_decompress4X_hufOnly (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress4X_hufOnly_wksp(dctx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} + + +size_t HUF_decompress4X_hufOnly_wksp(HUF_DTable* dctx, void* dst, + size_t dstSize, const void* cSrc, + size_t cSrcSize, void* workSpace, + size_t wkspSize) { /* validation checks */ if (dstSize == 0) return ERROR(dstSize_tooSmall); if ((cSrcSize >= dstSize) || (cSrcSize <= 1)) return ERROR(corruption_detected); /* invalid */ { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize); - return algoNb ? HUF_decompress4X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) : - HUF_decompress4X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ; + return algoNb ? HUF_decompress4X4_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize): + HUF_decompress4X2_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize, workSpace, wkspSize); } } -size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize) +size_t HUF_decompress1X_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize, + void* workSpace, size_t wkspSize) { /* validation checks */ if (dstSize == 0) return ERROR(dstSize_tooSmall); @@ -882,7 +996,17 @@ size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; } /* RLE */ { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize); - return algoNb ? HUF_decompress1X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) : - HUF_decompress1X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ; + return algoNb ? HUF_decompress1X4_DCtx_wksp(dctx, dst, dstSize, cSrc, + cSrcSize, workSpace, wkspSize): + HUF_decompress1X2_DCtx_wksp(dctx, dst, dstSize, cSrc, + cSrcSize, workSpace, wkspSize); } } + +size_t HUF_decompress1X_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, + const void* cSrc, size_t cSrcSize) +{ + U32 workSpace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; + return HUF_decompress1X_DCtx_wksp(dctx, dst, dstSize, cSrc, cSrcSize, + workSpace, sizeof(workSpace)); +} diff --git a/contrib/zstd/lib/decompress/zstd_decompress.c b/contrib/zstd/lib/decompress/zstd_decompress.c index 910f9ab783c3..003d703a5eb3 100644 --- a/contrib/zstd/lib/decompress/zstd_decompress.c +++ b/contrib/zstd/lib/decompress/zstd_decompress.c @@ -53,8 +53,7 @@ # include "zstd_legacy.h" #endif - -#if defined(_MSC_VER) +#if defined(_MSC_VER) && !defined(_M_IA64) /* _mm_prefetch() is not defined for ia64 */ # include /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */ # define ZSTD_PREFETCH(ptr) _mm_prefetch((const char*)ptr, _MM_HINT_T0) #elif defined(__GNUC__) @@ -63,8 +62,9 @@ # define ZSTD_PREFETCH(ptr) /* disabled */ #endif + /*-************************************* -* Macros +* Errors ***************************************/ #define ZSTD_isError ERR_isError /* for inlining */ #define FSE_isError ERR_isError @@ -85,11 +85,15 @@ typedef enum { ZSTDds_getFrameHeaderSize, ZSTDds_decodeFrameHeader, ZSTDds_decompressLastBlock, ZSTDds_checkChecksum, ZSTDds_decodeSkippableHeader, ZSTDds_skipFrame } ZSTD_dStage; +typedef enum { zdss_init=0, zdss_loadHeader, + zdss_read, zdss_load, zdss_flush } ZSTD_dStreamStage; + typedef struct { FSE_DTable LLTable[FSE_DTABLE_SIZE_U32(LLFSELog)]; FSE_DTable OFTable[FSE_DTABLE_SIZE_U32(OffFSELog)]; FSE_DTable MLTable[FSE_DTABLE_SIZE_U32(MLFSELog)]; HUF_DTable hufTable[HUF_DTABLE_SIZE(HufLog)]; /* can accommodate HUF_decompress4X */ + U32 workspace[HUF_DECOMPRESS_WORKSPACE_SIZE_U32]; U32 rep[ZSTD_REP_NUM]; } ZSTD_entropyTables_t; @@ -105,7 +109,7 @@ struct ZSTD_DCtx_s const void* vBase; /* virtual start of previous segment if it was just before current one */ const void* dictEnd; /* end of previous segment */ size_t expected; - ZSTD_frameParams fParams; + ZSTD_frameHeader fParams; blockType_e bType; /* used in ZSTD_decompressContinue(), to transfer blockType between header decoding and block decoding stages */ ZSTD_dStage stage; U32 litEntropy; @@ -117,11 +121,39 @@ struct ZSTD_DCtx_s ZSTD_customMem customMem; size_t litSize; size_t rleSize; - BYTE litBuffer[ZSTD_BLOCKSIZE_ABSOLUTEMAX + WILDCOPY_OVERLENGTH]; + size_t staticSize; + + /* streaming */ + ZSTD_DDict* ddictLocal; + const ZSTD_DDict* ddict; + ZSTD_dStreamStage streamStage; + char* inBuff; + size_t inBuffSize; + size_t inPos; + size_t maxWindowSize; + char* outBuff; + size_t outBuffSize; + size_t outStart; + size_t outEnd; + size_t blockSize; + size_t lhSize; + void* legacyContext; + U32 previousLegacyVersion; + U32 legacyVersion; + U32 hostageByte; + + /* workspace */ + BYTE litBuffer[ZSTD_BLOCKSIZE_MAX + WILDCOPY_OVERLENGTH]; BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX]; }; /* typedef'd to ZSTD_DCtx within "zstd.h" */ -size_t ZSTD_sizeof_DCtx (const ZSTD_DCtx* dctx) { return (dctx==NULL) ? 0 : sizeof(ZSTD_DCtx); } +size_t ZSTD_sizeof_DCtx (const ZSTD_DCtx* dctx) +{ + if (dctx==NULL) return 0; /* support sizeof NULL */ + return sizeof(*dctx) + + ZSTD_sizeof_DDict(dctx->ddictLocal) + + dctx->inBuffSize + dctx->outBuffSize; +} size_t ZSTD_estimateDCtxSize(void) { return sizeof(ZSTD_DCtx); } @@ -145,40 +177,76 @@ size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx) return 0; } +static void ZSTD_initDCtx_internal(ZSTD_DCtx* dctx) +{ + ZSTD_decompressBegin(dctx); /* cannot fail */ + dctx->staticSize = 0; + dctx->maxWindowSize = ZSTD_MAXWINDOWSIZE_DEFAULT; + dctx->ddict = NULL; + dctx->ddictLocal = NULL; + dctx->inBuff = NULL; + dctx->inBuffSize = 0; + dctx->outBuffSize= 0; + dctx->streamStage = zdss_init; +} + ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem) { - ZSTD_DCtx* dctx; + if (!customMem.customAlloc ^ !customMem.customFree) return NULL; - if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; - if (!customMem.customAlloc || !customMem.customFree) return NULL; + { ZSTD_DCtx* const dctx = (ZSTD_DCtx*)ZSTD_malloc(sizeof(*dctx), customMem); + if (!dctx) return NULL; + dctx->customMem = customMem; + dctx->legacyContext = NULL; + dctx->previousLegacyVersion = 0; + ZSTD_initDCtx_internal(dctx); + return dctx; + } +} - dctx = (ZSTD_DCtx*)ZSTD_malloc(sizeof(ZSTD_DCtx), customMem); - if (!dctx) return NULL; - memcpy(&dctx->customMem, &customMem, sizeof(customMem)); - ZSTD_decompressBegin(dctx); +ZSTD_DCtx* ZSTD_initStaticDCtx(void *workspace, size_t workspaceSize) +{ + ZSTD_DCtx* dctx = (ZSTD_DCtx*) workspace; + + if ((size_t)workspace & 7) return NULL; /* 8-aligned */ + if (workspaceSize < sizeof(ZSTD_DCtx)) return NULL; /* minimum size */ + + ZSTD_initDCtx_internal(dctx); + dctx->staticSize = workspaceSize; + dctx->inBuff = (char*)(dctx+1); return dctx; } ZSTD_DCtx* ZSTD_createDCtx(void) { - return ZSTD_createDCtx_advanced(defaultCustomMem); + return ZSTD_createDCtx_advanced(ZSTD_defaultCMem); } size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx) { if (dctx==NULL) return 0; /* support free on NULL */ - ZSTD_free(dctx, dctx->customMem); - return 0; /* reserved as a potential error code in the future */ + if (dctx->staticSize) return ERROR(memory_allocation); /* not compatible with static DCtx */ + { ZSTD_customMem const cMem = dctx->customMem; + ZSTD_freeDDict(dctx->ddictLocal); + dctx->ddictLocal = NULL; + ZSTD_free(dctx->inBuff, cMem); + dctx->inBuff = NULL; +#if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT >= 1) + if (dctx->legacyContext) + ZSTD_freeLegacyStreamContext(dctx->legacyContext, dctx->previousLegacyVersion); +#endif + ZSTD_free(dctx, cMem); + return 0; + } } +/* no longer useful */ void ZSTD_copyDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx) { - size_t const workSpaceSize = (ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH) + ZSTD_frameHeaderSize_max; - memcpy(dstDCtx, srcDCtx, sizeof(ZSTD_DCtx) - workSpaceSize); /* no need to copy workspace */ + size_t const toCopy = (size_t)((char*)(&dstDCtx->inBuff) - (char*)dstDCtx); + memcpy(dstDCtx, srcDCtx, toCopy); /* no need to copy workspace */ } -static void ZSTD_refDDict(ZSTD_DCtx* dstDCtx, const ZSTD_DDict* ddict); - /*-************************************************************* * Decompression section @@ -206,7 +274,7 @@ unsigned ZSTD_isFrame(const void* buffer, size_t size) /** ZSTD_frameHeaderSize() : * srcSize must be >= ZSTD_frameHeaderSize_prefix. * @return : size of the Frame Header */ -static size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize) +size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize) { if (srcSize < ZSTD_frameHeaderSize_prefix) return ERROR(srcSize_wrong); { BYTE const fhd = ((const BYTE*)src)[4]; @@ -219,22 +287,24 @@ static size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize) } -/** ZSTD_getFrameParams() : +/** ZSTD_getFrameHeader() : * decode Frame Header, or require larger `srcSize`. -* @return : 0, `fparamsPtr` is correctly filled, +* @return : 0, `zfhPtr` is correctly filled, * >0, `srcSize` is too small, result is expected `srcSize`, * or an error code, which can be tested using ZSTD_isError() */ -size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize) +size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize) { const BYTE* ip = (const BYTE*)src; - if (srcSize < ZSTD_frameHeaderSize_prefix) return ZSTD_frameHeaderSize_prefix; + if (MEM_readLE32(src) != ZSTD_MAGICNUMBER) { if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { - if (srcSize < ZSTD_skippableHeaderSize) return ZSTD_skippableHeaderSize; /* magic number + skippable frame length */ - memset(fparamsPtr, 0, sizeof(*fparamsPtr)); - fparamsPtr->frameContentSize = MEM_readLE32((const char *)src + 4); - fparamsPtr->windowSize = 0; /* windowSize==0 means a frame is skippable */ + /* skippable frame */ + if (srcSize < ZSTD_skippableHeaderSize) + return ZSTD_skippableHeaderSize; /* magic number + frame length */ + memset(zfhPtr, 0, sizeof(*zfhPtr)); + zfhPtr->frameContentSize = MEM_readLE32((const char *)src + 4); + zfhPtr->windowSize = 0; /* windowSize==0 means a frame is skippable */ return 0; } return ERROR(prefix_unknown); @@ -254,11 +324,13 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t U32 windowSize = 0; U32 dictID = 0; U64 frameContentSize = 0; - if ((fhdByte & 0x08) != 0) return ERROR(frameParameter_unsupported); /* reserved bits, which must be zero */ + if ((fhdByte & 0x08) != 0) + return ERROR(frameParameter_unsupported); /* reserved bits, must be zero */ if (!singleSegment) { BYTE const wlByte = ip[pos++]; U32 const windowLog = (wlByte >> 3) + ZSTD_WINDOWLOG_ABSOLUTEMIN; - if (windowLog > ZSTD_WINDOWLOG_MAX) return ERROR(frameParameter_windowTooLarge); /* avoids issue with 1 << windowLog */ + if (windowLog > ZSTD_WINDOWLOG_MAX) + return ERROR(frameParameter_windowTooLarge); windowSize = (1U << windowLog); windowSize += (windowSize >> 3) * (wlByte&7); } @@ -281,10 +353,10 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t } if (!windowSize) windowSize = (U32)frameContentSize; if (windowSize > windowSizeMax) return ERROR(frameParameter_windowTooLarge); - fparamsPtr->frameContentSize = frameContentSize; - fparamsPtr->windowSize = windowSize; - fparamsPtr->dictID = dictID; - fparamsPtr->checksumFlag = checksumFlag; + zfhPtr->frameContentSize = frameContentSize; + zfhPtr->windowSize = windowSize; + zfhPtr->dictID = dictID; + zfhPtr->checksumFlag = checksumFlag; } return 0; } @@ -302,9 +374,8 @@ unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize) return ret == 0 ? ZSTD_CONTENTSIZE_UNKNOWN : ret; } #endif - { - ZSTD_frameParams fParams; - if (ZSTD_getFrameParams(&fParams, src, srcSize) != 0) return ZSTD_CONTENTSIZE_ERROR; + { ZSTD_frameHeader fParams; + if (ZSTD_getFrameHeader(&fParams, src, srcSize) != 0) return ZSTD_CONTENTSIZE_ERROR; if (fParams.windowSize == 0) { /* Either skippable or empty frame, size == 0 either way */ return 0; @@ -323,51 +394,48 @@ unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize) * @return : decompressed size of the frames contained */ unsigned long long ZSTD_findDecompressedSize(const void* src, size_t srcSize) { - { - unsigned long long totalDstSize = 0; - while (srcSize >= ZSTD_frameHeaderSize_prefix) { - const U32 magicNumber = MEM_readLE32(src); + unsigned long long totalDstSize = 0; - if ((magicNumber & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { - size_t skippableSize; - if (srcSize < ZSTD_skippableHeaderSize) - return ERROR(srcSize_wrong); - skippableSize = MEM_readLE32((const BYTE *)src + 4) + - ZSTD_skippableHeaderSize; - if (srcSize < skippableSize) { - return ZSTD_CONTENTSIZE_ERROR; - } + while (srcSize >= ZSTD_frameHeaderSize_prefix) { + const U32 magicNumber = MEM_readLE32(src); - src = (const BYTE *)src + skippableSize; - srcSize -= skippableSize; - continue; + if ((magicNumber & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { + size_t skippableSize; + if (srcSize < ZSTD_skippableHeaderSize) + return ERROR(srcSize_wrong); + skippableSize = MEM_readLE32((const BYTE *)src + 4) + + ZSTD_skippableHeaderSize; + if (srcSize < skippableSize) { + return ZSTD_CONTENTSIZE_ERROR; } - { - unsigned long long const ret = ZSTD_getFrameContentSize(src, srcSize); - if (ret >= ZSTD_CONTENTSIZE_ERROR) return ret; - - /* check for overflow */ - if (totalDstSize + ret < totalDstSize) return ZSTD_CONTENTSIZE_ERROR; - totalDstSize += ret; - } - { - size_t const frameSrcSize = ZSTD_findFrameCompressedSize(src, srcSize); - if (ZSTD_isError(frameSrcSize)) { - return ZSTD_CONTENTSIZE_ERROR; - } - - src = (const BYTE *)src + frameSrcSize; - srcSize -= frameSrcSize; - } + src = (const BYTE *)src + skippableSize; + srcSize -= skippableSize; + continue; } - if (srcSize) { - return ZSTD_CONTENTSIZE_ERROR; - } + { unsigned long long const ret = ZSTD_getFrameContentSize(src, srcSize); + if (ret >= ZSTD_CONTENTSIZE_ERROR) return ret; - return totalDstSize; + /* check for overflow */ + if (totalDstSize + ret < totalDstSize) return ZSTD_CONTENTSIZE_ERROR; + totalDstSize += ret; + } + { size_t const frameSrcSize = ZSTD_findFrameCompressedSize(src, srcSize); + if (ZSTD_isError(frameSrcSize)) { + return ZSTD_CONTENTSIZE_ERROR; + } + + src = (const BYTE *)src + frameSrcSize; + srcSize -= frameSrcSize; + } } + + if (srcSize) { + return ZSTD_CONTENTSIZE_ERROR; + } + + return totalDstSize; } /** ZSTD_getDecompressedSize() : @@ -389,7 +457,7 @@ unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize) * @return : 0 if success, or an error code, which can be tested using ZSTD_isError() */ static size_t ZSTD_decodeFrameHeader(ZSTD_DCtx* dctx, const void* src, size_t headerSize) { - size_t const result = ZSTD_getFrameParams(&(dctx->fParams), src, headerSize); + size_t const result = ZSTD_getFrameHeader(&(dctx->fParams), src, headerSize); if (ZSTD_isError(result)) return result; /* invalid header */ if (result>0) return ERROR(srcSize_wrong); /* headerSize too small */ if (dctx->fParams.dictID && (dctx->dictID != dctx->fParams.dictID)) return ERROR(dictionary_wrong); @@ -485,7 +553,7 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, litCSize = (lhc >> 22) + (istart[4] << 10); break; } - if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected); + if (litSize > ZSTD_BLOCKSIZE_MAX) return ERROR(corruption_detected); if (litCSize + lhSize > srcSize) return ERROR(corruption_detected); if (HUF_isError((litEncType==set_repeat) ? @@ -493,8 +561,10 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, HUF_decompress1X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) : HUF_decompress4X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) ) : ( singleStream ? - HUF_decompress1X2_DCtx(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) : - HUF_decompress4X_hufOnly (dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize)) )) + HUF_decompress1X2_DCtx_wksp(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize, + dctx->entropy.workspace, sizeof(dctx->entropy.workspace)) : + HUF_decompress4X_hufOnly_wksp(dctx->entropy.hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize, + dctx->entropy.workspace, sizeof(dctx->entropy.workspace))))) return ERROR(corruption_detected); dctx->litPtr = dctx->litBuffer; @@ -557,7 +627,7 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, if (srcSize<4) return ERROR(corruption_detected); /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need lhSize+1 = 4 */ break; } - if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected); + if (litSize > ZSTD_BLOCKSIZE_MAX) return ERROR(corruption_detected); memset(dctx->litBuffer, istart[lhSize], litSize + WILDCOPY_OVERLENGTH); dctx->litPtr = dctx->litBuffer; dctx->litSize = litSize; @@ -684,6 +754,7 @@ size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr, const BYTE* const istart = (const BYTE* const)src; const BYTE* const iend = istart + srcSize; const BYTE* ip = istart; + DEBUGLOG(5, "ZSTD_decodeSeqHeaders"); /* check */ if (srcSize < MIN_SEQUENCES_SIZE) return ERROR(srcSize_wrong); @@ -864,12 +935,18 @@ static seq_t ZSTD_decodeSequence(seqState_t* seqState) seq.offset = offset; } - seq.matchLength = ML_base[mlCode] + ((mlCode>31) ? BIT_readBitsFast(&seqState->DStream, mlBits) : 0); /* <= 16 bits */ + seq.matchLength = ML_base[mlCode] + + ((mlCode>31) ? BIT_readBitsFast(&seqState->DStream, mlBits) : 0); /* <= 16 bits */ if (MEM_32bits() && (mlBits+llBits>24)) BIT_reloadDStream(&seqState->DStream); - seq.litLength = LL_base[llCode] + ((llCode>15) ? BIT_readBitsFast(&seqState->DStream, llBits) : 0); /* <= 16 bits */ - if (MEM_32bits() || - (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) BIT_reloadDStream(&seqState->DStream); + seq.litLength = LL_base[llCode] + + ((llCode>15) ? BIT_readBitsFast(&seqState->DStream, llBits) : 0); /* <= 16 bits */ + if ( MEM_32bits() + || (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) + BIT_reloadDStream(&seqState->DStream); + + DEBUGLOG(6, "seq: litL=%u, matchL=%u, offset=%u", + (U32)seq.litLength, (U32)seq.matchLength, (U32)seq.offset); /* ANS state update */ FSE_updateState(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */ @@ -909,7 +986,8 @@ size_t ZSTD_execSequence(BYTE* op, /* copy Match */ if (sequence.offset > (size_t)(oLitEnd - base)) { /* offset beyond prefix -> go into extDict */ - if (sequence.offset > (size_t)(oLitEnd - vBase)) return ERROR(corruption_detected); + if (sequence.offset > (size_t)(oLitEnd - vBase)) + return ERROR(corruption_detected); match = dictEnd + (match - base); if (match + sequence.matchLength <= dictEnd) { memmove(oLitEnd, match, sequence.matchLength); @@ -977,9 +1055,12 @@ static size_t ZSTD_decompressSequences( const BYTE* const vBase = (const BYTE*) (dctx->vBase); const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd); int nbSeq; + DEBUGLOG(5, "ZSTD_decompressSequences"); /* Build Decoding Tables */ { size_t const seqHSize = ZSTD_decodeSeqHeaders(dctx, &nbSeq, ip, seqSize); + DEBUGLOG(5, "ZSTD_decodeSeqHeaders: size=%u, nbSeq=%i", + (U32)seqHSize, nbSeq); if (ZSTD_isError(seqHSize)) return seqHSize; ip += seqHSize; } @@ -998,11 +1079,13 @@ static size_t ZSTD_decompressSequences( nbSeq--; { seq_t const sequence = ZSTD_decodeSequence(&seqState); size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequence, &litPtr, litEnd, base, vBase, dictEnd); + DEBUGLOG(6, "regenerated sequence size : %u", (U32)oneSeqSize); if (ZSTD_isError(oneSeqSize)) return oneSeqSize; op += oneSeqSize; } } /* check if reached exact end */ + DEBUGLOG(5, "after decode loop, remaining nbSeq : %i", nbSeq); if (nbSeq) return ERROR(corruption_detected); /* save reps for next block */ { U32 i; for (i=0; i entropy.rep[i] = (U32)(seqState.prevOffset[i]); } @@ -1216,7 +1299,7 @@ static size_t ZSTD_decompressSequencesLong( const BYTE* const base = (const BYTE*) (dctx->base); const BYTE* const vBase = (const BYTE*) (dctx->vBase); const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd); - unsigned const windowSize = dctx->fParams.windowSize; + unsigned const windowSize32 = (unsigned)dctx->fParams.windowSize; int nbSeq; /* Build Decoding Tables */ @@ -1246,13 +1329,13 @@ static size_t ZSTD_decompressSequencesLong( /* prepare in advance */ for (seqNb=0; (BIT_reloadDStream(&seqState.DStream) <= BIT_DStream_completed) && seqNb = ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(srcSize_wrong); + if (srcSize >= ZSTD_BLOCKSIZE_MAX) return ERROR(srcSize_wrong); /* Decode literals section */ { size_t const litCSize = ZSTD_decodeLiteralsBlock(dctx, src, srcSize); + DEBUGLOG(5, "ZSTD_decodeLiteralsBlock : %u", (U32)litCSize); if (ZSTD_isError(litCSize)) return litCSize; ip += litCSize; srcSize -= litCSize; @@ -1364,13 +1449,13 @@ size_t ZSTD_findFrameCompressedSize(const void *src, size_t srcSize) const BYTE* ip = (const BYTE*)src; const BYTE* const ipstart = ip; size_t remainingSize = srcSize; - ZSTD_frameParams fParams; + ZSTD_frameHeader fParams; size_t const headerSize = ZSTD_frameHeaderSize(ip, remainingSize); if (ZSTD_isError(headerSize)) return headerSize; /* Frame Header */ - { size_t const ret = ZSTD_getFrameParams(&fParams, ip, remainingSize); + { size_t const ret = ZSTD_getFrameHeader(&fParams, ip, remainingSize); if (ZSTD_isError(ret)) return ret; if (ret > 0) return ERROR(srcSize_wrong); } @@ -1505,6 +1590,8 @@ static size_t ZSTD_decompressMultiFrame(ZSTD_DCtx* dctx, size_t decodedSize; size_t const frameSize = ZSTD_findFrameCompressedSizeLegacy(src, srcSize); if (ZSTD_isError(frameSize)) return frameSize; + /* legacy support is incompatible with static dctx */ + if (dctx->staticSize) return ERROR(memory_allocation); decodedSize = ZSTD_decompressLegacy(dst, dstCapacity, src, frameSize, dict, dictSize); @@ -1540,7 +1627,7 @@ static size_t ZSTD_decompressMultiFrame(ZSTD_DCtx* dctx, if (ddict) { /* we were called from ZSTD_decompress_usingDDict */ - ZSTD_refDDict(dctx, ddict); + CHECK_F(ZSTD_decompressBegin_usingDDict(dctx, ddict)); } else { /* this will initialize correctly with no dict if dict == NULL, so * use this in all cases but ddict */ @@ -1580,7 +1667,7 @@ size_t ZSTD_decompressDCtx(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const size_t ZSTD_decompress(void* dst, size_t dstCapacity, const void* src, size_t srcSize) { -#if defined(ZSTD_HEAPMODE) && (ZSTD_HEAPMODE==1) +#if defined(ZSTD_HEAPMODE) && (ZSTD_HEAPMODE>=1) size_t regenSize; ZSTD_DCtx* const dctx = ZSTD_createDCtx(); if (dctx==NULL) return ERROR(memory_allocation); @@ -1604,6 +1691,7 @@ ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx) { switch(dctx->stage) { default: /* should not happen */ + assert(0); case ZSTDds_getFrameHeaderSize: case ZSTDds_decodeFrameHeader: return ZSTDnit_frameHeader; @@ -1621,21 +1709,24 @@ ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx) { } } -int ZSTD_isSkipFrame(ZSTD_DCtx* dctx) { return dctx->stage == ZSTDds_skipFrame; } /* for zbuff */ +static int ZSTD_isSkipFrame(ZSTD_DCtx* dctx) { return dctx->stage == ZSTDds_skipFrame; } /** ZSTD_decompressContinue() : -* @return : nb of bytes generated into `dst` (necessarily <= `dstCapacity) -* or an error code, which can be tested using ZSTD_isError() */ + * srcSize : must be the exact nb of bytes expected (see ZSTD_nextSrcSizeToDecompress()) + * @return : nb of bytes generated into `dst` (necessarily <= `dstCapacity) + * or an error code, which can be tested using ZSTD_isError() */ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize) { + DEBUGLOG(5, "ZSTD_decompressContinue"); /* Sanity check */ - if (srcSize != dctx->expected) return ERROR(srcSize_wrong); + if (srcSize != dctx->expected) return ERROR(srcSize_wrong); /* unauthorized */ if (dstCapacity) ZSTD_checkContinuity(dctx, dst); switch (dctx->stage) { case ZSTDds_getFrameHeaderSize : - if (srcSize != ZSTD_frameHeaderSize_prefix) return ERROR(srcSize_wrong); /* impossible */ + if (srcSize != ZSTD_frameHeaderSize_prefix) return ERROR(srcSize_wrong); /* unauthorized */ + assert(src != NULL); if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { /* skippable frame */ memcpy(dctx->headerBuffer, src, ZSTD_frameHeaderSize_prefix); dctx->expected = ZSTD_skippableHeaderSize - ZSTD_frameHeaderSize_prefix; /* magic number + skippable frame length */ @@ -1653,6 +1744,7 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c dctx->expected = 0; /* not necessary to copy more */ case ZSTDds_decodeFrameHeader: + assert(src != NULL); memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_prefix, src, dctx->expected); CHECK_F(ZSTD_decodeFrameHeader(dctx, dctx->headerBuffer, dctx->headerSize)); dctx->expected = ZSTD_blockHeaderSize; @@ -1680,17 +1772,19 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c dctx->stage = ZSTDds_getFrameHeaderSize; } } else { - dctx->expected = 3; /* go directly to next header */ + dctx->expected = ZSTD_blockHeaderSize; /* jump to next header */ dctx->stage = ZSTDds_decodeBlockHeader; } return 0; } case ZSTDds_decompressLastBlock: case ZSTDds_decompressBlock: + DEBUGLOG(5, "case ZSTDds_decompressBlock"); { size_t rSize; switch(dctx->bType) { case bt_compressed: + DEBUGLOG(5, "case bt_compressed"); rSize = ZSTD_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize); break; case bt_raw : @@ -1730,7 +1824,8 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c return 0; } case ZSTDds_decodeSkippableHeader: - { memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_prefix, src, dctx->expected); + { assert(src != NULL); + memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_prefix, src, dctx->expected); dctx->expected = MEM_readLE32(dctx->headerBuffer + 4); dctx->stage = ZSTDds_skipFrame; return 0; @@ -1767,7 +1862,9 @@ static size_t ZSTD_loadEntropy(ZSTD_entropyTables_t* entropy, const void* const dictPtr += 8; /* skip header = magic + dictID */ - { size_t const hSize = HUF_readDTableX4(entropy->hufTable, dictPtr, dictEnd-dictPtr); + { size_t const hSize = HUF_readDTableX4_wksp( + entropy->hufTable, dictPtr, dictEnd - dictPtr, + entropy->workspace, sizeof(entropy->workspace)); if (HUF_isError(hSize)) return ERROR(dictionary_corrupted); dictPtr += hSize; } @@ -1815,7 +1912,7 @@ static size_t ZSTD_decompress_insertDictionary(ZSTD_DCtx* dctx, const void* dict { if (dictSize < 8) return ZSTD_refDictContent(dctx, dict, dictSize); { U32 const magic = MEM_readLE32(dict); - if (magic != ZSTD_DICT_MAGIC) { + if (magic != ZSTD_MAGIC_DICTIONARY) { return ZSTD_refDictContent(dctx, dict, dictSize); /* pure content mode */ } } dctx->dictID = MEM_readLE32((const char*)dict + 4); @@ -1862,10 +1959,10 @@ static size_t ZSTD_DDictDictSize(const ZSTD_DDict* ddict) return ddict->dictSize; } -static void ZSTD_refDDict(ZSTD_DCtx* dstDCtx, const ZSTD_DDict* ddict) +size_t ZSTD_decompressBegin_usingDDict(ZSTD_DCtx* dstDCtx, const ZSTD_DDict* ddict) { - ZSTD_decompressBegin(dstDCtx); /* init */ - if (ddict) { /* support refDDict on NULL */ + CHECK_F(ZSTD_decompressBegin(dstDCtx)); + if (ddict) { /* support begin on NULL */ dstDCtx->dictID = ddict->dictID; dstDCtx->base = ddict->dictContent; dstDCtx->vBase = ddict->dictContent; @@ -1886,6 +1983,7 @@ static void ZSTD_refDDict(ZSTD_DCtx* dstDCtx, const ZSTD_DDict* ddict) dstDCtx->fseEntropy = 0; } } + return 0; } static size_t ZSTD_loadEntropy_inDDict(ZSTD_DDict* ddict) @@ -1894,7 +1992,7 @@ static size_t ZSTD_loadEntropy_inDDict(ZSTD_DDict* ddict) ddict->entropyPresent = 0; if (ddict->dictSize < 8) return 0; { U32 const magic = MEM_readLE32(ddict->dictContent); - if (magic != ZSTD_DICT_MAGIC) return 0; /* pure content mode */ + if (magic != ZSTD_MAGIC_DICTIONARY) return 0; /* pure content mode */ } ddict->dictID = MEM_readLE32((const char*)ddict->dictContent + 4); @@ -1905,33 +2003,39 @@ static size_t ZSTD_loadEntropy_inDDict(ZSTD_DDict* ddict) } +static size_t ZSTD_initDDict_internal(ZSTD_DDict* ddict, const void* dict, size_t dictSize, unsigned byReference) +{ + if ((byReference) || (!dict) || (!dictSize)) { + ddict->dictBuffer = NULL; + ddict->dictContent = dict; + } else { + void* const internalBuffer = ZSTD_malloc(dictSize, ddict->cMem); + ddict->dictBuffer = internalBuffer; + ddict->dictContent = internalBuffer; + if (!internalBuffer) return ERROR(memory_allocation); + memcpy(internalBuffer, dict, dictSize); + } + ddict->dictSize = dictSize; + ddict->entropy.hufTable[0] = (HUF_DTable)((HufLog)*0x1000001); /* cover both little and big endian */ + + /* parse dictionary content */ + CHECK_F( ZSTD_loadEntropy_inDDict(ddict) ); + + return 0; +} + ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, unsigned byReference, ZSTD_customMem customMem) { - if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; - if (!customMem.customAlloc || !customMem.customFree) return NULL; + if (!customMem.customAlloc ^ !customMem.customFree) return NULL; { ZSTD_DDict* const ddict = (ZSTD_DDict*) ZSTD_malloc(sizeof(ZSTD_DDict), customMem); if (!ddict) return NULL; ddict->cMem = customMem; - if ((byReference) || (!dict) || (!dictSize)) { - ddict->dictBuffer = NULL; - ddict->dictContent = dict; - } else { - void* const internalBuffer = ZSTD_malloc(dictSize, customMem); - if (!internalBuffer) { ZSTD_freeDDict(ddict); return NULL; } - memcpy(internalBuffer, dict, dictSize); - ddict->dictBuffer = internalBuffer; - ddict->dictContent = internalBuffer; + if (ZSTD_isError( ZSTD_initDDict_internal(ddict, dict, dictSize, byReference) )) { + ZSTD_freeDDict(ddict); + return NULL; } - ddict->dictSize = dictSize; - ddict->entropy.hufTable[0] = (HUF_DTable)((HufLog)*0x1000001); /* cover both little and big endian */ - /* parse dictionary content */ - { size_t const errorCode = ZSTD_loadEntropy_inDDict(ddict); - if (ZSTD_isError(errorCode)) { - ZSTD_freeDDict(ddict); - return NULL; - } } return ddict; } @@ -1947,7 +2051,6 @@ ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize) return ZSTD_createDDict_advanced(dict, dictSize, 0, allocator); } - /*! ZSTD_createDDict_byReference() : * Create a digested dictionary, to start decompression without startup delay. * Dictionary content is simply referenced, it will be accessed during decompression. @@ -1959,6 +2062,26 @@ ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, size_t dictSize } +ZSTD_DDict* ZSTD_initStaticDDict(void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference) +{ + size_t const neededSpace = sizeof(ZSTD_DDict) + (byReference ? 0 : dictSize); + ZSTD_DDict* const ddict = (ZSTD_DDict*)workspace; + assert(workspace != NULL); + assert(dict != NULL); + if ((size_t)workspace & 7) return NULL; /* 8-aligned */ + if (workspaceSize < neededSpace) return NULL; + if (!byReference) { + memcpy(ddict+1, dict, dictSize); /* local copy */ + dict = ddict+1; + } + if (ZSTD_isError( ZSTD_initDDict_internal(ddict, dict, dictSize, 1 /* byRef */) )) + return NULL; + return ddict; +} + + size_t ZSTD_freeDDict(ZSTD_DDict* ddict) { if (ddict==NULL) return 0; /* support free on NULL */ @@ -1969,6 +2092,14 @@ size_t ZSTD_freeDDict(ZSTD_DDict* ddict) } } +/*! ZSTD_estimateDDictSize() : + * Estimate amount of memory that will be needed to create a dictionary for decompression. + * Note : dictionary created "byReference" are smaller */ +size_t ZSTD_estimateDDictSize(size_t dictSize, unsigned byReference) +{ + return sizeof(ZSTD_DDict) + (byReference ? 0 : dictSize); +} + size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict) { if (ddict==NULL) return 0; /* support sizeof on NULL */ @@ -1982,7 +2113,7 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict) unsigned ZSTD_getDictID_fromDict(const void* dict, size_t dictSize) { if (dictSize < 8) return 0; - if (MEM_readLE32(dict) != ZSTD_DICT_MAGIC) return 0; + if (MEM_readLE32(dict) != ZSTD_MAGIC_DICTIONARY) return 0; return MEM_readLE32((const char*)dict + 4); } @@ -2008,11 +2139,11 @@ unsigned ZSTD_getDictID_fromDDict(const ZSTD_DDict* ddict) * Note : possible if `srcSize < ZSTD_FRAMEHEADERSIZE_MAX`. * - This is not a Zstandard frame. * When identifying the exact failure cause, it's possible to use - * ZSTD_getFrameParams(), which will provide a more precise error code. */ + * ZSTD_getFrameHeader(), which will provide a more precise error code. */ unsigned ZSTD_getDictID_fromFrame(const void* src, size_t srcSize) { - ZSTD_frameParams zfp = { 0 , 0 , 0 , 0 }; - size_t const hError = ZSTD_getFrameParams(&zfp, src, srcSize); + ZSTD_frameHeader zfp = { 0 , 0 , 0 , 0 }; + size_t const hError = ZSTD_getFrameHeader(&zfp, src, srcSize); if (ZSTD_isError(hError)) return 0; return zfp.dictID; } @@ -2037,88 +2168,35 @@ size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx, * Streaming decompression *====================================*/ -typedef enum { zdss_init, zdss_loadHeader, - zdss_read, zdss_load, zdss_flush } ZSTD_dStreamStage; - -/* *** Resource management *** */ -struct ZSTD_DStream_s { - ZSTD_DCtx* dctx; - ZSTD_DDict* ddictLocal; - const ZSTD_DDict* ddict; - ZSTD_frameParams fParams; - ZSTD_dStreamStage stage; - char* inBuff; - size_t inBuffSize; - size_t inPos; - size_t maxWindowSize; - char* outBuff; - size_t outBuffSize; - size_t outStart; - size_t outEnd; - size_t blockSize; - BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX]; /* tmp buffer to store frame header */ - size_t lhSize; - ZSTD_customMem customMem; - void* legacyContext; - U32 previousLegacyVersion; - U32 legacyVersion; - U32 hostageByte; -}; /* typedef'd to ZSTD_DStream within "zstd.h" */ - - ZSTD_DStream* ZSTD_createDStream(void) { - return ZSTD_createDStream_advanced(defaultCustomMem); + return ZSTD_createDStream_advanced(ZSTD_defaultCMem); +} + +ZSTD_DStream* ZSTD_initStaticDStream(void *workspace, size_t workspaceSize) +{ + return ZSTD_initStaticDCtx(workspace, workspaceSize); } ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem) { - ZSTD_DStream* zds; - - if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; - if (!customMem.customAlloc || !customMem.customFree) return NULL; - - zds = (ZSTD_DStream*) ZSTD_malloc(sizeof(ZSTD_DStream), customMem); - if (zds==NULL) return NULL; - memset(zds, 0, sizeof(ZSTD_DStream)); - memcpy(&zds->customMem, &customMem, sizeof(ZSTD_customMem)); - zds->dctx = ZSTD_createDCtx_advanced(customMem); - if (zds->dctx == NULL) { ZSTD_freeDStream(zds); return NULL; } - zds->stage = zdss_init; - zds->maxWindowSize = ZSTD_MAXWINDOWSIZE_DEFAULT; - return zds; + return ZSTD_createDCtx_advanced(customMem); } size_t ZSTD_freeDStream(ZSTD_DStream* zds) { - if (zds==NULL) return 0; /* support free on null */ - { ZSTD_customMem const cMem = zds->customMem; - ZSTD_freeDCtx(zds->dctx); - zds->dctx = NULL; - ZSTD_freeDDict(zds->ddictLocal); - zds->ddictLocal = NULL; - ZSTD_free(zds->inBuff, cMem); - zds->inBuff = NULL; - ZSTD_free(zds->outBuff, cMem); - zds->outBuff = NULL; -#if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT >= 1) - if (zds->legacyContext) - ZSTD_freeLegacyStreamContext(zds->legacyContext, zds->previousLegacyVersion); -#endif - ZSTD_free(zds, cMem); - return 0; - } + return ZSTD_freeDCtx(zds); } /* *** Initialization *** */ -size_t ZSTD_DStreamInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX + ZSTD_blockHeaderSize; } -size_t ZSTD_DStreamOutSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; } +size_t ZSTD_DStreamInSize(void) { return ZSTD_BLOCKSIZE_MAX + ZSTD_blockHeaderSize; } +size_t ZSTD_DStreamOutSize(void) { return ZSTD_BLOCKSIZE_MAX; } size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize) { - zds->stage = zdss_loadHeader; + zds->streamStage = zdss_loadHeader; zds->lhSize = zds->inPos = zds->outStart = zds->outEnd = 0; ZSTD_freeDDict(zds->ddictLocal); if (dict && dictSize >= 8) { @@ -2147,7 +2225,7 @@ size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict) size_t ZSTD_resetDStream(ZSTD_DStream* zds) { - zds->stage = zdss_loadHeader; + zds->streamStage = zdss_loadHeader; zds->lhSize = zds->inPos = zds->outStart = zds->outEnd = 0; zds->legacyVersion = 0; zds->hostageByte = 0; @@ -2168,11 +2246,24 @@ size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds) { - if (zds==NULL) return 0; /* support sizeof NULL */ - return sizeof(*zds) - + ZSTD_sizeof_DCtx(zds->dctx) - + ZSTD_sizeof_DDict(zds->ddictLocal) - + zds->inBuffSize + zds->outBuffSize; + return ZSTD_sizeof_DCtx(zds); +} + +size_t ZSTD_estimateDStreamSize(size_t windowSize) +{ + size_t const blockSize = MIN(windowSize, ZSTD_BLOCKSIZE_MAX); + size_t const inBuffSize = blockSize; /* no block can be larger */ + size_t const outBuffSize = windowSize + blockSize + (WILDCOPY_OVERLENGTH * 2); + return sizeof(ZSTD_DStream) + ZSTD_estimateDCtxSize() + inBuffSize + outBuffSize; +} + +ZSTDLIB_API size_t ZSTD_estimateDStreamSize_fromFrame(const void* src, size_t srcSize) +{ + ZSTD_frameHeader fh; + size_t const err = ZSTD_getFrameHeader(&fh, src, srcSize); + if (ZSTD_isError(err)) return err; + if (err>0) return ERROR(srcSize_wrong); + return ZSTD_estimateDStreamSize(fh.windowSize); } @@ -2196,44 +2287,55 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB char* op = ostart; U32 someMoreWork = 1; + DEBUGLOG(5, "ZSTD_decompressStream"); + DEBUGLOG(5, "input size : %u", (U32)(input->size - input->pos)); #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1) - if (zds->legacyVersion) + if (zds->legacyVersion) { + /* legacy support is incompatible with static dctx */ + if (zds->staticSize) return ERROR(memory_allocation); return ZSTD_decompressLegacyStream(zds->legacyContext, zds->legacyVersion, output, input); + } #endif while (someMoreWork) { - switch(zds->stage) + switch(zds->streamStage) { case zdss_init : ZSTD_resetDStream(zds); /* transparent reset on starting decoding a new frame */ /* fall-through */ case zdss_loadHeader : - { size_t const hSize = ZSTD_getFrameParams(&zds->fParams, zds->headerBuffer, zds->lhSize); - if (ZSTD_isError(hSize)) + { size_t const hSize = ZSTD_getFrameHeader(&zds->fParams, zds->headerBuffer, zds->lhSize); + if (ZSTD_isError(hSize)) { #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1) - { U32 const legacyVersion = ZSTD_isLegacy(istart, iend-istart); + U32 const legacyVersion = ZSTD_isLegacy(istart, iend-istart); if (legacyVersion) { const void* const dict = zds->ddict ? zds->ddict->dictContent : NULL; size_t const dictSize = zds->ddict ? zds->ddict->dictSize : 0; + /* legacy support is incompatible with static dctx */ + if (zds->staticSize) return ERROR(memory_allocation); CHECK_F(ZSTD_initLegacyStream(&zds->legacyContext, zds->previousLegacyVersion, legacyVersion, dict, dictSize)); zds->legacyVersion = zds->previousLegacyVersion = legacyVersion; return ZSTD_decompressLegacyStream(zds->legacyContext, zds->legacyVersion, output, input); } else { return hSize; /* error */ - } } + } #else - return hSize; + return hSize; #endif + } if (hSize != 0) { /* need more input */ size_t const toLoad = hSize - zds->lhSize; /* if hSize!=0, hSize > zds->lhSize */ if (toLoad > (size_t)(iend-ip)) { /* not enough input to load full header */ - memcpy(zds->headerBuffer + zds->lhSize, ip, iend-ip); - zds->lhSize += iend-ip; + if (iend-ip > 0) { + memcpy(zds->headerBuffer + zds->lhSize, ip, iend-ip); + zds->lhSize += iend-ip; + } input->pos = input->size; return (MAX(ZSTD_frameHeaderSize_min, hSize) - zds->lhSize) + ZSTD_blockHeaderSize; /* remaining header bytes + next block header */ } + assert(ip != NULL); memcpy(zds->headerBuffer + zds->lhSize, ip, toLoad); zds->lhSize = hSize; ip += toLoad; break; } } @@ -2243,74 +2345,90 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB && (U64)(size_t)(oend-op) >= zds->fParams.frameContentSize) { size_t const cSize = ZSTD_findFrameCompressedSize(istart, iend-istart); if (cSize <= (size_t)(iend-istart)) { - size_t const decompressedSize = ZSTD_decompress_usingDDict(zds->dctx, op, oend-op, istart, cSize, zds->ddict); + size_t const decompressedSize = ZSTD_decompress_usingDDict(zds, op, oend-op, istart, cSize, zds->ddict); if (ZSTD_isError(decompressedSize)) return decompressedSize; ip = istart + cSize; op += decompressedSize; - zds->dctx->expected = 0; - zds->stage = zdss_init; + zds->expected = 0; + zds->streamStage = zdss_init; someMoreWork = 0; break; } } - /* Consume header */ - ZSTD_refDDict(zds->dctx, zds->ddict); - { size_t const h1Size = ZSTD_nextSrcSizeToDecompress(zds->dctx); /* == ZSTD_frameHeaderSize_prefix */ - CHECK_F(ZSTD_decompressContinue(zds->dctx, NULL, 0, zds->headerBuffer, h1Size)); - { size_t const h2Size = ZSTD_nextSrcSizeToDecompress(zds->dctx); - CHECK_F(ZSTD_decompressContinue(zds->dctx, NULL, 0, zds->headerBuffer+h1Size, h2Size)); - } } + /* Consume header (see ZSTDds_decodeFrameHeader) */ + DEBUGLOG(4, "Consume header"); + CHECK_F(ZSTD_decompressBegin_usingDDict(zds, zds->ddict)); + if ((MEM_readLE32(zds->headerBuffer) & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { /* skippable frame */ + zds->expected = MEM_readLE32(zds->headerBuffer + 4); + zds->stage = ZSTDds_skipFrame; + } else { + CHECK_F(ZSTD_decodeFrameHeader(zds, zds->headerBuffer, zds->lhSize)); + zds->expected = ZSTD_blockHeaderSize; + zds->stage = ZSTDds_decodeBlockHeader; + } + + /* control buffer memory usage */ + DEBUGLOG(4, "Control max buffer memory usage"); zds->fParams.windowSize = MAX(zds->fParams.windowSize, 1U << ZSTD_WINDOWLOG_ABSOLUTEMIN); if (zds->fParams.windowSize > zds->maxWindowSize) return ERROR(frameParameter_windowTooLarge); /* Adapt buffer sizes to frame header instructions */ - { size_t const blockSize = MIN(zds->fParams.windowSize, ZSTD_BLOCKSIZE_ABSOLUTEMAX); + { size_t const blockSize = MIN(zds->fParams.windowSize, ZSTD_BLOCKSIZE_MAX); size_t const neededOutSize = zds->fParams.windowSize + blockSize + WILDCOPY_OVERLENGTH * 2; zds->blockSize = blockSize; - if (zds->inBuffSize < blockSize) { - ZSTD_free(zds->inBuff, zds->customMem); - zds->inBuffSize = 0; - zds->inBuff = (char*)ZSTD_malloc(blockSize, zds->customMem); - if (zds->inBuff == NULL) return ERROR(memory_allocation); + if ((zds->inBuffSize < blockSize) || (zds->outBuffSize < neededOutSize)) { + size_t const bufferSize = blockSize + neededOutSize; + DEBUGLOG(4, "inBuff : from %u to %u", + (U32)zds->inBuffSize, (U32)blockSize); + DEBUGLOG(4, "outBuff : from %u to %u", + (U32)zds->outBuffSize, (U32)neededOutSize); + if (zds->staticSize) { /* static DCtx */ + DEBUGLOG(4, "staticSize : %u", (U32)zds->staticSize); + assert(zds->staticSize >= sizeof(ZSTD_DCtx)); /* controlled at init */ + if (bufferSize > zds->staticSize - sizeof(ZSTD_DCtx)) + return ERROR(memory_allocation); + } else { + ZSTD_free(zds->inBuff, zds->customMem); + zds->inBuffSize = 0; + zds->outBuffSize = 0; + zds->inBuff = (char*)ZSTD_malloc(bufferSize, zds->customMem); + if (zds->inBuff == NULL) return ERROR(memory_allocation); + } zds->inBuffSize = blockSize; - } - if (zds->outBuffSize < neededOutSize) { - ZSTD_free(zds->outBuff, zds->customMem); - zds->outBuffSize = 0; - zds->outBuff = (char*)ZSTD_malloc(neededOutSize, zds->customMem); - if (zds->outBuff == NULL) return ERROR(memory_allocation); + zds->outBuff = zds->inBuff + zds->inBuffSize; zds->outBuffSize = neededOutSize; } } - zds->stage = zdss_read; + zds->streamStage = zdss_read; /* pass-through */ case zdss_read: - { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds->dctx); + DEBUGLOG(5, "stage zdss_read"); + { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds); + DEBUGLOG(5, "neededInSize = %u", (U32)neededInSize); if (neededInSize==0) { /* end of frame */ - zds->stage = zdss_init; + zds->streamStage = zdss_init; someMoreWork = 0; break; } if ((size_t)(iend-ip) >= neededInSize) { /* decode directly from src */ - const int isSkipFrame = ZSTD_isSkipFrame(zds->dctx); - size_t const decodedSize = ZSTD_decompressContinue(zds->dctx, + int const isSkipFrame = ZSTD_isSkipFrame(zds); + size_t const decodedSize = ZSTD_decompressContinue(zds, zds->outBuff + zds->outStart, (isSkipFrame ? 0 : zds->outBuffSize - zds->outStart), ip, neededInSize); if (ZSTD_isError(decodedSize)) return decodedSize; ip += neededInSize; if (!decodedSize && !isSkipFrame) break; /* this was just a header */ zds->outEnd = zds->outStart + decodedSize; - zds->stage = zdss_flush; + zds->streamStage = zdss_flush; break; - } - if (ip==iend) { someMoreWork = 0; break; } /* no more input */ - zds->stage = zdss_load; - /* pass-through */ - } + } } + if (ip==iend) { someMoreWork = 0; break; } /* no more input */ + zds->streamStage = zdss_load; + /* pass-through */ case zdss_load: - { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds->dctx); + { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds); size_t const toLoad = neededInSize - zds->inPos; /* should always be <= remaining space within inBuff */ size_t loadedSize; if (toLoad > zds->inBuffSize - zds->inPos) return ERROR(corruption_detected); /* should never happen */ @@ -2320,17 +2438,17 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB if (loadedSize < toLoad) { someMoreWork = 0; break; } /* not enough input, wait for more */ /* decode loaded input */ - { const int isSkipFrame = ZSTD_isSkipFrame(zds->dctx); - size_t const decodedSize = ZSTD_decompressContinue(zds->dctx, + { const int isSkipFrame = ZSTD_isSkipFrame(zds); + size_t const decodedSize = ZSTD_decompressContinue(zds, zds->outBuff + zds->outStart, zds->outBuffSize - zds->outStart, zds->inBuff, neededInSize); if (ZSTD_isError(decodedSize)) return decodedSize; zds->inPos = 0; /* input is consumed */ - if (!decodedSize && !isSkipFrame) { zds->stage = zdss_read; break; } /* this was just a header */ + if (!decodedSize && !isSkipFrame) { zds->streamStage = zdss_read; break; } /* this was just a header */ zds->outEnd = zds->outStart + decodedSize; - zds->stage = zdss_flush; - /* pass-through */ } } + zds->streamStage = zdss_flush; + /* pass-through */ case zdss_flush: { size_t const toFlushSize = zds->outEnd - zds->outStart; @@ -2338,37 +2456,41 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB op += flushedSize; zds->outStart += flushedSize; if (flushedSize == toFlushSize) { /* flush completed */ - zds->stage = zdss_read; + zds->streamStage = zdss_read; if (zds->outStart + zds->blockSize > zds->outBuffSize) zds->outStart = zds->outEnd = 0; break; - } - /* cannot complete flush */ - someMoreWork = 0; - break; - } + } } + /* cannot complete flush */ + someMoreWork = 0; + break; + default: return ERROR(GENERIC); /* impossible */ } } /* result */ input->pos += (size_t)(ip-istart); output->pos += (size_t)(op-ostart); - { size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zds->dctx); + { size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zds); if (!nextSrcSizeHint) { /* frame fully decoded */ if (zds->outEnd == zds->outStart) { /* output fully flushed */ if (zds->hostageByte) { - if (input->pos >= input->size) { zds->stage = zdss_read; return 1; } /* can't release hostage (not present) */ + if (input->pos >= input->size) { + /* can't release hostage (not present) */ + zds->streamStage = zdss_read; + return 1; + } input->pos++; /* release hostage */ - } + } /* zds->hostageByte */ return 0; - } + } /* zds->outEnd == zds->outStart */ if (!zds->hostageByte) { /* output not fully flushed; keep last byte as hostage; will be released when all output is flushed */ input->pos--; /* note : pos > 0, otherwise, impossible to finish reading last block */ zds->hostageByte=1; } return 1; - } - nextSrcSizeHint += ZSTD_blockHeaderSize * (ZSTD_nextInputType(zds->dctx) == ZSTDnit_block); /* preload header of next block */ + } /* nextSrcSizeHint==0 */ + nextSrcSizeHint += ZSTD_blockHeaderSize * (ZSTD_nextInputType(zds) == ZSTDnit_block); /* preload header of next block */ if (zds->inPos > nextSrcSizeHint) return ERROR(GENERIC); /* should never happen */ nextSrcSizeHint -= zds->inPos; /* already loaded*/ return nextSrcSizeHint; diff --git a/contrib/zstd/lib/dictBuilder/cover.c b/contrib/zstd/lib/dictBuilder/cover.c index 1863c8f34542..06c1b9fadb7a 100644 --- a/contrib/zstd/lib/dictBuilder/cover.c +++ b/contrib/zstd/lib/dictBuilder/cover.c @@ -398,7 +398,8 @@ typedef struct { */ static COVER_segment_t COVER_selectSegment(const COVER_ctx_t *ctx, U32 *freqs, COVER_map_t *activeDmers, U32 begin, - U32 end, COVER_params_t parameters) { + U32 end, + ZDICT_cover_params_t parameters) { /* Constants */ const U32 k = parameters.k; const U32 d = parameters.d; @@ -478,7 +479,7 @@ static COVER_segment_t COVER_selectSegment(const COVER_ctx_t *ctx, U32 *freqs, * Check the validity of the parameters. * Returns non-zero if the parameters are valid and 0 otherwise. */ -static int COVER_checkParameters(COVER_params_t parameters) { +static int COVER_checkParameters(ZDICT_cover_params_t parameters) { /* k and d are required parameters */ if (parameters.d == 0 || parameters.k == 0) { return 0; @@ -600,7 +601,7 @@ static int COVER_ctx_init(COVER_ctx_t *ctx, const void *samplesBuffer, static size_t COVER_buildDictionary(const COVER_ctx_t *ctx, U32 *freqs, COVER_map_t *activeDmers, void *dictBuffer, size_t dictBufferCapacity, - COVER_params_t parameters) { + ZDICT_cover_params_t parameters) { BYTE *const dict = (BYTE *)dictBuffer; size_t tail = dictBufferCapacity; /* Divide the data up into epochs of equal size. @@ -639,22 +640,10 @@ static size_t COVER_buildDictionary(const COVER_ctx_t *ctx, U32 *freqs, return tail; } -/** - * Translate from COVER_params_t to ZDICT_params_t required for finalizing the - * dictionary. - */ -static ZDICT_params_t COVER_translateParams(COVER_params_t parameters) { - ZDICT_params_t zdictParams; - memset(&zdictParams, 0, sizeof(zdictParams)); - zdictParams.notificationLevel = 1; - zdictParams.dictID = parameters.dictID; - zdictParams.compressionLevel = parameters.compressionLevel; - return zdictParams; -} - -ZDICTLIB_API size_t COVER_trainFromBuffer( +ZDICTLIB_API size_t ZDICT_trainFromBuffer_cover( void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, - const size_t *samplesSizes, unsigned nbSamples, COVER_params_t parameters) { + const size_t *samplesSizes, unsigned nbSamples, + ZDICT_cover_params_t parameters) { BYTE *const dict = (BYTE *)dictBuffer; COVER_ctx_t ctx; COVER_map_t activeDmers; @@ -673,7 +662,7 @@ ZDICTLIB_API size_t COVER_trainFromBuffer( return ERROR(dstSize_tooSmall); } /* Initialize global data */ - g_displayLevel = parameters.notificationLevel; + g_displayLevel = parameters.zParams.notificationLevel; /* Initialize context and activeDmers */ if (!COVER_ctx_init(&ctx, samplesBuffer, samplesSizes, nbSamples, parameters.d)) { @@ -690,10 +679,9 @@ ZDICTLIB_API size_t COVER_trainFromBuffer( const size_t tail = COVER_buildDictionary(&ctx, ctx.freqs, &activeDmers, dictBuffer, dictBufferCapacity, parameters); - ZDICT_params_t zdictParams = COVER_translateParams(parameters); const size_t dictionarySize = ZDICT_finalizeDictionary( dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail, - samplesBuffer, samplesSizes, nbSamples, zdictParams); + samplesBuffer, samplesSizes, nbSamples, parameters.zParams); if (!ZSTD_isError(dictionarySize)) { DISPLAYLEVEL(2, "Constructed dictionary of size %u\n", (U32)dictionarySize); @@ -718,7 +706,7 @@ typedef struct COVER_best_s { size_t liveJobs; void *dict; size_t dictSize; - COVER_params_t parameters; + ZDICT_cover_params_t parameters; size_t compressedSize; } COVER_best_t; @@ -786,7 +774,7 @@ static void COVER_best_start(COVER_best_t *best) { * If this dictionary is the best so far save it and its parameters. */ static void COVER_best_finish(COVER_best_t *best, size_t compressedSize, - COVER_params_t parameters, void *dict, + ZDICT_cover_params_t parameters, void *dict, size_t dictSize) { if (!best) { return; @@ -830,7 +818,7 @@ typedef struct COVER_tryParameters_data_s { const COVER_ctx_t *ctx; COVER_best_t *best; size_t dictBufferCapacity; - COVER_params_t parameters; + ZDICT_cover_params_t parameters; } COVER_tryParameters_data_t; /** @@ -842,7 +830,7 @@ static void COVER_tryParameters(void *opaque) { /* Save parameters as local variables */ COVER_tryParameters_data_t *const data = (COVER_tryParameters_data_t *)opaque; const COVER_ctx_t *const ctx = data->ctx; - const COVER_params_t parameters = data->parameters; + const ZDICT_cover_params_t parameters = data->parameters; size_t dictBufferCapacity = data->dictBufferCapacity; size_t totalCompressedSize = ERROR(GENERIC); /* Allocate space for hash table, dict, and freqs */ @@ -863,10 +851,10 @@ static void COVER_tryParameters(void *opaque) { { const size_t tail = COVER_buildDictionary(ctx, freqs, &activeDmers, dict, dictBufferCapacity, parameters); - const ZDICT_params_t zdictParams = COVER_translateParams(parameters); dictBufferCapacity = ZDICT_finalizeDictionary( dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail, - ctx->samples, ctx->samplesSizes, (unsigned)ctx->nbSamples, zdictParams); + ctx->samples, ctx->samplesSizes, (unsigned)ctx->nbSamples, + parameters.zParams); if (ZDICT_isError(dictBufferCapacity)) { DISPLAYLEVEL(1, "Failed to finalize dictionary\n"); goto _cleanup; @@ -892,8 +880,8 @@ static void COVER_tryParameters(void *opaque) { } /* Create the cctx and cdict */ cctx = ZSTD_createCCtx(); - cdict = - ZSTD_createCDict(dict, dictBufferCapacity, parameters.compressionLevel); + cdict = ZSTD_createCDict(dict, dictBufferCapacity, + parameters.zParams.compressionLevel); if (!dst || !cctx || !cdict) { goto _compressCleanup; } @@ -930,12 +918,10 @@ _cleanup: } } -ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void *dictBuffer, - size_t dictBufferCapacity, - const void *samplesBuffer, - const size_t *samplesSizes, - unsigned nbSamples, - COVER_params_t *parameters) { +ZDICTLIB_API size_t ZDICT_optimizeTrainFromBuffer_cover( + void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, + ZDICT_cover_params_t *parameters) { /* constants */ const unsigned nbThreads = parameters->nbThreads; const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d; @@ -947,7 +933,7 @@ ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void *dictBuffer, const unsigned kIterations = (1 + (kMaxD - kMinD) / 2) * (1 + (kMaxK - kMinK) / kStepSize); /* Local variables */ - const int displayLevel = parameters->notificationLevel; + const int displayLevel = parameters->zParams.notificationLevel; unsigned iteration = 1; unsigned d; unsigned k; @@ -976,7 +962,7 @@ ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void *dictBuffer, /* Initialization */ COVER_best_init(&best); /* Turn down global display level to clean up display at level 2 and below */ - g_displayLevel = parameters->notificationLevel - 1; + g_displayLevel = parameters->zParams.notificationLevel - 1; /* Loop through d first because each new value needs a new context */ LOCALDISPLAYLEVEL(displayLevel, 2, "Trying %u different sets of parameters\n", kIterations); diff --git a/contrib/zstd/lib/dictBuilder/zdict.c b/contrib/zstd/lib/dictBuilder/zdict.c index 179e02effa4d..742586eacdd2 100644 --- a/contrib/zstd/lib/dictBuilder/zdict.c +++ b/contrib/zstd/lib/dictBuilder/zdict.c @@ -94,7 +94,7 @@ const char* ZDICT_getErrorName(size_t errorCode) { return ERR_getErrorName(error unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize) { if (dictSize < 8) return 0; - if (MEM_readLE32(dictBuffer) != ZSTD_DICT_MAGIC) return 0; + if (MEM_readLE32(dictBuffer) != ZSTD_MAGIC_DICTIONARY) return 0; return MEM_readLE32((const char*)dictBuffer + 4); } @@ -487,7 +487,7 @@ static U32 ZDICT_dictSize(const dictItem* dictList) } -static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize, +static size_t ZDICT_trainBuffer_legacy(dictItem* dictList, U32 dictListSize, const void* const buffer, size_t bufferSize, /* buffer must end with noisy guard band */ const size_t* fileSizes, unsigned nbFiles, U32 minRatio, U32 notificationLevel) @@ -576,7 +576,7 @@ typedef struct { ZSTD_CCtx* ref; ZSTD_CCtx* zc; - void* workPlace; /* must be ZSTD_BLOCKSIZE_ABSOLUTEMAX allocated */ + void* workPlace; /* must be ZSTD_BLOCKSIZE_MAX allocated */ } EStats_ress_t; #define MAXREPOFFSET 1024 @@ -585,14 +585,14 @@ static void ZDICT_countEStats(EStats_ress_t esr, ZSTD_parameters params, U32* countLit, U32* offsetcodeCount, U32* matchlengthCount, U32* litlengthCount, U32* repOffsets, const void* src, size_t srcSize, U32 notificationLevel) { - size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << params.cParams.windowLog); + size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_MAX, 1 << params.cParams.windowLog); size_t cSize; if (srcSize > blockSizeMax) srcSize = blockSizeMax; /* protection vs large samples */ { size_t const errorCode = ZSTD_copyCCtx(esr.zc, esr.ref, 0); if (ZSTD_isError(errorCode)) { DISPLAYLEVEL(1, "warning : ZSTD_copyCCtx failed \n"); return; } } - cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_ABSOLUTEMAX, src, srcSize); + cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_MAX, src, srcSize); if (ZSTD_isError(cSize)) { DISPLAYLEVEL(3, "warning : could not compress sample size %u \n", (U32)srcSize); return; } if (cSize) { /* if == 0; block is not compressible */ @@ -634,17 +634,6 @@ static void ZDICT_countEStats(EStats_ress_t esr, ZSTD_parameters params, } } } } -/* -static size_t ZDICT_maxSampleSize(const size_t* fileSizes, unsigned nbFiles) -{ - unsigned u; - size_t max=0; - for (u=0; u = 3) { + if (params.zParams.notificationLevel>= 3) { U32 const nb = MIN(25, dictList[0].pos); U32 const dictContentSize = ZDICT_dictSize(dictList); U32 u; @@ -1026,7 +1015,7 @@ size_t ZDICT_trainFromBuffer_unsafe( dictSize = ZDICT_addEntropyTablesFromBuffer_advanced(dictBuffer, dictContentSize, maxDictSize, samplesBuffer, samplesSizes, nbSamples, - params); + params.zParams); } /* clean up */ @@ -1037,9 +1026,9 @@ size_t ZDICT_trainFromBuffer_unsafe( /* issue : samplesBuffer need to be followed by a noisy guard band. * work around : duplicate the buffer, and add the noise */ -size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, - ZDICT_params_t params) +size_t ZDICT_trainFromBuffer_legacy(void* dictBuffer, size_t dictBufferCapacity, + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, + ZDICT_legacy_params_t params) { size_t result; void* newBuff; @@ -1052,10 +1041,9 @@ size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacit memcpy(newBuff, samplesBuffer, sBuffSize); ZDICT_fillNoise((char*)newBuff + sBuffSize, NOISELENGTH); /* guard band, for end of buffer condition */ - result = ZDICT_trainFromBuffer_unsafe( - dictBuffer, dictBufferCapacity, - newBuff, samplesSizes, nbSamples, - params); + result = + ZDICT_trainFromBuffer_unsafe_legacy(dictBuffer, dictBufferCapacity, newBuff, + samplesSizes, nbSamples, params); free(newBuff); return result; } @@ -1064,11 +1052,13 @@ size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacit size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples) { - ZDICT_params_t params; + ZDICT_cover_params_t params; memset(¶ms, 0, sizeof(params)); - return ZDICT_trainFromBuffer_advanced(dictBuffer, dictBufferCapacity, - samplesBuffer, samplesSizes, nbSamples, - params); + params.d = 8; + params.steps = 4; + return ZDICT_optimizeTrainFromBuffer_cover(dictBuffer, dictBufferCapacity, + samplesBuffer, samplesSizes, + nbSamples, ¶ms); } size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity, diff --git a/contrib/zstd/lib/dictBuilder/zdict.h b/contrib/zstd/lib/dictBuilder/zdict.h index 9b53de346527..7bfbb351a1dd 100644 --- a/contrib/zstd/lib/dictBuilder/zdict.h +++ b/contrib/zstd/lib/dictBuilder/zdict.h @@ -20,10 +20,12 @@ extern "C" { /* ===== ZDICTLIB_API : control library symbols visibility ===== */ -#if defined(__GNUC__) && (__GNUC__ >= 4) -# define ZDICTLIB_VISIBILITY __attribute__ ((visibility ("default"))) -#else -# define ZDICTLIB_VISIBILITY +#ifndef ZDICTLIB_VISIBILITY +# if defined(__GNUC__) && (__GNUC__ >= 4) +# define ZDICTLIB_VISIBILITY __attribute__ ((visibility ("default"))) +# else +# define ZDICTLIB_VISIBILITY +# endif #endif #if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) # define ZDICTLIB_API __declspec(dllexport) ZDICTLIB_VISIBILITY @@ -34,18 +36,20 @@ extern "C" { #endif -/*! ZDICT_trainFromBuffer() : - Train a dictionary from an array of samples. - Samples must be stored concatenated in a single flat buffer `samplesBuffer`, - supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. - The resulting dictionary will be saved into `dictBuffer`. - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) - or an error code, which can be tested with ZDICT_isError(). - Tips : In general, a reasonable dictionary has a size of ~ 100 KB. - It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. - In general, it's recommended to provide a few thousands samples, but this can vary a lot. - It's recommended that total size of all samples be about ~x100 times the target size of dictionary. -*/ +/*! ZDICT_trainFromBuffer(): + * Train a dictionary from an array of samples. + * Uses ZDICT_optimizeTrainFromBuffer_cover() single-threaded, with d=8 and steps=4. + * Samples must be stored concatenated in a single flat buffer `samplesBuffer`, + * supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. + * The resulting dictionary will be saved into `dictBuffer`. + * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + * or an error code, which can be tested with ZDICT_isError(). + * Note: ZDICT_trainFromBuffer() requires about 9 bytes of memory for each input byte. + * Tips: In general, a reasonable dictionary has a size of ~ 100 KB. + * It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. + * In general, it's recommended to provide a few thousands samples, but this can vary a lot. + * It's recommended that total size of all samples be about ~x100 times the target size of dictionary. + */ ZDICTLIB_API size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples); @@ -67,94 +71,78 @@ ZDICTLIB_API const char* ZDICT_getErrorName(size_t errorCode); * ==================================================================================== */ typedef struct { - unsigned selectivityLevel; /* 0 means default; larger => select more => larger dictionary */ int compressionLevel; /* 0 means default; target a specific zstd compression level */ unsigned notificationLevel; /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */ unsigned dictID; /* 0 means auto mode (32-bits random value); other : force dictID value */ - unsigned reserved[2]; /* reserved space for future parameters */ } ZDICT_params_t; - -/*! ZDICT_trainFromBuffer_advanced() : - Same as ZDICT_trainFromBuffer() with control over more parameters. - `parameters` is optional and can be provided with values set to 0 to mean "default". - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`), - or an error code, which can be tested by ZDICT_isError(). - note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0. -*/ -ZDICTLIB_API size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, - ZDICT_params_t parameters); - -/*! COVER_params_t : - For all values 0 means default. - k and d are the only required parameters. -*/ +/*! ZDICT_cover_params_t: + * For all values 0 means default. + * k and d are the only required parameters. + */ typedef struct { unsigned k; /* Segment size : constraint: 0 < k : Reasonable range [16, 2048+] */ unsigned d; /* dmer size : constraint: 0 < d <= k : Reasonable range [6, 16] */ unsigned steps; /* Number of steps : Only used for optimization : 0 means default (32) : Higher means more parameters checked */ - unsigned nbThreads; /* Number of threads : constraint: 0 < nbThreads : 1 means single-threaded : Only used for optimization : Ignored if ZSTD_MULTITHREAD is not defined */ - unsigned notificationLevel; /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */ - unsigned dictID; /* 0 means auto mode (32-bits random value); other : force dictID value */ - int compressionLevel; /* 0 means default; target a specific zstd compression level */ -} COVER_params_t; + ZDICT_params_t zParams; +} ZDICT_cover_params_t; -/*! COVER_trainFromBuffer() : - Train a dictionary from an array of samples using the COVER algorithm. - Samples must be stored concatenated in a single flat buffer `samplesBuffer`, - supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. - The resulting dictionary will be saved into `dictBuffer`. - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) - or an error code, which can be tested with ZDICT_isError(). - Note : COVER_trainFromBuffer() requires about 9 bytes of memory for each input byte. - Tips : In general, a reasonable dictionary has a size of ~ 100 KB. - It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. - In general, it's recommended to provide a few thousands samples, but this can vary a lot. - It's recommended that total size of all samples be about ~x100 times the target size of dictionary. -*/ -ZDICTLIB_API size_t COVER_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, - COVER_params_t parameters); +/*! ZDICT_trainFromBuffer_cover(): + * Train a dictionary from an array of samples using the COVER algorithm. + * Samples must be stored concatenated in a single flat buffer `samplesBuffer`, + * supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. + * The resulting dictionary will be saved into `dictBuffer`. + * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + * or an error code, which can be tested with ZDICT_isError(). + * Note: ZDICT_trainFromBuffer_cover() requires about 9 bytes of memory for each input byte. + * Tips: In general, a reasonable dictionary has a size of ~ 100 KB. + * It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. + * In general, it's recommended to provide a few thousands samples, but this can vary a lot. + * It's recommended that total size of all samples be about ~x100 times the target size of dictionary. + */ +ZDICTLIB_API size_t ZDICT_trainFromBuffer_cover( + void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, + ZDICT_cover_params_t parameters); -/*! COVER_optimizeTrainFromBuffer() : - The same requirements as above hold for all the parameters except `parameters`. - This function tries many parameter combinations and picks the best parameters. - `*parameters` is filled with the best parameters found, and the dictionary - constructed with those parameters is stored in `dictBuffer`. +/*! ZDICT_optimizeTrainFromBuffer_cover(): + * The same requirements as above hold for all the parameters except `parameters`. + * This function tries many parameter combinations and picks the best parameters. + * `*parameters` is filled with the best parameters found, and the dictionary + * constructed with those parameters is stored in `dictBuffer`. + * + * All of the parameters d, k, steps are optional. + * If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8, 10, 12, 14, 16}. + * if steps is zero it defaults to its default value. + * If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [16, 2048]. + * + * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + * or an error code, which can be tested with ZDICT_isError(). + * On success `*parameters` contains the parameters selected. + * Note: ZDICT_optimizeTrainFromBuffer_cover() requires about 8 bytes of memory for each input byte and additionally another 5 bytes of memory for each byte of memory for each thread. + */ +ZDICTLIB_API size_t ZDICT_optimizeTrainFromBuffer_cover( + void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, + ZDICT_cover_params_t *parameters); - All of the parameters d, k, steps are optional. - If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8, 10, 12, 14, 16}. - if steps is zero it defaults to its default value. - If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [16, 2048]. - - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) - or an error code, which can be tested with ZDICT_isError(). - On success `*parameters` contains the parameters selected. - Note : COVER_optimizeTrainFromBuffer() requires about 8 bytes of memory for each input byte and additionally another 5 bytes of memory for each byte of memory for each thread. -*/ -ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t *samplesSizes, unsigned nbSamples, - COVER_params_t *parameters); - -/*! ZDICT_finalizeDictionary() : - - Given a custom content as a basis for dictionary, and a set of samples, - finalize dictionary by adding headers and statistics. - - Samples must be stored concatenated in a flat buffer `samplesBuffer`, - supplied with an array of sizes `samplesSizes`, providing the size of each sample in order. - - dictContentSize must be >= ZDICT_CONTENTSIZE_MIN bytes. - maxDictSize must be >= dictContentSize, and must be >= ZDICT_DICTSIZE_MIN bytes. - - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`), - or an error code, which can be tested by ZDICT_isError(). - note : ZDICT_finalizeDictionary() will push notifications into stderr if instructed to, using notificationLevel>0. - note 2 : dictBuffer and dictContent can overlap -*/ +/*! ZDICT_finalizeDictionary(): + * Given a custom content as a basis for dictionary, and a set of samples, + * finalize dictionary by adding headers and statistics. + * + * Samples must be stored concatenated in a flat buffer `samplesBuffer`, + * supplied with an array of sizes `samplesSizes`, providing the size of each sample in order. + * + * dictContentSize must be >= ZDICT_CONTENTSIZE_MIN bytes. + * maxDictSize must be >= dictContentSize, and must be >= ZDICT_DICTSIZE_MIN bytes. + * + * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`), + * or an error code, which can be tested by ZDICT_isError(). + * Note: ZDICT_finalizeDictionary() will push notifications into stderr if instructed to, using notificationLevel>0. + * Note 2: dictBuffer and dictContent can overlap + */ #define ZDICT_CONTENTSIZE_MIN 128 #define ZDICT_DICTSIZE_MIN 256 ZDICTLIB_API size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity, @@ -162,7 +150,28 @@ ZDICTLIB_API size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBuffer const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, ZDICT_params_t parameters); +typedef struct { + unsigned selectivityLevel; /* 0 means default; larger => select more => larger dictionary */ + ZDICT_params_t zParams; +} ZDICT_legacy_params_t; +/*! ZDICT_trainFromBuffer_legacy(): + * Train a dictionary from an array of samples. + * Samples must be stored concatenated in a single flat buffer `samplesBuffer`, + * supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. + * The resulting dictionary will be saved into `dictBuffer`. + * `parameters` is optional and can be provided with values set to 0 to mean "default". + * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + * or an error code, which can be tested with ZDICT_isError(). + * Tips: In general, a reasonable dictionary has a size of ~ 100 KB. + * It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. + * In general, it's recommended to provide a few thousands samples, but this can vary a lot. + * It's recommended that total size of all samples be about ~x100 times the target size of dictionary. + * Note: ZDICT_trainFromBuffer_legacy() will send notifications into stderr if instructed to, using notificationLevel>0. + */ +ZDICTLIB_API size_t ZDICT_trainFromBuffer_legacy( + void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, ZDICT_legacy_params_t parameters); /* Deprecation warnings */ /* It is generally possible to disable deprecation warnings from compiler, diff --git a/contrib/zstd/lib/legacy/zstd_v04.c b/contrib/zstd/lib/legacy/zstd_v04.c index 1a29da92d1e8..8b8e23cb09cb 100644 --- a/contrib/zstd/lib/legacy/zstd_v04.c +++ b/contrib/zstd/lib/legacy/zstd_v04.c @@ -816,13 +816,13 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si bitD->bitContainer = *(const BYTE*)(bitD->start); switch(srcSize) { - case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16); - case 6: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[5]) << (sizeof(size_t)*8 - 24); - case 5: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[4]) << (sizeof(size_t)*8 - 32); - case 4: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[3]) << 24; - case 3: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[2]) << 16; - case 2: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[1]) << 8; - default:; + case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16);/* fall-through */ + case 6: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[5]) << (sizeof(size_t)*8 - 24);/* fall-through */ + case 5: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[4]) << (sizeof(size_t)*8 - 32);/* fall-through */ + case 4: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[3]) << 24; /* fall-through */ + case 3: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[2]) << 16; /* fall-through */ + case 2: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[1]) << 8; /* fall-through */ + default: break; } contain32 = ((const BYTE*)srcBuffer)[srcSize-1]; if (contain32 == 0) return ERROR(GENERIC); /* endMark not present */ @@ -3665,7 +3665,7 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs break; } zbc->stage = ZBUFFds_read; - + /* fall-through */ case ZBUFFds_read: { size_t neededInSize = ZSTD_nextSrcSizeToDecompress(zbc->zc); @@ -3691,7 +3691,7 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs if (ip==iend) { notDone = 0; break; } /* no more input */ zbc->stage = ZBUFFds_load; } - + /* fall-through */ case ZBUFFds_load: { size_t neededInSize = ZSTD_nextSrcSizeToDecompress(zbc->zc); @@ -3711,9 +3711,10 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs if (!decodedSize) { zbc->stage = ZBUFFds_read; break; } /* this was just a header */ zbc->outEnd = zbc->outStart + decodedSize; zbc->stage = ZBUFFds_flush; - // break; /* ZBUFFds_flush follows */ + /* ZBUFFds_flush follows */ } } + /* fall-through */ case ZBUFFds_flush: { size_t toFlushSize = zbc->outEnd - zbc->outStart; diff --git a/contrib/zstd/lib/legacy/zstd_v05.c b/contrib/zstd/lib/legacy/zstd_v05.c index 674f5b0e4a8e..e929618a3bf5 100644 --- a/contrib/zstd/lib/legacy/zstd_v05.c +++ b/contrib/zstd/lib/legacy/zstd_v05.c @@ -326,13 +326,6 @@ size_t ZSTDv05_decompress_usingPreparedDCtx( * Streaming functions (direct mode) ****************************************/ size_t ZSTDv05_decompressBegin(ZSTDv05_DCtx* dctx); -size_t ZSTDv05_decompressBegin_usingDict(ZSTDv05_DCtx* dctx, const void* dict, size_t dictSize); -void ZSTDv05_copyDCtx(ZSTDv05_DCtx* dctx, const ZSTDv05_DCtx* preparedDCtx); - -size_t ZSTDv05_getFrameParams(ZSTDv05_parameters* params, const void* src, size_t srcSize); - -size_t ZSTDv05_nextSrcSizeToDecompress(ZSTDv05_DCtx* dctx); -size_t ZSTDv05_decompressContinue(ZSTDv05_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); /* Streaming decompression, direct mode (bufferless) @@ -818,13 +811,13 @@ MEM_STATIC size_t BITv05_initDStream(BITv05_DStream_t* bitD, const void* srcBuff bitD->bitContainer = *(const BYTE*)(bitD->start); switch(srcSize) { - case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16); - case 6: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[5]) << (sizeof(size_t)*8 - 24); - case 5: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[4]) << (sizeof(size_t)*8 - 32); - case 4: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[3]) << 24; - case 3: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[2]) << 16; - case 2: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[1]) << 8; - default:; + case 7: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[6]) << (sizeof(size_t)*8 - 16);/* fall-through */ + case 6: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[5]) << (sizeof(size_t)*8 - 24);/* fall-through */ + case 5: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[4]) << (sizeof(size_t)*8 - 32);/* fall-through */ + case 4: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[3]) << 24; /* fall-through */ + case 3: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[2]) << 16; /* fall-through */ + case 2: bitD->bitContainer += (size_t)(((const BYTE*)(bitD->start))[1]) << 8; /* fall-through */ + default: break; } contain32 = ((const BYTE*)srcBuffer)[srcSize-1]; if (contain32 == 0) return ERROR(GENERIC); /* endMark not present */ @@ -3955,7 +3948,7 @@ size_t ZBUFFv05_decompressContinue(ZBUFFv05_DCtx* zbc, void* dst, size_t* maxDst zbc->stage = ZBUFFv05ds_decodeHeader; break; } - + /* fall-through */ case ZBUFFv05ds_loadHeader: /* complete header from src */ { @@ -3973,7 +3966,7 @@ size_t ZBUFFv05_decompressContinue(ZBUFFv05_DCtx* zbc, void* dst, size_t* maxDst } // zbc->stage = ZBUFFv05ds_decodeHeader; break; /* useless : stage follows */ } - + /* fall-through */ case ZBUFFv05ds_decodeHeader: /* apply header to create / resize buffers */ { @@ -4000,7 +3993,7 @@ size_t ZBUFFv05_decompressContinue(ZBUFFv05_DCtx* zbc, void* dst, size_t* maxDst break; } zbc->stage = ZBUFFv05ds_read; - + /* fall-through */ case ZBUFFv05ds_read: { size_t neededInSize = ZSTDv05_nextSrcSizeToDecompress(zbc->zc); @@ -4024,7 +4017,7 @@ size_t ZBUFFv05_decompressContinue(ZBUFFv05_DCtx* zbc, void* dst, size_t* maxDst if (ip==iend) { notDone = 0; break; } /* no more input */ zbc->stage = ZBUFFv05ds_load; } - + /* fall-through */ case ZBUFFv05ds_load: { size_t neededInSize = ZSTDv05_nextSrcSizeToDecompress(zbc->zc); @@ -4045,7 +4038,9 @@ size_t ZBUFFv05_decompressContinue(ZBUFFv05_DCtx* zbc, void* dst, size_t* maxDst zbc->outEnd = zbc->outStart + decodedSize; zbc->stage = ZBUFFv05ds_flush; // break; /* ZBUFFv05ds_flush follows */ - } } + } + } + /* fall-through */ case ZBUFFv05ds_flush: { size_t toFlushSize = zbc->outEnd - zbc->outStart; diff --git a/contrib/zstd/lib/legacy/zstd_v06.c b/contrib/zstd/lib/legacy/zstd_v06.c index ad8c4cd3186d..26f0929da6fd 100644 --- a/contrib/zstd/lib/legacy/zstd_v06.c +++ b/contrib/zstd/lib/legacy/zstd_v06.c @@ -910,13 +910,13 @@ MEM_STATIC size_t BITv06_initDStream(BITv06_DStream_t* bitD, const void* srcBuff bitD->bitContainer = *(const BYTE*)(bitD->start); switch(srcSize) { - case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16); - case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24); - case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32); - case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; - case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; - case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; - default:; + case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);/* fall-through */ + case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24);/* fall-through */ + case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32);/* fall-through */ + case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; /* fall-through */ + case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; /* fall-through */ + case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; /* fall-through */ + default: break; } { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1]; if (lastByte == 0) return ERROR(GENERIC); /* endMark not present */ @@ -3789,7 +3789,7 @@ size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapac return 0; } dctx->expected = 0; /* not necessary to copy more */ - + /* fall-through */ case ZSTDds_decodeFrameHeader: { size_t result; memcpy(dctx->headerBuffer + ZSTDv06_frameHeaderSize_min, src, dctx->expected); @@ -4116,7 +4116,7 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd, if (zbd->outBuff == NULL) return ERROR(memory_allocation); } } } zbd->stage = ZBUFFds_read; - + /* fall-through */ case ZBUFFds_read: { size_t const neededInSize = ZSTDv06_nextSrcSizeToDecompress(zbd->zd); if (neededInSize==0) { /* end of frame */ @@ -4138,7 +4138,7 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd, if (ip==iend) { notDone = 0; break; } /* no more input */ zbd->stage = ZBUFFds_load; } - + /* fall-through */ case ZBUFFds_load: { size_t const neededInSize = ZSTDv06_nextSrcSizeToDecompress(zbd->zd); size_t const toLoad = neededInSize - zbd->inPos; /* should always be <= remaining space within inBuff */ @@ -4159,8 +4159,9 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd, zbd->outEnd = zbd->outStart + decodedSize; zbd->stage = ZBUFFds_flush; // break; /* ZBUFFds_flush follows */ - } } - + } + } + /* fall-through */ case ZBUFFds_flush: { size_t const toFlushSize = zbd->outEnd - zbd->outStart; size_t const flushedSize = ZBUFFv06_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize); diff --git a/contrib/zstd/lib/legacy/zstd_v07.c b/contrib/zstd/lib/legacy/zstd_v07.c index a54ad0ffdba9..6669b71cea40 100644 --- a/contrib/zstd/lib/legacy/zstd_v07.c +++ b/contrib/zstd/lib/legacy/zstd_v07.c @@ -580,13 +580,13 @@ MEM_STATIC size_t BITv07_initDStream(BITv07_DStream_t* bitD, const void* srcBuff bitD->bitContainer = *(const BYTE*)(bitD->start); switch(srcSize) { - case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16); - case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24); - case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32); - case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; - case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; - case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; - default:; + case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);/* fall-through */ + case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24);/* fall-through */ + case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32);/* fall-through */ + case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24; /* fall-through */ + case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16; /* fall-through */ + case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8; /* fall-through */ + default: break; } { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1]; bitD->bitsConsumed = lastByte ? 8 - BITv07_highbit32(lastByte) : 0; @@ -2920,8 +2920,6 @@ typedef struct { void ZSTDv07_seqToCodes(const seqStore_t* seqStorePtr, size_t const nbSeq); /* custom memory allocation functions */ -void* ZSTDv07_defaultAllocFunction(void* opaque, size_t size); -void ZSTDv07_defaultFreeFunction(void* opaque, void* address); static const ZSTDv07_customMem defaultCustomMem = { ZSTDv07_defaultAllocFunction, ZSTDv07_defaultFreeFunction, NULL }; #endif /* ZSTDv07_CCOMMON_H_MODULE */ @@ -4047,7 +4045,7 @@ size_t ZSTDv07_decompressContinue(ZSTDv07_DCtx* dctx, void* dst, size_t dstCapac return 0; } dctx->expected = 0; /* not necessary to copy more */ - + /* fall-through */ case ZSTDds_decodeFrameHeader: { size_t result; memcpy(dctx->headerBuffer + ZSTDv07_frameHeaderSize_min, src, dctx->expected); @@ -4494,7 +4492,7 @@ size_t ZBUFFv07_decompressContinue(ZBUFFv07_DCtx* zbd, } } } zbd->stage = ZBUFFds_read; /* pass-through */ - + /* fall-through */ case ZBUFFds_read: { size_t const neededInSize = ZSTDv07_nextSrcSizeToDecompress(zbd->zd); if (neededInSize==0) { /* end of frame */ @@ -4517,7 +4515,7 @@ size_t ZBUFFv07_decompressContinue(ZBUFFv07_DCtx* zbd, if (ip==iend) { notDone = 0; break; } /* no more input */ zbd->stage = ZBUFFds_load; } - + /* fall-through */ case ZBUFFds_load: { size_t const neededInSize = ZSTDv07_nextSrcSizeToDecompress(zbd->zd); size_t const toLoad = neededInSize - zbd->inPos; /* should always be <= remaining space within inBuff */ @@ -4540,8 +4538,9 @@ size_t ZBUFFv07_decompressContinue(ZBUFFv07_DCtx* zbd, zbd->stage = ZBUFFds_flush; /* break; */ /* pass-through */ - } } - + } + } + /* fall-through */ case ZBUFFds_flush: { size_t const toFlushSize = zbd->outEnd - zbd->outStart; size_t const flushedSize = ZBUFFv07_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize); diff --git a/contrib/zstd/lib/zstd.h b/contrib/zstd/lib/zstd.h index f8050c136104..58e9a5606db8 100644 --- a/contrib/zstd/lib/zstd.h +++ b/contrib/zstd/lib/zstd.h @@ -19,10 +19,12 @@ extern "C" { /* ===== ZSTDLIB_API : control library symbols visibility ===== */ -#if defined(__GNUC__) && (__GNUC__ >= 4) -# define ZSTDLIB_VISIBILITY __attribute__ ((visibility ("default"))) -#else -# define ZSTDLIB_VISIBILITY +#ifndef ZSTDLIB_VISIBILITY +# if defined(__GNUC__) && (__GNUC__ >= 4) +# define ZSTDLIB_VISIBILITY __attribute__ ((visibility ("default"))) +# else +# define ZSTDLIB_VISIBILITY +# endif #endif #if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) # define ZSTDLIB_API __declspec(dllexport) ZSTDLIB_VISIBILITY @@ -36,35 +38,37 @@ extern "C" { /******************************************************************************************************* Introduction - zstd, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios - at zlib-level and better compression ratios. The zstd compression library provides in-memory compression and - decompression functions. The library supports compression levels from 1 up to ZSTD_maxCLevel() which is 22. + zstd, short for Zstandard, is a fast lossless compression algorithm, + targeting real-time compression scenarios at zlib-level and better compression ratios. + The zstd compression library provides in-memory compression and decompression functions. + The library supports compression levels from 1 up to ZSTD_maxCLevel() which is currently 22. Levels >= 20, labeled `--ultra`, should be used with caution, as they require more memory. Compression can be done in: - a single step (described as Simple API) - a single step, reusing a context (described as Explicit memory management) - unbounded multiple steps (described as Streaming compression) - The compression ratio achievable on small data can be highly improved using compression with a dictionary in: + The compression ratio achievable on small data can be highly improved using a dictionary in: - a single step (described as Simple dictionary API) - a single step, reusing a dictionary (described as Fast dictionary API) Advanced experimental functions can be accessed using #define ZSTD_STATIC_LINKING_ONLY before including zstd.h. - These APIs shall never be used with a dynamic library. + Advanced experimental APIs shall never be used with a dynamic library. They are not "stable", their definition may change in the future. Only static linking is allowed. *********************************************************************************************************/ /*------ Version ------*/ #define ZSTD_VERSION_MAJOR 1 -#define ZSTD_VERSION_MINOR 2 +#define ZSTD_VERSION_MINOR 3 #define ZSTD_VERSION_RELEASE 0 +#define ZSTD_VERSION_NUMBER (ZSTD_VERSION_MAJOR *100*100 + ZSTD_VERSION_MINOR *100 + ZSTD_VERSION_RELEASE) +ZSTDLIB_API unsigned ZSTD_versionNumber(void); /**< useful to check dll version */ + #define ZSTD_LIB_VERSION ZSTD_VERSION_MAJOR.ZSTD_VERSION_MINOR.ZSTD_VERSION_RELEASE #define ZSTD_QUOTE(str) #str #define ZSTD_EXPAND_AND_QUOTE(str) ZSTD_QUOTE(str) #define ZSTD_VERSION_STRING ZSTD_EXPAND_AND_QUOTE(ZSTD_LIB_VERSION) - -#define ZSTD_VERSION_NUMBER (ZSTD_VERSION_MAJOR *100*100 + ZSTD_VERSION_MINOR *100 + ZSTD_VERSION_RELEASE) -ZSTDLIB_API unsigned ZSTD_versionNumber(void); /**< library version number; to be used when checking dll version */ +ZSTDLIB_API const char* ZSTD_versionString(void); /* v1.3.0 */ /*************************************** @@ -81,38 +85,48 @@ ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t dstCapacity, /*! ZSTD_decompress() : * `compressedSize` : must be the _exact_ size of some number of compressed and/or skippable frames. - * `dstCapacity` is an upper bound of originalSize. + * `dstCapacity` is an upper bound of originalSize to regenerate. * If user cannot imply a maximum upper bound, it's better to use streaming mode to decompress data. * @return : the number of bytes decompressed into `dst` (<= `dstCapacity`), * or an errorCode if it fails (which can be tested using ZSTD_isError()). */ ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t dstCapacity, const void* src, size_t compressedSize); -/*! ZSTD_getDecompressedSize() : - * NOTE: This function is planned to be obsolete, in favour of ZSTD_getFrameContentSize. - * ZSTD_getFrameContentSize functions the same way, returning the decompressed size of a single - * frame, but distinguishes empty frames from frames with an unknown size, or errors. - * - * Additionally, ZSTD_findDecompressedSize can be used instead. It can handle multiple - * concatenated frames in one buffer, and so is more general. - * As a result however, it requires more computation and entire frames to be passed to it, - * as opposed to ZSTD_getFrameContentSize which requires only a single frame's header. - * - * 'src' is the start of a zstd compressed frame. - * @return : content size to be decompressed, as a 64-bits value _if known_, 0 otherwise. - * note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. - * When `return==0`, data to decompress could be any size. +/*! ZSTD_getFrameContentSize() : v1.3.0 + * `src` should point to the start of a ZSTD encoded frame. + * `srcSize` must be at least as large as the frame header. + * hint : any size >= `ZSTD_frameHeaderSize_max` is large enough. + * @return : - decompressed size of the frame in `src`, if known + * - ZSTD_CONTENTSIZE_UNKNOWN if the size cannot be determined + * - ZSTD_CONTENTSIZE_ERROR if an error occurred (e.g. invalid magic number, srcSize too small) + * note 1 : a 0 return value means the frame is valid but "empty". + * note 2 : decompressed size is an optional field, it may not be present, typically in streaming mode. + * When `return==ZSTD_CONTENTSIZE_UNKNOWN`, data to decompress could be any size. * In which case, it's necessary to use streaming mode to decompress data. - * Optionally, application can still use ZSTD_decompress() while relying on implied limits. - * (For example, data may be necessarily cut into blocks <= 16 KB). - * note 2 : decompressed size is always present when compression is done with ZSTD_compress() - * note 3 : decompressed size can be very large (64-bits value), + * Optionally, application can rely on some implicit limit, + * as ZSTD_decompress() only needs an upper bound of decompressed size. + * (For example, data could be necessarily cut into blocks <= 16 KB). + * note 3 : decompressed size is always present when compression is done with ZSTD_compress() + * note 4 : decompressed size can be very large (64-bits value), * potentially larger than what local system can handle as a single memory segment. * In which case, it's necessary to use streaming mode to decompress data. - * note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. - * Always ensure result fits within application's authorized limits. + * note 5 : If source is untrusted, decompressed size could be wrong or intentionally modified. + * Always ensure return value fits within application's authorized limits. * Each application can set its own limits. - * note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more. */ + * note 6 : This function replaces ZSTD_getDecompressedSize() */ +#define ZSTD_CONTENTSIZE_UNKNOWN (0ULL - 1) +#define ZSTD_CONTENTSIZE_ERROR (0ULL - 2) +ZSTDLIB_API unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize); + +/*! ZSTD_getDecompressedSize() : + * NOTE: This function is now obsolete, in favor of ZSTD_getFrameContentSize(). + * Both functions work the same way, + * but ZSTD_getDecompressedSize() blends + * "empty", "unknown" and "error" results in the same return value (0), + * while ZSTD_getFrameContentSize() distinguishes them. + * + * 'src' is the start of a zstd compressed frame. + * @return : content size to be decompressed, as a 64-bits value _if known and not empty_, 0 otherwise. */ ZSTDLIB_API unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize); @@ -137,29 +151,35 @@ ZSTDLIB_API size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); /*! ZSTD_compressCCtx() : * Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()). */ -ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel); +ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + int compressionLevel); /*= Decompression context * When decompressing many times, - * it is recommended to allocate a context just once, and re-use it for each successive compression operation. + * it is recommended to allocate a context only once, + * and re-use it for each successive compression operation. * This will make workload friendlier for system's memory. - * Use one context per thread for parallel execution in multi-threaded environments. */ + * Use one context per thread for parallel execution. */ typedef struct ZSTD_DCtx_s ZSTD_DCtx; ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx(void); ZSTDLIB_API size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); /*! ZSTD_decompressDCtx() : - * Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()). */ -ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); + * Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()) */ +ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize); /************************** * Simple dictionary API ***************************/ /*! ZSTD_compress_usingDict() : -* Compression using a predefined Dictionary (see dictBuilder/zdict.h). -* Note : This function loads the dictionary, resulting in significant startup delay. -* Note : When `dict == NULL || dictSize < 8` no dictionary is used. */ + * Compression using a predefined Dictionary (see dictBuilder/zdict.h). + * Note : This function loads the dictionary, resulting in significant startup delay. + * Note : When `dict == NULL || dictSize < 8` no dictionary is used. */ ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, @@ -167,30 +187,31 @@ ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx, int compressionLevel); /*! ZSTD_decompress_usingDict() : -* Decompression using a predefined Dictionary (see dictBuilder/zdict.h). -* Dictionary must be identical to the one used during compression. -* Note : This function loads the dictionary, resulting in significant startup delay. -* Note : When `dict == NULL || dictSize < 8` no dictionary is used. */ + * Decompression using a predefined Dictionary (see dictBuilder/zdict.h). + * Dictionary must be identical to the one used during compression. + * Note : This function loads the dictionary, resulting in significant startup delay. + * Note : When `dict == NULL || dictSize < 8` no dictionary is used. */ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict,size_t dictSize); -/**************************** -* Fast dictionary API -****************************/ +/********************************** + * Bulk processing dictionary API + *********************************/ typedef struct ZSTD_CDict_s ZSTD_CDict; /*! ZSTD_createCDict() : -* When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once. -* ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay. -* ZSTD_CDict can be created once and used by multiple threads concurrently, as its usage is read-only. -* `dictBuffer` can be released after ZSTD_CDict creation, as its content is copied within CDict */ -ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, int compressionLevel); + * When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once. + * ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay. + * ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only. + * `dictBuffer` can be released after ZSTD_CDict creation, since its content is copied within CDict */ +ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, + int compressionLevel); /*! ZSTD_freeCDict() : -* Function frees memory allocated by ZSTD_createCDict(). */ + * Function frees memory allocated by ZSTD_createCDict(). */ ZSTDLIB_API size_t ZSTD_freeCDict(ZSTD_CDict* CDict); /*! ZSTD_compress_usingCDict() : @@ -207,17 +228,17 @@ ZSTDLIB_API size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx, typedef struct ZSTD_DDict_s ZSTD_DDict; /*! ZSTD_createDDict() : -* Create a digested dictionary, ready to start decompression operation without startup delay. -* dictBuffer can be released after DDict creation, as its content is copied inside DDict */ + * Create a digested dictionary, ready to start decompression operation without startup delay. + * dictBuffer can be released after DDict creation, as its content is copied inside DDict */ ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict(const void* dictBuffer, size_t dictSize); /*! ZSTD_freeDDict() : -* Function frees memory allocated with ZSTD_createDDict() */ + * Function frees memory allocated with ZSTD_createDDict() */ ZSTDLIB_API size_t ZSTD_freeDDict(ZSTD_DDict* ddict); /*! ZSTD_decompress_usingDDict() : -* Decompression using a digested Dictionary. -* Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. */ + * Decompression using a digested Dictionary. + * Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. */ ZSTDLIB_API size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, @@ -274,14 +295,17 @@ typedef struct ZSTD_outBuffer_s { * ZSTD_endStream() instructs to finish a frame. * It will perform a flush and write frame epilogue. * The epilogue is required for decoders to consider a frame completed. -* Similar to ZSTD_flushStream(), it may not be able to flush the full content if `output->size` is too small. +* ZSTD_endStream() may not be able to flush full data if `output->size` is too small. * In which case, call again ZSTD_endStream() to complete the flush. -* @return : nb of bytes still present within internal buffer (0 if it's empty, hence compression completed) +* @return : 0 if frame fully completed and fully flushed, + or >0 if some data is still present within internal buffer + (value is minimum size estimation for remaining data to flush, but it could be more) * or an error code, which can be tested using ZSTD_isError(). * * *******************************************************************/ -typedef struct ZSTD_CStream_s ZSTD_CStream; +typedef ZSTD_CCtx ZSTD_CStream; /**< CCtx and CStream are now effectively same object (>= v1.3.0) */ + /* Continue to distinguish them for compatibility with versions <= v1.2.0 */ /*===== ZSTD_CStream management functions =====*/ ZSTDLIB_API ZSTD_CStream* ZSTD_createCStream(void); ZSTDLIB_API size_t ZSTD_freeCStream(ZSTD_CStream* zcs); @@ -319,7 +343,8 @@ ZSTDLIB_API size_t ZSTD_CStreamOutSize(void); /**< recommended size for output * The return value is a suggested next input size (a hint to improve latency) that will never load more than the current frame. * *******************************************************************************/ -typedef struct ZSTD_DStream_s ZSTD_DStream; +typedef ZSTD_DCtx ZSTD_DStream; /**< DCtx and DStream are now effectively same object (>= v1.3.0) */ + /* Continue to distinguish them for compatibility with versions <= v1.2.0 */ /*===== ZSTD_DStream management functions =====*/ ZSTDLIB_API ZSTD_DStream* ZSTD_createDStream(void); ZSTDLIB_API size_t ZSTD_freeDStream(ZSTD_DStream* zds); @@ -334,23 +359,22 @@ ZSTDLIB_API size_t ZSTD_DStreamOutSize(void); /*!< recommended size for output #endif /* ZSTD_H_235446 */ -#if defined(ZSTD_STATIC_LINKING_ONLY) && !defined(ZSTD_H_ZSTD_STATIC_LINKING_ONLY) -#define ZSTD_H_ZSTD_STATIC_LINKING_ONLY /**************************************************************************************** * START OF ADVANCED AND EXPERIMENTAL FUNCTIONS * The definitions in this section are considered experimental. - * They should never be used with a dynamic library, as they may change in the future. - * They are provided for advanced usages. + * They should never be used with a dynamic library, as prototypes may change in the future. + * They are provided for advanced scenarios. * Use them only in association with static linking. * ***************************************************************************************/ +#if defined(ZSTD_STATIC_LINKING_ONLY) && !defined(ZSTD_H_ZSTD_STATIC_LINKING_ONLY) +#define ZSTD_H_ZSTD_STATIC_LINKING_ONLY + /* --- Constants ---*/ #define ZSTD_MAGICNUMBER 0xFD2FB528 /* >= v0.8.0 */ #define ZSTD_MAGIC_SKIPPABLE_START 0x184D2A50U - -#define ZSTD_CONTENTSIZE_UNKNOWN (0ULL - 1) -#define ZSTD_CONTENTSIZE_ERROR (0ULL - 2) +#define ZSTD_MAGIC_DICTIONARY 0xEC30A437 /* v0.7+ */ #define ZSTD_WINDOWLOG_MAX_32 27 #define ZSTD_WINDOWLOG_MAX_64 27 @@ -370,14 +394,15 @@ ZSTDLIB_API size_t ZSTD_DStreamOutSize(void); /*!< recommended size for output #define ZSTD_FRAMEHEADERSIZE_MAX 18 /* for static allocation */ #define ZSTD_FRAMEHEADERSIZE_MIN 6 -static const size_t ZSTD_frameHeaderSize_prefix = 5; -static const size_t ZSTD_frameHeaderSize_min = ZSTD_FRAMEHEADERSIZE_MIN; +static const size_t ZSTD_frameHeaderSize_prefix = 5; /* minimum input size to know frame header size */ static const size_t ZSTD_frameHeaderSize_max = ZSTD_FRAMEHEADERSIZE_MAX; +static const size_t ZSTD_frameHeaderSize_min = ZSTD_FRAMEHEADERSIZE_MIN; static const size_t ZSTD_skippableHeaderSize = 8; /* magic number + skippable frame length */ /*--- Advanced types ---*/ -typedef enum { ZSTD_fast, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt, ZSTD_btopt2 } ZSTD_strategy; /* from faster to stronger */ +typedef enum { ZSTD_fast=1, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, + ZSTD_btlazy2, ZSTD_btopt, ZSTD_btultra } ZSTD_strategy; /* from faster to stronger */ typedef struct { unsigned windowLog; /**< largest match distance : larger == more compression, more memory needed during decompression */ @@ -400,76 +425,141 @@ typedef struct { ZSTD_frameParameters fParams; } ZSTD_parameters; +typedef struct { + unsigned long long frameContentSize; + size_t windowSize; + unsigned dictID; + unsigned checksumFlag; +} ZSTD_frameHeader; + /*= Custom memory allocation functions */ typedef void* (*ZSTD_allocFunction) (void* opaque, size_t size); typedef void (*ZSTD_freeFunction) (void* opaque, void* address); typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; void* opaque; } ZSTD_customMem; +/* use this constant to defer to stdlib's functions */ +static const ZSTD_customMem ZSTD_defaultCMem = { NULL, NULL, NULL }; + /*************************************** -* Compressed size functions +* Frame size functions ***************************************/ /*! ZSTD_findFrameCompressedSize() : * `src` should point to the start of a ZSTD encoded frame or skippable frame * `srcSize` must be at least as large as the frame - * @return : the compressed size of the frame pointed to by `src`, suitable to pass to - * `ZSTD_decompress` or similar, or an error code if given invalid input. */ + * @return : the compressed size of the first frame starting at `src`, + * suitable to pass to `ZSTD_decompress` or similar, + * or an error code if input is invalid */ ZSTDLIB_API size_t ZSTD_findFrameCompressedSize(const void* src, size_t srcSize); -/*************************************** -* Decompressed size functions -***************************************/ -/*! ZSTD_getFrameContentSize() : -* `src` should point to the start of a ZSTD encoded frame -* `srcSize` must be at least as large as the frame header. A value greater than or equal -* to `ZSTD_frameHeaderSize_max` is guaranteed to be large enough in all cases. -* @return : decompressed size of the frame pointed to be `src` if known, otherwise -* - ZSTD_CONTENTSIZE_UNKNOWN if the size cannot be determined -* - ZSTD_CONTENTSIZE_ERROR if an error occurred (e.g. invalid magic number, srcSize too small) */ -ZSTDLIB_API unsigned long long ZSTD_getFrameContentSize(const void *src, size_t srcSize); - /*! ZSTD_findDecompressedSize() : -* `src` should point the start of a series of ZSTD encoded and/or skippable frames -* `srcSize` must be the _exact_ size of this series -* (i.e. there should be a frame boundary exactly `srcSize` bytes after `src`) -* @return : the decompressed size of all data in the contained frames, as a 64-bit value _if known_ -* - if the decompressed size cannot be determined: ZSTD_CONTENTSIZE_UNKNOWN -* - if an error occurred: ZSTD_CONTENTSIZE_ERROR -* -* note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. -* When `return==ZSTD_CONTENTSIZE_UNKNOWN`, data to decompress could be any size. -* In which case, it's necessary to use streaming mode to decompress data. -* Optionally, application can still use ZSTD_decompress() while relying on implied limits. -* (For example, data may be necessarily cut into blocks <= 16 KB). -* note 2 : decompressed size is always present when compression is done with ZSTD_compress() -* note 3 : decompressed size can be very large (64-bits value), -* potentially larger than what local system can handle as a single memory segment. -* In which case, it's necessary to use streaming mode to decompress data. -* note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. -* Always ensure result fits within application's authorized limits. -* Each application can set its own limits. -* note 5 : ZSTD_findDecompressedSize handles multiple frames, and so it must traverse the input to -* read each contained frame header. This is efficient as most of the data is skipped, -* however it does mean that all frame data must be present and valid. */ + * `src` should point the start of a series of ZSTD encoded and/or skippable frames + * `srcSize` must be the _exact_ size of this series + * (i.e. there should be a frame boundary exactly at `srcSize` bytes after `src`) + * @return : - decompressed size of all data in all successive frames + * - if the decompressed size cannot be determined: ZSTD_CONTENTSIZE_UNKNOWN + * - if an error occurred: ZSTD_CONTENTSIZE_ERROR + * + * note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode. + * When `return==ZSTD_CONTENTSIZE_UNKNOWN`, data to decompress could be any size. + * In which case, it's necessary to use streaming mode to decompress data. + * note 2 : decompressed size is always present when compression is done with ZSTD_compress() + * note 3 : decompressed size can be very large (64-bits value), + * potentially larger than what local system can handle as a single memory segment. + * In which case, it's necessary to use streaming mode to decompress data. + * note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified. + * Always ensure result fits within application's authorized limits. + * Each application can set its own limits. + * note 5 : ZSTD_findDecompressedSize handles multiple frames, and so it must traverse the input to + * read each contained frame header. This is fast as most of the data is skipped, + * however it does mean that all frame data must be present and valid. */ ZSTDLIB_API unsigned long long ZSTD_findDecompressedSize(const void* src, size_t srcSize); +/*! ZSTD_frameHeaderSize() : +* `src` should point to the start of a ZSTD frame +* `srcSize` must be >= ZSTD_frameHeaderSize_prefix. +* @return : size of the Frame Header */ +ZSTDLIB_API size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize); + + +/*************************************** +* Context memory usage +***************************************/ + +/*! ZSTD_sizeof_*() : + * These functions give the current memory usage of selected object. + * Object memory usage can evolve if it's re-used multiple times. */ +ZSTDLIB_API size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); +ZSTDLIB_API size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); +ZSTDLIB_API size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs); +ZSTDLIB_API size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds); +ZSTDLIB_API size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); +ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); + +/*! ZSTD_estimate*() : + * These functions make it possible to estimate memory usage + * of a future {D,C}Ctx, before its creation. + * ZSTD_estimateCCtxSize() will provide a budget large enough for any compression level up to selected one. + * It will also consider src size to be arbitrarily "large", which is worst case. + * If srcSize is known to always be small, ZSTD_estimateCCtxSize_advanced() can provide a tighter estimation. + * ZSTD_estimateCCtxSize_advanced() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. + * Note : CCtx estimation is only correct for single-threaded compression */ +ZSTDLIB_API size_t ZSTD_estimateCCtxSize(int compressionLevel); +ZSTDLIB_API size_t ZSTD_estimateCCtxSize_advanced(ZSTD_compressionParameters cParams); +ZSTDLIB_API size_t ZSTD_estimateDCtxSize(void); + +/*! ZSTD_estimate?StreamSize() : + * ZSTD_estimateCStreamSize() will provide a budget large enough for any compression level up to selected one. + * It will also consider src size to be arbitrarily "large", which is worst case. + * If srcSize is known to always be small, ZSTD_estimateCStreamSize_advanced() can provide a tighter estimation. + * ZSTD_estimateCStreamSize_advanced() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. + * Note : CStream estimation is only correct for single-threaded compression. + * ZSTD_DStream memory budget depends on window Size. + * This information can be passed manually, using ZSTD_estimateDStreamSize, + * or deducted from a valid frame Header, using ZSTD_estimateDStreamSize_fromFrame(); + * Note : if streaming is init with function ZSTD_init?Stream_usingDict(), + * an internal ?Dict will be created, which additional size is not estimated here. + * In this case, get total size by adding ZSTD_estimate?DictSize */ +ZSTDLIB_API size_t ZSTD_estimateCStreamSize(int compressionLevel); +ZSTDLIB_API size_t ZSTD_estimateCStreamSize_advanced(ZSTD_compressionParameters cParams); +ZSTDLIB_API size_t ZSTD_estimateDStreamSize(size_t windowSize); +ZSTDLIB_API size_t ZSTD_estimateDStreamSize_fromFrame(const void* src, size_t srcSize); + +/*! ZSTD_estimate?DictSize() : + * ZSTD_estimateCDictSize() will bet that src size is relatively "small", and content is copied, like ZSTD_createCDict(). + * ZSTD_estimateCStreamSize_advanced() makes it possible to control precisely compression parameters, like ZSTD_createCDict_advanced(). + * Note : dictionary created "byReference" are smaller */ +ZSTDLIB_API size_t ZSTD_estimateCDictSize(size_t dictSize, int compressionLevel); +ZSTDLIB_API size_t ZSTD_estimateCDictSize_advanced(size_t dictSize, ZSTD_compressionParameters cParams, unsigned byReference); +ZSTDLIB_API size_t ZSTD_estimateDDictSize(size_t dictSize, unsigned byReference); + /*************************************** * Advanced compression functions ***************************************/ -/*! ZSTD_estimateCCtxSize() : - * Gives the amount of memory allocated for a ZSTD_CCtx given a set of compression parameters. - * `frameContentSize` is an optional parameter, provide `0` if unknown */ -ZSTDLIB_API size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams); - /*! ZSTD_createCCtx_advanced() : * Create a ZSTD compression context using external alloc and free functions */ ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem); -/*! ZSTD_sizeofCCtx() : - * Gives the amount of memory used by a given ZSTD_CCtx */ -ZSTDLIB_API size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); +/*! ZSTD_initStaticCCtx() : initialize a fixed-size zstd compression context + * workspace: The memory area to emplace the context into. + * Provided pointer must 8-bytes aligned. + * It must outlive context usage. + * workspaceSize: Use ZSTD_estimateCCtxSize() or ZSTD_estimateCStreamSize() + * to determine how large workspace must be to support scenario. + * @return : pointer to ZSTD_CCtx*, or NULL if error (size too small) + * Note : zstd will never resize nor malloc() when using a static cctx. + * If it needs more memory than available, it will simply error out. + * Note 2 : there is no corresponding "free" function. + * Since workspace was allocated externally, it must be freed externally too. + * Limitation 1 : currently not compatible with internal CDict creation, such as + * ZSTD_CCtx_loadDictionary() or ZSTD_initCStream_usingDict(). + * Limitation 2 : currently not compatible with multi-threading + */ +ZSTDLIB_API ZSTD_CCtx* ZSTD_initStaticCCtx(void* workspace, size_t workspaceSize); + +/* !!! To be deprecated !!! */ typedef enum { ZSTD_p_forceWindow, /* Force back-references to remain < windowSize, even when referencing Dictionary content (default:0) */ ZSTD_p_forceRawDict /* Force loading dictionary in "content-only" mode (no header analysis) */ @@ -479,20 +569,43 @@ typedef enum { * @result : 0, or an error code (which can be tested with ZSTD_isError()) */ ZSTDLIB_API size_t ZSTD_setCCtxParameter(ZSTD_CCtx* cctx, ZSTD_CCtxParameter param, unsigned value); + /*! ZSTD_createCDict_byReference() : * Create a digested dictionary for compression * Dictionary content is simply referenced, and therefore stays in dictBuffer. * It is important that dictBuffer outlives CDict, it must remain read accessible throughout the lifetime of CDict */ ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_byReference(const void* dictBuffer, size_t dictSize, int compressionLevel); + +typedef enum { ZSTD_dm_auto=0, /* dictionary is "full" if it starts with ZSTD_MAGIC_DICTIONARY, otherwise it is "rawContent" */ + ZSTD_dm_rawContent, /* ensures dictionary is always loaded as rawContent, even if it starts with ZSTD_MAGIC_DICTIONARY */ + ZSTD_dm_fullDict /* refuses to load a dictionary if it does not respect Zstandard's specification */ +} ZSTD_dictMode_e; /*! ZSTD_createCDict_advanced() : * Create a ZSTD_CDict using external alloc and free, and customized compression parameters */ -ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, unsigned byReference, - ZSTD_compressionParameters cParams, ZSTD_customMem customMem); +ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, + ZSTD_compressionParameters cParams, + ZSTD_customMem customMem); -/*! ZSTD_sizeof_CDict() : - * Gives the amount of memory used by a given ZSTD_sizeof_CDict */ -ZSTDLIB_API size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); +/*! ZSTD_initStaticCDict_advanced() : + * Generate a digested dictionary in provided memory area. + * workspace: The memory area to emplace the dictionary into. + * Provided pointer must 8-bytes aligned. + * It must outlive dictionary usage. + * workspaceSize: Use ZSTD_estimateCDictSize() + * to determine how large workspace must be. + * cParams : use ZSTD_getCParams() to transform a compression level + * into its relevants cParams. + * @return : pointer to ZSTD_CDict*, or NULL if error (size too small) + * Note : there is no corresponding "free" function. + * Since workspace was allocated externally, it must be freed externally. + */ +ZSTDLIB_API ZSTD_CDict* ZSTD_initStaticCDict( + void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference, ZSTD_dictMode_e dictMode, + ZSTD_compressionParameters cParams); /*! ZSTD_getCParams() : * @return ZSTD_compressionParameters structure for a selected compression level and estimated srcSize. @@ -509,8 +622,8 @@ ZSTDLIB_API ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long l ZSTDLIB_API size_t ZSTD_checkCParams(ZSTD_compressionParameters params); /*! ZSTD_adjustCParams() : -* optimize params for a given `srcSize` and `dictSize`. -* both values are optional, select `0` if unknown. */ + * optimize params for a given `srcSize` and `dictSize`. + * both values are optional, select `0` if unknown. */ ZSTDLIB_API ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize); /*! ZSTD_compress_advanced() : @@ -538,22 +651,32 @@ ZSTDLIB_API size_t ZSTD_compress_usingCDict_advanced(ZSTD_CCtx* cctx, * Note 3 : Skippable Frame Identifiers are considered valid. */ ZSTDLIB_API unsigned ZSTD_isFrame(const void* buffer, size_t size); -/*! ZSTD_estimateDCtxSize() : - * Gives the potential amount of memory allocated to create a ZSTD_DCtx */ -ZSTDLIB_API size_t ZSTD_estimateDCtxSize(void); - /*! ZSTD_createDCtx_advanced() : * Create a ZSTD decompression context using external alloc and free functions */ ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem); -/*! ZSTD_sizeof_DCtx() : - * Gives the amount of memory used by a given ZSTD_DCtx */ -ZSTDLIB_API size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); +/*! ZSTD_initStaticDCtx() : initialize a fixed-size zstd decompression context + * workspace: The memory area to emplace the context into. + * Provided pointer must 8-bytes aligned. + * It must outlive context usage. + * workspaceSize: Use ZSTD_estimateDCtxSize() or ZSTD_estimateDStreamSize() + * to determine how large workspace must be to support scenario. + * @return : pointer to ZSTD_DCtx*, or NULL if error (size too small) + * Note : zstd will never resize nor malloc() when using a static dctx. + * If it needs more memory than available, it will simply error out. + * Note 2 : static dctx is incompatible with legacy support + * Note 3 : there is no corresponding "free" function. + * Since workspace was allocated externally, it must be freed externally. + * Limitation : currently not compatible with internal DDict creation, + * such as ZSTD_initDStream_usingDict(). + */ +ZSTDLIB_API ZSTD_DCtx* ZSTD_initStaticDCtx(void* workspace, size_t workspaceSize); /*! ZSTD_createDDict_byReference() : * Create a digested dictionary, ready to start decompression operation without startup delay. - * Dictionary content is simply referenced, and therefore stays in dictBuffer. - * It is important that dictBuffer outlives DDict, it must remain read accessible throughout the lifetime of DDict */ + * Dictionary content is referenced, and therefore stays in dictBuffer. + * It is important that dictBuffer outlives DDict, + * it must remain read accessible throughout the lifetime of DDict */ ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, size_t dictSize); /*! ZSTD_createDDict_advanced() : @@ -561,9 +684,20 @@ ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, siz ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, unsigned byReference, ZSTD_customMem customMem); -/*! ZSTD_sizeof_DDict() : - * Gives the amount of memory used by a given ZSTD_DDict */ -ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); +/*! ZSTD_initStaticDDict() : + * Generate a digested dictionary in provided memory area. + * workspace: The memory area to emplace the dictionary into. + * Provided pointer must 8-bytes aligned. + * It must outlive dictionary usage. + * workspaceSize: Use ZSTD_estimateDDictSize() + * to determine how large workspace must be. + * @return : pointer to ZSTD_DDict*, or NULL if error (size too small) + * Note : there is no corresponding "free" function. + * Since workspace was allocated externally, it must be freed externally. + */ +ZSTDLIB_API ZSTD_DDict* ZSTD_initStaticDDict(void* workspace, size_t workspaceSize, + const void* dict, size_t dictSize, + unsigned byReference); /*! ZSTD_getDictID_fromDict() : * Provides the dictID stored within dictionary. @@ -586,7 +720,7 @@ ZSTDLIB_API unsigned ZSTD_getDictID_fromDDict(const ZSTD_DDict* ddict); * Note : this use case also happens when using a non-conformant dictionary. * - `srcSize` is too small, and as a result, the frame header could not be decoded (only possible if `srcSize < ZSTD_FRAMEHEADERSIZE_MAX`). * - This is not a Zstandard frame. - * When identifying the exact failure cause, it's possible to use ZSTD_getFrameParams(), which will provide a more precise error code. */ + * When identifying the exact failure cause, it's possible to use ZSTD_getFrameHeader(), which will provide a more precise error code. */ ZSTDLIB_API unsigned ZSTD_getDictID_fromFrame(const void* src, size_t srcSize); @@ -596,13 +730,13 @@ ZSTDLIB_API unsigned ZSTD_getDictID_fromFrame(const void* src, size_t srcSize); /*===== Advanced Streaming compression functions =====*/ ZSTDLIB_API ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem); -ZSTDLIB_API size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs); /**< size of CStream is variable, depending primarily on compression level */ +ZSTDLIB_API ZSTD_CStream* ZSTD_initStaticCStream(void* workspace, size_t workspaceSize); /**< same as ZSTD_initStaticCCtx() */ ZSTDLIB_API size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize); /**< pledgedSrcSize must be correct, a size of 0 means unknown. for a frame size of 0 use initCStream_advanced */ -ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ +ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); /**< creates of an internal CDict (incompatible with static CCtx), except if dict == NULL or dictSize < 8, in which case no dict is used. */ ZSTDLIB_API size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be 0 (meaning unknown). note: if the contentSizeFlag is set, pledgedSrcSize == 0 means the source size is actually 0 */ ZSTDLIB_API size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); /**< note : cdict will just be referenced, and must outlive compression session */ -ZSTDLIB_API size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, unsigned long long pledgedSrcSize, ZSTD_frameParameters fParams); /**< same as ZSTD_initCStream_usingCDict(), with control over frame parameters */ +ZSTDLIB_API size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, ZSTD_frameParameters fParams, unsigned long long pledgedSrcSize); /**< same as ZSTD_initCStream_usingCDict(), with control over frame parameters */ /*! ZSTD_resetCStream() : * start a new compression job, using same parameters from previous job. @@ -617,11 +751,11 @@ ZSTDLIB_API size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledg /*===== Advanced Streaming decompression functions =====*/ typedef enum { DStream_p_maxWindowSize } ZSTD_DStreamParameter_e; ZSTDLIB_API ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem); -ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ +ZSTDLIB_API ZSTD_DStream* ZSTD_initStaticDStream(void* workspace, size_t workspaceSize); /**< same as ZSTD_initStaticDCtx() */ ZSTDLIB_API size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, ZSTD_DStreamParameter_e paramType, unsigned paramValue); +ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ ZSTDLIB_API size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); /**< note : ddict will just be referenced, and must outlive decompression session */ ZSTDLIB_API size_t ZSTD_resetDStream(ZSTD_DStream* zds); /**< re-use decompression parameters from previous init; saves dictionary loading */ -ZSTDLIB_API size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds); /********************************************************************* @@ -683,21 +817,24 @@ ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapaci Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it. A ZSTD_DCtx object can be re-used multiple times. - First typical operation is to retrieve frame parameters, using ZSTD_getFrameParams(). - It fills a ZSTD_frameParams structure which provide important information to correctly decode the frame, - such as the minimum rolling buffer size to allocate to decompress data (`windowSize`), - and the dictionary ID used. + First typical operation is to retrieve frame parameters, using ZSTD_getFrameHeader(). + It fills a ZSTD_frameHeader structure with important information to correctly decode the frame, + such as minimum rolling buffer size to allocate to decompress data (`windowSize`), + and the dictionary ID in use. (Note : content size is optional, it may not be present. 0 means : content size unknown). Note that these values could be wrong, either because of data malformation, or because an attacker is spoofing deliberate false information. As a consequence, check that values remain within valid application range, especially `windowSize`, before allocation. - Each application can set its own limit, depending on local restrictions. For extended interoperability, it is recommended to support at least 8 MB. - Frame parameters are extracted from the beginning of the compressed frame. - Data fragment must be large enough to ensure successful decoding, typically `ZSTD_frameHeaderSize_max` bytes. - @result : 0 : successful decoding, the `ZSTD_frameParams` structure is correctly filled. + Each application can set its own limit, depending on local restrictions. + For extended interoperability, it is recommended to support windowSize of at least 8 MB. + Frame header is extracted from the beginning of compressed frame, so providing only the frame's beginning is enough. + Data fragment must be large enough to ensure successful decoding. + `ZSTD_frameHeaderSize_max` bytes is guaranteed to always be large enough. + @result : 0 : successful decoding, the `ZSTD_frameHeader` structure is correctly filled. >0 : `srcSize` is too small, please provide at least @result bytes on next attempt. errorCode, which can be tested using ZSTD_isError(). - Start decompression, with ZSTD_decompressBegin() or ZSTD_decompressBegin_usingDict(). + Start decompression, with ZSTD_decompressBegin(). + If decompression requires a dictionary, use ZSTD_decompressBegin_usingDict() or ZSTD_decompressBegin_usingDDict(). Alternatively, you can copy a prepared context, using ZSTD_copyDCtx(). Then use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively. @@ -729,30 +866,233 @@ ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapaci b) Frame Size - 4 Bytes, Little endian format, unsigned 32-bits c) Frame Content - any content (User Data) of length equal to Frame Size For skippable frames ZSTD_decompressContinue() always returns 0. - For skippable frames ZSTD_getFrameParams() returns fparamsPtr->windowLog==0 what means that a frame is skippable. + For skippable frames ZSTD_getFrameHeader() returns fparamsPtr->windowLog==0 what means that a frame is skippable. Note : If fparamsPtr->frameContentSize==0, it is ambiguous: the frame might actually be a Zstd encoded frame with no content. For purposes of decompression, it is valid in both cases to skip the frame using ZSTD_findFrameCompressedSize to find its size in bytes. It also returns Frame Size as fparamsPtr->frameContentSize. */ -typedef struct { - unsigned long long frameContentSize; - unsigned windowSize; - unsigned dictID; - unsigned checksumFlag; -} ZSTD_frameParams; - /*===== Buffer-less streaming decompression functions =====*/ -ZSTDLIB_API size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize); /**< doesn't consume input, see details below */ +ZSTDLIB_API size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize); /**< doesn't consume input */ ZSTDLIB_API size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx); ZSTDLIB_API size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); +ZSTDLIB_API size_t ZSTD_decompressBegin_usingDDict(ZSTD_DCtx* dctx, const ZSTD_DDict* ddict); ZSTDLIB_API void ZSTD_copyDCtx(ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx); + ZSTDLIB_API size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx); ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e; ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); + + +/*=== New advanced API (experimental, and compression only) ===*/ + +/* notes on API design : + * In this proposal, parameters are pushed one by one into an existing CCtx, + * and then applied on all subsequent compression jobs. + * When no parameter is ever provided, CCtx is created with compression level ZSTD_CLEVEL_DEFAULT. + * + * This API is intended to replace all others experimental API. + * It can basically do all other use cases, and even new ones. + * It stands a good chance to become "stable", + * after a reasonable testing period. + */ + +/* note on naming convention : + * Initially, the API favored names like ZSTD_setCCtxParameter() . + * In this proposal, convention is changed towards ZSTD_CCtx_setParameter() . + * The main driver is that it identifies more clearly the target object type. + * It feels clearer in light of potential variants : + * ZSTD_CDict_setParameter() (rather than ZSTD_setCDictParameter()) + * ZSTD_DCtx_setParameter() (rather than ZSTD_setDCtxParameter() ) + * Left variant feels easier to distinguish. + */ + +/* note on enum design : + * All enum will be manually set to explicit values before reaching "stable API" status */ + +typedef enum { + /* compression parameters */ + ZSTD_p_compressionLevel=100, /* Update all compression parameters according to pre-defined cLevel table + * Default level is ZSTD_CLEVEL_DEFAULT==3. + * Special: value 0 means "do not change cLevel". */ + ZSTD_p_windowLog, /* Maximum allowed back-reference distance, expressed as power of 2. + * Must be clamped between ZSTD_WINDOWLOG_MIN and ZSTD_WINDOWLOG_MAX. + * Special: value 0 means "do not change windowLog". */ + ZSTD_p_hashLog, /* Size of the probe table, as a power of 2. + * Resulting table size is (1 << (hashLog+2)). + * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX. + * Larger tables improve compression ratio of strategies <= dFast, + * and improve speed of strategies > dFast. + * Special: value 0 means "do not change hashLog". */ + ZSTD_p_chainLog, /* Size of the full-search table, as a power of 2. + * Resulting table size is (1 << (chainLog+2)). + * Larger tables result in better and slower compression. + * This parameter is useless when using "fast" strategy. + * Special: value 0 means "do not change chainLog". */ + ZSTD_p_searchLog, /* Number of search attempts, as a power of 2. + * More attempts result in better and slower compression. + * This parameter is useless when using "fast" and "dFast" strategies. + * Special: value 0 means "do not change searchLog". */ + ZSTD_p_minMatch, /* Minimum size of searched matches (note : repCode matches can be smaller). + * Larger values make faster compression and decompression, but decrease ratio. + * Must be clamped between ZSTD_SEARCHLENGTH_MIN and ZSTD_SEARCHLENGTH_MAX. + * Note that currently, for all strategies < btopt, effective minimum is 4. + * Note that currently, for all strategies > fast, effective maximum is 6. + * Special: value 0 means "do not change minMatchLength". */ + ZSTD_p_targetLength, /* Only useful for strategies >= btopt. + * Length of Match considered "good enough" to stop search. + * Larger values make compression stronger and slower. + * Special: value 0 means "do not change targetLength". */ + ZSTD_p_compressionStrategy, /* See ZSTD_strategy enum definition. + * Cast selected strategy as unsigned for ZSTD_CCtx_setParameter() compatibility. + * The higher the value of selected strategy, the more complex it is, + * resulting in stronger and slower compression. + * Special: value 0 means "do not change strategy". */ + + /* frame parameters */ + ZSTD_p_contentSizeFlag=200, /* Content size is written into frame header _whenever known_ (default:1) */ + ZSTD_p_checksumFlag, /* A 32-bits checksum of content is written at end of frame (default:0) */ + ZSTD_p_dictIDFlag, /* When applicable, dictID of dictionary is provided in frame header (default:1) */ + + /* dictionary parameters (must be set before ZSTD_CCtx_loadDictionary) */ + ZSTD_p_dictMode=300, /* Select how dictionary content must be interpreted. Value must be from type ZSTD_dictMode_e. + * default : 0==auto : dictionary will be "full" if it respects specification, otherwise it will be "rawContent" */ + ZSTD_p_refDictContent, /* Dictionary content will be referenced, instead of copied (default:0==byCopy). + * It requires that dictionary buffer outlives its users */ + + /* multi-threading parameters */ + ZSTD_p_nbThreads=400, /* Select how many threads a compression job can spawn (default:1) + * More threads improve speed, but also increase memory usage. + * Can only receive a value > 1 if ZSTD_MULTITHREAD is enabled. + * Special: value 0 means "do not change nbThreads" */ + ZSTD_p_jobSize, /* Size of a compression job. Each compression job is completed in parallel. + * 0 means default, which is dynamically determined based on compression parameters. + * Job size must be a minimum of overlapSize, or 1 KB, whichever is largest + * The minimum size is automatically and transparently enforced */ + ZSTD_p_overlapSizeLog, /* Size of previous input reloaded at the beginning of each job. + * 0 => no overlap, 6(default) => use 1/8th of windowSize, >=9 => use full windowSize */ + + /* advanced parameters - may not remain available after API update */ + ZSTD_p_forceMaxWindow=1100, /* Force back-reference distances to remain < windowSize, + * even when referencing into Dictionary content (default:0) */ + +} ZSTD_cParameter; + + +/*! ZSTD_CCtx_setParameter() : + * Set one compression parameter, selected by enum ZSTD_cParameter. + * Note : when `value` is an enum, cast it to unsigned for proper type checking. + * @result : 0, or an error code (which can be tested with ZSTD_isError()). */ +ZSTDLIB_API size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value); + +/*! ZSTD_CCtx_setPledgedSrcSize() : + * Total input data size to be compressed as a single frame. + * This value will be controlled at the end, and result in error if not respected. + * @result : 0, or an error code (which can be tested with ZSTD_isError()). + * Note 1 : 0 means zero, empty. + * In order to mean "unknown content size", pass constant ZSTD_CONTENTSIZE_UNKNOWN. + * Note that ZSTD_CONTENTSIZE_UNKNOWN is default value for new compression jobs. + * Note 2 : If all data is provided and consumed in a single round, + * this value is overriden by srcSize instead. */ +ZSTDLIB_API size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize); + +/*! ZSTD_CCtx_loadDictionary() : + * Create an internal CDict from dict buffer. + * Decompression will have to use same buffer. + * @result : 0, or an error code (which can be tested with ZSTD_isError()). + * Special : Adding a NULL (or 0-size) dictionary invalidates any previous dictionary, + * meaning "return to no-dictionary mode". + * Note 1 : `dict` content will be copied internally, + * except if ZSTD_p_refDictContent is set before loading. + * Note 2 : Loading a dictionary involves building tables, which are dependent on compression parameters. + * For this reason, compression parameters cannot be changed anymore after loading a dictionary. + * It's also a CPU-heavy operation, with non-negligible impact on latency. + * Note 3 : Dictionary will be used for all future compression jobs. + * To return to "no-dictionary" situation, load a NULL dictionary */ +ZSTDLIB_API size_t ZSTD_CCtx_loadDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); + +/*! ZSTD_CCtx_refCDict() : + * Reference a prepared dictionary, to be used for all next compression jobs. + * Note that compression parameters are enforced from within CDict, + * and supercede any compression parameter previously set within CCtx. + * The dictionary will remain valid for future compression jobs using same CCtx. + * @result : 0, or an error code (which can be tested with ZSTD_isError()). + * Special : adding a NULL CDict means "return to no-dictionary mode". + * Note 1 : Currently, only one dictionary can be managed. + * Adding a new dictionary effectively "discards" any previous one. + * Note 2 : CDict is just referenced, its lifetime must outlive CCtx. + */ +ZSTDLIB_API size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); + +/*! ZSTD_CCtx_refPrefix() : + * Reference a prefix (single-usage dictionary) for next compression job. + * Decompression need same prefix to properly regenerate data. + * Prefix is **only used once**. Tables are discarded at end of compression job. + * Subsequent compression jobs will be done without prefix (if none is explicitly referenced). + * If there is a need to use same prefix multiple times, consider embedding it into a ZSTD_CDict instead. + * @result : 0, or an error code (which can be tested with ZSTD_isError()). + * Special : Adding any prefix (including NULL) invalidates any previous prefix or dictionary + * Note 1 : Prefix buffer is referenced. It must outlive compression job. + * Note 2 : Referencing a prefix involves building tables, which are dependent on compression parameters. + * It's a CPU-heavy operation, with non-negligible impact on latency. + * Note 3 : it's possible to alter ZSTD_p_dictMode using ZSTD_CCtx_setParameter() */ +ZSTDLIB_API size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, const void* prefix, size_t prefixSize); + + + +typedef enum { + ZSTD_e_continue=0, /* collect more data, encoder transparently decides when to output result, for optimal conditions */ + ZSTD_e_flush, /* flush any data provided so far - frame will continue, future data can still reference previous data for better compression */ + ZSTD_e_end /* flush any remaining data and ends current frame. Any future compression starts a new frame. */ +} ZSTD_EndDirective; + +/*! ZSTD_compress_generic() : + * Behave about the same as ZSTD_compressStream. To note : + * - Compression parameters are pushed into CCtx before starting compression, using ZSTD_CCtx_setParameter() + * - Compression parameters cannot be changed once compression is started. + * - *dstPos must be <= dstCapacity, *srcPos must be <= srcSize + * - *dspPos and *srcPos will be updated. They are guaranteed to remain below their respective limit. + * - @return provides the minimum amount of data still to flush from internal buffers + * or an error code, which can be tested using ZSTD_isError(). + * if @return != 0, flush is not fully completed, there is some data left within internal buffers. + * - after a ZSTD_e_end directive, if internal buffer is not fully flushed, + * only ZSTD_e_end or ZSTD_e_flush operations are allowed. + * It is necessary to fully flush internal buffers + * before starting a new compression job, or changing compression parameters. + */ +ZSTDLIB_API size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp); + +/*! ZSTD_CCtx_reset() : + * Return a CCtx to clean state. + * Useful after an error, or to interrupt an ongoing compression job and start a new one. + * Any internal data not yet flushed is cancelled. + * Dictionary (if any) is dropped. + * It's possible to modify compression parameters after a reset. + */ +ZSTDLIB_API void ZSTD_CCtx_reset(ZSTD_CCtx* cctx); /* Not ready yet ! */ + + +/*! ZSTD_compress_generic_simpleArgs() : + * Same as ZSTD_compress_generic(), + * but using only integral types as arguments. + * Argument list is larger and less expressive than ZSTD_{in,out}Buffer, + * but can be helpful for binders from dynamic languages + * which have troubles handling structures containing memory pointers. + */ +size_t ZSTD_compress_generic_simpleArgs ( + ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, size_t* dstPos, + const void* src, size_t srcSize, size_t* srcPos, + ZSTD_EndDirective endOp); + + + /** Block functions @@ -767,7 +1107,7 @@ ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); + compression : any ZSTD_compressBegin*() variant, including with dictionary + decompression : any ZSTD_decompressBegin*() variant, including with dictionary + copyCCtx() and copyDCtx() can be used too - - Block size is limited, it must be <= ZSTD_getBlockSizeMax() <= ZSTD_BLOCKSIZE_ABSOLUTEMAX + - Block size is limited, it must be <= ZSTD_getBlockSize() <= ZSTD_BLOCKSIZE_MAX + If input is larger than a block size, it's necessary to split input data into multiple blocks + For inputs larger than a single block size, consider using the regular ZSTD_compress() instead. Frame metadata is not that costly, and quickly becomes negligible as source size grows larger. @@ -780,9 +1120,10 @@ ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); Use ZSTD_insertBlock() for such a case. */ -#define ZSTD_BLOCKSIZE_ABSOLUTEMAX (128 * 1024) /* define, for static allocation */ +#define ZSTD_BLOCKSIZELOG_MAX 17 +#define ZSTD_BLOCKSIZE_MAX (1< =32 && !g_decodeOnly) ? g_blockSize : srcSize) + (!srcSize) /* avoid div by 0 */ ; - size_t const avgSize = MIN(blockSize, (srcSize / nbFiles)); U32 const maxNbBlocks = (U32) ((srcSize + (blockSize-1)) / blockSize) + nbFiles; blockParam_t* const blockTable = (blockParam_t*) malloc(maxNbBlocks * sizeof(blockParam_t)); size_t const maxCompressedSize = ZSTD_compressBound(srcSize) + (maxNbBlocks * 1024); /* add some room for safety */ @@ -262,42 +260,72 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize, UTIL_getTime(&clockStart); if (!cCompleted) { /* still some time to do compression tests */ - ZSTD_customMem const cmem = { NULL, NULL, NULL }; U64 const clockLoop = g_nbSeconds ? TIMELOOP_MICROSEC : 1; U32 nbLoops = 0; + ZSTD_CDict* cdict = NULL; +#ifdef ZSTD_NEWAPI + ZSTD_CCtx_setParameter(ctx, ZSTD_p_nbThreads, g_nbThreads); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionLevel, cLevel); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_windowLog, comprParams->windowLog); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_chainLog, comprParams->chainLog); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_searchLog, comprParams->searchLog); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_minMatch, comprParams->searchLength); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_targetLength, comprParams->targetLength); + ZSTD_CCtx_setParameter(ctx, ZSTD_p_compressionStrategy, comprParams->strategy); + ZSTD_CCtx_loadDictionary(ctx, dictBuffer, dictBufferSize); +#else + size_t const avgSize = MIN(blockSize, (srcSize / nbFiles)); ZSTD_parameters zparams = ZSTD_getParams(cLevel, avgSize, dictBufferSize); - ZSTD_CDict* cdict; + ZSTD_customMem const cmem = { NULL, NULL, NULL }; if (comprParams->windowLog) zparams.cParams.windowLog = comprParams->windowLog; if (comprParams->chainLog) zparams.cParams.chainLog = comprParams->chainLog; if (comprParams->hashLog) zparams.cParams.hashLog = comprParams->hashLog; if (comprParams->searchLog) zparams.cParams.searchLog = comprParams->searchLog; if (comprParams->searchLength) zparams.cParams.searchLength = comprParams->searchLength; if (comprParams->targetLength) zparams.cParams.targetLength = comprParams->targetLength; - if (comprParams->strategy) zparams.cParams.strategy = (ZSTD_strategy)(comprParams->strategy - 1); - cdict = ZSTD_createCDict_advanced(dictBuffer, dictBufferSize, 1, zparams.cParams, cmem); + if (comprParams->strategy) zparams.cParams.strategy = comprParams->strategy; + cdict = ZSTD_createCDict_advanced(dictBuffer, dictBufferSize, 1 /*byRef*/, ZSTD_dm_auto, zparams.cParams, cmem); if (cdict==NULL) EXM_THROW(1, "ZSTD_createCDict_advanced() allocation failure"); +#endif do { U32 blockNb; - size_t rSize; for (blockNb=0; blockNb 5) { int n; for (n=-5; n<0; n++) DISPLAY("%02X ", ((const BYTE*)srcBuffer)[u+n]); diff --git a/contrib/zstd/programs/dibio.c b/contrib/zstd/programs/dibio.c index aac36425cf75..31cde5c95db1 100644 --- a/contrib/zstd/programs/dibio.c +++ b/contrib/zstd/programs/dibio.c @@ -216,21 +216,21 @@ static U64 DiB_getTotalCappedFileSize(const char** fileNamesTable, unsigned nbFi } -/*! ZDICT_trainFromBuffer_unsafe() : +/*! ZDICT_trainFromBuffer_unsafe_legacy() : Strictly Internal use only !! - Same as ZDICT_trainFromBuffer_advanced(), but does not control `samplesBuffer`. + Same as ZDICT_trainFromBuffer_legacy(), but does not control `samplesBuffer`. `samplesBuffer` must be followed by noisy guard band to avoid out-of-buffer reads. @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) or an error code. */ -size_t ZDICT_trainFromBuffer_unsafe(void* dictBuffer, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, - ZDICT_params_t parameters); +size_t ZDICT_trainFromBuffer_unsafe_legacy(void* dictBuffer, size_t dictBufferCapacity, + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, + ZDICT_legacy_params_t parameters); int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize, const char** fileNamesTable, unsigned nbFiles, - ZDICT_params_t *params, COVER_params_t *coverParams, + ZDICT_legacy_params_t *params, ZDICT_cover_params_t *coverParams, int optimizeCover) { void* const dictBuffer = malloc(maxDictSize); @@ -243,8 +243,8 @@ int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize, int result = 0; /* Checks */ - if (params) g_displayLevel = params->notificationLevel; - else if (coverParams) g_displayLevel = coverParams->notificationLevel; + if (params) g_displayLevel = params->zParams.notificationLevel; + else if (coverParams) g_displayLevel = coverParams->zParams.notificationLevel; else EXM_THROW(13, "Neither dictionary algorith selected"); /* should not happen */ if ((!fileSizes) || (!srcBuffer) || (!dictBuffer)) EXM_THROW(12, "not enough memory for DiB_trainFiles"); /* should not happen */ if (g_tooLargeSamples) { @@ -273,20 +273,20 @@ int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize, size_t dictSize; if (params) { DiB_fillNoise((char*)srcBuffer + benchedSize, NOISELENGTH); /* guard band, for end of buffer condition */ - dictSize = ZDICT_trainFromBuffer_unsafe(dictBuffer, maxDictSize, - srcBuffer, fileSizes, nbFiles, - *params); + dictSize = ZDICT_trainFromBuffer_unsafe_legacy(dictBuffer, maxDictSize, + srcBuffer, fileSizes, nbFiles, + *params); } else if (optimizeCover) { - dictSize = COVER_optimizeTrainFromBuffer( - dictBuffer, maxDictSize, srcBuffer, fileSizes, nbFiles, - coverParams); + dictSize = ZDICT_optimizeTrainFromBuffer_cover(dictBuffer, maxDictSize, + srcBuffer, fileSizes, nbFiles, + coverParams); if (!ZDICT_isError(dictSize)) { - DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\n", coverParams->k, coverParams->d, coverParams->steps); + DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\n", coverParams->k, coverParams->d, coverParams->steps); } } else { - dictSize = COVER_trainFromBuffer(dictBuffer, maxDictSize, - srcBuffer, fileSizes, nbFiles, - *coverParams); + dictSize = + ZDICT_trainFromBuffer_cover(dictBuffer, maxDictSize, srcBuffer, + fileSizes, nbFiles, *coverParams); } if (ZDICT_isError(dictSize)) { DISPLAYLEVEL(1, "dictionary training failed : %s \n", ZDICT_getErrorName(dictSize)); /* should not happen */ diff --git a/contrib/zstd/programs/dibio.h b/contrib/zstd/programs/dibio.h index e61d0042c850..84f7d580283d 100644 --- a/contrib/zstd/programs/dibio.h +++ b/contrib/zstd/programs/dibio.h @@ -32,7 +32,7 @@ */ int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize, const char** fileNamesTable, unsigned nbFiles, - ZDICT_params_t *params, COVER_params_t *coverParams, + ZDICT_legacy_params_t *params, ZDICT_cover_params_t *coverParams, int optimizeCover); #endif diff --git a/contrib/zstd/programs/fileio.c b/contrib/zstd/programs/fileio.c index 33850cb4dca4..276735a0ecdc 100644 --- a/contrib/zstd/programs/fileio.c +++ b/contrib/zstd/programs/fileio.c @@ -77,10 +77,7 @@ #define BLOCKSIZE (128 KB) #define ROLLBUFFERSIZE (BLOCKSIZE*8*64) -#define FIO_FRAMEHEADERSIZE 5 /* as a define, because needed to allocated table on stack */ -#define FSE_CHECKSUM_SEED 0 - -#define CACHELINE 64 +#define FIO_FRAMEHEADERSIZE 5 /* as a define, because needed to allocated table on stack */ #define DICTSIZE_MAX (32 MB) /* protection against large input (attack scenario) */ @@ -91,8 +88,9 @@ * Macros ***************************************/ #define DISPLAY(...) fprintf(stderr, __VA_ARGS__) +#define DISPLAYOUT(...) fprintf(stdout, __VA_ARGS__) #define DISPLAYLEVEL(l, ...) { if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); } } -static int g_displayLevel = 2; /* 0 : no display; 1: errors; 2 : + result + interaction + warnings; 3 : + progression; 4 : + information */ +static int g_displayLevel = 2; /* 0 : no display; 1: errors; 2: + result + interaction + warnings; 3: + progression; 4: + information */ void FIO_setNotificationLevel(unsigned level) { g_displayLevel=level; } #define DISPLAYUPDATE(l, ...) { if (g_displayLevel>=l) { \ @@ -102,10 +100,46 @@ void FIO_setNotificationLevel(unsigned level) { g_displayLevel=level; } static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100; static clock_t g_time = 0; -#undef MIN +#undef MIN /* in case it would be already defined */ #define MIN(a,b) ((a) < (b) ? (a) : (b)) +/*-************************************* +* Errors +***************************************/ +/*-************************************* +* Debug +***************************************/ +#if defined(ZSTD_DEBUG) && (ZSTD_DEBUG>=1) +# include +#else +# ifndef assert +# define assert(condition) ((void)0) +# endif +#endif + +#ifndef ZSTD_DEBUG +# define ZSTD_DEBUG 0 +#endif +#define DEBUGLOG(l,...) if (l<=ZSTD_DEBUG) DISPLAY(__VA_ARGS__); +#define EXM_THROW(error, ...) \ +{ \ + DISPLAYLEVEL(1, "zstd: "); \ + DEBUGLOG(1, "Error defined at %s, line %i : \n", __FILE__, __LINE__); \ + DISPLAYLEVEL(1, "error %i : ", error); \ + DISPLAYLEVEL(1, __VA_ARGS__); \ + DISPLAYLEVEL(1, " \n"); \ + exit(error); \ +} + +#define CHECK(f) { \ + size_t const err = f; \ + if (ZSTD_isError(err)) { \ + DEBUGLOG(1, "%s \n", #f); \ + EXM_THROW(11, "%s", ZSTD_getErrorName(err)); \ +} } + + /* ************************************************************ * Avoid fseek()'s 2GiB barrier with MSVC, MacOS, *BSD, MinGW ***************************************************************/ @@ -145,7 +179,7 @@ static FIO_compressionType_t g_compressionType = FIO_zstdCompression; void FIO_setCompressionType(FIO_compressionType_t compressionType) { g_compressionType = compressionType; } static U32 g_overwrite = 0; void FIO_overwriteMode(void) { g_overwrite=1; } -static U32 g_sparseFileSupport = 1; /* 0 : no sparse allowed; 1: auto (file yes, stdout no); 2: force sparse */ +static U32 g_sparseFileSupport = 1; /* 0: no sparse allowed; 1: auto (file yes, stdout no); 2: force sparse */ void FIO_setSparseWrite(unsigned sparse) { g_sparseFileSupport=sparse; } static U32 g_dictIDFlag = 1; void FIO_setDictIDFlag(unsigned dictIDFlag) { g_dictIDFlag = dictIDFlag; } @@ -181,23 +215,6 @@ void FIO_setOverlapLog(unsigned overlapLog){ } -/*-************************************* -* Exceptions -***************************************/ -#ifndef DEBUG -# define DEBUG 0 -#endif -#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__); -#define EXM_THROW(error, ...) \ -{ \ - DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \ - DISPLAYLEVEL(1, "Error %i : ", error); \ - DISPLAYLEVEL(1, __VA_ARGS__); \ - DISPLAYLEVEL(1, " \n"); \ - exit(error); \ -} - - /*-************************************* * Functions ***************************************/ @@ -206,8 +223,8 @@ void FIO_setOverlapLog(unsigned overlapLog){ static int FIO_remove(const char* path) { #if defined(_WIN32) || defined(WIN32) - /* windows doesn't allow remove read-only files, so try to make it - * writable first */ + /* windows doesn't allow remove read-only files, + * so try to make it writable first */ chmod(path, _S_IWRITE); #endif return remove(path); @@ -225,12 +242,14 @@ static FILE* FIO_openSrcFile(const char* srcFileName) f = stdin; SET_BINARY_MODE(stdin); } else { - if (!UTIL_isRegFile(srcFileName)) { - DISPLAYLEVEL(1, "zstd: %s is not a regular file -- ignored \n", srcFileName); + if (!UTIL_isRegularFile(srcFileName)) { + DISPLAYLEVEL(1, "zstd: %s is not a regular file -- ignored \n", + srcFileName); return NULL; } f = fopen(srcFileName, "rb"); - if ( f==NULL ) DISPLAYLEVEL(1, "zstd: %s: %s \n", srcFileName, strerror(errno)); + if ( f==NULL ) + DISPLAYLEVEL(1, "zstd: %s: %s \n", srcFileName, strerror(errno)); } return f; @@ -255,26 +274,28 @@ static FILE* FIO_openDstFile(const char* dstFileName) if (g_sparseFileSupport == 1) { g_sparseFileSupport = ZSTD_SPARSE_DEFAULT; } - if (strcmp (dstFileName, nulmark)) { /* Check if destination file already exists */ + if (strcmp (dstFileName, nulmark)) { + /* Check if destination file already exists */ f = fopen( dstFileName, "rb" ); - if (f != 0) { /* dest file exists, prompt for overwrite authorization */ + if (f != 0) { /* dst file exists, prompt for overwrite authorization */ fclose(f); if (!g_overwrite) { if (g_displayLevel <= 1) { /* No interaction possible */ - DISPLAY("zstd: %s already exists; not overwritten \n", dstFileName); + DISPLAY("zstd: %s already exists; not overwritten \n", + dstFileName); return NULL; } - DISPLAY("zstd: %s already exists; do you wish to overwrite (y/N) ? ", dstFileName); + DISPLAY("zstd: %s already exists; do you wish to overwrite (y/N) ? ", + dstFileName); { int ch = getchar(); if ((ch!='Y') && (ch!='y')) { DISPLAY(" not overwritten \n"); return NULL; } - while ((ch!=EOF) && (ch!='\n')) ch = getchar(); /* flush rest of input line */ - } - } - + /* flush rest of input line */ + while ((ch!=EOF) && (ch!='\n')) ch = getchar(); + } } /* need to unlink */ FIO_remove(dstFileName); } } @@ -302,11 +323,13 @@ static size_t FIO_createDictBuffer(void** bufferPtr, const char* fileName) DISPLAYLEVEL(4,"Loading %s as dictionary \n", fileName); fileHandle = fopen(fileName, "rb"); - if (fileHandle==0) EXM_THROW(31, "zstd: %s: %s", fileName, strerror(errno)); + if (fileHandle==0) EXM_THROW(31, "%s: %s", fileName, strerror(errno)); fileSize = UTIL_getFileSize(fileName); - if (fileSize > DICTSIZE_MAX) EXM_THROW(32, "Dictionary file %s is too large (> %u MB)", fileName, DICTSIZE_MAX >> 20); /* avoid extreme cases */ + if (fileSize > DICTSIZE_MAX) + EXM_THROW(32, "Dictionary file %s is too large (> %u MB)", + fileName, DICTSIZE_MAX >> 20); /* avoid extreme cases */ *bufferPtr = malloc((size_t)fileSize); - if (*bufferPtr==NULL) EXM_THROW(34, "zstd: %s", strerror(errno)); + if (*bufferPtr==NULL) EXM_THROW(34, "%s", strerror(errno)); { size_t const readSize = fread(*bufferPtr, 1, (size_t)fileSize, fileHandle); if (readSize!=fileSize) EXM_THROW(35, "Error reading dictionary file %s", fileName); } fclose(fileHandle); @@ -325,7 +348,7 @@ typedef struct { size_t srcBufferSize; void* dstBuffer; size_t dstBufferSize; -#ifdef ZSTD_MULTITHREAD +#if !defined(ZSTD_NEWAPI) && defined(ZSTD_MULTITHREAD) ZSTDMT_CCtx* cctx; #else ZSTD_CStream* cctx; @@ -333,34 +356,65 @@ typedef struct { } cRess_t; static cRess_t FIO_createCResources(const char* dictFileName, int cLevel, - U64 srcSize, int srcRegFile, + U64 srcSize, int srcIsRegularFile, ZSTD_compressionParameters* comprParams) { cRess_t ress; memset(&ress, 0, sizeof(ress)); -#ifdef ZSTD_MULTITHREAD +#ifdef ZSTD_NEWAPI + ress.cctx = ZSTD_createCCtx(); + if (ress.cctx == NULL) + EXM_THROW(30, "allocation error : can't create ZSTD_CCtx"); +#elif defined(ZSTD_MULTITHREAD) ress.cctx = ZSTDMT_createCCtx(g_nbThreads); - if (ress.cctx == NULL) EXM_THROW(30, "zstd: allocation error : can't create ZSTD_CStream"); + if (ress.cctx == NULL) + EXM_THROW(30, "allocation error : can't create ZSTDMT_CCtx"); if ((cLevel==ZSTD_maxCLevel()) && (g_overlapLog==FIO_OVERLAP_LOG_NOTSET)) - ZSTDMT_setMTCtxParameter(ress.cctx, ZSTDMT_p_overlapSectionLog, 9); /* use complete window for overlap */ + /* use complete window for overlap */ + ZSTDMT_setMTCtxParameter(ress.cctx, ZSTDMT_p_overlapSectionLog, 9); if (g_overlapLog != FIO_OVERLAP_LOG_NOTSET) ZSTDMT_setMTCtxParameter(ress.cctx, ZSTDMT_p_overlapSectionLog, g_overlapLog); #else ress.cctx = ZSTD_createCStream(); - if (ress.cctx == NULL) EXM_THROW(30, "zstd: allocation error : can't create ZSTD_CStream"); + if (ress.cctx == NULL) + EXM_THROW(30, "allocation error : can't create ZSTD_CStream"); #endif ress.srcBufferSize = ZSTD_CStreamInSize(); ress.srcBuffer = malloc(ress.srcBufferSize); ress.dstBufferSize = ZSTD_CStreamOutSize(); ress.dstBuffer = malloc(ress.dstBufferSize); - if (!ress.srcBuffer || !ress.dstBuffer) EXM_THROW(31, "zstd: allocation error : not enough memory"); + if (!ress.srcBuffer || !ress.dstBuffer) + EXM_THROW(31, "allocation error : not enough memory"); /* dictionary */ { void* dictBuffer; size_t const dictBuffSize = FIO_createDictBuffer(&dictBuffer, dictFileName); /* works with dictFileName==NULL */ - if (dictFileName && (dictBuffer==NULL)) EXM_THROW(32, "zstd: allocation error : can't create dictBuffer"); + if (dictFileName && (dictBuffer==NULL)) + EXM_THROW(32, "allocation error : can't create dictBuffer"); + +#ifdef ZSTD_NEWAPI + { /* frame parameters */ + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_contentSizeFlag, srcIsRegularFile) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_dictIDFlag, g_dictIDFlag) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_checksumFlag, g_checksumFlag) ); + CHECK( ZSTD_CCtx_setPledgedSrcSize(ress.cctx, srcSize) ); + /* compression parameters */ + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionLevel, cLevel) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_windowLog, comprParams->windowLog) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_chainLog, comprParams->chainLog) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_hashLog, comprParams->hashLog) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_searchLog, comprParams->searchLog) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_minMatch, comprParams->searchLength) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_targetLength, comprParams->targetLength) ); + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_compressionStrategy, (U32)comprParams->strategy) ); + /* multi-threading */ + CHECK( ZSTD_CCtx_setParameter(ress.cctx, ZSTD_p_nbThreads, g_nbThreads) ); + /* dictionary */ + CHECK( ZSTD_CCtx_loadDictionary(ress.cctx, dictBuffer, dictBuffSize) ); + } +#elif defined(ZSTD_MULTITHREAD) { ZSTD_parameters params = ZSTD_getParams(cLevel, srcSize, dictBuffSize); - params.fParams.contentSizeFlag = srcRegFile; + params.fParams.contentSizeFlag = srcIsRegularFile; params.fParams.checksumFlag = g_checksumFlag; params.fParams.noDictIDFlag = !g_dictIDFlag; if (comprParams->windowLog) params.cParams.windowLog = comprParams->windowLog; @@ -369,16 +423,25 @@ static cRess_t FIO_createCResources(const char* dictFileName, int cLevel, if (comprParams->searchLog) params.cParams.searchLog = comprParams->searchLog; if (comprParams->searchLength) params.cParams.searchLength = comprParams->searchLength; if (comprParams->targetLength) params.cParams.targetLength = comprParams->targetLength; - if (comprParams->strategy) params.cParams.strategy = (ZSTD_strategy)(comprParams->strategy - 1); /* 0 means : do not change */ -#ifdef ZSTD_MULTITHREAD - { size_t const errorCode = ZSTDMT_initCStream_advanced(ress.cctx, dictBuffer, dictBuffSize, params, srcSize); - if (ZSTD_isError(errorCode)) EXM_THROW(33, "Error initializing CStream : %s", ZSTD_getErrorName(errorCode)); - ZSTDMT_setMTCtxParameter(ress.cctx, ZSTDMT_p_sectionSize, g_blockSize); + if (comprParams->strategy) params.cParams.strategy = (ZSTD_strategy) comprParams->strategy; + CHECK( ZSTDMT_initCStream_advanced(ress.cctx, dictBuffer, dictBuffSize, params, srcSize) ); + ZSTDMT_setMTCtxParameter(ress.cctx, ZSTDMT_p_sectionSize, g_blockSize); + } #else - { size_t const errorCode = ZSTD_initCStream_advanced(ress.cctx, dictBuffer, dictBuffSize, params, srcSize); - if (ZSTD_isError(errorCode)) EXM_THROW(33, "Error initializing CStream : %s", ZSTD_getErrorName(errorCode)); + { ZSTD_parameters params = ZSTD_getParams(cLevel, srcSize, dictBuffSize); + params.fParams.contentSizeFlag = srcIsRegularFile; + params.fParams.checksumFlag = g_checksumFlag; + params.fParams.noDictIDFlag = !g_dictIDFlag; + if (comprParams->windowLog) params.cParams.windowLog = comprParams->windowLog; + if (comprParams->chainLog) params.cParams.chainLog = comprParams->chainLog; + if (comprParams->hashLog) params.cParams.hashLog = comprParams->hashLog; + if (comprParams->searchLog) params.cParams.searchLog = comprParams->searchLog; + if (comprParams->searchLength) params.cParams.searchLength = comprParams->searchLength; + if (comprParams->targetLength) params.cParams.targetLength = comprParams->targetLength; + if (comprParams->strategy) params.cParams.strategy = (ZSTD_strategy) comprParams->strategy; + CHECK( ZSTD_initCStream_advanced(ress.cctx, dictBuffer, dictBuffSize, params, srcSize) ); + } #endif - } } free(dictBuffer); } @@ -389,7 +452,7 @@ static void FIO_freeCResources(cRess_t ress) { free(ress.srcBuffer); free(ress.dstBuffer); -#ifdef ZSTD_MULTITHREAD +#if !defined(ZSTD_NEWAPI) && defined(ZSTD_MULTITHREAD) ZSTDMT_freeCCtx(ress.cctx); #else ZSTD_freeCStream(ress.cctx); /* never fails */ @@ -398,20 +461,26 @@ static void FIO_freeCResources(cRess_t ress) #ifdef ZSTD_GZCOMPRESS -static unsigned long long FIO_compressGzFrame(cRess_t* ress, const char* srcFileName, U64 const srcFileSize, int compressionLevel, U64* readsize) +static unsigned long long FIO_compressGzFrame(cRess_t* ress, + const char* srcFileName, U64 const srcFileSize, + int compressionLevel, U64* readsize) { unsigned long long inFileSize = 0, outFileSize = 0; z_stream strm; int ret; - if (compressionLevel > Z_BEST_COMPRESSION) compressionLevel = Z_BEST_COMPRESSION; + if (compressionLevel > Z_BEST_COMPRESSION) + compressionLevel = Z_BEST_COMPRESSION; strm.zalloc = Z_NULL; strm.zfree = Z_NULL; strm.opaque = Z_NULL; - ret = deflateInit2(&strm, compressionLevel, Z_DEFLATED, 15 /* maxWindowLogSize */ + 16 /* gzip only */, 8, Z_DEFAULT_STRATEGY); /* see http://www.zlib.net/manual.html */ - if (ret != Z_OK) EXM_THROW(71, "zstd: %s: deflateInit2 error %d \n", srcFileName, ret); + ret = deflateInit2(&strm, compressionLevel, Z_DEFLATED, + 15 /* maxWindowLogSize */ + 16 /* gzip only */, + 8, Z_DEFAULT_STRATEGY); /* see http://www.zlib.net/manual.html */ + if (ret != Z_OK) + EXM_THROW(71, "zstd: %s: deflateInit2 error %d \n", srcFileName, ret); strm.next_in = 0; strm.avail_in = 0; @@ -427,35 +496,45 @@ static unsigned long long FIO_compressGzFrame(cRess_t* ress, const char* srcFile strm.avail_in = (uInt)inSize; } ret = deflate(&strm, Z_NO_FLUSH); - if (ret != Z_OK) EXM_THROW(72, "zstd: %s: deflate error %d \n", srcFileName, ret); + if (ret != Z_OK) + EXM_THROW(72, "zstd: %s: deflate error %d \n", srcFileName, ret); { size_t const decompBytes = ress->dstBufferSize - strm.avail_out; if (decompBytes) { - if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) EXM_THROW(73, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) + EXM_THROW(73, "Write error : cannot write to output file"); outFileSize += decompBytes; strm.next_out = (Bytef*)ress->dstBuffer; strm.avail_out = (uInt)ress->dstBufferSize; } } - if (!srcFileSize) DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", (U32)(inFileSize>>20), (double)outFileSize/inFileSize*100) - else DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", (U32)(inFileSize>>20), (U32)(srcFileSize>>20), (double)outFileSize/inFileSize*100); + if (!srcFileSize) + DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", + (U32)(inFileSize>>20), + (double)outFileSize/inFileSize*100) + else + DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", + (U32)(inFileSize>>20), (U32)(srcFileSize>>20), + (double)outFileSize/inFileSize*100); } while (1) { ret = deflate(&strm, Z_FINISH); { size_t const decompBytes = ress->dstBufferSize - strm.avail_out; if (decompBytes) { - if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) EXM_THROW(75, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) + EXM_THROW(75, "Write error : cannot write to output file"); outFileSize += decompBytes; strm.next_out = (Bytef*)ress->dstBuffer; strm.avail_out = (uInt)ress->dstBufferSize; - } - } + } } if (ret == Z_STREAM_END) break; - if (ret != Z_BUF_ERROR) EXM_THROW(77, "zstd: %s: deflate error %d \n", srcFileName, ret); + if (ret != Z_BUF_ERROR) + EXM_THROW(77, "zstd: %s: deflate error %d \n", srcFileName, ret); } ret = deflateEnd(&strm); - if (ret != Z_OK) EXM_THROW(79, "zstd: %s: deflateEnd error %d \n", srcFileName, ret); + if (ret != Z_OK) + EXM_THROW(79, "zstd: %s: deflateEnd error %d \n", srcFileName, ret); *readsize = inFileSize; return outFileSize; @@ -464,7 +543,9 @@ static unsigned long long FIO_compressGzFrame(cRess_t* ress, const char* srcFile #ifdef ZSTD_LZMACOMPRESS -static unsigned long long FIO_compressLzmaFrame(cRess_t* ress, const char* srcFileName, U64 const srcFileSize, int compressionLevel, U64* readsize, int plain_lzma) +static unsigned long long FIO_compressLzmaFrame(cRess_t* ress, + const char* srcFileName, U64 const srcFileSize, + int compressionLevel, U64* readsize, int plain_lzma) { unsigned long long inFileSize = 0, outFileSize = 0; lzma_stream strm = LZMA_STREAM_INIT; @@ -476,17 +557,20 @@ static unsigned long long FIO_compressLzmaFrame(cRess_t* ress, const char* srcFi if (plain_lzma) { lzma_options_lzma opt_lzma; - if (lzma_lzma_preset(&opt_lzma, compressionLevel)) EXM_THROW(71, "zstd: %s: lzma_lzma_preset error", srcFileName); + if (lzma_lzma_preset(&opt_lzma, compressionLevel)) + EXM_THROW(71, "zstd: %s: lzma_lzma_preset error", srcFileName); ret = lzma_alone_encoder(&strm, &opt_lzma); /* LZMA */ - if (ret != LZMA_OK) EXM_THROW(71, "zstd: %s: lzma_alone_encoder error %d", srcFileName, ret); + if (ret != LZMA_OK) + EXM_THROW(71, "zstd: %s: lzma_alone_encoder error %d", srcFileName, ret); } else { ret = lzma_easy_encoder(&strm, compressionLevel, LZMA_CHECK_CRC64); /* XZ */ - if (ret != LZMA_OK) EXM_THROW(71, "zstd: %s: lzma_easy_encoder error %d", srcFileName, ret); + if (ret != LZMA_OK) + EXM_THROW(71, "zstd: %s: lzma_easy_encoder error %d", srcFileName, ret); } strm.next_in = 0; strm.avail_in = 0; - strm.next_out = ress->dstBuffer; + strm.next_out = (BYTE*)ress->dstBuffer; strm.avail_out = ress->dstBufferSize; while (1) { @@ -494,23 +578,30 @@ static unsigned long long FIO_compressLzmaFrame(cRess_t* ress, const char* srcFi size_t const inSize = fread(ress->srcBuffer, 1, ress->srcBufferSize, ress->srcFile); if (inSize == 0) action = LZMA_FINISH; inFileSize += inSize; - strm.next_in = ress->srcBuffer; + strm.next_in = (BYTE const*)ress->srcBuffer; strm.avail_in = inSize; } ret = lzma_code(&strm, action); - if (ret != LZMA_OK && ret != LZMA_STREAM_END) EXM_THROW(72, "zstd: %s: lzma_code encoding error %d", srcFileName, ret); + if (ret != LZMA_OK && ret != LZMA_STREAM_END) + EXM_THROW(72, "zstd: %s: lzma_code encoding error %d", srcFileName, ret); { size_t const compBytes = ress->dstBufferSize - strm.avail_out; if (compBytes) { - if (fwrite(ress->dstBuffer, 1, compBytes, ress->dstFile) != compBytes) EXM_THROW(73, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, compBytes, ress->dstFile) != compBytes) + EXM_THROW(73, "Write error : cannot write to output file"); outFileSize += compBytes; - strm.next_out = ress->dstBuffer; + strm.next_out = (BYTE*)ress->dstBuffer; strm.avail_out = ress->dstBufferSize; - } - } - if (!srcFileSize) DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", (U32)(inFileSize>>20), (double)outFileSize/inFileSize*100) - else DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", (U32)(inFileSize>>20), (U32)(srcFileSize>>20), (double)outFileSize/inFileSize*100); + } } + if (!srcFileSize) + DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", + (U32)(inFileSize>>20), + (double)outFileSize/inFileSize*100) + else + DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", + (U32)(inFileSize>>20), (U32)(srcFileSize>>20), + (double)outFileSize/inFileSize*100); if (ret == LZMA_STREAM_END) break; } @@ -523,7 +614,9 @@ static unsigned long long FIO_compressLzmaFrame(cRess_t* ress, const char* srcFi #ifdef ZSTD_LZ4COMPRESS static int FIO_LZ4_GetBlockSize_FromBlockId (int id) { return (1 << (8 + (2 * id))); } -static unsigned long long FIO_compressLz4Frame(cRess_t* ress, const char* srcFileName, U64 const srcFileSize, int compressionLevel, U64* readsize) +static unsigned long long FIO_compressLz4Frame(cRess_t* ress, + const char* srcFileName, U64 const srcFileSize, + int compressionLevel, U64* readsize) { unsigned long long inFileSize = 0, outFileSize = 0; @@ -531,7 +624,8 @@ static unsigned long long FIO_compressLz4Frame(cRess_t* ress, const char* srcFil LZ4F_compressionContext_t ctx; LZ4F_errorCode_t const errorCode = LZ4F_createCompressionContext(&ctx, LZ4F_VERSION); - if (LZ4F_isError(errorCode)) EXM_THROW(31, "zstd: failed to create lz4 compression context"); + if (LZ4F_isError(errorCode)) + EXM_THROW(31, "zstd: failed to create lz4 compression context"); memset(&prefs, 0, sizeof(prefs)); @@ -553,7 +647,9 @@ static unsigned long long FIO_compressLz4Frame(cRess_t* ress, const char* srcFil size_t blockSize = FIO_LZ4_GetBlockSize_FromBlockId(LZ4F_max4MB); size_t readSize; size_t headerSize = LZ4F_compressBegin(ctx, ress->dstBuffer, ress->dstBufferSize, &prefs); - if (LZ4F_isError(headerSize)) EXM_THROW(33, "File header generation failed : %s", LZ4F_getErrorName(headerSize)); + if (LZ4F_isError(headerSize)) + EXM_THROW(33, "File header generation failed : %s", + LZ4F_getErrorName(headerSize)); { size_t const sizeCheck = fwrite(ress->dstBuffer, 1, headerSize, ress->dstFile); if (sizeCheck!=headerSize) EXM_THROW(34, "Write error : cannot write header"); } outFileSize += headerSize; @@ -568,10 +664,18 @@ static unsigned long long FIO_compressLz4Frame(cRess_t* ress, const char* srcFil /* Compress Block */ outSize = LZ4F_compressUpdate(ctx, ress->dstBuffer, ress->dstBufferSize, ress->srcBuffer, readSize, NULL); - if (LZ4F_isError(outSize)) EXM_THROW(35, "zstd: %s: lz4 compression failed : %s", srcFileName, LZ4F_getErrorName(outSize)); + if (LZ4F_isError(outSize)) + EXM_THROW(35, "zstd: %s: lz4 compression failed : %s", + srcFileName, LZ4F_getErrorName(outSize)); outFileSize += outSize; - if (!srcFileSize) DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", (U32)(inFileSize>>20), (double)outFileSize/inFileSize*100) - else DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", (U32)(inFileSize>>20), (U32)(srcFileSize>>20), (double)outFileSize/inFileSize*100); + if (!srcFileSize) + DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", + (U32)(inFileSize>>20), + (double)outFileSize/inFileSize*100) + else + DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", + (U32)(inFileSize>>20), (U32)(srcFileSize>>20), + (double)outFileSize/inFileSize*100); /* Write Block */ { size_t const sizeCheck = fwrite(ress->dstBuffer, 1, outSize, ress->dstFile); @@ -585,7 +689,9 @@ static unsigned long long FIO_compressLz4Frame(cRess_t* ress, const char* srcFil /* End of Stream mark */ headerSize = LZ4F_compressEnd(ctx, ress->dstBuffer, ress->dstBufferSize, NULL); - if (LZ4F_isError(headerSize)) EXM_THROW(38, "zstd: %s: lz4 end of file generation failed : %s", srcFileName, LZ4F_getErrorName(headerSize)); + if (LZ4F_isError(headerSize)) + EXM_THROW(38, "zstd: %s: lz4 end of file generation failed : %s", + srcFileName, LZ4F_getErrorName(headerSize)); { size_t const sizeCheck = fwrite(ress->dstBuffer, 1, headerSize, ress->dstFile); if (sizeCheck!=headerSize) EXM_THROW(39, "Write error : cannot write end of stream"); } @@ -623,7 +729,8 @@ static int FIO_compressFilename_internal(cRess_t ress, compressedfilesize = FIO_compressGzFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize); #else (void)compressionLevel; - EXM_THROW(20, "zstd: %s: file cannot be compressed as gzip (zstd compiled without ZSTD_GZCOMPRESS) -- ignored \n", srcFileName); + EXM_THROW(20, "zstd: %s: file cannot be compressed as gzip (zstd compiled without ZSTD_GZCOMPRESS) -- ignored \n", + srcFileName); #endif goto finish; @@ -633,7 +740,8 @@ static int FIO_compressFilename_internal(cRess_t ress, compressedfilesize = FIO_compressLzmaFrame(&ress, srcFileName, fileSize, compressionLevel, &readsize, g_compressionType==FIO_lzmaCompression); #else (void)compressionLevel; - EXM_THROW(20, "zstd: %s: file cannot be compressed as xz/lzma (zstd compiled without ZSTD_LZMACOMPRESS) -- ignored \n", srcFileName); + EXM_THROW(20, "zstd: %s: file cannot be compressed as xz/lzma (zstd compiled without ZSTD_LZMACOMPRESS) -- ignored \n", + srcFileName); #endif goto finish; @@ -642,49 +750,62 @@ static int FIO_compressFilename_internal(cRess_t ress, compressedfilesize = FIO_compressLz4Frame(&ress, srcFileName, fileSize, compressionLevel, &readsize); #else (void)compressionLevel; - EXM_THROW(20, "zstd: %s: file cannot be compressed as lz4 (zstd compiled without ZSTD_LZ4COMPRESS) -- ignored \n", srcFileName); + EXM_THROW(20, "zstd: %s: file cannot be compressed as lz4 (zstd compiled without ZSTD_LZ4COMPRESS) -- ignored \n", + srcFileName); #endif goto finish; } /* init */ -#ifdef ZSTD_MULTITHREAD - { size_t const resetError = ZSTDMT_resetCStream(ress.cctx, fileSize); +#ifdef ZSTD_NEWAPI + /* nothing, reset is implied */ +#elif defined(ZSTD_MULTITHREAD) + CHECK( ZSTDMT_resetCStream(ress.cctx, fileSize) ); #else - { size_t const resetError = ZSTD_resetCStream(ress.cctx, fileSize); + CHECK( ZSTD_resetCStream(ress.cctx, fileSize) ); #endif - if (ZSTD_isError(resetError)) EXM_THROW(21, "Error initializing compression : %s", ZSTD_getErrorName(resetError)); - } /* Main compression loop */ while (1) { /* Fill input Buffer */ size_t const inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile); + ZSTD_inBuffer inBuff = { ress.srcBuffer, inSize, 0 }; if (inSize==0) break; readsize += inSize; - { ZSTD_inBuffer inBuff = { ress.srcBuffer, inSize, 0 }; - while (inBuff.pos != inBuff.size) { - ZSTD_outBuffer outBuff = { ress.dstBuffer, ress.dstBufferSize, 0 }; -#ifdef ZSTD_MULTITHREAD - size_t const result = ZSTDMT_compressStream(ress.cctx, &outBuff, &inBuff); + while (inBuff.pos != inBuff.size) { + ZSTD_outBuffer outBuff = { ress.dstBuffer, ress.dstBufferSize, 0 }; +#ifdef ZSTD_NEWAPI + CHECK( ZSTD_compress_generic(ress.cctx, + &outBuff, &inBuff, ZSTD_e_continue) ); +#elif defined(ZSTD_MULTITHREAD) + CHECK( ZSTDMT_compressStream(ress.cctx, &outBuff, &inBuff) ); #else - size_t const result = ZSTD_compressStream(ress.cctx, &outBuff, &inBuff); + CHECK( ZSTD_compressStream(ress.cctx, &outBuff, &inBuff) ); #endif - if (ZSTD_isError(result)) EXM_THROW(23, "Compression error : %s ", ZSTD_getErrorName(result)); - /* Write compressed stream */ - if (outBuff.pos) { - size_t const sizeCheck = fwrite(ress.dstBuffer, 1, outBuff.pos, dstFile); - if (sizeCheck!=outBuff.pos) EXM_THROW(25, "Write error : cannot write compressed block into %s", dstFileName); - compressedfilesize += outBuff.pos; - } } } + /* Write compressed stream */ + if (outBuff.pos) { + size_t const sizeCheck = fwrite(ress.dstBuffer, 1, outBuff.pos, dstFile); + if (sizeCheck!=outBuff.pos) + EXM_THROW(25, "Write error : cannot write compressed block into %s", dstFileName); + compressedfilesize += outBuff.pos; + } } if (g_nbThreads > 1) { - if (!fileSize) DISPLAYUPDATE(2, "\rRead : %u MB", (U32)(readsize>>20)) - else DISPLAYUPDATE(2, "\rRead : %u / %u MB", (U32)(readsize>>20), (U32)(fileSize>>20)); + if (!fileSize) + DISPLAYUPDATE(2, "\rRead : %u MB", (U32)(readsize>>20)) + else + DISPLAYUPDATE(2, "\rRead : %u / %u MB", + (U32)(readsize>>20), (U32)(fileSize>>20)); } else { - if (!fileSize) DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", (U32)(readsize>>20), (double)compressedfilesize/readsize*100) - else DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", (U32)(readsize>>20), (U32)(fileSize>>20), (double)compressedfilesize/readsize*100); + if (!fileSize) + DISPLAYUPDATE(2, "\rRead : %u MB ==> %.2f%%", + (U32)(readsize>>20), + (double)compressedfilesize/readsize*100) + else + DISPLAYUPDATE(2, "\rRead : %u / %u MB ==> %.2f%%", + (U32)(readsize>>20), (U32)(fileSize>>20), + (double)compressedfilesize/readsize*100); } } @@ -692,12 +813,18 @@ static int FIO_compressFilename_internal(cRess_t ress, { size_t result = 1; while (result!=0) { /* note : is there any possibility of endless loop ? */ ZSTD_outBuffer outBuff = { ress.dstBuffer, ress.dstBufferSize, 0 }; -#ifdef ZSTD_MULTITHREAD +#ifdef ZSTD_NEWAPI + ZSTD_inBuffer inBuff = { NULL, 0, 0}; + result = ZSTD_compress_generic(ress.cctx, + &outBuff, &inBuff, ZSTD_e_end); +#elif defined(ZSTD_MULTITHREAD) result = ZSTDMT_endStream(ress.cctx, &outBuff); #else result = ZSTD_endStream(ress.cctx, &outBuff); #endif - if (ZSTD_isError(result)) EXM_THROW(26, "Compression error during frame end : %s", ZSTD_getErrorName(result)); + if (ZSTD_isError(result)) + EXM_THROW(26, "Compression error during frame end : %s", + ZSTD_getErrorName(result)); { size_t const sizeCheck = fwrite(ress.dstBuffer, 1, outBuff.pos, dstFile); if (sizeCheck!=outBuff.pos) EXM_THROW(27, "Write error : cannot write frame end into %s", dstFileName); } compressedfilesize += outBuff.pos; @@ -722,7 +849,8 @@ finish: * 1 : missing or pb opening srcFileName */ static int FIO_compressFilename_srcFile(cRess_t ress, - const char* dstFileName, const char* srcFileName, int compressionLevel) + const char* dstFileName, const char* srcFileName, + int compressionLevel) { int result; @@ -751,7 +879,9 @@ static int FIO_compressFilename_srcFile(cRess_t ress, * 1 : pb */ static int FIO_compressFilename_dstFile(cRess_t ress, - const char* dstFileName, const char* srcFileName, int compressionLevel) + const char* dstFileName, + const char* srcFileName, + int compressionLevel) { int result; stat_t statbuf; @@ -760,12 +890,20 @@ static int FIO_compressFilename_dstFile(cRess_t ress, ress.dstFile = FIO_openDstFile(dstFileName); if (ress.dstFile==NULL) return 1; /* could not open dstFileName */ - if (strcmp (srcFileName, stdinmark) && UTIL_getFileStat(srcFileName, &statbuf)) stat_result = 1; + if (strcmp (srcFileName, stdinmark) && UTIL_getFileStat(srcFileName, &statbuf)) + stat_result = 1; result = FIO_compressFilename_srcFile(ress, dstFileName, srcFileName, compressionLevel); - if (fclose(ress.dstFile)) { DISPLAYLEVEL(1, "zstd: %s: %s \n", dstFileName, strerror(errno)); result=1; } /* error closing dstFile */ - if (result!=0) { if (remove(dstFileName)) EXM_THROW(1, "zstd: %s: %s", dstFileName, strerror(errno)); } /* remove operation artefact */ - else if (strcmp (dstFileName, stdoutmark) && stat_result) UTIL_setFileStat(dstFileName, &statbuf); + if (fclose(ress.dstFile)) { /* error closing dstFile */ + DISPLAYLEVEL(1, "zstd: %s: %s \n", dstFileName, strerror(errno)); + result=1; + } + if (result!=0) { /* remove operation artefact */ + if (remove(dstFileName)) + EXM_THROW(1, "zstd: %s: %s", dstFileName, strerror(errno)); + } + else if (strcmp (dstFileName, stdoutmark) && stat_result) + UTIL_setFileStat(dstFileName, &statbuf); return result; } @@ -775,9 +913,9 @@ int FIO_compressFilename(const char* dstFileName, const char* srcFileName, { clock_t const start = clock(); U64 const srcSize = UTIL_getFileSize(srcFileName); - int const regFile = UTIL_isRegFile(srcFileName); + int const isRegularFile = UTIL_isRegularFile(srcFileName); - cRess_t const ress = FIO_createCResources(dictFileName, compressionLevel, srcSize, regFile, comprParams); + cRess_t const ress = FIO_createCResources(dictFileName, compressionLevel, srcSize, isRegularFile, comprParams); int const result = FIO_compressFilename_dstFile(ress, dstFileName, srcFileName, compressionLevel); double const seconds = (double)(clock() - start) / CLOCKS_PER_SEC; @@ -787,6 +925,230 @@ int FIO_compressFilename(const char* dstFileName, const char* srcFileName, return result; } +typedef struct { + int numActualFrames; + int numSkippableFrames; + unsigned long long decompressedSize; + int decompUnavailable; + unsigned long long compressedSize; + int usesCheck; +} fileInfo_t; + +/* + * Reads information from file, stores in *info + * if successful, returns 0, returns 1 for frame analysis error, returns 2 for file not compressed with zstd + * returns 3 for cases in which file could not be opened. + */ +static int getFileInfo(fileInfo_t* info, const char* inFileName){ + int detectError = 0; + FILE* const srcFile = FIO_openSrcFile(inFileName); + if (srcFile == NULL) { + DISPLAY("Error: could not open source file %s\n", inFileName); + return 3; + } + info->compressedSize = (unsigned long long)UTIL_getFileSize(inFileName); + /* begin analyzing frame */ + for ( ; ; ) { + BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX]; + size_t const numBytesRead = fread(headerBuffer, 1, sizeof(headerBuffer), srcFile); + if (numBytesRead < ZSTD_frameHeaderSize_min) { + if (feof(srcFile) && numBytesRead == 0 && info->compressedSize > 0) { + break; + } + else if (feof(srcFile)) { + DISPLAY("Error: reached end of file with incomplete frame\n"); + detectError = 2; + break; + } + else { + DISPLAY("Error: did not reach end of file but ran out of frames\n"); + detectError = 1; + break; + } + } + { + U32 const magicNumber = MEM_readLE32(headerBuffer); + if (magicNumber == ZSTD_MAGICNUMBER) { + U64 const frameContentSize = ZSTD_getFrameContentSize(headerBuffer, numBytesRead); + if (frameContentSize == ZSTD_CONTENTSIZE_ERROR || frameContentSize == ZSTD_CONTENTSIZE_UNKNOWN) { + info->decompUnavailable = 1; + } + else { + info->decompressedSize += frameContentSize; + } + { + /* move to the end of the frame header */ + size_t const headerSize = ZSTD_frameHeaderSize(headerBuffer, numBytesRead); + if (ZSTD_isError(headerSize)) { + DISPLAY("Error: could not determine frame header size\n"); + detectError = 1; + break; + } + { + int const ret = fseek(srcFile, ((long)headerSize)-((long)numBytesRead), SEEK_CUR); + if (ret != 0) { + DISPLAY("Error: could not move to end of frame header\n"); + detectError = 1; + break; + } + } + } + + /* skip the rest of the blocks in the frame */ + { + int lastBlock = 0; + do { + BYTE blockHeaderBuffer[3]; + U32 blockHeader; + int blockSize; + size_t const readBytes = fread(blockHeaderBuffer, 1, 3, srcFile); + if (readBytes != 3) { + DISPLAY("There was a problem reading the block header\n"); + detectError = 1; + break; + } + blockHeader = MEM_readLE24(blockHeaderBuffer); + lastBlock = blockHeader & 1; + blockSize = blockHeader >> 3; + { + int const ret = fseek(srcFile, blockSize, SEEK_CUR); + if (ret != 0) { + DISPLAY("Error: could not skip to end of block\n"); + detectError = 1; + break; + } + } + } while (lastBlock != 1); + + if (detectError) { + break; + } + } + { + /* check if checksum is used */ + BYTE const frameHeaderDescriptor = headerBuffer[4]; + int const contentChecksumFlag = (frameHeaderDescriptor & (1 << 2)) >> 2; + if (contentChecksumFlag) { + int const ret = fseek(srcFile, 4, SEEK_CUR); + info->usesCheck = 1; + if (ret != 0) { + DISPLAY("Error: could not skip past checksum\n"); + detectError = 1; + break; + } + } + } + info->numActualFrames++; + } + else if (magicNumber == ZSTD_MAGIC_SKIPPABLE_START) { + BYTE frameSizeBuffer[4]; + size_t const readBytes = fread(frameSizeBuffer, 1, 4, srcFile); + if (readBytes != 4) { + DISPLAY("There was an error reading skippable frame size"); + detectError = 1; + break; + } + { + U32 const frameSize = MEM_readLE32(frameSizeBuffer); + int const ret = LONG_SEEK(srcFile, frameSize, SEEK_CUR); + if (ret != 0) { + DISPLAY("Error: could not find end of skippable frame\n"); + detectError = 1; + break; + } + } + info->numSkippableFrames++; + } + else { + detectError = 2; + break; + } + } + } + fclose(srcFile); + return detectError; +} + +static void displayInfo(const char* inFileName, fileInfo_t* info, int displayLevel){ + double const compressedSizeMB = (double)info->compressedSize/(1 MB); + double const decompressedSizeMB = (double)info->decompressedSize/(1 MB); + double const ratio = (info->compressedSize == 0) ? 0 : ((double)info->decompressedSize)/info->compressedSize; + const char* const checkString = (info->usesCheck ? "XXH64" : "None"); + if (displayLevel <= 2) { + if (!info->decompUnavailable) { + DISPLAYOUT("Skippable Non-Skippable Compressed Uncompressed Ratio Check Filename\n"); + DISPLAYOUT("%9d %13d %7.2f MB %9.2f MB %5.3f %5s %s\n", + info->numSkippableFrames, info->numActualFrames, compressedSizeMB, decompressedSizeMB, + ratio, checkString, inFileName); + } + else { + DISPLAYOUT("Skippable Non-Skippable Compressed Check Filename\n"); + DISPLAYOUT("%9d %13d %7.2f MB %5s %s\n", + info->numSkippableFrames, info->numActualFrames, compressedSizeMB, checkString, inFileName); + } + } + else{ + DISPLAYOUT("# Zstandard Frames: %d\n", info->numActualFrames); + DISPLAYOUT("# Skippable Frames: %d\n", info->numSkippableFrames); + DISPLAYOUT("Compressed Size: %.2f MB (%llu B)\n", compressedSizeMB, info->compressedSize); + if (!info->decompUnavailable) { + DISPLAYOUT("Decompressed Size: %.2f MB (%llu B)\n", decompressedSizeMB, info->decompressedSize); + DISPLAYOUT("Ratio: %.4f\n", ratio); + } + DISPLAYOUT("Check: %s\n", checkString); + DISPLAYOUT("\n"); + } +} + + +static int FIO_listFile(const char* inFileName, int displayLevel, unsigned fileNo, unsigned numFiles){ + /* initialize info to avoid warnings */ + fileInfo_t info; + memset(&info, 0, sizeof(info)); + DISPLAYOUT("%s (%u/%u):\n", inFileName, fileNo, numFiles); + { + int const error = getFileInfo(&info, inFileName); + if (error == 1) { + /* display error, but provide output */ + DISPLAY("An error occurred with getting file info\n"); + } + else if (error == 2) { + DISPLAYOUT("File %s not compressed with zstd\n", inFileName); + if (displayLevel > 2) { + DISPLAYOUT("\n"); + } + return 1; + } + else if (error == 3) { + /* error occurred with opening the file */ + if (displayLevel > 2) { + DISPLAYOUT("\n"); + } + return 1; + } + displayInfo(inFileName, &info, displayLevel); + return error; + } +} + +int FIO_listMultipleFiles(unsigned numFiles, const char** filenameTable, int displayLevel){ + if (numFiles == 0) { + DISPLAYOUT("No files given\n"); + return 0; + } + DISPLAYOUT("===========================================\n"); + DISPLAYOUT("Printing information about compressed files\n"); + DISPLAYOUT("===========================================\n"); + DISPLAYOUT("Number of files listed: %u\n", numFiles); + { + int error = 0; + unsigned u; + for (u=0; u srcBufferLoaded bytes already loaded and stored within buffer */ - { size_t const sizeCheck = fwrite(buffer, 1, alreadyLoaded, foutput); - if (sizeCheck != alreadyLoaded) EXM_THROW(50, "Pass-through write error"); } + { size_t const sizeCheck = fwrite(buffer, 1, alreadyLoaded, foutput); + if (sizeCheck != alreadyLoaded) { + DISPLAYLEVEL(1, "Pass-through write error \n"); + return 1; + } } while (readFromInput) { readFromInput = fread(buffer, 1, blockSize, finput); @@ -983,16 +1361,19 @@ static unsigned FIO_passThrough(FILE* foutput, FILE* finput, void* buffer, size_ /** FIO_decompressFrame() : - @return : size of decoded frame + * @return : size of decoded zstd frame, or an error code */ -unsigned long long FIO_decompressFrame(dRess_t* ress, +#define FIO_ERROR_FRAME_DECODING ((unsigned long long)(-2)) +unsigned long long FIO_decompressZstdFrame(dRess_t* ress, FILE* finput, + const char* srcFileName, U64 alreadyDecoded) { U64 frameSize = 0; U32 storedSkips = 0; ZSTD_resetDStream(ress->dctx); + if (strlen(srcFileName)>20) srcFileName += strlen(srcFileName)-20; /* display last 20 characters */ /* Header loading (optional, saves one loop) */ { size_t const toRead = 9; @@ -1005,12 +1386,17 @@ unsigned long long FIO_decompressFrame(dRess_t* ress, ZSTD_inBuffer inBuff = { ress->srcBuffer, ress->srcBufferLoaded, 0 }; ZSTD_outBuffer outBuff= { ress->dstBuffer, ress->dstBufferSize, 0 }; size_t const readSizeHint = ZSTD_decompressStream(ress->dctx, &outBuff, &inBuff); - if (ZSTD_isError(readSizeHint)) EXM_THROW(36, "Decoding error : %s", ZSTD_getErrorName(readSizeHint)); + if (ZSTD_isError(readSizeHint)) { + DISPLAYLEVEL(1, "%s : Decoding error (36) : %s \n", + srcFileName, ZSTD_getErrorName(readSizeHint)); + return FIO_ERROR_FRAME_DECODING; + } /* Write block */ storedSkips = FIO_fwriteSparse(ress->dstFile, ress->dstBuffer, outBuff.pos, storedSkips); frameSize += outBuff.pos; - DISPLAYUPDATE(2, "\rDecoded : %u MB... ", (U32)((alreadyDecoded+frameSize)>>20) ); + DISPLAYUPDATE(2, "\r%-20.20s : %u MB... ", + srcFileName, (U32)((alreadyDecoded+frameSize)>>20) ); if (inBuff.pos > 0) { memmove(ress->srcBuffer, (char*)ress->srcBuffer + inBuff.pos, inBuff.size - inBuff.pos); @@ -1018,14 +1404,22 @@ unsigned long long FIO_decompressFrame(dRess_t* ress, } if (readSizeHint == 0) break; /* end of frame */ - if (inBuff.size != inBuff.pos) EXM_THROW(37, "Decoding error : should consume entire input"); + if (inBuff.size != inBuff.pos) { + DISPLAYLEVEL(1, "%s : Decoding error (37) : should consume entire input \n", + srcFileName); + return FIO_ERROR_FRAME_DECODING; + } /* Fill input buffer */ { size_t const toRead = MIN(readSizeHint, ress->srcBufferSize); /* support large skippable frames */ if (ress->srcBufferLoaded < toRead) - ress->srcBufferLoaded += fread(((char*)ress->srcBuffer) + ress->srcBufferLoaded, 1, toRead - ress->srcBufferLoaded, finput); - if (ress->srcBufferLoaded < toRead) EXM_THROW(39, "Read error : premature end"); - } } + ress->srcBufferLoaded += fread((char*)ress->srcBuffer + ress->srcBufferLoaded, + 1, toRead - ress->srcBufferLoaded, finput); + if (ress->srcBufferLoaded < toRead) { + DISPLAYLEVEL(1, "%s : Read error (39) : premature end \n", + srcFileName); + return FIO_ERROR_FRAME_DECODING; + } } } FIO_fwriteSparseEnd(ress->dstFile, storedSkips); @@ -1034,19 +1428,22 @@ unsigned long long FIO_decompressFrame(dRess_t* ress, #ifdef ZSTD_GZDECOMPRESS -static unsigned long long FIO_decompressGzFrame(dRess_t* ress, FILE* srcFile, const char* srcFileName) +static unsigned long long FIO_decompressGzFrame(dRess_t* ress, + FILE* srcFile, const char* srcFileName) { unsigned long long outFileSize = 0; z_stream strm; int flush = Z_NO_FLUSH; - int ret; + int decodingError = 0; strm.zalloc = Z_NULL; strm.zfree = Z_NULL; strm.opaque = Z_NULL; strm.next_in = 0; strm.avail_in = 0; - if (inflateInit2(&strm, 15 /* maxWindowLogSize */ + 16 /* gzip only */) != Z_OK) return 0; /* see http://www.zlib.net/manual.html */ + /* see http://www.zlib.net/manual.html */ + if (inflateInit2(&strm, 15 /* maxWindowLogSize */ + 16 /* gzip only */) != Z_OK) + return FIO_ERROR_FRAME_DECODING; strm.next_out = (Bytef*)ress->dstBuffer; strm.avail_out = (uInt)ress->dstBufferSize; @@ -1054,6 +1451,7 @@ static unsigned long long FIO_decompressGzFrame(dRess_t* ress, FILE* srcFile, co strm.next_in = (z_const unsigned char*)ress->srcBuffer; for ( ; ; ) { + int ret; if (strm.avail_in == 0) { ress->srcBufferLoaded = fread(ress->srcBuffer, 1, ress->srcBufferSize, srcFile); if (ress->srcBufferLoaded == 0) flush = Z_FINISH; @@ -1061,11 +1459,20 @@ static unsigned long long FIO_decompressGzFrame(dRess_t* ress, FILE* srcFile, co strm.avail_in = (uInt)ress->srcBufferLoaded; } ret = inflate(&strm, flush); - if (ret == Z_BUF_ERROR) EXM_THROW(39, "zstd: %s: premature end", srcFileName); - if (ret != Z_OK && ret != Z_STREAM_END) { DISPLAY("zstd: %s: inflate error %d \n", srcFileName, ret); return 0; } + if (ret == Z_BUF_ERROR) { + DISPLAYLEVEL(1, "zstd: %s: premature gz end \n", srcFileName); + decodingError = 1; break; + } + if (ret != Z_OK && ret != Z_STREAM_END) { + DISPLAYLEVEL(1, "zstd: %s: inflate error %d \n", srcFileName, ret); + decodingError = 1; break; + } { size_t const decompBytes = ress->dstBufferSize - strm.avail_out; if (decompBytes) { - if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) EXM_THROW(31, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) { + DISPLAYLEVEL(1, "zstd: %s \n", strerror(errno)); + decodingError = 1; break; + } outFileSize += decompBytes; strm.next_out = (Bytef*)ress->dstBuffer; strm.avail_out = (uInt)ress->dstBufferSize; @@ -1074,11 +1481,15 @@ static unsigned long long FIO_decompressGzFrame(dRess_t* ress, FILE* srcFile, co if (ret == Z_STREAM_END) break; } - if (strm.avail_in > 0) memmove(ress->srcBuffer, strm.next_in, strm.avail_in); + if (strm.avail_in > 0) + memmove(ress->srcBuffer, strm.next_in, strm.avail_in); ress->srcBufferLoaded = strm.avail_in; - ret = inflateEnd(&strm); - if (ret != Z_OK) EXM_THROW(32, "zstd: %s: inflateEnd error %d", srcFileName, ret); - return outFileSize; + if ( (inflateEnd(&strm) != Z_OK) /* release resources ; error detected */ + && (decodingError==0) ) { + DISPLAYLEVEL(1, "zstd: %s: inflateEnd error \n", srcFileName); + decodingError = 1; + } + return decodingError ? FIO_ERROR_FRAME_DECODING : outFileSize; } #endif @@ -1089,69 +1500,95 @@ static unsigned long long FIO_decompressLzmaFrame(dRess_t* ress, FILE* srcFile, unsigned long long outFileSize = 0; lzma_stream strm = LZMA_STREAM_INIT; lzma_action action = LZMA_RUN; - lzma_ret ret; + lzma_ret initRet; + int decodingError = 0; strm.next_in = 0; strm.avail_in = 0; if (plain_lzma) { - ret = lzma_alone_decoder(&strm, UINT64_MAX); /* LZMA */ + initRet = lzma_alone_decoder(&strm, UINT64_MAX); /* LZMA */ } else { - ret = lzma_stream_decoder(&strm, UINT64_MAX, 0); /* XZ */ + initRet = lzma_stream_decoder(&strm, UINT64_MAX, 0); /* XZ */ } - if (ret != LZMA_OK) EXM_THROW(71, "zstd: %s: lzma_alone_decoder/lzma_stream_decoder error %d", srcFileName, ret); + if (initRet != LZMA_OK) { + DISPLAYLEVEL(1, "zstd: %s: %s error %d \n", + plain_lzma ? "lzma_alone_decoder" : "lzma_stream_decoder", + srcFileName, initRet); + return FIO_ERROR_FRAME_DECODING; + } - strm.next_out = ress->dstBuffer; + strm.next_out = (BYTE*)ress->dstBuffer; strm.avail_out = ress->dstBufferSize; + strm.next_in = (BYTE const*)ress->srcBuffer; strm.avail_in = ress->srcBufferLoaded; - strm.next_in = ress->srcBuffer; for ( ; ; ) { + lzma_ret ret; if (strm.avail_in == 0) { ress->srcBufferLoaded = fread(ress->srcBuffer, 1, ress->srcBufferSize, srcFile); if (ress->srcBufferLoaded == 0) action = LZMA_FINISH; - strm.next_in = ress->srcBuffer; + strm.next_in = (BYTE const*)ress->srcBuffer; strm.avail_in = ress->srcBufferLoaded; } ret = lzma_code(&strm, action); - if (ret == LZMA_BUF_ERROR) EXM_THROW(39, "zstd: %s: premature end", srcFileName); - if (ret != LZMA_OK && ret != LZMA_STREAM_END) { DISPLAY("zstd: %s: lzma_code decoding error %d \n", srcFileName, ret); return 0; } + if (ret == LZMA_BUF_ERROR) { + DISPLAYLEVEL(1, "zstd: %s: premature lzma end \n", srcFileName); + decodingError = 1; break; + } + if (ret != LZMA_OK && ret != LZMA_STREAM_END) { + DISPLAYLEVEL(1, "zstd: %s: lzma_code decoding error %d \n", + srcFileName, ret); + decodingError = 1; break; + } { size_t const decompBytes = ress->dstBufferSize - strm.avail_out; if (decompBytes) { - if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) EXM_THROW(31, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, decompBytes, ress->dstFile) != decompBytes) { + DISPLAYLEVEL(1, "zstd: %s \n", strerror(errno)); + decodingError = 1; break; + } outFileSize += decompBytes; - strm.next_out = ress->dstBuffer; + strm.next_out = (BYTE*)ress->dstBuffer; strm.avail_out = ress->dstBufferSize; - } - } + } } if (ret == LZMA_STREAM_END) break; } - if (strm.avail_in > 0) memmove(ress->srcBuffer, strm.next_in, strm.avail_in); + if (strm.avail_in > 0) + memmove(ress->srcBuffer, strm.next_in, strm.avail_in); ress->srcBufferLoaded = strm.avail_in; lzma_end(&strm); - return outFileSize; + return decodingError ? FIO_ERROR_FRAME_DECODING : outFileSize; } #endif #ifdef ZSTD_LZ4DECOMPRESS -static unsigned long long FIO_decompressLz4Frame(dRess_t* ress, FILE* srcFile, const char* srcFileName) +static unsigned long long FIO_decompressLz4Frame(dRess_t* ress, + FILE* srcFile, const char* srcFileName) { unsigned long long filesize = 0; LZ4F_errorCode_t nextToLoad; LZ4F_decompressionContext_t dCtx; LZ4F_errorCode_t const errorCode = LZ4F_createDecompressionContext(&dCtx, LZ4F_VERSION); + int decodingError = 0; - if (LZ4F_isError(errorCode)) EXM_THROW(61, "zstd: failed to create lz4 decompression context"); + if (LZ4F_isError(errorCode)) { + DISPLAYLEVEL(1, "zstd: failed to create lz4 decompression context \n"); + return FIO_ERROR_FRAME_DECODING; + } /* Init feed with magic number (already consumed from FILE* sFile) */ { size_t inSize = 4; size_t outSize= 0; MEM_writeLE32(ress->srcBuffer, LZ4_MAGICNUMBER); nextToLoad = LZ4F_decompress(dCtx, ress->dstBuffer, &outSize, ress->srcBuffer, &inSize, NULL); - if (LZ4F_isError(nextToLoad)) EXM_THROW(62, "zstd: %s: lz4 header error : %s", srcFileName, LZ4F_getErrorName(nextToLoad)); - } + if (LZ4F_isError(nextToLoad)) { + DISPLAYLEVEL(1, "zstd: %s: lz4 header error : %s \n", + srcFileName, LZ4F_getErrorName(nextToLoad)); + LZ4F_freeDecompressionContext(dCtx); + return FIO_ERROR_FRAME_DECODING; + } } /* Main Loop */ for (;nextToLoad;) { @@ -1169,12 +1606,19 @@ static unsigned long long FIO_decompressLz4Frame(dRess_t* ress, FILE* srcFile, c size_t remaining = readSize - pos; decodedBytes = ress->dstBufferSize; nextToLoad = LZ4F_decompress(dCtx, ress->dstBuffer, &decodedBytes, (char*)(ress->srcBuffer)+pos, &remaining, NULL); - if (LZ4F_isError(nextToLoad)) EXM_THROW(66, "zstd: %s: decompression error : %s", srcFileName, LZ4F_getErrorName(nextToLoad)); + if (LZ4F_isError(nextToLoad)) { + DISPLAYLEVEL(1, "zstd: %s: lz4 decompression error : %s \n", + srcFileName, LZ4F_getErrorName(nextToLoad)); + decodingError = 1; break; + } pos += remaining; /* Write Block */ if (decodedBytes) { - if (fwrite(ress->dstBuffer, 1, decodedBytes, ress->dstFile) != decodedBytes) EXM_THROW(63, "Write error : cannot write to output file"); + if (fwrite(ress->dstBuffer, 1, decodedBytes, ress->dstFile) != decodedBytes) { + DISPLAYLEVEL(1, "zstd: %s \n", strerr(errno)); + decodingError = 1; break; + } filesize += decodedBytes; DISPLAYUPDATE(2, "\rDecompressed : %u MB ", (unsigned)(filesize>>20)); } @@ -1183,19 +1627,106 @@ static unsigned long long FIO_decompressLz4Frame(dRess_t* ress, FILE* srcFile, c } } /* can be out because readSize == 0, which could be an fread() error */ - if (ferror(srcFile)) EXM_THROW(67, "zstd: %s: read error", srcFileName); + if (ferror(srcFile)) { + DISPLAYLEVEL(1, "zstd: %s: read error \n", srcFileName); + decodingError=1; + } - if (nextToLoad!=0) EXM_THROW(68, "zstd: %s: unfinished stream", srcFileName); + if (nextToLoad!=0) { + DISPLAYLEVEL(1, "zstd: %s: unfinished lz4 stream \n", srcFileName); + decodingError=1; + } LZ4F_freeDecompressionContext(dCtx); - ress->srcBufferLoaded = 0; /* LZ4F will go to the frame boundary */ + ress->srcBufferLoaded = 0; /* LZ4F will reach exact frame boundary */ - return filesize; + return decodingError ? FIO_ERROR_FRAME_DECODING : filesize; } #endif +/** FIO_decompressFrames() : + * Find and decode frames inside srcFile + * srcFile presumed opened and valid + * @return : 0 : OK + * 1 : error + */ +static int FIO_decompressFrames(dRess_t ress, FILE* srcFile, + const char* dstFileName, const char* srcFileName) +{ + unsigned readSomething = 0; + unsigned long long filesize = 0; + assert(srcFile != NULL); + + /* for each frame */ + for ( ; ; ) { + /* check magic number -> version */ + size_t const toRead = 4; + const BYTE* const buf = (const BYTE*)ress.srcBuffer; + if (ress.srcBufferLoaded < toRead) /* load up to 4 bytes for header */ + ress.srcBufferLoaded += fread((char*)ress.srcBuffer + ress.srcBufferLoaded, + (size_t)1, toRead - ress.srcBufferLoaded, srcFile); + if (ress.srcBufferLoaded==0) { + if (readSomething==0) { /* srcFile is empty (which is invalid) */ + DISPLAYLEVEL(1, "zstd: %s: unexpected end of file \n", srcFileName); + return 1; + } /* else, just reached frame boundary */ + break; /* no more input */ + } + readSomething = 1; /* there is at least 1 byte in srcFile */ + if (ress.srcBufferLoaded < toRead) { + DISPLAYLEVEL(1, "zstd: %s: unknown header \n", srcFileName); + return 1; + } + if (ZSTD_isFrame(buf, ress.srcBufferLoaded)) { + unsigned long long const frameSize = FIO_decompressZstdFrame(&ress, srcFile, srcFileName, filesize); + if (frameSize == FIO_ERROR_FRAME_DECODING) return 1; + filesize += frameSize; + } else if (buf[0] == 31 && buf[1] == 139) { /* gz magic number */ +#ifdef ZSTD_GZDECOMPRESS + unsigned long long const frameSize = FIO_decompressGzFrame(&ress, srcFile, srcFileName); + if (frameSize == FIO_ERROR_FRAME_DECODING) return 1; + filesize += frameSize; +#else + DISPLAYLEVEL(1, "zstd: %s: gzip file cannot be uncompressed (zstd compiled without HAVE_ZLIB) -- ignored \n", srcFileName); + return 1; +#endif + } else if ((buf[0] == 0xFD && buf[1] == 0x37) /* xz magic number */ + || (buf[0] == 0x5D && buf[1] == 0x00)) { /* lzma header (no magic number) */ +#ifdef ZSTD_LZMADECOMPRESS + unsigned long long const frameSize = FIO_decompressLzmaFrame(&ress, srcFile, srcFileName, buf[0] != 0xFD); + if (frameSize == FIO_ERROR_FRAME_DECODING) return 1; + filesize += frameSize; +#else + DISPLAYLEVEL(1, "zstd: %s: xz/lzma file cannot be uncompressed (zstd compiled without HAVE_LZMA) -- ignored \n", srcFileName); + return 1; +#endif + } else if (MEM_readLE32(buf) == LZ4_MAGICNUMBER) { +#ifdef ZSTD_LZ4DECOMPRESS + unsigned long long const frameSize = FIO_decompressLz4Frame(&ress, srcFile, srcFileName); + if (frameSize == FIO_ERROR_FRAME_DECODING) return 1; + filesize += frameSize; +#else + DISPLAYLEVEL(1, "zstd: %s: lz4 file cannot be uncompressed (zstd compiled without HAVE_LZ4) -- ignored \n", srcFileName); + return 1; +#endif + } else if ((g_overwrite) && !strcmp (dstFileName, stdoutmark)) { /* pass-through mode */ + return FIO_passThrough(ress.dstFile, srcFile, + ress.srcBuffer, ress.srcBufferSize, ress.srcBufferLoaded); + } else { + DISPLAYLEVEL(1, "zstd: %s: unsupported format \n", srcFileName); + return 1; + } } /* for each frame */ + + /* Final Status */ + DISPLAYLEVEL(2, "\r%79s\r", ""); + DISPLAYLEVEL(2, "%-20s: %llu bytes \n", srcFileName, filesize); + + return 0; +} + + /** FIO_decompressSrcFile() : Decompression `srcFileName` into `ress.dstFile` @return : 0 : OK @@ -1204,8 +1735,7 @@ static unsigned long long FIO_decompressLz4Frame(dRess_t* ress, FILE* srcFile, c static int FIO_decompressSrcFile(dRess_t ress, const char* dstFileName, const char* srcFileName) { FILE* srcFile; - unsigned readSomething = 0; - unsigned long long filesize = 0; + int result; if (UTIL_isDirectory(srcFileName)) { DISPLAYLEVEL(1, "zstd: %s is a directory -- ignored \n", srcFileName); @@ -1215,70 +1745,22 @@ static int FIO_decompressSrcFile(dRess_t ress, const char* dstFileName, const ch srcFile = FIO_openSrcFile(srcFileName); if (srcFile==NULL) return 1; - /* for each frame */ - for ( ; ; ) { - /* check magic number -> version */ - size_t const toRead = 4; - const BYTE* buf = (const BYTE*)ress.srcBuffer; - if (ress.srcBufferLoaded < toRead) - ress.srcBufferLoaded += fread((char*)ress.srcBuffer + ress.srcBufferLoaded, (size_t)1, toRead - ress.srcBufferLoaded, srcFile); - if (ress.srcBufferLoaded==0) { - if (readSomething==0) { DISPLAY("zstd: %s: unexpected end of file \n", srcFileName); fclose(srcFile); return 1; } /* srcFileName is empty */ - break; /* no more input */ - } - readSomething = 1; /* there is at least >= 4 bytes in srcFile */ - if (ress.srcBufferLoaded < toRead) { DISPLAY("zstd: %s: unknown header \n", srcFileName); fclose(srcFile); return 1; } /* srcFileName is empty */ - if (buf[0] == 31 && buf[1] == 139) { /* gz magic number */ -#ifdef ZSTD_GZDECOMPRESS - unsigned long long const result = FIO_decompressGzFrame(&ress, srcFile, srcFileName); - if (result == 0) return 1; - filesize += result; -#else - DISPLAYLEVEL(1, "zstd: %s: gzip file cannot be uncompressed (zstd compiled without ZSTD_GZDECOMPRESS) -- ignored \n", srcFileName); - return 1; -#endif - } else if ((buf[0] == 0xFD && buf[1] == 0x37) /* xz magic number */ - || (buf[0] == 0x5D && buf[1] == 0x00)) { /* lzma header (no magic number) */ -#ifdef ZSTD_LZMADECOMPRESS - unsigned long long const result = FIO_decompressLzmaFrame(&ress, srcFile, srcFileName, buf[0] != 0xFD); - if (result == 0) return 1; - filesize += result; -#else - DISPLAYLEVEL(1, "zstd: %s: xz/lzma file cannot be uncompressed (zstd compiled without ZSTD_LZMADECOMPRESS) -- ignored \n", srcFileName); - return 1; -#endif - } else if (MEM_readLE32(buf) == LZ4_MAGICNUMBER) { -#ifdef ZSTD_LZ4DECOMPRESS - unsigned long long const result = FIO_decompressLz4Frame(&ress, srcFile, srcFileName); - if (result == 0) return 1; - filesize += result; -#else - DISPLAYLEVEL(1, "zstd: %s: lz4 file cannot be uncompressed (zstd compiled without ZSTD_LZ4DECOMPRESS) -- ignored \n", srcFileName); - return 1; -#endif - } else { - if (!ZSTD_isFrame(ress.srcBuffer, toRead)) { - if ((g_overwrite) && !strcmp (dstFileName, stdoutmark)) { /* pass-through mode */ - unsigned const result = FIO_passThrough(ress.dstFile, srcFile, ress.srcBuffer, ress.srcBufferSize, ress.srcBufferLoaded); - if (fclose(srcFile)) EXM_THROW(32, "zstd: %s close error", srcFileName); /* error should never happen */ - return result; - } else { - DISPLAYLEVEL(1, "zstd: %s: not in zstd format \n", srcFileName); - fclose(srcFile); - return 1; - } } - filesize += FIO_decompressFrame(&ress, srcFile, filesize); - } - } - - /* Final Status */ - DISPLAYLEVEL(2, "\r%79s\r", ""); - DISPLAYLEVEL(2, "%-20s: %llu bytes \n", srcFileName, filesize); + result = FIO_decompressFrames(ress, srcFile, dstFileName, srcFileName); /* Close file */ - if (fclose(srcFile)) EXM_THROW(33, "zstd: %s close error", srcFileName); /* error should never happen */ - if (g_removeSrcFile /* --rm */ && strcmp(srcFileName, stdinmark)) { if (remove(srcFileName)) EXM_THROW(34, "zstd: %s: %s", srcFileName, strerror(errno)); }; - return 0; + if (fclose(srcFile)) { + DISPLAYLEVEL(1, "zstd: %s: %s \n", srcFileName, strerror(errno)); /* error should never happen */ + return 1; + } + if ( g_removeSrcFile /* --rm */ + && (result==0) /* decompression successful */ + && strcmp(srcFileName, stdinmark) ) /* not stdin */ { + if (remove(srcFileName)) { + /* failed to remove src file */ + DISPLAYLEVEL(1, "zstd: %s: %s \n", srcFileName, strerror(errno)); + return 1; + } } + return result; } @@ -1297,16 +1779,21 @@ static int FIO_decompressDstFile(dRess_t ress, ress.dstFile = FIO_openDstFile(dstFileName); if (ress.dstFile==0) return 1; - if (strcmp (srcFileName, stdinmark) && UTIL_getFileStat(srcFileName, &statbuf)) stat_result = 1; + if (strcmp (srcFileName, stdinmark) && UTIL_getFileStat(srcFileName, &statbuf)) + stat_result = 1; result = FIO_decompressSrcFile(ress, dstFileName, srcFileName); - if (fclose(ress.dstFile)) EXM_THROW(38, "Write error : cannot properly close %s", dstFileName); + if (fclose(ress.dstFile)) { + DISPLAYLEVEL(1, "zstd: %s: %s \n", dstFileName, strerror(errno)); + result = 1; + } - if ( (result != 0) + if ( (result != 0) /* operation failure */ && strcmp(dstFileName, nulmark) /* special case : don't remove() /dev/null (#316) */ - && remove(dstFileName) ) + && remove(dstFileName) /* remove artefact */ ) result=1; /* don't do anything special if remove() fails */ - else if (strcmp (dstFileName, stdoutmark) && stat_result) UTIL_setFileStat(dstFileName, &statbuf); + else if (strcmp (dstFileName, stdoutmark) && stat_result) + UTIL_setFileStat(dstFileName, &statbuf); return result; } @@ -1314,13 +1801,12 @@ static int FIO_decompressDstFile(dRess_t ress, int FIO_decompressFilename(const char* dstFileName, const char* srcFileName, const char* dictFileName) { - int missingFiles = 0; - dRess_t ress = FIO_createDResources(dictFileName); + dRess_t const ress = FIO_createDResources(dictFileName); - missingFiles += FIO_decompressDstFile(ress, dstFileName, srcFileName); + int const decodingError = FIO_decompressDstFile(ress, dstFileName, srcFileName); FIO_freeDResources(ress); - return missingFiles; + return decodingError; } @@ -1333,7 +1819,8 @@ int FIO_decompressMultipleFilenames(const char** srcNamesTable, unsigned nbFiles int missingFiles = 0; dRess_t ress = FIO_createDResources(dictFileName); - if (suffix==NULL) EXM_THROW(70, "zstd: decompression: unknown dst"); /* should never happen */ + if (suffix==NULL) + EXM_THROW(70, "zstd: decompression: unknown dst"); /* should never happen */ if (!strcmp(suffix, stdoutmark) || !strcmp(suffix, nulmark)) { /* special cases : -c or -t */ unsigned u; @@ -1341,19 +1828,22 @@ int FIO_decompressMultipleFilenames(const char** srcNamesTable, unsigned nbFiles if (ress.dstFile == 0) EXM_THROW(71, "cannot open %s", suffix); for (u=0; u 100)\. . +.TP +\fB\-l\fR, \fB\-\-list\fR +Display information related to a zstd compressed file, such as size, ratio, and checksum\. Some of these fields may not be available\. This command can be augmented with the \fB\-v\fR modifier\. +. .SS "Operation modifiers" . .TP @@ -235,8 +239,8 @@ benchmark file(s) using multiple compression levels, from \fB\-b#\fR to \fB\-e#\ minimum evaluation time, in seconds (default : 3s), benchmark mode only . .TP -\fB\-B#\fR -cut file into independent blocks of size # (default: no block) +\fB\-B#\fR, \fB\-\-block\-size=#\fR +cut file(s) into independent blocks of size # (default: no block) . .TP \fB\-\-priority=rt\fR @@ -252,7 +256,7 @@ set process priority to real\-time Specify a strategy used by a match finder\. . .IP -There are 8 strategies numbered from 0 to 7, from faster to stronger: 0=ZSTD_fast, 1=ZSTD_dfast, 2=ZSTD_greedy, 3=ZSTD_lazy, 4=ZSTD_lazy2, 5=ZSTD_btlazy2, 6=ZSTD_btopt, 7=ZSTD_btopt2\. +There are 8 strategies numbered from 1 to 8, from faster to stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy, 5=ZSTD_lazy2, 6=ZSTD_btlazy2, 7=ZSTD_btopt, 8=ZSTD_btultra\. . .TP \fBwindowLog\fR=\fIwlog\fR, \fBwlog\fR=\fIwlog\fR @@ -306,7 +310,7 @@ The minimum \fIslen\fR is 3 and the maximum is 7\. Specify the minimum match length that causes a match finder to stop searching for better matches\. . .IP -A larger minimum match length usually improves compression ratio but decreases compression speed\. This option is only used with strategies ZSTD_btopt and ZSTD_btopt2\. +A larger minimum match length usually improves compression ratio but decreases compression speed\. This option is only used with strategies ZSTD_btopt and ZSTD_btultra\. . .IP The minimum \fItlen\fR is 4 and the maximum is 999\. diff --git a/contrib/zstd/programs/zstd.1.md b/contrib/zstd/programs/zstd.1.md index 118c9f2f8ee5..24e25a2f3af1 100644 --- a/contrib/zstd/programs/zstd.1.md +++ b/contrib/zstd/programs/zstd.1.md @@ -93,6 +93,10 @@ the last one takes effect. * `--train FILEs`: Use FILEs as a training set to create a dictionary. The training set should contain a lot of small files (> 100). +* `-l`, `--list`: + Display information related to a zstd compressed file, such as size, ratio, and checksum. + Some of these fields may not be available. + This command can be augmented with the `-v` modifier. ### Operation modifiers @@ -230,8 +234,8 @@ BENCHMARK benchmark file(s) using multiple compression levels, from `-b#` to `-e#` (inclusive) * `-i#`: minimum evaluation time, in seconds (default : 3s), benchmark mode only -* `-B#`: - cut file into independent blocks of size # (default: no block) +* `-B#`, `--block-size=#`: + cut file(s) into independent blocks of size # (default: no block) * `--priority=rt`: set process priority to real-time @@ -250,9 +254,9 @@ The list of available _options_: - `strategy`=_strat_, `strat`=_strat_: Specify a strategy used by a match finder. - There are 8 strategies numbered from 0 to 7, from faster to stronger: - 0=ZSTD\_fast, 1=ZSTD\_dfast, 2=ZSTD\_greedy, 3=ZSTD\_lazy, - 4=ZSTD\_lazy2, 5=ZSTD\_btlazy2, 6=ZSTD\_btopt, 7=ZSTD\_btopt2. + There are 8 strategies numbered from 1 to 8, from faster to stronger: + 1=ZSTD\_fast, 2=ZSTD\_dfast, 3=ZSTD\_greedy, 4=ZSTD\_lazy, + 5=ZSTD\_lazy2, 6=ZSTD\_btlazy2, 7=ZSTD\_btopt, 8=ZSTD\_btultra. - `windowLog`=_wlog_, `wlog`=_wlog_: Specify the maximum number of bits for a match distance. @@ -304,7 +308,7 @@ The list of available _options_: A larger minimum match length usually improves compression ratio but decreases compression speed. - This option is only used with strategies ZSTD_btopt and ZSTD_btopt2. + This option is only used with strategies ZSTD_btopt and ZSTD_btultra. The minimum _tlen_ is 4 and the maximum is 999. diff --git a/contrib/zstd/programs/zstdcli.c b/contrib/zstd/programs/zstdcli.c index 236338ab7e50..8bd45925bee6 100644 --- a/contrib/zstd/programs/zstdcli.c +++ b/contrib/zstd/programs/zstdcli.c @@ -56,7 +56,9 @@ #define ZSTD_GUNZIP "gunzip" #define ZSTD_GZCAT "gzcat" #define ZSTD_LZMA "lzma" +#define ZSTD_UNLZMA "unlzma" #define ZSTD_XZ "xz" +#define ZSTD_UNXZ "unxz" #define KB *(1 <<10) #define MB *(1 <<20) @@ -117,6 +119,7 @@ static int usage_advanced(const char* programName) DISPLAY( " -v : verbose mode; specify multiple times to increase verbosity\n"); DISPLAY( " -q : suppress warnings; specify twice to suppress errors too\n"); DISPLAY( " -c : force write to standard output, even if it is the console\n"); + DISPLAY( " -l : print information about zstd compressed files.\n"); #ifndef ZSTD_NOCOMPRESS DISPLAY( "--ultra : enable levels beyond %i, up to %i (requires more memory)\n", ZSTDCLI_CLEVEL_MAX, ZSTD_maxCLevel()); #ifdef ZSTD_MULTITHREAD @@ -148,6 +151,7 @@ static int usage_advanced(const char* programName) #endif #endif DISPLAY( " -M# : Set a memory usage limit for decompression \n"); + DISPLAY( "--list : list information about a zstd compressed file \n"); DISPLAY( "-- : All arguments after \"--\" are treated as files \n"); #ifndef ZSTD_NODICT DISPLAY( "\n"); @@ -244,7 +248,7 @@ static unsigned longCommandWArg(const char** stringPtr, const char* longCommand) * @return 1 means that cover parameters were correct * @return 0 in case of malformed parameters */ -static unsigned parseCoverParameters(const char* stringPtr, COVER_params_t* params) +static unsigned parseCoverParameters(const char* stringPtr, ZDICT_cover_params_t* params) { memset(params, 0, sizeof(*params)); for (; ;) { @@ -273,9 +277,9 @@ static unsigned parseLegacyParameters(const char* stringPtr, unsigned* selectivi return 1; } -static COVER_params_t defaultCoverParams(void) +static ZDICT_cover_params_t defaultCoverParams(void) { - COVER_params_t params; + ZDICT_cover_params_t params; memset(¶ms, 0, sizeof(params)); params.d = 8; params.steps = 4; @@ -298,7 +302,7 @@ static unsigned parseCompressionParameters(const char* stringPtr, ZSTD_compressi if (longCommandWArg(&stringPtr, "searchLog=") || longCommandWArg(&stringPtr, "slog=")) { params->searchLog = readU32FromChar(&stringPtr); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } if (longCommandWArg(&stringPtr, "searchLength=") || longCommandWArg(&stringPtr, "slen=")) { params->searchLength = readU32FromChar(&stringPtr); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } if (longCommandWArg(&stringPtr, "targetLength=") || longCommandWArg(&stringPtr, "tlen=")) { params->targetLength = readU32FromChar(&stringPtr); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } - if (longCommandWArg(&stringPtr, "strategy=") || longCommandWArg(&stringPtr, "strat=")) { params->strategy = (ZSTD_strategy)(1 + readU32FromChar(&stringPtr)); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } + if (longCommandWArg(&stringPtr, "strategy=") || longCommandWArg(&stringPtr, "strat=")) { params->strategy = (ZSTD_strategy)(readU32FromChar(&stringPtr)); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } if (longCommandWArg(&stringPtr, "overlapLog=") || longCommandWArg(&stringPtr, "ovlog=")) { g_overlapLog = readU32FromChar(&stringPtr); if (stringPtr[0]==',') { stringPtr++; continue; } else break; } return 0; } @@ -310,7 +314,7 @@ static unsigned parseCompressionParameters(const char* stringPtr, ZSTD_compressi } -typedef enum { zom_compress, zom_decompress, zom_test, zom_bench, zom_train } zstd_operation_mode; +typedef enum { zom_compress, zom_decompress, zom_test, zom_bench, zom_train, zom_list } zstd_operation_mode; #define CLEAN_RETURN(i) { operationResult = (i); goto _end; } @@ -354,7 +358,7 @@ int main(int argCount, const char* argv[]) unsigned fileNamesNb; #endif #ifndef ZSTD_NODICT - COVER_params_t coverParams = defaultCoverParams(); + ZDICT_cover_params_t coverParams = defaultCoverParams(); int cover = 1; #endif @@ -377,7 +381,9 @@ int main(int argCount, const char* argv[]) if (exeNameMatch(programName, ZSTD_GUNZIP)) { operation=zom_decompress; FIO_setRemoveSrcFile(1); } /* behave like gunzip */ if (exeNameMatch(programName, ZSTD_GZCAT)) { operation=zom_decompress; forceStdout=1; FIO_overwriteMode(); outFileName=stdoutmark; g_displayLevel=1; } /* behave like gzcat */ if (exeNameMatch(programName, ZSTD_LZMA)) { suffix = LZMA_EXTENSION; FIO_setCompressionType(FIO_lzmaCompression); FIO_setRemoveSrcFile(1); } /* behave like lzma */ + if (exeNameMatch(programName, ZSTD_UNLZMA)) { operation=zom_decompress; FIO_setCompressionType(FIO_lzmaCompression); FIO_setRemoveSrcFile(1); } /* behave like unlzma */ if (exeNameMatch(programName, ZSTD_XZ)) { suffix = XZ_EXTENSION; FIO_setCompressionType(FIO_xzCompression); FIO_setRemoveSrcFile(1); } /* behave like xz */ + if (exeNameMatch(programName, ZSTD_UNXZ)) { operation=zom_decompress; FIO_setCompressionType(FIO_xzCompression); FIO_setRemoveSrcFile(1); } /* behave like unxz */ memset(&compressionParams, 0, sizeof(compressionParams)); /* command switches */ @@ -401,6 +407,7 @@ int main(int argCount, const char* argv[]) if (argument[1]=='-') { /* long commands (--long-word) */ if (!strcmp(argument, "--")) { nextArgumentsAreFiles=1; continue; } /* only file names allowed from now on */ + if (!strcmp(argument, "--list")) { operation=zom_list; continue; } if (!strcmp(argument, "--compress")) { operation=zom_compress; continue; } if (!strcmp(argument, "--decompress")) { operation=zom_decompress; continue; } if (!strcmp(argument, "--uncompress")) { operation=zom_decompress; continue; } @@ -531,7 +538,7 @@ int main(int argCount, const char* argv[]) argument++; memLimit = readU32FromChar(&argument); break; - + case 'l': operation=zom_list; argument++; break; #ifdef UTIL_HAS_CREATEFILELIST /* recursive */ case 'r': recursive=1; argument++; break; @@ -672,7 +679,10 @@ int main(int argCount, const char* argv[]) } } #endif - + if (operation == zom_list) { + int const ret = FIO_listMultipleFiles(filenameIdx, filenameTable, g_displayLevel); + CLEAN_RETURN(ret); + } /* Check if benchmark is selected */ if (operation==zom_bench) { #ifndef ZSTD_NOBENCH @@ -689,20 +699,20 @@ int main(int argCount, const char* argv[]) /* Check if dictionary builder is selected */ if (operation==zom_train) { #ifndef ZSTD_NODICT + ZDICT_params_t zParams; + zParams.compressionLevel = dictCLevel; + zParams.notificationLevel = g_displayLevel; + zParams.dictID = dictID; if (cover) { int const optimize = !coverParams.k || !coverParams.d; coverParams.nbThreads = nbThreads; - coverParams.compressionLevel = dictCLevel; - coverParams.notificationLevel = g_displayLevel; - coverParams.dictID = dictID; + coverParams.zParams = zParams; operationResult = DiB_trainFromFiles(outFileName, maxDictSize, filenameTable, filenameIdx, NULL, &coverParams, optimize); } else { - ZDICT_params_t dictParams; + ZDICT_legacy_params_t dictParams; memset(&dictParams, 0, sizeof(dictParams)); - dictParams.compressionLevel = dictCLevel; dictParams.selectivityLevel = dictSelect; - dictParams.notificationLevel = g_displayLevel; - dictParams.dictID = dictID; + dictParams.zParams = zParams; operationResult = DiB_trainFromFiles(outFileName, maxDictSize, filenameTable, filenameIdx, &dictParams, NULL, 0); } #endif diff --git a/contrib/zstd/tests/Makefile b/contrib/zstd/tests/Makefile index ea58c0fe5119..82f12887bf9b 100644 --- a/contrib/zstd/tests/Makefile +++ b/contrib/zstd/tests/Makefile @@ -25,17 +25,18 @@ PRGDIR = ../programs PYTHON ?= python3 TESTARTEFACT := versionsTest namespaceTest - -DEBUGFLAGS=-g -DZSTD_DEBUG=1 -CPPFLAGS+= -I$(ZSTDDIR) -I$(ZSTDDIR)/common -I$(ZSTDDIR)/compress \ - -I$(ZSTDDIR)/dictBuilder -I$(ZSTDDIR)/deprecated -I$(PRGDIR) \ - $(DEBUGFLAG) -CFLAGS ?= -O3 -CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \ - -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement \ - -Wstrict-prototypes -Wundef -Wformat-security -CFLAGS += $(MOREFLAGS) -FLAGS = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) +DEBUGLEVEL= 1 +DEBUGFLAGS= -g -DZSTD_DEBUG=$(DEBUGLEVEL) +CPPFLAGS += -I$(ZSTDDIR) -I$(ZSTDDIR)/common -I$(ZSTDDIR)/compress \ + -I$(ZSTDDIR)/dictBuilder -I$(ZSTDDIR)/deprecated -I$(PRGDIR) +CFLAGS ?= -O3 +CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \ + -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement \ + -Wstrict-prototypes -Wundef -Wformat-security \ + -Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings \ + -Wredundant-decls +CFLAGS += $(DEBUGFLAGS) $(MOREFLAGS) +FLAGS = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) ZSTDCOMMON_FILES := $(ZSTDDIR)/common/*.c @@ -92,11 +93,12 @@ zstd-nolegacy: gzstd: $(MAKE) -C $(PRGDIR) $@ -fullbench : $(ZSTD_FILES) $(PRGDIR)/datagen.c fullbench.c - $(CC) $(FLAGS) $^ -o $@$(EXT) - -fullbench32 : $(ZSTD_FILES) $(PRGDIR)/datagen.c fullbench.c - $(CC) -m32 $(FLAGS) $^ -o $@$(EXT) +fullbench32: CPPFLAGS += -m32 +fullbench fullbench32 : CPPFLAGS += $(MULTITHREAD_CPP) +fullbench fullbench32 : LDFLAGS += $(MULTITHREAD_LD) +fullbench fullbench32 : DEBUGFLAGS = # turn off assert() for speed measurements +fullbench fullbench32 : $(ZSTD_FILES) $(PRGDIR)/datagen.c fullbench.c + $(CC) $(FLAGS) $^ -o $@$(EXT) fullbench-lib: $(PRGDIR)/datagen.c fullbench.c $(MAKE) -C $(ZSTDDIR) libzstd.a @@ -157,7 +159,7 @@ zstreamtest-dll : $(ZSTDDIR)/common/xxhash.c $(PRGDIR)/datagen.c zstreamtest.c $(MAKE) -C $(ZSTDDIR) libzstd $(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@$(EXT) -paramgrill : DEBUGFLAG = +paramgrill : DEBUGFLAGS = paramgrill : $(ZSTD_FILES) $(PRGDIR)/datagen.c paramgrill.c $(CC) $(FLAGS) $^ -lm -o $@$(EXT) @@ -178,7 +180,7 @@ legacy : CPPFLAGS+= -I$(ZSTDDIR)/legacy legacy : $(ZSTD_FILES) $(wildcard $(ZSTDDIR)/legacy/*.c) legacy.c $(CC) $(FLAGS) $^ -o $@$(EXT) -decodecorpus : $(filter-out $(ZSTDDIR)/compress/zstd_compress.c, $(wildcard $(ZSTD_FILES))) decodecorpus.c +decodecorpus : $(filter-out $(ZSTDDIR)/compress/zstd_compress.c, $(wildcard $(ZSTD_FILES))) $(ZDICT_FILES) decodecorpus.c $(CC) $(FLAGS) $^ -o $@$(EXT) -lm symbols : symbols.c @@ -222,7 +224,7 @@ clean: ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD NetBSD DragonFly SunOS)) HOST_OS = POSIX -valgrindTest: VALGRIND = valgrind --leak-check=full --error-exitcode=1 +valgrindTest: VALGRIND = valgrind --leak-check=full --show-leak-kinds=all --error-exitcode=1 valgrindTest: zstd datagen fuzzer fullbench @echo "\n ---- valgrind tests : memory analyzer ----" $(VALGRIND) ./datagen -g50M > $(VOID) @@ -270,7 +272,7 @@ endif test32: test-zstd32 test-fullbench32 test-fuzzer32 test-zstream32 -test-all: test test32 valgrindTest +test-all: test test32 valgrindTest test-decodecorpus-cli test-zstd: ZSTD = $(PRGDIR)/zstd test-zstd: zstd zstd-playTests @@ -319,6 +321,8 @@ test-zbuff32: zbufftest32 test-zstream: zstreamtest $(QEMU_SYS) ./zstreamtest $(ZSTREAM_TESTTIME) $(FUZZER_FLAGS) + $(QEMU_SYS) ./zstreamtest --mt $(ZSTREAM_TESTTIME) $(FUZZER_FLAGS) + $(QEMU_SYS) ./zstreamtest --newapi $(ZSTREAM_TESTTIME) $(FUZZER_FLAGS) test-zstream32: zstreamtest32 $(QEMU_SYS) ./zstreamtest32 $(ZSTREAM_TESTTIME) $(FUZZER_FLAGS) @@ -338,6 +342,39 @@ test-legacy: legacy test-decodecorpus: decodecorpus $(QEMU_SYS) ./decodecorpus -t $(DECODECORPUS_TESTTIME) +test-decodecorpus-cli: decodecorpus + @echo "\n ---- decodecorpus basic cli tests ----" + @mkdir testdir + ./decodecorpus -n5 -otestdir -ptestdir + @cd testdir && \ + $(ZSTD) -d z000000.zst -o tmp0 && \ + $(ZSTD) -d z000001.zst -o tmp1 && \ + $(ZSTD) -d z000002.zst -o tmp2 && \ + $(ZSTD) -d z000003.zst -o tmp3 && \ + $(ZSTD) -d z000004.zst -o tmp4 && \ + diff z000000 tmp0 && \ + diff z000001 tmp1 && \ + diff z000002 tmp2 && \ + diff z000003 tmp3 && \ + diff z000004 tmp4 && \ + rm ./* && \ + cd .. + @echo "\n ---- decodecorpus dictionary cli tests ----" + ./decodecorpus -n5 -otestdir -ptestdir --use-dict=1MB + @cd testdir && \ + $(ZSTD) -d z000000.zst -D dictionary -o tmp0 && \ + $(ZSTD) -d z000001.zst -D dictionary -o tmp1 && \ + $(ZSTD) -d z000002.zst -D dictionary -o tmp2 && \ + $(ZSTD) -d z000003.zst -D dictionary -o tmp3 && \ + $(ZSTD) -d z000004.zst -D dictionary -o tmp4 && \ + diff z000000 tmp0 && \ + diff z000001 tmp1 && \ + diff z000002 tmp2 && \ + diff z000003 tmp3 && \ + diff z000004 tmp4 && \ + cd .. + @rm -rf testdir + test-pool: pool $(QEMU_SYS) ./pool diff --git a/contrib/zstd/tests/datagencli.c b/contrib/zstd/tests/datagencli.c index 2f3ebc4d6863..8a81939d16de 100644 --- a/contrib/zstd/tests/datagencli.c +++ b/contrib/zstd/tests/datagencli.c @@ -48,7 +48,8 @@ static int usage(const char* programName) DISPLAY( "Arguments :\n"); DISPLAY( " -g# : generate # data (default:%i)\n", SIZE_DEFAULT); DISPLAY( " -s# : Select seed (default:%i)\n", SEED_DEFAULT); - DISPLAY( " -P# : Select compressibility in %% (default:%i%%)\n", COMPRESSIBILITY_DEFAULT); + DISPLAY( " -P# : Select compressibility in %% (default:%i%%)\n", + COMPRESSIBILITY_DEFAULT); DISPLAY( " -h : display help and exit\n"); return 0; } @@ -56,7 +57,7 @@ static int usage(const char* programName) int main(int argc, const char** argv) { - double proba = (double)COMPRESSIBILITY_DEFAULT / 100; + unsigned probaU32 = COMPRESSIBILITY_DEFAULT; double litProba = 0.0; U64 size = SIZE_DEFAULT; U32 seed = SEED_DEFAULT; @@ -94,11 +95,10 @@ int main(int argc, const char** argv) break; case 'P': argument++; - proba=0.0; + probaU32 = 0; while ((*argument>='0') && (*argument<='9')) - proba *= 10, proba += *argument++ - '0'; - if (proba>100.) proba=100.; - proba /= 100.; + probaU32 *= 10, probaU32 += *argument++ - '0'; + if (probaU32>100) probaU32 = 100; break; case 'L': /* hidden argument : Literal distribution probability */ argument++; @@ -117,11 +117,12 @@ int main(int argc, const char** argv) } } } } /* for(argNb=1; argNb (b) ? (b) : (x))) - static unsigned RAND(unsigned* src) { #define RAND_rotl32(x,r) ((x << r) | (x >> (32 - r))) @@ -231,6 +231,12 @@ typedef struct { cblockStats_t oldStats; /* so they can be rolled back if uncompressible */ } frame_t; +typedef struct { + int useDict; + U32 dictID; + size_t dictContentSize; + BYTE* dictContent; +} dictInfo; /*-******************************************************* * Generator Functions *********************************************************/ @@ -240,7 +246,7 @@ struct { } opts; /* advanced options on generation */ /* Generate and write a random frame header */ -static void writeFrameHeader(U32* seed, frame_t* frame) +static void writeFrameHeader(U32* seed, frame_t* frame, dictInfo info) { BYTE* const op = frame->data; size_t pos = 0; @@ -306,15 +312,26 @@ static void writeFrameHeader(U32* seed, frame_t* frame) pos += 4; { + /* + * fcsCode: 2-bit flag specifying how many bytes used to represent Frame_Content_Size (bits 7-6) + * singleSegment: 1-bit flag describing if data must be regenerated within a single continuous memory segment. (bit 5) + * contentChecksumFlag: 1-bit flag that is set if frame includes checksum at the end -- set to 1 below (bit 2) + * dictBits: 2-bit flag describing how many bytes Dictionary_ID uses -- set to 3 (bits 1-0) + * For more information: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#frame_header + */ + int const dictBits = info.useDict ? 3 : 0; BYTE const frameHeaderDescriptor = - (BYTE) ((fcsCode << 6) | (singleSegment << 5) | (1 << 2)); + (BYTE) ((fcsCode << 6) | (singleSegment << 5) | (1 << 2) | dictBits); op[pos++] = frameHeaderDescriptor; } if (!singleSegment) { op[pos++] = windowByte; } - + if (info.useDict) { + MEM_writeLE32(op + pos, (U32) info.dictID); + pos += 4; + } if (contentSizeFlag) { switch (fcsCode) { default: /* Impossible */ @@ -605,7 +622,7 @@ static inline void initSeqStore(seqStore_t *seqStore) { /* Randomly generate sequence commands */ static U32 generateSequences(U32* seed, frame_t* frame, seqStore_t* seqStore, - size_t contentSize, size_t literalsSize) + size_t contentSize, size_t literalsSize, dictInfo info) { /* The total length of all the matches */ size_t const remainingMatch = contentSize - literalsSize; @@ -629,7 +646,6 @@ static U32 generateSequences(U32* seed, frame_t* frame, seqStore_t* seqStore, } DISPLAYLEVEL(5, " total match lengths: %u\n", (U32)remainingMatch); - for (i = 0; i < numSequences; i++) { /* Generate match and literal lengths by exponential distribution to * ensure nice numbers */ @@ -654,14 +670,33 @@ static U32 generateSequences(U32* seed, frame_t* frame, seqStore_t* seqStore, memcpy(srcPtr, literals, literalLen); srcPtr += literalLen; - do { if (RAND(seed) & 7) { /* do a normal offset */ + U32 const dataDecompressed = (U32)((BYTE*)srcPtr-(BYTE*)frame->srcStart); offset = (RAND(seed) % MIN(frame->header.windowSize, (size_t)((BYTE*)srcPtr - (BYTE*)frame->srcStart))) + 1; + if (info.useDict && (RAND(seed) & 1) && i + 1 != numSequences && dataDecompressed < frame->header.windowSize) { + /* need to occasionally generate offsets that go past the start */ + /* including i+1 != numSequences because the last sequences has to adhere to predetermined contentSize */ + U32 lenPastStart = (RAND(seed) % info.dictContentSize) + 1; + offset = (U32)((BYTE*)srcPtr - (BYTE*)frame->srcStart)+lenPastStart; + if (offset > frame->header.windowSize) { + if (lenPastStart < MIN_SEQ_LEN) { + /* when offset > windowSize, matchLen bound by end of dictionary (lenPastStart) */ + /* this also means that lenPastStart must be greater than MIN_SEQ_LEN */ + /* make sure lenPastStart does not go past dictionary start though */ + lenPastStart = MIN(lenPastStart+MIN_SEQ_LEN, (U32)info.dictContentSize); + offset = (U32)((BYTE*)srcPtr - (BYTE*)frame->srcStart) + lenPastStart; + } + { + U32 const matchLenBound = MIN(frame->header.windowSize, lenPastStart); + matchLen = MIN(matchLen, matchLenBound); + } + } + } offsetCode = offset + ZSTD_REP_MOVE; repIndex = 2; } else { @@ -677,11 +712,20 @@ static U32 generateSequences(U32* seed, frame_t* frame, seqStore_t* seqStore, repIndex = MIN(2, offsetCode + 1); } } - } while (offset > (size_t)((BYTE*)srcPtr - (BYTE*)frame->srcStart) || offset == 0); + } while (((!info.useDict) && (offset > (size_t)((BYTE*)srcPtr - (BYTE*)frame->srcStart))) || offset == 0); - { size_t j; + { + size_t j; + BYTE* const dictEnd = info.dictContent + info.dictContentSize; for (j = 0; j < matchLen; j++) { - *srcPtr = *(srcPtr-offset); + if ((U32)((BYTE*)srcPtr - (BYTE*)frame->srcStart) < offset) { + /* copy from dictionary instead of literals */ + size_t const dictOffset = offset - (srcPtr - (BYTE*)frame->srcStart); + *srcPtr = *(dictEnd - dictOffset); + } + else { + *srcPtr = *(srcPtr-offset); + } srcPtr++; } } @@ -931,7 +975,7 @@ static size_t writeSequences(U32* seed, frame_t* frame, seqStore_t* seqStorePtr, } static size_t writeSequencesBlock(U32* seed, frame_t* frame, size_t contentSize, - size_t literalsSize) + size_t literalsSize, dictInfo info) { seqStore_t seqStore; size_t numSequences; @@ -940,14 +984,14 @@ static size_t writeSequencesBlock(U32* seed, frame_t* frame, size_t contentSize, initSeqStore(&seqStore); /* randomly generate sequences */ - numSequences = generateSequences(seed, frame, &seqStore, contentSize, literalsSize); + numSequences = generateSequences(seed, frame, &seqStore, contentSize, literalsSize, info); /* write them out to the frame data */ CHECKERR(writeSequences(seed, frame, &seqStore, numSequences)); return numSequences; } -static size_t writeCompressedBlock(U32* seed, frame_t* frame, size_t contentSize) +static size_t writeCompressedBlock(U32* seed, frame_t* frame, size_t contentSize, dictInfo info) { BYTE* const blockStart = (BYTE*)frame->data; size_t literalsSize; @@ -959,7 +1003,7 @@ static size_t writeCompressedBlock(U32* seed, frame_t* frame, size_t contentSize DISPLAYLEVEL(4, " literals size: %u\n", (U32)literalsSize); - nbSeq = writeSequencesBlock(seed, frame, contentSize, literalsSize); + nbSeq = writeSequencesBlock(seed, frame, contentSize, literalsSize, info); DISPLAYLEVEL(4, " number of sequences: %u\n", (U32)nbSeq); @@ -967,7 +1011,7 @@ static size_t writeCompressedBlock(U32* seed, frame_t* frame, size_t contentSize } static void writeBlock(U32* seed, frame_t* frame, size_t contentSize, - int lastBlock) + int lastBlock, dictInfo info) { int const blockTypeDesc = RAND(seed) % 8; size_t blockSize; @@ -1007,7 +1051,7 @@ static void writeBlock(U32* seed, frame_t* frame, size_t contentSize, frame->oldStats = frame->stats; frame->data = op; - compressedSize = writeCompressedBlock(seed, frame, contentSize); + compressedSize = writeCompressedBlock(seed, frame, contentSize, info); if (compressedSize > contentSize) { blockType = 0; memcpy(op, frame->src, contentSize); @@ -1033,7 +1077,7 @@ static void writeBlock(U32* seed, frame_t* frame, size_t contentSize, frame->data = op; } -static void writeBlocks(U32* seed, frame_t* frame) +static void writeBlocks(U32* seed, frame_t* frame, dictInfo info) { size_t contentLeft = frame->header.contentSize; size_t const maxBlockSize = MIN(MAX_BLOCK_SIZE, frame->header.windowSize); @@ -1056,7 +1100,7 @@ static void writeBlocks(U32* seed, frame_t* frame) } } - writeBlock(seed, frame, blockContentSize, lastBlock); + writeBlock(seed, frame, blockContentSize, lastBlock, info); contentLeft -= blockContentSize; if (lastBlock) break; @@ -1121,20 +1165,102 @@ static void initFrame(frame_t* fr) } /* Return the final seed */ -static U32 generateFrame(U32 seed, frame_t* fr) +static U32 generateFrame(U32 seed, frame_t* fr, dictInfo info) { /* generate a complete frame */ DISPLAYLEVEL(1, "frame seed: %u\n", seed); - initFrame(fr); - writeFrameHeader(&seed, fr); - writeBlocks(&seed, fr); + writeFrameHeader(&seed, fr, info); + writeBlocks(&seed, fr, info); writeChecksum(fr); return seed; } +/*_******************************************************* +* Dictionary Helper Functions +*********************************************************/ +/* returns 0 if successful, otherwise returns 1 upon error */ +static int genRandomDict(U32 dictID, U32 seed, size_t dictSize, BYTE* fullDict){ + /* allocate space for samples */ + int ret = 0; + unsigned const numSamples = 4; + size_t sampleSizes[4]; + BYTE* const samples = malloc(5000*sizeof(BYTE)); + if (samples == NULL) { + DISPLAY("Error: could not allocate space for samples\n"); + return 1; + } + + /* generate samples */ + { + unsigned literalValue = 1; + unsigned samplesPos = 0; + size_t currSize = 1; + while (literalValue <= 4) { + sampleSizes[literalValue - 1] = currSize; + { + size_t k; + for (k = 0; k < currSize; k++) { + *(samples + (samplesPos++)) = (BYTE)literalValue; + } + } + literalValue++; + currSize *= 16; + } + } + + + { + /* create variables */ + size_t dictWriteSize = 0; + ZDICT_params_t zdictParams; + size_t const headerSize = MAX(dictSize/4, 256); + size_t const dictContentSize = dictSize - headerSize; + BYTE* const dictContent = fullDict + headerSize; + if (dictContentSize < ZDICT_CONTENTSIZE_MIN || dictSize < ZDICT_DICTSIZE_MIN) { + DISPLAY("Error: dictionary size is too small\n"); + ret = 1; + goto exitGenRandomDict; + } + + /* init dictionary params */ + memset(&zdictParams, 0, sizeof(zdictParams)); + zdictParams.dictID = dictID; + zdictParams.notificationLevel = 1; + + /* fill in dictionary content */ + RAND_buffer(&seed, (void*)dictContent, dictContentSize); + + /* finalize dictionary with random samples */ + dictWriteSize = ZDICT_finalizeDictionary(fullDict, dictSize, + dictContent, dictContentSize, + samples, sampleSizes, numSamples, + zdictParams); + + if (ZDICT_isError(dictWriteSize)) { + DISPLAY("Could not finalize dictionary: %s\n", ZDICT_getErrorName(dictWriteSize)); + ret = 1; + } + } + +exitGenRandomDict: + free(samples); + return ret; +} + +static dictInfo initDictInfo(int useDict, size_t dictContentSize, BYTE* dictContent, U32 dictID){ + /* allocate space statically */ + dictInfo dictOp; + memset(&dictOp, 0, sizeof(dictOp)); + dictOp.useDict = useDict; + dictOp.dictContentSize = dictContentSize; + dictOp.dictContent = dictContent; + dictOp.dictID = dictID; + return dictOp; +} + /*-******************************************************* * Test Mode *********************************************************/ @@ -1196,6 +1322,65 @@ cleanup: return ret; } +static size_t testDecodeWithDict(U32 seed) +{ + /* create variables */ + size_t const dictSize = RAND(&seed) % (10 << 20) + ZDICT_DICTSIZE_MIN + ZDICT_CONTENTSIZE_MIN; + U32 const dictID = RAND(&seed); + size_t errorDetected = 0; + BYTE* const fullDict = malloc(dictSize); + if (fullDict == NULL) { + return ERROR(GENERIC); + } + + /* generate random dictionary */ + { + int const ret = genRandomDict(dictID, seed, dictSize, fullDict); + if (ret != 0) { + errorDetected = ERROR(GENERIC); + goto dictTestCleanup; + } + } + + + { + frame_t fr; + + /* generate frame */ + { + size_t const headerSize = MAX(dictSize/4, 256); + size_t const dictContentSize = dictSize-headerSize; + BYTE* const dictContent = fullDict+headerSize; + dictInfo const info = initDictInfo(1, dictContentSize, dictContent, dictID); + seed = generateFrame(seed, &fr, info); + } + + /* manually decompress and check difference */ + { + ZSTD_DCtx* const dctx = ZSTD_createDCtx(); + { + size_t const returnValue = ZSTD_decompress_usingDict(dctx, DECOMPRESSED_BUFFER, MAX_DECOMPRESSED_SIZE, + fr.dataStart, (BYTE*)fr.data - (BYTE*)fr.dataStart, + fullDict, dictSize); + if (ZSTD_isError(returnValue)) { + errorDetected = returnValue; + goto dictTestCleanup; + } + } + + if (memcmp(DECOMPRESSED_BUFFER, fr.srcStart, (BYTE*)fr.src - (BYTE*)fr.srcStart) != 0) { + errorDetected = ERROR(corruption_detected); + goto dictTestCleanup; + } + ZSTD_freeDCtx(dctx); + } + } + +dictTestCleanup: + free(fullDict); + return errorDetected; +} + static int runTestMode(U32 seed, unsigned numFiles, unsigned const testDurationS) { unsigned fnum; @@ -1209,28 +1394,39 @@ static int runTestMode(U32 seed, unsigned numFiles, unsigned const testDurationS for (fnum = 0; fnum < numFiles || clockSpan(startClock) < maxClockSpan; fnum++) { frame_t fr; - + U32 const seedCopy = seed; if (fnum < numFiles) DISPLAYUPDATE("\r%u/%u ", fnum, numFiles); else DISPLAYUPDATE("\r%u ", fnum); - seed = generateFrame(seed, &fr); + { + dictInfo const info = initDictInfo(0, 0, NULL, 0); + seed = generateFrame(seed, &fr, info); + } { size_t const r = testDecodeSimple(&fr); if (ZSTD_isError(r)) { - DISPLAY("Error in simple mode on test seed %u: %s\n", seed + fnum, + DISPLAY("Error in simple mode on test seed %u: %s\n", seedCopy, ZSTD_getErrorName(r)); return 1; } } { size_t const r = testDecodeStreaming(&fr); if (ZSTD_isError(r)) { - DISPLAY("Error in streaming mode on test seed %u: %s\n", seed + fnum, + DISPLAY("Error in streaming mode on test seed %u: %s\n", seedCopy, ZSTD_getErrorName(r)); return 1; } } + { + /* don't create a dictionary that is too big */ + size_t const r = testDecodeWithDict(seed); + if (ZSTD_isError(r)) { + DISPLAY("Error in dictionary mode on test seed %u: %s\n", seedCopy, ZSTD_getErrorName(r)); + return 1; + } + } } DISPLAY("\r%u tests completed: ", fnum); @@ -1250,7 +1446,10 @@ static int generateFile(U32 seed, const char* const path, DISPLAY("seed: %u\n", seed); - generateFrame(seed, &fr); + { + dictInfo const info = initDictInfo(0, 0, NULL, 0); + generateFrame(seed, &fr, info); + } outputBuffer(fr.dataStart, (BYTE*)fr.data - (BYTE*)fr.dataStart, path); if (origPath) { @@ -1272,7 +1471,10 @@ static int generateCorpus(U32 seed, unsigned numFiles, const char* const path, DISPLAYUPDATE("\r%u/%u ", fnum, numFiles); - seed = generateFrame(seed, &fr); + { + dictInfo const info = initDictInfo(0, 0, NULL, 0); + seed = generateFrame(seed, &fr, info); + } if (snprintf(outPath, MAX_PATH, "%s/z%06u.zst", path, fnum) + 1 > MAX_PATH) { DISPLAY("Error: path too long\n"); @@ -1294,6 +1496,93 @@ static int generateCorpus(U32 seed, unsigned numFiles, const char* const path, return 0; } +static int generateCorpusWithDict(U32 seed, unsigned numFiles, const char* const path, + const char* const origPath, const size_t dictSize) +{ + char outPath[MAX_PATH]; + BYTE* fullDict; + U32 const dictID = RAND(&seed); + int errorDetected = 0; + + if (snprintf(outPath, MAX_PATH, "%s/dictionary", path) + 1 > MAX_PATH) { + DISPLAY("Error: path too long\n"); + return 1; + } + + /* allocate space for the dictionary */ + fullDict = malloc(dictSize); + if (fullDict == NULL) { + DISPLAY("Error: could not allocate space for full dictionary.\n"); + return 1; + } + + /* randomly generate the dictionary */ + { + int const ret = genRandomDict(dictID, seed, dictSize, fullDict); + if (ret != 0) { + errorDetected = ret; + goto dictCleanup; + } + } + + /* write out dictionary */ + if (numFiles != 0) { + if (snprintf(outPath, MAX_PATH, "%s/dictionary", path) + 1 > MAX_PATH) { + DISPLAY("Error: dictionary path too long\n"); + errorDetected = 1; + goto dictCleanup; + } + outputBuffer(fullDict, dictSize, outPath); + } + else { + outputBuffer(fullDict, dictSize, "dictionary"); + } + + /* generate random compressed/decompressed files */ + { + unsigned fnum; + for (fnum = 0; fnum < MAX(numFiles, 1); fnum++) { + frame_t fr; + DISPLAYUPDATE("\r%u/%u ", fnum, numFiles); + { + size_t const headerSize = MAX(dictSize/4, 256); + size_t const dictContentSize = dictSize-headerSize; + BYTE* const dictContent = fullDict+headerSize; + dictInfo const info = initDictInfo(1, dictContentSize, dictContent, dictID); + seed = generateFrame(seed, &fr, info); + } + + if (numFiles != 0) { + if (snprintf(outPath, MAX_PATH, "%s/z%06u.zst", path, fnum) + 1 > MAX_PATH) { + DISPLAY("Error: path too long\n"); + errorDetected = 1; + goto dictCleanup; + } + outputBuffer(fr.dataStart, (BYTE*)fr.data - (BYTE*)fr.dataStart, outPath); + + if (origPath) { + if (snprintf(outPath, MAX_PATH, "%s/z%06u", origPath, fnum) + 1 > MAX_PATH) { + DISPLAY("Error: path too long\n"); + errorDetected = 1; + goto dictCleanup; + } + outputBuffer(fr.srcStart, (BYTE*)fr.src - (BYTE*)fr.srcStart, outPath); + } + } + else { + outputBuffer(fr.dataStart, (BYTE*)fr.data - (BYTE*)fr.dataStart, path); + if (origPath) { + outputBuffer(fr.srcStart, (BYTE*)fr.src - (BYTE*)fr.srcStart, origPath); + } + } + } + } + +dictCleanup: + free(fullDict); + return errorDetected; +} + /*_******************************************************* * Command line @@ -1339,6 +1628,40 @@ static void advancedUsage(const char* programName) DISPLAY( "\n"); DISPLAY( "Advanced arguments :\n"); DISPLAY( " --content-size : always include the content size in the frame header\n"); + DISPLAY( " --use-dict=# : include a dictionary used to decompress the corpus\n"); +} + +/*! readU32FromChar() : + @return : unsigned integer value read from input in `char` format + allows and interprets K, KB, KiB, M, MB and MiB suffix. + Will also modify `*stringPtr`, advancing it to position where it stopped reading. + Note : function result can overflow if digit string > MAX_UINT */ +static unsigned readU32FromChar(const char** stringPtr) +{ + unsigned result = 0; + while ((**stringPtr >='0') && (**stringPtr <='9')) + result *= 10, result += **stringPtr - '0', (*stringPtr)++ ; + if ((**stringPtr=='K') || (**stringPtr=='M')) { + result <<= 10; + if (**stringPtr=='M') result <<= 10; + (*stringPtr)++ ; + if (**stringPtr=='i') (*stringPtr)++; + if (**stringPtr=='B') (*stringPtr)++; + } + return result; +} + +/** longCommandWArg() : + * check if *stringPtr is the same as longCommand. + * If yes, @return 1 and advances *stringPtr to the position which immediately follows longCommand. + * @return 0 and doesn't modify *stringPtr otherwise. + */ +static unsigned longCommandWArg(const char** stringPtr, const char* longCommand) +{ + size_t const comSize = strlen(longCommand); + int const result = !strncmp(*stringPtr, longCommand, comSize); + if (result) *stringPtr += comSize; + return result; } int main(int argc, char** argv) @@ -1350,6 +1673,8 @@ int main(int argc, char** argv) int testMode = 0; const char* path = NULL; const char* origPath = NULL; + int useDict = 0; + unsigned dictSize = (10 << 10); /* 10 kB default */ int argNb; @@ -1410,6 +1735,9 @@ int main(int argc, char** argv) argument++; if (strcmp(argument, "content-size") == 0) { opts.contentSize = 1; + } else if (longCommandWArg(&argument, "use-dict=")) { + dictSize = readU32FromChar(&argument); + useDict = 1; } else { advancedUsage(argv[0]); return 1; @@ -1441,9 +1769,13 @@ int main(int argc, char** argv) return 1; } - if (numFiles == 0) { + if (numFiles == 0 && useDict == 0) { return generateFile(seed, path, origPath); - } else { + } else if (useDict == 0){ return generateCorpus(seed, numFiles, path, origPath); + } else { + /* should generate files with a dictionary */ + return generateCorpusWithDict(seed, numFiles, path, origPath, dictSize); } + } diff --git a/contrib/zstd/tests/fullbench.c b/contrib/zstd/tests/fullbench.c index 38cf22fa7ebc..81de5157b8e1 100644 --- a/contrib/zstd/tests/fullbench.c +++ b/contrib/zstd/tests/fullbench.c @@ -14,19 +14,19 @@ #include "util.h" /* Compiler options, UTIL_GetFileSize */ #include /* malloc */ #include /* fprintf, fopen, ftello64 */ -#include /* clock_t, clock, CLOCKS_PER_SEC */ -#include "mem.h" +#include "mem.h" /* U32 */ #ifndef ZSTD_DLL_IMPORT #include "zstd_internal.h" /* ZSTD_blockHeaderSize, blockType_e, KB, MB */ - #define ZSTD_STATIC_LINKING_ONLY /* ZSTD_compressBegin, ZSTD_compressContinue, etc. */ #else #define KB *(1 <<10) #define MB *(1 <<20) #define GB *(1U<<30) typedef enum { bt_raw, bt_rle, bt_compressed, bt_reserved } blockType_e; #endif -#include "zstd.h" /* ZSTD_VERSION_STRING */ +#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_compressBegin, ZSTD_compressContinue, etc. */ +#include "zstd.h" /* ZSTD_versionString */ +#include "util.h" /* time functions */ #include "datagen.h" @@ -35,7 +35,7 @@ **************************************/ #define PROGRAM_DESCRIPTION "Zstandard speed analyzer" #define AUTHOR "Yann Collet" -#define WELCOME_MESSAGE "*** %s %s %i-bits, by %s (%s) ***\n", PROGRAM_DESCRIPTION, ZSTD_VERSION_STRING, (int)(sizeof(void*)*8), AUTHOR, __DATE__ +#define WELCOME_MESSAGE "*** %s %s %i-bits, by %s (%s) ***\n", PROGRAM_DESCRIPTION, ZSTD_versionString(), (int)(sizeof(void*)*8), AUTHOR, __DATE__ #define NBLOOPS 6 #define TIMELOOP_S 2 @@ -69,12 +69,6 @@ static void BMK_SetNbIterations(U32 nbLoops) /*_******************************************************* * Private functions *********************************************************/ -static clock_t BMK_clockSpan( clock_t clockStart ) -{ - return clock() - clockStart; /* works even if overflow, span limited to <= ~30mn */ -} - - static size_t BMK_findMaxMem(U64 requiredMem) { size_t const step = 64 MB; @@ -116,8 +110,9 @@ size_t local_ZSTD_decompress(void* dst, size_t dstSize, void* buff2, const void* return ZSTD_decompress(dst, dstSize, buff2, g_cSize); } -#ifndef ZSTD_DLL_IMPORT static ZSTD_DCtx* g_zdc = NULL; + +#ifndef ZSTD_DLL_IMPORT extern size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* ctx, const void* src, size_t srcSize); size_t local_ZSTD_decodeLiteralsBlock(void* dst, size_t dstSize, void* buff2, const void* src, size_t srcSize) { @@ -153,6 +148,74 @@ size_t local_ZSTD_compressStream(void* dst, size_t dstCapacity, void* buff2, con return buffOut.pos; } +static size_t local_ZSTD_compress_generic_end(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) +{ + ZSTD_outBuffer buffOut; + ZSTD_inBuffer buffIn; + (void)buff2; + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1); + buffOut.dst = dst; + buffOut.size = dstCapacity; + buffOut.pos = 0; + buffIn.src = src; + buffIn.size = srcSize; + buffIn.pos = 0; + ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_end); + return buffOut.pos; +} + +static size_t local_ZSTD_compress_generic_continue(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) +{ + ZSTD_outBuffer buffOut; + ZSTD_inBuffer buffIn; + (void)buff2; + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1); + buffOut.dst = dst; + buffOut.size = dstCapacity; + buffOut.pos = 0; + buffIn.src = src; + buffIn.size = srcSize; + buffIn.pos = 0; + ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_continue); + ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_end); + return buffOut.pos; +} + +static size_t local_ZSTD_compress_generic_T2_end(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) +{ + ZSTD_outBuffer buffOut; + ZSTD_inBuffer buffIn; + (void)buff2; + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1); + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbThreads, 2); + buffOut.dst = dst; + buffOut.size = dstCapacity; + buffOut.pos = 0; + buffIn.src = src; + buffIn.size = srcSize; + buffIn.pos = 0; + while (ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_end)) {} + return buffOut.pos; +} + +static size_t local_ZSTD_compress_generic_T2_continue(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) +{ + ZSTD_outBuffer buffOut; + ZSTD_inBuffer buffIn; + (void)buff2; + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_compressionLevel, 1); + ZSTD_CCtx_setParameter(g_cstream, ZSTD_p_nbThreads, 2); + buffOut.dst = dst; + buffOut.size = dstCapacity; + buffOut.pos = 0; + buffIn.src = src; + buffIn.size = srcSize; + buffIn.pos = 0; + ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_continue); + while(ZSTD_compress_generic(g_cstream, &buffOut, &buffIn, ZSTD_e_end)) {} + return buffOut.pos; +} + static ZSTD_DStream* g_dstream= NULL; static size_t local_ZSTD_decompressStream(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) { @@ -170,8 +233,9 @@ static size_t local_ZSTD_decompressStream(void* dst, size_t dstCapacity, void* b return buffOut.pos; } -#ifndef ZSTD_DLL_IMPORT static ZSTD_CCtx* g_zcc = NULL; + +#ifndef ZSTD_DLL_IMPORT size_t local_ZSTD_compressContinue(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize) { (void)buff2; @@ -236,33 +300,45 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) switch(benchNb) { case 1: - benchFunction = local_ZSTD_compress; benchName = "ZSTD_compress"; + benchFunction = local_ZSTD_compress; benchName = "compress(1)"; break; case 2: - benchFunction = local_ZSTD_decompress; benchName = "ZSTD_decompress"; + benchFunction = local_ZSTD_decompress; benchName = "decompress"; break; #ifndef ZSTD_DLL_IMPORT case 11: - benchFunction = local_ZSTD_compressContinue; benchName = "ZSTD_compressContinue"; + benchFunction = local_ZSTD_compressContinue; benchName = "compressContinue(1)"; break; case 12: - benchFunction = local_ZSTD_compressContinue_extDict; benchName = "ZSTD_compressContinue_extDict"; + benchFunction = local_ZSTD_compressContinue_extDict; benchName = "compressContinue_extDict"; break; case 13: - benchFunction = local_ZSTD_decompressContinue; benchName = "ZSTD_decompressContinue"; + benchFunction = local_ZSTD_decompressContinue; benchName = "decompressContinue"; break; case 31: - benchFunction = local_ZSTD_decodeLiteralsBlock; benchName = "ZSTD_decodeLiteralsBlock"; + benchFunction = local_ZSTD_decodeLiteralsBlock; benchName = "decodeLiteralsBlock"; break; case 32: - benchFunction = local_ZSTD_decodeSeqHeaders; benchName = "ZSTD_decodeSeqHeaders"; + benchFunction = local_ZSTD_decodeSeqHeaders; benchName = "decodeSeqHeaders"; break; #endif case 41: - benchFunction = local_ZSTD_compressStream; benchName = "ZSTD_compressStream"; + benchFunction = local_ZSTD_compressStream; benchName = "compressStream(1)"; break; case 42: - benchFunction = local_ZSTD_decompressStream; benchName = "ZSTD_decompressStream"; + benchFunction = local_ZSTD_decompressStream; benchName = "decompressStream"; + break; + case 51: + benchFunction = local_ZSTD_compress_generic_continue; benchName = "compress_generic, continue"; + break; + case 52: + benchFunction = local_ZSTD_compress_generic_end; benchName = "compress_generic, end"; + break; + case 61: + benchFunction = local_ZSTD_compress_generic_T2_continue; benchName = "compress_generic, -T2, continue"; + break; + case 62: + benchFunction = local_ZSTD_compress_generic_T2_end; benchName = "compress_generic, -T2, end"; break; default : return 0; @@ -276,6 +352,10 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) free(dstBuff); free(buff2); return 12; } + if (g_zcc==NULL) g_zcc = ZSTD_createCCtx(); + if (g_zdc==NULL) g_zdc = ZSTD_createDCtx(); + if (g_cstream==NULL) g_cstream = ZSTD_createCStream(); + if (g_dstream==NULL) g_dstream = ZSTD_createDStream(); /* Preparation */ switch(benchNb) @@ -284,23 +364,15 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) g_cSize = ZSTD_compress(buff2, dstBuffSize, src, srcSize, 1); break; #ifndef ZSTD_DLL_IMPORT - case 11 : - if (g_zcc==NULL) g_zcc = ZSTD_createCCtx(); - break; - case 12 : - if (g_zcc==NULL) g_zcc = ZSTD_createCCtx(); - break; case 13 : - if (g_zdc==NULL) g_zdc = ZSTD_createDCtx(); g_cSize = ZSTD_compress(buff2, dstBuffSize, src, srcSize, 1); break; case 31: /* ZSTD_decodeLiteralsBlock */ - if (g_zdc==NULL) g_zdc = ZSTD_createDCtx(); { blockProperties_t bp; - ZSTD_frameParams zfp; + ZSTD_frameHeader zfp; size_t frameHeaderSize, skippedSize; g_cSize = ZSTD_compress(dstBuff, dstBuffSize, src, srcSize, 1); - frameHeaderSize = ZSTD_getFrameParams(&zfp, dstBuff, ZSTD_frameHeaderSize_min); + frameHeaderSize = ZSTD_getFrameHeader(&zfp, dstBuff, ZSTD_frameHeaderSize_min); if (frameHeaderSize==0) frameHeaderSize = ZSTD_frameHeaderSize_min; ZSTD_getcBlockSize(dstBuff+frameHeaderSize, dstBuffSize, &bp); /* Get 1st block type */ if (bp.blockType != bt_compressed) { @@ -313,15 +385,14 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) break; } case 32: /* ZSTD_decodeSeqHeaders */ - if (g_zdc==NULL) g_zdc = ZSTD_createDCtx(); { blockProperties_t bp; - ZSTD_frameParams zfp; + ZSTD_frameHeader zfp; const BYTE* ip = dstBuff; const BYTE* iend; size_t frameHeaderSize, cBlockSize; ZSTD_compress(dstBuff, dstBuffSize, src, srcSize, 1); /* it would be better to use direct block compression here */ g_cSize = ZSTD_compress(dstBuff, dstBuffSize, src, srcSize, 1); - frameHeaderSize = ZSTD_getFrameParams(&zfp, dstBuff, ZSTD_frameHeaderSize_min); + frameHeaderSize = ZSTD_getFrameHeader(&zfp, dstBuff, ZSTD_frameHeaderSize_min); if (frameHeaderSize==0) frameHeaderSize = ZSTD_frameHeaderSize_min; ip += frameHeaderSize; /* Skip frame Header */ cBlockSize = ZSTD_getcBlockSize(ip, dstBuffSize, &bp); /* Get 1st block type */ @@ -342,11 +413,7 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) case 31: goto _cleanOut; #endif - case 41 : - if (g_cstream==NULL) g_cstream = ZSTD_createCStream(); - break; case 42 : - if (g_dstream==NULL) g_dstream = ZSTD_createDStream(); g_cSize = ZSTD_compress(buff2, dstBuffSize, src, srcSize, 1); break; @@ -359,30 +426,37 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb) { size_t i; for (i=0; i %s !! \n", benchName, ZSTD_getErrorName(benchResult)); exit(1); } } - averageTime = (((double)BMK_clockSpan(clockStart)) / CLOCKS_PER_SEC) / nbRounds; - if (averageTime < bestTime) bestTime = averageTime; - DISPLAY("%2i- %-30.30s : %7.1f MB/s (%9u)\r", loopNb, benchName, (double)srcSize / (1 MB) / bestTime, (U32)benchResult); - } } + { U64 const clockSpanMicro = UTIL_clockSpanMicro(clockStart, ticksPerSecond); + double const averageTime = (double)clockSpanMicro / TIME_SEC_MICROSEC / nbRounds; + if (averageTime < bestTime) bestTime = averageTime; + DISPLAY("%2i- %-30.30s : %7.1f MB/s (%9u)\r", loopNb, benchName, (double)srcSize / (1 MB) / bestTime, (U32)benchResult); + } } } DISPLAY("%2u\n", benchNb); _cleanOut: free(dstBuff); free(buff2); + ZSTD_freeCCtx(g_zcc); g_zcc=NULL; + ZSTD_freeDCtx(g_zdc); g_zdc=NULL; + ZSTD_freeCStream(g_cstream); g_cstream=NULL; + ZSTD_freeDStream(g_dstream); g_dstream=NULL; return 0; } diff --git a/contrib/zstd/tests/fuzzer.c b/contrib/zstd/tests/fuzzer.c index a9dcf12e0702..b8f514785542 100644 --- a/contrib/zstd/tests/fuzzer.c +++ b/contrib/zstd/tests/fuzzer.c @@ -12,9 +12,9 @@ * Compiler specific **************************************/ #ifdef _MSC_VER /* Visual Studio */ -# define _CRT_SECURE_NO_WARNINGS /* fgets */ -# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */ -# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ +# define _CRT_SECURE_NO_WARNINGS /* fgets */ +# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */ +# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ #endif @@ -25,7 +25,7 @@ #include /* fgets, sscanf */ #include /* strcmp */ #include /* clock_t */ -#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_compressContinue, ZSTD_compressBlock */ +#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_compressContinue, ZSTD_compressBlock */ #include "zstd.h" /* ZSTD_VERSION_STRING */ #include "zstd_errors.h" /* ZSTD_getErrorCode */ #include "zstdmt_compress.h" @@ -74,7 +74,6 @@ static clock_t FUZ_clockSpan(clock_t cStart) return clock() - cStart; /* works even when overflow; max span ~ 30mn */ } - #define FUZ_rotl32(x,r) ((x << r) | (x >> (32 - r))) static unsigned FUZ_rand(unsigned* src) { @@ -104,6 +103,7 @@ static unsigned FUZ_highbit32(U32 v32) #define CHECK_V(var, fn) size_t const var = fn; if (ZSTD_isError(var)) goto _output_error #define CHECK(fn) { CHECK_V(err, fn); } #define CHECKPLUS(var, fn, more) { CHECK_V(var, fn); more; } + static int basicUnitTests(U32 seed, double compressibility) { size_t const CNBuffSize = 5 MB; @@ -190,6 +190,89 @@ static int basicUnitTests(U32 seed, double compressibility) DISPLAYLEVEL(4, "OK \n"); + /* Static CCtx tests */ +#define STATIC_CCTX_LEVEL 3 + DISPLAYLEVEL(4, "test%3i : create static CCtx for level %u :", testNb++, STATIC_CCTX_LEVEL); + { size_t const staticCCtxSize = ZSTD_estimateCStreamSize(STATIC_CCTX_LEVEL); + void* const staticCCtxBuffer = malloc(staticCCtxSize); + size_t const staticDCtxSize = ZSTD_estimateDCtxSize(); + void* const staticDCtxBuffer = malloc(staticDCtxSize); + if (staticCCtxBuffer==NULL || staticDCtxBuffer==NULL) { + free(staticCCtxBuffer); + free(staticDCtxBuffer); + DISPLAY("Not enough memory, aborting\n"); + testResult = 1; + goto _end; + } + { ZSTD_CCtx* staticCCtx = ZSTD_initStaticCCtx(staticCCtxBuffer, staticCCtxSize); + ZSTD_DCtx* staticDCtx = ZSTD_initStaticDCtx(staticDCtxBuffer, staticDCtxSize); + if ((staticCCtx==NULL) || (staticDCtx==NULL)) goto _output_error; + DISPLAYLEVEL(4, "OK \n"); + + DISPLAYLEVEL(4, "test%3i : init CCtx for level %u : ", testNb++, STATIC_CCTX_LEVEL); + { size_t const r = ZSTD_compressBegin(staticCCtx, STATIC_CCTX_LEVEL); + if (ZSTD_isError(r)) goto _output_error; } + DISPLAYLEVEL(4, "OK \n"); + + DISPLAYLEVEL(4, "test%3i : simple compression test with static CCtx : ", testNb++); + CHECKPLUS(r, ZSTD_compressCCtx(staticCCtx, + compressedBuffer, ZSTD_compressBound(CNBuffSize), + CNBuffer, CNBuffSize, STATIC_CCTX_LEVEL), + cSize=r ); + DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", + (U32)cSize, (double)cSize/CNBuffSize*100); + + DISPLAYLEVEL(4, "test%3i : simple decompression test with static DCtx : ", testNb++); + { size_t const r = ZSTD_decompressDCtx(staticDCtx, + decodedBuffer, CNBuffSize, + compressedBuffer, cSize); + if (r != CNBuffSize) goto _output_error; } + DISPLAYLEVEL(4, "OK \n"); + + DISPLAYLEVEL(4, "test%3i : check decompressed result : ", testNb++); + { size_t u; + for (u=0; u = blockSize); cSize = ZSTD_compressBlock(cctx, compressedBuffer, ZSTD_compressBound(blockSize), CNBuffer, blockSize); if (ZSTD_isError(cSize)) goto _output_error; DISPLAYLEVEL(4, "OK \n"); @@ -743,8 +884,23 @@ static size_t FUZ_randomLength(U32* seed, U32 maxLog) } #undef CHECK -#define CHECK(cond, ...) if (cond) { DISPLAY("Error => "); DISPLAY(__VA_ARGS__); \ - DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); goto _output_error; } +#define CHECK(cond, ...) { \ + if (cond) { \ + DISPLAY("Error => "); \ + DISPLAY(__VA_ARGS__); \ + DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); \ + goto _output_error; \ +} } + +#define CHECK_Z(f) { \ + size_t const err = f; \ + if (ZSTD_isError(err)) { \ + DISPLAY("Error => %s : %s ", \ + #f, ZSTD_getErrorName(err)); \ + DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); \ + goto _output_error; \ +} } + static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxDurationS, double compressibility, int bigTests) { @@ -856,9 +1012,8 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD } /* frame header decompression test */ - { ZSTD_frameParams dParams; - size_t const check = ZSTD_getFrameParams(&dParams, cBuffer, cSize); - CHECK(ZSTD_isError(check), "Frame Parameters extraction failed"); + { ZSTD_frameHeader dParams; + CHECK_Z( ZSTD_getFrameHeader(&dParams, cBuffer, cSize) ); CHECK(dParams.frameContentSize != sampleSize, "Frame content size incorrect"); } @@ -945,20 +1100,17 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD dict = srcBuffer + (FUZ_rand(&lseed) % (srcBufferSize - dictSize)); if (FUZ_rand(&lseed) & 0xF) { - size_t const errorCode = ZSTD_compressBegin_usingDict(refCtx, dict, dictSize, cLevel); - CHECK (ZSTD_isError(errorCode), "ZSTD_compressBegin_usingDict error : %s", ZSTD_getErrorName(errorCode)); + CHECK_Z ( ZSTD_compressBegin_usingDict(refCtx, dict, dictSize, cLevel) ); } else { ZSTD_compressionParameters const cPar = ZSTD_getCParams(cLevel, 0, dictSize); ZSTD_frameParameters const fPar = { FUZ_rand(&lseed)&1 /* contentSizeFlag */, !(FUZ_rand(&lseed)&3) /* contentChecksumFlag*/, 0 /*NodictID*/ }; /* note : since dictionary is fake, dictIDflag has no impact */ ZSTD_parameters const p = FUZ_makeParams(cPar, fPar); - size_t const errorCode = ZSTD_compressBegin_advanced(refCtx, dict, dictSize, p, 0); - CHECK (ZSTD_isError(errorCode), "ZSTD_compressBegin_advanced error : %s", ZSTD_getErrorName(errorCode)); + CHECK_Z ( ZSTD_compressBegin_advanced(refCtx, dict, dictSize, p, 0) ); } - { size_t const errorCode = ZSTD_copyCCtx(ctx, refCtx, 0); - CHECK (ZSTD_isError(errorCode), "ZSTD_copyCCtx error : %s", ZSTD_getErrorName(errorCode)); - } } + CHECK_Z( ZSTD_copyCCtx(ctx, refCtx, 0) ); + } ZSTD_setCCtxParameter(ctx, ZSTD_p_forceWindow, FUZ_rand(&lseed) & 1); { U32 const nbChunks = (FUZ_rand(&lseed) & 127) + 2; @@ -990,8 +1142,7 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD /* streaming decompression test */ if (dictSize<8) dictSize=0, dict=NULL; /* disable dictionary */ - { size_t const errorCode = ZSTD_decompressBegin_usingDict(dctx, dict, dictSize); - CHECK (ZSTD_isError(errorCode), "ZSTD_decompressBegin_usingDict error : %s", ZSTD_getErrorName(errorCode)); } + CHECK_Z( ZSTD_decompressBegin_usingDict(dctx, dict, dictSize) ); totalCSize = 0; totalGenSize = 0; while (totalCSize < cSize) { diff --git a/contrib/zstd/tests/paramgrill.c b/contrib/zstd/tests/paramgrill.c index 1913b54d0a54..da06ccb52aab 100644 --- a/contrib/zstd/tests/paramgrill.c +++ b/contrib/zstd/tests/paramgrill.c @@ -38,7 +38,7 @@ #define GB *(1ULL<<30) #define NBLOOPS 2 -#define TIMELOOP (2 * CLOCKS_PER_SEC) +#define TIMELOOP (2 * CLOCKS_PER_SEC) #define NB_LEVELS_TRACKED 30 @@ -47,7 +47,7 @@ static const size_t maxMemory = (sizeof(size_t)==4) ? (2 GB - 64 MB) : (size_t #define COMPRESSIBILITY_DEFAULT 0.50 static const size_t sampleSize = 10000000; -static const U32 g_grillDuration_s = 60000; /* about 16 hours */ +static const double g_grillDuration_s = 90000; /* about 24 hours */ static const clock_t g_maxParamTime = 15 * CLOCKS_PER_SEC; static const clock_t g_maxVariationTime = 60 * CLOCKS_PER_SEC; static const int g_maxNbVariations = 64; @@ -87,9 +87,11 @@ void BMK_SetNbIterations(int nbLoops) * Private functions *********************************************************/ -static clock_t BMK_clockSpan(clock_t cStart) { return clock() - cStart; } /* works even if overflow ; max span ~ 30 mn */ +/* works even if overflow ; max span ~ 30 mn */ +static clock_t BMK_clockSpan(clock_t cStart) { return clock() - cStart; } -static U32 BMK_timeSpan(time_t tStart) { return (U32)difftime(time(NULL), tStart); } /* accuracy in seconds only, span can be multiple years */ +/* accuracy in seconds only, span can be multiple years */ +static double BMK_timeSpan(time_t tStart) { return difftime(time(NULL), tStart); } static size_t BMK_findMaxMem(U64 requiredMem) @@ -307,14 +309,14 @@ static size_t BMK_benchParam(BMK_result_t* resultPtr, } -const char* g_stratName[] = { "ZSTD_fast ", - "ZSTD_dfast ", - "ZSTD_greedy ", - "ZSTD_lazy ", - "ZSTD_lazy2 ", - "ZSTD_btlazy2", - "ZSTD_btopt ", - "ZSTD_btopt2 "}; +const char* g_stratName[] = { "ZSTD_fast ", + "ZSTD_dfast ", + "ZSTD_greedy ", + "ZSTD_lazy ", + "ZSTD_lazy2 ", + "ZSTD_btlazy2 ", + "ZSTD_btopt ", + "ZSTD_btultra "}; static void BMK_printWinner(FILE* f, U32 cLevel, BMK_result_t result, ZSTD_compressionParameters params, size_t srcSize) { @@ -388,8 +390,8 @@ static int BMK_seed(winnerInfo_t* winners, const ZSTD_compressionParameters para double W_DMemUsed_note = W_ratioNote * ( 40 + 9*cLevel) - log((double)W_DMemUsed); double O_DMemUsed_note = O_ratioNote * ( 40 + 9*cLevel) - log((double)O_DMemUsed); - size_t W_CMemUsed = (1 << params.windowLog) + ZSTD_estimateCCtxSize(params); - size_t O_CMemUsed = (1 << winners[cLevel].params.windowLog) + ZSTD_estimateCCtxSize(winners[cLevel].params); + size_t W_CMemUsed = (1 << params.windowLog) + ZSTD_estimateCCtxSize_advanced(params); + size_t O_CMemUsed = (1 << winners[cLevel].params.windowLog) + ZSTD_estimateCCtxSize_advanced(winners[cLevel].params); double W_CMemUsed_note = W_ratioNote * ( 50 + 13*cLevel) - log((double)W_CMemUsed); double O_CMemUsed_note = O_ratioNote * ( 50 + 13*cLevel) - log((double)O_CMemUsed); @@ -454,7 +456,7 @@ static ZSTD_compressionParameters* sanitizeParams(ZSTD_compressionParameters par g_params.chainLog = 0, g_params.searchLog = 0; if (params.strategy == ZSTD_dfast) g_params.searchLog = 0; - if (params.strategy != ZSTD_btopt && params.strategy != ZSTD_btopt2) + if (params.strategy != ZSTD_btopt && params.strategy != ZSTD_btultra) g_params.targetLength = 0; return &g_params; } @@ -558,7 +560,7 @@ static ZSTD_compressionParameters randomParams(void) p.windowLog = FUZ_rand(&g_rand) % (ZSTD_WINDOWLOG_MAX+1 - ZSTD_WINDOWLOG_MIN) + ZSTD_WINDOWLOG_MIN; p.searchLength=FUZ_rand(&g_rand) % (ZSTD_SEARCHLENGTH_MAX+1 - ZSTD_SEARCHLENGTH_MIN) + ZSTD_SEARCHLENGTH_MIN; p.targetLength=FUZ_rand(&g_rand) % (ZSTD_TARGETLENGTH_MAX+1 - ZSTD_TARGETLENGTH_MIN) + ZSTD_TARGETLENGTH_MIN; - p.strategy = (ZSTD_strategy) (FUZ_rand(&g_rand) % (ZSTD_btopt2 +1)); + p.strategy = (ZSTD_strategy) (FUZ_rand(&g_rand) % (ZSTD_btultra +1)); validated = !ZSTD_isError(ZSTD_checkCParams(p)); } return p; diff --git a/contrib/zstd/tests/playTests.sh b/contrib/zstd/tests/playTests.sh index 021fd59fe2af..2e1cc6826f11 100755 --- a/contrib/zstd/tests/playTests.sh +++ b/contrib/zstd/tests/playTests.sh @@ -92,7 +92,7 @@ $ZSTD tmp --stdout > tmpCompressed # long command format $ECHO "test : compress to named file" rm tmpCompressed $ZSTD tmp -o tmpCompressed -ls tmpCompressed # must work +test -f tmpCompressed # file must be created $ECHO "test : -o must be followed by filename (must fail)" $ZSTD tmp -of tmpCompressed && die "-o must be followed by filename " $ECHO "test : force write, correct order" @@ -142,21 +142,21 @@ $ZSTD -q -f tmpro rm -f tmpro tmpro.zst $ECHO "test : file removal" $ZSTD -f --rm tmp -ls tmp && die "tmp should no longer be present" +test ! -f tmp # tmp should no longer be present $ZSTD -f -d --rm tmp.zst -ls tmp.zst && die "tmp.zst should no longer be present" +test ! -f tmp.zst # tmp.zst should no longer be present $ECHO "test : --rm on stdin" $ECHO a | $ZSTD --rm > $INTOVOID # --rm should remain silent rm tmp $ZSTD -f tmp && die "tmp not present : should have failed" -ls tmp.zst && die "tmp.zst should not be created" +test ! -f tmp.zst # tmp.zst should not be created $ECHO "\n**** Advanced compression parameters **** " $ECHO "Hello world!" | $ZSTD --zstd=windowLog=21, - -o tmp.zst && die "wrong parameters not detected!" $ECHO "Hello world!" | $ZSTD --zstd=windowLo=21 - -o tmp.zst && die "wrong parameters not detected!" $ECHO "Hello world!" | $ZSTD --zstd=windowLog=21,slog - -o tmp.zst && die "wrong parameters not detected!" -ls tmp.zst && die "tmp.zst should not be created" +test ! -f tmp.zst # tmp.zst should not be created roundTripTest -g512K roundTripTest -g512K " --zstd=slen=3,tlen=48,strat=6" roundTripTest -g512K " --zstd=strat=6,wlog=23,clog=23,hlog=22,slog=6" @@ -201,16 +201,17 @@ $ECHO foo | $ZSTD > /dev/full && die "write error not detected!" $ECHO "$ECHO foo | $ZSTD | $ZSTD -d > /dev/full" $ECHO foo | $ZSTD | $ZSTD -d > /dev/full && die "write error not detected!" + $ECHO "\n**** symbolic link test **** " rm -f hello.tmp world.tmp hello.tmp.zst world.tmp.zst $ECHO "hello world" > hello.tmp ln -s hello.tmp world.tmp $ZSTD world.tmp hello.tmp -ls hello.tmp.zst || die "regular file should have been compressed!" -ls world.tmp.zst && die "symbolic link should not have been compressed!" +test -f hello.tmp.zst # regular file should have been compressed! +test ! -f world.tmp.zst # symbolic link should not have been compressed! $ZSTD world.tmp hello.tmp -f -ls world.tmp.zst || die "symbol link should have been compressed with --force" +test -f world.tmp.zst # symbolic link should have been compressed with --force rm -f hello.tmp world.tmp hello.tmp.zst world.tmp.zst fi @@ -225,10 +226,10 @@ $ZSTD tmpSparse -c | $ZSTD -dv --sparse -c > tmpOutSparse $DIFF -s tmpSparse tmpOutSparse $ZSTD tmpSparse -c | $ZSTD -dv --no-sparse -c > tmpOutNoSparse $DIFF -s tmpSparse tmpOutNoSparse -ls -ls tmpSparse* +ls -ls tmpSparse* # look at file size and block size on disk ./datagen -s1 -g1200007 -P100 | $ZSTD | $ZSTD -dv --sparse -c > tmpSparseOdd # Odd size file (to not finish on an exact nb of blocks) ./datagen -s1 -g1200007 -P100 | $DIFF -s - tmpSparseOdd -ls -ls tmpSparseOdd +ls -ls tmpSparseOdd # look at file size and block size on disk $ECHO "\n Sparse Compatibility with Console :" $ECHO "Hello World 1 !" | $ZSTD | $ZSTD -d -c $ECHO "Hello World 2 !" | $ZSTD | $ZSTD -d | cat @@ -238,7 +239,7 @@ cat tmpSparse1M tmpSparse1M > tmpSparse2M $ZSTD -v -f tmpSparse1M -o tmpSparseCompressed $ZSTD -d -v -f tmpSparseCompressed -o tmpSparseRegenerated $ZSTD -d -v -f tmpSparseCompressed -c >> tmpSparseRegenerated -ls -ls tmpSparse* +ls -ls tmpSparse* # look at file size and block size on disk $DIFF tmpSparse2M tmpSparseRegenerated rm tmpSparse* @@ -257,11 +258,11 @@ $ZSTD -df *.zst ls -ls tmp* $ECHO "compress tmp* into stdout > tmpall : " $ZSTD -c tmp1 tmp2 tmp3 > tmpall -ls -ls tmp* +ls -ls tmp* # check size of tmpall (should be tmp1.zst + tmp2.zst + tmp3.zst) $ECHO "decompress tmpall* into stdout > tmpdec : " cp tmpall tmpall2 $ZSTD -dc tmpall* > tmpdec -ls -ls tmp* +ls -ls tmp* # check size of tmpdec (should be 2*(tmp1 + tmp2 + tmp3)) $ECHO "compress multiple files including a missing one (notHere) : " $ZSTD -f tmp1 notHere tmp2 && die "missing file not detected!" @@ -379,7 +380,9 @@ $ZSTD -t tmp2.zst && die "bad file not detected !" $ZSTD -t tmp3 && die "bad file not detected !" # detects 0-sized files as bad $ECHO "test --rm and --test combined " $ZSTD -t --rm tmp1.zst -ls -ls tmp1.zst # check file is still present +test -f tmp1.zst # check file is still present +split -b16384 tmp1.zst tmpSplit. +$ZSTD -t tmpSplit.* && die "bad file not detected !" $ECHO "\n**** benchmark mode tests **** " @@ -439,6 +442,7 @@ if [ $LZMAMODE -eq 1 ]; then XZEXE=1 xz -V && lzma -V || XZEXE=0 if [ $XZEXE -eq 1 ]; then + $ECHO "Testing zstd xz and lzma support" ./datagen > tmp $ZSTD --format=lzma -f tmp $ZSTD --format=xz -f tmp @@ -449,6 +453,24 @@ if [ $LZMAMODE -eq 1 ]; then $ZSTD -d -f -v tmp.xz $ZSTD -d -f -v tmp.lzma rm tmp* + $ECHO "Creating symlinks" + ln -s $ZSTD ./xz + ln -s $ZSTD ./unxz + ln -s $ZSTD ./lzma + ln -s $ZSTD ./unlzma + $ECHO "Testing xz and lzma symlinks" + ./datagen > tmp + ./xz tmp + xz -d tmp.xz + ./lzma tmp + lzma -d tmp.lzma + $ECHO "Testing unxz and unlzma symlinks" + xz tmp + ./xz -d tmp.xz + lzma tmp + ./lzma -d tmp.lzma + rm xz unxz lzma unlzma + rm tmp* else $ECHO "xz binary not detected" fi @@ -534,6 +556,54 @@ fi rm tmp* +$ECHO "\n**** zstd --list/-l single frame tests ****" +./datagen > tmp1 +./datagen > tmp2 +./datagen > tmp3 +./datagen > tmp4 +$ZSTD tmp* +$ZSTD -l *.zst +$ZSTD -lv *.zst +$ZSTD --list *.zst +$ZSTD --list -v *.zst + +$ECHO "\n**** zstd --list/-l multiple frame tests ****" +cat tmp1.zst tmp2.zst > tmp12.zst +cat tmp3.zst tmp4.zst > tmp34.zst +cat tmp12.zst tmp34.zst > tmp1234.zst +cat tmp12.zst tmp4.zst > tmp124.zst +$ZSTD -l *.zst +$ZSTD -lv *.zst +$ZSTD --list *.zst +$ZSTD --list -v *.zst + +$ECHO "\n**** zstd --list/-l error detection tests ****" +! $ZSTD -l tmp1 tmp1.zst +! $ZSTD --list tmp* +! $ZSTD -lv tmp1* +! $ZSTD --list -v tmp2 tmp23.zst + +$ECHO "\n**** zstd --list/-l test with null files ****" +./datagen -g0 > tmp5 +$ZSTD tmp5 +$ZSTD -l tmp5.zst +! $ZSTD -l tmp5* +$ZSTD -lv tmp5.zst +! $ZSTD -lv tmp5* + +$ECHO "\n**** zstd --list/-l test with no content size field ****" +./datagen -g1MB | $ZSTD > tmp6.zst +$ZSTD -l tmp6.zst +$ZSTD -lv tmp6.zst + +$ECHO "\n**** zstd --list/-l test with no checksum ****" +$ZSTD -f --no-check tmp1 +$ZSTD -l tmp1.zst +$ZSTD -lv tmp1.zst + +rm tmp* + + if [ "$1" != "--test-large-data" ]; then $ECHO "Skipping large data tests" exit 0 diff --git a/contrib/zstd/tests/roundTripCrash.c b/contrib/zstd/tests/roundTripCrash.c index a296d4160e55..77c6737eebdb 100644 --- a/contrib/zstd/tests/roundTripCrash.c +++ b/contrib/zstd/tests/roundTripCrash.c @@ -68,13 +68,20 @@ static size_t checkBuffers(const void* buff1, const void* buff2, size_t buffSize return pos; } +static void crash(int errorCode){ + /* abort if AFL/libfuzzer, exit otherwise */ + #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION /* could also use __AFL_COMPILER */ + abort(); + #else + exit(errorCode); + #endif +} static void roundTripCheck(const void* srcBuff, size_t srcBuffSize) { size_t const cBuffSize = ZSTD_compressBound(srcBuffSize); void* cBuff = malloc(cBuffSize); void* rBuff = malloc(cBuffSize); - #define CRASH { free(cBuff); free(cBuff); } /* double free, to crash program */ if (!cBuff || !rBuff) { fprintf(stderr, "not enough memory ! \n"); @@ -84,15 +91,15 @@ static void roundTripCheck(const void* srcBuff, size_t srcBuffSize) { size_t const result = roundTripTest(rBuff, cBuffSize, cBuff, cBuffSize, srcBuff, srcBuffSize); if (ZSTD_isError(result)) { fprintf(stderr, "roundTripTest error : %s \n", ZSTD_getErrorName(result)); - CRASH; + crash(1); } if (result != srcBuffSize) { fprintf(stderr, "Incorrect regenerated size : %u != %u\n", (unsigned)result, (unsigned)srcBuffSize); - CRASH; + crash(1); } if (checkBuffers(srcBuff, rBuff, srcBuffSize) != srcBuffSize) { fprintf(stderr, "Silent decoding corruption !!!"); - CRASH; + crash(1); } } diff --git a/contrib/zstd/tests/symbols.c b/contrib/zstd/tests/symbols.c index 7dacfc05892d..8920187f37f7 100644 --- a/contrib/zstd/tests/symbols.c +++ b/contrib/zstd/tests/symbols.c @@ -88,14 +88,14 @@ static const void *symbols[] = { &ZSTD_copyCCtx, &ZSTD_compressContinue, &ZSTD_compressEnd, - &ZSTD_getFrameParams, + &ZSTD_getFrameHeader, &ZSTD_decompressBegin, &ZSTD_decompressBegin_usingDict, &ZSTD_copyDCtx, &ZSTD_nextSrcSizeToDecompress, &ZSTD_decompressContinue, &ZSTD_nextInputType, - &ZSTD_getBlockSizeMax, + &ZSTD_getBlockSize, &ZSTD_compressBlock, &ZSTD_decompressBlock, &ZSTD_insertBlock, @@ -131,7 +131,10 @@ static const void *symbols[] = { &ZDICT_isError, &ZDICT_getErrorName, /* zdict.h: advanced functions */ - &ZDICT_trainFromBuffer_advanced, + &ZDICT_trainFromBuffer_cover, + &ZDICT_optimizeTrainFromBuffer_cover, + &ZDICT_finalizeDictionary, + &ZDICT_trainFromBuffer_legacy, &ZDICT_addEntropyTablesFromBuffer, NULL, }; diff --git a/contrib/zstd/tests/zstreamtest.c b/contrib/zstd/tests/zstreamtest.c index 0e09e185642b..9b2b8eaf81b7 100644 --- a/contrib/zstd/tests/zstreamtest.c +++ b/contrib/zstd/tests/zstreamtest.c @@ -12,9 +12,9 @@ * Compiler specific **************************************/ #ifdef _MSC_VER /* Visual Studio */ -# define _CRT_SECURE_NO_WARNINGS /* fgets */ -# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */ -# pragma warning(disable : 4146) /* disable: C4146: minus unsigned expression */ +# define _CRT_SECURE_NO_WARNINGS /* fgets */ +# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */ +# pragma warning(disable : 4146) /* disable: C4146: minus unsigned expression */ #endif @@ -25,8 +25,9 @@ #include /* fgets, sscanf */ #include /* clock_t, clock() */ #include /* strcmp */ +#include /* assert */ #include "mem.h" -#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_maxCLevel, ZSTD_customMem, ZSTD_getDictID_fromFrame */ +#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_maxCLevel, ZSTD_customMem, ZSTD_getDictID_fromFrame */ #include "zstd.h" /* ZSTD_compressBound */ #include "zstd_errors.h" /* ZSTD_error_srcSize_wrong */ #include "zstdmt_compress.h" @@ -44,6 +45,7 @@ #define GB *(1U<<30) static const U32 nbTestsDefault = 10000; +static const U32 g_cLevelMax_smallTests = 10; #define COMPRESSIBLE_NOISE_LENGTH (10 MB) #define FUZ_COMPRESSIBILITY_DEFAULT 50 static const U32 prime32 = 2654435761U; @@ -53,13 +55,15 @@ static const U32 prime32 = 2654435761U; * Display Macros **************************************/ #define DISPLAY(...) fprintf(stderr, __VA_ARGS__) -#define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); } +#define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { \ + DISPLAY(__VA_ARGS__); \ + if (g_displayLevel>=4) fflush(stderr); } static U32 g_displayLevel = 2; #define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \ if ((FUZ_GetClockSpan(g_displayClock) > g_refreshRate) || (g_displayLevel>=4)) \ { g_displayClock = clock(); DISPLAY(__VA_ARGS__); \ - if (g_displayLevel>=4) fflush(stderr); } } + if (g_displayLevel>=4) fflush(stderr); } } static const clock_t g_refreshRate = CLOCKS_PER_SEC / 6; static clock_t g_displayClock = 0; @@ -155,12 +159,13 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo void* decodedBuffer = malloc(decodedBufferSize); size_t cSize; int testResult = 0; - U32 testNb=0; + U32 testNb = 1; ZSTD_CStream* zc = ZSTD_createCStream_advanced(customMem); ZSTD_DStream* zd = ZSTD_createDStream_advanced(customMem); ZSTD_inBuffer inBuff, inBuff2; ZSTD_outBuffer outBuff; buffer_t dictionary = g_nullBuffer; + size_t const dictSize = 128 KB; unsigned dictID = 0; /* Create compressible test buffer */ @@ -186,7 +191,8 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo /* Basic compression test */ DISPLAYLEVEL(3, "test%3i : compress %u bytes : ", testNb++, COMPRESSIBLE_NOISE_LENGTH); - ZSTD_initCStream_usingDict(zc, CNBuffer, 128 KB, 1); + { size_t const r = ZSTD_initCStream_usingDict(zc, CNBuffer, dictSize, 1); + if (ZSTD_isError(r)) goto _output_error; } outBuff.dst = (char*)(compressedBuffer)+cSize; outBuff.size = compressedBufferSize; outBuff.pos = 0; @@ -201,10 +207,20 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo cSize += outBuff.pos; DISPLAYLEVEL(3, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/COMPRESSIBLE_NOISE_LENGTH*100); - DISPLAYLEVEL(3, "test%3i : check CStream size : ", testNb++); - { size_t const s = ZSTD_sizeof_CStream(zc); - if (ZSTD_isError(s)) goto _output_error; - DISPLAYLEVEL(3, "OK (%u bytes) \n", (U32)s); + /* context size functions */ + DISPLAYLEVEL(3, "test%3i : estimate CStream size : ", testNb++); + { ZSTD_compressionParameters const cParams = ZSTD_getCParams(1, CNBufferSize, dictSize); + size_t const s = ZSTD_estimateCStreamSize_advanced(cParams) + /* uses ZSTD_initCStream_usingDict() */ + + ZSTD_estimateCDictSize_advanced(dictSize, cParams, 0); + if (ZSTD_isError(s)) goto _output_error; + DISPLAYLEVEL(3, "OK (%u bytes) \n", (U32)s); + } + + DISPLAYLEVEL(3, "test%3i : check actual CStream size : ", testNb++); + { size_t const s = ZSTD_sizeof_CStream(zc); + if (ZSTD_isError(s)) goto _output_error; + DISPLAYLEVEL(3, "OK (%u bytes) \n", (U32)s); } /* Attempt bad compression parameters */ @@ -219,22 +235,25 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo /* skippable frame test */ DISPLAYLEVEL(3, "test%3i : decompress skippable frame : ", testNb++); - ZSTD_initDStream_usingDict(zd, CNBuffer, 128 KB); + if (ZSTD_isError( ZSTD_initDStream_usingDict(zd, CNBuffer, dictSize) )) + goto _output_error; inBuff.src = compressedBuffer; inBuff.size = cSize; inBuff.pos = 0; outBuff.dst = decodedBuffer; outBuff.size = CNBufferSize; outBuff.pos = 0; - { size_t const r = ZSTD_decompressStream(zd, &outBuff, &inBuff); - if (r != 0) goto _output_error; } + { size_t const r = ZSTD_decompressStream(zd, &outBuff, &inBuff); + DISPLAYLEVEL(5, " ( ZSTD_decompressStream => %u ) ", (U32)r); + if (r != 0) goto _output_error; + } if (outBuff.pos != 0) goto _output_error; /* skippable frame output len is 0 */ DISPLAYLEVEL(3, "OK \n"); /* Basic decompression test */ inBuff2 = inBuff; DISPLAYLEVEL(3, "test%3i : decompress %u bytes : ", testNb++, COMPRESSIBLE_NOISE_LENGTH); - ZSTD_initDStream_usingDict(zd, CNBuffer, 128 KB); + ZSTD_initDStream_usingDict(zd, CNBuffer, dictSize); { size_t const r = ZSTD_setDStreamParameter(zd, DStream_p_maxWindowSize, 1000000000); /* large limit */ if (ZSTD_isError(r)) goto _output_error; } { size_t const remaining = ZSTD_decompressStream(zd, &outBuff, &inBuff); @@ -260,7 +279,21 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo } } DISPLAYLEVEL(3, "OK \n"); - DISPLAYLEVEL(3, "test%3i : check DStream size : ", testNb++); + /* context size functions */ + DISPLAYLEVEL(3, "test%3i : estimate DStream size : ", testNb++); + { ZSTD_frameHeader fhi; + const void* cStart = (char*)compressedBuffer + (skippableFrameSize + 8); + size_t const gfhError = ZSTD_getFrameHeader(&fhi, cStart, cSize); + if (gfhError!=0) goto _output_error; + DISPLAYLEVEL(5, " (windowSize : %u) ", (U32)fhi.windowSize); + { size_t const s = ZSTD_estimateDStreamSize(fhi.windowSize) + /* uses ZSTD_initDStream_usingDict() */ + + ZSTD_estimateDDictSize(dictSize, 0); + if (ZSTD_isError(s)) goto _output_error; + DISPLAYLEVEL(3, "OK (%u bytes) \n", (U32)s); + } } + + DISPLAYLEVEL(3, "test%3i : check actual DStream size : ", testNb++); { size_t const s = ZSTD_sizeof_DStream(zd); if (ZSTD_isError(s)) goto _output_error; DISPLAYLEVEL(3, "OK (%u bytes) \n", (U32)s); @@ -270,7 +303,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo DISPLAYLEVEL(3, "test%3i : decompress byte-by-byte : ", testNb++); { /* skippable frame */ size_t r = 1; - ZSTD_initDStream_usingDict(zd, CNBuffer, 128 KB); + ZSTD_initDStream_usingDict(zd, CNBuffer, dictSize); inBuff.src = compressedBuffer; outBuff.dst = decodedBuffer; inBuff.pos = 0; @@ -282,7 +315,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo if (ZSTD_isError(r)) goto _output_error; } /* normal frame */ - ZSTD_initDStream_usingDict(zd, CNBuffer, 128 KB); + ZSTD_initDStream_usingDict(zd, CNBuffer, dictSize); r=1; while (r) { inBuff.size = inBuff.pos + 1; @@ -344,6 +377,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo if (zc==NULL) goto _output_error; /* memory allocation issue */ /* use 1 */ { size_t const inSize = 513; + DISPLAYLEVEL(5, "use1 "); ZSTD_initCStream_advanced(zc, NULL, 0, ZSTD_getParams(19, inSize, 0), inSize); /* needs btopt + search3 to trigger hashLog3 */ inBuff.src = CNBuffer; inBuff.size = inSize; @@ -351,14 +385,17 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo outBuff.dst = (char*)(compressedBuffer)+cSize; outBuff.size = ZSTD_compressBound(inSize); outBuff.pos = 0; + DISPLAYLEVEL(5, "compress1 "); { size_t const r = ZSTD_compressStream(zc, &outBuff, &inBuff); if (ZSTD_isError(r)) goto _output_error; } if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ + DISPLAYLEVEL(5, "end1 "); { size_t const r = ZSTD_endStream(zc, &outBuff); if (r != 0) goto _output_error; } /* error, or some data not flushed */ } /* use 2 */ { size_t const inSize = 1025; /* will not continue, because tables auto-adjust and are therefore different size */ + DISPLAYLEVEL(5, "use2 "); ZSTD_initCStream_advanced(zc, NULL, 0, ZSTD_getParams(19, inSize, 0), inSize); /* needs btopt + search3 to trigger hashLog3 */ inBuff.src = CNBuffer; inBuff.size = inSize; @@ -366,9 +403,11 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo outBuff.dst = (char*)(compressedBuffer)+cSize; outBuff.size = ZSTD_compressBound(inSize); outBuff.pos = 0; + DISPLAYLEVEL(5, "compress2 "); { size_t const r = ZSTD_compressStream(zc, &outBuff, &inBuff); if (ZSTD_isError(r)) goto _output_error; } if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ + DISPLAYLEVEL(5, "end2 "); { size_t const r = ZSTD_endStream(zc, &outBuff); if (r != 0) goto _output_error; } /* error, or some data not flushed */ } @@ -376,7 +415,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo /* CDict scenario */ DISPLAYLEVEL(3, "test%3i : digested dictionary : ", testNb++); - { ZSTD_CDict* const cdict = ZSTD_createCDict(dictionary.start, dictionary.filled, 1); + { ZSTD_CDict* const cdict = ZSTD_createCDict(dictionary.start, dictionary.filled, 1 /*byRef*/ ); size_t const initError = ZSTD_initCStream_usingCDict(zc, cdict); if (ZSTD_isError(initError)) goto _output_error; cSize = 0; @@ -435,7 +474,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo /* Memory restriction */ DISPLAYLEVEL(3, "test%3i : maxWindowSize < frame requirement : ", testNb++); - ZSTD_initDStream_usingDict(zd, CNBuffer, 128 KB); + ZSTD_initDStream_usingDict(zd, CNBuffer, dictSize); { size_t const r = ZSTD_setDStreamParameter(zd, DStream_p_maxWindowSize, 1000); /* too small limit */ if (ZSTD_isError(r)) goto _output_error; } inBuff.src = compressedBuffer; @@ -451,8 +490,8 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo DISPLAYLEVEL(3, "test%3i : ZSTD_initCStream_usingCDict_advanced with masked dictID : ", testNb++); { ZSTD_compressionParameters const cParams = ZSTD_getCParams(1, CNBufferSize, dictionary.filled); ZSTD_frameParameters const fParams = { 1 /* contentSize */, 1 /* checksum */, 1 /* noDictID */}; - ZSTD_CDict* const cdict = ZSTD_createCDict_advanced(dictionary.start, dictionary.filled, 1 /* byReference */, cParams, customMem); - size_t const initError = ZSTD_initCStream_usingCDict_advanced(zc, cdict, CNBufferSize, fParams); + ZSTD_CDict* const cdict = ZSTD_createCDict_advanced(dictionary.start, dictionary.filled, 1 /* byReference */, ZSTD_dm_auto, cParams, customMem); + size_t const initError = ZSTD_initCStream_usingCDict_advanced(zc, cdict, fParams, CNBufferSize); if (ZSTD_isError(initError)) goto _output_error; cSize = 0; outBuff.dst = compressedBuffer; @@ -463,7 +502,7 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo inBuff.pos = 0; { size_t const r = ZSTD_compressStream(zc, &outBuff, &inBuff); if (ZSTD_isError(r)) goto _output_error; } - if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ + if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ { size_t const r = ZSTD_endStream(zc, &outBuff); if (r != 0) goto _output_error; } /* error, or some data not flushed */ cSize = outBuff.pos; @@ -483,6 +522,55 @@ static int basicUnitTests(U32 seed, double compressibility, ZSTD_customMem custo DISPLAYLEVEL(3, "OK (%s)\n", ZSTD_getErrorName(r)); } + DISPLAYLEVEL(3, "test%3i : compress with ZSTD_CCtx_refPrefix : ", testNb++); + { size_t const refErr = ZSTD_CCtx_refPrefix(zc, dictionary.start, dictionary.filled); + if (ZSTD_isError(refErr)) goto _output_error; } + outBuff.dst = compressedBuffer; + outBuff.size = compressedBufferSize; + outBuff.pos = 0; + inBuff.src = CNBuffer; + inBuff.size = CNBufferSize; + inBuff.pos = 0; + { size_t const r = ZSTD_compress_generic(zc, &outBuff, &inBuff, ZSTD_e_end); + if (ZSTD_isError(r)) goto _output_error; } + if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ + cSize = outBuff.pos; + DISPLAYLEVEL(3, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/CNBufferSize*100); + + DISPLAYLEVEL(3, "test%3i : decompress with dictionary : ", testNb++); + { size_t const r = ZSTD_decompress_usingDict(zd, + decodedBuffer, CNBufferSize, + compressedBuffer, cSize, + dictionary.start, dictionary.filled); + if (ZSTD_isError(r)) goto _output_error; /* must fail : dictionary not used */ + DISPLAYLEVEL(3, "OK \n"); + } + + DISPLAYLEVEL(3, "test%3i : decompress without dictionary (should fail): ", testNb++); + { size_t const r = ZSTD_decompress(decodedBuffer, CNBufferSize, compressedBuffer, cSize); + if (!ZSTD_isError(r)) goto _output_error; /* must fail : dictionary not used */ + DISPLAYLEVEL(3, "OK (%s)\n", ZSTD_getErrorName(r)); + } + + DISPLAYLEVEL(3, "test%3i : compress again with ZSTD_compress_generic : ", testNb++); + outBuff.dst = compressedBuffer; + outBuff.size = compressedBufferSize; + outBuff.pos = 0; + inBuff.src = CNBuffer; + inBuff.size = CNBufferSize; + inBuff.pos = 0; + { size_t const r = ZSTD_compress_generic(zc, &outBuff, &inBuff, ZSTD_e_end); + if (ZSTD_isError(r)) goto _output_error; } + if (inBuff.pos != inBuff.size) goto _output_error; /* entire input should be consumed */ + cSize = outBuff.pos; + DISPLAYLEVEL(3, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/CNBufferSize*100); + + DISPLAYLEVEL(3, "test%3i : decompress without dictionary (should work): ", testNb++); + { size_t const r = ZSTD_decompress(decodedBuffer, CNBufferSize, compressedBuffer, cSize); + if (ZSTD_isError(r)) goto _output_error; /* must fail : dictionary not used */ + DISPLAYLEVEL(3, "OK \n"); + } + /* Empty srcSize */ DISPLAYLEVEL(3, "test%3i : ZSTD_initCStream_advanced with pledgedSrcSize=0 and dict : ", testNb++); { ZSTD_parameters params = ZSTD_getParams(5, 0, 0); @@ -612,12 +700,26 @@ static size_t FUZ_randomLength(U32* seed, U32 maxLog) #define MIN(a,b) ( (a) < (b) ? (a) : (b) ) -#define CHECK(cond, ...) if (cond) { DISPLAY("Error => "); DISPLAY(__VA_ARGS__); \ - DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); goto _output_error; } +#define CHECK(cond, ...) { \ + if (cond) { \ + DISPLAY("Error => "); \ + DISPLAY(__VA_ARGS__); \ + DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); \ + goto _output_error; \ +} } + +#define CHECK_Z(f) { \ + size_t const err = f; \ + if (ZSTD_isError(err)) { \ + DISPLAY("Error => %s : %s ", \ + #f, ZSTD_getErrorName(err)); \ + DISPLAY(" (seed %u, test nb %u) \n", seed, testNb); \ + goto _output_error; \ +} } static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibility, int bigTests) { - static const U32 maxSrcLog = 24; + U32 const maxSrcLog = bigTests ? 24 : 22; static const U32 maxSampleLog = 19; size_t const srcBufferSize = (size_t)1< = testNb) { DISPLAYUPDATE(2, "\r%6u/%6u ", testNb, nbTests); } - else { DISPLAYUPDATE(2, "\r%6u ", testNb); } - FUZ_rand(&coreSeed); - lseed = coreSeed ^ prime32; - - /* states full reset (deliberately not synchronized) */ - /* some issues can only happen when reusing states */ - if ((FUZ_rand(&lseed) & 0xFF) == 131) { ZSTD_freeCStream(zc); zc = ZSTD_createCStream(); resetAllowed=0; } - if ((FUZ_rand(&lseed) & 0xFF) == 132) { ZSTD_freeDStream(zd); zd = ZSTD_createDStream(); ZSTD_initDStream_usingDict(zd, NULL, 0); /* ensure at least one init */ } - - /* srcBuffer selection [0-4] */ - { U32 buffNb = FUZ_rand(&lseed) & 0x7F; - if (buffNb & 7) buffNb=2; /* most common : compressible (P) */ - else { - buffNb >>= 3; - if (buffNb & 7) { - const U32 tnb[2] = { 1, 3 }; /* barely/highly compressible */ - buffNb = tnb[buffNb >> 3]; - } else { - const U32 tnb[2] = { 0, 4 }; /* not compressible / sparse */ - buffNb = tnb[buffNb >> 3]; - } } - srcBuffer = cNoiseBuffer[buffNb]; - } - - /* compression init */ - if ((FUZ_rand(&lseed)&1) /* at beginning, to keep same nb of rand */ - && oldTestLog /* at least one test happened */ && resetAllowed) { - maxTestSize = FUZ_randomLength(&lseed, oldTestLog+2); - if (maxTestSize >= srcBufferSize) - maxTestSize = srcBufferSize-1; - { U64 const pledgedSrcSize = (FUZ_rand(&lseed) & 3) ? 0 : maxTestSize; - size_t const resetError = ZSTD_resetCStream(zc, pledgedSrcSize); - CHECK(ZSTD_isError(resetError), "ZSTD_resetCStream error : %s", ZSTD_getErrorName(resetError)); - } - } else { - U32 const testLog = FUZ_rand(&lseed) % maxSrcLog; - U32 const dictLog = FUZ_rand(&lseed) % maxSrcLog; - U32 const cLevel = ( FUZ_rand(&lseed) % - (ZSTD_maxCLevel() - - (MAX(testLog, dictLog) / cLevelLimiter))) - + 1; - maxTestSize = FUZ_rLogLength(&lseed, testLog); - oldTestLog = testLog; - /* random dictionary selection */ - dictSize = ((FUZ_rand(&lseed)&1)==1) ? FUZ_rLogLength(&lseed, dictLog) : 0; - { size_t const dictStart = FUZ_rand(&lseed) % (srcBufferSize - dictSize); - dict = srcBuffer + dictStart; - } - { U64 const pledgedSrcSize = (FUZ_rand(&lseed) & 3) ? 0 : maxTestSize; - ZSTD_parameters params = ZSTD_getParams(cLevel, pledgedSrcSize, dictSize); - params.fParams.checksumFlag = FUZ_rand(&lseed) & 1; - params.fParams.noDictIDFlag = FUZ_rand(&lseed) & 1; - { size_t const initError = ZSTD_initCStream_advanced(zc, dict, dictSize, params, pledgedSrcSize); - CHECK (ZSTD_isError(initError),"ZSTD_initCStream_advanced error : %s", ZSTD_getErrorName(initError)); - } } } - - /* multi-segments compression test */ - XXH64_reset(&xxhState, 0); - { ZSTD_outBuffer outBuff = { cBuffer, cBufferSize, 0 } ; - U32 n; - for (n=0, cSize=0, totalTestSize=0 ; totalTestSize < maxTestSize ; n++) { - /* compress random chunks into randomly sized dst buffers */ - { size_t const randomSrcSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const srcSize = MIN (maxTestSize-totalTestSize, randomSrcSize); - size_t const srcStart = FUZ_rand(&lseed) % (srcBufferSize - srcSize); - size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const dstBuffSize = MIN(cBufferSize - cSize, randomDstSize); - ZSTD_inBuffer inBuff = { srcBuffer+srcStart, srcSize, 0 }; - outBuff.size = outBuff.pos + dstBuffSize; - - { size_t const compressionError = ZSTD_compressStream(zc, &outBuff, &inBuff); - CHECK (ZSTD_isError(compressionError), "compression error : %s", ZSTD_getErrorName(compressionError)); } - - XXH64_update(&xxhState, srcBuffer+srcStart, inBuff.pos); - memcpy(copyBuffer+totalTestSize, srcBuffer+srcStart, inBuff.pos); - totalTestSize += inBuff.pos; - } - - /* random flush operation, to mess around */ - if ((FUZ_rand(&lseed) & 15) == 0) { - size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); - outBuff.size = outBuff.pos + adjustedDstSize; - { size_t const flushError = ZSTD_flushStream(zc, &outBuff); - CHECK (ZSTD_isError(flushError), "flush error : %s", ZSTD_getErrorName(flushError)); - } } } - - /* final frame epilogue */ - { size_t remainingToFlush = (size_t)(-1); - while (remainingToFlush) { - size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); - U32 const enoughDstSize = (adjustedDstSize >= remainingToFlush); - outBuff.size = outBuff.pos + adjustedDstSize; - remainingToFlush = ZSTD_endStream(zc, &outBuff); - CHECK (ZSTD_isError(remainingToFlush), "flush error : %s", ZSTD_getErrorName(remainingToFlush)); - CHECK (enoughDstSize && remainingToFlush, "ZSTD_endStream() not fully flushed (%u remaining), but enough space available", (U32)remainingToFlush); - } } - crcOrig = XXH64_digest(&xxhState); - cSize = outBuff.pos; - } - - /* multi - fragments decompression test */ - if (!dictSize /* don't reset if dictionary : could be different */ && (FUZ_rand(&lseed) & 1)) { - CHECK (ZSTD_isError(ZSTD_resetDStream(zd)), "ZSTD_resetDStream failed"); - } else { - ZSTD_initDStream_usingDict(zd, dict, dictSize); - } - { size_t decompressionResult = 1; - ZSTD_inBuffer inBuff = { cBuffer, cSize, 0 }; - ZSTD_outBuffer outBuff= { dstBuffer, dstBufferSize, 0 }; - for (totalGenSize = 0 ; decompressionResult ; ) { - size_t const readCSrcSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); - size_t const dstBuffSize = MIN(dstBufferSize - totalGenSize, randomDstSize); - inBuff.size = inBuff.pos + readCSrcSize; - outBuff.size = inBuff.pos + dstBuffSize; - decompressionResult = ZSTD_decompressStream(zd, &outBuff, &inBuff); - CHECK (ZSTD_isError(decompressionResult), "decompression error : %s", ZSTD_getErrorName(decompressionResult)); - } - CHECK (decompressionResult != 0, "frame not fully decoded"); - CHECK (outBuff.pos != totalTestSize, "decompressed data : wrong size") - CHECK (inBuff.pos != cSize, "compressed data should be fully read") - { U64 const crcDest = XXH64(dstBuffer, totalTestSize, 0); - if (crcDest!=crcOrig) findDiff(copyBuffer, dstBuffer, totalTestSize); - CHECK (crcDest!=crcOrig, "decompressed data corrupted"); - } } - - /*===== noisy/erroneous src decompression test =====*/ - - /* add some noise */ - { U32 const nbNoiseChunks = (FUZ_rand(&lseed) & 7) + 2; - U32 nn; for (nn=0; nn >= 3; + if (buffNb & 7) { + const U32 tnb[2] = { 1, 3 }; /* barely/highly compressible */ + buffNb = tnb[buffNb >> 3]; + } else { + const U32 tnb[2] = { 0, 4 }; /* not compressible / sparse */ + buffNb = tnb[buffNb >> 3]; + } } + srcBuffer = cNoiseBuffer[buffNb]; + } + + /* compression init */ + if ((FUZ_rand(&lseed)&1) /* at beginning, to keep same nb of rand */ + && oldTestLog /* at least one test happened */ && resetAllowed) { + maxTestSize = FUZ_randomLength(&lseed, oldTestLog+2); + if (maxTestSize >= srcBufferSize) + maxTestSize = srcBufferSize-1; + { U64 const pledgedSrcSize = (FUZ_rand(&lseed) & 3) ? 0 : maxTestSize; + CHECK_Z( ZSTD_resetCStream(zc, pledgedSrcSize) ); + } + } else { + U32 const testLog = FUZ_rand(&lseed) % maxSrcLog; + U32 const dictLog = FUZ_rand(&lseed) % maxSrcLog; + U32 const cLevelCandidate = ( FUZ_rand(&lseed) % + (ZSTD_maxCLevel() - + (MAX(testLog, dictLog) / 3))) + + 1; + U32 const cLevel = MIN(cLevelCandidate, cLevelMax); + maxTestSize = FUZ_rLogLength(&lseed, testLog); + oldTestLog = testLog; + /* random dictionary selection */ + dictSize = ((FUZ_rand(&lseed)&7)==1) ? FUZ_rLogLength(&lseed, dictLog) : 0; + { size_t const dictStart = FUZ_rand(&lseed) % (srcBufferSize - dictSize); + dict = srcBuffer + dictStart; + } + { U64 const pledgedSrcSize = (FUZ_rand(&lseed) & 3) ? 0 : maxTestSize; + ZSTD_parameters params = ZSTD_getParams(cLevel, pledgedSrcSize, dictSize); + params.fParams.checksumFlag = FUZ_rand(&lseed) & 1; + params.fParams.noDictIDFlag = FUZ_rand(&lseed) & 1; + CHECK_Z ( ZSTD_initCStream_advanced(zc, dict, dictSize, params, pledgedSrcSize) ); + } } + + /* multi-segments compression test */ + XXH64_reset(&xxhState, 0); + { ZSTD_outBuffer outBuff = { cBuffer, cBufferSize, 0 } ; + U32 n; + for (n=0, cSize=0, totalTestSize=0 ; totalTestSize < maxTestSize ; n++) { + /* compress random chunks into randomly sized dst buffers */ + { size_t const randomSrcSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const srcSize = MIN (maxTestSize-totalTestSize, randomSrcSize); + size_t const srcStart = FUZ_rand(&lseed) % (srcBufferSize - srcSize); + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const dstBuffSize = MIN(cBufferSize - cSize, randomDstSize); + ZSTD_inBuffer inBuff = { srcBuffer+srcStart, srcSize, 0 }; + outBuff.size = outBuff.pos + dstBuffSize; + + CHECK_Z( ZSTD_compressStream(zc, &outBuff, &inBuff) ); + + XXH64_update(&xxhState, srcBuffer+srcStart, inBuff.pos); + memcpy(copyBuffer+totalTestSize, srcBuffer+srcStart, inBuff.pos); + totalTestSize += inBuff.pos; + } + + /* random flush operation, to mess around */ + if ((FUZ_rand(&lseed) & 15) == 0) { + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); + outBuff.size = outBuff.pos + adjustedDstSize; + CHECK_Z( ZSTD_flushStream(zc, &outBuff) ); + } } + + /* final frame epilogue */ + { size_t remainingToFlush = (size_t)(-1); + while (remainingToFlush) { + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); + outBuff.size = outBuff.pos + adjustedDstSize; + remainingToFlush = ZSTD_endStream(zc, &outBuff); + CHECK (ZSTD_isError(remainingToFlush), "end error : %s", ZSTD_getErrorName(remainingToFlush)); + } } + crcOrig = XXH64_digest(&xxhState); + cSize = outBuff.pos; + } + + /* multi - fragments decompression test */ + if (!dictSize /* don't reset if dictionary : could be different */ && (FUZ_rand(&lseed) & 1)) { + CHECK_Z ( ZSTD_resetDStream(zd) ); + } else { + CHECK_Z ( ZSTD_initDStream_usingDict(zd, dict, dictSize) ); + } + { size_t decompressionResult = 1; + ZSTD_inBuffer inBuff = { cBuffer, cSize, 0 }; + ZSTD_outBuffer outBuff= { dstBuffer, dstBufferSize, 0 }; + for (totalGenSize = 0 ; decompressionResult ; ) { + size_t const readCSrcSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const dstBuffSize = MIN(dstBufferSize - totalGenSize, randomDstSize); + inBuff.size = inBuff.pos + readCSrcSize; + outBuff.size = inBuff.pos + dstBuffSize; + decompressionResult = ZSTD_decompressStream(zd, &outBuff, &inBuff); + CHECK (ZSTD_isError(decompressionResult), "decompression error : %s", ZSTD_getErrorName(decompressionResult)); + } + CHECK (decompressionResult != 0, "frame not fully decoded"); + CHECK (outBuff.pos != totalTestSize, "decompressed data : wrong size") + CHECK (inBuff.pos != cSize, "compressed data should be fully read") + { U64 const crcDest = XXH64(dstBuffer, totalTestSize, 0); + if (crcDest!=crcOrig) findDiff(copyBuffer, dstBuffer, totalTestSize); + CHECK (crcDest!=crcOrig, "decompressed data corrupted"); + } } + + /*===== noisy/erroneous src decompression test =====*/ + + /* add some noise */ + { U32 const nbNoiseChunks = (FUZ_rand(&lseed) & 7) + 2; + U32 nn; for (nn=0; nn = testNb) { DISPLAYUPDATE(2, "\r%6u/%6u ", testNb, nbTests); } + else { DISPLAYUPDATE(2, "\r%6u ", testNb); } + FUZ_rand(&coreSeed); + lseed = coreSeed ^ prime32; + + /* states full reset (deliberately not synchronized) */ + /* some issues can only happen when reusing states */ + if ((FUZ_rand(&lseed) & 0xFF) == 131) { + U32 const nbThreadsCandidate = (FUZ_rand(&lseed) % 6) + 1; + U32 const nbThreads = MIN(nbThreadsCandidate, nbThreadsMax); + DISPLAYLEVEL(5, "Creating new context with %u threads \n", nbThreads); + ZSTDMT_freeCCtx(zc); + zc = ZSTDMT_createCCtx(nbThreads); + CHECK(zc==NULL, "ZSTDMT_createCCtx allocation error") + resetAllowed=0; + } + if ((FUZ_rand(&lseed) & 0xFF) == 132) { + ZSTD_freeDStream(zd); + zd = ZSTD_createDStream(); + CHECK(zd==NULL, "ZSTDMT_createCCtx allocation error") ZSTD_initDStream_usingDict(zd, NULL, 0); /* ensure at least one init */ } @@ -952,16 +1063,16 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp maxTestSize = FUZ_randomLength(&lseed, oldTestLog+2); if (maxTestSize >= srcBufferSize) maxTestSize = srcBufferSize-1; { int const compressionLevel = (FUZ_rand(&lseed) % 5) + 1; - size_t const resetError = ZSTDMT_initCStream(zc, compressionLevel); - CHECK(ZSTD_isError(resetError), "ZSTDMT_initCStream error : %s", ZSTD_getErrorName(resetError)); + CHECK_Z( ZSTDMT_initCStream(zc, compressionLevel) ); } } else { U32 const testLog = FUZ_rand(&lseed) % maxSrcLog; U32 const dictLog = FUZ_rand(&lseed) % maxSrcLog; - U32 const cLevel = (FUZ_rand(&lseed) % - (ZSTD_maxCLevel() - - (MAX(testLog, dictLog) / cLevelLimiter))) + + U32 const cLevelCandidate = (FUZ_rand(&lseed) % + (ZSTD_maxCLevel() - + (MAX(testLog, dictLog) / 3))) + 1; + U32 const cLevel = MIN(cLevelCandidate, cLevelMax); maxTestSize = FUZ_rLogLength(&lseed, testLog); oldTestLog = testLog; /* random dictionary selection */ @@ -977,10 +1088,9 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp params.fParams.noDictIDFlag = FUZ_rand(&lseed) & 1; params.fParams.contentSizeFlag = pledgedSrcSize>0; DISPLAYLEVEL(5, "checksumFlag : %u \n", params.fParams.checksumFlag); - { size_t const initError = ZSTDMT_initCStream_advanced(zc, dict, dictSize, params, pledgedSrcSize); - CHECK (ZSTD_isError(initError),"ZSTDMT_initCStream_advanced error : %s", ZSTD_getErrorName(initError)); } - ZSTDMT_setMTCtxParameter(zc, ZSTDMT_p_overlapSectionLog, FUZ_rand(&lseed) % 12); - ZSTDMT_setMTCtxParameter(zc, ZSTDMT_p_sectionSize, FUZ_rand(&lseed) % (2*maxTestSize+1)); + CHECK_Z( ZSTDMT_initCStream_advanced(zc, dict, dictSize, params, pledgedSrcSize) ); + CHECK_Z( ZSTDMT_setMTCtxParameter(zc, ZSTDMT_p_overlapSectionLog, FUZ_rand(&lseed) % 12) ); + CHECK_Z( ZSTDMT_setMTCtxParameter(zc, ZSTDMT_p_sectionSize, FUZ_rand(&lseed) % (2*maxTestSize+1)) ); } } /* multi-segments compression test */ @@ -998,8 +1108,7 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp outBuff.size = outBuff.pos + dstBuffSize; DISPLAYLEVEL(5, "Sending %u bytes to compress \n", (U32)srcSize); - { size_t const compressionError = ZSTDMT_compressStream(zc, &outBuff, &inBuff); - CHECK (ZSTD_isError(compressionError), "compression error : %s", ZSTD_getErrorName(compressionError)); } + CHECK_Z( ZSTDMT_compressStream(zc, &outBuff, &inBuff) ); DISPLAYLEVEL(5, "%u bytes read by ZSTDMT_compressStream \n", (U32)inBuff.pos); XXH64_update(&xxhState, srcBuffer+srcStart, inBuff.pos); @@ -1013,9 +1122,8 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); outBuff.size = outBuff.pos + adjustedDstSize; DISPLAYLEVEL(5, "Flushing into dst buffer of size %u \n", (U32)adjustedDstSize); - { size_t const flushError = ZSTDMT_flushStream(zc, &outBuff); - CHECK (ZSTD_isError(flushError), "ZSTDMT_flushStream error : %s", ZSTD_getErrorName(flushError)); - } } } + CHECK_Z( ZSTDMT_flushStream(zc, &outBuff) ); + } } /* final frame epilogue */ { size_t remainingToFlush = (size_t)(-1); @@ -1035,9 +1143,9 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp /* multi - fragments decompression test */ if (!dictSize /* don't reset if dictionary : could be different */ && (FUZ_rand(&lseed) & 1)) { - CHECK (ZSTD_isError(ZSTD_resetDStream(zd)), "ZSTD_resetDStream failed"); + CHECK_Z( ZSTD_resetDStream(zd) ); } else { - ZSTD_initDStream_usingDict(zd, dict, dictSize); + CHECK_Z( ZSTD_initDStream_usingDict(zd, dict, dictSize) ); } { size_t decompressionResult = 1; ZSTD_inBuffer inBuff = { cBuffer, cSize, 0 }; @@ -1073,7 +1181,7 @@ static int fuzzerTests_MT(U32 seed, U32 nbTests, unsigned startTest, double comp } } /* try decompression on noisy data */ - ZSTD_initDStream(zd_noise); /* note : no dictionary */ + CHECK_Z( ZSTD_initDStream(zd_noise) ); /* note : no dictionary */ { ZSTD_inBuffer inBuff = { cBuffer, cSize, 0 }; ZSTD_outBuffer outBuff= { dstBuffer, dstBufferSize, 0 }; while (outBuff.pos < dstBufferSize) { @@ -1110,6 +1218,297 @@ _output_error: } +/* Tests for ZSTD_compress_generic() API */ +static int fuzzerTests_newAPI(U32 seed, U32 nbTests, unsigned startTest, double compressibility, int bigTests) +{ + U32 const maxSrcLog = bigTests ? 24 : 22; + static const U32 maxSampleLog = 19; + size_t const srcBufferSize = (size_t)1< = testNb) { DISPLAYUPDATE(2, "\r%6u/%6u ", testNb, nbTests); } + else { DISPLAYUPDATE(2, "\r%6u ", testNb); } + FUZ_rand(&coreSeed); + lseed = coreSeed ^ prime32; + + /* states full reset (deliberately not synchronized) */ + /* some issues can only happen when reusing states */ + if ((FUZ_rand(&lseed) & 0xFF) == 131) { + DISPLAYLEVEL(5, "Creating new context \n"); + ZSTD_freeCCtx(zc); + zc = ZSTD_createCCtx(); + CHECK(zc==NULL, "ZSTD_createCCtx allocation error"); + resetAllowed=0; + } + if ((FUZ_rand(&lseed) & 0xFF) == 132) { + ZSTD_freeDStream(zd); + zd = ZSTD_createDStream(); + CHECK(zd==NULL, "ZSTD_createDStream allocation error"); + ZSTD_initDStream_usingDict(zd, NULL, 0); /* ensure at least one init */ + } + + /* srcBuffer selection [0-4] */ + { U32 buffNb = FUZ_rand(&lseed) & 0x7F; + if (buffNb & 7) buffNb=2; /* most common : compressible (P) */ + else { + buffNb >>= 3; + if (buffNb & 7) { + const U32 tnb[2] = { 1, 3 }; /* barely/highly compressible */ + buffNb = tnb[buffNb >> 3]; + } else { + const U32 tnb[2] = { 0, 4 }; /* not compressible / sparse */ + buffNb = tnb[buffNb >> 3]; + } } + srcBuffer = cNoiseBuffer[buffNb]; + } + + /* compression init */ + CHECK_Z( ZSTD_CCtx_loadDictionary(zc, NULL, 0) ); /* cancel previous dict /*/ + if ((FUZ_rand(&lseed)&1) /* at beginning, to keep same nb of rand */ + && oldTestLog /* at least one test happened */ && resetAllowed) { + maxTestSize = FUZ_randomLength(&lseed, oldTestLog+2); + if (maxTestSize >= srcBufferSize) maxTestSize = srcBufferSize-1; + { int const compressionLevel = (FUZ_rand(&lseed) % 5) + 1; + CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_compressionLevel, compressionLevel) ); + } + } else { + U32 const testLog = FUZ_rand(&lseed) % maxSrcLog; + U32 const dictLog = FUZ_rand(&lseed) % maxSrcLog; + U32 const cLevelCandidate = (FUZ_rand(&lseed) % + (ZSTD_maxCLevel() - + (MAX(testLog, dictLog) / 3))) + + 1; + U32 const cLevel = MIN(cLevelCandidate, cLevelMax); + maxTestSize = FUZ_rLogLength(&lseed, testLog); + oldTestLog = testLog; + /* random dictionary selection */ + dictSize = ((FUZ_rand(&lseed)&63)==1) ? FUZ_rLogLength(&lseed, dictLog) : 0; + { size_t const dictStart = FUZ_rand(&lseed) % (srcBufferSize - dictSize); + dict = srcBuffer + dictStart; + if (!dictSize) dict=NULL; + } + { U64 const pledgedSrcSize = (FUZ_rand(&lseed) & 3) ? ZSTD_CONTENTSIZE_UNKNOWN : maxTestSize; + ZSTD_compressionParameters cParams = ZSTD_getCParams(cLevel, pledgedSrcSize, dictSize); + + /* mess with compression parameters */ + cParams.windowLog += (FUZ_rand(&lseed) & 3) - 1; + cParams.hashLog += (FUZ_rand(&lseed) & 3) - 1; + cParams.chainLog += (FUZ_rand(&lseed) & 3) - 1; + cParams.searchLog += (FUZ_rand(&lseed) & 3) - 1; + cParams.searchLength += (FUZ_rand(&lseed) & 3) - 1; + cParams.targetLength = (U32)(cParams.targetLength * (0.5 + ((double)(FUZ_rand(&lseed) & 127) / 128))); + cParams = ZSTD_adjustCParams(cParams, 0, 0); + + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_windowLog, cParams.windowLog) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_hashLog, cParams.hashLog) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_chainLog, cParams.chainLog) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_searchLog, cParams.searchLog) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_minMatch, cParams.searchLength) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_targetLength, cParams.targetLength) ); + + /* unconditionally set, to be sync with decoder */ + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_refDictContent, FUZ_rand(&lseed) & 1) ); + if (FUZ_rand(&lseed) & 1) { + CHECK_Z( ZSTD_CCtx_loadDictionary(zc, dict, dictSize) ); + if (dict && dictSize) { + /* test that compression parameters are rejected (correctly) after loading a non-NULL dictionary */ + size_t const setError = ZSTD_CCtx_setParameter(zc, ZSTD_p_windowLog, cParams.windowLog-1) ; + CHECK(!ZSTD_isError(setError), "ZSTD_CCtx_setParameter should have failed"); + } } else { + CHECK_Z( ZSTD_CCtx_refPrefix(zc, dict, dictSize) ); + } + + /* mess with frame parameters */ + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_checksumFlag, FUZ_rand(&lseed) & 1) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_dictIDFlag, FUZ_rand(&lseed) & 1) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_contentSizeFlag, FUZ_rand(&lseed) & 1) ); + if (FUZ_rand(&lseed) & 1) CHECK_Z( ZSTD_CCtx_setPledgedSrcSize(zc, pledgedSrcSize) ); + DISPLAYLEVEL(5, "pledgedSrcSize : %u \n", (U32)pledgedSrcSize); + + /* multi-threading parameters */ + { U32 const nbThreadsCandidate = (FUZ_rand(&lseed) & 4) + 1; + U32 const nbThreads = MIN(nbThreadsCandidate, nbThreadsMax); + CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_nbThreads, nbThreads) ); + if (nbThreads > 1) { + U32 const jobLog = FUZ_rand(&lseed) % (testLog+1); + CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_overlapSizeLog, FUZ_rand(&lseed) % 10) ); + CHECK_Z( ZSTD_CCtx_setParameter(zc, ZSTD_p_jobSize, (U32)FUZ_rLogLength(&lseed, jobLog)) ); + } } } } + + /* multi-segments compression test */ + XXH64_reset(&xxhState, 0); + { ZSTD_outBuffer outBuff = { cBuffer, cBufferSize, 0 } ; + U32 n; + for (n=0, cSize=0, totalTestSize=0 ; totalTestSize < maxTestSize ; n++) { + /* compress random chunks into randomly sized dst buffers */ + size_t const randomSrcSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const srcSize = MIN(maxTestSize-totalTestSize, randomSrcSize); + size_t const srcStart = FUZ_rand(&lseed) % (srcBufferSize - srcSize); + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const dstBuffSize = MIN(cBufferSize - cSize, randomDstSize); + ZSTD_EndDirective const flush = (FUZ_rand(&lseed) & 15) ? ZSTD_e_continue : ZSTD_e_flush; + ZSTD_inBuffer inBuff = { srcBuffer+srcStart, srcSize, 0 }; + outBuff.size = outBuff.pos + dstBuffSize; + + CHECK_Z( ZSTD_compress_generic(zc, &outBuff, &inBuff, flush) ); + DISPLAYLEVEL(5, "compress consumed %u bytes (total : %u) \n", + (U32)inBuff.pos, (U32)(totalTestSize + inBuff.pos)); + + XXH64_update(&xxhState, srcBuffer+srcStart, inBuff.pos); + memcpy(copyBuffer+totalTestSize, srcBuffer+srcStart, inBuff.pos); + totalTestSize += inBuff.pos; + } + + /* final frame epilogue */ + { size_t remainingToFlush = (size_t)(-1); + while (remainingToFlush) { + ZSTD_inBuffer inBuff = { NULL, 0, 0 }; + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const adjustedDstSize = MIN(cBufferSize - cSize, randomDstSize); + outBuff.size = outBuff.pos + adjustedDstSize; + DISPLAYLEVEL(5, "End-flush into dst buffer of size %u \n", (U32)adjustedDstSize); + remainingToFlush = ZSTD_compress_generic(zc, &outBuff, &inBuff, ZSTD_e_end); + CHECK(ZSTD_isError(remainingToFlush), + "ZSTD_compress_generic w/ ZSTD_e_end error : %s", + ZSTD_getErrorName(remainingToFlush) ); + } } + crcOrig = XXH64_digest(&xxhState); + cSize = outBuff.pos; + DISPLAYLEVEL(5, "Frame completed : %u bytes \n", (U32)cSize); + } + + /* multi - fragments decompression test */ + if (!dictSize /* don't reset if dictionary : could be different */ && (FUZ_rand(&lseed) & 1)) { + DISPLAYLEVEL(5, "resetting DCtx (dict:%08X) \n", (U32)(size_t)dict); + CHECK_Z( ZSTD_resetDStream(zd) ); + } else { + DISPLAYLEVEL(5, "using dict of size %u \n", (U32)dictSize); + CHECK_Z( ZSTD_initDStream_usingDict(zd, dict, dictSize) ); + } + { size_t decompressionResult = 1; + ZSTD_inBuffer inBuff = { cBuffer, cSize, 0 }; + ZSTD_outBuffer outBuff= { dstBuffer, dstBufferSize, 0 }; + for (totalGenSize = 0 ; decompressionResult ; ) { + size_t const readCSrcSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog); + size_t const dstBuffSize = MIN(dstBufferSize - totalGenSize, randomDstSize); + inBuff.size = inBuff.pos + readCSrcSize; + outBuff.size = inBuff.pos + dstBuffSize; + DISPLAYLEVEL(5, "ZSTD_decompressStream input %u bytes (pos:%u/%u)\n", + (U32)readCSrcSize, (U32)inBuff.pos, (U32)cSize); + decompressionResult = ZSTD_decompressStream(zd, &outBuff, &inBuff); + CHECK (ZSTD_isError(decompressionResult), "decompression error : %s", ZSTD_getErrorName(decompressionResult)); + DISPLAYLEVEL(5, "inBuff.pos = %u \n", (U32)readCSrcSize); + } + CHECK (outBuff.pos != totalTestSize, "decompressed data : wrong size (%u != %u)", (U32)outBuff.pos, (U32)totalTestSize); + CHECK (inBuff.pos != cSize, "compressed data should be fully read (%u != %u)", (U32)inBuff.pos, (U32)cSize); + { U64 const crcDest = XXH64(dstBuffer, totalTestSize, 0); + if (crcDest!=crcOrig) findDiff(copyBuffer, dstBuffer, totalTestSize); + CHECK (crcDest!=crcOrig, "decompressed data corrupted"); + } } + + /*===== noisy/erroneous src decompression test =====*/ + + /* add some noise */ + { U32 const nbNoiseChunks = (FUZ_rand(&lseed) & 7) + 2; + U32 nn; for (nn=0; nn #include /* vsprintf */ #include /* va_list, for z_gzprintf */ #define NO_DUMMY_DECL #define ZLIB_CONST #include /* without #define Z_PREFIX */ #include "zstd_zlibwrapper.h" -#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_MAGICNUMBER */ +#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_isFrame, ZSTD_MAGICNUMBER */ #include "zstd.h" -#include "zstd_internal.h" /* defaultCustomMem */ +#include "zstd_internal.h" /* ZSTD_malloc, ZSTD_free */ +/* === Constants === */ #define Z_INFLATE_SYNC 8 #define ZLIB_HEADERSIZE 4 #define ZSTD_HEADERSIZE ZSTD_frameHeaderSize_min #define ZWRAP_DEFAULT_CLEVEL 3 /* Z_DEFAULT_COMPRESSION is translated to ZWRAP_DEFAULT_CLEVEL for zstd */ -#define LOG_WRAPPERC(...) /* printf(__VA_ARGS__) */ -#define LOG_WRAPPERD(...) /* printf(__VA_ARGS__) */ + +/* === Debug === */ +#define LOG_WRAPPERC(...) /* fprintf(stderr, __VA_ARGS__) */ +#define LOG_WRAPPERD(...) /* fprintf(stderr, __VA_ARGS__) */ #define FINISH_WITH_GZ_ERR(msg) { (void)msg; return Z_STREAM_ERROR; } #define FINISH_WITH_NULL_ERR(msg) { (void)msg; return NULL; } - -#ifndef ZWRAP_USE_ZSTD - #define ZWRAP_USE_ZSTD 0 -#endif - -static int g_ZWRAP_useZSTDcompression = ZWRAP_USE_ZSTD; /* 0 = don't use ZSTD */ +/* === Wrapper === */ +static int g_ZWRAP_useZSTDcompression = ZWRAP_USE_ZSTD; /* 0 = don't use ZSTD */ void ZWRAP_useZSTDcompression(int turn_on) { g_ZWRAP_useZSTDcompression = turn_on; } @@ -62,7 +69,7 @@ static void* ZWRAP_allocFunction(void* opaque, size_t size) { z_streamp strm = (z_streamp) opaque; void* address = strm->zalloc(strm->opaque, 1, (uInt)size); - /* printf("ZWRAP alloc %p, %d \n", address, (int)size); */ + /* LOG_WRAPPERC("ZWRAP alloc %p, %d \n", address, (int)size); */ return address; } @@ -70,12 +77,12 @@ static void ZWRAP_freeFunction(void* opaque, void* address) { z_streamp strm = (z_streamp) opaque; strm->zfree(strm->opaque, address); - /* if (address) printf("ZWRAP free %p \n", address); */ + /* if (address) LOG_WRAPPERC("ZWRAP free %p \n", address); */ } -/* *** Compression *** */ +/* === Compression === */ typedef enum { ZWRAP_useInit, ZWRAP_useReset, ZWRAP_streamEnd } ZWRAP_state_t; typedef struct { @@ -95,16 +102,16 @@ typedef ZWRAP_CCtx internal_state; -size_t ZWRAP_freeCCtx(ZWRAP_CCtx* zwc) +static size_t ZWRAP_freeCCtx(ZWRAP_CCtx* zwc) { if (zwc==NULL) return 0; /* support free on NULL */ - if (zwc->zbc) ZSTD_freeCStream(zwc->zbc); - zwc->customMem.customFree(zwc->customMem.opaque, zwc); + ZSTD_freeCStream(zwc->zbc); + ZSTD_free(zwc, zwc->customMem); return 0; } -ZWRAP_CCtx* ZWRAP_createCCtx(z_streamp strm) +static ZWRAP_CCtx* ZWRAP_createCCtx(z_streamp strm) { ZWRAP_CCtx* zwc; @@ -113,37 +120,36 @@ ZWRAP_CCtx* ZWRAP_createCCtx(z_streamp strm) if (zwc==NULL) return NULL; memset(zwc, 0, sizeof(ZWRAP_CCtx)); memcpy(&zwc->allocFunc, strm, sizeof(z_stream)); - { ZSTD_customMem ZWRAP_customMem = { ZWRAP_allocFunction, ZWRAP_freeFunction, &zwc->allocFunc }; - memcpy(&zwc->customMem, &ZWRAP_customMem, sizeof(ZSTD_customMem)); - } + { ZSTD_customMem const ZWRAP_customMem = { ZWRAP_allocFunction, ZWRAP_freeFunction, &zwc->allocFunc }; + zwc->customMem = ZWRAP_customMem; } } else { - zwc = (ZWRAP_CCtx*)defaultCustomMem.customAlloc(defaultCustomMem.opaque, sizeof(ZWRAP_CCtx)); + zwc = (ZWRAP_CCtx*)calloc(1, sizeof(*zwc)); if (zwc==NULL) return NULL; - memset(zwc, 0, sizeof(ZWRAP_CCtx)); - memcpy(&zwc->customMem, &defaultCustomMem, sizeof(ZSTD_customMem)); } return zwc; } -int ZWRAP_initializeCStream(ZWRAP_CCtx* zwc, const void* dict, size_t dictSize, unsigned long long pledgedSrcSize) +static int ZWRAP_initializeCStream(ZWRAP_CCtx* zwc, const void* dict, size_t dictSize, unsigned long long pledgedSrcSize) { LOG_WRAPPERC("- ZWRAP_initializeCStream=%p\n", zwc); if (zwc == NULL || zwc->zbc == NULL) return Z_STREAM_ERROR; if (!pledgedSrcSize) pledgedSrcSize = zwc->pledgedSrcSize; - { ZSTD_parameters const params = ZSTD_getParams(zwc->compressionLevel, pledgedSrcSize, dictSize); - size_t errorCode; - LOG_WRAPPERC("pledgedSrcSize=%d windowLog=%d chainLog=%d hashLog=%d searchLog=%d searchLength=%d strategy=%d\n", (int)pledgedSrcSize, params.cParams.windowLog, params.cParams.chainLog, params.cParams.hashLog, params.cParams.searchLog, params.cParams.searchLength, params.cParams.strategy); - errorCode = ZSTD_initCStream_advanced(zwc->zbc, dict, dictSize, params, pledgedSrcSize); - if (ZSTD_isError(errorCode)) return Z_STREAM_ERROR; } + { ZSTD_parameters const params = ZSTD_getParams(zwc->compressionLevel, pledgedSrcSize, dictSize); + size_t initErr; + LOG_WRAPPERC("pledgedSrcSize=%d windowLog=%d chainLog=%d hashLog=%d searchLog=%d searchLength=%d strategy=%d\n", + (int)pledgedSrcSize, params.cParams.windowLog, params.cParams.chainLog, params.cParams.hashLog, params.cParams.searchLog, params.cParams.searchLength, params.cParams.strategy); + initErr = ZSTD_initCStream_advanced(zwc->zbc, dict, dictSize, params, pledgedSrcSize); + if (ZSTD_isError(initErr)) return Z_STREAM_ERROR; + } return Z_OK; } -int ZWRAPC_finishWithError(ZWRAP_CCtx* zwc, z_streamp strm, int error) +static int ZWRAPC_finishWithError(ZWRAP_CCtx* zwc, z_streamp strm, int error) { LOG_WRAPPERC("- ZWRAPC_finishWithError=%d\n", error); if (zwc) ZWRAP_freeCCtx(zwc); @@ -152,7 +158,7 @@ int ZWRAPC_finishWithError(ZWRAP_CCtx* zwc, z_streamp strm, int error) } -int ZWRAPC_finishWithErrorMsg(z_streamp strm, char* message) +static int ZWRAPC_finishWithErrorMsg(z_streamp strm, char* message) { ZWRAP_CCtx* zwc = (ZWRAP_CCtx*) strm->state; strm->msg = message; @@ -219,7 +225,7 @@ int ZWRAP_deflateReset_keepDict(z_streamp strm) return deflateReset(strm); { ZWRAP_CCtx* zwc = (ZWRAP_CCtx*) strm->state; - if (zwc) { + if (zwc) { zwc->streamEnd = 0; zwc->totalInBytes = 0; } @@ -277,34 +283,36 @@ ZEXTERN int ZEXPORT z_deflate OF((z_streamp strm, int flush)) ZWRAP_CCtx* zwc; if (!g_ZWRAP_useZSTDcompression) { - int res; - LOG_WRAPPERC("- deflate1 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); - res = deflate(strm, flush); - return res; + LOG_WRAPPERC("- deflate1 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); + return deflate(strm, flush); } zwc = (ZWRAP_CCtx*) strm->state; if (zwc == NULL) { LOG_WRAPPERC("zwc == NULL\n"); return Z_STREAM_ERROR; } if (zwc->zbc == NULL) { - int res; zwc->zbc = ZSTD_createCStream_advanced(zwc->customMem); if (zwc->zbc == NULL) return ZWRAPC_finishWithError(zwc, strm, 0); - res = ZWRAP_initializeCStream(zwc, NULL, 0, (flush == Z_FINISH) ? strm->avail_in : 0); - if (res != Z_OK) return ZWRAPC_finishWithError(zwc, strm, res); + { int const initErr = ZWRAP_initializeCStream(zwc, NULL, 0, (flush == Z_FINISH) ? strm->avail_in : 0); + if (initErr != Z_OK) return ZWRAPC_finishWithError(zwc, strm, initErr); } if (flush != Z_FINISH) zwc->comprState = ZWRAP_useReset; } else { if (zwc->totalInBytes == 0) { if (zwc->comprState == ZWRAP_useReset) { - size_t const errorCode = ZSTD_resetCStream(zwc->zbc, (flush == Z_FINISH) ? strm->avail_in : zwc->pledgedSrcSize); - if (ZSTD_isError(errorCode)) { LOG_WRAPPERC("ERROR: ZSTD_resetCStream errorCode=%s\n", ZSTD_getErrorName(errorCode)); return ZWRAPC_finishWithError(zwc, strm, 0); } + size_t const resetErr = ZSTD_resetCStream(zwc->zbc, (flush == Z_FINISH) ? strm->avail_in : zwc->pledgedSrcSize); + if (ZSTD_isError(resetErr)) { + LOG_WRAPPERC("ERROR: ZSTD_resetCStream errorCode=%s\n", + ZSTD_getErrorName(resetErr)); + return ZWRAPC_finishWithError(zwc, strm, 0); + } } else { - int res = ZWRAP_initializeCStream(zwc, NULL, 0, (flush == Z_FINISH) ? strm->avail_in : 0); + int const res = ZWRAP_initializeCStream(zwc, NULL, 0, (flush == Z_FINISH) ? strm->avail_in : 0); if (res != Z_OK) return ZWRAPC_finishWithError(zwc, strm, res); if (flush != Z_FINISH) zwc->comprState = ZWRAP_useReset; } - } - } + } /* (zwc->totalInBytes == 0) */ + } /* ! (zwc->zbc == NULL) */ LOG_WRAPPERC("- deflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); if (strm->avail_in > 0) { @@ -314,9 +322,9 @@ ZEXTERN int ZEXPORT z_deflate OF((z_streamp strm, int flush)) zwc->outBuffer.dst = strm->next_out; zwc->outBuffer.size = strm->avail_out; zwc->outBuffer.pos = 0; - { size_t const errorCode = ZSTD_compressStream(zwc->zbc, &zwc->outBuffer, &zwc->inBuffer); + { size_t const cErr = ZSTD_compressStream(zwc->zbc, &zwc->outBuffer, &zwc->inBuffer); LOG_WRAPPERC("deflate ZSTD_compressStream srcSize=%d dstCapacity=%d\n", (int)zwc->inBuffer.size, (int)zwc->outBuffer.size); - if (ZSTD_isError(errorCode)) return ZWRAPC_finishWithError(zwc, strm, 0); + if (ZSTD_isError(cErr)) return ZWRAPC_finishWithError(zwc, strm, 0); } strm->next_out += zwc->outBuffer.pos; strm->total_out += zwc->outBuffer.pos; @@ -346,8 +354,12 @@ ZEXTERN int ZEXPORT z_deflate OF((z_streamp strm, int flush)) strm->next_out += zwc->outBuffer.pos; strm->total_out += zwc->outBuffer.pos; strm->avail_out -= zwc->outBuffer.pos; - if (bytesLeft == 0) { zwc->streamEnd = 1; LOG_WRAPPERC("Z_STREAM_END2 strm->total_in=%d strm->avail_out=%d strm->total_out=%d\n", (int)strm->total_in, (int)strm->avail_out, (int)strm->total_out); return Z_STREAM_END; } - } + if (bytesLeft == 0) { + zwc->streamEnd = 1; + LOG_WRAPPERC("Z_STREAM_END2 strm->total_in=%d strm->avail_out=%d strm->total_out=%d\n", + (int)strm->total_in, (int)strm->avail_out, (int)strm->total_out); + return Z_STREAM_END; + } } else if (flush == Z_SYNC_FLUSH || flush == Z_PARTIAL_FLUSH) { size_t bytesLeft; @@ -410,12 +422,13 @@ ZEXTERN int ZEXPORT z_deflateParams OF((z_streamp strm, -/* *** Decompression *** */ +/* === Decompression === */ + typedef enum { ZWRAP_ZLIB_STREAM, ZWRAP_ZSTD_STREAM, ZWRAP_UNKNOWN_STREAM } ZWRAP_stream_type; typedef struct { ZSTD_DStream* zbd; - char headerBuf[16]; /* should be equal or bigger than ZSTD_frameHeaderSize_min */ + char headerBuf[16]; /* must be >= ZSTD_frameHeaderSize_min */ int errorCount; unsigned long long totalInBytes; /* we need it as strm->total_in can be reset by user */ ZWRAP_state_t decompState; @@ -427,10 +440,48 @@ typedef struct { char *version; int windowBits; ZSTD_customMem customMem; - z_stream allocFunc; /* copy of zalloc, zfree, opaque */ + z_stream allocFunc; /* just to copy zalloc, zfree, opaque */ } ZWRAP_DCtx; +static void ZWRAP_initDCtx(ZWRAP_DCtx* zwd) +{ + zwd->errorCount = 0; + zwd->outBuffer.pos = 0; + zwd->outBuffer.size = 0; +} + +static ZWRAP_DCtx* ZWRAP_createDCtx(z_streamp strm) +{ + ZWRAP_DCtx* zwd; + MEM_STATIC_ASSERT(sizeof(zwd->headerBuf) >= ZSTD_FRAMEHEADERSIZE_MIN); /* check static buffer size condition */ + + if (strm->zalloc && strm->zfree) { + zwd = (ZWRAP_DCtx*)strm->zalloc(strm->opaque, 1, sizeof(ZWRAP_DCtx)); + if (zwd==NULL) return NULL; + memset(zwd, 0, sizeof(ZWRAP_DCtx)); + zwd->allocFunc = *strm; /* just to copy zalloc, zfree & opaque */ + { ZSTD_customMem const ZWRAP_customMem = { ZWRAP_allocFunction, ZWRAP_freeFunction, &zwd->allocFunc }; + zwd->customMem = ZWRAP_customMem; } + } else { + zwd = (ZWRAP_DCtx*)calloc(1, sizeof(*zwd)); + if (zwd==NULL) return NULL; + } + + ZWRAP_initDCtx(zwd); + return zwd; +} + +static size_t ZWRAP_freeDCtx(ZWRAP_DCtx* zwd) +{ + if (zwd==NULL) return 0; /* support free on null */ + ZSTD_freeDStream(zwd->zbd); + ZSTD_free(zwd->version, zwd->customMem); + ZSTD_free(zwd, zwd->customMem); + return 0; +} + + int ZWRAP_isUsingZSTDdecompression(z_streamp strm) { if (strm == NULL) return 0; @@ -438,61 +489,17 @@ int ZWRAP_isUsingZSTDdecompression(z_streamp strm) } -void ZWRAP_initDCtx(ZWRAP_DCtx* zwd) -{ - zwd->errorCount = 0; - zwd->outBuffer.pos = 0; - zwd->outBuffer.size = 0; -} - - -ZWRAP_DCtx* ZWRAP_createDCtx(z_streamp strm) -{ - ZWRAP_DCtx* zwd; - - if (strm->zalloc && strm->zfree) { - zwd = (ZWRAP_DCtx*)strm->zalloc(strm->opaque, 1, sizeof(ZWRAP_DCtx)); - if (zwd==NULL) return NULL; - memset(zwd, 0, sizeof(ZWRAP_DCtx)); - memcpy(&zwd->allocFunc, strm, sizeof(z_stream)); - { ZSTD_customMem ZWRAP_customMem = { ZWRAP_allocFunction, ZWRAP_freeFunction, &zwd->allocFunc }; - memcpy(&zwd->customMem, &ZWRAP_customMem, sizeof(ZSTD_customMem)); - } - } else { - zwd = (ZWRAP_DCtx*)defaultCustomMem.customAlloc(defaultCustomMem.opaque, sizeof(ZWRAP_DCtx)); - if (zwd==NULL) return NULL; - memset(zwd, 0, sizeof(ZWRAP_DCtx)); - memcpy(&zwd->customMem, &defaultCustomMem, sizeof(ZSTD_customMem)); - } - - MEM_STATIC_ASSERT(sizeof(zwd->headerBuf) >= ZSTD_FRAMEHEADERSIZE_MIN); /* if compilation fails here, assertion is false */ - ZWRAP_initDCtx(zwd); - return zwd; -} - - -size_t ZWRAP_freeDCtx(ZWRAP_DCtx* zwd) -{ - if (zwd==NULL) return 0; /* support free on null */ - if (zwd->zbd) ZSTD_freeDStream(zwd->zbd); - if (zwd->version) zwd->customMem.customFree(zwd->customMem.opaque, zwd->version); - zwd->customMem.customFree(zwd->customMem.opaque, zwd); - return 0; -} - - -int ZWRAPD_finishWithError(ZWRAP_DCtx* zwd, z_streamp strm, int error) +static int ZWRAPD_finishWithError(ZWRAP_DCtx* zwd, z_streamp strm, int error) { LOG_WRAPPERD("- ZWRAPD_finishWithError=%d\n", error); - if (zwd) ZWRAP_freeDCtx(zwd); - if (strm) strm->state = NULL; + ZWRAP_freeDCtx(zwd); + strm->state = NULL; return (error) ? error : Z_STREAM_ERROR; } - -int ZWRAPD_finishWithErrorMsg(z_streamp strm, char* message) +static int ZWRAPD_finishWithErrorMsg(z_streamp strm, char* message) { - ZWRAP_DCtx* zwd = (ZWRAP_DCtx*) strm->state; + ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*) strm->state; strm->msg = message; if (zwd == NULL) return Z_STREAM_ERROR; @@ -504,26 +511,25 @@ ZEXTERN int ZEXPORT z_inflateInit_ OF((z_streamp strm, const char *version, int stream_size)) { if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB) { - strm->reserved = ZWRAP_ZLIB_STREAM; /* mark as zlib stream */ + strm->reserved = ZWRAP_ZLIB_STREAM; return inflateInit(strm); } - { - ZWRAP_DCtx* zwd = ZWRAP_createDCtx(strm); - LOG_WRAPPERD("- inflateInit\n"); - if (zwd == NULL) return ZWRAPD_finishWithError(zwd, strm, 0); + { ZWRAP_DCtx* const zwd = ZWRAP_createDCtx(strm); + LOG_WRAPPERD("- inflateInit\n"); + if (zwd == NULL) return ZWRAPD_finishWithError(zwd, strm, 0); - zwd->version = zwd->customMem.customAlloc(zwd->customMem.opaque, strlen(version) + 1); - if (zwd->version == NULL) return ZWRAPD_finishWithError(zwd, strm, 0); - strcpy(zwd->version, version); + zwd->version = ZSTD_malloc(strlen(version)+1, zwd->customMem); + if (zwd->version == NULL) return ZWRAPD_finishWithError(zwd, strm, 0); + strcpy(zwd->version, version); - zwd->stream_size = stream_size; - zwd->totalInBytes = 0; - strm->state = (struct internal_state*) zwd; /* use state which in not used by user */ - strm->total_in = 0; - strm->total_out = 0; - strm->reserved = ZWRAP_UNKNOWN_STREAM; /* mark as unknown steam */ - strm->adler = 0; + zwd->stream_size = stream_size; + zwd->totalInBytes = 0; + strm->state = (struct internal_state*) zwd; + strm->total_in = 0; + strm->total_out = 0; + strm->reserved = ZWRAP_UNKNOWN_STREAM; + strm->adler = 0; } return Z_OK; @@ -537,15 +543,14 @@ ZEXTERN int ZEXPORT z_inflateInit2_ OF((z_streamp strm, int windowBits, return inflateInit2_(strm, windowBits, version, stream_size); } - { - int ret = z_inflateInit_ (strm, version, stream_size); - LOG_WRAPPERD("- inflateInit2 windowBits=%d\n", windowBits); - if (ret == Z_OK) { - ZWRAP_DCtx* zwd = (ZWRAP_DCtx*)strm->state; - if (zwd == NULL) return Z_STREAM_ERROR; - zwd->windowBits = windowBits; - } - return ret; + { int const ret = z_inflateInit_ (strm, version, stream_size); + LOG_WRAPPERD("- inflateInit2 windowBits=%d\n", windowBits); + if (ret == Z_OK) { + ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*)strm->state; + if (zwd == NULL) return Z_STREAM_ERROR; + zwd->windowBits = windowBits; + } + return ret; } } @@ -555,7 +560,7 @@ int ZWRAP_inflateReset_keepDict(z_streamp strm) if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) return inflateReset(strm); - { ZWRAP_DCtx* zwd = (ZWRAP_DCtx*) strm->state; + { ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*) strm->state; if (zwd == NULL) return Z_STREAM_ERROR; ZWRAP_initDCtx(zwd); zwd->decompState = ZWRAP_useReset; @@ -574,10 +579,10 @@ ZEXTERN int ZEXPORT z_inflateReset OF((z_streamp strm)) if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) return inflateReset(strm); - { int ret = ZWRAP_inflateReset_keepDict(strm); + { int const ret = ZWRAP_inflateReset_keepDict(strm); if (ret != Z_OK) return ret; } - { ZWRAP_DCtx* zwd = (ZWRAP_DCtx*) strm->state; + { ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*) strm->state; if (zwd == NULL) return Z_STREAM_ERROR; zwd->decompState = ZWRAP_useInit; } @@ -592,9 +597,9 @@ ZEXTERN int ZEXPORT z_inflateReset2 OF((z_streamp strm, if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) return inflateReset2(strm, windowBits); - { int ret = z_inflateReset (strm); + { int const ret = z_inflateReset (strm); if (ret == Z_OK) { - ZWRAP_DCtx* zwd = (ZWRAP_DCtx*)strm->state; + ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*)strm->state; if (zwd == NULL) return Z_STREAM_ERROR; zwd->windowBits = windowBits; } @@ -612,11 +617,10 @@ ZEXTERN int ZEXPORT z_inflateSetDictionary OF((z_streamp strm, if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) return inflateSetDictionary(strm, dictionary, dictLength); - { size_t errorCode; - ZWRAP_DCtx* zwd = (ZWRAP_DCtx*) strm->state; + { ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*) strm->state; if (zwd == NULL || zwd->zbd == NULL) return Z_STREAM_ERROR; - errorCode = ZSTD_initDStream_usingDict(zwd->zbd, dictionary, dictLength); - if (ZSTD_isError(errorCode)) return ZWRAPD_finishWithError(zwd, strm, 0); + { size_t const initErr = ZSTD_initDStream_usingDict(zwd->zbd, dictionary, dictLength); + if (ZSTD_isError(initErr)) return ZWRAPD_finishWithError(zwd, strm, 0); } zwd->decompState = ZWRAP_useReset; if (zwd->totalInBytes == ZSTD_HEADERSIZE) { @@ -626,14 +630,14 @@ ZEXTERN int ZEXPORT z_inflateSetDictionary OF((z_streamp strm, zwd->outBuffer.dst = strm->next_out; zwd->outBuffer.size = 0; zwd->outBuffer.pos = 0; - errorCode = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); - LOG_WRAPPERD("inflateSetDictionary ZSTD_decompressStream errorCode=%d srcSize=%d dstCapacity=%d\n", (int)errorCode, (int)zwd->inBuffer.size, (int)zwd->outBuffer.size); - if (zwd->inBuffer.pos < zwd->outBuffer.size || ZSTD_isError(errorCode)) { - LOG_WRAPPERD("ERROR: ZSTD_decompressStream %s\n", ZSTD_getErrorName(errorCode)); - return ZWRAPD_finishWithError(zwd, strm, 0); - } - } - } + { size_t const errorCode = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); + LOG_WRAPPERD("inflateSetDictionary ZSTD_decompressStream errorCode=%d srcSize=%d dstCapacity=%d\n", + (int)errorCode, (int)zwd->inBuffer.size, (int)zwd->outBuffer.size); + if (zwd->inBuffer.pos < zwd->outBuffer.size || ZSTD_isError(errorCode)) { + LOG_WRAPPERD("ERROR: ZSTD_decompressStream %s\n", + ZSTD_getErrorName(errorCode)); + return ZWRAPD_finishWithError(zwd, strm, 0); + } } } } return Z_OK; } @@ -642,156 +646,175 @@ ZEXTERN int ZEXPORT z_inflateSetDictionary OF((z_streamp strm, ZEXTERN int ZEXPORT z_inflate OF((z_streamp strm, int flush)) { ZWRAP_DCtx* zwd; - int res; + if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) { - LOG_WRAPPERD("- inflate1 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); - res = inflate(strm, flush); - LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, res); - return res; + int const result = inflate(strm, flush); + LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, result); + return result; } if (strm->avail_in <= 0) return Z_OK; - { size_t errorCode, srcSize; - zwd = (ZWRAP_DCtx*) strm->state; - LOG_WRAPPERD("- inflate1 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); + zwd = (ZWRAP_DCtx*) strm->state; + LOG_WRAPPERD("- inflate1 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); - if (zwd == NULL) return Z_STREAM_ERROR; - if (zwd->decompState == ZWRAP_streamEnd) return Z_STREAM_END; + if (zwd == NULL) return Z_STREAM_ERROR; + if (zwd->decompState == ZWRAP_streamEnd) return Z_STREAM_END; - if (zwd->totalInBytes < ZLIB_HEADERSIZE) { - if (zwd->totalInBytes == 0 && strm->avail_in >= ZLIB_HEADERSIZE) { - if (MEM_readLE32(strm->next_in) != ZSTD_MAGICNUMBER) { - if (zwd->windowBits) - errorCode = inflateInit2_(strm, zwd->windowBits, zwd->version, zwd->stream_size); - else - errorCode = inflateInit_(strm, zwd->version, zwd->stream_size); - - strm->reserved = ZWRAP_ZLIB_STREAM; /* mark as zlib stream */ - errorCode = ZWRAP_freeDCtx(zwd); - if (ZSTD_isError(errorCode)) goto error; - - if (flush == Z_INFLATE_SYNC) res = inflateSync(strm); - else res = inflate(strm, flush); - LOG_WRAPPERD("- inflate3 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, res); - return res; - } - } else { - srcSize = MIN(strm->avail_in, ZLIB_HEADERSIZE - zwd->totalInBytes); - memcpy(zwd->headerBuf+zwd->totalInBytes, strm->next_in, srcSize); - strm->total_in += srcSize; - zwd->totalInBytes += srcSize; - strm->next_in += srcSize; - strm->avail_in -= srcSize; - if (zwd->totalInBytes < ZLIB_HEADERSIZE) return Z_OK; - - if (MEM_readLE32(zwd->headerBuf) != ZSTD_MAGICNUMBER) { - z_stream strm2; - strm2.next_in = strm->next_in; - strm2.avail_in = strm->avail_in; - strm2.next_out = strm->next_out; - strm2.avail_out = strm->avail_out; - - if (zwd->windowBits) - errorCode = inflateInit2_(strm, zwd->windowBits, zwd->version, zwd->stream_size); - else - errorCode = inflateInit_(strm, zwd->version, zwd->stream_size); - LOG_WRAPPERD("ZLIB inflateInit errorCode=%d\n", (int)errorCode); - if (errorCode != Z_OK) return ZWRAPD_finishWithError(zwd, strm, (int)errorCode); - - /* inflate header */ - strm->next_in = (unsigned char*)zwd->headerBuf; - strm->avail_in = ZLIB_HEADERSIZE; - strm->avail_out = 0; - errorCode = inflate(strm, Z_NO_FLUSH); - LOG_WRAPPERD("ZLIB inflate errorCode=%d strm->avail_in=%d\n", (int)errorCode, (int)strm->avail_in); - if (errorCode != Z_OK) return ZWRAPD_finishWithError(zwd, strm, (int)errorCode); - if (strm->avail_in > 0) goto error; - - strm->next_in = strm2.next_in; - strm->avail_in = strm2.avail_in; - strm->next_out = strm2.next_out; - strm->avail_out = strm2.avail_out; - - strm->reserved = ZWRAP_ZLIB_STREAM; /* mark as zlib stream */ - errorCode = ZWRAP_freeDCtx(zwd); - if (ZSTD_isError(errorCode)) goto error; - - if (flush == Z_INFLATE_SYNC) res = inflateSync(strm); - else res = inflate(strm, flush); - LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, res); - return res; - } - } - } - - strm->reserved = ZWRAP_ZSTD_STREAM; /* mark as zstd steam */ - - if (flush == Z_INFLATE_SYNC) { strm->msg = "inflateSync is not supported!"; goto error; } - - if (!zwd->zbd) { - zwd->zbd = ZSTD_createDStream_advanced(zwd->customMem); - if (zwd->zbd == NULL) { LOG_WRAPPERD("ERROR: ZSTD_createDStream_advanced\n"); goto error; } - zwd->decompState = ZWRAP_useInit; - } - - if (zwd->totalInBytes < ZSTD_HEADERSIZE) - { - if (zwd->totalInBytes == 0 && strm->avail_in >= ZSTD_HEADERSIZE) { - if (zwd->decompState == ZWRAP_useInit) { - errorCode = ZSTD_initDStream(zwd->zbd); - if (ZSTD_isError(errorCode)) { LOG_WRAPPERD("ERROR: ZSTD_initDStream errorCode=%s\n", ZSTD_getErrorName(errorCode)); goto error; } - } else { - errorCode = ZSTD_resetDStream(zwd->zbd); - if (ZSTD_isError(errorCode)) goto error; - } - } else { - srcSize = MIN(strm->avail_in, ZSTD_HEADERSIZE - zwd->totalInBytes); - memcpy(zwd->headerBuf+zwd->totalInBytes, strm->next_in, srcSize); - strm->total_in += srcSize; - zwd->totalInBytes += srcSize; - strm->next_in += srcSize; - strm->avail_in -= srcSize; - if (zwd->totalInBytes < ZSTD_HEADERSIZE) return Z_OK; - - if (zwd->decompState == ZWRAP_useInit) { - errorCode = ZSTD_initDStream(zwd->zbd); - if (ZSTD_isError(errorCode)) { LOG_WRAPPERD("ERROR: ZSTD_initDStream errorCode=%s\n", ZSTD_getErrorName(errorCode)); goto error; } - } else { - errorCode = ZSTD_resetDStream(zwd->zbd); - if (ZSTD_isError(errorCode)) goto error; + if (zwd->totalInBytes < ZLIB_HEADERSIZE) { + if (zwd->totalInBytes == 0 && strm->avail_in >= ZLIB_HEADERSIZE) { + if (MEM_readLE32(strm->next_in) != ZSTD_MAGICNUMBER) { + { int const initErr = (zwd->windowBits) ? + inflateInit2_(strm, zwd->windowBits, zwd->version, zwd->stream_size) : + inflateInit_(strm, zwd->version, zwd->stream_size); + LOG_WRAPPERD("ZLIB inflateInit errorCode=%d\n", initErr); + if (initErr != Z_OK) return ZWRAPD_finishWithError(zwd, strm, initErr); } - zwd->inBuffer.src = zwd->headerBuf; - zwd->inBuffer.size = ZSTD_HEADERSIZE; - zwd->inBuffer.pos = 0; - zwd->outBuffer.dst = strm->next_out; - zwd->outBuffer.size = 0; - zwd->outBuffer.pos = 0; - errorCode = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); - LOG_WRAPPERD("inflate ZSTD_decompressStream1 errorCode=%d srcSize=%d dstCapacity=%d\n", (int)errorCode, (int)zwd->inBuffer.size, (int)zwd->outBuffer.size); - if (ZSTD_isError(errorCode)) { - LOG_WRAPPERD("ERROR: ZSTD_decompressStream1 %s\n", ZSTD_getErrorName(errorCode)); + strm->reserved = ZWRAP_ZLIB_STREAM; + { size_t const freeErr = ZWRAP_freeDCtx(zwd); + if (ZSTD_isError(freeErr)) goto error; } + + { int const result = (flush == Z_INFLATE_SYNC) ? + inflateSync(strm) : + inflate(strm, flush); + LOG_WRAPPERD("- inflate3 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, res); + return result; + } } + } else { /* ! (zwd->totalInBytes == 0 && strm->avail_in >= ZLIB_HEADERSIZE) */ + size_t const srcSize = MIN(strm->avail_in, ZLIB_HEADERSIZE - zwd->totalInBytes); + memcpy(zwd->headerBuf+zwd->totalInBytes, strm->next_in, srcSize); + strm->total_in += srcSize; + zwd->totalInBytes += srcSize; + strm->next_in += srcSize; + strm->avail_in -= srcSize; + if (zwd->totalInBytes < ZLIB_HEADERSIZE) return Z_OK; + + if (MEM_readLE32(zwd->headerBuf) != ZSTD_MAGICNUMBER) { + z_stream strm2; + strm2.next_in = strm->next_in; + strm2.avail_in = strm->avail_in; + strm2.next_out = strm->next_out; + strm2.avail_out = strm->avail_out; + + { int const initErr = (zwd->windowBits) ? + inflateInit2_(strm, zwd->windowBits, zwd->version, zwd->stream_size) : + inflateInit_(strm, zwd->version, zwd->stream_size); + LOG_WRAPPERD("ZLIB inflateInit errorCode=%d\n", initErr); + if (initErr != Z_OK) return ZWRAPD_finishWithError(zwd, strm, initErr); + } + + /* inflate header */ + strm->next_in = (unsigned char*)zwd->headerBuf; + strm->avail_in = ZLIB_HEADERSIZE; + strm->avail_out = 0; + { int const dErr = inflate(strm, Z_NO_FLUSH); + LOG_WRAPPERD("ZLIB inflate errorCode=%d strm->avail_in=%d\n", + dErr, (int)strm->avail_in); + if (dErr != Z_OK) + return ZWRAPD_finishWithError(zwd, strm, dErr); + } + if (strm->avail_in > 0) goto error; + + strm->next_in = strm2.next_in; + strm->avail_in = strm2.avail_in; + strm->next_out = strm2.next_out; + strm->avail_out = strm2.avail_out; + + strm->reserved = ZWRAP_ZLIB_STREAM; /* mark as zlib stream */ + { size_t const freeErr = ZWRAP_freeDCtx(zwd); + if (ZSTD_isError(freeErr)) goto error; } + + { int const result = (flush == Z_INFLATE_SYNC) ? + inflateSync(strm) : + inflate(strm, flush); + LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, res); + return result; + } } } /* if ! (zwd->totalInBytes == 0 && strm->avail_in >= ZLIB_HEADERSIZE) */ + } /* (zwd->totalInBytes < ZLIB_HEADERSIZE) */ + + strm->reserved = ZWRAP_ZSTD_STREAM; /* mark as zstd steam */ + + if (flush == Z_INFLATE_SYNC) { strm->msg = "inflateSync is not supported!"; goto error; } + + if (!zwd->zbd) { + zwd->zbd = ZSTD_createDStream_advanced(zwd->customMem); + if (zwd->zbd == NULL) { LOG_WRAPPERD("ERROR: ZSTD_createDStream_advanced\n"); goto error; } + zwd->decompState = ZWRAP_useInit; + } + + if (zwd->totalInBytes < ZSTD_HEADERSIZE) { + if (zwd->totalInBytes == 0 && strm->avail_in >= ZSTD_HEADERSIZE) { + if (zwd->decompState == ZWRAP_useInit) { + size_t const initErr = ZSTD_initDStream(zwd->zbd); + if (ZSTD_isError(initErr)) { + LOG_WRAPPERD("ERROR: ZSTD_initDStream errorCode=%s\n", + ZSTD_getErrorName(initErr)); goto error; } - if (zwd->inBuffer.pos != zwd->inBuffer.size) goto error; /* not consumed */ + } else { + size_t const resetErr = ZSTD_resetDStream(zwd->zbd); + if (ZSTD_isError(resetErr)) goto error; } - } + } else { + size_t const srcSize = MIN(strm->avail_in, ZSTD_HEADERSIZE - zwd->totalInBytes); + memcpy(zwd->headerBuf+zwd->totalInBytes, strm->next_in, srcSize); + strm->total_in += srcSize; + zwd->totalInBytes += srcSize; + strm->next_in += srcSize; + strm->avail_in -= srcSize; + if (zwd->totalInBytes < ZSTD_HEADERSIZE) return Z_OK; - zwd->inBuffer.src = strm->next_in; - zwd->inBuffer.size = strm->avail_in; - zwd->inBuffer.pos = 0; - zwd->outBuffer.dst = strm->next_out; - zwd->outBuffer.size = strm->avail_out; - zwd->outBuffer.pos = 0; - errorCode = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); - LOG_WRAPPERD("inflate ZSTD_decompressStream2 errorCode=%d srcSize=%d dstCapacity=%d\n", (int)errorCode, (int)strm->avail_in, (int)strm->avail_out); - if (ZSTD_isError(errorCode)) { + if (zwd->decompState == ZWRAP_useInit) { + size_t const initErr = ZSTD_initDStream(zwd->zbd); + if (ZSTD_isError(initErr)) { + LOG_WRAPPERD("ERROR: ZSTD_initDStream errorCode=%s\n", + ZSTD_getErrorName(initErr)); + goto error; + } + } else { + size_t const resetErr = ZSTD_resetDStream(zwd->zbd); + if (ZSTD_isError(resetErr)) goto error; + } + + zwd->inBuffer.src = zwd->headerBuf; + zwd->inBuffer.size = ZSTD_HEADERSIZE; + zwd->inBuffer.pos = 0; + zwd->outBuffer.dst = strm->next_out; + zwd->outBuffer.size = 0; + zwd->outBuffer.pos = 0; + { size_t const dErr = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); + LOG_WRAPPERD("inflate ZSTD_decompressStream1 errorCode=%d srcSize=%d dstCapacity=%d\n", + (int)dErr, (int)zwd->inBuffer.size, (int)zwd->outBuffer.size); + if (ZSTD_isError(dErr)) { + LOG_WRAPPERD("ERROR: ZSTD_decompressStream1 %s\n", ZSTD_getErrorName(dErr)); + goto error; + } } + if (zwd->inBuffer.pos != zwd->inBuffer.size) goto error; /* not consumed */ + } + } /* (zwd->totalInBytes < ZSTD_HEADERSIZE) */ + + zwd->inBuffer.src = strm->next_in; + zwd->inBuffer.size = strm->avail_in; + zwd->inBuffer.pos = 0; + zwd->outBuffer.dst = strm->next_out; + zwd->outBuffer.size = strm->avail_out; + zwd->outBuffer.pos = 0; + { size_t const dErr = ZSTD_decompressStream(zwd->zbd, &zwd->outBuffer, &zwd->inBuffer); + LOG_WRAPPERD("inflate ZSTD_decompressStream2 errorCode=%d srcSize=%d dstCapacity=%d\n", + (int)dErr, (int)strm->avail_in, (int)strm->avail_out); + if (ZSTD_isError(dErr)) { zwd->errorCount++; - LOG_WRAPPERD("ERROR: ZSTD_decompressStream2 %s zwd->errorCount=%d\n", ZSTD_getErrorName(errorCode), zwd->errorCount); + LOG_WRAPPERD("ERROR: ZSTD_decompressStream2 %s zwd->errorCount=%d\n", + ZSTD_getErrorName(dErr), zwd->errorCount); if (zwd->errorCount<=1) return Z_NEED_DICT; else goto error; } - LOG_WRAPPERD("inflate inBuffer.pos=%d inBuffer.size=%d outBuffer.pos=%d outBuffer.size=%d o\n", (int)zwd->inBuffer.pos, (int)zwd->inBuffer.size, (int)zwd->outBuffer.pos, (int)zwd->outBuffer.size); + LOG_WRAPPERD("inflate inBuffer.pos=%d inBuffer.size=%d outBuffer.pos=%d outBuffer.size=%d o\n", + (int)zwd->inBuffer.pos, (int)zwd->inBuffer.size, (int)zwd->outBuffer.pos, (int)zwd->outBuffer.size); strm->next_out += zwd->outBuffer.pos; strm->total_out += zwd->outBuffer.pos; strm->avail_out -= zwd->outBuffer.pos; @@ -799,13 +822,16 @@ ZEXTERN int ZEXPORT z_inflate OF((z_streamp strm, int flush)) zwd->totalInBytes += zwd->inBuffer.pos; strm->next_in += zwd->inBuffer.pos; strm->avail_in -= zwd->inBuffer.pos; - if (errorCode == 0) { - LOG_WRAPPERD("inflate Z_STREAM_END1 avail_in=%d avail_out=%d total_in=%d total_out=%d\n", (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); + if (dErr == 0) { + LOG_WRAPPERD("inflate Z_STREAM_END1 avail_in=%d avail_out=%d total_in=%d total_out=%d\n", + (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out); zwd->decompState = ZWRAP_streamEnd; return Z_STREAM_END; } - } - LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, Z_OK); + } /* dErr lifetime */ + + LOG_WRAPPERD("- inflate2 flush=%d avail_in=%d avail_out=%d total_in=%d total_out=%d res=%d\n", + (int)flush, (int)strm->avail_in, (int)strm->avail_out, (int)strm->total_in, (int)strm->total_out, Z_OK); return Z_OK; error: @@ -818,13 +844,13 @@ ZEXTERN int ZEXPORT z_inflateEnd OF((z_streamp strm)) if (g_ZWRAPdecompressionType == ZWRAP_FORCE_ZLIB || !strm->reserved) return inflateEnd(strm); - LOG_WRAPPERD("- inflateEnd total_in=%d total_out=%d\n", (int)(strm->total_in), (int)(strm->total_out)); - { size_t errorCode; - ZWRAP_DCtx* zwd = (ZWRAP_DCtx*) strm->state; + LOG_WRAPPERD("- inflateEnd total_in=%d total_out=%d\n", + (int)(strm->total_in), (int)(strm->total_out)); + { ZWRAP_DCtx* const zwd = (ZWRAP_DCtx*) strm->state; if (zwd == NULL) return Z_OK; /* structures are already freed */ + { size_t const freeErr = ZWRAP_freeDCtx(zwd); + if (ZSTD_isError(freeErr)) return Z_STREAM_ERROR; } strm->state = NULL; - errorCode = ZWRAP_freeDCtx(zwd); - if (ZSTD_isError(errorCode)) return Z_STREAM_ERROR; } return Z_OK; } @@ -841,8 +867,6 @@ ZEXTERN int ZEXPORT z_inflateSync OF((z_streamp strm)) - - /* Advanced compression functions */ ZEXTERN int ZEXPORT z_deflateCopy OF((z_streamp dest, z_streamp source)) @@ -982,7 +1006,7 @@ ZEXTERN uLong ZEXPORT z_zlibCompileFlags OF((void)) { return zlibCompileFlags(); - /* utility functions */ + /* === utility functions === */ #ifndef Z_SOLO ZEXTERN int ZEXPORT z_compress OF((Bytef *dest, uLongf *destLen, @@ -991,11 +1015,14 @@ ZEXTERN int ZEXPORT z_compress OF((Bytef *dest, uLongf *destLen, if (!g_ZWRAP_useZSTDcompression) return compress(dest, destLen, source, sourceLen); - { size_t dstCapacity = *destLen; - size_t const errorCode = ZSTD_compress(dest, dstCapacity, source, sourceLen, ZWRAP_DEFAULT_CLEVEL); - LOG_WRAPPERD("z_compress sourceLen=%d dstCapacity=%d\n", (int)sourceLen, (int)dstCapacity); - if (ZSTD_isError(errorCode)) return Z_STREAM_ERROR; - *destLen = errorCode; + { size_t dstCapacity = *destLen; + size_t const cSize = ZSTD_compress(dest, dstCapacity, + source, sourceLen, + ZWRAP_DEFAULT_CLEVEL); + LOG_WRAPPERD("z_compress sourceLen=%d dstCapacity=%d\n", + (int)sourceLen, (int)dstCapacity); + if (ZSTD_isError(cSize)) return Z_STREAM_ERROR; + *destLen = cSize; } return Z_OK; } @@ -1009,9 +1036,9 @@ ZEXTERN int ZEXPORT z_compress2 OF((Bytef *dest, uLongf *destLen, return compress2(dest, destLen, source, sourceLen, level); { size_t dstCapacity = *destLen; - size_t const errorCode = ZSTD_compress(dest, dstCapacity, source, sourceLen, level); - if (ZSTD_isError(errorCode)) return Z_STREAM_ERROR; - *destLen = errorCode; + size_t const cSize = ZSTD_compress(dest, dstCapacity, source, sourceLen, level); + if (ZSTD_isError(cSize)) return Z_STREAM_ERROR; + *destLen = cSize; } return Z_OK; } @@ -1029,13 +1056,13 @@ ZEXTERN uLong ZEXPORT z_compressBound OF((uLong sourceLen)) ZEXTERN int ZEXPORT z_uncompress OF((Bytef *dest, uLongf *destLen, const Bytef *source, uLong sourceLen)) { - if (sourceLen < 4 || MEM_readLE32(source) != ZSTD_MAGICNUMBER) + if (!ZSTD_isFrame(source, sourceLen)) return uncompress(dest, destLen, source, sourceLen); { size_t dstCapacity = *destLen; - size_t const errorCode = ZSTD_decompress(dest, dstCapacity, source, sourceLen); - if (ZSTD_isError(errorCode)) return Z_STREAM_ERROR; - *destLen = errorCode; + size_t const dSize = ZSTD_decompress(dest, dstCapacity, source, sourceLen); + if (ZSTD_isError(dSize)) return Z_STREAM_ERROR; + *destLen = dSize; } return Z_OK; }