Import Zstd 1.4.5
This commit is contained in:
parent
ea68403922
commit
bc64b5ce19
29
CHANGELOG
29
CHANGELOG
@ -1,3 +1,29 @@
|
||||
v1.4.5
|
||||
fix : Compression ratio regression on huge files (> 3 GB) using high levels (--ultra) and multithreading, by @terrelln
|
||||
perf: Improved decompression speed: x64 : +10% (clang) / +5% (gcc); ARM : from +15% to +50%, depending on SoC, by @terrelln
|
||||
perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashreshta)
|
||||
perf: Improved fast compression speed on aarch64 (#2040, ~+3%, by @caoyzh)
|
||||
perf: Small level 1 compression speed gains (depending on compiler)
|
||||
cli : New --patch-from command, create and apply patches from files, by @bimbashreshta
|
||||
cli : New --filelist= : Provide a list of files to operate upon from a file
|
||||
cli : -b -d command can now benchmark decompression on multiple files
|
||||
cli : New --no-content-size command
|
||||
cli : New --show-default-cparams information command
|
||||
api : ZDICT_finalizeDictionary() is promoted to stable (#2111)
|
||||
api : new experimental parameter ZSTD_d_stableOutBuffer (#2094)
|
||||
build: Generate a single-file libzstd library (#2065, by @cwoffenden)
|
||||
build: Relative includes no longer require -I compiler flags for zstd lib subdirs (#2103, by @felixhandte)
|
||||
build: zstd now compiles cleanly under -pedantic (#2099)
|
||||
build: zstd now compiles with make-4.3
|
||||
build: Support mingw cross-compilation from Linux, by @Ericson2314
|
||||
build: Meson multi-thread build fix on windows
|
||||
build: Some misc icc fixes backed by new ci test on travis
|
||||
misc: bitflip analyzer tool, by @felixhandte
|
||||
misc: Extend largeNbDicts benchmark to compression
|
||||
misc: Edit-distance match finder in contrib/
|
||||
doc : Improved beginner CONTRIBUTING.md docs
|
||||
doc : New issue templates for zstd
|
||||
|
||||
v1.4.4
|
||||
perf: Improved decompression speed, by > 10%, by @terrelln
|
||||
perf: Better compression speed when re-using a context, by @felixhandte
|
||||
@ -14,7 +40,8 @@ cli: commands --stream-size=# and --size-hint=#, by @nmagerko
|
||||
cli: command --exclude-compressed, by @shashank0791
|
||||
cli: faster `-t` test mode
|
||||
cli: improved some error messages, by @vangyzen
|
||||
cli: rare deadlock condition within dictionary builder, by @terrelln
|
||||
cli: fix command `-D dictionary` on Windows, reported by @artyompetrov
|
||||
cli: fix rare deadlock condition within dictionary builder, by @terrelln
|
||||
build: single-file decoder with emscripten compilation script, by @cwoffenden
|
||||
build: fixed zlibWrapper compilation on Visual Studio, reported by @bluenlive
|
||||
build: fixed deprecation warning for certain gcc version, reported by @jasonma163
|
||||
|
352
CONTRIBUTING.md
352
CONTRIBUTING.md
@ -26,6 +26,356 @@ to do this once to work on any of Facebook's open source projects.
|
||||
|
||||
Complete your CLA here: <https://code.facebook.com/cla>
|
||||
|
||||
## Workflow
|
||||
Zstd uses a branch-based workflow for making changes to the codebase. Typically, zstd
|
||||
will use a new branch per sizable topic. For smaller changes, it is okay to lump multiple
|
||||
related changes into a branch.
|
||||
|
||||
Our contribution process works in three main stages:
|
||||
1. Local development
|
||||
* Update:
|
||||
* Checkout your fork of zstd if you have not already
|
||||
```
|
||||
git checkout https://github.com/<username>/zstd
|
||||
cd zstd
|
||||
```
|
||||
* Update your local dev branch
|
||||
```
|
||||
git pull https://github.com/facebook/zstd dev
|
||||
git push origin dev
|
||||
```
|
||||
* Topic and development:
|
||||
* Make a new branch on your fork about the topic you're developing for
|
||||
```
|
||||
# branch names should be consise but sufficiently informative
|
||||
git checkout -b <branch-name>
|
||||
git push origin <branch-name>
|
||||
```
|
||||
* Make commits and push
|
||||
```
|
||||
# make some changes =
|
||||
git add -u && git commit -m <message>
|
||||
git push origin <branch-name>
|
||||
```
|
||||
* Note: run local tests to ensure that your changes didn't break existing functionality
|
||||
* Quick check
|
||||
```
|
||||
make shortest
|
||||
```
|
||||
* Longer check
|
||||
```
|
||||
make test
|
||||
```
|
||||
2. Code Review and CI tests
|
||||
* Ensure CI tests pass:
|
||||
* Before sharing anything to the community, make sure that all CI tests pass on your local fork.
|
||||
See our section on setting up your CI environment for more information on how to do this.
|
||||
* Ensure that static analysis passes on your development machine. See the Static Analysis section
|
||||
below to see how to do this.
|
||||
* Create a pull request:
|
||||
* When you are ready to share you changes to the community, create a pull request from your branch
|
||||
to facebook:dev. You can do this very easily by clicking 'Create Pull Request' on your fork's home
|
||||
page.
|
||||
* From there, select the branch where you made changes as your source branch and facebook:dev
|
||||
as the destination.
|
||||
* Examine the diff presented between the two branches to make sure there is nothing unexpected.
|
||||
* Write a good pull request description:
|
||||
* While there is no strict template that our contributors follow, we would like them to
|
||||
sufficiently summarize and motivate the changes they are proposing. We recommend all pull requests,
|
||||
at least indirectly, address the following points.
|
||||
* Is this pull request important and why?
|
||||
* Is it addressing an issue? If so, what issue? (provide links for convenience please)
|
||||
* Is this a new feature? If so, why is it useful and/or necessary?
|
||||
* Are there background references and documents that reviewers should be aware of to properly assess this change?
|
||||
* Note: make sure to point out any design and architectural decisions that you made and the rationale behind them.
|
||||
* Note: if you have been working with a specific user and would like them to review your work, make sure you mention them using (@<username>)
|
||||
* Submit the pull request and iterate with feedback.
|
||||
3. Merge and Release
|
||||
* Getting approval:
|
||||
* You will have to iterate on your changes with feedback from other collaborators to reach a point
|
||||
where your pull request can be safely merged.
|
||||
* To avoid too many comments on style and convention, make sure that you have a
|
||||
look at our style section below before creating a pull request.
|
||||
* Eventually, someone from the zstd team will approve your pull request and not long after merge it into
|
||||
the dev branch.
|
||||
* Housekeeping:
|
||||
* Most PRs are linked with one or more Github issues. If this is the case for your PR, make sure
|
||||
the corresponding issue is mentioned. If your change 'fixes' or completely addresses the
|
||||
issue at hand, then please indicate this by requesting that an issue be closed by commenting.
|
||||
* Just because your changes have been merged does not mean the topic or larger issue is complete. Remember
|
||||
that the change must make it to an official zstd release for it to be meaningful. We recommend
|
||||
that contributers track the activity on their pull request and corresponding issue(s) page(s) until
|
||||
their change makes it to the next release of zstd. Users will often discover bugs in your code or
|
||||
suggest ways to refine and improve your initial changes even after the pull request is merged.
|
||||
|
||||
## Static Analysis
|
||||
Static analysis is a process for examining the correctness or validity of a program without actually
|
||||
executing it. It usually helps us find many simple bugs. Zstd uses clang's `scan-build` tool for
|
||||
static analysis. You can install it by following the instructions for your OS on https://clang-analyzer.llvm.org/scan-build.
|
||||
|
||||
Once installed, you can ensure that our static analysis tests pass on your local development machine
|
||||
by running:
|
||||
```
|
||||
make staticAnalyze
|
||||
```
|
||||
|
||||
In general, you can use `scan-build` to static analyze any build script. For example, to static analyze
|
||||
just `contrib/largeNbDicts` and nothing else, you can run:
|
||||
|
||||
```
|
||||
scan-build make -C contrib/largeNbDicts largeNbDicts
|
||||
```
|
||||
|
||||
## Performance
|
||||
Performance is extremely important for zstd and we only merge pull requests whose performance
|
||||
landscape and corresponding trade-offs have been adequately analyzed, reproduced, and presented.
|
||||
This high bar for performance means that every PR which has the potential to
|
||||
impact performance takes a very long time for us to properly review. That being said, we
|
||||
always welcome contributions to improve performance (or worsen performance for the trade-off of
|
||||
something else). Please keep the following in mind before submitting a performance related PR:
|
||||
|
||||
1. Zstd isn't as old as gzip but it has been around for time now and its evolution is
|
||||
very well documented via past Github issues and pull requests. It may be the case that your
|
||||
particular performance optimization has already been considered in the past. Please take some
|
||||
time to search through old issues and pull requests using keywords specific to your
|
||||
would-be PR. Of course, just because a topic has already been discussed (and perhaps rejected
|
||||
on some grounds) in the past, doesn't mean it isn't worth bringing up again. But even in that case,
|
||||
it will be helpful for you to have context from that topic's history before contributing.
|
||||
2. The distinction between noise and actual performance gains can unfortunately be very subtle
|
||||
especially when microbenchmarking extremely small wins or losses. The only remedy to getting
|
||||
something subtle merged is extensive benchmarking. You will be doing us a great favor if you
|
||||
take the time to run extensive, long-duration, and potentially cross-(os, platform, process, etc)
|
||||
benchmarks on your end before submitting a PR. Of course, you will not be able to benchmark
|
||||
your changes on every single processor and os out there (and neither will we) but do that best
|
||||
you can:) We've adding some things to think about when benchmarking below in the Benchmarking
|
||||
Performance section which might be helpful for you.
|
||||
3. Optimizing performance for a certain OS, processor vendor, compiler, or network system is a perfectly
|
||||
legitimate thing to do as long as it does not harm the overall performance health of Zstd.
|
||||
This is a hard balance to strike but please keep in mind other aspects of Zstd when
|
||||
submitting changes that are clang-specific, windows-specific, etc.
|
||||
|
||||
## Benchmarking Performance
|
||||
Performance microbenchmarking is a tricky subject but also essential for Zstd. We value empirical
|
||||
testing over theoretical speculation. This guide it not perfect but for most scenarios, it
|
||||
is a good place to start.
|
||||
|
||||
### Stability
|
||||
Unfortunately, the most important aspect in being able to benchmark reliably is to have a stable
|
||||
benchmarking machine. A virtual machine, a machine with shared resources, or your laptop
|
||||
will typically not be stable enough to obtain reliable benchmark results. If you can get your
|
||||
hands on a desktop, this is usually a better scenario.
|
||||
|
||||
Of course, benchmarking can be done on non-hyper-stable machines as well. You will just have to
|
||||
do a little more work to ensure that you are in fact measuring the changes you've made not and
|
||||
noise. Here are some things you can do to make your benchmarks more stable:
|
||||
|
||||
1. The most simple thing you can do to drastically improve the stability of your benchmark is
|
||||
to run it multiple times and then aggregate the results of those runs. As a general rule of
|
||||
thumb, the smaller the change you are trying to measure, the more samples of benchmark runs
|
||||
you will have to aggregate over to get reliable results. Here are some additional things to keep in
|
||||
mind when running multiple trials:
|
||||
* How you aggregate your samples are important. You might be tempted to use the mean of your
|
||||
results. While this is certainly going to be a more stable number than a raw single sample
|
||||
benchmark number, you might have more luck by taking the median. The mean is not robust to
|
||||
outliers whereas the median is. Better still, you could simply take the fastest speed your
|
||||
benchmark achieved on each run since that is likely the fastest your process will be
|
||||
capable of running your code. In our experience, this (aggregating by just taking the sample
|
||||
with the fastest running time) has been the most stable approach.
|
||||
* The more samples you have, the more stable your benchmarks should be. You can verify
|
||||
your improved stability by looking at the size of your confidence intervals as you
|
||||
increase your sample count. These should get smaller and smaller. Eventually hopefully
|
||||
smaller than the performance win you are expecting.
|
||||
* Most processors will take some time to get `hot` when running anything. The observations
|
||||
you collect during that time period will very different from the true performance number. Having
|
||||
a very large number of sample will help alleviate this problem slightly but you can also
|
||||
address is directly by simply not including the first `n` iterations of your benchmark in
|
||||
your aggregations. You can determine `n` by simply looking at the results from each iteration
|
||||
and then hand picking a good threshold after which the variance in results seems to stabilize.
|
||||
2. You cannot really get reliable benchmarks if your host machine is simultaneously running
|
||||
another cpu/memory-intensive application in the background. If you are running benchmarks on your
|
||||
personal laptop for instance, you should close all applications (including your code editor and
|
||||
browser) before running your benchmarks. You might also have invisible background applications
|
||||
running. You can see what these are by looking at either Activity Monitor on Mac or Task Manager
|
||||
on Windows. You will get more stable benchmark results of you end those processes as well.
|
||||
* If you have multiple cores, you can even run your benchmark on a reserved core to prevent
|
||||
pollution from other OS and user processes. There are a number of ways to do this depending
|
||||
on your OS:
|
||||
* On linux boxes, you have use https://github.com/lpechacek/cpuset.
|
||||
* On Windows, you can "Set Processor Affinity" using https://www.thewindowsclub.com/processor-affinity-windows
|
||||
* On Mac, you can try to use their dedicated affinity API https://developer.apple.com/library/archive/releasenotes/Performance/RN-AffinityAPI/#//apple_ref/doc/uid/TP40006635-CH1-DontLinkElementID_2
|
||||
3. To benchmark, you will likely end up writing a separate c/c++ program that will link libzstd.
|
||||
Dynamically linking your library will introduce some added variation (not a large amount but
|
||||
definitely some). Statically linking libzstd will be more stable. Static libraries should
|
||||
be enabled by default when building zstd.
|
||||
4. Use a profiler with a good high resolution timer. See the section below on profiling for
|
||||
details on this.
|
||||
5. Disable frequency scaling, turbo boost and address space randomization (this will vary by OS)
|
||||
6. Try to avoid storage. On some systems you can use tmpfs. Putting the program, inputs and outputs on
|
||||
tmpfs avoids touching a real storage system, which can have a pretty big variability.
|
||||
|
||||
Also check our LLVM's guide on benchmarking here: https://llvm.org/docs/Benchmarking.html
|
||||
|
||||
### Zstd benchmark
|
||||
The fastest signal you can get regarding your performance changes is via the in-build zstd cli
|
||||
bench option. You can run Zstd as you typically would for your scenario using some set of options
|
||||
and then additionally also specify the `-b#` option. Doing this will run our benchmarking pipeline
|
||||
for that options you have just provided. If you want to look at the internals of how this
|
||||
benchmarking script works, you can check out programs/benchzstd.c
|
||||
|
||||
For example: say you have made a change that you believe improves the speed of zstd level 1. The
|
||||
very first thing you should use to asses whether you actually achieved any sort of improvement
|
||||
is `zstd -b`. You might try to do something like this. Note: you can use the `-i` option to
|
||||
specify a running time for your benchmark in seconds (default is 3 seconds).
|
||||
Usually, the longer the running time, the more stable your results will be.
|
||||
|
||||
```
|
||||
$ git checkout <commit-before-your-change>
|
||||
$ make && cp zstd zstd-old
|
||||
$ git checkout <commit-after-your-change>
|
||||
$ make && cp zstd zstd-new
|
||||
$ zstd-old -i5 -b1 <your-test-data>
|
||||
1<your-test-data> : 8990 -> 3992 (2.252), 302.6 MB/s , 626.4 MB/s
|
||||
$ zstd-new -i5 -b1 <your-test-data>
|
||||
1<your-test-data> : 8990 -> 3992 (2.252), 302.8 MB/s , 628.4 MB/s
|
||||
```
|
||||
|
||||
Unless your performance win is large enough to be visible despite the intrinsic noise
|
||||
on your computer, benchzstd alone will likely not be enough to validate the impact of your
|
||||
changes. For example, the results of the example above indicate that effectively nothing
|
||||
changed but there could be a small <3% improvement that the noise on the host machine
|
||||
obscured. So unless you see a large performance win (10-15% consistently) using just
|
||||
this method of evaluation will not be sufficient.
|
||||
|
||||
### Profiling
|
||||
There are a number of great profilers out there. We're going to briefly mention how you can
|
||||
profile your code using `instruments` on mac, `perf` on linux and `visual studio profiler`
|
||||
on windows.
|
||||
|
||||
Say you have an idea for a change that you think will provide some good performance gains
|
||||
for level 1 compression on Zstd. Typically this means, you have identified a section of
|
||||
code that you think can be made to run faster.
|
||||
|
||||
The first thing you will want to do is make sure that the piece of code is actually taking up
|
||||
a notable amount of time to run. It is usually not worth optimzing something which accounts for less than
|
||||
0.0001% of the total running time. Luckily, there are tools to help with this.
|
||||
Profilers will let you see how much time your code spends inside a particular function.
|
||||
If your target code snippit is only part of a function, it might be worth trying to
|
||||
isolate that snippit by moving it to its own function (this is usually not necessary but
|
||||
might be).
|
||||
|
||||
Most profilers (including the profilers dicusssed below) will generate a call graph of
|
||||
functions for you. Your goal will be to find your function of interest in this call grapch
|
||||
and then inspect the time spent inside of it. You might also want to to look at the
|
||||
annotated assembly which most profilers will provide you with.
|
||||
|
||||
#### Instruments
|
||||
We will once again consider the scenario where you think you've identified a piece of code
|
||||
whose performance can be improved upon. Follow these steps to profile your code using
|
||||
Instruments.
|
||||
|
||||
1. Open Instruments
|
||||
2. Select `Time Profiler` from the list of standard templates
|
||||
3. Close all other applications except for your instruments window and your terminal
|
||||
4. Run your benchmarking script from your terminal window
|
||||
* You will want a benchmark that runs for at least a few seconds (5 seconds will
|
||||
usually be long enough). This way the profiler will have something to work with
|
||||
and you will have ample time to attach your profiler to this process:)
|
||||
* I will just use benchzstd as my bencharmking script for this example:
|
||||
```
|
||||
$ zstd -b1 -i5 <my-data> # this will run for 5 seconds
|
||||
```
|
||||
5. Once you run your benchmarking script, switch back over to instruments and attach your
|
||||
process to the time profiler. You can do this by:
|
||||
* Clicking on the `All Processes` drop down in the top left of the toolbar.
|
||||
* Selecting your process from the dropdown. In my case, it is just going to be labled
|
||||
`zstd`
|
||||
* Hitting the bright red record circle button on the top left of the toolbar
|
||||
6. You profiler will now start collecting metrics from your bencharking script. Once
|
||||
you think you have collected enough samples (usually this is the case after 3 seconds of
|
||||
recording), stop your profiler.
|
||||
7. Make sure that in toolbar of the bottom window, `profile` is selected.
|
||||
8. You should be able to see your call graph.
|
||||
* If you don't see the call graph or an incomplete call graph, make sure you have compiled
|
||||
zstd and your benchmarking scripg using debug flags. On mac and linux, this just means
|
||||
you will have to supply the `-g` flag alone with your build script. You might also
|
||||
have to provide the `-fno-omit-frame-pointer` flag
|
||||
9. Dig down the graph to find your function call and then inspect it by double clicking
|
||||
the list item. You will be able to see the annotated source code and the assembly side by
|
||||
side.
|
||||
|
||||
#### Perf
|
||||
|
||||
This wiki has a pretty detailed tutorial on getting started working with perf so we'll
|
||||
leave you to check that out of you're getting started:
|
||||
|
||||
https://perf.wiki.kernel.org/index.php/Tutorial
|
||||
|
||||
Some general notes on perf:
|
||||
* Use `perf stat -r # <bench-program>` to quickly get some relevant timing and
|
||||
counter statistics. Perf uses a high resolution timer and this is likely one
|
||||
of the first things your team will run when assessing your PR.
|
||||
* Perf has a long list of hardware counters that can be viewed with `perf --list`.
|
||||
When measuring optimizations, something worth trying is to make sure the handware
|
||||
counters you expect to be impacted by your change are in fact being so. For example,
|
||||
if you expect the L1 cache misses to decrease with your change, you can look at the
|
||||
counter `L1-dcache-load-misses`
|
||||
* Perf hardware counters will not work on a virtual machine.
|
||||
|
||||
#### Visual Studio
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
## Setting up continuous integration (CI) on your fork
|
||||
Zstd uses a number of different continuous integration (CI) tools to ensure that new changes
|
||||
are well tested before they make it to an official release. Specifically, we use the platforms
|
||||
travis-ci, circle-ci, and appveyor.
|
||||
|
||||
Changes cannot be merged into the main dev branch unless they pass all of our CI tests.
|
||||
The easiest way to run these CI tests on your own before submitting a PR to our dev branch
|
||||
is to configure your personal fork of zstd with each of the CI platforms. Below, you'll find
|
||||
instructions for doing this.
|
||||
|
||||
### travis-ci
|
||||
Follow these steps to link travis-ci with your github fork of zstd
|
||||
|
||||
1. Make sure you are logged into your github account
|
||||
2. Go to https://travis-ci.org/
|
||||
3. Click 'Sign in with Github' on the top right
|
||||
4. Click 'Authorize travis-ci'
|
||||
5. Click 'Activate all repositories using Github Apps'
|
||||
6. Select 'Only select repositories' and select your fork of zstd from the drop down
|
||||
7. Click 'Approve and Install'
|
||||
8. Click 'Sign in with Github' again. This time, it will be for travis-pro (which will let you view your tests on the web dashboard)
|
||||
9. Click 'Authorize travis-pro'
|
||||
10. You should have travis set up on your fork now.
|
||||
|
||||
### circle-ci
|
||||
TODO
|
||||
|
||||
### appveyor
|
||||
Follow these steps to link circle-ci with your girhub fork of zstd
|
||||
|
||||
1. Make sure you are logged into your github account
|
||||
2. Go to https://www.appveyor.com/
|
||||
3. Click 'Sign in' on the top right
|
||||
4. Select 'Github' on the left panel
|
||||
5. Click 'Authorize appveyor'
|
||||
6. You might be asked to select which repositories you want to give appveyor permission to. Select your fork of zstd if you're prompted
|
||||
7. You should have appveyor set up on your fork now.
|
||||
|
||||
### General notes on CI
|
||||
CI tests run every time a pull request (PR) is created or updated. The exact tests
|
||||
that get run will depend on the destination branch you specify. Some tests take
|
||||
longer to run than others. Currently, our CI is set up to run a short
|
||||
series of tests when creating a PR to the dev branch and a longer series of tests
|
||||
when creating a PR to the master branch. You can look in the configuration files
|
||||
of the respective CI platform for more information on what gets run when.
|
||||
|
||||
Most people will just want to create a PR with the destination set to their local dev
|
||||
branch of zstd. You can then find the status of the tests on the PR's page. You can also
|
||||
re-run tests and cancel running tests from the PR page or from the respective CI's dashboard.
|
||||
|
||||
## Issues
|
||||
We use GitHub issues to track public bugs. Please ensure your description is
|
||||
clear and has sufficient instructions to be able to reproduce the issue.
|
||||
@ -34,7 +384,7 @@ Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
|
||||
disclosure of security bugs. In those cases, please go through the process
|
||||
outlined on that page and do not file a public issue.
|
||||
|
||||
## Coding Style
|
||||
## Coding Style
|
||||
* 4 spaces for indentation rather than tabs
|
||||
|
||||
## License
|
||||
|
40
Makefile
40
Makefile
@ -1,10 +1,11 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2015-present, Yann Collet, Facebook, Inc.
|
||||
# Copyright (c) 2015-2020, Yann Collet, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# You may select, at your option, one of the above-listed licenses.
|
||||
# ################################################################
|
||||
|
||||
PRGDIR = programs
|
||||
@ -17,7 +18,16 @@ FUZZDIR = $(TESTDIR)/fuzz
|
||||
# Define nul output
|
||||
VOID = /dev/null
|
||||
|
||||
ifneq (,$(filter Windows%,$(OS)))
|
||||
# When cross-compiling from linux to windows, you might
|
||||
# need to specify this as "Windows." Fedora build fails
|
||||
# without it.
|
||||
#
|
||||
# Note: mingw-w64 build from linux to windows does not
|
||||
# fail on other tested distros (ubuntu, debian) even
|
||||
# without manually specifying the TARGET_SYSTEM.
|
||||
TARGET_SYSTEM ?= $(OS)
|
||||
|
||||
ifneq (,$(filter Windows%,$(TARGET_SYSTEM)))
|
||||
EXT =.exe
|
||||
else
|
||||
EXT =
|
||||
@ -35,7 +45,7 @@ allmost: allzstd zlibwrapper
|
||||
|
||||
# skip zwrapper, can't build that on alternate architectures without the proper zlib installed
|
||||
.PHONY: allzstd
|
||||
allzstd: lib
|
||||
allzstd: lib-all
|
||||
$(MAKE) -C $(PRGDIR) all
|
||||
$(MAKE) -C $(TESTDIR) all
|
||||
|
||||
@ -45,7 +55,7 @@ all32:
|
||||
$(MAKE) -C $(TESTDIR) all32
|
||||
|
||||
.PHONY: lib lib-release libzstd.a
|
||||
lib lib-release :
|
||||
lib lib-release lib-all :
|
||||
@$(MAKE) -C $(ZSTDDIR) $@
|
||||
|
||||
.PHONY: zstd zstd-release
|
||||
@ -80,6 +90,13 @@ shortest:
|
||||
.PHONY: check
|
||||
check: shortest
|
||||
|
||||
.PHONY: automated_benchmarking
|
||||
automated_benchmarking:
|
||||
$(MAKE) -C $(TESTDIR) $@
|
||||
|
||||
.PHONY: benchmarking
|
||||
benchmarking: automated_benchmarking
|
||||
|
||||
## examples: build all examples in `/examples` directory
|
||||
.PHONY: examples
|
||||
examples: lib
|
||||
@ -101,7 +118,8 @@ contrib: lib
|
||||
$(MAKE) -C contrib/pzstd all
|
||||
$(MAKE) -C contrib/seekable_format/examples all
|
||||
$(MAKE) -C contrib/largeNbDicts all
|
||||
cd contrib/single_file_decoder/ ; ./build_test.sh
|
||||
cd contrib/single_file_libs/ ; ./build_decoder_test.sh
|
||||
cd contrib/single_file_libs/ ; ./build_library_test.sh
|
||||
|
||||
.PHONY: cleanTabs
|
||||
cleanTabs:
|
||||
@ -337,7 +355,7 @@ endif
|
||||
|
||||
ifneq (,$(filter MSYS%,$(shell uname)))
|
||||
HOST_OS = MSYS
|
||||
CMAKE_PARAMS = -G"MSYS Makefiles" -DZSTD_MULTITHREAD_SUPPORT:BOOL=OFF -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON
|
||||
CMAKE_PARAMS = -G"MSYS Makefiles" -DCMAKE_BUILD_TYPE=Debug -DZSTD_MULTITHREAD_SUPPORT:BOOL=OFF -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON
|
||||
endif
|
||||
|
||||
|
||||
@ -349,11 +367,15 @@ cmakebuild:
|
||||
cmake --version
|
||||
$(RM) -r $(BUILDIR)/cmake/build
|
||||
mkdir $(BUILDIR)/cmake/build
|
||||
cd $(BUILDIR)/cmake/build ; cmake -DCMAKE_INSTALL_PREFIX:PATH=~/install_test_dir $(CMAKE_PARAMS) .. ; $(MAKE) install ; $(MAKE) uninstall
|
||||
cd $(BUILDIR)/cmake/build; cmake -DCMAKE_INSTALL_PREFIX:PATH=~/install_test_dir $(CMAKE_PARAMS) ..
|
||||
$(MAKE) -C $(BUILDIR)/cmake/build -j4;
|
||||
$(MAKE) -C $(BUILDIR)/cmake/build install;
|
||||
$(MAKE) -C $(BUILDIR)/cmake/build uninstall;
|
||||
cd $(BUILDIR)/cmake/build; ctest -V -L Medium
|
||||
|
||||
c90build: clean
|
||||
c89build: clean
|
||||
$(CC) -v
|
||||
CFLAGS="-std=c90 -Werror" $(MAKE) allmost # will fail, due to missing support for `long long`
|
||||
CFLAGS="-std=c89 -Werror" $(MAKE) allmost # will fail, due to missing support for `long long`
|
||||
|
||||
gnu90build: clean
|
||||
$(CC) -v
|
||||
|
40
README.md
40
README.md
@ -31,10 +31,10 @@ a list of known ports and bindings is provided on [Zstandard homepage](http://ww
|
||||
## Benchmarks
|
||||
|
||||
For reference, several fast compression algorithms were tested and compared
|
||||
on a server running Arch Linux (`Linux version 5.0.5-arch1-1`),
|
||||
on a server running Arch Linux (`Linux version 5.5.11-arch1-1`),
|
||||
with a Core i9-9900K CPU @ 5.0GHz,
|
||||
using [lzbench], an open-source in-memory benchmark by @inikep
|
||||
compiled with [gcc] 8.2.1,
|
||||
compiled with [gcc] 9.3.0,
|
||||
on the [Silesia compression corpus].
|
||||
|
||||
[lzbench]: https://github.com/inikep/lzbench
|
||||
@ -43,18 +43,26 @@ on the [Silesia compression corpus].
|
||||
|
||||
| Compressor name | Ratio | Compression| Decompress.|
|
||||
| --------------- | ------| -----------| ---------- |
|
||||
| **zstd 1.4.0 -1** | 2.884 | 530 MB/s | 1360 MB/s |
|
||||
| zlib 1.2.11 -1 | 2.743 | 110 MB/s | 440 MB/s |
|
||||
| brotli 1.0.7 -0 | 2.701 | 430 MB/s | 470 MB/s |
|
||||
| quicklz 1.5.0 -1 | 2.238 | 600 MB/s | 800 MB/s |
|
||||
| lzo1x 2.09 -1 | 2.106 | 680 MB/s | 950 MB/s |
|
||||
| lz4 1.8.3 | 2.101 | 800 MB/s | 4220 MB/s |
|
||||
| snappy 1.1.4 | 2.073 | 580 MB/s | 2020 MB/s |
|
||||
| lzf 3.6 -1 | 2.077 | 440 MB/s | 930 MB/s |
|
||||
| **zstd 1.4.5 -1** | 2.884 | 500 MB/s | 1660 MB/s |
|
||||
| zlib 1.2.11 -1 | 2.743 | 90 MB/s | 400 MB/s |
|
||||
| brotli 1.0.7 -0 | 2.703 | 400 MB/s | 450 MB/s |
|
||||
| **zstd 1.4.5 --fast=1** | 2.434 | 570 MB/s | 2200 MB/s |
|
||||
| **zstd 1.4.5 --fast=3** | 2.312 | 640 MB/s | 2300 MB/s |
|
||||
| quicklz 1.5.0 -1 | 2.238 | 560 MB/s | 710 MB/s |
|
||||
| **zstd 1.4.5 --fast=5** | 2.178 | 700 MB/s | 2420 MB/s |
|
||||
| lzo1x 2.10 -1 | 2.106 | 690 MB/s | 820 MB/s |
|
||||
| lz4 1.9.2 | 2.101 | 740 MB/s | 4530 MB/s |
|
||||
| **zstd 1.4.5 --fast=7** | 2.096 | 750 MB/s | 2480 MB/s |
|
||||
| lzf 3.6 -1 | 2.077 | 410 MB/s | 860 MB/s |
|
||||
| snappy 1.1.8 | 2.073 | 560 MB/s | 1790 MB/s |
|
||||
|
||||
[zlib]: http://www.zlib.net/
|
||||
[LZ4]: http://www.lz4.org/
|
||||
|
||||
The negative compression levels, specified with `--fast=#`,
|
||||
offer faster compression and decompression speed in exchange for some loss in
|
||||
compression ratio compared to level 1, as seen in the table above.
|
||||
|
||||
Zstd can also offer stronger compression ratios at the cost of compression speed.
|
||||
Speed vs Compression trade-off is configurable by small increments.
|
||||
Decompression speed is preserved and remains roughly the same at all settings,
|
||||
@ -143,6 +151,18 @@ example about how Meson is used to build this project.
|
||||
|
||||
Note that default build type is **release**.
|
||||
|
||||
### VCPKG
|
||||
You can build and install zstd [vcpkg](https://github.com/Microsoft/vcpkg/) dependency manager:
|
||||
|
||||
git clone https://github.com/Microsoft/vcpkg.git
|
||||
cd vcpkg
|
||||
./bootstrap-vcpkg.sh
|
||||
./vcpkg integrate install
|
||||
./vcpkg install zstd
|
||||
|
||||
The zstd port in vcpkg is kept up to date by Microsoft team members and community contributors.
|
||||
If the version is out of date, please [create an issue or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository.
|
||||
|
||||
### Visual Studio (Windows)
|
||||
|
||||
Going into `build` directory, you will find additional possibilities:
|
||||
|
@ -11,7 +11,7 @@ They consist of the following tests:
|
||||
- Compilation on all supported targets (x86, x86_64, ARM, AArch64, PowerPC, and PowerPC64)
|
||||
- Compilation on various versions of gcc, clang, and g++
|
||||
- `tests/playTests.sh` on x86_64, without the tests on long data (CLI tests)
|
||||
- Small tests (`tests/legacy.c`, `tests/longmatch.c`, `tests/symbols.c`) on x64_64
|
||||
- Small tests (`tests/legacy.c`, `tests/longmatch.c`) on x64_64
|
||||
|
||||
Medium Tests
|
||||
------------
|
||||
@ -19,7 +19,7 @@ Medium tests run on every commit and pull request to `dev` branch, on TravisCI.
|
||||
They consist of the following tests:
|
||||
- The following tests run with UBsan and Asan on x86_64 and x86, as well as with
|
||||
Msan on x86_64
|
||||
- `tests/playTests.sh --test-long-data`
|
||||
- `tests/playTests.sh --test-large-data`
|
||||
- Fuzzer tests: `tests/fuzzer.c`, `tests/zstreamtest.c`, and `tests/decodecorpus.c`
|
||||
- `tests/zstreamtest.c` under Tsan (streaming mode, including multithreaded mode)
|
||||
- Valgrind Test (`make -C tests valgrindTest`) (testing CLI and fuzzer under valgrind)
|
||||
|
27
appveyor.yml
27
appveyor.yml
@ -14,7 +14,7 @@
|
||||
- COMPILER: "gcc"
|
||||
HOST: "mingw"
|
||||
PLATFORM: "x64"
|
||||
SCRIPT: "make allzstd MOREFLAGS=-static && make -C tests test-symbols fullbench-lib"
|
||||
SCRIPT: "make allzstd MOREFLAGS=-static && make -C tests fullbench-lib"
|
||||
ARTIFACT: "true"
|
||||
BUILD: "true"
|
||||
- COMPILER: "gcc"
|
||||
@ -169,7 +169,8 @@
|
||||
- SET "FUZZERTEST=-T30s"
|
||||
- if [%HOST%]==[visual] if [%CONFIGURATION%]==[Release] (
|
||||
CD tests &&
|
||||
SET ZSTD=./zstd.exe &&
|
||||
SET ZSTD_BIN=./zstd.exe&&
|
||||
SET DATAGEN_BIN=./datagen.exe&&
|
||||
sh -e playTests.sh --test-large-data &&
|
||||
fullbench.exe -i1 &&
|
||||
fullbench.exe -i1 -P0 &&
|
||||
@ -187,6 +188,9 @@
|
||||
version: 1.0.{build}
|
||||
environment:
|
||||
matrix:
|
||||
- COMPILER: "gcc"
|
||||
HOST: "cygwin"
|
||||
PLATFORM: "x64"
|
||||
- COMPILER: "gcc"
|
||||
HOST: "mingw"
|
||||
PLATFORM: "x64"
|
||||
@ -220,6 +224,14 @@
|
||||
install:
|
||||
- ECHO Installing %COMPILER% %PLATFORM% %CONFIGURATION%
|
||||
- SET PATH_ORIGINAL=%PATH%
|
||||
- if [%HOST%]==[cygwin] (
|
||||
ECHO Installing Cygwin Packages &&
|
||||
C:\cygwin64\setup-x86_64.exe -qnNdO -R "C:\cygwin64" -g -P ^
|
||||
gcc-g++,^
|
||||
gcc,^
|
||||
cmake,^
|
||||
make
|
||||
)
|
||||
- if [%HOST%]==[mingw] (
|
||||
SET "PATH_MINGW32=C:\mingw-w64\i686-6.3.0-posix-dwarf-rt_v5-rev1\mingw32\bin" &&
|
||||
SET "PATH_MINGW64=C:\mingw-w64\x86_64-6.3.0-posix-seh-rt_v5-rev1\mingw64\bin" &&
|
||||
@ -232,6 +244,17 @@
|
||||
|
||||
build_script:
|
||||
- ECHO Building %COMPILER% %PLATFORM% %CONFIGURATION%
|
||||
- if [%HOST%]==[cygwin] (
|
||||
set CHERE_INVOKING=yes &&
|
||||
set CC=%COMPILER% &&
|
||||
C:\cygwin64\bin\bash --login -c "
|
||||
set -e;
|
||||
cd build/cmake;
|
||||
CFLAGS='-Werror' cmake -G 'Unix Makefiles' -DCMAKE_BUILD_TYPE=Debug -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_FUZZER_FLAGS=-T30s -DZSTD_ZSTREAM_FLAGS=-T30s .;
|
||||
make -j4;
|
||||
ctest -V -L Medium;
|
||||
"
|
||||
)
|
||||
- if [%HOST%]==[mingw] (
|
||||
( if [%PLATFORM%]==[x64] (
|
||||
SET "PATH=%PATH_MINGW64%;%PATH_ORIGINAL%"
|
||||
|
@ -1,2 +0,0 @@
|
||||
#!/bin/sh
|
||||
sed -i '' $'s/\t/ /g' ../lib/**/*.{h,c} ../programs/*.{h,c} ../tests/*.c ./**/*.{h,cpp} ../examples/*.c ../zlibWrapper/*.{h,c}
|
@ -1,20 +0,0 @@
|
||||
# Dockerfile
|
||||
# First image to build the binary
|
||||
FROM alpine as builder
|
||||
|
||||
RUN apk --no-cache add make gcc libc-dev
|
||||
COPY . /src
|
||||
RUN mkdir /pkg && cd /src && make && make DESTDIR=/pkg install
|
||||
|
||||
# Second minimal image to only keep the built binary
|
||||
FROM alpine
|
||||
|
||||
# Copy the built files
|
||||
COPY --from=builder /pkg /
|
||||
|
||||
# Copy the license as well
|
||||
RUN mkdir -p /usr/local/share/licenses/zstd
|
||||
COPY --from=builder /src/LICENSE /usr/local/share/licences/zstd/
|
||||
|
||||
# Just run `zstd` if no other command is given
|
||||
CMD ["/usr/local/bin/zstd"]
|
@ -1,20 +0,0 @@
|
||||
|
||||
## Requirement
|
||||
|
||||
The `Dockerfile` script requires a version of `docker` >= 17.05
|
||||
|
||||
## Installing docker
|
||||
|
||||
The official docker install docs use a ppa with a modern version available:
|
||||
https://docs.docker.com/install/linux/docker-ce/ubuntu/
|
||||
|
||||
## How to run
|
||||
|
||||
`docker build -t zstd .`
|
||||
|
||||
## test
|
||||
|
||||
```
|
||||
echo foo | docker run -i --rm zstd | docker run -i --rm zstd zstdcat
|
||||
foo
|
||||
```
|
@ -1,44 +0,0 @@
|
||||
ARG :=
|
||||
|
||||
CC ?= gcc
|
||||
CFLAGS ?= -O3
|
||||
INCLUDES := -I ../randomDictBuilder -I ../../../programs -I ../../../lib/common -I ../../../lib -I ../../../lib/dictBuilder
|
||||
|
||||
RANDOM_FILE := ../randomDictBuilder/random.c
|
||||
IO_FILE := ../randomDictBuilder/io.c
|
||||
|
||||
all: run clean
|
||||
|
||||
.PHONY: run
|
||||
run: benchmark
|
||||
echo "Benchmarking with $(ARG)"
|
||||
./benchmark $(ARG)
|
||||
|
||||
.PHONY: test
|
||||
test: benchmarkTest clean
|
||||
|
||||
.PHONY: benchmarkTest
|
||||
benchmarkTest: benchmark test.sh
|
||||
sh test.sh
|
||||
|
||||
benchmark: benchmark.o io.o random.o libzstd.a
|
||||
$(CC) $(CFLAGS) benchmark.o io.o random.o libzstd.a -o benchmark
|
||||
|
||||
benchmark.o: benchmark.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c benchmark.c
|
||||
|
||||
random.o: $(RANDOM_FILE)
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c $(RANDOM_FILE)
|
||||
|
||||
io.o: $(IO_FILE)
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c $(IO_FILE)
|
||||
|
||||
libzstd.a:
|
||||
$(MAKE) -C ../../../lib libzstd.a
|
||||
mv ../../../lib/libzstd.a .
|
||||
|
||||
.PHONY: clean
|
||||
clean:
|
||||
rm -f *.o benchmark libzstd.a
|
||||
$(MAKE) -C ../../../lib clean
|
||||
echo "Cleaning is completed"
|
@ -1,849 +0,0 @@
|
||||
Benchmarking Dictionary Builder
|
||||
|
||||
### Permitted Argument:
|
||||
Input File/Directory (in=fileName): required; file/directory used to build dictionary; if directory, will operate recursively for files inside directory; can include multiple files/directories, each following "in="
|
||||
|
||||
###Running Test:
|
||||
make test
|
||||
|
||||
###Usage:
|
||||
Benchmark given input files: make ARG= followed by permitted arguments
|
||||
|
||||
### Examples:
|
||||
make ARG="in=../../../lib/dictBuilder in=../../../lib/compress"
|
||||
|
||||
###Benchmarking Result:
|
||||
- First Cover is optimize cover, second Cover uses optimized d and k from first one.
|
||||
- For every f value of fastCover, the first one is optimize fastCover and the second one uses optimized d and k from first one. This is run for accel values from 1 to 10.
|
||||
- Fourth column is chosen d and fifth column is chosen k
|
||||
|
||||
github:
|
||||
NODICT 0.000004 2.999642
|
||||
RANDOM 0.024560 8.791189
|
||||
LEGACY 0.727109 8.173529
|
||||
COVER 40.565676 10.652243 8 1298
|
||||
COVER 3.608284 10.652243 8 1298
|
||||
FAST f=15 a=1 4.181024 10.570882 8 1154
|
||||
FAST f=15 a=1 0.040788 10.570882 8 1154
|
||||
FAST f=15 a=2 3.548352 10.574287 6 1970
|
||||
FAST f=15 a=2 0.035535 10.574287 6 1970
|
||||
FAST f=15 a=3 3.287364 10.613950 6 1010
|
||||
FAST f=15 a=3 0.032182 10.613950 6 1010
|
||||
FAST f=15 a=4 3.184976 10.573883 6 1058
|
||||
FAST f=15 a=4 0.029878 10.573883 6 1058
|
||||
FAST f=15 a=5 3.045513 10.580640 8 1154
|
||||
FAST f=15 a=5 0.022162 10.580640 8 1154
|
||||
FAST f=15 a=6 3.003296 10.583677 6 1010
|
||||
FAST f=15 a=6 0.028091 10.583677 6 1010
|
||||
FAST f=15 a=7 2.952655 10.622551 6 1106
|
||||
FAST f=15 a=7 0.02724 10.622551 6 1106
|
||||
FAST f=15 a=8 2.945674 10.614657 6 1010
|
||||
FAST f=15 a=8 0.027264 10.614657 6 1010
|
||||
FAST f=15 a=9 3.153439 10.564018 8 1154
|
||||
FAST f=15 a=9 0.020635 10.564018 8 1154
|
||||
FAST f=15 a=10 2.950416 10.511454 6 1010
|
||||
FAST f=15 a=10 0.026606 10.511454 6 1010
|
||||
FAST f=16 a=1 3.970029 10.681035 8 1154
|
||||
FAST f=16 a=1 0.038188 10.681035 8 1154
|
||||
FAST f=16 a=2 3.422892 10.484978 6 1874
|
||||
FAST f=16 a=2 0.034702 10.484978 6 1874
|
||||
FAST f=16 a=3 3.215836 10.632631 8 1154
|
||||
FAST f=16 a=3 0.026084 10.632631 8 1154
|
||||
FAST f=16 a=4 3.081353 10.626533 6 1106
|
||||
FAST f=16 a=4 0.030032 10.626533 6 1106
|
||||
FAST f=16 a=5 3.041241 10.545027 8 1922
|
||||
FAST f=16 a=5 0.022882 10.545027 8 1922
|
||||
FAST f=16 a=6 2.989390 10.638284 6 1874
|
||||
FAST f=16 a=6 0.028308 10.638284 6 1874
|
||||
FAST f=16 a=7 3.001581 10.797136 6 1106
|
||||
FAST f=16 a=7 0.027479 10.797136 6 1106
|
||||
FAST f=16 a=8 2.984107 10.658356 8 1058
|
||||
FAST f=16 a=8 0.021099 10.658356 8 1058
|
||||
FAST f=16 a=9 2.925788 10.523869 6 1010
|
||||
FAST f=16 a=9 0.026905 10.523869 6 1010
|
||||
FAST f=16 a=10 2.889605 10.745841 6 1874
|
||||
FAST f=16 a=10 0.026846 10.745841 6 1874
|
||||
FAST f=17 a=1 4.031953 10.672080 8 1202
|
||||
FAST f=17 a=1 0.040658 10.672080 8 1202
|
||||
FAST f=17 a=2 3.458107 10.589352 8 1106
|
||||
FAST f=17 a=2 0.02926 10.589352 8 1106
|
||||
FAST f=17 a=3 3.291189 10.662714 8 1154
|
||||
FAST f=17 a=3 0.026531 10.662714 8 1154
|
||||
FAST f=17 a=4 3.154950 10.549456 8 1346
|
||||
FAST f=17 a=4 0.024991 10.549456 8 1346
|
||||
FAST f=17 a=5 3.092271 10.541670 6 1202
|
||||
FAST f=17 a=5 0.038285 10.541670 6 1202
|
||||
FAST f=17 a=6 3.166146 10.729112 6 1874
|
||||
FAST f=17 a=6 0.038217 10.729112 6 1874
|
||||
FAST f=17 a=7 3.035467 10.810485 6 1106
|
||||
FAST f=17 a=7 0.036655 10.810485 6 1106
|
||||
FAST f=17 a=8 3.035668 10.530532 6 1058
|
||||
FAST f=17 a=8 0.037715 10.530532 6 1058
|
||||
FAST f=17 a=9 2.987917 10.589802 8 1922
|
||||
FAST f=17 a=9 0.02217 10.589802 8 1922
|
||||
FAST f=17 a=10 2.981647 10.722579 8 1106
|
||||
FAST f=17 a=10 0.021948 10.722579 8 1106
|
||||
FAST f=18 a=1 4.067144 10.634943 8 1154
|
||||
FAST f=18 a=1 0.041386 10.634943 8 1154
|
||||
FAST f=18 a=2 3.507377 10.546230 6 1970
|
||||
FAST f=18 a=2 0.037572 10.546230 6 1970
|
||||
FAST f=18 a=3 3.323015 10.648061 8 1154
|
||||
FAST f=18 a=3 0.028306 10.648061 8 1154
|
||||
FAST f=18 a=4 3.216735 10.705402 6 1010
|
||||
FAST f=18 a=4 0.030755 10.705402 6 1010
|
||||
FAST f=18 a=5 3.175794 10.588154 8 1874
|
||||
FAST f=18 a=5 0.025315 10.588154 8 1874
|
||||
FAST f=18 a=6 3.127459 10.751104 8 1106
|
||||
FAST f=18 a=6 0.023897 10.751104 8 1106
|
||||
FAST f=18 a=7 3.083017 10.780402 6 1106
|
||||
FAST f=18 a=7 0.029158 10.780402 6 1106
|
||||
FAST f=18 a=8 3.069700 10.547226 8 1346
|
||||
FAST f=18 a=8 0.024046 10.547226 8 1346
|
||||
FAST f=18 a=9 3.056591 10.674759 6 1010
|
||||
FAST f=18 a=9 0.028496 10.674759 6 1010
|
||||
FAST f=18 a=10 3.063588 10.737578 8 1106
|
||||
FAST f=18 a=10 0.023033 10.737578 8 1106
|
||||
FAST f=19 a=1 4.164041 10.650333 8 1154
|
||||
FAST f=19 a=1 0.042906 10.650333 8 1154
|
||||
FAST f=19 a=2 3.585409 10.577066 6 1058
|
||||
FAST f=19 a=2 0.038994 10.577066 6 1058
|
||||
FAST f=19 a=3 3.439643 10.639403 8 1154
|
||||
FAST f=19 a=3 0.028427 10.639403 8 1154
|
||||
FAST f=19 a=4 3.268869 10.554410 8 1298
|
||||
FAST f=19 a=4 0.026866 10.554410 8 1298
|
||||
FAST f=19 a=5 3.238225 10.615109 6 1010
|
||||
FAST f=19 a=5 0.03078 10.615109 6 1010
|
||||
FAST f=19 a=6 3.199558 10.609782 6 1874
|
||||
FAST f=19 a=6 0.030099 10.609782 6 1874
|
||||
FAST f=19 a=7 3.132395 10.794753 6 1106
|
||||
FAST f=19 a=7 0.028964 10.794753 6 1106
|
||||
FAST f=19 a=8 3.148446 10.554842 8 1298
|
||||
FAST f=19 a=8 0.024277 10.554842 8 1298
|
||||
FAST f=19 a=9 3.108324 10.668763 6 1010
|
||||
FAST f=19 a=9 0.02896 10.668763 6 1010
|
||||
FAST f=19 a=10 3.159863 10.757347 8 1106
|
||||
FAST f=19 a=10 0.023351 10.757347 8 1106
|
||||
FAST f=20 a=1 4.462698 10.661788 8 1154
|
||||
FAST f=20 a=1 0.047174 10.661788 8 1154
|
||||
FAST f=20 a=2 3.820269 10.678612 6 1106
|
||||
FAST f=20 a=2 0.040807 10.678612 6 1106
|
||||
FAST f=20 a=3 3.644955 10.648424 8 1154
|
||||
FAST f=20 a=3 0.031398 10.648424 8 1154
|
||||
FAST f=20 a=4 3.546257 10.559756 8 1298
|
||||
FAST f=20 a=4 0.029856 10.559756 8 1298
|
||||
FAST f=20 a=5 3.485248 10.646637 6 1010
|
||||
FAST f=20 a=5 0.033756 10.646637 6 1010
|
||||
FAST f=20 a=6 3.490438 10.775824 8 1106
|
||||
FAST f=20 a=6 0.028338 10.775824 8 1106
|
||||
FAST f=20 a=7 3.631289 10.801795 6 1106
|
||||
FAST f=20 a=7 0.035228 10.801795 6 1106
|
||||
FAST f=20 a=8 3.758936 10.545116 8 1346
|
||||
FAST f=20 a=8 0.027495 10.545116 8 1346
|
||||
FAST f=20 a=9 3.707024 10.677454 6 1010
|
||||
FAST f=20 a=9 0.031326 10.677454 6 1010
|
||||
FAST f=20 a=10 3.586593 10.756017 8 1106
|
||||
FAST f=20 a=10 0.027122 10.756017 8 1106
|
||||
FAST f=21 a=1 5.701396 10.655398 8 1154
|
||||
FAST f=21 a=1 0.067744 10.655398 8 1154
|
||||
FAST f=21 a=2 5.270542 10.650743 6 1106
|
||||
FAST f=21 a=2 0.052999 10.650743 6 1106
|
||||
FAST f=21 a=3 4.945294 10.652380 8 1154
|
||||
FAST f=21 a=3 0.052678 10.652380 8 1154
|
||||
FAST f=21 a=4 4.894079 10.543185 8 1298
|
||||
FAST f=21 a=4 0.04997 10.543185 8 1298
|
||||
FAST f=21 a=5 4.785417 10.630321 6 1010
|
||||
FAST f=21 a=5 0.045294 10.630321 6 1010
|
||||
FAST f=21 a=6 4.789381 10.664477 6 1874
|
||||
FAST f=21 a=6 0.046578 10.664477 6 1874
|
||||
FAST f=21 a=7 4.302955 10.805179 6 1106
|
||||
FAST f=21 a=7 0.041205 10.805179 6 1106
|
||||
FAST f=21 a=8 4.034630 10.551211 8 1298
|
||||
FAST f=21 a=8 0.040121 10.551211 8 1298
|
||||
FAST f=21 a=9 4.523868 10.799114 6 1010
|
||||
FAST f=21 a=9 0.043592 10.799114 6 1010
|
||||
FAST f=21 a=10 4.760736 10.750255 8 1106
|
||||
FAST f=21 a=10 0.043483 10.750255 8 1106
|
||||
FAST f=22 a=1 6.743064 10.640537 8 1154
|
||||
FAST f=22 a=1 0.086967 10.640537 8 1154
|
||||
FAST f=22 a=2 6.121739 10.626638 6 1970
|
||||
FAST f=22 a=2 0.066337 10.626638 6 1970
|
||||
FAST f=22 a=3 5.248851 10.640688 8 1154
|
||||
FAST f=22 a=3 0.054935 10.640688 8 1154
|
||||
FAST f=22 a=4 5.436579 10.588333 8 1298
|
||||
FAST f=22 a=4 0.064113 10.588333 8 1298
|
||||
FAST f=22 a=5 5.812815 10.652653 6 1010
|
||||
FAST f=22 a=5 0.058189 10.652653 6 1010
|
||||
FAST f=22 a=6 5.745472 10.666437 6 1874
|
||||
FAST f=22 a=6 0.057188 10.666437 6 1874
|
||||
FAST f=22 a=7 5.716393 10.806911 6 1106
|
||||
FAST f=22 a=7 0.056 10.806911 6 1106
|
||||
FAST f=22 a=8 5.698799 10.530784 8 1298
|
||||
FAST f=22 a=8 0.0583 10.530784 8 1298
|
||||
FAST f=22 a=9 5.710533 10.777391 6 1010
|
||||
FAST f=22 a=9 0.054945 10.777391 6 1010
|
||||
FAST f=22 a=10 5.685395 10.745023 8 1106
|
||||
FAST f=22 a=10 0.056526 10.745023 8 1106
|
||||
FAST f=23 a=1 7.836923 10.638828 8 1154
|
||||
FAST f=23 a=1 0.099522 10.638828 8 1154
|
||||
FAST f=23 a=2 6.627834 10.631061 6 1970
|
||||
FAST f=23 a=2 0.066769 10.631061 6 1970
|
||||
FAST f=23 a=3 5.602533 10.647288 8 1154
|
||||
FAST f=23 a=3 0.064513 10.647288 8 1154
|
||||
FAST f=23 a=4 6.005580 10.568747 8 1298
|
||||
FAST f=23 a=4 0.062022 10.568747 8 1298
|
||||
FAST f=23 a=5 5.481816 10.676921 6 1010
|
||||
FAST f=23 a=5 0.058959 10.676921 6 1010
|
||||
FAST f=23 a=6 5.460444 10.666194 6 1874
|
||||
FAST f=23 a=6 0.057687 10.666194 6 1874
|
||||
FAST f=23 a=7 5.659822 10.800377 6 1106
|
||||
FAST f=23 a=7 0.06783 10.800377 6 1106
|
||||
FAST f=23 a=8 6.826940 10.522167 8 1298
|
||||
FAST f=23 a=8 0.070533 10.522167 8 1298
|
||||
FAST f=23 a=9 6.804757 10.577799 8 1682
|
||||
FAST f=23 a=9 0.069949 10.577799 8 1682
|
||||
FAST f=23 a=10 6.774933 10.742093 8 1106
|
||||
FAST f=23 a=10 0.068395 10.742093 8 1106
|
||||
FAST f=24 a=1 8.444110 10.632783 8 1154
|
||||
FAST f=24 a=1 0.094357 10.632783 8 1154
|
||||
FAST f=24 a=2 7.289578 10.631061 6 1970
|
||||
FAST f=24 a=2 0.098515 10.631061 6 1970
|
||||
FAST f=24 a=3 8.619780 10.646289 8 1154
|
||||
FAST f=24 a=3 0.098041 10.646289 8 1154
|
||||
FAST f=24 a=4 8.508455 10.555199 8 1298
|
||||
FAST f=24 a=4 0.093885 10.555199 8 1298
|
||||
FAST f=24 a=5 8.471145 10.674363 6 1010
|
||||
FAST f=24 a=5 0.088676 10.674363 6 1010
|
||||
FAST f=24 a=6 8.426727 10.667228 6 1874
|
||||
FAST f=24 a=6 0.087247 10.667228 6 1874
|
||||
FAST f=24 a=7 8.356826 10.803027 6 1106
|
||||
FAST f=24 a=7 0.085835 10.803027 6 1106
|
||||
FAST f=24 a=8 6.756811 10.522049 8 1298
|
||||
FAST f=24 a=8 0.07107 10.522049 8 1298
|
||||
FAST f=24 a=9 6.548169 10.571882 8 1682
|
||||
FAST f=24 a=9 0.0713 10.571882 8 1682
|
||||
FAST f=24 a=10 8.238079 10.736453 8 1106
|
||||
FAST f=24 a=10 0.07004 10.736453 8 1106
|
||||
|
||||
|
||||
hg-commands:
|
||||
NODICT 0.000005 2.425276
|
||||
RANDOM 0.046332 3.490331
|
||||
LEGACY 0.720351 3.911682
|
||||
COVER 45.507731 4.132653 8 386
|
||||
COVER 1.868810 4.132653 8 386
|
||||
FAST f=15 a=1 4.561427 3.866894 8 1202
|
||||
FAST f=15 a=1 0.048946 3.866894 8 1202
|
||||
FAST f=15 a=2 3.574462 3.892119 8 1538
|
||||
FAST f=15 a=2 0.033677 3.892119 8 1538
|
||||
FAST f=15 a=3 3.230227 3.888791 6 1346
|
||||
FAST f=15 a=3 0.034312 3.888791 6 1346
|
||||
FAST f=15 a=4 3.042388 3.899739 8 1010
|
||||
FAST f=15 a=4 0.024307 3.899739 8 1010
|
||||
FAST f=15 a=5 2.800148 3.896220 8 818
|
||||
FAST f=15 a=5 0.022331 3.896220 8 818
|
||||
FAST f=15 a=6 2.706518 3.882039 8 578
|
||||
FAST f=15 a=6 0.020955 3.882039 8 578
|
||||
FAST f=15 a=7 2.701820 3.885430 6 866
|
||||
FAST f=15 a=7 0.026074 3.885430 6 866
|
||||
FAST f=15 a=8 2.604445 3.906932 8 1826
|
||||
FAST f=15 a=8 0.021789 3.906932 8 1826
|
||||
FAST f=15 a=9 2.598568 3.870324 6 1682
|
||||
FAST f=15 a=9 0.026004 3.870324 6 1682
|
||||
FAST f=15 a=10 2.575920 3.920783 8 1442
|
||||
FAST f=15 a=10 0.020228 3.920783 8 1442
|
||||
FAST f=16 a=1 4.630623 4.001430 8 770
|
||||
FAST f=16 a=1 0.047497 4.001430 8 770
|
||||
FAST f=16 a=2 3.674721 3.974431 8 1874
|
||||
FAST f=16 a=2 0.035761 3.974431 8 1874
|
||||
FAST f=16 a=3 3.338384 3.978703 8 1010
|
||||
FAST f=16 a=3 0.029436 3.978703 8 1010
|
||||
FAST f=16 a=4 3.004412 3.983035 8 1010
|
||||
FAST f=16 a=4 0.025744 3.983035 8 1010
|
||||
FAST f=16 a=5 2.881892 3.987710 8 770
|
||||
FAST f=16 a=5 0.023211 3.987710 8 770
|
||||
FAST f=16 a=6 2.807410 3.952717 8 1298
|
||||
FAST f=16 a=6 0.023199 3.952717 8 1298
|
||||
FAST f=16 a=7 2.819623 3.994627 8 770
|
||||
FAST f=16 a=7 0.021806 3.994627 8 770
|
||||
FAST f=16 a=8 2.740092 3.954032 8 1826
|
||||
FAST f=16 a=8 0.0226 3.954032 8 1826
|
||||
FAST f=16 a=9 2.682564 3.969879 6 1442
|
||||
FAST f=16 a=9 0.026324 3.969879 6 1442
|
||||
FAST f=16 a=10 2.657959 3.969755 8 674
|
||||
FAST f=16 a=10 0.020413 3.969755 8 674
|
||||
FAST f=17 a=1 4.729228 4.046000 8 530
|
||||
FAST f=17 a=1 0.049703 4.046000 8 530
|
||||
FAST f=17 a=2 3.764510 3.991519 8 1970
|
||||
FAST f=17 a=2 0.038195 3.991519 8 1970
|
||||
FAST f=17 a=3 3.416992 4.006296 6 914
|
||||
FAST f=17 a=3 0.036244 4.006296 6 914
|
||||
FAST f=17 a=4 3.145626 3.979182 8 1970
|
||||
FAST f=17 a=4 0.028676 3.979182 8 1970
|
||||
FAST f=17 a=5 2.995070 4.050070 8 770
|
||||
FAST f=17 a=5 0.025707 4.050070 8 770
|
||||
FAST f=17 a=6 2.911833 4.040024 8 770
|
||||
FAST f=17 a=6 0.02453 4.040024 8 770
|
||||
FAST f=17 a=7 2.894796 4.015884 8 818
|
||||
FAST f=17 a=7 0.023956 4.015884 8 818
|
||||
FAST f=17 a=8 2.789962 4.039303 8 530
|
||||
FAST f=17 a=8 0.023219 4.039303 8 530
|
||||
FAST f=17 a=9 2.787625 3.996762 8 1634
|
||||
FAST f=17 a=9 0.023651 3.996762 8 1634
|
||||
FAST f=17 a=10 2.754796 4.005059 8 1058
|
||||
FAST f=17 a=10 0.022537 4.005059 8 1058
|
||||
FAST f=18 a=1 4.779117 4.038214 8 242
|
||||
FAST f=18 a=1 0.048814 4.038214 8 242
|
||||
FAST f=18 a=2 3.829753 4.045768 8 722
|
||||
FAST f=18 a=2 0.036541 4.045768 8 722
|
||||
FAST f=18 a=3 3.495053 4.021497 8 770
|
||||
FAST f=18 a=3 0.032648 4.021497 8 770
|
||||
FAST f=18 a=4 3.221395 4.039623 8 770
|
||||
FAST f=18 a=4 0.027818 4.039623 8 770
|
||||
FAST f=18 a=5 3.059369 4.050414 8 530
|
||||
FAST f=18 a=5 0.026296 4.050414 8 530
|
||||
FAST f=18 a=6 3.019292 4.010714 6 962
|
||||
FAST f=18 a=6 0.031104 4.010714 6 962
|
||||
FAST f=18 a=7 2.949322 4.031439 6 770
|
||||
FAST f=18 a=7 0.030745 4.031439 6 770
|
||||
FAST f=18 a=8 2.876425 4.032088 6 386
|
||||
FAST f=18 a=8 0.027407 4.032088 6 386
|
||||
FAST f=18 a=9 2.850958 4.053372 8 674
|
||||
FAST f=18 a=9 0.023799 4.053372 8 674
|
||||
FAST f=18 a=10 2.884352 4.020148 8 1730
|
||||
FAST f=18 a=10 0.024401 4.020148 8 1730
|
||||
FAST f=19 a=1 4.815669 4.061203 8 674
|
||||
FAST f=19 a=1 0.051425 4.061203 8 674
|
||||
FAST f=19 a=2 3.951356 4.013822 8 1442
|
||||
FAST f=19 a=2 0.039968 4.013822 8 1442
|
||||
FAST f=19 a=3 3.554682 4.050425 8 722
|
||||
FAST f=19 a=3 0.032725 4.050425 8 722
|
||||
FAST f=19 a=4 3.242585 4.054677 8 722
|
||||
FAST f=19 a=4 0.028194 4.054677 8 722
|
||||
FAST f=19 a=5 3.105909 4.064524 8 818
|
||||
FAST f=19 a=5 0.02675 4.064524 8 818
|
||||
FAST f=19 a=6 3.059901 4.036857 8 1250
|
||||
FAST f=19 a=6 0.026396 4.036857 8 1250
|
||||
FAST f=19 a=7 3.016151 4.068234 6 770
|
||||
FAST f=19 a=7 0.031501 4.068234 6 770
|
||||
FAST f=19 a=8 2.962902 4.077509 8 530
|
||||
FAST f=19 a=8 0.023333 4.077509 8 530
|
||||
FAST f=19 a=9 2.899607 4.067328 8 530
|
||||
FAST f=19 a=9 0.024553 4.067328 8 530
|
||||
FAST f=19 a=10 2.950978 4.059901 8 434
|
||||
FAST f=19 a=10 0.023852 4.059901 8 434
|
||||
FAST f=20 a=1 5.259834 4.027579 8 1634
|
||||
FAST f=20 a=1 0.061123 4.027579 8 1634
|
||||
FAST f=20 a=2 4.382150 4.025093 8 1634
|
||||
FAST f=20 a=2 0.048009 4.025093 8 1634
|
||||
FAST f=20 a=3 4.104323 4.060842 8 530
|
||||
FAST f=20 a=3 0.040965 4.060842 8 530
|
||||
FAST f=20 a=4 3.853340 4.023504 6 914
|
||||
FAST f=20 a=4 0.041072 4.023504 6 914
|
||||
FAST f=20 a=5 3.728841 4.018089 6 1634
|
||||
FAST f=20 a=5 0.037469 4.018089 6 1634
|
||||
FAST f=20 a=6 3.683045 4.069138 8 578
|
||||
FAST f=20 a=6 0.028011 4.069138 8 578
|
||||
FAST f=20 a=7 3.726973 4.063160 8 722
|
||||
FAST f=20 a=7 0.028437 4.063160 8 722
|
||||
FAST f=20 a=8 3.555073 4.057690 8 386
|
||||
FAST f=20 a=8 0.027588 4.057690 8 386
|
||||
FAST f=20 a=9 3.551095 4.067253 8 482
|
||||
FAST f=20 a=9 0.025976 4.067253 8 482
|
||||
FAST f=20 a=10 3.490127 4.068518 8 530
|
||||
FAST f=20 a=10 0.025971 4.068518 8 530
|
||||
FAST f=21 a=1 7.343816 4.064945 8 770
|
||||
FAST f=21 a=1 0.085035 4.064945 8 770
|
||||
FAST f=21 a=2 5.930894 4.048206 8 386
|
||||
FAST f=21 a=2 0.067349 4.048206 8 386
|
||||
FAST f=21 a=3 6.770775 4.063417 8 578
|
||||
FAST f=21 a=3 0.077104 4.063417 8 578
|
||||
FAST f=21 a=4 6.889409 4.066761 8 626
|
||||
FAST f=21 a=4 0.0717 4.066761 8 626
|
||||
FAST f=21 a=5 6.714896 4.051813 8 914
|
||||
FAST f=21 a=5 0.071026 4.051813 8 914
|
||||
FAST f=21 a=6 6.539890 4.047263 8 1922
|
||||
FAST f=21 a=6 0.07127 4.047263 8 1922
|
||||
FAST f=21 a=7 6.511052 4.068373 8 482
|
||||
FAST f=21 a=7 0.065467 4.068373 8 482
|
||||
FAST f=21 a=8 6.458788 4.071597 8 482
|
||||
FAST f=21 a=8 0.063817 4.071597 8 482
|
||||
FAST f=21 a=9 6.377591 4.052905 8 434
|
||||
FAST f=21 a=9 0.063112 4.052905 8 434
|
||||
FAST f=21 a=10 6.360752 4.047773 8 530
|
||||
FAST f=21 a=10 0.063606 4.047773 8 530
|
||||
FAST f=22 a=1 10.523471 4.040812 8 962
|
||||
FAST f=22 a=1 0.14214 4.040812 8 962
|
||||
FAST f=22 a=2 9.454758 4.059396 8 914
|
||||
FAST f=22 a=2 0.118343 4.059396 8 914
|
||||
FAST f=22 a=3 9.043197 4.043019 8 1922
|
||||
FAST f=22 a=3 0.109798 4.043019 8 1922
|
||||
FAST f=22 a=4 8.716261 4.044819 8 770
|
||||
FAST f=22 a=4 0.099687 4.044819 8 770
|
||||
FAST f=22 a=5 8.529472 4.070576 8 530
|
||||
FAST f=22 a=5 0.093127 4.070576 8 530
|
||||
FAST f=22 a=6 8.424241 4.070565 8 722
|
||||
FAST f=22 a=6 0.093703 4.070565 8 722
|
||||
FAST f=22 a=7 8.403391 4.070591 8 578
|
||||
FAST f=22 a=7 0.089763 4.070591 8 578
|
||||
FAST f=22 a=8 8.285221 4.089171 8 530
|
||||
FAST f=22 a=8 0.087716 4.089171 8 530
|
||||
FAST f=22 a=9 8.282506 4.047470 8 722
|
||||
FAST f=22 a=9 0.089773 4.047470 8 722
|
||||
FAST f=22 a=10 8.241809 4.064151 8 818
|
||||
FAST f=22 a=10 0.090413 4.064151 8 818
|
||||
FAST f=23 a=1 12.389208 4.051635 6 530
|
||||
FAST f=23 a=1 0.147796 4.051635 6 530
|
||||
FAST f=23 a=2 11.300910 4.042835 6 914
|
||||
FAST f=23 a=2 0.133178 4.042835 6 914
|
||||
FAST f=23 a=3 10.879455 4.047415 8 626
|
||||
FAST f=23 a=3 0.129571 4.047415 8 626
|
||||
FAST f=23 a=4 10.522718 4.038269 6 914
|
||||
FAST f=23 a=4 0.118121 4.038269 6 914
|
||||
FAST f=23 a=5 10.348043 4.066884 8 434
|
||||
FAST f=23 a=5 0.112098 4.066884 8 434
|
||||
FAST f=23 a=6 10.238630 4.048635 8 1010
|
||||
FAST f=23 a=6 0.120281 4.048635 8 1010
|
||||
FAST f=23 a=7 10.213255 4.061809 8 530
|
||||
FAST f=23 a=7 0.1121 4.061809 8 530
|
||||
FAST f=23 a=8 10.107879 4.074104 8 818
|
||||
FAST f=23 a=8 0.116544 4.074104 8 818
|
||||
FAST f=23 a=9 10.063424 4.064811 8 674
|
||||
FAST f=23 a=9 0.109045 4.064811 8 674
|
||||
FAST f=23 a=10 10.035801 4.054918 8 530
|
||||
FAST f=23 a=10 0.108735 4.054918 8 530
|
||||
FAST f=24 a=1 14.963878 4.073490 8 722
|
||||
FAST f=24 a=1 0.206344 4.073490 8 722
|
||||
FAST f=24 a=2 13.833472 4.036100 8 962
|
||||
FAST f=24 a=2 0.17486 4.036100 8 962
|
||||
FAST f=24 a=3 13.404631 4.026281 6 1106
|
||||
FAST f=24 a=3 0.153961 4.026281 6 1106
|
||||
FAST f=24 a=4 13.041164 4.065448 8 674
|
||||
FAST f=24 a=4 0.155509 4.065448 8 674
|
||||
FAST f=24 a=5 12.879412 4.054636 8 674
|
||||
FAST f=24 a=5 0.148282 4.054636 8 674
|
||||
FAST f=24 a=6 12.773736 4.081376 8 530
|
||||
FAST f=24 a=6 0.142563 4.081376 8 530
|
||||
FAST f=24 a=7 12.711310 4.059834 8 770
|
||||
FAST f=24 a=7 0.149321 4.059834 8 770
|
||||
FAST f=24 a=8 12.635459 4.052050 8 1298
|
||||
FAST f=24 a=8 0.15095 4.052050 8 1298
|
||||
FAST f=24 a=9 12.558104 4.076516 8 722
|
||||
FAST f=24 a=9 0.144361 4.076516 8 722
|
||||
FAST f=24 a=10 10.661348 4.062137 8 818
|
||||
FAST f=24 a=10 0.108232 4.062137 8 818
|
||||
|
||||
|
||||
hg-changelog:
|
||||
NODICT 0.000017 1.377590
|
||||
RANDOM 0.186171 2.097487
|
||||
LEGACY 1.670867 2.058907
|
||||
COVER 173.561948 2.189685 8 98
|
||||
COVER 4.811180 2.189685 8 98
|
||||
FAST f=15 a=1 18.685906 2.129682 8 434
|
||||
FAST f=15 a=1 0.173376 2.129682 8 434
|
||||
FAST f=15 a=2 12.928259 2.131890 8 482
|
||||
FAST f=15 a=2 0.102582 2.131890 8 482
|
||||
FAST f=15 a=3 11.132343 2.128027 8 386
|
||||
FAST f=15 a=3 0.077122 2.128027 8 386
|
||||
FAST f=15 a=4 10.120683 2.125797 8 434
|
||||
FAST f=15 a=4 0.065175 2.125797 8 434
|
||||
FAST f=15 a=5 9.479092 2.127697 8 386
|
||||
FAST f=15 a=5 0.057905 2.127697 8 386
|
||||
FAST f=15 a=6 9.159523 2.127132 8 1682
|
||||
FAST f=15 a=6 0.058604 2.127132 8 1682
|
||||
FAST f=15 a=7 8.724003 2.129914 8 434
|
||||
FAST f=15 a=7 0.0493 2.129914 8 434
|
||||
FAST f=15 a=8 8.595001 2.127137 8 338
|
||||
FAST f=15 a=8 0.0474 2.127137 8 338
|
||||
FAST f=15 a=9 8.356405 2.125512 8 482
|
||||
FAST f=15 a=9 0.046126 2.125512 8 482
|
||||
FAST f=15 a=10 8.207111 2.126066 8 338
|
||||
FAST f=15 a=10 0.043292 2.126066 8 338
|
||||
FAST f=16 a=1 18.464436 2.144040 8 242
|
||||
FAST f=16 a=1 0.172156 2.144040 8 242
|
||||
FAST f=16 a=2 12.844825 2.148171 8 194
|
||||
FAST f=16 a=2 0.099619 2.148171 8 194
|
||||
FAST f=16 a=3 11.082568 2.140837 8 290
|
||||
FAST f=16 a=3 0.079165 2.140837 8 290
|
||||
FAST f=16 a=4 10.066749 2.144405 8 386
|
||||
FAST f=16 a=4 0.068411 2.144405 8 386
|
||||
FAST f=16 a=5 9.501121 2.140720 8 386
|
||||
FAST f=16 a=5 0.061316 2.140720 8 386
|
||||
FAST f=16 a=6 9.179332 2.139478 8 386
|
||||
FAST f=16 a=6 0.056322 2.139478 8 386
|
||||
FAST f=16 a=7 8.849438 2.142412 8 194
|
||||
FAST f=16 a=7 0.050493 2.142412 8 194
|
||||
FAST f=16 a=8 8.810919 2.143454 8 434
|
||||
FAST f=16 a=8 0.051304 2.143454 8 434
|
||||
FAST f=16 a=9 8.553900 2.140339 8 194
|
||||
FAST f=16 a=9 0.047285 2.140339 8 194
|
||||
FAST f=16 a=10 8.398027 2.143130 8 386
|
||||
FAST f=16 a=10 0.046386 2.143130 8 386
|
||||
FAST f=17 a=1 18.644657 2.157192 8 98
|
||||
FAST f=17 a=1 0.173884 2.157192 8 98
|
||||
FAST f=17 a=2 13.071242 2.159830 8 146
|
||||
FAST f=17 a=2 0.10388 2.159830 8 146
|
||||
FAST f=17 a=3 11.332366 2.153654 6 194
|
||||
FAST f=17 a=3 0.08983 2.153654 6 194
|
||||
FAST f=17 a=4 10.362413 2.156813 8 242
|
||||
FAST f=17 a=4 0.070389 2.156813 8 242
|
||||
FAST f=17 a=5 9.808159 2.155098 6 338
|
||||
FAST f=17 a=5 0.072661 2.155098 6 338
|
||||
FAST f=17 a=6 9.451165 2.153845 6 146
|
||||
FAST f=17 a=6 0.064959 2.153845 6 146
|
||||
FAST f=17 a=7 9.163097 2.155424 6 242
|
||||
FAST f=17 a=7 0.064323 2.155424 6 242
|
||||
FAST f=17 a=8 9.047276 2.156640 8 242
|
||||
FAST f=17 a=8 0.053382 2.156640 8 242
|
||||
FAST f=17 a=9 8.807671 2.152396 8 146
|
||||
FAST f=17 a=9 0.049617 2.152396 8 146
|
||||
FAST f=17 a=10 8.649827 2.152370 8 146
|
||||
FAST f=17 a=10 0.047849 2.152370 8 146
|
||||
FAST f=18 a=1 18.809502 2.168116 8 98
|
||||
FAST f=18 a=1 0.175226 2.168116 8 98
|
||||
FAST f=18 a=2 13.756502 2.170870 6 242
|
||||
FAST f=18 a=2 0.119507 2.170870 6 242
|
||||
FAST f=18 a=3 12.059748 2.163094 6 98
|
||||
FAST f=18 a=3 0.093912 2.163094 6 98
|
||||
FAST f=18 a=4 11.410294 2.172372 8 98
|
||||
FAST f=18 a=4 0.073048 2.172372 8 98
|
||||
FAST f=18 a=5 10.560297 2.166388 8 98
|
||||
FAST f=18 a=5 0.065136 2.166388 8 98
|
||||
FAST f=18 a=6 10.071390 2.162672 8 98
|
||||
FAST f=18 a=6 0.059402 2.162672 8 98
|
||||
FAST f=18 a=7 10.084214 2.166624 6 194
|
||||
FAST f=18 a=7 0.073276 2.166624 6 194
|
||||
FAST f=18 a=8 9.953226 2.167454 8 98
|
||||
FAST f=18 a=8 0.053659 2.167454 8 98
|
||||
FAST f=18 a=9 8.982461 2.161593 6 146
|
||||
FAST f=18 a=9 0.05955 2.161593 6 146
|
||||
FAST f=18 a=10 8.986092 2.164373 6 242
|
||||
FAST f=18 a=10 0.059135 2.164373 6 242
|
||||
FAST f=19 a=1 18.908277 2.176021 8 98
|
||||
FAST f=19 a=1 0.177316 2.176021 8 98
|
||||
FAST f=19 a=2 13.471313 2.176103 8 98
|
||||
FAST f=19 a=2 0.106344 2.176103 8 98
|
||||
FAST f=19 a=3 11.571406 2.172812 8 98
|
||||
FAST f=19 a=3 0.083293 2.172812 8 98
|
||||
FAST f=19 a=4 10.632775 2.177770 6 146
|
||||
FAST f=19 a=4 0.079864 2.177770 6 146
|
||||
FAST f=19 a=5 10.030190 2.175574 6 146
|
||||
FAST f=19 a=5 0.07223 2.175574 6 146
|
||||
FAST f=19 a=6 9.717818 2.169997 8 98
|
||||
FAST f=19 a=6 0.060049 2.169997 8 98
|
||||
FAST f=19 a=7 9.397531 2.172770 8 146
|
||||
FAST f=19 a=7 0.057188 2.172770 8 146
|
||||
FAST f=19 a=8 9.281061 2.175822 8 98
|
||||
FAST f=19 a=8 0.053711 2.175822 8 98
|
||||
FAST f=19 a=9 9.165242 2.169849 6 146
|
||||
FAST f=19 a=9 0.059898 2.169849 6 146
|
||||
FAST f=19 a=10 9.048763 2.173394 8 98
|
||||
FAST f=19 a=10 0.049757 2.173394 8 98
|
||||
FAST f=20 a=1 21.166917 2.183923 6 98
|
||||
FAST f=20 a=1 0.205425 2.183923 6 98
|
||||
FAST f=20 a=2 15.642753 2.182349 6 98
|
||||
FAST f=20 a=2 0.135957 2.182349 6 98
|
||||
FAST f=20 a=3 14.053730 2.173544 6 98
|
||||
FAST f=20 a=3 0.11266 2.173544 6 98
|
||||
FAST f=20 a=4 15.270019 2.183656 8 98
|
||||
FAST f=20 a=4 0.107892 2.183656 8 98
|
||||
FAST f=20 a=5 15.497927 2.174661 6 98
|
||||
FAST f=20 a=5 0.100305 2.174661 6 98
|
||||
FAST f=20 a=6 13.973505 2.172391 8 98
|
||||
FAST f=20 a=6 0.087565 2.172391 8 98
|
||||
FAST f=20 a=7 14.083296 2.172443 8 98
|
||||
FAST f=20 a=7 0.078062 2.172443 8 98
|
||||
FAST f=20 a=8 12.560048 2.175581 8 98
|
||||
FAST f=20 a=8 0.070282 2.175581 8 98
|
||||
FAST f=20 a=9 13.078645 2.173975 6 146
|
||||
FAST f=20 a=9 0.081041 2.173975 6 146
|
||||
FAST f=20 a=10 12.823328 2.177778 8 98
|
||||
FAST f=20 a=10 0.074522 2.177778 8 98
|
||||
FAST f=21 a=1 29.825370 2.183057 6 98
|
||||
FAST f=21 a=1 0.334453 2.183057 6 98
|
||||
FAST f=21 a=2 29.476474 2.182752 8 98
|
||||
FAST f=21 a=2 0.286602 2.182752 8 98
|
||||
FAST f=21 a=3 25.937186 2.175867 8 98
|
||||
FAST f=21 a=3 0.17626 2.175867 8 98
|
||||
FAST f=21 a=4 20.413865 2.179780 8 98
|
||||
FAST f=21 a=4 0.206085 2.179780 8 98
|
||||
FAST f=21 a=5 20.541889 2.178328 6 146
|
||||
FAST f=21 a=5 0.199157 2.178328 6 146
|
||||
FAST f=21 a=6 21.090670 2.174443 6 146
|
||||
FAST f=21 a=6 0.190645 2.174443 6 146
|
||||
FAST f=21 a=7 20.221569 2.177384 6 146
|
||||
FAST f=21 a=7 0.184278 2.177384 6 146
|
||||
FAST f=21 a=8 20.322357 2.179456 6 98
|
||||
FAST f=21 a=8 0.178458 2.179456 6 98
|
||||
FAST f=21 a=9 20.683912 2.174396 6 146
|
||||
FAST f=21 a=9 0.190829 2.174396 6 146
|
||||
FAST f=21 a=10 20.840865 2.174905 8 98
|
||||
FAST f=21 a=10 0.172515 2.174905 8 98
|
||||
FAST f=22 a=1 36.822827 2.181612 6 98
|
||||
FAST f=22 a=1 0.437389 2.181612 6 98
|
||||
FAST f=22 a=2 30.616902 2.183142 8 98
|
||||
FAST f=22 a=2 0.324284 2.183142 8 98
|
||||
FAST f=22 a=3 28.472482 2.178130 8 98
|
||||
FAST f=22 a=3 0.236538 2.178130 8 98
|
||||
FAST f=22 a=4 25.847028 2.181878 8 98
|
||||
FAST f=22 a=4 0.263744 2.181878 8 98
|
||||
FAST f=22 a=5 27.095881 2.180775 8 98
|
||||
FAST f=22 a=5 0.24988 2.180775 8 98
|
||||
FAST f=22 a=6 25.939172 2.170916 8 98
|
||||
FAST f=22 a=6 0.240033 2.170916 8 98
|
||||
FAST f=22 a=7 27.064194 2.177849 8 98
|
||||
FAST f=22 a=7 0.242383 2.177849 8 98
|
||||
FAST f=22 a=8 25.140221 2.178216 8 98
|
||||
FAST f=22 a=8 0.237601 2.178216 8 98
|
||||
FAST f=22 a=9 25.505283 2.177455 6 146
|
||||
FAST f=22 a=9 0.223217 2.177455 6 146
|
||||
FAST f=22 a=10 24.529362 2.176705 6 98
|
||||
FAST f=22 a=10 0.222876 2.176705 6 98
|
||||
FAST f=23 a=1 39.127310 2.183006 6 98
|
||||
FAST f=23 a=1 0.417338 2.183006 6 98
|
||||
FAST f=23 a=2 32.468161 2.183524 6 98
|
||||
FAST f=23 a=2 0.351645 2.183524 6 98
|
||||
FAST f=23 a=3 31.577620 2.172604 6 98
|
||||
FAST f=23 a=3 0.319659 2.172604 6 98
|
||||
FAST f=23 a=4 30.129247 2.183932 6 98
|
||||
FAST f=23 a=4 0.307239 2.183932 6 98
|
||||
FAST f=23 a=5 29.103376 2.183529 6 146
|
||||
FAST f=23 a=5 0.285533 2.183529 6 146
|
||||
FAST f=23 a=6 29.776045 2.174367 8 98
|
||||
FAST f=23 a=6 0.276846 2.174367 8 98
|
||||
FAST f=23 a=7 28.940407 2.178022 6 146
|
||||
FAST f=23 a=7 0.274082 2.178022 6 146
|
||||
FAST f=23 a=8 29.256009 2.179462 6 98
|
||||
FAST f=23 a=8 0.26949 2.179462 6 98
|
||||
FAST f=23 a=9 29.347312 2.170407 8 98
|
||||
FAST f=23 a=9 0.265034 2.170407 8 98
|
||||
FAST f=23 a=10 29.140081 2.171762 8 98
|
||||
FAST f=23 a=10 0.259183 2.171762 8 98
|
||||
FAST f=24 a=1 44.871179 2.182115 6 98
|
||||
FAST f=24 a=1 0.509433 2.182115 6 98
|
||||
FAST f=24 a=2 38.694867 2.180549 8 98
|
||||
FAST f=24 a=2 0.406695 2.180549 8 98
|
||||
FAST f=24 a=3 38.363769 2.172821 8 98
|
||||
FAST f=24 a=3 0.359581 2.172821 8 98
|
||||
FAST f=24 a=4 36.580797 2.184142 8 98
|
||||
FAST f=24 a=4 0.340614 2.184142 8 98
|
||||
FAST f=24 a=5 33.125701 2.183301 8 98
|
||||
FAST f=24 a=5 0.324874 2.183301 8 98
|
||||
FAST f=24 a=6 34.776068 2.173019 6 146
|
||||
FAST f=24 a=6 0.340397 2.173019 6 146
|
||||
FAST f=24 a=7 34.417625 2.176561 6 146
|
||||
FAST f=24 a=7 0.308223 2.176561 6 146
|
||||
FAST f=24 a=8 35.470291 2.182161 6 98
|
||||
FAST f=24 a=8 0.307724 2.182161 6 98
|
||||
FAST f=24 a=9 34.927252 2.172682 6 146
|
||||
FAST f=24 a=9 0.300598 2.172682 6 146
|
||||
FAST f=24 a=10 33.238355 2.173395 6 98
|
||||
FAST f=24 a=10 0.249916 2.173395 6 98
|
||||
|
||||
|
||||
hg-manifest:
|
||||
NODICT 0.000004 1.866377
|
||||
RANDOM 0.696346 2.309436
|
||||
LEGACY 7.064527 2.506977
|
||||
COVER 876.312865 2.582528 8 434
|
||||
COVER 35.684533 2.582528 8 434
|
||||
FAST f=15 a=1 76.618201 2.404013 8 1202
|
||||
FAST f=15 a=1 0.700722 2.404013 8 1202
|
||||
FAST f=15 a=2 49.213058 2.409248 6 1826
|
||||
FAST f=15 a=2 0.473393 2.409248 6 1826
|
||||
FAST f=15 a=3 41.753197 2.409677 8 1490
|
||||
FAST f=15 a=3 0.336848 2.409677 8 1490
|
||||
FAST f=15 a=4 38.648295 2.407996 8 1538
|
||||
FAST f=15 a=4 0.283952 2.407996 8 1538
|
||||
FAST f=15 a=5 36.144936 2.402895 8 1874
|
||||
FAST f=15 a=5 0.270128 2.402895 8 1874
|
||||
FAST f=15 a=6 35.484675 2.394873 8 1586
|
||||
FAST f=15 a=6 0.251637 2.394873 8 1586
|
||||
FAST f=15 a=7 34.280599 2.397311 8 1778
|
||||
FAST f=15 a=7 0.23984 2.397311 8 1778
|
||||
FAST f=15 a=8 32.122572 2.396089 6 1490
|
||||
FAST f=15 a=8 0.251508 2.396089 6 1490
|
||||
FAST f=15 a=9 29.909842 2.390092 6 1970
|
||||
FAST f=15 a=9 0.251233 2.390092 6 1970
|
||||
FAST f=15 a=10 30.102938 2.400086 6 1682
|
||||
FAST f=15 a=10 0.23688 2.400086 6 1682
|
||||
FAST f=16 a=1 67.750401 2.475460 6 1346
|
||||
FAST f=16 a=1 0.796035 2.475460 6 1346
|
||||
FAST f=16 a=2 52.812027 2.480860 6 1730
|
||||
FAST f=16 a=2 0.480384 2.480860 6 1730
|
||||
FAST f=16 a=3 44.179259 2.469304 8 1970
|
||||
FAST f=16 a=3 0.332657 2.469304 8 1970
|
||||
FAST f=16 a=4 37.612728 2.478208 6 1970
|
||||
FAST f=16 a=4 0.32498 2.478208 6 1970
|
||||
FAST f=16 a=5 35.056222 2.475568 6 1298
|
||||
FAST f=16 a=5 0.302824 2.475568 6 1298
|
||||
FAST f=16 a=6 34.713012 2.486079 8 1730
|
||||
FAST f=16 a=6 0.24755 2.486079 8 1730
|
||||
FAST f=16 a=7 33.713687 2.477180 6 1682
|
||||
FAST f=16 a=7 0.280358 2.477180 6 1682
|
||||
FAST f=16 a=8 31.571412 2.475418 8 1538
|
||||
FAST f=16 a=8 0.241241 2.475418 8 1538
|
||||
FAST f=16 a=9 31.608069 2.478263 8 1922
|
||||
FAST f=16 a=9 0.241764 2.478263 8 1922
|
||||
FAST f=16 a=10 31.358002 2.472263 8 1442
|
||||
FAST f=16 a=10 0.221661 2.472263 8 1442
|
||||
FAST f=17 a=1 66.185775 2.536085 6 1346
|
||||
FAST f=17 a=1 0.713549 2.536085 6 1346
|
||||
FAST f=17 a=2 50.365000 2.546105 8 1298
|
||||
FAST f=17 a=2 0.467846 2.546105 8 1298
|
||||
FAST f=17 a=3 42.712843 2.536250 8 1298
|
||||
FAST f=17 a=3 0.34047 2.536250 8 1298
|
||||
FAST f=17 a=4 39.514227 2.535555 8 1442
|
||||
FAST f=17 a=4 0.302989 2.535555 8 1442
|
||||
FAST f=17 a=5 35.189292 2.524925 8 1202
|
||||
FAST f=17 a=5 0.273451 2.524925 8 1202
|
||||
FAST f=17 a=6 35.791683 2.523466 8 1202
|
||||
FAST f=17 a=6 0.268261 2.523466 8 1202
|
||||
FAST f=17 a=7 37.416136 2.526625 6 1010
|
||||
FAST f=17 a=7 0.277558 2.526625 6 1010
|
||||
FAST f=17 a=8 37.084707 2.533274 6 1250
|
||||
FAST f=17 a=8 0.285104 2.533274 6 1250
|
||||
FAST f=17 a=9 34.183814 2.532765 8 1298
|
||||
FAST f=17 a=9 0.235133 2.532765 8 1298
|
||||
FAST f=17 a=10 31.149235 2.528722 8 1346
|
||||
FAST f=17 a=10 0.232679 2.528722 8 1346
|
||||
FAST f=18 a=1 72.942176 2.559857 6 386
|
||||
FAST f=18 a=1 0.718618 2.559857 6 386
|
||||
FAST f=18 a=2 51.690440 2.559572 8 290
|
||||
FAST f=18 a=2 0.403978 2.559572 8 290
|
||||
FAST f=18 a=3 45.344908 2.561040 8 962
|
||||
FAST f=18 a=3 0.357205 2.561040 8 962
|
||||
FAST f=18 a=4 39.804522 2.558446 8 1010
|
||||
FAST f=18 a=4 0.310526 2.558446 8 1010
|
||||
FAST f=18 a=5 38.134888 2.561811 8 626
|
||||
FAST f=18 a=5 0.273743 2.561811 8 626
|
||||
FAST f=18 a=6 35.091890 2.555518 8 722
|
||||
FAST f=18 a=6 0.260135 2.555518 8 722
|
||||
FAST f=18 a=7 34.639523 2.562938 8 290
|
||||
FAST f=18 a=7 0.234294 2.562938 8 290
|
||||
FAST f=18 a=8 36.076431 2.563567 8 1586
|
||||
FAST f=18 a=8 0.274075 2.563567 8 1586
|
||||
FAST f=18 a=9 36.376433 2.560950 8 722
|
||||
FAST f=18 a=9 0.240106 2.560950 8 722
|
||||
FAST f=18 a=10 32.624790 2.559340 8 578
|
||||
FAST f=18 a=10 0.234704 2.559340 8 578
|
||||
FAST f=19 a=1 70.513761 2.572441 8 194
|
||||
FAST f=19 a=1 0.726112 2.572441 8 194
|
||||
FAST f=19 a=2 59.263032 2.574560 8 482
|
||||
FAST f=19 a=2 0.451554 2.574560 8 482
|
||||
FAST f=19 a=3 51.509594 2.571546 6 194
|
||||
FAST f=19 a=3 0.393014 2.571546 6 194
|
||||
FAST f=19 a=4 55.393906 2.573386 8 482
|
||||
FAST f=19 a=4 0.38819 2.573386 8 482
|
||||
FAST f=19 a=5 43.201736 2.567589 8 674
|
||||
FAST f=19 a=5 0.292155 2.567589 8 674
|
||||
FAST f=19 a=6 42.911687 2.572666 6 434
|
||||
FAST f=19 a=6 0.303988 2.572666 6 434
|
||||
FAST f=19 a=7 44.687591 2.573613 6 290
|
||||
FAST f=19 a=7 0.308721 2.573613 6 290
|
||||
FAST f=19 a=8 37.372868 2.571039 6 194
|
||||
FAST f=19 a=8 0.287137 2.571039 6 194
|
||||
FAST f=19 a=9 36.074230 2.566473 6 482
|
||||
FAST f=19 a=9 0.280721 2.566473 6 482
|
||||
FAST f=19 a=10 33.731720 2.570306 8 194
|
||||
FAST f=19 a=10 0.224073 2.570306 8 194
|
||||
FAST f=20 a=1 79.670634 2.581146 6 290
|
||||
FAST f=20 a=1 0.899986 2.581146 6 290
|
||||
FAST f=20 a=2 58.827141 2.579782 8 386
|
||||
FAST f=20 a=2 0.602288 2.579782 8 386
|
||||
FAST f=20 a=3 51.289004 2.579627 8 722
|
||||
FAST f=20 a=3 0.446091 2.579627 8 722
|
||||
FAST f=20 a=4 47.711068 2.581508 8 722
|
||||
FAST f=20 a=4 0.473007 2.581508 8 722
|
||||
FAST f=20 a=5 47.402929 2.578062 6 434
|
||||
FAST f=20 a=5 0.497131 2.578062 6 434
|
||||
FAST f=20 a=6 54.797102 2.577365 8 482
|
||||
FAST f=20 a=6 0.515061 2.577365 8 482
|
||||
FAST f=20 a=7 51.370877 2.583050 8 386
|
||||
FAST f=20 a=7 0.402878 2.583050 8 386
|
||||
FAST f=20 a=8 51.437931 2.574875 6 242
|
||||
FAST f=20 a=8 0.453094 2.574875 6 242
|
||||
FAST f=20 a=9 44.105456 2.576700 6 242
|
||||
FAST f=20 a=9 0.456633 2.576700 6 242
|
||||
FAST f=20 a=10 44.447580 2.578305 8 338
|
||||
FAST f=20 a=10 0.409121 2.578305 8 338
|
||||
FAST f=21 a=1 113.031686 2.582449 6 242
|
||||
FAST f=21 a=1 1.456971 2.582449 6 242
|
||||
FAST f=21 a=2 97.700932 2.582124 8 194
|
||||
FAST f=21 a=2 1.072078 2.582124 8 194
|
||||
FAST f=21 a=3 96.563648 2.585479 8 434
|
||||
FAST f=21 a=3 0.949528 2.585479 8 434
|
||||
FAST f=21 a=4 90.597813 2.582366 6 386
|
||||
FAST f=21 a=4 0.76944 2.582366 6 386
|
||||
FAST f=21 a=5 86.815980 2.579043 8 434
|
||||
FAST f=21 a=5 0.858167 2.579043 8 434
|
||||
FAST f=21 a=6 91.235820 2.578378 8 530
|
||||
FAST f=21 a=6 0.684274 2.578378 8 530
|
||||
FAST f=21 a=7 84.392788 2.581243 8 386
|
||||
FAST f=21 a=7 0.814386 2.581243 8 386
|
||||
FAST f=21 a=8 82.052310 2.582547 8 338
|
||||
FAST f=21 a=8 0.822633 2.582547 8 338
|
||||
FAST f=21 a=9 74.696074 2.579319 8 194
|
||||
FAST f=21 a=9 0.811028 2.579319 8 194
|
||||
FAST f=21 a=10 76.211170 2.578766 8 290
|
||||
FAST f=21 a=10 0.809715 2.578766 8 290
|
||||
FAST f=22 a=1 138.976871 2.580478 8 194
|
||||
FAST f=22 a=1 1.748932 2.580478 8 194
|
||||
FAST f=22 a=2 120.164097 2.583633 8 386
|
||||
FAST f=22 a=2 1.333239 2.583633 8 386
|
||||
FAST f=22 a=3 111.986474 2.582566 6 194
|
||||
FAST f=22 a=3 1.305734 2.582566 6 194
|
||||
FAST f=22 a=4 108.548148 2.583068 6 194
|
||||
FAST f=22 a=4 1.314026 2.583068 6 194
|
||||
FAST f=22 a=5 103.173017 2.583495 6 290
|
||||
FAST f=22 a=5 1.228664 2.583495 6 290
|
||||
FAST f=22 a=6 108.421262 2.582349 8 530
|
||||
FAST f=22 a=6 1.076773 2.582349 8 530
|
||||
FAST f=22 a=7 103.284127 2.581022 8 386
|
||||
FAST f=22 a=7 1.112117 2.581022 8 386
|
||||
FAST f=22 a=8 96.330279 2.581073 8 290
|
||||
FAST f=22 a=8 1.109303 2.581073 8 290
|
||||
FAST f=22 a=9 97.651348 2.580075 6 194
|
||||
FAST f=22 a=9 0.933032 2.580075 6 194
|
||||
FAST f=22 a=10 101.660621 2.584886 8 194
|
||||
FAST f=22 a=10 0.796823 2.584886 8 194
|
||||
FAST f=23 a=1 159.322978 2.581474 6 242
|
||||
FAST f=23 a=1 2.015878 2.581474 6 242
|
||||
FAST f=23 a=2 134.331775 2.581619 8 194
|
||||
FAST f=23 a=2 1.545845 2.581619 8 194
|
||||
FAST f=23 a=3 127.724552 2.579888 6 338
|
||||
FAST f=23 a=3 1.444496 2.579888 6 338
|
||||
FAST f=23 a=4 126.077675 2.578137 6 242
|
||||
FAST f=23 a=4 1.364394 2.578137 6 242
|
||||
FAST f=23 a=5 124.914027 2.580843 8 338
|
||||
FAST f=23 a=5 1.116059 2.580843 8 338
|
||||
FAST f=23 a=6 122.874153 2.577637 6 338
|
||||
FAST f=23 a=6 1.164584 2.577637 6 338
|
||||
FAST f=23 a=7 123.099257 2.582715 6 386
|
||||
FAST f=23 a=7 1.354042 2.582715 6 386
|
||||
FAST f=23 a=8 122.026753 2.577681 8 194
|
||||
FAST f=23 a=8 1.210966 2.577681 8 194
|
||||
FAST f=23 a=9 121.164312 2.584599 6 290
|
||||
FAST f=23 a=9 1.174859 2.584599 6 290
|
||||
FAST f=23 a=10 117.462222 2.580358 8 194
|
||||
FAST f=23 a=10 1.075258 2.580358 8 194
|
||||
FAST f=24 a=1 169.539659 2.581642 6 194
|
||||
FAST f=24 a=1 1.916804 2.581642 6 194
|
||||
FAST f=24 a=2 160.539270 2.580421 6 290
|
||||
FAST f=24 a=2 1.71087 2.580421 6 290
|
||||
FAST f=24 a=3 155.455874 2.580449 6 242
|
||||
FAST f=24 a=3 1.60307 2.580449 6 242
|
||||
FAST f=24 a=4 147.630320 2.582953 6 338
|
||||
FAST f=24 a=4 1.396364 2.582953 6 338
|
||||
FAST f=24 a=5 133.767428 2.580589 6 290
|
||||
FAST f=24 a=5 1.19933 2.580589 6 290
|
||||
FAST f=24 a=6 146.437535 2.579453 8 194
|
||||
FAST f=24 a=6 1.385405 2.579453 8 194
|
||||
FAST f=24 a=7 147.227507 2.584155 8 386
|
||||
FAST f=24 a=7 1.48942 2.584155 8 386
|
||||
FAST f=24 a=8 138.005773 2.584115 8 194
|
||||
FAST f=24 a=8 1.352 2.584115 8 194
|
||||
FAST f=24 a=9 141.442625 2.582902 8 290
|
||||
FAST f=24 a=9 1.39647 2.582902 8 290
|
||||
FAST f=24 a=10 142.157446 2.582701 8 434
|
||||
FAST f=24 a=10 1.498889 2.582701 8 434
|
@ -1,442 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* strcmp, strlen */
|
||||
#include <errno.h> /* errno */
|
||||
#include <ctype.h>
|
||||
#include <time.h>
|
||||
#include "random.h"
|
||||
#include "dictBuilder.h"
|
||||
#include "zstd_internal.h" /* includes zstd.h */
|
||||
#include "io.h"
|
||||
#include "util.h"
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) if (displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||
|
||||
static const U64 g_refreshRate = SEC_TO_MICRO / 6;
|
||||
static UTIL_time_t g_displayClock = UTIL_TIME_INITIALIZER;
|
||||
|
||||
#define DISPLAYUPDATE(l, ...) { if (displayLevel>=l) { \
|
||||
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (displayLevel>=4)) \
|
||||
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
||||
if (displayLevel>=4) fflush(stderr); } } }
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Exceptions
|
||||
***************************************/
|
||||
#ifndef DEBUG
|
||||
# define DEBUG 0
|
||||
#endif
|
||||
#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__);
|
||||
#define EXM_THROW(error, ...) \
|
||||
{ \
|
||||
DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \
|
||||
DISPLAY("Error %i : ", error); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
DISPLAY("\n"); \
|
||||
exit(error); \
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Constants
|
||||
***************************************/
|
||||
static const unsigned g_defaultMaxDictSize = 110 KB;
|
||||
#define DEFAULT_CLEVEL 3
|
||||
#define DEFAULT_DISPLAYLEVEL 2
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Struct
|
||||
***************************************/
|
||||
typedef struct {
|
||||
const void* dictBuffer;
|
||||
size_t dictSize;
|
||||
} dictInfo;
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Dictionary related operations
|
||||
***************************************/
|
||||
/** createDictFromFiles() :
|
||||
* Based on type of param given, train dictionary using the corresponding algorithm
|
||||
* @return dictInfo containing dictionary buffer and dictionary size
|
||||
*/
|
||||
dictInfo* createDictFromFiles(sampleInfo *info, unsigned maxDictSize,
|
||||
ZDICT_random_params_t *randomParams, ZDICT_cover_params_t *coverParams,
|
||||
ZDICT_legacy_params_t *legacyParams, ZDICT_fastCover_params_t *fastParams) {
|
||||
unsigned const displayLevel = randomParams ? randomParams->zParams.notificationLevel :
|
||||
coverParams ? coverParams->zParams.notificationLevel :
|
||||
legacyParams ? legacyParams->zParams.notificationLevel :
|
||||
fastParams ? fastParams->zParams.notificationLevel :
|
||||
DEFAULT_DISPLAYLEVEL; /* no dict */
|
||||
void* const dictBuffer = malloc(maxDictSize);
|
||||
|
||||
dictInfo* dInfo = NULL;
|
||||
|
||||
/* Checks */
|
||||
if (!dictBuffer)
|
||||
EXM_THROW(12, "not enough memory for trainFromFiles"); /* should not happen */
|
||||
|
||||
{ size_t dictSize;
|
||||
if(randomParams) {
|
||||
dictSize = ZDICT_trainFromBuffer_random(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *randomParams);
|
||||
}else if(coverParams) {
|
||||
/* Run the optimize version if either k or d is not provided */
|
||||
if (!coverParams->d || !coverParams->k){
|
||||
dictSize = ZDICT_optimizeTrainFromBuffer_cover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, coverParams);
|
||||
} else {
|
||||
dictSize = ZDICT_trainFromBuffer_cover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *coverParams);
|
||||
}
|
||||
} else if(legacyParams) {
|
||||
dictSize = ZDICT_trainFromBuffer_legacy(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *legacyParams);
|
||||
} else if(fastParams) {
|
||||
/* Run the optimize version if either k or d is not provided */
|
||||
if (!fastParams->d || !fastParams->k) {
|
||||
dictSize = ZDICT_optimizeTrainFromBuffer_fastCover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, fastParams);
|
||||
} else {
|
||||
dictSize = ZDICT_trainFromBuffer_fastCover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *fastParams);
|
||||
}
|
||||
} else {
|
||||
dictSize = 0;
|
||||
}
|
||||
if (ZDICT_isError(dictSize)) {
|
||||
DISPLAYLEVEL(1, "dictionary training failed : %s \n", ZDICT_getErrorName(dictSize)); /* should not happen */
|
||||
free(dictBuffer);
|
||||
return dInfo;
|
||||
}
|
||||
dInfo = (dictInfo *)malloc(sizeof(dictInfo));
|
||||
dInfo->dictBuffer = dictBuffer;
|
||||
dInfo->dictSize = dictSize;
|
||||
}
|
||||
return dInfo;
|
||||
}
|
||||
|
||||
|
||||
/** compressWithDict() :
|
||||
* Compress samples from sample buffer given dictionary stored on dictionary buffer and compression level
|
||||
* @return compression ratio
|
||||
*/
|
||||
double compressWithDict(sampleInfo *srcInfo, dictInfo* dInfo, int compressionLevel, int displayLevel) {
|
||||
/* Local variables */
|
||||
size_t totalCompressedSize = 0;
|
||||
size_t totalOriginalSize = 0;
|
||||
const unsigned hasDict = dInfo->dictSize > 0 ? 1 : 0;
|
||||
double cRatio;
|
||||
size_t dstCapacity;
|
||||
int i;
|
||||
|
||||
/* Pointers */
|
||||
ZSTD_CDict *cdict = NULL;
|
||||
ZSTD_CCtx* cctx = NULL;
|
||||
size_t *offsets = NULL;
|
||||
void* dst = NULL;
|
||||
|
||||
/* Allocate dst with enough space to compress the maximum sized sample */
|
||||
{
|
||||
size_t maxSampleSize = 0;
|
||||
for (i = 0; i < srcInfo->nbSamples; i++) {
|
||||
maxSampleSize = MAX(srcInfo->samplesSizes[i], maxSampleSize);
|
||||
}
|
||||
dstCapacity = ZSTD_compressBound(maxSampleSize);
|
||||
dst = malloc(dstCapacity);
|
||||
}
|
||||
|
||||
/* Calculate offset for each sample */
|
||||
offsets = (size_t *)malloc((srcInfo->nbSamples + 1) * sizeof(size_t));
|
||||
offsets[0] = 0;
|
||||
for (i = 1; i <= srcInfo->nbSamples; i++) {
|
||||
offsets[i] = offsets[i - 1] + srcInfo->samplesSizes[i - 1];
|
||||
}
|
||||
|
||||
/* Create the cctx */
|
||||
cctx = ZSTD_createCCtx();
|
||||
if(!cctx || !dst) {
|
||||
cRatio = -1;
|
||||
goto _cleanup;
|
||||
}
|
||||
|
||||
/* Create CDict if there's a dictionary stored on buffer */
|
||||
if (hasDict) {
|
||||
cdict = ZSTD_createCDict(dInfo->dictBuffer, dInfo->dictSize, compressionLevel);
|
||||
if(!cdict) {
|
||||
cRatio = -1;
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
/* Compress each sample and sum their sizes*/
|
||||
const BYTE *const samples = (const BYTE *)srcInfo->srcBuffer;
|
||||
for (i = 0; i < srcInfo->nbSamples; i++) {
|
||||
size_t compressedSize;
|
||||
if(hasDict) {
|
||||
compressedSize = ZSTD_compress_usingCDict(cctx, dst, dstCapacity, samples + offsets[i], srcInfo->samplesSizes[i], cdict);
|
||||
} else {
|
||||
compressedSize = ZSTD_compressCCtx(cctx, dst, dstCapacity,samples + offsets[i], srcInfo->samplesSizes[i], compressionLevel);
|
||||
}
|
||||
if (ZSTD_isError(compressedSize)) {
|
||||
cRatio = -1;
|
||||
goto _cleanup;
|
||||
}
|
||||
totalCompressedSize += compressedSize;
|
||||
}
|
||||
|
||||
/* Sum original sizes */
|
||||
for (i = 0; i<srcInfo->nbSamples; i++) {
|
||||
totalOriginalSize += srcInfo->samplesSizes[i];
|
||||
}
|
||||
|
||||
/* Calculate compression ratio */
|
||||
DISPLAYLEVEL(2, "original size is %lu\n", totalOriginalSize);
|
||||
DISPLAYLEVEL(2, "compressed size is %lu\n", totalCompressedSize);
|
||||
cRatio = (double)totalOriginalSize/(double)totalCompressedSize;
|
||||
|
||||
_cleanup:
|
||||
free(dst);
|
||||
free(offsets);
|
||||
ZSTD_freeCCtx(cctx);
|
||||
ZSTD_freeCDict(cdict);
|
||||
return cRatio;
|
||||
}
|
||||
|
||||
|
||||
/** FreeDictInfo() :
|
||||
* Free memory allocated for dictInfo
|
||||
*/
|
||||
void freeDictInfo(dictInfo* info) {
|
||||
if (!info) return;
|
||||
if (info->dictBuffer) free((void*)(info->dictBuffer));
|
||||
free(info);
|
||||
}
|
||||
|
||||
|
||||
|
||||
/*-********************************************************
|
||||
* Benchmarking functions
|
||||
**********************************************************/
|
||||
/** benchmarkDictBuilder() :
|
||||
* Measure how long a dictionary builder takes and compression ratio with the dictionary built
|
||||
* @return 0 if benchmark successfully, 1 otherwise
|
||||
*/
|
||||
int benchmarkDictBuilder(sampleInfo *srcInfo, unsigned maxDictSize, ZDICT_random_params_t *randomParam,
|
||||
ZDICT_cover_params_t *coverParam, ZDICT_legacy_params_t *legacyParam,
|
||||
ZDICT_fastCover_params_t *fastParam) {
|
||||
/* Local variables */
|
||||
const unsigned displayLevel = randomParam ? randomParam->zParams.notificationLevel :
|
||||
coverParam ? coverParam->zParams.notificationLevel :
|
||||
legacyParam ? legacyParam->zParams.notificationLevel :
|
||||
fastParam ? fastParam->zParams.notificationLevel:
|
||||
DEFAULT_DISPLAYLEVEL; /* no dict */
|
||||
const char* name = randomParam ? "RANDOM" :
|
||||
coverParam ? "COVER" :
|
||||
legacyParam ? "LEGACY" :
|
||||
fastParam ? "FAST":
|
||||
"NODICT"; /* no dict */
|
||||
const unsigned cLevel = randomParam ? randomParam->zParams.compressionLevel :
|
||||
coverParam ? coverParam->zParams.compressionLevel :
|
||||
legacyParam ? legacyParam->zParams.compressionLevel :
|
||||
fastParam ? fastParam->zParams.compressionLevel:
|
||||
DEFAULT_CLEVEL; /* no dict */
|
||||
int result = 0;
|
||||
|
||||
/* Calculate speed */
|
||||
const UTIL_time_t begin = UTIL_getTime();
|
||||
dictInfo* dInfo = createDictFromFiles(srcInfo, maxDictSize, randomParam, coverParam, legacyParam, fastParam);
|
||||
const U64 timeMicro = UTIL_clockSpanMicro(begin);
|
||||
const double timeSec = timeMicro / (double)SEC_TO_MICRO;
|
||||
if (!dInfo) {
|
||||
DISPLAYLEVEL(1, "%s does not train successfully\n", name);
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
DISPLAYLEVEL(1, "%s took %f seconds to execute \n", name, timeSec);
|
||||
|
||||
/* Calculate compression ratio */
|
||||
const double cRatio = compressWithDict(srcInfo, dInfo, cLevel, displayLevel);
|
||||
if (cRatio < 0) {
|
||||
DISPLAYLEVEL(1, "Compressing with %s dictionary does not work\n", name);
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
|
||||
}
|
||||
DISPLAYLEVEL(1, "Compression ratio with %s dictionary is %f\n", name, cRatio);
|
||||
|
||||
_cleanup:
|
||||
freeDictInfo(dInfo);
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
|
||||
int main(int argCount, const char* argv[])
|
||||
{
|
||||
const int displayLevel = DEFAULT_DISPLAYLEVEL;
|
||||
const char* programName = argv[0];
|
||||
int result = 0;
|
||||
|
||||
/* Initialize arguments to default values */
|
||||
unsigned k = 200;
|
||||
unsigned d = 8;
|
||||
unsigned f;
|
||||
unsigned accel;
|
||||
unsigned i;
|
||||
const unsigned cLevel = DEFAULT_CLEVEL;
|
||||
const unsigned dictID = 0;
|
||||
const unsigned maxDictSize = g_defaultMaxDictSize;
|
||||
|
||||
/* Initialize table to store input files */
|
||||
const char** filenameTable = (const char**)malloc(argCount * sizeof(const char*));
|
||||
unsigned filenameIdx = 0;
|
||||
|
||||
char* fileNamesBuf = NULL;
|
||||
unsigned fileNamesNb = filenameIdx;
|
||||
const int followLinks = 0;
|
||||
const char** extendedFileList = NULL;
|
||||
|
||||
/* Parse arguments */
|
||||
for (i = 1; i < argCount; i++) {
|
||||
const char* argument = argv[i];
|
||||
if (longCommandWArg(&argument, "in=")) {
|
||||
filenameTable[filenameIdx] = argument;
|
||||
filenameIdx++;
|
||||
continue;
|
||||
}
|
||||
DISPLAYLEVEL(1, "benchmark: Incorrect parameters\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Get the list of all files recursively (because followLinks==0)*/
|
||||
extendedFileList = UTIL_createFileList(filenameTable, filenameIdx, &fileNamesBuf,
|
||||
&fileNamesNb, followLinks);
|
||||
if (extendedFileList) {
|
||||
unsigned u;
|
||||
for (u=0; u<fileNamesNb; u++) DISPLAYLEVEL(4, "%u %s\n", u, extendedFileList[u]);
|
||||
free((void*)filenameTable);
|
||||
filenameTable = extendedFileList;
|
||||
filenameIdx = fileNamesNb;
|
||||
}
|
||||
|
||||
/* get sampleInfo */
|
||||
size_t blockSize = 0;
|
||||
sampleInfo* srcInfo= getSampleInfo(filenameTable,
|
||||
filenameIdx, blockSize, maxDictSize, displayLevel);
|
||||
|
||||
/* set up zParams */
|
||||
ZDICT_params_t zParams;
|
||||
zParams.compressionLevel = cLevel;
|
||||
zParams.notificationLevel = displayLevel;
|
||||
zParams.dictID = dictID;
|
||||
|
||||
/* with no dict */
|
||||
{
|
||||
const int noDictResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, NULL, NULL, NULL);
|
||||
if(noDictResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
/* for random */
|
||||
{
|
||||
ZDICT_random_params_t randomParam;
|
||||
randomParam.zParams = zParams;
|
||||
randomParam.k = k;
|
||||
const int randomResult = benchmarkDictBuilder(srcInfo, maxDictSize, &randomParam, NULL, NULL, NULL);
|
||||
DISPLAYLEVEL(2, "k=%u\n", randomParam.k);
|
||||
if(randomResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
/* for legacy */
|
||||
{
|
||||
ZDICT_legacy_params_t legacyParam;
|
||||
legacyParam.zParams = zParams;
|
||||
legacyParam.selectivityLevel = 9;
|
||||
const int legacyResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, NULL, &legacyParam, NULL);
|
||||
DISPLAYLEVEL(2, "selectivityLevel=%u\n", legacyParam.selectivityLevel);
|
||||
if(legacyResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
/* for cover */
|
||||
{
|
||||
/* for cover (optimizing k and d) */
|
||||
ZDICT_cover_params_t coverParam;
|
||||
memset(&coverParam, 0, sizeof(coverParam));
|
||||
coverParam.zParams = zParams;
|
||||
coverParam.splitPoint = 1.0;
|
||||
coverParam.steps = 40;
|
||||
coverParam.nbThreads = 1;
|
||||
const int coverOptResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, &coverParam, NULL, NULL);
|
||||
DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\nsplit=%u\n", coverParam.k, coverParam.d, coverParam.steps, (unsigned)(coverParam.splitPoint * 100));
|
||||
if(coverOptResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
|
||||
/* for cover (with k and d provided) */
|
||||
const int coverResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, &coverParam, NULL, NULL);
|
||||
DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\nsplit=%u\n", coverParam.k, coverParam.d, coverParam.steps, (unsigned)(coverParam.splitPoint * 100));
|
||||
if(coverResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
/* for fastCover */
|
||||
for (f = 15; f < 25; f++){
|
||||
DISPLAYLEVEL(2, "current f is %u\n", f);
|
||||
for (accel = 1; accel < 11; accel++) {
|
||||
DISPLAYLEVEL(2, "current accel is %u\n", accel);
|
||||
/* for fastCover (optimizing k and d) */
|
||||
ZDICT_fastCover_params_t fastParam;
|
||||
memset(&fastParam, 0, sizeof(fastParam));
|
||||
fastParam.zParams = zParams;
|
||||
fastParam.f = f;
|
||||
fastParam.steps = 40;
|
||||
fastParam.nbThreads = 1;
|
||||
fastParam.accel = accel;
|
||||
const int fastOptResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, NULL, NULL, &fastParam);
|
||||
DISPLAYLEVEL(2, "k=%u\nd=%u\nf=%u\nsteps=%u\nsplit=%u\naccel=%u\n", fastParam.k, fastParam.d, fastParam.f, fastParam.steps, (unsigned)(fastParam.splitPoint * 100), fastParam.accel);
|
||||
if(fastOptResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
|
||||
/* for fastCover (with k and d provided) */
|
||||
for (i = 0; i < 5; i++) {
|
||||
const int fastResult = benchmarkDictBuilder(srcInfo, maxDictSize, NULL, NULL, NULL, &fastParam);
|
||||
DISPLAYLEVEL(2, "k=%u\nd=%u\nf=%u\nsteps=%u\nsplit=%u\naccel=%u\n", fastParam.k, fastParam.d, fastParam.f, fastParam.steps, (unsigned)(fastParam.splitPoint * 100), fastParam.accel);
|
||||
if(fastResult) {
|
||||
result = 1;
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* Free allocated memory */
|
||||
_cleanup:
|
||||
UTIL_freeFileList(extendedFileList, fileNamesBuf);
|
||||
freeSampleInfo(srcInfo);
|
||||
return result;
|
||||
}
|
@ -1,6 +0,0 @@
|
||||
/* ZDICT_trainFromBuffer_legacy() :
|
||||
* issue : samplesBuffer need to be followed by a noisy guard band.
|
||||
* work around : duplicate the buffer, and add the noise */
|
||||
size_t ZDICT_trainFromBuffer_legacy(void* dictBuffer, size_t dictBufferCapacity,
|
||||
const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
|
||||
ZDICT_legacy_params_t params);
|
@ -1,2 +0,0 @@
|
||||
echo "Benchmark with in=../../lib/common"
|
||||
./benchmark in=../../../lib/common
|
@ -1,54 +0,0 @@
|
||||
ARG :=
|
||||
|
||||
CC ?= gcc
|
||||
CFLAGS ?= -O3 -g
|
||||
INCLUDES := -I ../../../programs -I ../randomDictBuilder -I ../../../lib/common -I ../../../lib -I ../../../lib/dictBuilder
|
||||
|
||||
IO_FILE := ../randomDictBuilder/io.c
|
||||
|
||||
TEST_INPUT := ../../../lib
|
||||
TEST_OUTPUT := fastCoverDict
|
||||
|
||||
all: main run clean
|
||||
|
||||
.PHONY: test
|
||||
test: main testrun testshell clean
|
||||
|
||||
.PHONY: run
|
||||
run:
|
||||
echo "Building a fastCover dictionary with given arguments"
|
||||
./main $(ARG)
|
||||
|
||||
main: main.o io.o fastCover.o libzstd.a
|
||||
$(CC) $(CFLAGS) main.o io.o fastCover.o libzstd.a -o main
|
||||
|
||||
main.o: main.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c main.c
|
||||
|
||||
fastCover.o: fastCover.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c fastCover.c
|
||||
|
||||
io.o: $(IO_FILE)
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c $(IO_FILE)
|
||||
|
||||
libzstd.a:
|
||||
$(MAKE) MOREFLAGS=-g -C ../../../lib libzstd.a
|
||||
mv ../../../lib/libzstd.a .
|
||||
|
||||
.PHONY: testrun
|
||||
testrun: main
|
||||
echo "Run with $(TEST_INPUT) and $(TEST_OUTPUT) "
|
||||
./main in=$(TEST_INPUT) out=$(TEST_OUTPUT)
|
||||
zstd -be3 -D $(TEST_OUTPUT) -r $(TEST_INPUT) -q
|
||||
rm -f $(TEST_OUTPUT)
|
||||
|
||||
.PHONY: testshell
|
||||
testshell: test.sh
|
||||
sh test.sh
|
||||
echo "Finish running test.sh"
|
||||
|
||||
.PHONY: clean
|
||||
clean:
|
||||
rm -f *.o main libzstd.a
|
||||
$(MAKE) -C ../../../lib clean
|
||||
echo "Cleaning is completed"
|
@ -1,24 +0,0 @@
|
||||
FastCover Dictionary Builder
|
||||
|
||||
### Permitted Arguments:
|
||||
Input File/Directory (in=fileName): required; file/directory used to build dictionary; if directory, will operate recursively for files inside directory; can include multiple files/directories, each following "in="
|
||||
Output Dictionary (out=dictName): if not provided, default to fastCoverDict
|
||||
Dictionary ID (dictID=#): nonnegative number; if not provided, default to 0
|
||||
Maximum Dictionary Size (maxdict=#): positive number; in bytes, if not provided, default to 110KB
|
||||
Size of Selected Segment (k=#): positive number; in bytes; if not provided, default to 200
|
||||
Size of Dmer (d=#): either 6 or 8; if not provided, default to 8
|
||||
Number of steps (steps=#): positive number, if not provided, default to 32
|
||||
Percentage of samples used for training(split=#): positive number; if not provided, default to 100
|
||||
|
||||
|
||||
###Running Test:
|
||||
make test
|
||||
|
||||
|
||||
###Usage:
|
||||
To build a FASTCOVER dictionary with the provided arguments: make ARG= followed by arguments
|
||||
If k or d is not provided, the optimize version of FASTCOVER is run.
|
||||
|
||||
### Examples:
|
||||
make ARG="in=../../../lib/dictBuilder out=dict100 dictID=520"
|
||||
make ARG="in=../../../lib/dictBuilder in=../../../lib/compress"
|
@ -1,809 +0,0 @@
|
||||
/*-*************************************
|
||||
* Dependencies
|
||||
***************************************/
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* memset */
|
||||
#include <time.h> /* clock */
|
||||
#include "mem.h" /* read */
|
||||
#include "pool.h"
|
||||
#include "threading.h"
|
||||
#include "fastCover.h"
|
||||
#include "zstd_internal.h" /* includes zstd.h */
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Constants
|
||||
***************************************/
|
||||
#define FASTCOVER_MAX_SAMPLES_SIZE (sizeof(size_t) == 8 ? ((U32)-1) : ((U32)1 GB))
|
||||
#define FASTCOVER_MAX_F 32
|
||||
#define DEFAULT_SPLITPOINT 1.0
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
static int g_displayLevel = 2;
|
||||
#define DISPLAY(...) \
|
||||
{ \
|
||||
fprintf(stderr, __VA_ARGS__); \
|
||||
fflush(stderr); \
|
||||
}
|
||||
#define LOCALDISPLAYLEVEL(displayLevel, l, ...) \
|
||||
if (displayLevel >= l) { \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
} /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
|
||||
#define DISPLAYLEVEL(l, ...) LOCALDISPLAYLEVEL(g_displayLevel, l, __VA_ARGS__)
|
||||
|
||||
#define LOCALDISPLAYUPDATE(displayLevel, l, ...) \
|
||||
if (displayLevel >= l) { \
|
||||
if ((clock() - g_time > refreshRate) || (displayLevel >= 4)) { \
|
||||
g_time = clock(); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
} \
|
||||
}
|
||||
#define DISPLAYUPDATE(l, ...) LOCALDISPLAYUPDATE(g_displayLevel, l, __VA_ARGS__)
|
||||
static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100;
|
||||
static clock_t g_time = 0;
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Hash Functions
|
||||
***************************************/
|
||||
static const U64 prime6bytes = 227718039650203ULL;
|
||||
static size_t ZSTD_hash6(U64 u, U32 h) { return (size_t)(((u << (64-48)) * prime6bytes) >> (64-h)) ; }
|
||||
static size_t ZSTD_hash6Ptr(const void* p, U32 h) { return ZSTD_hash6(MEM_readLE64(p), h); }
|
||||
|
||||
static const U64 prime8bytes = 0xCF1BBCDCB7A56463ULL;
|
||||
static size_t ZSTD_hash8(U64 u, U32 h) { return (size_t)(((u) * prime8bytes) >> (64-h)) ; }
|
||||
static size_t ZSTD_hash8Ptr(const void* p, U32 h) { return ZSTD_hash8(MEM_readLE64(p), h); }
|
||||
|
||||
|
||||
/**
|
||||
* Hash the d-byte value pointed to by p and mod 2^f
|
||||
*/
|
||||
static size_t FASTCOVER_hashPtrToIndex(const void* p, U32 h, unsigned d) {
|
||||
if (d == 6) {
|
||||
return ZSTD_hash6Ptr(p, h) & ((1 << h) - 1);
|
||||
}
|
||||
return ZSTD_hash8Ptr(p, h) & ((1 << h) - 1);
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Context
|
||||
***************************************/
|
||||
typedef struct {
|
||||
const BYTE *samples;
|
||||
size_t *offsets;
|
||||
const size_t *samplesSizes;
|
||||
size_t nbSamples;
|
||||
size_t nbTrainSamples;
|
||||
size_t nbTestSamples;
|
||||
size_t nbDmers;
|
||||
U32 *freqs;
|
||||
U16 *segmentFreqs;
|
||||
unsigned d;
|
||||
} FASTCOVER_ctx_t;
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Helper functions
|
||||
***************************************/
|
||||
/**
|
||||
* Returns the sum of the sample sizes.
|
||||
*/
|
||||
static size_t FASTCOVER_sum(const size_t *samplesSizes, unsigned nbSamples) {
|
||||
size_t sum = 0;
|
||||
unsigned i;
|
||||
for (i = 0; i < nbSamples; ++i) {
|
||||
sum += samplesSizes[i];
|
||||
}
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* fast functions
|
||||
***************************************/
|
||||
/**
|
||||
* A segment is a range in the source as well as the score of the segment.
|
||||
*/
|
||||
typedef struct {
|
||||
U32 begin;
|
||||
U32 end;
|
||||
U32 score;
|
||||
} FASTCOVER_segment_t;
|
||||
|
||||
|
||||
/**
|
||||
* Selects the best segment in an epoch.
|
||||
* Segments of are scored according to the function:
|
||||
*
|
||||
* Let F(d) be the frequency of all dmers with hash value d.
|
||||
* Let S_i be hash value of the dmer at position i of segment S which has length k.
|
||||
*
|
||||
* Score(S) = F(S_1) + F(S_2) + ... + F(S_{k-d+1})
|
||||
*
|
||||
* Once the dmer with hash value d is in the dictionary we set F(d) = F(d)/2.
|
||||
*/
|
||||
static FASTCOVER_segment_t FASTCOVER_selectSegment(const FASTCOVER_ctx_t *ctx,
|
||||
U32 *freqs, U32 begin,U32 end,
|
||||
ZDICT_fastCover_params_t parameters) {
|
||||
/* Constants */
|
||||
const U32 k = parameters.k;
|
||||
const U32 d = parameters.d;
|
||||
const U32 dmersInK = k - d + 1;
|
||||
/* Try each segment (activeSegment) and save the best (bestSegment) */
|
||||
FASTCOVER_segment_t bestSegment = {0, 0, 0};
|
||||
FASTCOVER_segment_t activeSegment;
|
||||
/* Reset the activeDmers in the segment */
|
||||
/* The activeSegment starts at the beginning of the epoch. */
|
||||
activeSegment.begin = begin;
|
||||
activeSegment.end = begin;
|
||||
activeSegment.score = 0;
|
||||
{
|
||||
/* Slide the activeSegment through the whole epoch.
|
||||
* Save the best segment in bestSegment.
|
||||
*/
|
||||
while (activeSegment.end < end) {
|
||||
/* Get hash value of current dmer */
|
||||
const size_t index = FASTCOVER_hashPtrToIndex(ctx->samples + activeSegment.end, parameters.f, ctx->d);
|
||||
/* Add frequency of this index to score if this is the first occurrence of index in active segment */
|
||||
if (ctx->segmentFreqs[index] == 0) {
|
||||
activeSegment.score += freqs[index];
|
||||
}
|
||||
ctx->segmentFreqs[index] += 1;
|
||||
/* Increment end of segment */
|
||||
activeSegment.end += 1;
|
||||
/* If the window is now too large, drop the first position */
|
||||
if (activeSegment.end - activeSegment.begin == dmersInK + 1) {
|
||||
/* Get hash value of the dmer to be eliminated from active segment */
|
||||
const size_t delIndex = FASTCOVER_hashPtrToIndex(ctx->samples + activeSegment.begin, parameters.f, ctx->d);
|
||||
ctx->segmentFreqs[delIndex] -= 1;
|
||||
/* Subtract frequency of this index from score if this is the last occurrence of this index in active segment */
|
||||
if (ctx->segmentFreqs[delIndex] == 0) {
|
||||
activeSegment.score -= freqs[delIndex];
|
||||
}
|
||||
/* Increment start of segment */
|
||||
activeSegment.begin += 1;
|
||||
}
|
||||
/* If this segment is the best so far save it */
|
||||
if (activeSegment.score > bestSegment.score) {
|
||||
bestSegment = activeSegment;
|
||||
}
|
||||
}
|
||||
/* Zero out rest of segmentFreqs array */
|
||||
while (activeSegment.begin < end) {
|
||||
const size_t delIndex = FASTCOVER_hashPtrToIndex(ctx->samples + activeSegment.begin, parameters.f, ctx->d);
|
||||
ctx->segmentFreqs[delIndex] -= 1;
|
||||
activeSegment.begin += 1;
|
||||
}
|
||||
}
|
||||
{
|
||||
/* Trim off the zero frequency head and tail from the segment. */
|
||||
U32 newBegin = bestSegment.end;
|
||||
U32 newEnd = bestSegment.begin;
|
||||
U32 pos;
|
||||
for (pos = bestSegment.begin; pos != bestSegment.end; ++pos) {
|
||||
const size_t index = FASTCOVER_hashPtrToIndex(ctx->samples + pos, parameters.f, ctx->d);
|
||||
U32 freq = freqs[index];
|
||||
if (freq != 0) {
|
||||
newBegin = MIN(newBegin, pos);
|
||||
newEnd = pos + 1;
|
||||
}
|
||||
}
|
||||
bestSegment.begin = newBegin;
|
||||
bestSegment.end = newEnd;
|
||||
}
|
||||
{
|
||||
/* Zero the frequency of hash value of each dmer covered by the chosen segment. */
|
||||
U32 pos;
|
||||
for (pos = bestSegment.begin; pos != bestSegment.end; ++pos) {
|
||||
const size_t i = FASTCOVER_hashPtrToIndex(ctx->samples + pos, parameters.f, ctx->d);
|
||||
freqs[i] = 0;
|
||||
}
|
||||
}
|
||||
return bestSegment;
|
||||
}
|
||||
|
||||
/**
|
||||
* Check the validity of the parameters.
|
||||
* Returns non-zero if the parameters are valid and 0 otherwise.
|
||||
*/
|
||||
static int FASTCOVER_checkParameters(ZDICT_fastCover_params_t parameters,
|
||||
size_t maxDictSize) {
|
||||
/* k, d, and f are required parameters */
|
||||
if (parameters.d == 0 || parameters.k == 0 || parameters.f == 0) {
|
||||
return 0;
|
||||
}
|
||||
/* d has to be 6 or 8 */
|
||||
if (parameters.d != 6 && parameters.d != 8) {
|
||||
return 0;
|
||||
}
|
||||
/* 0 < f <= FASTCOVER_MAX_F */
|
||||
if (parameters.f > FASTCOVER_MAX_F) {
|
||||
return 0;
|
||||
}
|
||||
/* k <= maxDictSize */
|
||||
if (parameters.k > maxDictSize) {
|
||||
return 0;
|
||||
}
|
||||
/* d <= k */
|
||||
if (parameters.d > parameters.k) {
|
||||
return 0;
|
||||
}
|
||||
/* 0 < splitPoint <= 1 */
|
||||
if (parameters.splitPoint <= 0 || parameters.splitPoint > 1) {
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Clean up a context initialized with `FASTCOVER_ctx_init()`.
|
||||
*/
|
||||
static void FASTCOVER_ctx_destroy(FASTCOVER_ctx_t *ctx) {
|
||||
if (!ctx) {
|
||||
return;
|
||||
}
|
||||
if (ctx->segmentFreqs) {
|
||||
free(ctx->segmentFreqs);
|
||||
ctx->segmentFreqs = NULL;
|
||||
}
|
||||
if (ctx->freqs) {
|
||||
free(ctx->freqs);
|
||||
ctx->freqs = NULL;
|
||||
}
|
||||
if (ctx->offsets) {
|
||||
free(ctx->offsets);
|
||||
ctx->offsets = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculate for frequency of hash value of each dmer in ctx->samples
|
||||
*/
|
||||
static void FASTCOVER_computeFrequency(U32 *freqs, unsigned f, FASTCOVER_ctx_t *ctx){
|
||||
size_t start; /* start of current dmer */
|
||||
for (unsigned i = 0; i < ctx->nbTrainSamples; i++) {
|
||||
size_t currSampleStart = ctx->offsets[i];
|
||||
size_t currSampleEnd = ctx->offsets[i+1];
|
||||
start = currSampleStart;
|
||||
while (start + ctx->d <= currSampleEnd) {
|
||||
const size_t dmerIndex = FASTCOVER_hashPtrToIndex(ctx->samples + start, f, ctx->d);
|
||||
freqs[dmerIndex]++;
|
||||
start++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Prepare a context for dictionary building.
|
||||
* The context is only dependent on the parameter `d` and can used multiple
|
||||
* times.
|
||||
* Returns 1 on success or zero on error.
|
||||
* The context must be destroyed with `FASTCOVER_ctx_destroy()`.
|
||||
*/
|
||||
static int FASTCOVER_ctx_init(FASTCOVER_ctx_t *ctx, const void *samplesBuffer,
|
||||
const size_t *samplesSizes, unsigned nbSamples,
|
||||
unsigned d, double splitPoint, unsigned f) {
|
||||
const BYTE *const samples = (const BYTE *)samplesBuffer;
|
||||
const size_t totalSamplesSize = FASTCOVER_sum(samplesSizes, nbSamples);
|
||||
/* Split samples into testing and training sets */
|
||||
const unsigned nbTrainSamples = splitPoint < 1.0 ? (unsigned)((double)nbSamples * splitPoint) : nbSamples;
|
||||
const unsigned nbTestSamples = splitPoint < 1.0 ? nbSamples - nbTrainSamples : nbSamples;
|
||||
const size_t trainingSamplesSize = splitPoint < 1.0 ? FASTCOVER_sum(samplesSizes, nbTrainSamples) : totalSamplesSize;
|
||||
const size_t testSamplesSize = splitPoint < 1.0 ? FASTCOVER_sum(samplesSizes + nbTrainSamples, nbTestSamples) : totalSamplesSize;
|
||||
/* Checks */
|
||||
if (totalSamplesSize < MAX(d, sizeof(U64)) ||
|
||||
totalSamplesSize >= (size_t)FASTCOVER_MAX_SAMPLES_SIZE) {
|
||||
DISPLAYLEVEL(1, "Total samples size is too large (%u MB), maximum size is %u MB\n",
|
||||
(U32)(totalSamplesSize >> 20), (FASTCOVER_MAX_SAMPLES_SIZE >> 20));
|
||||
return 0;
|
||||
}
|
||||
/* Check if there are at least 5 training samples */
|
||||
if (nbTrainSamples < 5) {
|
||||
DISPLAYLEVEL(1, "Total number of training samples is %u and is invalid.", nbTrainSamples);
|
||||
return 0;
|
||||
}
|
||||
/* Check if there's testing sample */
|
||||
if (nbTestSamples < 1) {
|
||||
DISPLAYLEVEL(1, "Total number of testing samples is %u and is invalid.", nbTestSamples);
|
||||
return 0;
|
||||
}
|
||||
/* Zero the context */
|
||||
memset(ctx, 0, sizeof(*ctx));
|
||||
DISPLAYLEVEL(2, "Training on %u samples of total size %u\n", nbTrainSamples,
|
||||
(U32)trainingSamplesSize);
|
||||
DISPLAYLEVEL(2, "Testing on %u samples of total size %u\n", nbTestSamples,
|
||||
(U32)testSamplesSize);
|
||||
|
||||
ctx->samples = samples;
|
||||
ctx->samplesSizes = samplesSizes;
|
||||
ctx->nbSamples = nbSamples;
|
||||
ctx->nbTrainSamples = nbTrainSamples;
|
||||
ctx->nbTestSamples = nbTestSamples;
|
||||
ctx->nbDmers = trainingSamplesSize - d + 1;
|
||||
ctx->d = d;
|
||||
|
||||
/* The offsets of each file */
|
||||
ctx->offsets = (size_t *)malloc((nbSamples + 1) * sizeof(size_t));
|
||||
if (!ctx->offsets) {
|
||||
DISPLAYLEVEL(1, "Failed to allocate scratch buffers\n");
|
||||
FASTCOVER_ctx_destroy(ctx);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Fill offsets from the samplesSizes */
|
||||
{
|
||||
U32 i;
|
||||
ctx->offsets[0] = 0;
|
||||
for (i = 1; i <= nbSamples; ++i) {
|
||||
ctx->offsets[i] = ctx->offsets[i - 1] + samplesSizes[i - 1];
|
||||
}
|
||||
}
|
||||
|
||||
/* Initialize frequency array of size 2^f */
|
||||
ctx->freqs = (U32 *)calloc((1 << f), sizeof(U32));
|
||||
ctx->segmentFreqs = (U16 *)calloc((1 << f), sizeof(U16));
|
||||
DISPLAYLEVEL(2, "Computing frequencies\n");
|
||||
FASTCOVER_computeFrequency(ctx->freqs, f, ctx);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Given the prepared context build the dictionary.
|
||||
*/
|
||||
static size_t FASTCOVER_buildDictionary(const FASTCOVER_ctx_t *ctx, U32 *freqs,
|
||||
void *dictBuffer,
|
||||
size_t dictBufferCapacity,
|
||||
ZDICT_fastCover_params_t parameters){
|
||||
BYTE *const dict = (BYTE *)dictBuffer;
|
||||
size_t tail = dictBufferCapacity;
|
||||
/* Divide the data up into epochs of equal size.
|
||||
* We will select at least one segment from each epoch.
|
||||
*/
|
||||
const U32 epochs = MAX(1, (U32)(dictBufferCapacity / parameters.k));
|
||||
const U32 epochSize = (U32)(ctx->nbDmers / epochs);
|
||||
size_t epoch;
|
||||
DISPLAYLEVEL(2, "Breaking content into %u epochs of size %u\n", epochs,
|
||||
epochSize);
|
||||
/* Loop through the epochs until there are no more segments or the dictionary
|
||||
* is full.
|
||||
*/
|
||||
for (epoch = 0; tail > 0; epoch = (epoch + 1) % epochs) {
|
||||
const U32 epochBegin = (U32)(epoch * epochSize);
|
||||
const U32 epochEnd = epochBegin + epochSize;
|
||||
size_t segmentSize;
|
||||
/* Select a segment */
|
||||
FASTCOVER_segment_t segment = FASTCOVER_selectSegment(
|
||||
ctx, freqs, epochBegin, epochEnd, parameters);
|
||||
|
||||
/* If the segment covers no dmers, then we are out of content */
|
||||
if (segment.score == 0) {
|
||||
break;
|
||||
}
|
||||
|
||||
/* Trim the segment if necessary and if it is too small then we are done */
|
||||
segmentSize = MIN(segment.end - segment.begin + parameters.d - 1, tail);
|
||||
if (segmentSize < parameters.d) {
|
||||
break;
|
||||
}
|
||||
|
||||
/* We fill the dictionary from the back to allow the best segments to be
|
||||
* referenced with the smallest offsets.
|
||||
*/
|
||||
tail -= segmentSize;
|
||||
memcpy(dict + tail, ctx->samples + segment.begin, segmentSize);
|
||||
DISPLAYUPDATE(
|
||||
2, "\r%u%% ",
|
||||
(U32)(((dictBufferCapacity - tail) * 100) / dictBufferCapacity));
|
||||
}
|
||||
DISPLAYLEVEL(2, "\r%79s\r", "");
|
||||
return tail;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* FASTCOVER_best_t is used for two purposes:
|
||||
* 1. Synchronizing threads.
|
||||
* 2. Saving the best parameters and dictionary.
|
||||
*
|
||||
* All of the methods except FASTCOVER_best_init() are thread safe if zstd is
|
||||
* compiled with multithreaded support.
|
||||
*/
|
||||
typedef struct fast_best_s {
|
||||
ZSTD_pthread_mutex_t mutex;
|
||||
ZSTD_pthread_cond_t cond;
|
||||
size_t liveJobs;
|
||||
void *dict;
|
||||
size_t dictSize;
|
||||
ZDICT_fastCover_params_t parameters;
|
||||
size_t compressedSize;
|
||||
} FASTCOVER_best_t;
|
||||
|
||||
/**
|
||||
* Initialize the `FASTCOVER_best_t`.
|
||||
*/
|
||||
static void FASTCOVER_best_init(FASTCOVER_best_t *best) {
|
||||
if (best==NULL) return; /* compatible with init on NULL */
|
||||
(void)ZSTD_pthread_mutex_init(&best->mutex, NULL);
|
||||
(void)ZSTD_pthread_cond_init(&best->cond, NULL);
|
||||
best->liveJobs = 0;
|
||||
best->dict = NULL;
|
||||
best->dictSize = 0;
|
||||
best->compressedSize = (size_t)-1;
|
||||
memset(&best->parameters, 0, sizeof(best->parameters));
|
||||
}
|
||||
|
||||
/**
|
||||
* Wait until liveJobs == 0.
|
||||
*/
|
||||
static void FASTCOVER_best_wait(FASTCOVER_best_t *best) {
|
||||
if (!best) {
|
||||
return;
|
||||
}
|
||||
ZSTD_pthread_mutex_lock(&best->mutex);
|
||||
while (best->liveJobs != 0) {
|
||||
ZSTD_pthread_cond_wait(&best->cond, &best->mutex);
|
||||
}
|
||||
ZSTD_pthread_mutex_unlock(&best->mutex);
|
||||
}
|
||||
|
||||
/**
|
||||
* Call FASTCOVER_best_wait() and then destroy the FASTCOVER_best_t.
|
||||
*/
|
||||
static void FASTCOVER_best_destroy(FASTCOVER_best_t *best) {
|
||||
if (!best) {
|
||||
return;
|
||||
}
|
||||
FASTCOVER_best_wait(best);
|
||||
if (best->dict) {
|
||||
free(best->dict);
|
||||
}
|
||||
ZSTD_pthread_mutex_destroy(&best->mutex);
|
||||
ZSTD_pthread_cond_destroy(&best->cond);
|
||||
}
|
||||
|
||||
/**
|
||||
* Called when a thread is about to be launched.
|
||||
* Increments liveJobs.
|
||||
*/
|
||||
static void FASTCOVER_best_start(FASTCOVER_best_t *best) {
|
||||
if (!best) {
|
||||
return;
|
||||
}
|
||||
ZSTD_pthread_mutex_lock(&best->mutex);
|
||||
++best->liveJobs;
|
||||
ZSTD_pthread_mutex_unlock(&best->mutex);
|
||||
}
|
||||
|
||||
/**
|
||||
* Called when a thread finishes executing, both on error or success.
|
||||
* Decrements liveJobs and signals any waiting threads if liveJobs == 0.
|
||||
* If this dictionary is the best so far save it and its parameters.
|
||||
*/
|
||||
static void FASTCOVER_best_finish(FASTCOVER_best_t *best, size_t compressedSize,
|
||||
ZDICT_fastCover_params_t parameters, void *dict,
|
||||
size_t dictSize) {
|
||||
if (!best) {
|
||||
return;
|
||||
}
|
||||
{
|
||||
size_t liveJobs;
|
||||
ZSTD_pthread_mutex_lock(&best->mutex);
|
||||
--best->liveJobs;
|
||||
liveJobs = best->liveJobs;
|
||||
/* If the new dictionary is better */
|
||||
if (compressedSize < best->compressedSize) {
|
||||
/* Allocate space if necessary */
|
||||
if (!best->dict || best->dictSize < dictSize) {
|
||||
if (best->dict) {
|
||||
free(best->dict);
|
||||
}
|
||||
best->dict = malloc(dictSize);
|
||||
if (!best->dict) {
|
||||
best->compressedSize = ERROR(GENERIC);
|
||||
best->dictSize = 0;
|
||||
return;
|
||||
}
|
||||
}
|
||||
/* Save the dictionary, parameters, and size */
|
||||
memcpy(best->dict, dict, dictSize);
|
||||
best->dictSize = dictSize;
|
||||
best->parameters = parameters;
|
||||
best->compressedSize = compressedSize;
|
||||
}
|
||||
ZSTD_pthread_mutex_unlock(&best->mutex);
|
||||
if (liveJobs == 0) {
|
||||
ZSTD_pthread_cond_broadcast(&best->cond);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Parameters for FASTCOVER_tryParameters().
|
||||
*/
|
||||
typedef struct FASTCOVER_tryParameters_data_s {
|
||||
const FASTCOVER_ctx_t *ctx;
|
||||
FASTCOVER_best_t *best;
|
||||
size_t dictBufferCapacity;
|
||||
ZDICT_fastCover_params_t parameters;
|
||||
} FASTCOVER_tryParameters_data_t;
|
||||
|
||||
/**
|
||||
* Tries a set of parameters and updates the FASTCOVER_best_t with the results.
|
||||
* This function is thread safe if zstd is compiled with multithreaded support.
|
||||
* It takes its parameters as an *OWNING* opaque pointer to support threading.
|
||||
*/
|
||||
static void FASTCOVER_tryParameters(void *opaque) {
|
||||
/* Save parameters as local variables */
|
||||
FASTCOVER_tryParameters_data_t *const data = (FASTCOVER_tryParameters_data_t *)opaque;
|
||||
const FASTCOVER_ctx_t *const ctx = data->ctx;
|
||||
const ZDICT_fastCover_params_t parameters = data->parameters;
|
||||
size_t dictBufferCapacity = data->dictBufferCapacity;
|
||||
size_t totalCompressedSize = ERROR(GENERIC);
|
||||
/* Allocate space for hash table, dict, and freqs */
|
||||
BYTE *const dict = (BYTE * const)malloc(dictBufferCapacity);
|
||||
U32 *freqs = (U32*) malloc((1 << parameters.f) * sizeof(U32));
|
||||
if (!dict || !freqs) {
|
||||
DISPLAYLEVEL(1, "Failed to allocate buffers: out of memory\n");
|
||||
goto _cleanup;
|
||||
}
|
||||
/* Copy the frequencies because we need to modify them */
|
||||
memcpy(freqs, ctx->freqs, (1 << parameters.f) * sizeof(U32));
|
||||
/* Build the dictionary */
|
||||
{
|
||||
const size_t tail = FASTCOVER_buildDictionary(ctx, freqs, dict,
|
||||
dictBufferCapacity, parameters);
|
||||
|
||||
dictBufferCapacity = ZDICT_finalizeDictionary(
|
||||
dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail,
|
||||
ctx->samples, ctx->samplesSizes, (unsigned)ctx->nbTrainSamples,
|
||||
parameters.zParams);
|
||||
if (ZDICT_isError(dictBufferCapacity)) {
|
||||
DISPLAYLEVEL(1, "Failed to finalize dictionary\n");
|
||||
goto _cleanup;
|
||||
}
|
||||
}
|
||||
/* Check total compressed size */
|
||||
{
|
||||
/* Pointers */
|
||||
ZSTD_CCtx *cctx;
|
||||
ZSTD_CDict *cdict;
|
||||
void *dst;
|
||||
/* Local variables */
|
||||
size_t dstCapacity;
|
||||
size_t i;
|
||||
/* Allocate dst with enough space to compress the maximum sized sample */
|
||||
{
|
||||
size_t maxSampleSize = 0;
|
||||
i = parameters.splitPoint < 1.0 ? ctx->nbTrainSamples : 0;
|
||||
for (; i < ctx->nbSamples; ++i) {
|
||||
maxSampleSize = MAX(ctx->samplesSizes[i], maxSampleSize);
|
||||
}
|
||||
dstCapacity = ZSTD_compressBound(maxSampleSize);
|
||||
dst = malloc(dstCapacity);
|
||||
}
|
||||
/* Create the cctx and cdict */
|
||||
cctx = ZSTD_createCCtx();
|
||||
cdict = ZSTD_createCDict(dict, dictBufferCapacity,
|
||||
parameters.zParams.compressionLevel);
|
||||
if (!dst || !cctx || !cdict) {
|
||||
goto _compressCleanup;
|
||||
}
|
||||
/* Compress each sample and sum their sizes (or error) */
|
||||
totalCompressedSize = dictBufferCapacity;
|
||||
i = parameters.splitPoint < 1.0 ? ctx->nbTrainSamples : 0;
|
||||
for (; i < ctx->nbSamples; ++i) {
|
||||
const size_t size = ZSTD_compress_usingCDict(
|
||||
cctx, dst, dstCapacity, ctx->samples + ctx->offsets[i],
|
||||
ctx->samplesSizes[i], cdict);
|
||||
if (ZSTD_isError(size)) {
|
||||
totalCompressedSize = ERROR(GENERIC);
|
||||
goto _compressCleanup;
|
||||
}
|
||||
totalCompressedSize += size;
|
||||
}
|
||||
_compressCleanup:
|
||||
ZSTD_freeCCtx(cctx);
|
||||
ZSTD_freeCDict(cdict);
|
||||
if (dst) {
|
||||
free(dst);
|
||||
}
|
||||
}
|
||||
|
||||
_cleanup:
|
||||
FASTCOVER_best_finish(data->best, totalCompressedSize, parameters, dict,
|
||||
dictBufferCapacity);
|
||||
free(data);
|
||||
if (dict) {
|
||||
free(dict);
|
||||
}
|
||||
if (freqs) {
|
||||
free(freqs);
|
||||
}
|
||||
}
|
||||
|
||||
ZDICTLIB_API size_t ZDICT_trainFromBuffer_fastCover(
|
||||
void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer,
|
||||
const size_t *samplesSizes, unsigned nbSamples, ZDICT_fastCover_params_t parameters) {
|
||||
BYTE* const dict = (BYTE*)dictBuffer;
|
||||
FASTCOVER_ctx_t ctx;
|
||||
parameters.splitPoint = 1.0;
|
||||
/* Initialize global data */
|
||||
g_displayLevel = parameters.zParams.notificationLevel;
|
||||
/* Checks */
|
||||
if (!FASTCOVER_checkParameters(parameters, dictBufferCapacity)) {
|
||||
DISPLAYLEVEL(1, "FASTCOVER parameters incorrect\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (nbSamples == 0) {
|
||||
DISPLAYLEVEL(1, "FASTCOVER must have at least one input file\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) {
|
||||
DISPLAYLEVEL(1, "dictBufferCapacity must be at least %u\n",
|
||||
ZDICT_DICTSIZE_MIN);
|
||||
return ERROR(dstSize_tooSmall);
|
||||
}
|
||||
/* Initialize context */
|
||||
if (!FASTCOVER_ctx_init(&ctx, samplesBuffer, samplesSizes, nbSamples,
|
||||
parameters.d, parameters.splitPoint, parameters.f)) {
|
||||
DISPLAYLEVEL(1, "Failed to initialize context\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
/* Build the dictionary */
|
||||
DISPLAYLEVEL(2, "Building dictionary\n");
|
||||
{
|
||||
const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
|
||||
dictBufferCapacity, parameters);
|
||||
|
||||
const size_t dictionarySize = ZDICT_finalizeDictionary(
|
||||
dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail,
|
||||
samplesBuffer, samplesSizes, (unsigned)ctx.nbTrainSamples,
|
||||
parameters.zParams);
|
||||
if (!ZSTD_isError(dictionarySize)) {
|
||||
DISPLAYLEVEL(2, "Constructed dictionary of size %u\n",
|
||||
(U32)dictionarySize);
|
||||
}
|
||||
FASTCOVER_ctx_destroy(&ctx);
|
||||
return dictionarySize;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
ZDICTLIB_API size_t ZDICT_optimizeTrainFromBuffer_fastCover(
|
||||
void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer,
|
||||
const size_t *samplesSizes, unsigned nbSamples,
|
||||
ZDICT_fastCover_params_t *parameters) {
|
||||
/* constants */
|
||||
const unsigned nbThreads = parameters->nbThreads;
|
||||
const double splitPoint =
|
||||
parameters->splitPoint <= 0.0 ? DEFAULT_SPLITPOINT : parameters->splitPoint;
|
||||
const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d;
|
||||
const unsigned kMaxD = parameters->d == 0 ? 8 : parameters->d;
|
||||
const unsigned kMinK = parameters->k == 0 ? 50 : parameters->k;
|
||||
const unsigned kMaxK = parameters->k == 0 ? 2000 : parameters->k;
|
||||
const unsigned kSteps = parameters->steps == 0 ? 40 : parameters->steps;
|
||||
const unsigned kStepSize = MAX((kMaxK - kMinK) / kSteps, 1);
|
||||
const unsigned kIterations =
|
||||
(1 + (kMaxD - kMinD) / 2) * (1 + (kMaxK - kMinK) / kStepSize);
|
||||
const unsigned f = parameters->f == 0 ? 23 : parameters->f;
|
||||
|
||||
/* Local variables */
|
||||
const int displayLevel = parameters->zParams.notificationLevel;
|
||||
unsigned iteration = 1;
|
||||
unsigned d;
|
||||
unsigned k;
|
||||
FASTCOVER_best_t best;
|
||||
POOL_ctx *pool = NULL;
|
||||
|
||||
/* Checks */
|
||||
if (splitPoint <= 0 || splitPoint > 1) {
|
||||
LOCALDISPLAYLEVEL(displayLevel, 1, "Incorrect splitPoint\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (kMinK < kMaxD || kMaxK < kMinK) {
|
||||
LOCALDISPLAYLEVEL(displayLevel, 1, "Incorrect k\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (nbSamples == 0) {
|
||||
DISPLAYLEVEL(1, "FASTCOVER must have at least one input file\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) {
|
||||
DISPLAYLEVEL(1, "dictBufferCapacity must be at least %u\n",
|
||||
ZDICT_DICTSIZE_MIN);
|
||||
return ERROR(dstSize_tooSmall);
|
||||
}
|
||||
if (nbThreads > 1) {
|
||||
pool = POOL_create(nbThreads, 1);
|
||||
if (!pool) {
|
||||
return ERROR(memory_allocation);
|
||||
}
|
||||
}
|
||||
/* Initialization */
|
||||
FASTCOVER_best_init(&best);
|
||||
/* Turn down global display level to clean up display at level 2 and below */
|
||||
g_displayLevel = displayLevel == 0 ? 0 : displayLevel - 1;
|
||||
/* Loop through d first because each new value needs a new context */
|
||||
LOCALDISPLAYLEVEL(displayLevel, 2, "Trying %u different sets of parameters\n",
|
||||
kIterations);
|
||||
for (d = kMinD; d <= kMaxD; d += 2) {
|
||||
/* Initialize the context for this value of d */
|
||||
FASTCOVER_ctx_t ctx;
|
||||
LOCALDISPLAYLEVEL(displayLevel, 3, "d=%u\n", d);
|
||||
if (!FASTCOVER_ctx_init(&ctx, samplesBuffer, samplesSizes, nbSamples, d, splitPoint, f)) {
|
||||
LOCALDISPLAYLEVEL(displayLevel, 1, "Failed to initialize context\n");
|
||||
FASTCOVER_best_destroy(&best);
|
||||
POOL_free(pool);
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
/* Loop through k reusing the same context */
|
||||
for (k = kMinK; k <= kMaxK; k += kStepSize) {
|
||||
/* Prepare the arguments */
|
||||
FASTCOVER_tryParameters_data_t *data = (FASTCOVER_tryParameters_data_t *)malloc(
|
||||
sizeof(FASTCOVER_tryParameters_data_t));
|
||||
LOCALDISPLAYLEVEL(displayLevel, 3, "k=%u\n", k);
|
||||
if (!data) {
|
||||
LOCALDISPLAYLEVEL(displayLevel, 1, "Failed to allocate parameters\n");
|
||||
FASTCOVER_best_destroy(&best);
|
||||
FASTCOVER_ctx_destroy(&ctx);
|
||||
POOL_free(pool);
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
data->ctx = &ctx;
|
||||
data->best = &best;
|
||||
data->dictBufferCapacity = dictBufferCapacity;
|
||||
data->parameters = *parameters;
|
||||
data->parameters.k = k;
|
||||
data->parameters.d = d;
|
||||
data->parameters.f = f;
|
||||
data->parameters.splitPoint = splitPoint;
|
||||
data->parameters.steps = kSteps;
|
||||
data->parameters.zParams.notificationLevel = g_displayLevel;
|
||||
/* Check the parameters */
|
||||
if (!FASTCOVER_checkParameters(data->parameters, dictBufferCapacity)) {
|
||||
DISPLAYLEVEL(1, "fastCover parameters incorrect\n");
|
||||
free(data);
|
||||
continue;
|
||||
}
|
||||
/* Call the function and pass ownership of data to it */
|
||||
FASTCOVER_best_start(&best);
|
||||
if (pool) {
|
||||
POOL_add(pool, &FASTCOVER_tryParameters, data);
|
||||
} else {
|
||||
FASTCOVER_tryParameters(data);
|
||||
}
|
||||
/* Print status */
|
||||
LOCALDISPLAYUPDATE(displayLevel, 2, "\r%u%% ",
|
||||
(U32)((iteration * 100) / kIterations));
|
||||
++iteration;
|
||||
}
|
||||
FASTCOVER_best_wait(&best);
|
||||
FASTCOVER_ctx_destroy(&ctx);
|
||||
}
|
||||
LOCALDISPLAYLEVEL(displayLevel, 2, "\r%79s\r", "");
|
||||
/* Fill the output buffer and parameters with output of the best parameters */
|
||||
{
|
||||
const size_t dictSize = best.dictSize;
|
||||
if (ZSTD_isError(best.compressedSize)) {
|
||||
const size_t compressedSize = best.compressedSize;
|
||||
FASTCOVER_best_destroy(&best);
|
||||
POOL_free(pool);
|
||||
return compressedSize;
|
||||
}
|
||||
*parameters = best.parameters;
|
||||
memcpy(dictBuffer, best.dict, dictSize);
|
||||
FASTCOVER_best_destroy(&best);
|
||||
POOL_free(pool);
|
||||
return dictSize;
|
||||
}
|
||||
|
||||
}
|
@ -1,57 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* memset */
|
||||
#include <time.h> /* clock */
|
||||
#include "mem.h" /* read */
|
||||
#include "pool.h"
|
||||
#include "threading.h"
|
||||
#include "zstd_internal.h" /* includes zstd.h */
|
||||
#ifndef ZDICT_STATIC_LINKING_ONLY
|
||||
#define ZDICT_STATIC_LINKING_ONLY
|
||||
#endif
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
typedef struct {
|
||||
unsigned k; /* Segment size : constraint: 0 < k : Reasonable range [16, 2048+] */
|
||||
unsigned d; /* dmer size : constraint: 0 < d <= k : Reasonable range [6, 16] */
|
||||
unsigned f; /* log of size of frequency array */
|
||||
unsigned steps; /* Number of steps : Only used for optimization : 0 means default (32) : Higher means more parameters checked */
|
||||
unsigned nbThreads; /* Number of threads : constraint: 0 < nbThreads : 1 means single-threaded : Only used for optimization : Ignored if ZSTD_MULTITHREAD is not defined */
|
||||
double splitPoint; /* Percentage of samples used for training: the first nbSamples * splitPoint samples will be used to training, the last nbSamples * (1 - splitPoint) samples will be used for testing, 0 means default (1.0), 1.0 when all samples are used for both training and testing */
|
||||
ZDICT_params_t zParams;
|
||||
} ZDICT_fastCover_params_t;
|
||||
|
||||
|
||||
/*! ZDICT_optimizeTrainFromBuffer_fastCover():
|
||||
* Train a dictionary from an array of samples using a modified version of the COVER algorithm.
|
||||
* Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
|
||||
* supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
|
||||
* The resulting dictionary will be saved into `dictBuffer`.
|
||||
* All of the parameters except for f are optional.
|
||||
* If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8, 10, 12, 14, 16}.
|
||||
* if steps is zero it defaults to its default value.
|
||||
* If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [16, 2048].
|
||||
*
|
||||
* @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
|
||||
* or an error code, which can be tested with ZDICT_isError().
|
||||
* On success `*parameters` contains the parameters selected.
|
||||
*/
|
||||
ZDICTLIB_API size_t ZDICT_optimizeTrainFromBuffer_fastCover(
|
||||
void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer,
|
||||
const size_t *samplesSizes, unsigned nbSamples,
|
||||
ZDICT_fastCover_params_t *parameters);
|
||||
|
||||
|
||||
/*! ZDICT_trainFromBuffer_fastCover():
|
||||
* Train a dictionary from an array of samples using a modified version of the COVER algorithm.
|
||||
* Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
|
||||
* supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
|
||||
* The resulting dictionary will be saved into `dictBuffer`.
|
||||
* d, k, and f are required.
|
||||
* @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
|
||||
* or an error code, which can be tested with ZDICT_isError().
|
||||
*/
|
||||
ZDICTLIB_API size_t ZDICT_trainFromBuffer_fastCover(
|
||||
void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer,
|
||||
const size_t *samplesSizes, unsigned nbSamples, ZDICT_fastCover_params_t parameters);
|
@ -1,183 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* strcmp, strlen */
|
||||
#include <errno.h> /* errno */
|
||||
#include <ctype.h>
|
||||
#include "fastCover.h"
|
||||
#include "io.h"
|
||||
#include "util.h"
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) if (displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||
|
||||
static const U64 g_refreshRate = SEC_TO_MICRO / 6;
|
||||
static UTIL_time_t g_displayClock = UTIL_TIME_INITIALIZER;
|
||||
|
||||
#define DISPLAYUPDATE(l, ...) { if (displayLevel>=l) { \
|
||||
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (displayLevel>=4)) \
|
||||
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
||||
if (displayLevel>=4) fflush(stderr); } } }
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Exceptions
|
||||
***************************************/
|
||||
#ifndef DEBUG
|
||||
# define DEBUG 0
|
||||
#endif
|
||||
#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__);
|
||||
#define EXM_THROW(error, ...) \
|
||||
{ \
|
||||
DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \
|
||||
DISPLAY("Error %i : ", error); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
DISPLAY("\n"); \
|
||||
exit(error); \
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Constants
|
||||
***************************************/
|
||||
static const unsigned g_defaultMaxDictSize = 110 KB;
|
||||
#define DEFAULT_CLEVEL 3
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* FASTCOVER
|
||||
***************************************/
|
||||
int FASTCOVER_trainFromFiles(const char* dictFileName, sampleInfo *info,
|
||||
unsigned maxDictSize,
|
||||
ZDICT_fastCover_params_t *params) {
|
||||
unsigned const displayLevel = params->zParams.notificationLevel;
|
||||
void* const dictBuffer = malloc(maxDictSize);
|
||||
|
||||
int result = 0;
|
||||
|
||||
/* Checks */
|
||||
if (!dictBuffer)
|
||||
EXM_THROW(12, "not enough memory for trainFromFiles"); /* should not happen */
|
||||
|
||||
{ size_t dictSize;
|
||||
/* Run the optimize version if either k or d is not provided */
|
||||
if (!params->d || !params->k) {
|
||||
dictSize = ZDICT_optimizeTrainFromBuffer_fastCover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, params);
|
||||
} else {
|
||||
dictSize = ZDICT_trainFromBuffer_fastCover(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *params);
|
||||
}
|
||||
DISPLAYLEVEL(2, "k=%u\nd=%u\nf=%u\nsteps=%u\nsplit=%u\n", params->k, params->d, params->f, params->steps, (unsigned)(params->splitPoint*100));
|
||||
if (ZDICT_isError(dictSize)) {
|
||||
DISPLAYLEVEL(1, "dictionary training failed : %s \n", ZDICT_getErrorName(dictSize)); /* should not happen */
|
||||
result = 1;
|
||||
goto _done;
|
||||
}
|
||||
/* save dict */
|
||||
DISPLAYLEVEL(2, "Save dictionary of size %u into file %s \n", (U32)dictSize, dictFileName);
|
||||
saveDict(dictFileName, dictBuffer, dictSize);
|
||||
}
|
||||
|
||||
/* clean up */
|
||||
_done:
|
||||
free(dictBuffer);
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
|
||||
int main(int argCount, const char* argv[])
|
||||
{
|
||||
int displayLevel = 2;
|
||||
const char* programName = argv[0];
|
||||
int operationResult = 0;
|
||||
|
||||
/* Initialize arguments to default values */
|
||||
unsigned k = 0;
|
||||
unsigned d = 0;
|
||||
unsigned f = 23;
|
||||
unsigned steps = 32;
|
||||
unsigned nbThreads = 1;
|
||||
unsigned split = 100;
|
||||
const char* outputFile = "fastCoverDict";
|
||||
unsigned dictID = 0;
|
||||
unsigned maxDictSize = g_defaultMaxDictSize;
|
||||
|
||||
/* Initialize table to store input files */
|
||||
const char** filenameTable = (const char**)malloc(argCount * sizeof(const char*));
|
||||
unsigned filenameIdx = 0;
|
||||
|
||||
char* fileNamesBuf = NULL;
|
||||
unsigned fileNamesNb = filenameIdx;
|
||||
int followLinks = 0; /* follow directory recursively */
|
||||
const char** extendedFileList = NULL;
|
||||
|
||||
/* Parse arguments */
|
||||
for (int i = 1; i < argCount; i++) {
|
||||
const char* argument = argv[i];
|
||||
if (longCommandWArg(&argument, "k=")) { k = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "d=")) { d = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "f=")) { f = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "steps=")) { steps = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "split=")) { split = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "dictID=")) { dictID = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "maxdict=")) { maxDictSize = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "in=")) {
|
||||
filenameTable[filenameIdx] = argument;
|
||||
filenameIdx++;
|
||||
continue;
|
||||
}
|
||||
if (longCommandWArg(&argument, "out=")) {
|
||||
outputFile = argument;
|
||||
continue;
|
||||
}
|
||||
DISPLAYLEVEL(1, "Incorrect parameters\n");
|
||||
operationResult = 1;
|
||||
return operationResult;
|
||||
}
|
||||
|
||||
/* Get the list of all files recursively (because followLinks==0)*/
|
||||
extendedFileList = UTIL_createFileList(filenameTable, filenameIdx, &fileNamesBuf,
|
||||
&fileNamesNb, followLinks);
|
||||
if (extendedFileList) {
|
||||
unsigned u;
|
||||
for (u=0; u<fileNamesNb; u++) DISPLAYLEVEL(4, "%u %s\n", u, extendedFileList[u]);
|
||||
free((void*)filenameTable);
|
||||
filenameTable = extendedFileList;
|
||||
filenameIdx = fileNamesNb;
|
||||
}
|
||||
|
||||
size_t blockSize = 0;
|
||||
|
||||
/* Set up zParams */
|
||||
ZDICT_params_t zParams;
|
||||
zParams.compressionLevel = DEFAULT_CLEVEL;
|
||||
zParams.notificationLevel = displayLevel;
|
||||
zParams.dictID = dictID;
|
||||
|
||||
/* Set up fastCover params */
|
||||
ZDICT_fastCover_params_t params;
|
||||
params.zParams = zParams;
|
||||
params.k = k;
|
||||
params.d = d;
|
||||
params.f = f;
|
||||
params.steps = steps;
|
||||
params.nbThreads = nbThreads;
|
||||
params.splitPoint = (double)split/100;
|
||||
|
||||
/* Build dictionary */
|
||||
sampleInfo* info = getSampleInfo(filenameTable,
|
||||
filenameIdx, blockSize, maxDictSize, zParams.notificationLevel);
|
||||
operationResult = FASTCOVER_trainFromFiles(outputFile, info, maxDictSize, ¶ms);
|
||||
|
||||
/* Free allocated memory */
|
||||
UTIL_freeFileList(extendedFileList, fileNamesBuf);
|
||||
freeSampleInfo(info);
|
||||
|
||||
return operationResult;
|
||||
}
|
@ -1,15 +0,0 @@
|
||||
echo "Building fastCover dictionary with in=../../lib/common f=20 out=dict1"
|
||||
./main in=../../../lib/common f=20 out=dict1
|
||||
zstd -be3 -D dict1 -r ../../../lib/common -q
|
||||
echo "Building fastCover dictionary with in=../../lib/common k=500 d=6 f=24 out=dict2 dictID=100 maxdict=140000"
|
||||
./main in=../../../lib/common k=500 d=6 f=24 out=dict2 dictID=100 maxdict=140000
|
||||
zstd -be3 -D dict2 -r ../../../lib/common -q
|
||||
echo "Building fastCover dictionary with 2 sample sources"
|
||||
./main in=../../../lib/common in=../../../lib/compress out=dict3
|
||||
zstd -be3 -D dict3 -r ../../../lib/common -q
|
||||
echo "Removing dict1 dict2 dict3"
|
||||
rm -f dict1 dict2 dict3
|
||||
|
||||
echo "Testing with invalid parameters, should fail"
|
||||
! ./main in=../../../lib/common r=10
|
||||
! ./main in=../../../lib/common d=10
|
@ -1,52 +0,0 @@
|
||||
ARG :=
|
||||
|
||||
CC ?= gcc
|
||||
CFLAGS ?= -O3
|
||||
INCLUDES := -I ../../../programs -I ../../../lib/common -I ../../../lib -I ../../../lib/dictBuilder
|
||||
|
||||
TEST_INPUT := ../../../lib
|
||||
TEST_OUTPUT := randomDict
|
||||
|
||||
all: main run clean
|
||||
|
||||
.PHONY: test
|
||||
test: main testrun testshell clean
|
||||
|
||||
.PHONY: run
|
||||
run:
|
||||
echo "Building a random dictionary with given arguments"
|
||||
./main $(ARG)
|
||||
|
||||
main: main.o io.o random.o libzstd.a
|
||||
$(CC) $(CFLAGS) main.o io.o random.o libzstd.a -o main
|
||||
|
||||
main.o: main.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c main.c
|
||||
|
||||
random.o: random.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c random.c
|
||||
|
||||
io.o: io.c
|
||||
$(CC) $(CFLAGS) $(INCLUDES) -c io.c
|
||||
|
||||
libzstd.a:
|
||||
$(MAKE) -C ../../../lib libzstd.a
|
||||
mv ../../../lib/libzstd.a .
|
||||
|
||||
.PHONY: testrun
|
||||
testrun: main
|
||||
echo "Run with $(TEST_INPUT) and $(TEST_OUTPUT) "
|
||||
./main in=$(TEST_INPUT) out=$(TEST_OUTPUT)
|
||||
zstd -be3 -D $(TEST_OUTPUT) -r $(TEST_INPUT) -q
|
||||
rm -f $(TEST_OUTPUT)
|
||||
|
||||
.PHONY: testshell
|
||||
testshell: test.sh
|
||||
sh test.sh
|
||||
echo "Finish running test.sh"
|
||||
|
||||
.PHONY: clean
|
||||
clean:
|
||||
rm -f *.o main libzstd.a
|
||||
$(MAKE) -C ../../../lib clean
|
||||
echo "Cleaning is completed"
|
@ -1,20 +0,0 @@
|
||||
Random Dictionary Builder
|
||||
|
||||
### Permitted Arguments:
|
||||
Input File/Directory (in=fileName): required; file/directory used to build dictionary; if directory, will operate recursively for files inside directory; can include multiple files/directories, each following "in="
|
||||
Output Dictionary (out=dictName): if not provided, default to defaultDict
|
||||
Dictionary ID (dictID=#): nonnegative number; if not provided, default to 0
|
||||
Maximum Dictionary Size (maxdict=#): positive number; in bytes, if not provided, default to 110KB
|
||||
Size of Randomly Selected Segment (k=#): positive number; in bytes; if not provided, default to 200
|
||||
|
||||
###Running Test:
|
||||
make test
|
||||
|
||||
|
||||
###Usage:
|
||||
To build a random dictionary with the provided arguments: make ARG= followed by arguments
|
||||
|
||||
|
||||
### Examples:
|
||||
make ARG="in=../../../lib/dictBuilder out=dict100 dictID=520"
|
||||
make ARG="in=../../../lib/dictBuilder in=../../../lib/compress"
|
@ -1,284 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* strcmp, strlen */
|
||||
#include <errno.h> /* errno */
|
||||
#include <ctype.h>
|
||||
#include "io.h"
|
||||
#include "fileio.h" /* stdinmark, stdoutmark, ZSTD_EXTENSION */
|
||||
#include "platform.h" /* Large Files support */
|
||||
#include "util.h"
|
||||
#include "zdict.h"
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) if (displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||
|
||||
static const U64 g_refreshRate = SEC_TO_MICRO / 6;
|
||||
static UTIL_time_t g_displayClock = UTIL_TIME_INITIALIZER;
|
||||
|
||||
#define DISPLAYUPDATE(l, ...) { if (displayLevel>=l) { \
|
||||
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (displayLevel>=4)) \
|
||||
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
||||
if (displayLevel>=4) fflush(stderr); } } }
|
||||
|
||||
/*-*************************************
|
||||
* Exceptions
|
||||
***************************************/
|
||||
#ifndef DEBUG
|
||||
# define DEBUG 0
|
||||
#endif
|
||||
#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__);
|
||||
#define EXM_THROW(error, ...) \
|
||||
{ \
|
||||
DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \
|
||||
DISPLAY("Error %i : ", error); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
DISPLAY("\n"); \
|
||||
exit(error); \
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Constants
|
||||
***************************************/
|
||||
|
||||
#define SAMPLESIZE_MAX (128 KB)
|
||||
#define RANDOM_MAX_SAMPLES_SIZE (sizeof(size_t) == 8 ? ((U32)-1) : ((U32)1 GB))
|
||||
#define RANDOM_MEMMULT 9
|
||||
static const size_t g_maxMemory = (sizeof(size_t) == 4) ?
|
||||
(2 GB - 64 MB) : ((size_t)(512 MB) << sizeof(size_t));
|
||||
|
||||
#define NOISELENGTH 32
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Commandline related functions
|
||||
***************************************/
|
||||
unsigned readU32FromChar(const char** stringPtr){
|
||||
const char errorMsg[] = "error: numeric value too large";
|
||||
unsigned result = 0;
|
||||
while ((**stringPtr >='0') && (**stringPtr <='9')) {
|
||||
unsigned const max = (((unsigned)(-1)) / 10) - 1;
|
||||
if (result > max) exit(1);
|
||||
result *= 10, result += **stringPtr - '0', (*stringPtr)++ ;
|
||||
}
|
||||
if ((**stringPtr=='K') || (**stringPtr=='M')) {
|
||||
unsigned const maxK = ((unsigned)(-1)) >> 10;
|
||||
if (result > maxK) exit(1);
|
||||
result <<= 10;
|
||||
if (**stringPtr=='M') {
|
||||
if (result > maxK) exit(1);
|
||||
result <<= 10;
|
||||
}
|
||||
(*stringPtr)++; /* skip `K` or `M` */
|
||||
if (**stringPtr=='i') (*stringPtr)++;
|
||||
if (**stringPtr=='B') (*stringPtr)++;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
unsigned longCommandWArg(const char** stringPtr, const char* longCommand){
|
||||
size_t const comSize = strlen(longCommand);
|
||||
int const result = !strncmp(*stringPtr, longCommand, comSize);
|
||||
if (result) *stringPtr += comSize;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/* ********************************************************
|
||||
* File related operations
|
||||
**********************************************************/
|
||||
/** loadFiles() :
|
||||
* load samples from files listed in fileNamesTable into buffer.
|
||||
* works even if buffer is too small to load all samples.
|
||||
* Also provides the size of each sample into sampleSizes table
|
||||
* which must be sized correctly, using DiB_fileStats().
|
||||
* @return : nb of samples effectively loaded into `buffer`
|
||||
* *bufferSizePtr is modified, it provides the amount data loaded within buffer.
|
||||
* sampleSizes is filled with the size of each sample.
|
||||
*/
|
||||
static unsigned loadFiles(void* buffer, size_t* bufferSizePtr, size_t* sampleSizes,
|
||||
unsigned sstSize, const char** fileNamesTable, unsigned nbFiles,
|
||||
size_t targetChunkSize, unsigned displayLevel) {
|
||||
char* const buff = (char*)buffer;
|
||||
size_t pos = 0;
|
||||
unsigned nbLoadedChunks = 0, fileIndex;
|
||||
|
||||
for (fileIndex=0; fileIndex<nbFiles; fileIndex++) {
|
||||
const char* const fileName = fileNamesTable[fileIndex];
|
||||
unsigned long long const fs64 = UTIL_getFileSize(fileName);
|
||||
unsigned long long remainingToLoad = (fs64 == UTIL_FILESIZE_UNKNOWN) ? 0 : fs64;
|
||||
U32 const nbChunks = targetChunkSize ? (U32)((fs64 + (targetChunkSize-1)) / targetChunkSize) : 1;
|
||||
U64 const chunkSize = targetChunkSize ? MIN(targetChunkSize, fs64) : fs64;
|
||||
size_t const maxChunkSize = (size_t)MIN(chunkSize, SAMPLESIZE_MAX);
|
||||
U32 cnb;
|
||||
FILE* const f = fopen(fileName, "rb");
|
||||
if (f==NULL) EXM_THROW(10, "zstd: dictBuilder: %s %s ", fileName, strerror(errno));
|
||||
DISPLAYUPDATE(2, "Loading %s... \r", fileName);
|
||||
for (cnb=0; cnb<nbChunks; cnb++) {
|
||||
size_t const toLoad = (size_t)MIN(maxChunkSize, remainingToLoad);
|
||||
if (toLoad > *bufferSizePtr-pos) break;
|
||||
{ size_t const readSize = fread(buff+pos, 1, toLoad, f);
|
||||
if (readSize != toLoad) EXM_THROW(11, "Pb reading %s", fileName);
|
||||
pos += readSize;
|
||||
sampleSizes[nbLoadedChunks++] = toLoad;
|
||||
remainingToLoad -= targetChunkSize;
|
||||
if (nbLoadedChunks == sstSize) { /* no more space left in sampleSizes table */
|
||||
fileIndex = nbFiles; /* stop there */
|
||||
break;
|
||||
}
|
||||
if (toLoad < targetChunkSize) {
|
||||
fseek(f, (long)(targetChunkSize - toLoad), SEEK_CUR);
|
||||
} } }
|
||||
fclose(f);
|
||||
}
|
||||
DISPLAYLEVEL(2, "\r%79s\r", "");
|
||||
*bufferSizePtr = pos;
|
||||
DISPLAYLEVEL(4, "loaded : %u KB \n", (U32)(pos >> 10))
|
||||
return nbLoadedChunks;
|
||||
}
|
||||
|
||||
#define rotl32(x,r) ((x << r) | (x >> (32 - r)))
|
||||
static U32 getRand(U32* src)
|
||||
{
|
||||
static const U32 prime1 = 2654435761U;
|
||||
static const U32 prime2 = 2246822519U;
|
||||
U32 rand32 = *src;
|
||||
rand32 *= prime1;
|
||||
rand32 ^= prime2;
|
||||
rand32 = rotl32(rand32, 13);
|
||||
*src = rand32;
|
||||
return rand32 >> 5;
|
||||
}
|
||||
|
||||
/* shuffle() :
|
||||
* shuffle a table of file names in a semi-random way
|
||||
* It improves dictionary quality by reducing "locality" impact, so if sample set is very large,
|
||||
* it will load random elements from it, instead of just the first ones. */
|
||||
static void shuffle(const char** fileNamesTable, unsigned nbFiles) {
|
||||
U32 seed = 0xFD2FB528;
|
||||
unsigned i;
|
||||
for (i = nbFiles - 1; i > 0; --i) {
|
||||
unsigned const j = getRand(&seed) % (i + 1);
|
||||
const char* const tmp = fileNamesTable[j];
|
||||
fileNamesTable[j] = fileNamesTable[i];
|
||||
fileNamesTable[i] = tmp;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*-********************************************************
|
||||
* Dictionary training functions
|
||||
**********************************************************/
|
||||
size_t findMaxMem(unsigned long long requiredMem) {
|
||||
size_t const step = 8 MB;
|
||||
void* testmem = NULL;
|
||||
|
||||
requiredMem = (((requiredMem >> 23) + 1) << 23);
|
||||
requiredMem += step;
|
||||
if (requiredMem > g_maxMemory) requiredMem = g_maxMemory;
|
||||
|
||||
while (!testmem) {
|
||||
testmem = malloc((size_t)requiredMem);
|
||||
requiredMem -= step;
|
||||
}
|
||||
|
||||
free(testmem);
|
||||
return (size_t)requiredMem;
|
||||
}
|
||||
|
||||
void saveDict(const char* dictFileName,
|
||||
const void* buff, size_t buffSize) {
|
||||
FILE* const f = fopen(dictFileName, "wb");
|
||||
if (f==NULL) EXM_THROW(3, "cannot open %s ", dictFileName);
|
||||
|
||||
{ size_t const n = fwrite(buff, 1, buffSize, f);
|
||||
if (n!=buffSize) EXM_THROW(4, "%s : write error", dictFileName) }
|
||||
|
||||
{ size_t const n = (size_t)fclose(f);
|
||||
if (n!=0) EXM_THROW(5, "%s : flush error", dictFileName) }
|
||||
}
|
||||
|
||||
/*! getFileStats() :
|
||||
* Given a list of files, and a chunkSize (0 == no chunk, whole files)
|
||||
* provides the amount of data to be loaded and the resulting nb of samples.
|
||||
* This is useful primarily for allocation purpose => sample buffer, and sample sizes table.
|
||||
*/
|
||||
static fileStats getFileStats(const char** fileNamesTable, unsigned nbFiles,
|
||||
size_t chunkSize, unsigned displayLevel) {
|
||||
fileStats fs;
|
||||
unsigned n;
|
||||
memset(&fs, 0, sizeof(fs));
|
||||
for (n=0; n<nbFiles; n++) {
|
||||
U64 const fileSize = UTIL_getFileSize(fileNamesTable[n]);
|
||||
U64 const srcSize = (fileSize == UTIL_FILESIZE_UNKNOWN) ? 0 : fileSize;
|
||||
U32 const nbSamples = (U32)(chunkSize ? (srcSize + (chunkSize-1)) / chunkSize : 1);
|
||||
U64 const chunkToLoad = chunkSize ? MIN(chunkSize, srcSize) : srcSize;
|
||||
size_t const cappedChunkSize = (size_t)MIN(chunkToLoad, SAMPLESIZE_MAX);
|
||||
fs.totalSizeToLoad += cappedChunkSize * nbSamples;
|
||||
fs.oneSampleTooLarge |= (chunkSize > 2*SAMPLESIZE_MAX);
|
||||
fs.nbSamples += nbSamples;
|
||||
}
|
||||
DISPLAYLEVEL(4, "Preparing to load : %u KB \n", (U32)(fs.totalSizeToLoad >> 10));
|
||||
return fs;
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
sampleInfo* getSampleInfo(const char** fileNamesTable, unsigned nbFiles, size_t chunkSize,
|
||||
unsigned maxDictSize, const unsigned displayLevel) {
|
||||
fileStats const fs = getFileStats(fileNamesTable, nbFiles, chunkSize, displayLevel);
|
||||
size_t* const sampleSizes = (size_t*)malloc(fs.nbSamples * sizeof(size_t));
|
||||
size_t const memMult = RANDOM_MEMMULT;
|
||||
size_t const maxMem = findMaxMem(fs.totalSizeToLoad * memMult) / memMult;
|
||||
size_t loadedSize = (size_t) MIN ((unsigned long long)maxMem, fs.totalSizeToLoad);
|
||||
void* const srcBuffer = malloc(loadedSize+NOISELENGTH);
|
||||
|
||||
/* Checks */
|
||||
if ((!sampleSizes) || (!srcBuffer))
|
||||
EXM_THROW(12, "not enough memory for trainFromFiles"); /* should not happen */
|
||||
if (fs.oneSampleTooLarge) {
|
||||
DISPLAYLEVEL(2, "! Warning : some sample(s) are very large \n");
|
||||
DISPLAYLEVEL(2, "! Note that dictionary is only useful for small samples. \n");
|
||||
DISPLAYLEVEL(2, "! As a consequence, only the first %u bytes of each sample are loaded \n", SAMPLESIZE_MAX);
|
||||
}
|
||||
if (fs.nbSamples < 5) {
|
||||
DISPLAYLEVEL(2, "! Warning : nb of samples too low for proper processing ! \n");
|
||||
DISPLAYLEVEL(2, "! Please provide _one file per sample_. \n");
|
||||
DISPLAYLEVEL(2, "! Alternatively, split files into fixed-size blocks representative of samples, with -B# \n");
|
||||
EXM_THROW(14, "nb of samples too low"); /* we now clearly forbid this case */
|
||||
}
|
||||
if (fs.totalSizeToLoad < (unsigned long long)(8 * maxDictSize)) {
|
||||
DISPLAYLEVEL(2, "! Warning : data size of samples too small for target dictionary size \n");
|
||||
DISPLAYLEVEL(2, "! Samples should be about 100x larger than target dictionary size \n");
|
||||
}
|
||||
|
||||
/* init */
|
||||
if (loadedSize < fs.totalSizeToLoad)
|
||||
DISPLAYLEVEL(1, "Not enough memory; training on %u MB only...\n", (unsigned)(loadedSize >> 20));
|
||||
|
||||
/* Load input buffer */
|
||||
DISPLAYLEVEL(3, "Shuffling input files\n");
|
||||
shuffle(fileNamesTable, nbFiles);
|
||||
nbFiles = loadFiles(srcBuffer, &loadedSize, sampleSizes, fs.nbSamples,
|
||||
fileNamesTable, nbFiles, chunkSize, displayLevel);
|
||||
|
||||
sampleInfo *info = (sampleInfo *)malloc(sizeof(sampleInfo));
|
||||
|
||||
info->nbSamples = fs.nbSamples;
|
||||
info->samplesSizes = sampleSizes;
|
||||
info->srcBuffer = srcBuffer;
|
||||
|
||||
return info;
|
||||
}
|
||||
|
||||
|
||||
void freeSampleInfo(sampleInfo *info) {
|
||||
if (!info) return;
|
||||
if (info->samplesSizes) free((void*)(info->samplesSizes));
|
||||
if (info->srcBuffer) free((void*)(info->srcBuffer));
|
||||
free(info);
|
||||
}
|
@ -1,60 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* strcmp, strlen */
|
||||
#include <errno.h> /* errno */
|
||||
#include <ctype.h>
|
||||
#include "zstd_internal.h" /* includes zstd.h */
|
||||
#include "fileio.h" /* stdinmark, stdoutmark, ZSTD_EXTENSION */
|
||||
#include "platform.h" /* Large Files support */
|
||||
#include "util.h"
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Structs
|
||||
***************************************/
|
||||
typedef struct {
|
||||
U64 totalSizeToLoad;
|
||||
unsigned oneSampleTooLarge;
|
||||
unsigned nbSamples;
|
||||
} fileStats;
|
||||
|
||||
typedef struct {
|
||||
const void* srcBuffer;
|
||||
const size_t *samplesSizes;
|
||||
size_t nbSamples;
|
||||
}sampleInfo;
|
||||
|
||||
|
||||
|
||||
/*! getSampleInfo():
|
||||
* Load from input files and add samples to buffer
|
||||
* @return: a sampleInfo struct containing infomation about buffer where samples are stored,
|
||||
* size of each sample, and total number of samples
|
||||
*/
|
||||
sampleInfo* getSampleInfo(const char** fileNamesTable, unsigned nbFiles, size_t chunkSize,
|
||||
unsigned maxDictSize, const unsigned displayLevel);
|
||||
|
||||
|
||||
|
||||
/*! freeSampleInfo():
|
||||
* Free memory allocated for info
|
||||
*/
|
||||
void freeSampleInfo(sampleInfo *info);
|
||||
|
||||
|
||||
|
||||
/*! saveDict():
|
||||
* Save data stored on buff to dictFileName
|
||||
*/
|
||||
void saveDict(const char* dictFileName, const void* buff, size_t buffSize);
|
||||
|
||||
|
||||
unsigned readU32FromChar(const char** stringPtr);
|
||||
|
||||
/** longCommandWArg() :
|
||||
* check if *stringPtr is the same as longCommand.
|
||||
* If yes, @return 1 and advances *stringPtr to the position which immediately follows longCommand.
|
||||
* @return 0 and doesn't modify *stringPtr otherwise.
|
||||
*/
|
||||
unsigned longCommandWArg(const char** stringPtr, const char* longCommand);
|
@ -1,161 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* strcmp, strlen */
|
||||
#include <errno.h> /* errno */
|
||||
#include <ctype.h>
|
||||
#include "random.h"
|
||||
#include "io.h"
|
||||
#include "util.h"
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) if (displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||
|
||||
static const U64 g_refreshRate = SEC_TO_MICRO / 6;
|
||||
static UTIL_time_t g_displayClock = UTIL_TIME_INITIALIZER;
|
||||
|
||||
#define DISPLAYUPDATE(l, ...) { if (displayLevel>=l) { \
|
||||
if ((UTIL_clockSpanMicro(g_displayClock) > g_refreshRate) || (displayLevel>=4)) \
|
||||
{ g_displayClock = UTIL_getTime(); DISPLAY(__VA_ARGS__); \
|
||||
if (displayLevel>=4) fflush(stderr); } } }
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Exceptions
|
||||
***************************************/
|
||||
#ifndef DEBUG
|
||||
# define DEBUG 0
|
||||
#endif
|
||||
#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__);
|
||||
#define EXM_THROW(error, ...) \
|
||||
{ \
|
||||
DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \
|
||||
DISPLAY("Error %i : ", error); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
DISPLAY("\n"); \
|
||||
exit(error); \
|
||||
}
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* Constants
|
||||
***************************************/
|
||||
static const unsigned g_defaultMaxDictSize = 110 KB;
|
||||
#define DEFAULT_CLEVEL 3
|
||||
#define DEFAULT_k 200
|
||||
#define DEFAULT_OUTPUTFILE "defaultDict"
|
||||
#define DEFAULT_DICTID 0
|
||||
|
||||
|
||||
|
||||
/*-*************************************
|
||||
* RANDOM
|
||||
***************************************/
|
||||
int RANDOM_trainFromFiles(const char* dictFileName, sampleInfo *info,
|
||||
unsigned maxDictSize,
|
||||
ZDICT_random_params_t *params) {
|
||||
unsigned const displayLevel = params->zParams.notificationLevel;
|
||||
void* const dictBuffer = malloc(maxDictSize);
|
||||
|
||||
int result = 0;
|
||||
|
||||
/* Checks */
|
||||
if (!dictBuffer)
|
||||
EXM_THROW(12, "not enough memory for trainFromFiles"); /* should not happen */
|
||||
|
||||
{ size_t dictSize;
|
||||
dictSize = ZDICT_trainFromBuffer_random(dictBuffer, maxDictSize, info->srcBuffer,
|
||||
info->samplesSizes, info->nbSamples, *params);
|
||||
DISPLAYLEVEL(2, "k=%u\n", params->k);
|
||||
if (ZDICT_isError(dictSize)) {
|
||||
DISPLAYLEVEL(1, "dictionary training failed : %s \n", ZDICT_getErrorName(dictSize)); /* should not happen */
|
||||
result = 1;
|
||||
goto _done;
|
||||
}
|
||||
/* save dict */
|
||||
DISPLAYLEVEL(2, "Save dictionary of size %u into file %s \n", (U32)dictSize, dictFileName);
|
||||
saveDict(dictFileName, dictBuffer, dictSize);
|
||||
}
|
||||
|
||||
/* clean up */
|
||||
_done:
|
||||
free(dictBuffer);
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
|
||||
int main(int argCount, const char* argv[])
|
||||
{
|
||||
int displayLevel = 2;
|
||||
const char* programName = argv[0];
|
||||
int operationResult = 0;
|
||||
|
||||
/* Initialize arguments to default values */
|
||||
unsigned k = DEFAULT_k;
|
||||
const char* outputFile = DEFAULT_OUTPUTFILE;
|
||||
unsigned dictID = DEFAULT_DICTID;
|
||||
unsigned maxDictSize = g_defaultMaxDictSize;
|
||||
|
||||
/* Initialize table to store input files */
|
||||
const char** filenameTable = (const char**)malloc(argCount * sizeof(const char*));
|
||||
unsigned filenameIdx = 0;
|
||||
|
||||
/* Parse arguments */
|
||||
for (int i = 1; i < argCount; i++) {
|
||||
const char* argument = argv[i];
|
||||
if (longCommandWArg(&argument, "k=")) { k = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "dictID=")) { dictID = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "maxdict=")) { maxDictSize = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "in=")) {
|
||||
filenameTable[filenameIdx] = argument;
|
||||
filenameIdx++;
|
||||
continue;
|
||||
}
|
||||
if (longCommandWArg(&argument, "out=")) {
|
||||
outputFile = argument;
|
||||
continue;
|
||||
}
|
||||
DISPLAYLEVEL(1, "Incorrect parameters\n");
|
||||
operationResult = 1;
|
||||
return operationResult;
|
||||
}
|
||||
|
||||
char* fileNamesBuf = NULL;
|
||||
unsigned fileNamesNb = filenameIdx;
|
||||
int followLinks = 0; /* follow directory recursively */
|
||||
const char** extendedFileList = NULL;
|
||||
extendedFileList = UTIL_createFileList(filenameTable, filenameIdx, &fileNamesBuf,
|
||||
&fileNamesNb, followLinks);
|
||||
if (extendedFileList) {
|
||||
unsigned u;
|
||||
for (u=0; u<fileNamesNb; u++) DISPLAYLEVEL(4, "%u %s\n", u, extendedFileList[u]);
|
||||
free((void*)filenameTable);
|
||||
filenameTable = extendedFileList;
|
||||
filenameIdx = fileNamesNb;
|
||||
}
|
||||
|
||||
size_t blockSize = 0;
|
||||
|
||||
ZDICT_random_params_t params;
|
||||
ZDICT_params_t zParams;
|
||||
zParams.compressionLevel = DEFAULT_CLEVEL;
|
||||
zParams.notificationLevel = displayLevel;
|
||||
zParams.dictID = dictID;
|
||||
params.zParams = zParams;
|
||||
params.k = k;
|
||||
|
||||
sampleInfo* info = getSampleInfo(filenameTable,
|
||||
filenameIdx, blockSize, maxDictSize, zParams.notificationLevel);
|
||||
operationResult = RANDOM_trainFromFiles(outputFile, info, maxDictSize, ¶ms);
|
||||
|
||||
/* Free allocated memory */
|
||||
UTIL_freeFileList(extendedFileList, fileNamesBuf);
|
||||
freeSampleInfo(info);
|
||||
|
||||
return operationResult;
|
||||
}
|
@ -1,163 +0,0 @@
|
||||
/*-*************************************
|
||||
* Dependencies
|
||||
***************************************/
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* memset */
|
||||
#include <time.h> /* clock */
|
||||
#include "random.h"
|
||||
#include "util.h" /* UTIL_getFileSize, UTIL_getTotalFileSize */
|
||||
#ifndef ZDICT_STATIC_LINKING_ONLY
|
||||
#define ZDICT_STATIC_LINKING_ONLY
|
||||
#endif
|
||||
#include "zdict.h"
|
||||
|
||||
/*-*************************************
|
||||
* Console display
|
||||
***************************************/
|
||||
#define DISPLAY(...) fprintf(stderr, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) if (displayLevel>=l) { DISPLAY(__VA_ARGS__); }
|
||||
|
||||
#define LOCALDISPLAYUPDATE(displayLevel, l, ...) \
|
||||
if (displayLevel >= l) { \
|
||||
if ((clock() - g_time > refreshRate) || (displayLevel >= 4)) { \
|
||||
g_time = clock(); \
|
||||
DISPLAY(__VA_ARGS__); \
|
||||
} \
|
||||
}
|
||||
#define DISPLAYUPDATE(l, ...) LOCALDISPLAYUPDATE(displayLevel, l, __VA_ARGS__)
|
||||
static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100;
|
||||
static clock_t g_time = 0;
|
||||
|
||||
|
||||
|
||||
/* ********************************************************
|
||||
* Random Dictionary Builder
|
||||
**********************************************************/
|
||||
/**
|
||||
* Returns the sum of the sample sizes.
|
||||
*/
|
||||
static size_t RANDOM_sum(const size_t *samplesSizes, unsigned nbSamples) {
|
||||
size_t sum = 0;
|
||||
unsigned i;
|
||||
for (i = 0; i < nbSamples; ++i) {
|
||||
sum += samplesSizes[i];
|
||||
}
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* A segment is an inclusive range in the source.
|
||||
*/
|
||||
typedef struct {
|
||||
U32 begin;
|
||||
U32 end;
|
||||
} RANDOM_segment_t;
|
||||
|
||||
|
||||
/**
|
||||
* Selects a random segment from totalSamplesSize - k + 1 possible segments
|
||||
*/
|
||||
static RANDOM_segment_t RANDOM_selectSegment(const size_t totalSamplesSize,
|
||||
ZDICT_random_params_t parameters) {
|
||||
const U32 k = parameters.k;
|
||||
RANDOM_segment_t segment;
|
||||
unsigned index;
|
||||
|
||||
/* Randomly generate a number from 0 to sampleSizes - k */
|
||||
index = rand()%(totalSamplesSize - k + 1);
|
||||
|
||||
/* inclusive */
|
||||
segment.begin = index;
|
||||
segment.end = index + k - 1;
|
||||
|
||||
return segment;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Check the validity of the parameters.
|
||||
* Returns non-zero if the parameters are valid and 0 otherwise.
|
||||
*/
|
||||
static int RANDOM_checkParameters(ZDICT_random_params_t parameters,
|
||||
size_t maxDictSize) {
|
||||
/* k is a required parameter */
|
||||
if (parameters.k == 0) {
|
||||
return 0;
|
||||
}
|
||||
/* k <= maxDictSize */
|
||||
if (parameters.k > maxDictSize) {
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Given the prepared context build the dictionary.
|
||||
*/
|
||||
static size_t RANDOM_buildDictionary(const size_t totalSamplesSize, const BYTE *samples,
|
||||
void *dictBuffer, size_t dictBufferCapacity,
|
||||
ZDICT_random_params_t parameters) {
|
||||
BYTE *const dict = (BYTE *)dictBuffer;
|
||||
size_t tail = dictBufferCapacity;
|
||||
const int displayLevel = parameters.zParams.notificationLevel;
|
||||
while (tail > 0) {
|
||||
|
||||
/* Select a segment */
|
||||
RANDOM_segment_t segment = RANDOM_selectSegment(totalSamplesSize, parameters);
|
||||
|
||||
size_t segmentSize;
|
||||
segmentSize = MIN(segment.end - segment.begin + 1, tail);
|
||||
|
||||
tail -= segmentSize;
|
||||
memcpy(dict + tail, samples + segment.begin, segmentSize);
|
||||
DISPLAYUPDATE(
|
||||
2, "\r%u%% ",
|
||||
(U32)(((dictBufferCapacity - tail) * 100) / dictBufferCapacity));
|
||||
}
|
||||
|
||||
return tail;
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
ZDICTLIB_API size_t ZDICT_trainFromBuffer_random(
|
||||
void *dictBuffer, size_t dictBufferCapacity,
|
||||
const void *samplesBuffer, const size_t *samplesSizes, unsigned nbSamples,
|
||||
ZDICT_random_params_t parameters) {
|
||||
const int displayLevel = parameters.zParams.notificationLevel;
|
||||
BYTE* const dict = (BYTE*)dictBuffer;
|
||||
/* Checks */
|
||||
if (!RANDOM_checkParameters(parameters, dictBufferCapacity)) {
|
||||
DISPLAYLEVEL(1, "k is incorrect\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (nbSamples == 0) {
|
||||
DISPLAYLEVEL(1, "Random must have at least one input file\n");
|
||||
return ERROR(GENERIC);
|
||||
}
|
||||
if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) {
|
||||
DISPLAYLEVEL(1, "dictBufferCapacity must be at least %u\n",
|
||||
ZDICT_DICTSIZE_MIN);
|
||||
return ERROR(dstSize_tooSmall);
|
||||
}
|
||||
const size_t totalSamplesSize = RANDOM_sum(samplesSizes, nbSamples);
|
||||
const BYTE *const samples = (const BYTE *)samplesBuffer;
|
||||
|
||||
DISPLAYLEVEL(2, "Building dictionary\n");
|
||||
{
|
||||
const size_t tail = RANDOM_buildDictionary(totalSamplesSize, samples,
|
||||
dictBuffer, dictBufferCapacity, parameters);
|
||||
const size_t dictSize = ZDICT_finalizeDictionary(
|
||||
dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail,
|
||||
samplesBuffer, samplesSizes, nbSamples, parameters.zParams);
|
||||
if (!ZSTD_isError(dictSize)) {
|
||||
DISPLAYLEVEL(2, "Constructed dictionary of size %u\n",
|
||||
(U32)dictSize);
|
||||
}
|
||||
return dictSize;
|
||||
}
|
||||
}
|
@ -1,29 +0,0 @@
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <stdlib.h> /* malloc, free, qsort */
|
||||
#include <string.h> /* memset */
|
||||
#include <time.h> /* clock */
|
||||
#include "zstd_internal.h" /* includes zstd.h */
|
||||
#ifndef ZDICT_STATIC_LINKING_ONLY
|
||||
#define ZDICT_STATIC_LINKING_ONLY
|
||||
#endif
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
|
||||
typedef struct {
|
||||
unsigned k; /* Segment size : constraint: 0 < k : Reasonable range [16, 2048+]; Default to 200 */
|
||||
ZDICT_params_t zParams;
|
||||
} ZDICT_random_params_t;
|
||||
|
||||
|
||||
/*! ZDICT_trainFromBuffer_random():
|
||||
* Train a dictionary from an array of samples.
|
||||
* Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
|
||||
* supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
|
||||
* The resulting dictionary will be saved into `dictBuffer`.
|
||||
* @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
|
||||
* or an error code, which can be tested with ZDICT_isError().
|
||||
*/
|
||||
ZDICTLIB_API size_t ZDICT_trainFromBuffer_random( void *dictBuffer, size_t dictBufferCapacity,
|
||||
const void *samplesBuffer, const size_t *samplesSizes, unsigned nbSamples,
|
||||
ZDICT_random_params_t parameters);
|
@ -1,14 +0,0 @@
|
||||
echo "Building random dictionary with in=../../lib/common k=200 out=dict1"
|
||||
./main in=../../../lib/common k=200 out=dict1
|
||||
zstd -be3 -D dict1 -r ../../../lib/common -q
|
||||
echo "Building random dictionary with in=../../lib/common k=500 out=dict2 dictID=100 maxdict=140000"
|
||||
./main in=../../../lib/common k=500 out=dict2 dictID=100 maxdict=140000
|
||||
zstd -be3 -D dict2 -r ../../../lib/common -q
|
||||
echo "Building random dictionary with 2 sample sources"
|
||||
./main in=../../../lib/common in=../../../lib/compress out=dict3
|
||||
zstd -be3 -D dict3 -r ../../../lib/common -q
|
||||
echo "Removing dict1 dict2 dict3"
|
||||
rm -f dict1 dict2 dict3
|
||||
|
||||
echo "Testing with invalid parameters, should fail"
|
||||
! ./main r=10
|
@ -1,51 +0,0 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2016-present, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# ################################################################
|
||||
|
||||
CXXFLAGS ?= -O3
|
||||
CXXFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wno-comment
|
||||
CXXFLAGS += $(MOREFLAGS)
|
||||
FLAGS = $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS)
|
||||
|
||||
ZSTDAPI = ../../lib/zstd.h
|
||||
ZSTDMANUAL = ../../doc/zstd_manual.html
|
||||
LIBVER_MAJOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MAJOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < $(ZSTDAPI)`
|
||||
LIBVER_MINOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MINOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < $(ZSTDAPI)`
|
||||
LIBVER_PATCH_SCRIPT:=`sed -n '/define ZSTD_VERSION_RELEASE/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < $(ZSTDAPI)`
|
||||
LIBVER_SCRIPT:= $(LIBVER_MAJOR_SCRIPT).$(LIBVER_MINOR_SCRIPT).$(LIBVER_PATCH_SCRIPT)
|
||||
LIBVER := $(shell echo $(LIBVER_SCRIPT))
|
||||
|
||||
|
||||
# Define *.exe as extension for Windows systems
|
||||
ifneq (,$(filter Windows%,$(OS)))
|
||||
EXT =.exe
|
||||
else
|
||||
EXT =
|
||||
endif
|
||||
|
||||
|
||||
.PHONY: default
|
||||
default: gen_html
|
||||
|
||||
.PHONY: all
|
||||
all: manual
|
||||
|
||||
gen_html: gen_html.cpp
|
||||
$(CXX) $(FLAGS) $^ -o $@$(EXT)
|
||||
|
||||
$(ZSTDMANUAL): gen_html $(ZSTDAPI)
|
||||
echo "Update zstd manual in /doc"
|
||||
./gen_html $(LIBVER) $(ZSTDAPI) $(ZSTDMANUAL)
|
||||
|
||||
.PHONY: manual
|
||||
manual: gen_html $(ZSTDMANUAL)
|
||||
|
||||
.PHONY: clean
|
||||
clean:
|
||||
@$(RM) gen_html$(EXT)
|
||||
@echo Cleaning completed
|
@ -1,31 +0,0 @@
|
||||
gen_html - a program for automatic generation of zstd manual
|
||||
============================================================
|
||||
|
||||
#### Introduction
|
||||
|
||||
This simple C++ program generates a single-page HTML manual from `zstd.h`.
|
||||
|
||||
The format of recognized comment blocks is following:
|
||||
- comments of type `/*!` mean: this is a function declaration; switch comments with declarations
|
||||
- comments of type `/**` and `/*-` mean: this is a comment; use a `<H2>` header for the first line
|
||||
- comments of type `/*=` and `/**=` mean: use a `<H3>` header and show also all functions until first empty line
|
||||
- comments of type `/*X` where `X` is different from above-mentioned are ignored
|
||||
|
||||
Moreover:
|
||||
- `ZSTDLIB_API` is removed to improve readability
|
||||
- `typedef` are detected and included even if uncommented
|
||||
- comments of type `/**<` and `/*!<` are detected and only function declaration is highlighted (bold)
|
||||
|
||||
|
||||
#### Usage
|
||||
|
||||
The program requires 3 parameters:
|
||||
```
|
||||
gen_html [zstd_version] [input_file] [output_html]
|
||||
```
|
||||
|
||||
To compile program and generate zstd manual we have used:
|
||||
```
|
||||
make
|
||||
./gen_html.exe 1.1.1 ../../lib/zstd.h zstd_manual.html
|
||||
```
|
@ -1,9 +0,0 @@
|
||||
#!/bin/sh
|
||||
|
||||
LIBVER_MAJOR_SCRIPT=`sed -n '/define ZSTD_VERSION_MAJOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ../../lib/zstd.h`
|
||||
LIBVER_MINOR_SCRIPT=`sed -n '/define ZSTD_VERSION_MINOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ../../lib/zstd.h`
|
||||
LIBVER_PATCH_SCRIPT=`sed -n '/define ZSTD_VERSION_RELEASE/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ../../lib/zstd.h`
|
||||
LIBVER_SCRIPT=$LIBVER_MAJOR_SCRIPT.$LIBVER_MINOR_SCRIPT.$LIBVER_PATCH_SCRIPT
|
||||
|
||||
echo ZSTD_VERSION=$LIBVER_SCRIPT
|
||||
./gen_html $LIBVER_SCRIPT ../../lib/zstd.h ./zstd_manual.html
|
@ -1,224 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Przemyslaw Skibinski, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
#include <iostream>
|
||||
#include <fstream>
|
||||
#include <sstream>
|
||||
#include <vector>
|
||||
using namespace std;
|
||||
|
||||
|
||||
/* trim string at the beginning and at the end */
|
||||
void trim(string& s, string characters)
|
||||
{
|
||||
size_t p = s.find_first_not_of(characters);
|
||||
s.erase(0, p);
|
||||
|
||||
p = s.find_last_not_of(characters);
|
||||
if (string::npos != p)
|
||||
s.erase(p+1);
|
||||
}
|
||||
|
||||
|
||||
/* trim C++ style comments */
|
||||
void trim_comments(string &s)
|
||||
{
|
||||
size_t spos, epos;
|
||||
|
||||
spos = s.find("/*");
|
||||
epos = s.find("*/");
|
||||
s = s.substr(spos+3, epos-(spos+3));
|
||||
}
|
||||
|
||||
|
||||
/* get lines until a given terminator */
|
||||
vector<string> get_lines(vector<string>& input, int& linenum, string terminator)
|
||||
{
|
||||
vector<string> out;
|
||||
string line;
|
||||
size_t epos;
|
||||
|
||||
while ((size_t)linenum < input.size()) {
|
||||
line = input[linenum];
|
||||
|
||||
if (terminator.empty() && line.empty()) { linenum--; break; }
|
||||
|
||||
epos = line.find(terminator);
|
||||
if (!terminator.empty() && epos!=string::npos) {
|
||||
out.push_back(line);
|
||||
break;
|
||||
}
|
||||
out.push_back(line);
|
||||
linenum++;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
|
||||
/* print line with ZSTDLIB_API removed and C++ comments not bold */
|
||||
void print_line(stringstream &sout, string line)
|
||||
{
|
||||
size_t spos;
|
||||
|
||||
if (line.substr(0,12) == "ZSTDLIB_API ") line = line.substr(12);
|
||||
spos = line.find("/*");
|
||||
if (spos!=string::npos) {
|
||||
sout << line.substr(0, spos);
|
||||
sout << "</b>" << line.substr(spos) << "<b>" << endl;
|
||||
} else {
|
||||
// fprintf(stderr, "lines=%s\n", line.c_str());
|
||||
sout << line << endl;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
int main(int argc, char *argv[]) {
|
||||
char exclam;
|
||||
int linenum, chapter = 1;
|
||||
vector<string> input, lines, comments, chapters;
|
||||
string line, version;
|
||||
size_t spos, l;
|
||||
stringstream sout;
|
||||
ifstream istream;
|
||||
ofstream ostream;
|
||||
|
||||
if (argc < 4) {
|
||||
cout << "usage: " << argv[0] << " [zstd_version] [input_file] [output_html]" << endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
version = "zstd " + string(argv[1]) + " Manual";
|
||||
|
||||
istream.open(argv[2], ifstream::in);
|
||||
if (!istream.is_open()) {
|
||||
cout << "Error opening file " << argv[2] << endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
ostream.open(argv[3], ifstream::out);
|
||||
if (!ostream.is_open()) {
|
||||
cout << "Error opening file " << argv[3] << endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
while (getline(istream, line)) {
|
||||
input.push_back(line);
|
||||
}
|
||||
|
||||
for (linenum=0; (size_t)linenum < input.size(); linenum++) {
|
||||
line = input[linenum];
|
||||
|
||||
/* typedefs are detected and included even if uncommented */
|
||||
if (line.substr(0,7) == "typedef" && line.find("{")!=string::npos) {
|
||||
lines = get_lines(input, linenum, "}");
|
||||
sout << "<pre><b>";
|
||||
for (l=0; l<lines.size(); l++) {
|
||||
print_line(sout, lines[l]);
|
||||
}
|
||||
sout << "</b></pre><BR>" << endl;
|
||||
continue;
|
||||
}
|
||||
|
||||
/* comments of type /**< and /*!< are detected and only function declaration is highlighted (bold) */
|
||||
if ((line.find("/**<")!=string::npos || line.find("/*!<")!=string::npos) && line.find("*/")!=string::npos) {
|
||||
sout << "<pre><b>";
|
||||
print_line(sout, line);
|
||||
sout << "</b></pre><BR>" << endl;
|
||||
continue;
|
||||
}
|
||||
|
||||
spos = line.find("/**=");
|
||||
if (spos==string::npos) {
|
||||
spos = line.find("/*!");
|
||||
if (spos==string::npos)
|
||||
spos = line.find("/**");
|
||||
if (spos==string::npos)
|
||||
spos = line.find("/*-");
|
||||
if (spos==string::npos)
|
||||
spos = line.find("/*=");
|
||||
if (spos==string::npos)
|
||||
continue;
|
||||
exclam = line[spos+2];
|
||||
}
|
||||
else exclam = '=';
|
||||
|
||||
comments = get_lines(input, linenum, "*/");
|
||||
if (!comments.empty()) comments[0] = line.substr(spos+3);
|
||||
if (!comments.empty()) comments[comments.size()-1] = comments[comments.size()-1].substr(0, comments[comments.size()-1].find("*/"));
|
||||
for (l=0; l<comments.size(); l++) {
|
||||
if (comments[l].find(" *")==0) comments[l] = comments[l].substr(2);
|
||||
else if (comments[l].find(" *")==0) comments[l] = comments[l].substr(3);
|
||||
trim(comments[l], "*-=");
|
||||
}
|
||||
while (!comments.empty() && comments[comments.size()-1].empty()) comments.pop_back(); // remove empty line at the end
|
||||
while (!comments.empty() && comments[0].empty()) comments.erase(comments.begin()); // remove empty line at the start
|
||||
|
||||
/* comments of type /*! mean: this is a function declaration; switch comments with declarations */
|
||||
if (exclam == '!') {
|
||||
if (!comments.empty()) comments.erase(comments.begin()); /* remove first line like "ZSTD_XXX() :" */
|
||||
linenum++;
|
||||
lines = get_lines(input, linenum, "");
|
||||
|
||||
sout << "<pre><b>";
|
||||
for (l=0; l<lines.size(); l++) {
|
||||
// fprintf(stderr, "line[%d]=%s\n", l, lines[l].c_str());
|
||||
string fline = lines[l];
|
||||
if (fline.substr(0, 12) == "ZSTDLIB_API " ||
|
||||
fline.substr(0, 12) == string(12, ' '))
|
||||
fline = fline.substr(12);
|
||||
print_line(sout, fline);
|
||||
}
|
||||
sout << "</b><p>";
|
||||
for (l=0; l<comments.size(); l++) {
|
||||
print_line(sout, comments[l]);
|
||||
}
|
||||
sout << "</p></pre><BR>" << endl << endl;
|
||||
} else if (exclam == '=') { /* comments of type /*= and /**= mean: use a <H3> header and show also all functions until first empty line */
|
||||
trim(comments[0], " ");
|
||||
sout << "<h3>" << comments[0] << "</h3><pre>";
|
||||
for (l=1; l<comments.size(); l++) {
|
||||
print_line(sout, comments[l]);
|
||||
}
|
||||
sout << "</pre><b><pre>";
|
||||
lines = get_lines(input, ++linenum, "");
|
||||
for (l=0; l<lines.size(); l++) {
|
||||
print_line(sout, lines[l]);
|
||||
}
|
||||
sout << "</pre></b><BR>" << endl;
|
||||
} else { /* comments of type /** and /*- mean: this is a comment; use a <H2> header for the first line */
|
||||
if (comments.empty()) continue;
|
||||
|
||||
trim(comments[0], " ");
|
||||
sout << "<a name=\"Chapter" << chapter << "\"></a><h2>" << comments[0] << "</h2><pre>";
|
||||
chapters.push_back(comments[0]);
|
||||
chapter++;
|
||||
|
||||
for (l=1; l<comments.size(); l++) {
|
||||
print_line(sout, comments[l]);
|
||||
}
|
||||
if (comments.size() > 1)
|
||||
sout << "<BR></pre>" << endl << endl;
|
||||
else
|
||||
sout << "</pre>" << endl << endl;
|
||||
}
|
||||
}
|
||||
|
||||
ostream << "<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=ISO-8859-1\">\n<title>" << version << "</title>\n</head>\n<body>" << endl;
|
||||
ostream << "<h1>" << version << "</h1>\n";
|
||||
|
||||
ostream << "<hr>\n<a name=\"Contents\"></a><h2>Contents</h2>\n<ol>\n";
|
||||
for (size_t i=0; i<chapters.size(); i++)
|
||||
ostream << "<li><a href=\"#Chapter" << i+1 << "\">" << chapters[i].c_str() << "</a></li>\n";
|
||||
ostream << "</ol>\n<hr>\n";
|
||||
|
||||
ostream << sout.str();
|
||||
ostream << "</html>" << endl << "</body>" << endl;
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,58 +0,0 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2018-present, Yann Collet, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# ################################################################
|
||||
|
||||
PROGDIR = ../../programs
|
||||
LIBDIR = ../../lib
|
||||
|
||||
LIBZSTD = $(LIBDIR)/libzstd.a
|
||||
|
||||
CPPFLAGS+= -I$(LIBDIR) -I$(LIBDIR)/common -I$(LIBDIR)/dictBuilder -I$(PROGDIR)
|
||||
|
||||
CFLAGS ?= -O3
|
||||
CFLAGS += -std=gnu99
|
||||
DEBUGFLAGS= -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \
|
||||
-Wstrict-aliasing=1 -Wswitch-enum \
|
||||
-Wstrict-prototypes -Wundef -Wpointer-arith \
|
||||
-Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings \
|
||||
-Wredundant-decls
|
||||
CFLAGS += $(DEBUGFLAGS) $(MOREFLAGS)
|
||||
|
||||
|
||||
default: largeNbDicts
|
||||
|
||||
all : largeNbDicts
|
||||
|
||||
largeNbDicts: util.o timefn.o benchfn.o datagen.o xxhash.o largeNbDicts.c $(LIBZSTD)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
|
||||
|
||||
.PHONY: $(LIBZSTD)
|
||||
$(LIBZSTD):
|
||||
$(MAKE) -C $(LIBDIR) libzstd.a CFLAGS="$(CFLAGS)"
|
||||
|
||||
benchfn.o: $(PROGDIR)/benchfn.c
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ -c
|
||||
|
||||
timefn.o: $(PROGDIR)/timefn.c
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ -c
|
||||
|
||||
datagen.o: $(PROGDIR)/datagen.c
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ -c
|
||||
|
||||
util.o: $(PROGDIR)/util.c
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ -c
|
||||
|
||||
|
||||
xxhash.o : $(LIBDIR)/common/xxhash.c
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ -c
|
||||
|
||||
|
||||
clean:
|
||||
$(RM) *.o
|
||||
$(MAKE) -C $(LIBDIR) clean > /dev/null
|
||||
$(RM) largeNbDicts
|
@ -1,25 +0,0 @@
|
||||
largeNbDicts
|
||||
=====================
|
||||
|
||||
`largeNbDicts` is a benchmark test tool
|
||||
dedicated to the specific scenario of
|
||||
dictionary decompression using a very large number of dictionaries.
|
||||
When dictionaries are constantly changing, they are always "cold",
|
||||
suffering from increased latency due to cache misses.
|
||||
|
||||
The tool is created in a bid to investigate performance for this scenario,
|
||||
and experiment mitigation techniques.
|
||||
|
||||
Command line :
|
||||
```
|
||||
largeNbDicts [Options] filename(s)
|
||||
|
||||
Options :
|
||||
-r : recursively load all files in subdirectories (default: off)
|
||||
-B# : split input into blocks of size # (default: no split)
|
||||
-# : use compression level # (default: 3)
|
||||
-D # : use # as a dictionary (default: create one)
|
||||
-i# : nb benchmark rounds (default: 6)
|
||||
--nbDicts=# : set nb of dictionaries to # (default: one per block)
|
||||
-h : help (this text)
|
||||
```
|
@ -1,817 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2018-present, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
* You may select, at your option, one of the above-listed licenses.
|
||||
*/
|
||||
|
||||
/* largeNbDicts
|
||||
* This is a benchmark test tool
|
||||
* dedicated to the specific case of dictionary decompression
|
||||
* using a very large nb of dictionaries
|
||||
* thus suffering latency from lots of cache misses.
|
||||
* It's created in a bid to investigate performance and find optimizations. */
|
||||
|
||||
|
||||
/*--- Dependencies ---*/
|
||||
|
||||
#include <stddef.h> /* size_t */
|
||||
#include <stdlib.h> /* malloc, free, abort */
|
||||
#include <stdio.h> /* fprintf */
|
||||
#include <limits.h> /* UINT_MAX */
|
||||
#include <assert.h> /* assert */
|
||||
|
||||
#include "util.h"
|
||||
#include "benchfn.h"
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include "zstd.h"
|
||||
#include "zdict.h"
|
||||
|
||||
|
||||
/*--- Constants --- */
|
||||
|
||||
#define KB *(1<<10)
|
||||
#define MB *(1<<20)
|
||||
|
||||
#define BLOCKSIZE_DEFAULT 0 /* no slicing into blocks */
|
||||
#define DICTSIZE (4 KB)
|
||||
#define CLEVEL_DEFAULT 3
|
||||
|
||||
#define BENCH_TIME_DEFAULT_S 6
|
||||
#define RUN_TIME_DEFAULT_MS 1000
|
||||
#define BENCH_TIME_DEFAULT_MS (BENCH_TIME_DEFAULT_S * RUN_TIME_DEFAULT_MS)
|
||||
|
||||
#define DISPLAY_LEVEL_DEFAULT 3
|
||||
|
||||
#define BENCH_SIZE_MAX (1200 MB)
|
||||
|
||||
|
||||
/*--- Macros ---*/
|
||||
|
||||
#define CONTROL(c) { if (!(c)) abort(); }
|
||||
#undef MIN
|
||||
#define MIN(a,b) ((a) < (b) ? (a) : (b))
|
||||
|
||||
|
||||
/*--- Display Macros ---*/
|
||||
|
||||
#define DISPLAY(...) fprintf(stdout, __VA_ARGS__)
|
||||
#define DISPLAYLEVEL(l, ...) { if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); } }
|
||||
static int g_displayLevel = DISPLAY_LEVEL_DEFAULT; /* 0 : no display, 1: errors, 2 : + result + interaction + warnings, 3 : + progression, 4 : + information */
|
||||
|
||||
|
||||
/*--- buffer_t ---*/
|
||||
|
||||
typedef struct {
|
||||
void* ptr;
|
||||
size_t size;
|
||||
size_t capacity;
|
||||
} buffer_t;
|
||||
|
||||
static const buffer_t kBuffNull = { NULL, 0, 0 };
|
||||
|
||||
/* @return : kBuffNull if any error */
|
||||
static buffer_t createBuffer(size_t capacity)
|
||||
{
|
||||
assert(capacity > 0);
|
||||
void* const ptr = malloc(capacity);
|
||||
if (ptr==NULL) return kBuffNull;
|
||||
|
||||
buffer_t buffer;
|
||||
buffer.ptr = ptr;
|
||||
buffer.capacity = capacity;
|
||||
buffer.size = 0;
|
||||
return buffer;
|
||||
}
|
||||
|
||||
static void freeBuffer(buffer_t buff)
|
||||
{
|
||||
free(buff.ptr);
|
||||
}
|
||||
|
||||
|
||||
static void fillBuffer_fromHandle(buffer_t* buff, FILE* f)
|
||||
{
|
||||
size_t const readSize = fread(buff->ptr, 1, buff->capacity, f);
|
||||
buff->size = readSize;
|
||||
}
|
||||
|
||||
|
||||
/* @return : kBuffNull if any error */
|
||||
static buffer_t createBuffer_fromFile(const char* fileName)
|
||||
{
|
||||
U64 const fileSize = UTIL_getFileSize(fileName);
|
||||
size_t const bufferSize = (size_t) fileSize;
|
||||
|
||||
if (fileSize == UTIL_FILESIZE_UNKNOWN) return kBuffNull;
|
||||
assert((U64)bufferSize == fileSize); /* check overflow */
|
||||
|
||||
{ FILE* const f = fopen(fileName, "rb");
|
||||
if (f == NULL) return kBuffNull;
|
||||
|
||||
buffer_t buff = createBuffer(bufferSize);
|
||||
CONTROL(buff.ptr != NULL);
|
||||
|
||||
fillBuffer_fromHandle(&buff, f);
|
||||
CONTROL(buff.size == buff.capacity);
|
||||
|
||||
fclose(f); /* do nothing specific if fclose() fails */
|
||||
return buff;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* @return : kBuffNull if any error */
|
||||
static buffer_t
|
||||
createDictionaryBuffer(const char* dictionaryName,
|
||||
const void* srcBuffer,
|
||||
const size_t* srcBlockSizes, size_t nbBlocks,
|
||||
size_t requestedDictSize)
|
||||
{
|
||||
if (dictionaryName) {
|
||||
DISPLAYLEVEL(3, "loading dictionary %s \n", dictionaryName);
|
||||
return createBuffer_fromFile(dictionaryName); /* note : result might be kBuffNull */
|
||||
|
||||
} else {
|
||||
|
||||
DISPLAYLEVEL(3, "creating dictionary, of target size %u bytes \n",
|
||||
(unsigned)requestedDictSize);
|
||||
void* const dictBuffer = malloc(requestedDictSize);
|
||||
CONTROL(dictBuffer != NULL);
|
||||
|
||||
assert(nbBlocks <= UINT_MAX);
|
||||
size_t const dictSize = ZDICT_trainFromBuffer(dictBuffer, requestedDictSize,
|
||||
srcBuffer,
|
||||
srcBlockSizes, (unsigned)nbBlocks);
|
||||
CONTROL(!ZSTD_isError(dictSize));
|
||||
|
||||
buffer_t result;
|
||||
result.ptr = dictBuffer;
|
||||
result.capacity = requestedDictSize;
|
||||
result.size = dictSize;
|
||||
return result;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*! BMK_loadFiles() :
|
||||
* Loads `buffer`, with content from files listed within `fileNamesTable`.
|
||||
* Fills `buffer` entirely.
|
||||
* @return : 0 on success, !=0 on error */
|
||||
static int loadFiles(void* buffer, size_t bufferSize,
|
||||
size_t* fileSizes,
|
||||
const char* const * fileNamesTable, unsigned nbFiles)
|
||||
{
|
||||
size_t pos = 0, totalSize = 0;
|
||||
|
||||
for (unsigned n=0; n<nbFiles; n++) {
|
||||
U64 fileSize = UTIL_getFileSize(fileNamesTable[n]);
|
||||
if (UTIL_isDirectory(fileNamesTable[n])) {
|
||||
fileSizes[n] = 0;
|
||||
continue;
|
||||
}
|
||||
if (fileSize == UTIL_FILESIZE_UNKNOWN) {
|
||||
fileSizes[n] = 0;
|
||||
continue;
|
||||
}
|
||||
|
||||
FILE* const f = fopen(fileNamesTable[n], "rb");
|
||||
assert(f!=NULL);
|
||||
|
||||
assert(pos <= bufferSize);
|
||||
assert(fileSize <= bufferSize - pos);
|
||||
|
||||
{ size_t const readSize = fread(((char*)buffer)+pos, 1, (size_t)fileSize, f);
|
||||
assert(readSize == fileSize);
|
||||
pos += readSize;
|
||||
}
|
||||
fileSizes[n] = (size_t)fileSize;
|
||||
totalSize += (size_t)fileSize;
|
||||
fclose(f);
|
||||
}
|
||||
|
||||
assert(totalSize == bufferSize);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
|
||||
/*--- slice_collection_t ---*/
|
||||
|
||||
typedef struct {
|
||||
void** slicePtrs;
|
||||
size_t* capacities;
|
||||
size_t nbSlices;
|
||||
} slice_collection_t;
|
||||
|
||||
static const slice_collection_t kNullCollection = { NULL, NULL, 0 };
|
||||
|
||||
static void freeSliceCollection(slice_collection_t collection)
|
||||
{
|
||||
free(collection.slicePtrs);
|
||||
free(collection.capacities);
|
||||
}
|
||||
|
||||
/* shrinkSizes() :
|
||||
* downsizes sizes of slices within collection, according to `newSizes`.
|
||||
* every `newSizes` entry must be <= than its corresponding collection size */
|
||||
void shrinkSizes(slice_collection_t collection,
|
||||
const size_t* newSizes) /* presumed same size as collection */
|
||||
{
|
||||
size_t const nbSlices = collection.nbSlices;
|
||||
for (size_t blockNb = 0; blockNb < nbSlices; blockNb++) {
|
||||
assert(newSizes[blockNb] <= collection.capacities[blockNb]);
|
||||
collection.capacities[blockNb] = newSizes[blockNb];
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* splitSlices() :
|
||||
* nbSlices : if == 0, nbSlices is automatically determined from srcSlices and blockSize.
|
||||
* otherwise, creates exactly nbSlices slices,
|
||||
* by either truncating input (when smaller)
|
||||
* or repeating input from beginning */
|
||||
static slice_collection_t
|
||||
splitSlices(slice_collection_t srcSlices, size_t blockSize, size_t nbSlices)
|
||||
{
|
||||
if (blockSize==0) blockSize = (size_t)(-1); /* means "do not cut" */
|
||||
size_t nbSrcBlocks = 0;
|
||||
for (size_t ssnb=0; ssnb < srcSlices.nbSlices; ssnb++) {
|
||||
size_t pos = 0;
|
||||
while (pos <= srcSlices.capacities[ssnb]) {
|
||||
nbSrcBlocks++;
|
||||
pos += blockSize;
|
||||
}
|
||||
}
|
||||
|
||||
if (nbSlices == 0) nbSlices = nbSrcBlocks;
|
||||
|
||||
void** const sliceTable = (void**)malloc(nbSlices * sizeof(*sliceTable));
|
||||
size_t* const capacities = (size_t*)malloc(nbSlices * sizeof(*capacities));
|
||||
if (sliceTable == NULL || capacities == NULL) {
|
||||
free(sliceTable);
|
||||
free(capacities);
|
||||
return kNullCollection;
|
||||
}
|
||||
|
||||
size_t ssnb = 0;
|
||||
for (size_t sliceNb=0; sliceNb < nbSlices; ) {
|
||||
ssnb = (ssnb + 1) % srcSlices.nbSlices;
|
||||
size_t pos = 0;
|
||||
char* const ptr = (char*)srcSlices.slicePtrs[ssnb];
|
||||
while (pos < srcSlices.capacities[ssnb] && sliceNb < nbSlices) {
|
||||
size_t const size = MIN(blockSize, srcSlices.capacities[ssnb] - pos);
|
||||
sliceTable[sliceNb] = ptr + pos;
|
||||
capacities[sliceNb] = size;
|
||||
sliceNb++;
|
||||
pos += blockSize;
|
||||
}
|
||||
}
|
||||
|
||||
slice_collection_t result;
|
||||
result.nbSlices = nbSlices;
|
||||
result.slicePtrs = sliceTable;
|
||||
result.capacities = capacities;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
static size_t sliceCollection_totalCapacity(slice_collection_t sc)
|
||||
{
|
||||
size_t totalSize = 0;
|
||||
for (size_t n=0; n<sc.nbSlices; n++)
|
||||
totalSize += sc.capacities[n];
|
||||
return totalSize;
|
||||
}
|
||||
|
||||
|
||||
/* --- buffer collection --- */
|
||||
|
||||
typedef struct {
|
||||
buffer_t buffer;
|
||||
slice_collection_t slices;
|
||||
} buffer_collection_t;
|
||||
|
||||
|
||||
static void freeBufferCollection(buffer_collection_t bc)
|
||||
{
|
||||
freeBuffer(bc.buffer);
|
||||
freeSliceCollection(bc.slices);
|
||||
}
|
||||
|
||||
|
||||
static buffer_collection_t
|
||||
createBufferCollection_fromSliceCollectionSizes(slice_collection_t sc)
|
||||
{
|
||||
size_t const bufferSize = sliceCollection_totalCapacity(sc);
|
||||
|
||||
buffer_t buffer = createBuffer(bufferSize);
|
||||
CONTROL(buffer.ptr != NULL);
|
||||
|
||||
size_t const nbSlices = sc.nbSlices;
|
||||
void** const slices = (void**)malloc(nbSlices * sizeof(*slices));
|
||||
CONTROL(slices != NULL);
|
||||
|
||||
size_t* const capacities = (size_t*)malloc(nbSlices * sizeof(*capacities));
|
||||
CONTROL(capacities != NULL);
|
||||
|
||||
char* const ptr = (char*)buffer.ptr;
|
||||
size_t pos = 0;
|
||||
for (size_t n=0; n < nbSlices; n++) {
|
||||
capacities[n] = sc.capacities[n];
|
||||
slices[n] = ptr + pos;
|
||||
pos += capacities[n];
|
||||
}
|
||||
|
||||
buffer_collection_t result;
|
||||
result.buffer = buffer;
|
||||
result.slices.nbSlices = nbSlices;
|
||||
result.slices.capacities = capacities;
|
||||
result.slices.slicePtrs = slices;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/* @return : kBuffNull if any error */
|
||||
static buffer_collection_t
|
||||
createBufferCollection_fromFiles(const char* const * fileNamesTable, unsigned nbFiles)
|
||||
{
|
||||
U64 const totalSizeToLoad = UTIL_getTotalFileSize(fileNamesTable, nbFiles);
|
||||
assert(totalSizeToLoad != UTIL_FILESIZE_UNKNOWN);
|
||||
assert(totalSizeToLoad <= BENCH_SIZE_MAX);
|
||||
size_t const loadedSize = (size_t)totalSizeToLoad;
|
||||
assert(loadedSize > 0);
|
||||
void* const srcBuffer = malloc(loadedSize);
|
||||
assert(srcBuffer != NULL);
|
||||
|
||||
assert(nbFiles > 0);
|
||||
size_t* const fileSizes = (size_t*)calloc(nbFiles, sizeof(*fileSizes));
|
||||
assert(fileSizes != NULL);
|
||||
|
||||
/* Load input buffer */
|
||||
int const errorCode = loadFiles(srcBuffer, loadedSize,
|
||||
fileSizes,
|
||||
fileNamesTable, nbFiles);
|
||||
assert(errorCode == 0);
|
||||
|
||||
void** sliceTable = (void**)malloc(nbFiles * sizeof(*sliceTable));
|
||||
assert(sliceTable != NULL);
|
||||
|
||||
char* const ptr = (char*)srcBuffer;
|
||||
size_t pos = 0;
|
||||
unsigned fileNb = 0;
|
||||
for ( ; (pos < loadedSize) && (fileNb < nbFiles); fileNb++) {
|
||||
sliceTable[fileNb] = ptr + pos;
|
||||
pos += fileSizes[fileNb];
|
||||
}
|
||||
assert(pos == loadedSize);
|
||||
assert(fileNb == nbFiles);
|
||||
|
||||
|
||||
buffer_t buffer;
|
||||
buffer.ptr = srcBuffer;
|
||||
buffer.capacity = loadedSize;
|
||||
buffer.size = loadedSize;
|
||||
|
||||
slice_collection_t slices;
|
||||
slices.slicePtrs = sliceTable;
|
||||
slices.capacities = fileSizes;
|
||||
slices.nbSlices = nbFiles;
|
||||
|
||||
buffer_collection_t bc;
|
||||
bc.buffer = buffer;
|
||||
bc.slices = slices;
|
||||
return bc;
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
/*--- ddict_collection_t ---*/
|
||||
|
||||
typedef struct {
|
||||
ZSTD_DDict** ddicts;
|
||||
size_t nbDDict;
|
||||
} ddict_collection_t;
|
||||
|
||||
static const ddict_collection_t kNullDDictCollection = { NULL, 0 };
|
||||
|
||||
static void freeDDictCollection(ddict_collection_t ddictc)
|
||||
{
|
||||
for (size_t dictNb=0; dictNb < ddictc.nbDDict; dictNb++) {
|
||||
ZSTD_freeDDict(ddictc.ddicts[dictNb]);
|
||||
}
|
||||
free(ddictc.ddicts);
|
||||
}
|
||||
|
||||
/* returns .buffers=NULL if operation fails */
|
||||
static ddict_collection_t createDDictCollection(const void* dictBuffer, size_t dictSize, size_t nbDDict)
|
||||
{
|
||||
ZSTD_DDict** const ddicts = malloc(nbDDict * sizeof(ZSTD_DDict*));
|
||||
assert(ddicts != NULL);
|
||||
if (ddicts==NULL) return kNullDDictCollection;
|
||||
for (size_t dictNb=0; dictNb < nbDDict; dictNb++) {
|
||||
ddicts[dictNb] = ZSTD_createDDict(dictBuffer, dictSize);
|
||||
assert(ddicts[dictNb] != NULL);
|
||||
}
|
||||
ddict_collection_t ddictc;
|
||||
ddictc.ddicts = ddicts;
|
||||
ddictc.nbDDict = nbDDict;
|
||||
return ddictc;
|
||||
}
|
||||
|
||||
|
||||
/* mess with addresses, so that linear scanning dictionaries != linear address scanning */
|
||||
void shuffleDictionaries(ddict_collection_t dicts)
|
||||
{
|
||||
size_t const nbDicts = dicts.nbDDict;
|
||||
for (size_t r=0; r<nbDicts; r++) {
|
||||
size_t const d = rand() % nbDicts;
|
||||
ZSTD_DDict* tmpd = dicts.ddicts[d];
|
||||
dicts.ddicts[d] = dicts.ddicts[r];
|
||||
dicts.ddicts[r] = tmpd;
|
||||
}
|
||||
for (size_t r=0; r<nbDicts; r++) {
|
||||
size_t const d1 = rand() % nbDicts;
|
||||
size_t const d2 = rand() % nbDicts;
|
||||
ZSTD_DDict* tmpd = dicts.ddicts[d1];
|
||||
dicts.ddicts[d1] = dicts.ddicts[d2];
|
||||
dicts.ddicts[d2] = tmpd;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* --- Compression --- */
|
||||
|
||||
/* compressBlocks() :
|
||||
* @return : total compressed size of all blocks,
|
||||
* or 0 if error.
|
||||
*/
|
||||
static size_t compressBlocks(size_t* cSizes, /* optional (can be NULL). If present, must contain at least nbBlocks fields */
|
||||
slice_collection_t dstBlockBuffers,
|
||||
slice_collection_t srcBlockBuffers,
|
||||
ZSTD_CDict* cdict, int cLevel)
|
||||
{
|
||||
size_t const nbBlocks = srcBlockBuffers.nbSlices;
|
||||
assert(dstBlockBuffers.nbSlices == srcBlockBuffers.nbSlices);
|
||||
|
||||
ZSTD_CCtx* const cctx = ZSTD_createCCtx();
|
||||
assert(cctx != NULL);
|
||||
|
||||
size_t totalCSize = 0;
|
||||
for (size_t blockNb=0; blockNb < nbBlocks; blockNb++) {
|
||||
size_t cBlockSize;
|
||||
if (cdict == NULL) {
|
||||
cBlockSize = ZSTD_compressCCtx(cctx,
|
||||
dstBlockBuffers.slicePtrs[blockNb], dstBlockBuffers.capacities[blockNb],
|
||||
srcBlockBuffers.slicePtrs[blockNb], srcBlockBuffers.capacities[blockNb],
|
||||
cLevel);
|
||||
} else {
|
||||
cBlockSize = ZSTD_compress_usingCDict(cctx,
|
||||
dstBlockBuffers.slicePtrs[blockNb], dstBlockBuffers.capacities[blockNb],
|
||||
srcBlockBuffers.slicePtrs[blockNb], srcBlockBuffers.capacities[blockNb],
|
||||
cdict);
|
||||
}
|
||||
CONTROL(!ZSTD_isError(cBlockSize));
|
||||
if (cSizes) cSizes[blockNb] = cBlockSize;
|
||||
totalCSize += cBlockSize;
|
||||
}
|
||||
return totalCSize;
|
||||
}
|
||||
|
||||
|
||||
/* --- Benchmark --- */
|
||||
|
||||
typedef struct {
|
||||
ZSTD_DCtx* dctx;
|
||||
size_t nbDicts;
|
||||
size_t dictNb;
|
||||
ddict_collection_t dictionaries;
|
||||
} decompressInstructions;
|
||||
|
||||
decompressInstructions createDecompressInstructions(ddict_collection_t dictionaries)
|
||||
{
|
||||
decompressInstructions di;
|
||||
di.dctx = ZSTD_createDCtx();
|
||||
assert(di.dctx != NULL);
|
||||
di.nbDicts = dictionaries.nbDDict;
|
||||
di.dictNb = 0;
|
||||
di.dictionaries = dictionaries;
|
||||
return di;
|
||||
}
|
||||
|
||||
void freeDecompressInstructions(decompressInstructions di)
|
||||
{
|
||||
ZSTD_freeDCtx(di.dctx);
|
||||
}
|
||||
|
||||
/* benched function */
|
||||
size_t decompress(const void* src, size_t srcSize, void* dst, size_t dstCapacity, void* payload)
|
||||
{
|
||||
decompressInstructions* const di = (decompressInstructions*) payload;
|
||||
|
||||
size_t const result = ZSTD_decompress_usingDDict(di->dctx,
|
||||
dst, dstCapacity,
|
||||
src, srcSize,
|
||||
di->dictionaries.ddicts[di->dictNb]);
|
||||
|
||||
di->dictNb = di->dictNb + 1;
|
||||
if (di->dictNb >= di->nbDicts) di->dictNb = 0;
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
static int benchMem(slice_collection_t dstBlocks,
|
||||
slice_collection_t srcBlocks,
|
||||
ddict_collection_t dictionaries,
|
||||
int nbRounds)
|
||||
{
|
||||
assert(dstBlocks.nbSlices == srcBlocks.nbSlices);
|
||||
|
||||
unsigned const ms_per_round = RUN_TIME_DEFAULT_MS;
|
||||
unsigned const total_time_ms = nbRounds * ms_per_round;
|
||||
|
||||
double bestSpeed = 0.;
|
||||
|
||||
BMK_timedFnState_t* const benchState =
|
||||
BMK_createTimedFnState(total_time_ms, ms_per_round);
|
||||
decompressInstructions di = createDecompressInstructions(dictionaries);
|
||||
BMK_benchParams_t const bp = {
|
||||
.benchFn = decompress,
|
||||
.benchPayload = &di,
|
||||
.initFn = NULL,
|
||||
.initPayload = NULL,
|
||||
.errorFn = ZSTD_isError,
|
||||
.blockCount = dstBlocks.nbSlices,
|
||||
.srcBuffers = (const void* const*) srcBlocks.slicePtrs,
|
||||
.srcSizes = srcBlocks.capacities,
|
||||
.dstBuffers = dstBlocks.slicePtrs,
|
||||
.dstCapacities = dstBlocks.capacities,
|
||||
.blockResults = NULL
|
||||
};
|
||||
|
||||
for (;;) {
|
||||
BMK_runOutcome_t const outcome = BMK_benchTimedFn(benchState, bp);
|
||||
CONTROL(BMK_isSuccessful_runOutcome(outcome));
|
||||
|
||||
BMK_runTime_t const result = BMK_extract_runTime(outcome);
|
||||
double const dTime_ns = result.nanoSecPerRun;
|
||||
double const dTime_sec = (double)dTime_ns / 1000000000;
|
||||
size_t const srcSize = result.sumOfReturn;
|
||||
double const dSpeed_MBps = (double)srcSize / dTime_sec / (1 MB);
|
||||
if (dSpeed_MBps > bestSpeed) bestSpeed = dSpeed_MBps;
|
||||
DISPLAY("Decompression Speed : %.1f MB/s \r", bestSpeed);
|
||||
fflush(stdout);
|
||||
if (BMK_isCompleted_TimedFn(benchState)) break;
|
||||
}
|
||||
DISPLAY("\n");
|
||||
|
||||
freeDecompressInstructions(di);
|
||||
BMK_freeTimedFnState(benchState);
|
||||
|
||||
return 0; /* success */
|
||||
}
|
||||
|
||||
|
||||
/*! bench() :
|
||||
* fileName : file to load for benchmarking purpose
|
||||
* dictionary : optional (can be NULL), file to load as dictionary,
|
||||
* if none provided : will be calculated on the fly by the program.
|
||||
* @return : 0 is success, 1+ otherwise */
|
||||
int bench(const char** fileNameTable, unsigned nbFiles,
|
||||
const char* dictionary,
|
||||
size_t blockSize, int clevel,
|
||||
unsigned nbDictMax, unsigned nbBlocks,
|
||||
int nbRounds)
|
||||
{
|
||||
int result = 0;
|
||||
|
||||
DISPLAYLEVEL(3, "loading %u files... \n", nbFiles);
|
||||
buffer_collection_t const srcs = createBufferCollection_fromFiles(fileNameTable, nbFiles);
|
||||
CONTROL(srcs.buffer.ptr != NULL);
|
||||
buffer_t srcBuffer = srcs.buffer;
|
||||
size_t const srcSize = srcBuffer.size;
|
||||
DISPLAYLEVEL(3, "created src buffer of size %.1f MB \n",
|
||||
(double)srcSize / (1 MB));
|
||||
|
||||
slice_collection_t const srcSlices = splitSlices(srcs.slices, blockSize, nbBlocks);
|
||||
nbBlocks = (unsigned)(srcSlices.nbSlices);
|
||||
DISPLAYLEVEL(3, "split input into %u blocks ", nbBlocks);
|
||||
if (blockSize)
|
||||
DISPLAYLEVEL(3, "of max size %u bytes ", (unsigned)blockSize);
|
||||
DISPLAYLEVEL(3, "\n");
|
||||
size_t const totalSrcSlicesSize = sliceCollection_totalCapacity(srcSlices);
|
||||
|
||||
|
||||
size_t* const dstCapacities = malloc(nbBlocks * sizeof(*dstCapacities));
|
||||
CONTROL(dstCapacities != NULL);
|
||||
size_t dstBufferCapacity = 0;
|
||||
for (size_t bnb=0; bnb<nbBlocks; bnb++) {
|
||||
dstCapacities[bnb] = ZSTD_compressBound(srcSlices.capacities[bnb]);
|
||||
dstBufferCapacity += dstCapacities[bnb];
|
||||
}
|
||||
|
||||
buffer_t dstBuffer = createBuffer(dstBufferCapacity);
|
||||
CONTROL(dstBuffer.ptr != NULL);
|
||||
|
||||
void** const sliceTable = malloc(nbBlocks * sizeof(*sliceTable));
|
||||
CONTROL(sliceTable != NULL);
|
||||
|
||||
{ char* const ptr = dstBuffer.ptr;
|
||||
size_t pos = 0;
|
||||
for (size_t snb=0; snb < nbBlocks; snb++) {
|
||||
sliceTable[snb] = ptr + pos;
|
||||
pos += dstCapacities[snb];
|
||||
} }
|
||||
|
||||
slice_collection_t dstSlices;
|
||||
dstSlices.capacities = dstCapacities;
|
||||
dstSlices.slicePtrs = sliceTable;
|
||||
dstSlices.nbSlices = nbBlocks;
|
||||
|
||||
|
||||
/* dictionary determination */
|
||||
buffer_t const dictBuffer = createDictionaryBuffer(dictionary,
|
||||
srcs.buffer.ptr,
|
||||
srcs.slices.capacities, srcs.slices.nbSlices,
|
||||
DICTSIZE);
|
||||
CONTROL(dictBuffer.ptr != NULL);
|
||||
|
||||
ZSTD_CDict* const cdict = ZSTD_createCDict(dictBuffer.ptr, dictBuffer.size, clevel);
|
||||
CONTROL(cdict != NULL);
|
||||
|
||||
size_t const cTotalSizeNoDict = compressBlocks(NULL, dstSlices, srcSlices, NULL, clevel);
|
||||
CONTROL(cTotalSizeNoDict != 0);
|
||||
DISPLAYLEVEL(3, "compressing at level %u without dictionary : Ratio=%.2f (%u bytes) \n",
|
||||
clevel,
|
||||
(double)totalSrcSlicesSize / cTotalSizeNoDict, (unsigned)cTotalSizeNoDict);
|
||||
|
||||
size_t* const cSizes = malloc(nbBlocks * sizeof(size_t));
|
||||
CONTROL(cSizes != NULL);
|
||||
|
||||
size_t const cTotalSize = compressBlocks(cSizes, dstSlices, srcSlices, cdict, clevel);
|
||||
CONTROL(cTotalSize != 0);
|
||||
DISPLAYLEVEL(3, "compressed using a %u bytes dictionary : Ratio=%.2f (%u bytes) \n",
|
||||
(unsigned)dictBuffer.size,
|
||||
(double)totalSrcSlicesSize / cTotalSize, (unsigned)cTotalSize);
|
||||
|
||||
/* now dstSlices contain the real compressed size of each block, instead of the maximum capacity */
|
||||
shrinkSizes(dstSlices, cSizes);
|
||||
|
||||
size_t const dictMem = ZSTD_estimateDDictSize(dictBuffer.size, ZSTD_dlm_byCopy);
|
||||
unsigned const nbDicts = nbDictMax ? nbDictMax : nbBlocks;
|
||||
size_t const allDictMem = dictMem * nbDicts;
|
||||
DISPLAYLEVEL(3, "generating %u dictionaries, using %.1f MB of memory \n",
|
||||
nbDicts, (double)allDictMem / (1 MB));
|
||||
|
||||
ddict_collection_t const dictionaries = createDDictCollection(dictBuffer.ptr, dictBuffer.size, nbDicts);
|
||||
CONTROL(dictionaries.ddicts != NULL);
|
||||
|
||||
shuffleDictionaries(dictionaries);
|
||||
|
||||
buffer_collection_t resultCollection = createBufferCollection_fromSliceCollectionSizes(srcSlices);
|
||||
CONTROL(resultCollection.buffer.ptr != NULL);
|
||||
|
||||
result = benchMem(resultCollection.slices, dstSlices, dictionaries, nbRounds);
|
||||
|
||||
/* free all heap objects in reverse order */
|
||||
freeBufferCollection(resultCollection);
|
||||
freeDDictCollection(dictionaries);
|
||||
free(cSizes);
|
||||
ZSTD_freeCDict(cdict);
|
||||
freeBuffer(dictBuffer);
|
||||
freeSliceCollection(dstSlices);
|
||||
freeBuffer(dstBuffer);
|
||||
freeSliceCollection(srcSlices);
|
||||
freeBufferCollection(srcs);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
|
||||
/* --- Command Line --- */
|
||||
|
||||
/*! readU32FromChar() :
|
||||
* @return : unsigned integer value read from input in `char` format.
|
||||
* allows and interprets K, KB, KiB, M, MB and MiB suffix.
|
||||
* Will also modify `*stringPtr`, advancing it to position where it stopped reading.
|
||||
* Note : function will exit() program if digit sequence overflows */
|
||||
static unsigned readU32FromChar(const char** stringPtr)
|
||||
{
|
||||
unsigned result = 0;
|
||||
while ((**stringPtr >='0') && (**stringPtr <='9')) {
|
||||
unsigned const max = (((unsigned)(-1)) / 10) - 1;
|
||||
assert(result <= max); /* check overflow */
|
||||
result *= 10, result += **stringPtr - '0', (*stringPtr)++ ;
|
||||
}
|
||||
if ((**stringPtr=='K') || (**stringPtr=='M')) {
|
||||
unsigned const maxK = ((unsigned)(-1)) >> 10;
|
||||
assert(result <= maxK); /* check overflow */
|
||||
result <<= 10;
|
||||
if (**stringPtr=='M') {
|
||||
assert(result <= maxK); /* check overflow */
|
||||
result <<= 10;
|
||||
}
|
||||
(*stringPtr)++; /* skip `K` or `M` */
|
||||
if (**stringPtr=='i') (*stringPtr)++;
|
||||
if (**stringPtr=='B') (*stringPtr)++;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
/** longCommandWArg() :
|
||||
* check if *stringPtr is the same as longCommand.
|
||||
* If yes, @return 1 and advances *stringPtr to the position which immediately follows longCommand.
|
||||
* @return 0 and doesn't modify *stringPtr otherwise.
|
||||
*/
|
||||
static unsigned longCommandWArg(const char** stringPtr, const char* longCommand)
|
||||
{
|
||||
size_t const comSize = strlen(longCommand);
|
||||
int const result = !strncmp(*stringPtr, longCommand, comSize);
|
||||
if (result) *stringPtr += comSize;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
int usage(const char* exeName)
|
||||
{
|
||||
DISPLAY (" \n");
|
||||
DISPLAY (" %s [Options] filename(s) \n", exeName);
|
||||
DISPLAY (" \n");
|
||||
DISPLAY ("Options : \n");
|
||||
DISPLAY ("-r : recursively load all files in subdirectories (default: off) \n");
|
||||
DISPLAY ("-B# : split input into blocks of size # (default: no split) \n");
|
||||
DISPLAY ("-# : use compression level # (default: %u) \n", CLEVEL_DEFAULT);
|
||||
DISPLAY ("-D # : use # as a dictionary (default: create one) \n");
|
||||
DISPLAY ("-i# : nb benchmark rounds (default: %u) \n", BENCH_TIME_DEFAULT_S);
|
||||
DISPLAY ("--nbBlocks=#: use # blocks for bench (default: one per file) \n");
|
||||
DISPLAY ("--nbDicts=# : create # dictionaries for bench (default: one per block) \n");
|
||||
DISPLAY ("-h : help (this text) \n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
int bad_usage(const char* exeName)
|
||||
{
|
||||
DISPLAY (" bad usage : \n");
|
||||
usage(exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
int main (int argc, const char** argv)
|
||||
{
|
||||
int recursiveMode = 0;
|
||||
int nbRounds = BENCH_TIME_DEFAULT_S;
|
||||
const char* const exeName = argv[0];
|
||||
|
||||
if (argc < 2) return bad_usage(exeName);
|
||||
|
||||
const char** nameTable = (const char**)malloc(argc * sizeof(const char*));
|
||||
assert(nameTable != NULL);
|
||||
unsigned nameIdx = 0;
|
||||
|
||||
const char* dictionary = NULL;
|
||||
int cLevel = CLEVEL_DEFAULT;
|
||||
size_t blockSize = BLOCKSIZE_DEFAULT;
|
||||
unsigned nbDicts = 0; /* determine nbDicts automatically: 1 dictionary per block */
|
||||
unsigned nbBlocks = 0; /* determine nbBlocks automatically, from source and blockSize */
|
||||
|
||||
for (int argNb = 1; argNb < argc ; argNb++) {
|
||||
const char* argument = argv[argNb];
|
||||
if (!strcmp(argument, "-h")) { free(nameTable); return usage(exeName); }
|
||||
if (!strcmp(argument, "-r")) { recursiveMode = 1; continue; }
|
||||
if (!strcmp(argument, "-D")) { argNb++; assert(argNb < argc); dictionary = argv[argNb]; continue; }
|
||||
if (longCommandWArg(&argument, "-i")) { nbRounds = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "--dictionary=")) { dictionary = argument; continue; }
|
||||
if (longCommandWArg(&argument, "-B")) { blockSize = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "--blockSize=")) { blockSize = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "--nbDicts=")) { nbDicts = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "--nbBlocks=")) { nbBlocks = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "--clevel=")) { cLevel = readU32FromChar(&argument); continue; }
|
||||
if (longCommandWArg(&argument, "-")) { cLevel = readU32FromChar(&argument); continue; }
|
||||
/* anything that's not a command is a filename */
|
||||
nameTable[nameIdx++] = argument;
|
||||
}
|
||||
|
||||
const char** filenameTable = nameTable;
|
||||
unsigned nbFiles = nameIdx;
|
||||
char* buffer_containing_filenames = NULL;
|
||||
|
||||
if (recursiveMode) {
|
||||
#ifndef UTIL_HAS_CREATEFILELIST
|
||||
assert(0); /* missing capability, do not run */
|
||||
#endif
|
||||
filenameTable = UTIL_createFileList(nameTable, nameIdx, &buffer_containing_filenames, &nbFiles, 1 /* follow_links */);
|
||||
}
|
||||
|
||||
int result = bench(filenameTable, nbFiles, dictionary, blockSize, cLevel, nbDicts, nbBlocks, nbRounds);
|
||||
|
||||
free(buffer_containing_filenames);
|
||||
free(nameTable);
|
||||
|
||||
return result;
|
||||
}
|
@ -1,6 +0,0 @@
|
||||
-- Include zstd.lua in your GENie or premake4 file, which exposes a project_zstd function
|
||||
dofile('zstd.lua')
|
||||
|
||||
solution 'example'
|
||||
configurations { 'Debug', 'Release' }
|
||||
project_zstd('../../lib/')
|
@ -1,80 +0,0 @@
|
||||
-- This GENie/premake file copies the behavior of the Makefile in the lib folder.
|
||||
-- Basic usage: project_zstd(ZSTD_DIR)
|
||||
|
||||
function project_zstd(dir, compression, decompression, deprecated, dictbuilder, legacy)
|
||||
if compression == nil then compression = true end
|
||||
if decompression == nil then decompression = true end
|
||||
if deprecated == nil then deprecated = false end
|
||||
if dictbuilder == nil then dictbuilder = false end
|
||||
|
||||
if legacy == nil then legacy = 0 end
|
||||
|
||||
if not compression then
|
||||
dictbuilder = false
|
||||
deprecated = false
|
||||
end
|
||||
|
||||
if not decompression then
|
||||
legacy = 0
|
||||
deprecated = false
|
||||
end
|
||||
|
||||
project 'zstd'
|
||||
kind 'StaticLib'
|
||||
language 'C'
|
||||
|
||||
files {
|
||||
dir .. 'zstd.h',
|
||||
dir .. 'common/**.c',
|
||||
dir .. 'common/**.h'
|
||||
}
|
||||
|
||||
if compression then
|
||||
files {
|
||||
dir .. 'compress/**.c',
|
||||
dir .. 'compress/**.h'
|
||||
}
|
||||
end
|
||||
|
||||
if decompression then
|
||||
files {
|
||||
dir .. 'decompress/**.c',
|
||||
dir .. 'decompress/**.h'
|
||||
}
|
||||
end
|
||||
|
||||
if dictbuilder then
|
||||
files {
|
||||
dir .. 'dictBuilder/**.c',
|
||||
dir .. 'dictBuilder/**.h'
|
||||
}
|
||||
end
|
||||
|
||||
if deprecated then
|
||||
files {
|
||||
dir .. 'deprecated/**.c',
|
||||
dir .. 'deprecated/**.h'
|
||||
}
|
||||
end
|
||||
|
||||
if legacy ~= 0 then
|
||||
if legacy >= 8 then
|
||||
files {
|
||||
dir .. 'legacy/zstd_v0' .. (legacy - 7) .. '.*'
|
||||
}
|
||||
end
|
||||
includedirs {
|
||||
dir .. 'legacy'
|
||||
}
|
||||
end
|
||||
|
||||
includedirs {
|
||||
dir,
|
||||
dir .. 'common'
|
||||
}
|
||||
|
||||
defines {
|
||||
'XXH_NAMESPACE=ZSTD_',
|
||||
'ZSTD_LEGACY_SUPPORT=' .. legacy
|
||||
}
|
||||
end
|
@ -1,72 +0,0 @@
|
||||
cxx_library(
|
||||
name='libpzstd',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='',
|
||||
exported_headers=[
|
||||
'ErrorHolder.h',
|
||||
'Logging.h',
|
||||
'Pzstd.h',
|
||||
],
|
||||
headers=[
|
||||
'SkippableFrame.h',
|
||||
],
|
||||
srcs=[
|
||||
'Pzstd.cpp',
|
||||
'SkippableFrame.cpp',
|
||||
],
|
||||
deps=[
|
||||
':options',
|
||||
'//contrib/pzstd/utils:utils',
|
||||
'//lib:mem',
|
||||
'//lib:zstd',
|
||||
],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='options',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='',
|
||||
exported_headers=['Options.h'],
|
||||
srcs=['Options.cpp'],
|
||||
deps=[
|
||||
'//contrib/pzstd/utils:scope_guard',
|
||||
'//lib:zstd',
|
||||
'//programs:util',
|
||||
],
|
||||
)
|
||||
|
||||
cxx_binary(
|
||||
name='pzstd',
|
||||
visibility=['PUBLIC'],
|
||||
srcs=['main.cpp'],
|
||||
deps=[
|
||||
':libpzstd',
|
||||
':options',
|
||||
],
|
||||
)
|
||||
|
||||
# Must run "make googletest" first
|
||||
cxx_library(
|
||||
name='gtest',
|
||||
srcs=glob([
|
||||
'googletest/googletest/src/gtest-all.cc',
|
||||
'googletest/googlemock/src/gmock-all.cc',
|
||||
'googletest/googlemock/src/gmock_main.cc',
|
||||
]),
|
||||
header_namespace='',
|
||||
exported_headers=subdir_glob([
|
||||
('googletest/googletest/include', '**/*.h'),
|
||||
('googletest/googlemock/include', '**/*.h'),
|
||||
]),
|
||||
headers=subdir_glob([
|
||||
('googletest/googletest', 'src/*.cc'),
|
||||
('googletest/googletest', 'src/*.h'),
|
||||
('googletest/googlemock', 'src/*.cc'),
|
||||
('googletest/googlemock', 'src/*.h'),
|
||||
]),
|
||||
platform_linker_flags=[
|
||||
('android', []),
|
||||
('', ['-lpthread']),
|
||||
],
|
||||
visibility=['PUBLIC'],
|
||||
)
|
@ -1,54 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include <atomic>
|
||||
#include <cassert>
|
||||
#include <stdexcept>
|
||||
#include <string>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
// Coordinates graceful shutdown of the pzstd pipeline
|
||||
class ErrorHolder {
|
||||
std::atomic<bool> error_;
|
||||
std::string message_;
|
||||
|
||||
public:
|
||||
ErrorHolder() : error_(false) {}
|
||||
|
||||
bool hasError() noexcept {
|
||||
return error_.load();
|
||||
}
|
||||
|
||||
void setError(std::string message) noexcept {
|
||||
// Given multiple possibly concurrent calls, exactly one will ever succeed.
|
||||
bool expected = false;
|
||||
if (error_.compare_exchange_strong(expected, true)) {
|
||||
message_ = std::move(message);
|
||||
}
|
||||
}
|
||||
|
||||
bool check(bool predicate, std::string message) noexcept {
|
||||
if (!predicate) {
|
||||
setError(std::move(message));
|
||||
}
|
||||
return !hasError();
|
||||
}
|
||||
|
||||
std::string getError() noexcept {
|
||||
error_.store(false);
|
||||
return std::move(message_);
|
||||
}
|
||||
|
||||
~ErrorHolder() {
|
||||
assert(!hasError());
|
||||
}
|
||||
};
|
||||
}
|
@ -1,72 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include <cstdio>
|
||||
#include <mutex>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
constexpr int ERROR = 1;
|
||||
constexpr int INFO = 2;
|
||||
constexpr int DEBUG = 3;
|
||||
constexpr int VERBOSE = 4;
|
||||
|
||||
class Logger {
|
||||
std::mutex mutex_;
|
||||
FILE* out_;
|
||||
const int level_;
|
||||
|
||||
using Clock = std::chrono::system_clock;
|
||||
Clock::time_point lastUpdate_;
|
||||
std::chrono::milliseconds refreshRate_;
|
||||
|
||||
public:
|
||||
explicit Logger(int level, FILE* out = stderr)
|
||||
: out_(out), level_(level), lastUpdate_(Clock::now()),
|
||||
refreshRate_(150) {}
|
||||
|
||||
|
||||
bool logsAt(int level) {
|
||||
return level <= level_;
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
void operator()(int level, const char *fmt, Args... args) {
|
||||
if (level > level_) {
|
||||
return;
|
||||
}
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
std::fprintf(out_, fmt, args...);
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
void update(int level, const char *fmt, Args... args) {
|
||||
if (level > level_) {
|
||||
return;
|
||||
}
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
auto now = Clock::now();
|
||||
if (now - lastUpdate_ > refreshRate_) {
|
||||
lastUpdate_ = now;
|
||||
std::fprintf(out_, "\r");
|
||||
std::fprintf(out_, fmt, args...);
|
||||
}
|
||||
}
|
||||
|
||||
void clear(int level) {
|
||||
if (level > level_) {
|
||||
return;
|
||||
}
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
std::fprintf(out_, "\r%79s\r", "");
|
||||
}
|
||||
};
|
||||
|
||||
}
|
@ -1,271 +0,0 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2016-present, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# ################################################################
|
||||
|
||||
# Standard variables for installation
|
||||
DESTDIR ?=
|
||||
PREFIX ?= /usr/local
|
||||
BINDIR := $(DESTDIR)$(PREFIX)/bin
|
||||
|
||||
ZSTDDIR = ../../lib
|
||||
PROGDIR = ../../programs
|
||||
|
||||
# External program to use to run tests, e.g. qemu or valgrind
|
||||
TESTPROG ?=
|
||||
# Flags to pass to the tests
|
||||
TESTFLAGS ?=
|
||||
|
||||
# We use gcc/clang to generate the header dependencies of files
|
||||
DEPFLAGS = -MMD -MP -MF $*.Td
|
||||
POSTCOMPILE = mv -f $*.Td $*.d
|
||||
|
||||
# CFLAGS, CXXFLAGS, CPPFLAGS, and LDFLAGS are for the users to override
|
||||
CFLAGS ?= -O3 -Wall -Wextra
|
||||
CXXFLAGS ?= -O3 -Wall -Wextra -pedantic
|
||||
CPPFLAGS ?=
|
||||
LDFLAGS ?=
|
||||
|
||||
# Include flags
|
||||
PZSTD_INC = -I$(ZSTDDIR) -I$(ZSTDDIR)/common -I$(PROGDIR) -I.
|
||||
GTEST_INC = -isystem googletest/googletest/include
|
||||
|
||||
PZSTD_CPPFLAGS = $(PZSTD_INC)
|
||||
PZSTD_CCXXFLAGS =
|
||||
PZSTD_CFLAGS = $(PZSTD_CCXXFLAGS)
|
||||
PZSTD_CXXFLAGS = $(PZSTD_CCXXFLAGS) -std=c++11
|
||||
PZSTD_LDFLAGS =
|
||||
EXTRA_FLAGS =
|
||||
ALL_CFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CFLAGS) $(PZSTD_CFLAGS)
|
||||
ALL_CXXFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CXXFLAGS) $(PZSTD_CXXFLAGS)
|
||||
ALL_LDFLAGS = $(EXTRA_FLAGS) $(CXXFLAGS) $(LDFLAGS) $(PZSTD_LDFLAGS)
|
||||
|
||||
|
||||
# gtest libraries need to go before "-lpthread" because they depend on it.
|
||||
GTEST_LIB = -L googletest/build/googlemock/gtest
|
||||
LIBS =
|
||||
|
||||
# Compilation commands
|
||||
LD_COMMAND = $(CXX) $^ $(ALL_LDFLAGS) $(LIBS) -pthread -o $@
|
||||
CC_COMMAND = $(CC) $(DEPFLAGS) $(ALL_CFLAGS) -c $< -o $@
|
||||
CXX_COMMAND = $(CXX) $(DEPFLAGS) $(ALL_CXXFLAGS) -c $< -o $@
|
||||
|
||||
# Get a list of all zstd files so we rebuild the static library when we need to
|
||||
ZSTDCOMMON_FILES := $(wildcard $(ZSTDDIR)/common/*.c) \
|
||||
$(wildcard $(ZSTDDIR)/common/*.h)
|
||||
ZSTDCOMP_FILES := $(wildcard $(ZSTDDIR)/compress/*.c) \
|
||||
$(wildcard $(ZSTDDIR)/compress/*.h)
|
||||
ZSTDDECOMP_FILES := $(wildcard $(ZSTDDIR)/decompress/*.c) \
|
||||
$(wildcard $(ZSTDDIR)/decompress/*.h)
|
||||
ZSTDPROG_FILES := $(wildcard $(PROGDIR)/*.c) \
|
||||
$(wildcard $(PROGDIR)/*.h)
|
||||
ZSTD_FILES := $(wildcard $(ZSTDDIR)/*.h) \
|
||||
$(ZSTDDECOMP_FILES) $(ZSTDCOMMON_FILES) $(ZSTDCOMP_FILES) \
|
||||
$(ZSTDPROG_FILES)
|
||||
|
||||
# List all the pzstd source files so we can determine their dependencies
|
||||
PZSTD_SRCS := $(wildcard *.cpp)
|
||||
PZSTD_TESTS := $(wildcard test/*.cpp)
|
||||
UTILS_TESTS := $(wildcard utils/test/*.cpp)
|
||||
ALL_SRCS := $(PZSTD_SRCS) $(PZSTD_TESTS) $(UTILS_TESTS)
|
||||
|
||||
|
||||
# Define *.exe as extension for Windows systems
|
||||
ifneq (,$(filter Windows%,$(OS)))
|
||||
EXT =.exe
|
||||
else
|
||||
EXT =
|
||||
endif
|
||||
|
||||
# Standard targets
|
||||
.PHONY: default
|
||||
default: all
|
||||
|
||||
.PHONY: test-pzstd
|
||||
test-pzstd: TESTFLAGS=--gtest_filter=-*ExtremelyLarge*
|
||||
test-pzstd: clean googletest pzstd tests check
|
||||
|
||||
.PHONY: test-pzstd32
|
||||
test-pzstd32: clean googletest32 all32 check
|
||||
|
||||
.PHONY: test-pzstd-tsan
|
||||
test-pzstd-tsan: LDFLAGS=-fuse-ld=gold
|
||||
test-pzstd-tsan: TESTFLAGS=--gtest_filter=-*ExtremelyLarge*
|
||||
test-pzstd-tsan: clean googletest tsan check
|
||||
|
||||
.PHONY: test-pzstd-asan
|
||||
test-pzstd-asan: LDFLAGS=-fuse-ld=gold
|
||||
test-pzstd-asan: TESTFLAGS=--gtest_filter=-*ExtremelyLarge*
|
||||
test-pzstd-asan: clean asan check
|
||||
|
||||
.PHONY: check
|
||||
check:
|
||||
$(TESTPROG) ./utils/test/BufferTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./utils/test/RangeTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./utils/test/ResourcePoolTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./utils/test/ScopeGuardTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./utils/test/ThreadPoolTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./utils/test/WorkQueueTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./test/OptionsTest$(EXT) $(TESTFLAGS)
|
||||
$(TESTPROG) ./test/PzstdTest$(EXT) $(TESTFLAGS)
|
||||
|
||||
.PHONY: install
|
||||
install: PZSTD_CPPFLAGS += -DNDEBUG
|
||||
install: pzstd$(EXT)
|
||||
install -d -m 755 $(BINDIR)/
|
||||
install -m 755 pzstd$(EXT) $(BINDIR)/pzstd$(EXT)
|
||||
|
||||
.PHONY: uninstall
|
||||
uninstall:
|
||||
$(RM) $(BINDIR)/pzstd$(EXT)
|
||||
|
||||
# Targets for many different builds
|
||||
.PHONY: all
|
||||
all: PZSTD_CPPFLAGS += -DNDEBUG
|
||||
all: pzstd$(EXT)
|
||||
|
||||
.PHONY: debug
|
||||
debug: EXTRA_FLAGS += -g
|
||||
debug: pzstd$(EXT) tests roundtrip
|
||||
|
||||
.PHONY: tsan
|
||||
tsan: PZSTD_CCXXFLAGS += -fsanitize=thread -fPIC
|
||||
tsan: PZSTD_LDFLAGS += -fsanitize=thread
|
||||
tsan: debug
|
||||
|
||||
.PHONY: asan
|
||||
asan: EXTRA_FLAGS += -fsanitize=address
|
||||
asan: debug
|
||||
|
||||
.PHONY: ubsan
|
||||
ubsan: EXTRA_FLAGS += -fsanitize=undefined
|
||||
ubsan: debug
|
||||
|
||||
.PHONY: all32
|
||||
all32: EXTRA_FLAGS += -m32
|
||||
all32: all tests roundtrip
|
||||
|
||||
.PHONY: debug32
|
||||
debug32: EXTRA_FLAGS += -m32
|
||||
debug32: debug
|
||||
|
||||
.PHONY: asan32
|
||||
asan32: EXTRA_FLAGS += -m32
|
||||
asan32: asan
|
||||
|
||||
.PHONY: tsan32
|
||||
tsan32: EXTRA_FLAGS += -m32
|
||||
tsan32: tsan
|
||||
|
||||
.PHONY: ubsan32
|
||||
ubsan32: EXTRA_FLAGS += -m32
|
||||
ubsan32: ubsan
|
||||
|
||||
# Run long round trip tests
|
||||
.PHONY: roundtripcheck
|
||||
roundtripcheck: roundtrip check
|
||||
$(TESTPROG) ./test/RoundTripTest$(EXT) $(TESTFLAGS)
|
||||
|
||||
# Build the main binary
|
||||
pzstd$(EXT): main.o $(PROGDIR)/util.o Options.o Pzstd.o SkippableFrame.o $(ZSTDDIR)/libzstd.a
|
||||
$(LD_COMMAND)
|
||||
|
||||
# Target that depends on all the tests
|
||||
.PHONY: tests
|
||||
tests: EXTRA_FLAGS += -Wno-deprecated-declarations
|
||||
tests: $(patsubst %,%$(EXT),$(basename $(PZSTD_TESTS) $(UTILS_TESTS)))
|
||||
|
||||
# Build the round trip tests
|
||||
.PHONY: roundtrip
|
||||
roundtrip: EXTRA_FLAGS += -Wno-deprecated-declarations
|
||||
roundtrip: test/RoundTripTest$(EXT)
|
||||
|
||||
# Use the static library that zstd builds for simplicity and
|
||||
# so we get the compiler options correct
|
||||
$(ZSTDDIR)/libzstd.a: $(ZSTD_FILES)
|
||||
CFLAGS="$(ALL_CFLAGS)" LDFLAGS="$(ALL_LDFLAGS)" $(MAKE) -C $(ZSTDDIR) libzstd.a
|
||||
|
||||
# Rules to build the tests
|
||||
test/RoundTripTest$(EXT): test/RoundTripTest.o $(PROGDIR)/datagen.o \
|
||||
$(PROGDIR)/util.o Options.o \
|
||||
Pzstd.o SkippableFrame.o $(ZSTDDIR)/libzstd.a
|
||||
$(LD_COMMAND)
|
||||
|
||||
test/%Test$(EXT): PZSTD_LDFLAGS += $(GTEST_LIB)
|
||||
test/%Test$(EXT): LIBS += -lgtest -lgtest_main
|
||||
test/%Test$(EXT): test/%Test.o $(PROGDIR)/datagen.o \
|
||||
$(PROGDIR)/util.o Options.o Pzstd.o \
|
||||
SkippableFrame.o $(ZSTDDIR)/libzstd.a
|
||||
$(LD_COMMAND)
|
||||
|
||||
utils/test/%Test$(EXT): PZSTD_LDFLAGS += $(GTEST_LIB)
|
||||
utils/test/%Test$(EXT): LIBS += -lgtest -lgtest_main
|
||||
utils/test/%Test$(EXT): utils/test/%Test.o
|
||||
$(LD_COMMAND)
|
||||
|
||||
|
||||
GTEST_CMAKEFLAGS =
|
||||
|
||||
# Install googletest
|
||||
.PHONY: googletest
|
||||
googletest: PZSTD_CCXXFLAGS += -fPIC
|
||||
googletest:
|
||||
@$(RM) -rf googletest
|
||||
@git clone https://github.com/google/googletest
|
||||
@mkdir -p googletest/build
|
||||
@cd googletest/build && cmake $(GTEST_CMAKEFLAGS) -DCMAKE_CXX_FLAGS="$(ALL_CXXFLAGS)" .. && $(MAKE)
|
||||
|
||||
.PHONY: googletest32
|
||||
googletest32: PZSTD_CCXXFLAGS += -m32
|
||||
googletest32: googletest
|
||||
|
||||
.PHONY: googletest-mingw64
|
||||
googletest-mingw64: GTEST_CMAKEFLAGS += -G "MSYS Makefiles"
|
||||
googletest-mingw64: googletest
|
||||
|
||||
.PHONY: clean
|
||||
clean:
|
||||
$(RM) -f *.o pzstd$(EXT) *.Td *.d
|
||||
$(RM) -f test/*.o test/*Test$(EXT) test/*.Td test/*.d
|
||||
$(RM) -f utils/test/*.o utils/test/*Test$(EXT) utils/test/*.Td utils/test/*.d
|
||||
$(RM) -f $(PROGDIR)/*.o $(PROGDIR)/*.Td $(PROGDIR)/*.d
|
||||
$(MAKE) -C $(ZSTDDIR) clean
|
||||
@echo Cleaning completed
|
||||
|
||||
|
||||
# Cancel implicit rules
|
||||
%.o: %.c
|
||||
%.o: %.cpp
|
||||
|
||||
# Object file rules
|
||||
%.o: %.c
|
||||
$(CC_COMMAND)
|
||||
$(POSTCOMPILE)
|
||||
|
||||
$(PROGDIR)/%.o: $(PROGDIR)/%.c
|
||||
$(CC_COMMAND)
|
||||
$(POSTCOMPILE)
|
||||
|
||||
%.o: %.cpp
|
||||
$(CXX_COMMAND)
|
||||
$(POSTCOMPILE)
|
||||
|
||||
test/%.o: PZSTD_CPPFLAGS += $(GTEST_INC)
|
||||
test/%.o: test/%.cpp
|
||||
$(CXX_COMMAND)
|
||||
$(POSTCOMPILE)
|
||||
|
||||
utils/test/%.o: PZSTD_CPPFLAGS += $(GTEST_INC)
|
||||
utils/test/%.o: utils/test/%.cpp
|
||||
$(CXX_COMMAND)
|
||||
$(POSTCOMPILE)
|
||||
|
||||
# Dependency file stuff
|
||||
.PRECIOUS: %.d test/%.d utils/test/%.d
|
||||
|
||||
# Include rules that specify header file dependencies
|
||||
-include $(patsubst %,%.d,$(basename $(ALL_SRCS)))
|
@ -1,428 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "Options.h"
|
||||
#include "util.h"
|
||||
#include "utils/ScopeGuard.h"
|
||||
|
||||
#include <algorithm>
|
||||
#include <cassert>
|
||||
#include <cstdio>
|
||||
#include <cstring>
|
||||
#include <iterator>
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
namespace {
|
||||
unsigned defaultNumThreads() {
|
||||
#ifdef PZSTD_NUM_THREADS
|
||||
return PZSTD_NUM_THREADS;
|
||||
#else
|
||||
return std::thread::hardware_concurrency();
|
||||
#endif
|
||||
}
|
||||
|
||||
unsigned parseUnsigned(const char **arg) {
|
||||
unsigned result = 0;
|
||||
while (**arg >= '0' && **arg <= '9') {
|
||||
result *= 10;
|
||||
result += **arg - '0';
|
||||
++(*arg);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
const char *getArgument(const char *options, const char **argv, int &i,
|
||||
int argc) {
|
||||
if (options[1] != 0) {
|
||||
return options + 1;
|
||||
}
|
||||
++i;
|
||||
if (i == argc) {
|
||||
std::fprintf(stderr, "Option -%c requires an argument, but none provided\n",
|
||||
*options);
|
||||
return nullptr;
|
||||
}
|
||||
return argv[i];
|
||||
}
|
||||
|
||||
const std::string kZstdExtension = ".zst";
|
||||
constexpr char kStdIn[] = "-";
|
||||
constexpr char kStdOut[] = "-";
|
||||
constexpr unsigned kDefaultCompressionLevel = 3;
|
||||
constexpr unsigned kMaxNonUltraCompressionLevel = 19;
|
||||
|
||||
#ifdef _WIN32
|
||||
const char nullOutput[] = "nul";
|
||||
#else
|
||||
const char nullOutput[] = "/dev/null";
|
||||
#endif
|
||||
|
||||
void notSupported(const char *option) {
|
||||
std::fprintf(stderr, "Operation not supported: %s\n", option);
|
||||
}
|
||||
|
||||
void usage() {
|
||||
std::fprintf(stderr, "Usage:\n");
|
||||
std::fprintf(stderr, " pzstd [args] [FILE(s)]\n");
|
||||
std::fprintf(stderr, "Parallel ZSTD options:\n");
|
||||
std::fprintf(stderr, " -p, --processes # : number of threads to use for (de)compression (default:<numcpus>)\n");
|
||||
|
||||
std::fprintf(stderr, "ZSTD options:\n");
|
||||
std::fprintf(stderr, " -# : # compression level (1-%d, default:%d)\n", kMaxNonUltraCompressionLevel, kDefaultCompressionLevel);
|
||||
std::fprintf(stderr, " -d, --decompress : decompression\n");
|
||||
std::fprintf(stderr, " -o file : result stored into `file` (only if 1 input file)\n");
|
||||
std::fprintf(stderr, " -f, --force : overwrite output without prompting, (de)compress links\n");
|
||||
std::fprintf(stderr, " --rm : remove source file(s) after successful (de)compression\n");
|
||||
std::fprintf(stderr, " -k, --keep : preserve source file(s) (default)\n");
|
||||
std::fprintf(stderr, " -h, --help : display help and exit\n");
|
||||
std::fprintf(stderr, " -V, --version : display version number and exit\n");
|
||||
std::fprintf(stderr, " -v, --verbose : verbose mode; specify multiple times to increase log level (default:2)\n");
|
||||
std::fprintf(stderr, " -q, --quiet : suppress warnings; specify twice to suppress errors too\n");
|
||||
std::fprintf(stderr, " -c, --stdout : force write to standard output, even if it is the console\n");
|
||||
#ifdef UTIL_HAS_CREATEFILELIST
|
||||
std::fprintf(stderr, " -r : operate recursively on directories\n");
|
||||
#endif
|
||||
std::fprintf(stderr, " --ultra : enable levels beyond %i, up to %i (requires more memory)\n", kMaxNonUltraCompressionLevel, ZSTD_maxCLevel());
|
||||
std::fprintf(stderr, " -C, --check : integrity check (default)\n");
|
||||
std::fprintf(stderr, " --no-check : no integrity check\n");
|
||||
std::fprintf(stderr, " -t, --test : test compressed file integrity\n");
|
||||
std::fprintf(stderr, " -- : all arguments after \"--\" are treated as files\n");
|
||||
}
|
||||
} // anonymous namespace
|
||||
|
||||
Options::Options()
|
||||
: numThreads(defaultNumThreads()), maxWindowLog(23),
|
||||
compressionLevel(kDefaultCompressionLevel), decompress(false),
|
||||
overwrite(false), keepSource(true), writeMode(WriteMode::Auto),
|
||||
checksum(true), verbosity(2) {}
|
||||
|
||||
Options::Status Options::parse(int argc, const char **argv) {
|
||||
bool test = false;
|
||||
bool recursive = false;
|
||||
bool ultra = false;
|
||||
bool forceStdout = false;
|
||||
bool followLinks = false;
|
||||
// Local copy of input files, which are pointers into argv.
|
||||
std::vector<const char *> localInputFiles;
|
||||
for (int i = 1; i < argc; ++i) {
|
||||
const char *arg = argv[i];
|
||||
// Protect against empty arguments
|
||||
if (arg[0] == 0) {
|
||||
continue;
|
||||
}
|
||||
// Everything after "--" is an input file
|
||||
if (!std::strcmp(arg, "--")) {
|
||||
++i;
|
||||
std::copy(argv + i, argv + argc, std::back_inserter(localInputFiles));
|
||||
break;
|
||||
}
|
||||
// Long arguments that don't have a short option
|
||||
{
|
||||
bool isLongOption = true;
|
||||
if (!std::strcmp(arg, "--rm")) {
|
||||
keepSource = false;
|
||||
} else if (!std::strcmp(arg, "--ultra")) {
|
||||
ultra = true;
|
||||
maxWindowLog = 0;
|
||||
} else if (!std::strcmp(arg, "--no-check")) {
|
||||
checksum = false;
|
||||
} else if (!std::strcmp(arg, "--sparse")) {
|
||||
writeMode = WriteMode::Sparse;
|
||||
notSupported("Sparse mode");
|
||||
return Status::Failure;
|
||||
} else if (!std::strcmp(arg, "--no-sparse")) {
|
||||
writeMode = WriteMode::Regular;
|
||||
notSupported("Sparse mode");
|
||||
return Status::Failure;
|
||||
} else if (!std::strcmp(arg, "--dictID")) {
|
||||
notSupported(arg);
|
||||
return Status::Failure;
|
||||
} else if (!std::strcmp(arg, "--no-dictID")) {
|
||||
notSupported(arg);
|
||||
return Status::Failure;
|
||||
} else {
|
||||
isLongOption = false;
|
||||
}
|
||||
if (isLongOption) {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
// Arguments with a short option simply set their short option.
|
||||
const char *options = nullptr;
|
||||
if (!std::strcmp(arg, "--processes")) {
|
||||
options = "p";
|
||||
} else if (!std::strcmp(arg, "--version")) {
|
||||
options = "V";
|
||||
} else if (!std::strcmp(arg, "--help")) {
|
||||
options = "h";
|
||||
} else if (!std::strcmp(arg, "--decompress")) {
|
||||
options = "d";
|
||||
} else if (!std::strcmp(arg, "--force")) {
|
||||
options = "f";
|
||||
} else if (!std::strcmp(arg, "--stdout")) {
|
||||
options = "c";
|
||||
} else if (!std::strcmp(arg, "--keep")) {
|
||||
options = "k";
|
||||
} else if (!std::strcmp(arg, "--verbose")) {
|
||||
options = "v";
|
||||
} else if (!std::strcmp(arg, "--quiet")) {
|
||||
options = "q";
|
||||
} else if (!std::strcmp(arg, "--check")) {
|
||||
options = "C";
|
||||
} else if (!std::strcmp(arg, "--test")) {
|
||||
options = "t";
|
||||
} else if (arg[0] == '-' && arg[1] != 0) {
|
||||
options = arg + 1;
|
||||
} else {
|
||||
localInputFiles.emplace_back(arg);
|
||||
continue;
|
||||
}
|
||||
assert(options != nullptr);
|
||||
|
||||
bool finished = false;
|
||||
while (!finished && *options != 0) {
|
||||
// Parse the compression level
|
||||
if (*options >= '0' && *options <= '9') {
|
||||
compressionLevel = parseUnsigned(&options);
|
||||
continue;
|
||||
}
|
||||
|
||||
switch (*options) {
|
||||
case 'h':
|
||||
case 'H':
|
||||
usage();
|
||||
return Status::Message;
|
||||
case 'V':
|
||||
std::fprintf(stderr, "PZSTD version: %s.\n", ZSTD_VERSION_STRING);
|
||||
return Status::Message;
|
||||
case 'p': {
|
||||
finished = true;
|
||||
const char *optionArgument = getArgument(options, argv, i, argc);
|
||||
if (optionArgument == nullptr) {
|
||||
return Status::Failure;
|
||||
}
|
||||
if (*optionArgument < '0' || *optionArgument > '9') {
|
||||
std::fprintf(stderr, "Option -p expects a number, but %s provided\n",
|
||||
optionArgument);
|
||||
return Status::Failure;
|
||||
}
|
||||
numThreads = parseUnsigned(&optionArgument);
|
||||
if (*optionArgument != 0) {
|
||||
std::fprintf(stderr,
|
||||
"Option -p expects a number, but %u%s provided\n",
|
||||
numThreads, optionArgument);
|
||||
return Status::Failure;
|
||||
}
|
||||
break;
|
||||
}
|
||||
case 'o': {
|
||||
finished = true;
|
||||
const char *optionArgument = getArgument(options, argv, i, argc);
|
||||
if (optionArgument == nullptr) {
|
||||
return Status::Failure;
|
||||
}
|
||||
outputFile = optionArgument;
|
||||
break;
|
||||
}
|
||||
case 'C':
|
||||
checksum = true;
|
||||
break;
|
||||
case 'k':
|
||||
keepSource = true;
|
||||
break;
|
||||
case 'd':
|
||||
decompress = true;
|
||||
break;
|
||||
case 'f':
|
||||
overwrite = true;
|
||||
forceStdout = true;
|
||||
followLinks = true;
|
||||
break;
|
||||
case 't':
|
||||
test = true;
|
||||
decompress = true;
|
||||
break;
|
||||
#ifdef UTIL_HAS_CREATEFILELIST
|
||||
case 'r':
|
||||
recursive = true;
|
||||
break;
|
||||
#endif
|
||||
case 'c':
|
||||
outputFile = kStdOut;
|
||||
forceStdout = true;
|
||||
break;
|
||||
case 'v':
|
||||
++verbosity;
|
||||
break;
|
||||
case 'q':
|
||||
--verbosity;
|
||||
// Ignore them for now
|
||||
break;
|
||||
// Unsupported options from Zstd
|
||||
case 'D':
|
||||
case 's':
|
||||
notSupported("Zstd dictionaries.");
|
||||
return Status::Failure;
|
||||
case 'b':
|
||||
case 'e':
|
||||
case 'i':
|
||||
case 'B':
|
||||
notSupported("Zstd benchmarking options.");
|
||||
return Status::Failure;
|
||||
default:
|
||||
std::fprintf(stderr, "Invalid argument: %s\n", arg);
|
||||
return Status::Failure;
|
||||
}
|
||||
if (!finished) {
|
||||
++options;
|
||||
}
|
||||
} // while (*options != 0);
|
||||
} // for (int i = 1; i < argc; ++i);
|
||||
|
||||
// Set options for test mode
|
||||
if (test) {
|
||||
outputFile = nullOutput;
|
||||
keepSource = true;
|
||||
}
|
||||
|
||||
// Input file defaults to standard input if not provided.
|
||||
if (localInputFiles.empty()) {
|
||||
localInputFiles.emplace_back(kStdIn);
|
||||
}
|
||||
|
||||
// Check validity of input files
|
||||
if (localInputFiles.size() > 1) {
|
||||
const auto it = std::find(localInputFiles.begin(), localInputFiles.end(),
|
||||
std::string{kStdIn});
|
||||
if (it != localInputFiles.end()) {
|
||||
std::fprintf(
|
||||
stderr,
|
||||
"Cannot specify standard input when handling multiple files\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
}
|
||||
if (localInputFiles.size() > 1 || recursive) {
|
||||
if (!outputFile.empty() && outputFile != nullOutput) {
|
||||
std::fprintf(
|
||||
stderr,
|
||||
"Cannot specify an output file when handling multiple inputs\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
}
|
||||
|
||||
g_utilDisplayLevel = verbosity;
|
||||
// Remove local input files that are symbolic links
|
||||
if (!followLinks) {
|
||||
std::remove_if(localInputFiles.begin(), localInputFiles.end(),
|
||||
[&](const char *path) {
|
||||
bool isLink = UTIL_isLink(path);
|
||||
if (isLink && verbosity >= 2) {
|
||||
std::fprintf(
|
||||
stderr,
|
||||
"Warning : %s is symbolic link, ignoring\n",
|
||||
path);
|
||||
}
|
||||
return isLink;
|
||||
});
|
||||
}
|
||||
|
||||
// Translate input files/directories into files to (de)compress
|
||||
if (recursive) {
|
||||
char *scratchBuffer = nullptr;
|
||||
unsigned numFiles = 0;
|
||||
const char **files =
|
||||
UTIL_createFileList(localInputFiles.data(), localInputFiles.size(),
|
||||
&scratchBuffer, &numFiles, followLinks);
|
||||
if (files == nullptr) {
|
||||
std::fprintf(stderr, "Error traversing directories\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
auto guard =
|
||||
makeScopeGuard([&] { UTIL_freeFileList(files, scratchBuffer); });
|
||||
if (numFiles == 0) {
|
||||
std::fprintf(stderr, "No files found\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
inputFiles.resize(numFiles);
|
||||
std::copy(files, files + numFiles, inputFiles.begin());
|
||||
} else {
|
||||
inputFiles.resize(localInputFiles.size());
|
||||
std::copy(localInputFiles.begin(), localInputFiles.end(),
|
||||
inputFiles.begin());
|
||||
}
|
||||
localInputFiles.clear();
|
||||
assert(!inputFiles.empty());
|
||||
|
||||
// If reading from standard input, default to standard output
|
||||
if (inputFiles[0] == kStdIn && outputFile.empty()) {
|
||||
assert(inputFiles.size() == 1);
|
||||
outputFile = "-";
|
||||
}
|
||||
|
||||
if (inputFiles[0] == kStdIn && IS_CONSOLE(stdin)) {
|
||||
assert(inputFiles.size() == 1);
|
||||
std::fprintf(stderr, "Cannot read input from interactive console\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
if (outputFile == "-" && IS_CONSOLE(stdout) && !(forceStdout && decompress)) {
|
||||
std::fprintf(stderr, "Will not write to console stdout unless -c or -f is "
|
||||
"specified and decompressing\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
|
||||
// Check compression level
|
||||
{
|
||||
unsigned maxCLevel =
|
||||
ultra ? ZSTD_maxCLevel() : kMaxNonUltraCompressionLevel;
|
||||
if (compressionLevel > maxCLevel || compressionLevel == 0) {
|
||||
std::fprintf(stderr, "Invalid compression level %u.\n", compressionLevel);
|
||||
return Status::Failure;
|
||||
}
|
||||
}
|
||||
|
||||
// Check that numThreads is set
|
||||
if (numThreads == 0) {
|
||||
std::fprintf(stderr, "Invalid arguments: # of threads not specified "
|
||||
"and unable to determine hardware concurrency.\n");
|
||||
return Status::Failure;
|
||||
}
|
||||
|
||||
// Modify verbosity
|
||||
// If we are piping input and output, turn off interaction
|
||||
if (inputFiles[0] == kStdIn && outputFile == kStdOut && verbosity == 2) {
|
||||
verbosity = 1;
|
||||
}
|
||||
// If we are in multi-file mode, turn off interaction
|
||||
if (inputFiles.size() > 1 && verbosity == 2) {
|
||||
verbosity = 1;
|
||||
}
|
||||
|
||||
return Status::Success;
|
||||
}
|
||||
|
||||
std::string Options::getOutputFile(const std::string &inputFile) const {
|
||||
if (!outputFile.empty()) {
|
||||
return outputFile;
|
||||
}
|
||||
// Attempt to add/remove zstd extension from the input file
|
||||
if (decompress) {
|
||||
int stemSize = inputFile.size() - kZstdExtension.size();
|
||||
if (stemSize > 0 && inputFile.substr(stemSize) == kZstdExtension) {
|
||||
return inputFile.substr(0, stemSize);
|
||||
} else {
|
||||
return "";
|
||||
}
|
||||
} else {
|
||||
return inputFile + kZstdExtension;
|
||||
}
|
||||
}
|
||||
}
|
@ -1,68 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include "zstd.h"
|
||||
#undef ZSTD_STATIC_LINKING_ONLY
|
||||
|
||||
#include <cstdint>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
struct Options {
|
||||
enum class WriteMode { Regular, Auto, Sparse };
|
||||
|
||||
unsigned numThreads;
|
||||
unsigned maxWindowLog;
|
||||
unsigned compressionLevel;
|
||||
bool decompress;
|
||||
std::vector<std::string> inputFiles;
|
||||
std::string outputFile;
|
||||
bool overwrite;
|
||||
bool keepSource;
|
||||
WriteMode writeMode;
|
||||
bool checksum;
|
||||
int verbosity;
|
||||
|
||||
enum class Status {
|
||||
Success, // Successfully parsed options
|
||||
Failure, // Failure to parse options
|
||||
Message // Options specified to print a message (e.g. "-h")
|
||||
};
|
||||
|
||||
Options();
|
||||
Options(unsigned numThreads, unsigned maxWindowLog, unsigned compressionLevel,
|
||||
bool decompress, std::vector<std::string> inputFiles,
|
||||
std::string outputFile, bool overwrite, bool keepSource,
|
||||
WriteMode writeMode, bool checksum, int verbosity)
|
||||
: numThreads(numThreads), maxWindowLog(maxWindowLog),
|
||||
compressionLevel(compressionLevel), decompress(decompress),
|
||||
inputFiles(std::move(inputFiles)), outputFile(std::move(outputFile)),
|
||||
overwrite(overwrite), keepSource(keepSource), writeMode(writeMode),
|
||||
checksum(checksum), verbosity(verbosity) {}
|
||||
|
||||
Status parse(int argc, const char **argv);
|
||||
|
||||
ZSTD_parameters determineParameters() const {
|
||||
ZSTD_parameters params = ZSTD_getParams(compressionLevel, 0, 0);
|
||||
params.fParams.contentSizeFlag = 0;
|
||||
params.fParams.checksumFlag = checksum;
|
||||
if (maxWindowLog != 0 && params.cParams.windowLog > maxWindowLog) {
|
||||
params.cParams.windowLog = maxWindowLog;
|
||||
params.cParams = ZSTD_adjustCParams(params.cParams, 0, 0);
|
||||
}
|
||||
return params;
|
||||
}
|
||||
|
||||
std::string getOutputFile(const std::string &inputFile) const;
|
||||
};
|
||||
}
|
@ -1,611 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "platform.h" /* Large Files support, SET_BINARY_MODE */
|
||||
#include "Pzstd.h"
|
||||
#include "SkippableFrame.h"
|
||||
#include "utils/FileSystem.h"
|
||||
#include "utils/Range.h"
|
||||
#include "utils/ScopeGuard.h"
|
||||
#include "utils/ThreadPool.h"
|
||||
#include "utils/WorkQueue.h"
|
||||
|
||||
#include <chrono>
|
||||
#include <cinttypes>
|
||||
#include <cstddef>
|
||||
#include <cstdio>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
namespace {
|
||||
#ifdef _WIN32
|
||||
const std::string nullOutput = "nul";
|
||||
#else
|
||||
const std::string nullOutput = "/dev/null";
|
||||
#endif
|
||||
}
|
||||
|
||||
using std::size_t;
|
||||
|
||||
static std::uintmax_t fileSizeOrZero(const std::string &file) {
|
||||
if (file == "-") {
|
||||
return 0;
|
||||
}
|
||||
std::error_code ec;
|
||||
auto size = file_size(file, ec);
|
||||
if (ec) {
|
||||
size = 0;
|
||||
}
|
||||
return size;
|
||||
}
|
||||
|
||||
static std::uint64_t handleOneInput(const Options &options,
|
||||
const std::string &inputFile,
|
||||
FILE* inputFd,
|
||||
const std::string &outputFile,
|
||||
FILE* outputFd,
|
||||
SharedState& state) {
|
||||
auto inputSize = fileSizeOrZero(inputFile);
|
||||
// WorkQueue outlives ThreadPool so in the case of error we are certain
|
||||
// we don't accidentally try to call push() on it after it is destroyed
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>> outs{options.numThreads + 1};
|
||||
std::uint64_t bytesRead;
|
||||
std::uint64_t bytesWritten;
|
||||
{
|
||||
// Initialize the (de)compression thread pool with numThreads
|
||||
ThreadPool executor(options.numThreads);
|
||||
// Run the reader thread on an extra thread
|
||||
ThreadPool readExecutor(1);
|
||||
if (!options.decompress) {
|
||||
// Add a job that reads the input and starts all the compression jobs
|
||||
readExecutor.add(
|
||||
[&state, &outs, &executor, inputFd, inputSize, &options, &bytesRead] {
|
||||
bytesRead = asyncCompressChunks(
|
||||
state,
|
||||
outs,
|
||||
executor,
|
||||
inputFd,
|
||||
inputSize,
|
||||
options.numThreads,
|
||||
options.determineParameters());
|
||||
});
|
||||
// Start writing
|
||||
bytesWritten = writeFile(state, outs, outputFd, options.decompress);
|
||||
} else {
|
||||
// Add a job that reads the input and starts all the decompression jobs
|
||||
readExecutor.add([&state, &outs, &executor, inputFd, &bytesRead] {
|
||||
bytesRead = asyncDecompressFrames(state, outs, executor, inputFd);
|
||||
});
|
||||
// Start writing
|
||||
bytesWritten = writeFile(state, outs, outputFd, options.decompress);
|
||||
}
|
||||
}
|
||||
if (!state.errorHolder.hasError()) {
|
||||
std::string inputFileName = inputFile == "-" ? "stdin" : inputFile;
|
||||
std::string outputFileName = outputFile == "-" ? "stdout" : outputFile;
|
||||
if (!options.decompress) {
|
||||
double ratio = static_cast<double>(bytesWritten) /
|
||||
static_cast<double>(bytesRead + !bytesRead);
|
||||
state.log(INFO, "%-20s :%6.2f%% (%6" PRIu64 " => %6" PRIu64
|
||||
" bytes, %s)\n",
|
||||
inputFileName.c_str(), ratio * 100, bytesRead, bytesWritten,
|
||||
outputFileName.c_str());
|
||||
} else {
|
||||
state.log(INFO, "%-20s: %" PRIu64 " bytes \n",
|
||||
inputFileName.c_str(),bytesWritten);
|
||||
}
|
||||
}
|
||||
return bytesWritten;
|
||||
}
|
||||
|
||||
static FILE *openInputFile(const std::string &inputFile,
|
||||
ErrorHolder &errorHolder) {
|
||||
if (inputFile == "-") {
|
||||
SET_BINARY_MODE(stdin);
|
||||
return stdin;
|
||||
}
|
||||
// Check if input file is a directory
|
||||
{
|
||||
std::error_code ec;
|
||||
if (is_directory(inputFile, ec)) {
|
||||
errorHolder.setError("Output file is a directory -- ignored");
|
||||
return nullptr;
|
||||
}
|
||||
}
|
||||
auto inputFd = std::fopen(inputFile.c_str(), "rb");
|
||||
if (!errorHolder.check(inputFd != nullptr, "Failed to open input file")) {
|
||||
return nullptr;
|
||||
}
|
||||
return inputFd;
|
||||
}
|
||||
|
||||
static FILE *openOutputFile(const Options &options,
|
||||
const std::string &outputFile,
|
||||
SharedState& state) {
|
||||
if (outputFile == "-") {
|
||||
SET_BINARY_MODE(stdout);
|
||||
return stdout;
|
||||
}
|
||||
// Check if the output file exists and then open it
|
||||
if (!options.overwrite && outputFile != nullOutput) {
|
||||
auto outputFd = std::fopen(outputFile.c_str(), "rb");
|
||||
if (outputFd != nullptr) {
|
||||
std::fclose(outputFd);
|
||||
if (!state.log.logsAt(INFO)) {
|
||||
state.errorHolder.setError("Output file exists");
|
||||
return nullptr;
|
||||
}
|
||||
state.log(
|
||||
INFO,
|
||||
"pzstd: %s already exists; do you wish to overwrite (y/n) ? ",
|
||||
outputFile.c_str());
|
||||
int c = getchar();
|
||||
if (c != 'y' && c != 'Y') {
|
||||
state.errorHolder.setError("Not overwritten");
|
||||
return nullptr;
|
||||
}
|
||||
}
|
||||
}
|
||||
auto outputFd = std::fopen(outputFile.c_str(), "wb");
|
||||
if (!state.errorHolder.check(
|
||||
outputFd != nullptr, "Failed to open output file")) {
|
||||
return nullptr;
|
||||
}
|
||||
return outputFd;
|
||||
}
|
||||
|
||||
int pzstdMain(const Options &options) {
|
||||
int returnCode = 0;
|
||||
SharedState state(options);
|
||||
for (const auto& input : options.inputFiles) {
|
||||
// Setup the shared state
|
||||
auto printErrorGuard = makeScopeGuard([&] {
|
||||
if (state.errorHolder.hasError()) {
|
||||
returnCode = 1;
|
||||
state.log(ERROR, "pzstd: %s: %s.\n", input.c_str(),
|
||||
state.errorHolder.getError().c_str());
|
||||
}
|
||||
});
|
||||
// Open the input file
|
||||
auto inputFd = openInputFile(input, state.errorHolder);
|
||||
if (inputFd == nullptr) {
|
||||
continue;
|
||||
}
|
||||
auto closeInputGuard = makeScopeGuard([&] { std::fclose(inputFd); });
|
||||
// Open the output file
|
||||
auto outputFile = options.getOutputFile(input);
|
||||
if (!state.errorHolder.check(outputFile != "",
|
||||
"Input file does not have extension .zst")) {
|
||||
continue;
|
||||
}
|
||||
auto outputFd = openOutputFile(options, outputFile, state);
|
||||
if (outputFd == nullptr) {
|
||||
continue;
|
||||
}
|
||||
auto closeOutputGuard = makeScopeGuard([&] { std::fclose(outputFd); });
|
||||
// (de)compress the file
|
||||
handleOneInput(options, input, inputFd, outputFile, outputFd, state);
|
||||
if (state.errorHolder.hasError()) {
|
||||
continue;
|
||||
}
|
||||
// Delete the input file if necessary
|
||||
if (!options.keepSource) {
|
||||
// Be sure that we are done and have written everything before we delete
|
||||
if (!state.errorHolder.check(std::fclose(inputFd) == 0,
|
||||
"Failed to close input file")) {
|
||||
continue;
|
||||
}
|
||||
closeInputGuard.dismiss();
|
||||
if (!state.errorHolder.check(std::fclose(outputFd) == 0,
|
||||
"Failed to close output file")) {
|
||||
continue;
|
||||
}
|
||||
closeOutputGuard.dismiss();
|
||||
if (std::remove(input.c_str()) != 0) {
|
||||
state.errorHolder.setError("Failed to remove input file");
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
// Returns 1 if any of the files failed to (de)compress.
|
||||
return returnCode;
|
||||
}
|
||||
|
||||
/// Construct a `ZSTD_inBuffer` that points to the data in `buffer`.
|
||||
static ZSTD_inBuffer makeZstdInBuffer(const Buffer& buffer) {
|
||||
return ZSTD_inBuffer{buffer.data(), buffer.size(), 0};
|
||||
}
|
||||
|
||||
/**
|
||||
* Advance `buffer` and `inBuffer` by the amount of data read, as indicated by
|
||||
* `inBuffer.pos`.
|
||||
*/
|
||||
void advance(Buffer& buffer, ZSTD_inBuffer& inBuffer) {
|
||||
auto pos = inBuffer.pos;
|
||||
inBuffer.src = static_cast<const unsigned char*>(inBuffer.src) + pos;
|
||||
inBuffer.size -= pos;
|
||||
inBuffer.pos = 0;
|
||||
return buffer.advance(pos);
|
||||
}
|
||||
|
||||
/// Construct a `ZSTD_outBuffer` that points to the data in `buffer`.
|
||||
static ZSTD_outBuffer makeZstdOutBuffer(Buffer& buffer) {
|
||||
return ZSTD_outBuffer{buffer.data(), buffer.size(), 0};
|
||||
}
|
||||
|
||||
/**
|
||||
* Split `buffer` and advance `outBuffer` by the amount of data written, as
|
||||
* indicated by `outBuffer.pos`.
|
||||
*/
|
||||
Buffer split(Buffer& buffer, ZSTD_outBuffer& outBuffer) {
|
||||
auto pos = outBuffer.pos;
|
||||
outBuffer.dst = static_cast<unsigned char*>(outBuffer.dst) + pos;
|
||||
outBuffer.size -= pos;
|
||||
outBuffer.pos = 0;
|
||||
return buffer.splitAt(pos);
|
||||
}
|
||||
|
||||
/**
|
||||
* Stream chunks of input from `in`, compress it, and stream it out to `out`.
|
||||
*
|
||||
* @param state The shared state
|
||||
* @param in Queue that we `pop()` input buffers from
|
||||
* @param out Queue that we `push()` compressed output buffers to
|
||||
* @param maxInputSize An upper bound on the size of the input
|
||||
*/
|
||||
static void compress(
|
||||
SharedState& state,
|
||||
std::shared_ptr<BufferWorkQueue> in,
|
||||
std::shared_ptr<BufferWorkQueue> out,
|
||||
size_t maxInputSize) {
|
||||
auto& errorHolder = state.errorHolder;
|
||||
auto guard = makeScopeGuard([&] { out->finish(); });
|
||||
// Initialize the CCtx
|
||||
auto ctx = state.cStreamPool->get();
|
||||
if (!errorHolder.check(ctx != nullptr, "Failed to allocate ZSTD_CStream")) {
|
||||
return;
|
||||
}
|
||||
{
|
||||
auto err = ZSTD_resetCStream(ctx.get(), 0);
|
||||
if (!errorHolder.check(!ZSTD_isError(err), ZSTD_getErrorName(err))) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
// Allocate space for the result
|
||||
auto outBuffer = Buffer(ZSTD_compressBound(maxInputSize));
|
||||
auto zstdOutBuffer = makeZstdOutBuffer(outBuffer);
|
||||
{
|
||||
Buffer inBuffer;
|
||||
// Read a buffer in from the input queue
|
||||
while (in->pop(inBuffer) && !errorHolder.hasError()) {
|
||||
auto zstdInBuffer = makeZstdInBuffer(inBuffer);
|
||||
// Compress the whole buffer and send it to the output queue
|
||||
while (!inBuffer.empty() && !errorHolder.hasError()) {
|
||||
if (!errorHolder.check(
|
||||
!outBuffer.empty(), "ZSTD_compressBound() was too small")) {
|
||||
return;
|
||||
}
|
||||
// Compress
|
||||
auto err =
|
||||
ZSTD_compressStream(ctx.get(), &zstdOutBuffer, &zstdInBuffer);
|
||||
if (!errorHolder.check(!ZSTD_isError(err), ZSTD_getErrorName(err))) {
|
||||
return;
|
||||
}
|
||||
// Split the compressed data off outBuffer and pass to the output queue
|
||||
out->push(split(outBuffer, zstdOutBuffer));
|
||||
// Forget about the data we already compressed
|
||||
advance(inBuffer, zstdInBuffer);
|
||||
}
|
||||
}
|
||||
}
|
||||
// Write the epilog
|
||||
size_t bytesLeft;
|
||||
do {
|
||||
if (!errorHolder.check(
|
||||
!outBuffer.empty(), "ZSTD_compressBound() was too small")) {
|
||||
return;
|
||||
}
|
||||
bytesLeft = ZSTD_endStream(ctx.get(), &zstdOutBuffer);
|
||||
if (!errorHolder.check(
|
||||
!ZSTD_isError(bytesLeft), ZSTD_getErrorName(bytesLeft))) {
|
||||
return;
|
||||
}
|
||||
out->push(split(outBuffer, zstdOutBuffer));
|
||||
} while (bytesLeft != 0 && !errorHolder.hasError());
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculates how large each independently compressed frame should be.
|
||||
*
|
||||
* @param size The size of the source if known, 0 otherwise
|
||||
* @param numThreads The number of threads available to run compression jobs on
|
||||
* @param params The zstd parameters to be used for compression
|
||||
*/
|
||||
static size_t calculateStep(
|
||||
std::uintmax_t size,
|
||||
size_t numThreads,
|
||||
const ZSTD_parameters ¶ms) {
|
||||
(void)size;
|
||||
(void)numThreads;
|
||||
return size_t{1} << (params.cParams.windowLog + 2);
|
||||
}
|
||||
|
||||
namespace {
|
||||
enum class FileStatus { Continue, Done, Error };
|
||||
/// Determines the status of the file descriptor `fd`.
|
||||
FileStatus fileStatus(FILE* fd) {
|
||||
if (std::feof(fd)) {
|
||||
return FileStatus::Done;
|
||||
} else if (std::ferror(fd)) {
|
||||
return FileStatus::Error;
|
||||
}
|
||||
return FileStatus::Continue;
|
||||
}
|
||||
} // anonymous namespace
|
||||
|
||||
/**
|
||||
* Reads `size` data in chunks of `chunkSize` and puts it into `queue`.
|
||||
* Will read less if an error or EOF occurs.
|
||||
* Returns the status of the file after all of the reads have occurred.
|
||||
*/
|
||||
static FileStatus
|
||||
readData(BufferWorkQueue& queue, size_t chunkSize, size_t size, FILE* fd,
|
||||
std::uint64_t *totalBytesRead) {
|
||||
Buffer buffer(size);
|
||||
while (!buffer.empty()) {
|
||||
auto bytesRead =
|
||||
std::fread(buffer.data(), 1, std::min(chunkSize, buffer.size()), fd);
|
||||
*totalBytesRead += bytesRead;
|
||||
queue.push(buffer.splitAt(bytesRead));
|
||||
auto status = fileStatus(fd);
|
||||
if (status != FileStatus::Continue) {
|
||||
return status;
|
||||
}
|
||||
}
|
||||
return FileStatus::Continue;
|
||||
}
|
||||
|
||||
std::uint64_t asyncCompressChunks(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& chunks,
|
||||
ThreadPool& executor,
|
||||
FILE* fd,
|
||||
std::uintmax_t size,
|
||||
size_t numThreads,
|
||||
ZSTD_parameters params) {
|
||||
auto chunksGuard = makeScopeGuard([&] { chunks.finish(); });
|
||||
std::uint64_t bytesRead = 0;
|
||||
|
||||
// Break the input up into chunks of size `step` and compress each chunk
|
||||
// independently.
|
||||
size_t step = calculateStep(size, numThreads, params);
|
||||
state.log(DEBUG, "Chosen frame size: %zu\n", step);
|
||||
auto status = FileStatus::Continue;
|
||||
while (status == FileStatus::Continue && !state.errorHolder.hasError()) {
|
||||
// Make a new input queue that we will put the chunk's input data into.
|
||||
auto in = std::make_shared<BufferWorkQueue>();
|
||||
auto inGuard = makeScopeGuard([&] { in->finish(); });
|
||||
// Make a new output queue that compress will put the compressed data into.
|
||||
auto out = std::make_shared<BufferWorkQueue>();
|
||||
// Start compression in the thread pool
|
||||
executor.add([&state, in, out, step] {
|
||||
return compress(
|
||||
state, std::move(in), std::move(out), step);
|
||||
});
|
||||
// Pass the output queue to the writer thread.
|
||||
chunks.push(std::move(out));
|
||||
state.log(VERBOSE, "%s\n", "Starting a new frame");
|
||||
// Fill the input queue for the compression job we just started
|
||||
status = readData(*in, ZSTD_CStreamInSize(), step, fd, &bytesRead);
|
||||
}
|
||||
state.errorHolder.check(status != FileStatus::Error, "Error reading input");
|
||||
return bytesRead;
|
||||
}
|
||||
|
||||
/**
|
||||
* Decompress a frame, whose data is streamed into `in`, and stream the output
|
||||
* to `out`.
|
||||
*
|
||||
* @param state The shared state
|
||||
* @param in Queue that we `pop()` input buffers from. It contains
|
||||
* exactly one compressed frame.
|
||||
* @param out Queue that we `push()` decompressed output buffers to
|
||||
*/
|
||||
static void decompress(
|
||||
SharedState& state,
|
||||
std::shared_ptr<BufferWorkQueue> in,
|
||||
std::shared_ptr<BufferWorkQueue> out) {
|
||||
auto& errorHolder = state.errorHolder;
|
||||
auto guard = makeScopeGuard([&] { out->finish(); });
|
||||
// Initialize the DCtx
|
||||
auto ctx = state.dStreamPool->get();
|
||||
if (!errorHolder.check(ctx != nullptr, "Failed to allocate ZSTD_DStream")) {
|
||||
return;
|
||||
}
|
||||
{
|
||||
auto err = ZSTD_resetDStream(ctx.get());
|
||||
if (!errorHolder.check(!ZSTD_isError(err), ZSTD_getErrorName(err))) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
const size_t outSize = ZSTD_DStreamOutSize();
|
||||
Buffer inBuffer;
|
||||
size_t returnCode = 0;
|
||||
// Read a buffer in from the input queue
|
||||
while (in->pop(inBuffer) && !errorHolder.hasError()) {
|
||||
auto zstdInBuffer = makeZstdInBuffer(inBuffer);
|
||||
// Decompress the whole buffer and send it to the output queue
|
||||
while (!inBuffer.empty() && !errorHolder.hasError()) {
|
||||
// Allocate a buffer with at least outSize bytes.
|
||||
Buffer outBuffer(outSize);
|
||||
auto zstdOutBuffer = makeZstdOutBuffer(outBuffer);
|
||||
// Decompress
|
||||
returnCode =
|
||||
ZSTD_decompressStream(ctx.get(), &zstdOutBuffer, &zstdInBuffer);
|
||||
if (!errorHolder.check(
|
||||
!ZSTD_isError(returnCode), ZSTD_getErrorName(returnCode))) {
|
||||
return;
|
||||
}
|
||||
// Pass the buffer with the decompressed data to the output queue
|
||||
out->push(split(outBuffer, zstdOutBuffer));
|
||||
// Advance past the input we already read
|
||||
advance(inBuffer, zstdInBuffer);
|
||||
if (returnCode == 0) {
|
||||
// The frame is over, prepare to (maybe) start a new frame
|
||||
ZSTD_initDStream(ctx.get());
|
||||
}
|
||||
}
|
||||
}
|
||||
if (!errorHolder.check(returnCode <= 1, "Incomplete block")) {
|
||||
return;
|
||||
}
|
||||
// We've given ZSTD_decompressStream all of our data, but there may still
|
||||
// be data to read.
|
||||
while (returnCode == 1) {
|
||||
// Allocate a buffer with at least outSize bytes.
|
||||
Buffer outBuffer(outSize);
|
||||
auto zstdOutBuffer = makeZstdOutBuffer(outBuffer);
|
||||
// Pass in no input.
|
||||
ZSTD_inBuffer zstdInBuffer{nullptr, 0, 0};
|
||||
// Decompress
|
||||
returnCode =
|
||||
ZSTD_decompressStream(ctx.get(), &zstdOutBuffer, &zstdInBuffer);
|
||||
if (!errorHolder.check(
|
||||
!ZSTD_isError(returnCode), ZSTD_getErrorName(returnCode))) {
|
||||
return;
|
||||
}
|
||||
// Pass the buffer with the decompressed data to the output queue
|
||||
out->push(split(outBuffer, zstdOutBuffer));
|
||||
}
|
||||
}
|
||||
|
||||
std::uint64_t asyncDecompressFrames(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& frames,
|
||||
ThreadPool& executor,
|
||||
FILE* fd) {
|
||||
auto framesGuard = makeScopeGuard([&] { frames.finish(); });
|
||||
std::uint64_t totalBytesRead = 0;
|
||||
|
||||
// Split the source up into its component frames.
|
||||
// If we find our recognized skippable frame we know the next frames size
|
||||
// which means that we can decompress each standard frame in independently.
|
||||
// Otherwise, we will decompress using only one decompression task.
|
||||
const size_t chunkSize = ZSTD_DStreamInSize();
|
||||
auto status = FileStatus::Continue;
|
||||
while (status == FileStatus::Continue && !state.errorHolder.hasError()) {
|
||||
// Make a new input queue that we will put the frames's bytes into.
|
||||
auto in = std::make_shared<BufferWorkQueue>();
|
||||
auto inGuard = makeScopeGuard([&] { in->finish(); });
|
||||
// Make a output queue that decompress will put the decompressed data into
|
||||
auto out = std::make_shared<BufferWorkQueue>();
|
||||
|
||||
size_t frameSize;
|
||||
{
|
||||
// Calculate the size of the next frame.
|
||||
// frameSize is 0 if the frame info can't be decoded.
|
||||
Buffer buffer(SkippableFrame::kSize);
|
||||
auto bytesRead = std::fread(buffer.data(), 1, buffer.size(), fd);
|
||||
totalBytesRead += bytesRead;
|
||||
status = fileStatus(fd);
|
||||
if (bytesRead == 0 && status != FileStatus::Continue) {
|
||||
break;
|
||||
}
|
||||
buffer.subtract(buffer.size() - bytesRead);
|
||||
frameSize = SkippableFrame::tryRead(buffer.range());
|
||||
in->push(std::move(buffer));
|
||||
}
|
||||
if (frameSize == 0) {
|
||||
// We hit a non SkippableFrame, so this will be the last job.
|
||||
// Make sure that we don't use too much memory
|
||||
in->setMaxSize(64);
|
||||
out->setMaxSize(64);
|
||||
}
|
||||
// Start decompression in the thread pool
|
||||
executor.add([&state, in, out] {
|
||||
return decompress(state, std::move(in), std::move(out));
|
||||
});
|
||||
// Pass the output queue to the writer thread
|
||||
frames.push(std::move(out));
|
||||
if (frameSize == 0) {
|
||||
// We hit a non SkippableFrame ==> not compressed by pzstd or corrupted
|
||||
// Pass the rest of the source to this decompression task
|
||||
state.log(VERBOSE, "%s\n",
|
||||
"Input not in pzstd format, falling back to serial decompression");
|
||||
while (status == FileStatus::Continue && !state.errorHolder.hasError()) {
|
||||
status = readData(*in, chunkSize, chunkSize, fd, &totalBytesRead);
|
||||
}
|
||||
break;
|
||||
}
|
||||
state.log(VERBOSE, "Decompressing a frame of size %zu", frameSize);
|
||||
// Fill the input queue for the decompression job we just started
|
||||
status = readData(*in, chunkSize, frameSize, fd, &totalBytesRead);
|
||||
}
|
||||
state.errorHolder.check(status != FileStatus::Error, "Error reading input");
|
||||
return totalBytesRead;
|
||||
}
|
||||
|
||||
/// Write `data` to `fd`, returns true iff success.
|
||||
static bool writeData(ByteRange data, FILE* fd) {
|
||||
while (!data.empty()) {
|
||||
data.advance(std::fwrite(data.begin(), 1, data.size(), fd));
|
||||
if (std::ferror(fd)) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
std::uint64_t writeFile(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& outs,
|
||||
FILE* outputFd,
|
||||
bool decompress) {
|
||||
auto& errorHolder = state.errorHolder;
|
||||
auto lineClearGuard = makeScopeGuard([&state] {
|
||||
state.log.clear(INFO);
|
||||
});
|
||||
std::uint64_t bytesWritten = 0;
|
||||
std::shared_ptr<BufferWorkQueue> out;
|
||||
// Grab the output queue for each decompression job (in order).
|
||||
while (outs.pop(out)) {
|
||||
if (errorHolder.hasError()) {
|
||||
continue;
|
||||
}
|
||||
if (!decompress) {
|
||||
// If we are compressing and want to write skippable frames we can't
|
||||
// start writing before compression is done because we need to know the
|
||||
// compressed size.
|
||||
// Wait for the compressed size to be available and write skippable frame
|
||||
SkippableFrame frame(out->size());
|
||||
if (!writeData(frame.data(), outputFd)) {
|
||||
errorHolder.setError("Failed to write output");
|
||||
return bytesWritten;
|
||||
}
|
||||
bytesWritten += frame.kSize;
|
||||
}
|
||||
// For each chunk of the frame: Pop it from the queue and write it
|
||||
Buffer buffer;
|
||||
while (out->pop(buffer) && !errorHolder.hasError()) {
|
||||
if (!writeData(buffer.range(), outputFd)) {
|
||||
errorHolder.setError("Failed to write output");
|
||||
return bytesWritten;
|
||||
}
|
||||
bytesWritten += buffer.size();
|
||||
state.log.update(INFO, "Written: %u MB ",
|
||||
static_cast<std::uint32_t>(bytesWritten >> 20));
|
||||
}
|
||||
}
|
||||
return bytesWritten;
|
||||
}
|
||||
}
|
@ -1,150 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "ErrorHolder.h"
|
||||
#include "Logging.h"
|
||||
#include "Options.h"
|
||||
#include "utils/Buffer.h"
|
||||
#include "utils/Range.h"
|
||||
#include "utils/ResourcePool.h"
|
||||
#include "utils/ThreadPool.h"
|
||||
#include "utils/WorkQueue.h"
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include "zstd.h"
|
||||
#undef ZSTD_STATIC_LINKING_ONLY
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <memory>
|
||||
|
||||
namespace pzstd {
|
||||
/**
|
||||
* Runs pzstd with `options` and returns the number of bytes written.
|
||||
* An error occurred if `errorHandler.hasError()`.
|
||||
*
|
||||
* @param options The pzstd options to use for (de)compression
|
||||
* @returns 0 upon success and non-zero on failure.
|
||||
*/
|
||||
int pzstdMain(const Options& options);
|
||||
|
||||
class SharedState {
|
||||
public:
|
||||
SharedState(const Options& options) : log(options.verbosity) {
|
||||
if (!options.decompress) {
|
||||
auto parameters = options.determineParameters();
|
||||
cStreamPool.reset(new ResourcePool<ZSTD_CStream>{
|
||||
[this, parameters]() -> ZSTD_CStream* {
|
||||
this->log(VERBOSE, "%s\n", "Creating new ZSTD_CStream");
|
||||
auto zcs = ZSTD_createCStream();
|
||||
if (zcs) {
|
||||
auto err = ZSTD_initCStream_advanced(
|
||||
zcs, nullptr, 0, parameters, 0);
|
||||
if (ZSTD_isError(err)) {
|
||||
ZSTD_freeCStream(zcs);
|
||||
return nullptr;
|
||||
}
|
||||
}
|
||||
return zcs;
|
||||
},
|
||||
[](ZSTD_CStream *zcs) {
|
||||
ZSTD_freeCStream(zcs);
|
||||
}});
|
||||
} else {
|
||||
dStreamPool.reset(new ResourcePool<ZSTD_DStream>{
|
||||
[this]() -> ZSTD_DStream* {
|
||||
this->log(VERBOSE, "%s\n", "Creating new ZSTD_DStream");
|
||||
auto zds = ZSTD_createDStream();
|
||||
if (zds) {
|
||||
auto err = ZSTD_initDStream(zds);
|
||||
if (ZSTD_isError(err)) {
|
||||
ZSTD_freeDStream(zds);
|
||||
return nullptr;
|
||||
}
|
||||
}
|
||||
return zds;
|
||||
},
|
||||
[](ZSTD_DStream *zds) {
|
||||
ZSTD_freeDStream(zds);
|
||||
}});
|
||||
}
|
||||
}
|
||||
|
||||
~SharedState() {
|
||||
// The resource pools have references to this, so destroy them first.
|
||||
cStreamPool.reset();
|
||||
dStreamPool.reset();
|
||||
}
|
||||
|
||||
Logger log;
|
||||
ErrorHolder errorHolder;
|
||||
std::unique_ptr<ResourcePool<ZSTD_CStream>> cStreamPool;
|
||||
std::unique_ptr<ResourcePool<ZSTD_DStream>> dStreamPool;
|
||||
};
|
||||
|
||||
/**
|
||||
* Streams input from `fd`, breaks input up into chunks, and compresses each
|
||||
* chunk independently. Output of each chunk gets streamed to a queue, and
|
||||
* the output queues get put into `chunks` in order.
|
||||
*
|
||||
* @param state The shared state
|
||||
* @param chunks Each compression jobs output queue gets `pushed()` here
|
||||
* as soon as it is available
|
||||
* @param executor The thread pool to run compression jobs in
|
||||
* @param fd The input file descriptor
|
||||
* @param size The size of the input file if known, 0 otherwise
|
||||
* @param numThreads The number of threads in the thread pool
|
||||
* @param parameters The zstd parameters to use for compression
|
||||
* @returns The number of bytes read from the file
|
||||
*/
|
||||
std::uint64_t asyncCompressChunks(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& chunks,
|
||||
ThreadPool& executor,
|
||||
FILE* fd,
|
||||
std::uintmax_t size,
|
||||
std::size_t numThreads,
|
||||
ZSTD_parameters parameters);
|
||||
|
||||
/**
|
||||
* Streams input from `fd`. If pzstd headers are available it breaks the input
|
||||
* up into independent frames. It sends each frame to an independent
|
||||
* decompression job. Output of each frame gets streamed to a queue, and
|
||||
* the output queues get put into `frames` in order.
|
||||
*
|
||||
* @param state The shared state
|
||||
* @param frames Each decompression jobs output queue gets `pushed()` here
|
||||
* as soon as it is available
|
||||
* @param executor The thread pool to run compression jobs in
|
||||
* @param fd The input file descriptor
|
||||
* @returns The number of bytes read from the file
|
||||
*/
|
||||
std::uint64_t asyncDecompressFrames(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& frames,
|
||||
ThreadPool& executor,
|
||||
FILE* fd);
|
||||
|
||||
/**
|
||||
* Streams input in from each queue in `outs` in order, and writes the data to
|
||||
* `outputFd`.
|
||||
*
|
||||
* @param state The shared state
|
||||
* @param outs A queue of output queues, one for each
|
||||
* (de)compression job.
|
||||
* @param outputFd The file descriptor to write to
|
||||
* @param decompress Are we decompressing?
|
||||
* @returns The number of bytes written
|
||||
*/
|
||||
std::uint64_t writeFile(
|
||||
SharedState& state,
|
||||
WorkQueue<std::shared_ptr<BufferWorkQueue>>& outs,
|
||||
FILE* outputFd,
|
||||
bool decompress);
|
||||
}
|
@ -1,56 +0,0 @@
|
||||
# Parallel Zstandard (PZstandard)
|
||||
|
||||
Parallel Zstandard is a Pigz-like tool for Zstandard.
|
||||
It provides Zstandard format compatible compression and decompression that is able to utilize multiple cores.
|
||||
It breaks the input up into equal sized chunks and compresses each chunk independently into a Zstandard frame.
|
||||
It then concatenates the frames together to produce the final compressed output.
|
||||
Pzstandard will write a 12 byte header for each frame that is a skippable frame in the Zstandard format, which tells PZstandard the size of the next compressed frame.
|
||||
PZstandard supports parallel decompression of files compressed with PZstandard.
|
||||
When decompressing files compressed with Zstandard, PZstandard does IO in one thread, and decompression in another.
|
||||
|
||||
## Usage
|
||||
|
||||
PZstandard supports the same command line interface as Zstandard, but also provides the `-p` option to specify the number of threads.
|
||||
Dictionary mode is not currently supported.
|
||||
|
||||
Basic usage
|
||||
|
||||
pzstd input-file -o output-file -p num-threads -# # Compression
|
||||
pzstd -d input-file -o output-file -p num-threads # Decompression
|
||||
|
||||
PZstandard also supports piping and fifo pipes
|
||||
|
||||
cat input-file | pzstd -p num-threads -# -c > /dev/null
|
||||
|
||||
For more options
|
||||
|
||||
pzstd --help
|
||||
|
||||
PZstandard tries to pick a smart default number of threads if not specified (displayed in `pzstd --help`).
|
||||
If this number is not suitable, during compilation you can define `PZSTD_NUM_THREADS` to the number of threads you prefer.
|
||||
|
||||
## Benchmarks
|
||||
|
||||
As a reference, PZstandard and Pigz were compared on an Intel Core i7 @ 3.1 GHz, each using 4 threads, with the [Silesia compression corpus](http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia).
|
||||
|
||||
Compression Speed vs Ratio with 4 Threads | Decompression Speed with 4 Threads
|
||||
------------------------------------------|-----------------------------------
|
||||
![Compression Speed vs Ratio](images/Cspeed.png "Compression Speed vs Ratio") | ![Decompression Speed](images/Dspeed.png "Decompression Speed")
|
||||
|
||||
The test procedure was to run each of the following commands 2 times for each compression level, and take the minimum time.
|
||||
|
||||
time pzstd -# -p 4 -c silesia.tar > silesia.tar.zst
|
||||
time pzstd -d -p 4 -c silesia.tar.zst > /dev/null
|
||||
|
||||
time pigz -# -p 4 -k -c silesia.tar > silesia.tar.gz
|
||||
time pigz -d -p 4 -k -c silesia.tar.gz > /dev/null
|
||||
|
||||
PZstandard was tested using compression levels 1-19, and Pigz was tested using compression levels 1-9.
|
||||
Pigz cannot do parallel decompression, it simply does each of reading, decompression, and writing on separate threads.
|
||||
|
||||
## Tests
|
||||
|
||||
Tests require that you have [gtest](https://github.com/google/googletest) installed.
|
||||
Set `GTEST_INC` and `GTEST_LIB` in `Makefile` to specify the location of the gtest headers and libraries.
|
||||
Alternatively, run `make googletest`, which will clone googletest and build it.
|
||||
Run `make tests && make check` to run tests.
|
@ -1,30 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "SkippableFrame.h"
|
||||
#include "mem.h"
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <cstdio>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
SkippableFrame::SkippableFrame(std::uint32_t size) : frameSize_(size) {
|
||||
MEM_writeLE32(data_.data(), kSkippableFrameMagicNumber);
|
||||
MEM_writeLE32(data_.data() + 4, kFrameContentsSize);
|
||||
MEM_writeLE32(data_.data() + 8, frameSize_);
|
||||
}
|
||||
|
||||
/* static */ std::size_t SkippableFrame::tryRead(ByteRange bytes) {
|
||||
if (bytes.size() < SkippableFrame::kSize ||
|
||||
MEM_readLE32(bytes.begin()) != kSkippableFrameMagicNumber ||
|
||||
MEM_readLE32(bytes.begin() + 4) != kFrameContentsSize) {
|
||||
return 0;
|
||||
}
|
||||
return MEM_readLE32(bytes.begin() + 8);
|
||||
}
|
@ -1,64 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <array>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <cstdio>
|
||||
|
||||
namespace pzstd {
|
||||
/**
|
||||
* We put a skippable frame before each frame.
|
||||
* It contains a skippable frame magic number, the size of the skippable frame,
|
||||
* and the size of the next frame.
|
||||
* Each skippable frame is exactly 12 bytes in little endian format.
|
||||
* The first 8 bytes are for compatibility with the ZSTD format.
|
||||
* If we have N threads, the output will look like
|
||||
*
|
||||
* [0x184D2A50|4|size1] [frame1 of size size1]
|
||||
* [0x184D2A50|4|size2] [frame2 of size size2]
|
||||
* ...
|
||||
* [0x184D2A50|4|sizeN] [frameN of size sizeN]
|
||||
*
|
||||
* Each sizeX is 4 bytes.
|
||||
*
|
||||
* These skippable frames should allow us to skip through the compressed file
|
||||
* and only load at most N pages.
|
||||
*/
|
||||
class SkippableFrame {
|
||||
public:
|
||||
static constexpr std::size_t kSize = 12;
|
||||
|
||||
private:
|
||||
std::uint32_t frameSize_;
|
||||
std::array<std::uint8_t, kSize> data_;
|
||||
static constexpr std::uint32_t kSkippableFrameMagicNumber = 0x184D2A50;
|
||||
// Could be improved if the size fits in less bytes
|
||||
static constexpr std::uint32_t kFrameContentsSize = kSize - 8;
|
||||
|
||||
public:
|
||||
// Write the skippable frame to data_ in LE format.
|
||||
explicit SkippableFrame(std::uint32_t size);
|
||||
|
||||
// Read the skippable frame from bytes in LE format.
|
||||
static std::size_t tryRead(ByteRange bytes);
|
||||
|
||||
ByteRange data() const {
|
||||
return {data_.data(), data_.size()};
|
||||
}
|
||||
|
||||
// Size of the next frame.
|
||||
std::size_t frameSize() const {
|
||||
return frameSize_;
|
||||
}
|
||||
};
|
||||
}
|
Binary file not shown.
Before Width: | Height: | Size: 68 KiB |
Binary file not shown.
Before Width: | Height: | Size: 26 KiB |
@ -1,27 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "ErrorHolder.h"
|
||||
#include "Options.h"
|
||||
#include "Pzstd.h"
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
int main(int argc, const char** argv) {
|
||||
Options options;
|
||||
switch (options.parse(argc, argv)) {
|
||||
case Options::Status::Failure:
|
||||
return 1;
|
||||
case Options::Status::Message:
|
||||
return 0;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
return pzstdMain(options);
|
||||
}
|
@ -1,37 +0,0 @@
|
||||
cxx_test(
|
||||
name='options_test',
|
||||
srcs=['OptionsTest.cpp'],
|
||||
deps=['//contrib/pzstd:options'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='pzstd_test',
|
||||
srcs=['PzstdTest.cpp'],
|
||||
deps=[
|
||||
':round_trip',
|
||||
'//contrib/pzstd:libpzstd',
|
||||
'//contrib/pzstd/utils:scope_guard',
|
||||
'//programs:datagen',
|
||||
],
|
||||
)
|
||||
|
||||
cxx_binary(
|
||||
name='round_trip_test',
|
||||
srcs=['RoundTripTest.cpp'],
|
||||
deps=[
|
||||
':round_trip',
|
||||
'//contrib/pzstd/utils:scope_guard',
|
||||
'//programs:datagen',
|
||||
]
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='round_trip',
|
||||
header_namespace='test',
|
||||
exported_headers=['RoundTrip.h'],
|
||||
deps=[
|
||||
'//contrib/pzstd:libpzstd',
|
||||
'//contrib/pzstd:options',
|
||||
'//contrib/pzstd/utils:scope_guard',
|
||||
]
|
||||
)
|
@ -1,536 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "Options.h"
|
||||
|
||||
#include <array>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
namespace pzstd {
|
||||
bool operator==(const Options &lhs, const Options &rhs) {
|
||||
return lhs.numThreads == rhs.numThreads &&
|
||||
lhs.maxWindowLog == rhs.maxWindowLog &&
|
||||
lhs.compressionLevel == rhs.compressionLevel &&
|
||||
lhs.decompress == rhs.decompress && lhs.inputFiles == rhs.inputFiles &&
|
||||
lhs.outputFile == rhs.outputFile && lhs.overwrite == rhs.overwrite &&
|
||||
lhs.keepSource == rhs.keepSource && lhs.writeMode == rhs.writeMode &&
|
||||
lhs.checksum == rhs.checksum && lhs.verbosity == rhs.verbosity;
|
||||
}
|
||||
|
||||
std::ostream &operator<<(std::ostream &out, const Options &opt) {
|
||||
out << "{";
|
||||
{
|
||||
out << "\n\t"
|
||||
<< "numThreads: " << opt.numThreads;
|
||||
out << ",\n\t"
|
||||
<< "maxWindowLog: " << opt.maxWindowLog;
|
||||
out << ",\n\t"
|
||||
<< "compressionLevel: " << opt.compressionLevel;
|
||||
out << ",\n\t"
|
||||
<< "decompress: " << opt.decompress;
|
||||
out << ",\n\t"
|
||||
<< "inputFiles: {";
|
||||
{
|
||||
bool first = true;
|
||||
for (const auto &file : opt.inputFiles) {
|
||||
if (!first) {
|
||||
out << ",";
|
||||
}
|
||||
first = false;
|
||||
out << "\n\t\t" << file;
|
||||
}
|
||||
}
|
||||
out << "\n\t}";
|
||||
out << ",\n\t"
|
||||
<< "outputFile: " << opt.outputFile;
|
||||
out << ",\n\t"
|
||||
<< "overwrite: " << opt.overwrite;
|
||||
out << ",\n\t"
|
||||
<< "keepSource: " << opt.keepSource;
|
||||
out << ",\n\t"
|
||||
<< "writeMode: " << static_cast<int>(opt.writeMode);
|
||||
out << ",\n\t"
|
||||
<< "checksum: " << opt.checksum;
|
||||
out << ",\n\t"
|
||||
<< "verbosity: " << opt.verbosity;
|
||||
}
|
||||
out << "\n}";
|
||||
return out;
|
||||
}
|
||||
}
|
||||
|
||||
namespace {
|
||||
#ifdef _WIN32
|
||||
const char nullOutput[] = "nul";
|
||||
#else
|
||||
const char nullOutput[] = "/dev/null";
|
||||
#endif
|
||||
|
||||
constexpr auto autoMode = Options::WriteMode::Auto;
|
||||
} // anonymous namespace
|
||||
|
||||
#define EXPECT_SUCCESS(...) EXPECT_EQ(Options::Status::Success, __VA_ARGS__)
|
||||
#define EXPECT_FAILURE(...) EXPECT_EQ(Options::Status::Failure, __VA_ARGS__)
|
||||
#define EXPECT_MESSAGE(...) EXPECT_EQ(Options::Status::Message, __VA_ARGS__)
|
||||
|
||||
template <typename... Args>
|
||||
std::array<const char *, sizeof...(Args) + 1> makeArray(Args... args) {
|
||||
return {{nullptr, args...}};
|
||||
}
|
||||
|
||||
TEST(Options, ValidInputs) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--processes", "5", "-o", "x", "y", "-f");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {5, 23, 3, false, {"y"}, "x",
|
||||
true, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-p", "1", "input", "-19");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {1, 23, 19, false, {"input"}, "",
|
||||
false, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args =
|
||||
makeArray("--ultra", "-22", "-p", "1", "-o", "x", "-d", "x.zst", "-f");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {1, 0, 22, true, {"x.zst"}, "x",
|
||||
true, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--processes", "100", "hello.zst", "--decompress",
|
||||
"--force");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {100, 23, 3, true, {"hello.zst"}, "", true,
|
||||
true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-dp", "1", "-c");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {1, 23, 3, true, {"x"}, "-",
|
||||
false, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-dp", "1", "--stdout");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {1, 23, 3, true, {"x"}, "-",
|
||||
false, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-p", "1", "x", "-5", "-fo", "-", "--ultra", "-d");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {1, 0, 5, true, {"x"}, "-",
|
||||
true, true, autoMode, true, 2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("silesia.tar", "-o", "silesia.tar.pzstd", "-p", "2");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {2,
|
||||
23,
|
||||
3,
|
||||
false,
|
||||
{"silesia.tar"},
|
||||
"silesia.tar.pzstd",
|
||||
false,
|
||||
true,
|
||||
autoMode,
|
||||
true,
|
||||
2};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-p", "1");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-p", "1");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, GetOutputFile) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ("x.zst", options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y", "-o", nullOutput);
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(nullOutput, options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x.zst", "-do", nullOutput);
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(nullOutput, options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x.zst", "-d");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ("x", options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("xzst", "-d");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ("", options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("xzst", "-doxx");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ("xx", options.getOutputFile(options.inputFiles[0]));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, MultipleFiles) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y", "z");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected;
|
||||
expected.inputFiles = {"x", "y", "z"};
|
||||
expected.verbosity = 1;
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y", "z", "-o", nullOutput);
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected;
|
||||
expected.inputFiles = {"x", "y", "z"};
|
||||
expected.outputFile = nullOutput;
|
||||
expected.verbosity = 1;
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y", "-o-");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y", "-o", "file");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-qqvd12qp4", "-f", "x", "--", "--rm", "-c");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
Options expected = {4, 23, 12, true, {"x", "--rm", "-c"},
|
||||
"", true, true, autoMode, true,
|
||||
0};
|
||||
EXPECT_EQ(expected, options);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, NumThreads) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-dfo", "-");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-p", "0", "-fo", "-");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-f", "-p", "-o", "-");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, BadCompressionLevel) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-20");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--ultra", "-23");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--1"); // negative 1?
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, InvalidOption) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-x");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, BadOutputFile) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("notzst", "-d", "-p", "1");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ("", options.getOutputFile(options.inputFiles.front()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, BadOptionsWithArguments) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-pf");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-p", "10f");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-p");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-o");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-o");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, KeepSource) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--rm", "-k");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.keepSource);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--rm", "--keep");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.keepSource);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.keepSource);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--rm");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(false, options.keepSource);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, Verbosity) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(2, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--quiet", "-qq", "x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(-1, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "y");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(1, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--", "x", "y");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(1, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-qv", "x", "y");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(1, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-v", "x", "y");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(3, options.verbosity);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-v", "x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(3, options.verbosity);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, TestMode) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-t");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.keepSource);
|
||||
EXPECT_EQ(true, options.decompress);
|
||||
EXPECT_EQ(nullOutput, options.outputFile);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--test", "--rm", "-ohello");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.keepSource);
|
||||
EXPECT_EQ(true, options.decompress);
|
||||
EXPECT_EQ(nullOutput, options.outputFile);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, Checksum) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x.zst", "--no-check", "-Cd");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.checksum);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.checksum);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--no-check", "--check");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(true, options.checksum);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "--no-check");
|
||||
EXPECT_SUCCESS(options.parse(args.size(), args.data()));
|
||||
EXPECT_EQ(false, options.checksum);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, InputFiles) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-cd");
|
||||
options.parse(args.size(), args.data());
|
||||
EXPECT_EQ(1, options.inputFiles.size());
|
||||
EXPECT_EQ("-", options.inputFiles[0]);
|
||||
EXPECT_EQ("-", options.outputFile);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray();
|
||||
options.parse(args.size(), args.data());
|
||||
EXPECT_EQ(1, options.inputFiles.size());
|
||||
EXPECT_EQ("-", options.inputFiles[0]);
|
||||
EXPECT_EQ("-", options.outputFile);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-d");
|
||||
options.parse(args.size(), args.data());
|
||||
EXPECT_EQ(1, options.inputFiles.size());
|
||||
EXPECT_EQ("-", options.inputFiles[0]);
|
||||
EXPECT_EQ("-", options.outputFile);
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("x", "-");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, InvalidOptions) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-ibasdf");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("- ");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-n15");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-0", "x");
|
||||
EXPECT_FAILURE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Options, Extras) {
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-h");
|
||||
EXPECT_MESSAGE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-H");
|
||||
EXPECT_MESSAGE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("-V");
|
||||
EXPECT_MESSAGE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--help");
|
||||
EXPECT_MESSAGE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
{
|
||||
Options options;
|
||||
auto args = makeArray("--version");
|
||||
EXPECT_MESSAGE(options.parse(args.size(), args.data()));
|
||||
}
|
||||
}
|
@ -1,149 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "Pzstd.h"
|
||||
extern "C" {
|
||||
#include "datagen.h"
|
||||
}
|
||||
#include "test/RoundTrip.h"
|
||||
#include "utils/ScopeGuard.h"
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdio>
|
||||
#include <gtest/gtest.h>
|
||||
#include <memory>
|
||||
#include <random>
|
||||
|
||||
using namespace std;
|
||||
using namespace pzstd;
|
||||
|
||||
TEST(Pzstd, SmallSizes) {
|
||||
unsigned seed = std::random_device{}();
|
||||
std::fprintf(stderr, "Pzstd.SmallSizes seed: %u\n", seed);
|
||||
std::mt19937 gen(seed);
|
||||
|
||||
for (unsigned len = 1; len < 256; ++len) {
|
||||
if (len % 16 == 0) {
|
||||
std::fprintf(stderr, "%u / 16\n", len / 16);
|
||||
}
|
||||
std::string inputFile = std::tmpnam(nullptr);
|
||||
auto guard = makeScopeGuard([&] { std::remove(inputFile.c_str()); });
|
||||
{
|
||||
static uint8_t buf[256];
|
||||
RDG_genBuffer(buf, len, 0.5, 0.0, gen());
|
||||
auto fd = std::fopen(inputFile.c_str(), "wb");
|
||||
auto written = std::fwrite(buf, 1, len, fd);
|
||||
std::fclose(fd);
|
||||
ASSERT_EQ(written, len);
|
||||
}
|
||||
for (unsigned numThreads = 1; numThreads <= 2; ++numThreads) {
|
||||
for (unsigned level = 1; level <= 4; level *= 4) {
|
||||
auto errorGuard = makeScopeGuard([&] {
|
||||
std::fprintf(stderr, "# threads: %u\n", numThreads);
|
||||
std::fprintf(stderr, "compression level: %u\n", level);
|
||||
});
|
||||
Options options;
|
||||
options.overwrite = true;
|
||||
options.inputFiles = {inputFile};
|
||||
options.numThreads = numThreads;
|
||||
options.compressionLevel = level;
|
||||
options.verbosity = 1;
|
||||
ASSERT_TRUE(roundTrip(options));
|
||||
errorGuard.dismiss();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Pzstd, LargeSizes) {
|
||||
unsigned seed = std::random_device{}();
|
||||
std::fprintf(stderr, "Pzstd.LargeSizes seed: %u\n", seed);
|
||||
std::mt19937 gen(seed);
|
||||
|
||||
for (unsigned len = 1 << 20; len <= (1 << 24); len *= 2) {
|
||||
std::string inputFile = std::tmpnam(nullptr);
|
||||
auto guard = makeScopeGuard([&] { std::remove(inputFile.c_str()); });
|
||||
{
|
||||
std::unique_ptr<uint8_t[]> buf(new uint8_t[len]);
|
||||
RDG_genBuffer(buf.get(), len, 0.5, 0.0, gen());
|
||||
auto fd = std::fopen(inputFile.c_str(), "wb");
|
||||
auto written = std::fwrite(buf.get(), 1, len, fd);
|
||||
std::fclose(fd);
|
||||
ASSERT_EQ(written, len);
|
||||
}
|
||||
for (unsigned numThreads = 1; numThreads <= 16; numThreads *= 4) {
|
||||
for (unsigned level = 1; level <= 4; level *= 4) {
|
||||
auto errorGuard = makeScopeGuard([&] {
|
||||
std::fprintf(stderr, "# threads: %u\n", numThreads);
|
||||
std::fprintf(stderr, "compression level: %u\n", level);
|
||||
});
|
||||
Options options;
|
||||
options.overwrite = true;
|
||||
options.inputFiles = {inputFile};
|
||||
options.numThreads = std::min(numThreads, options.numThreads);
|
||||
options.compressionLevel = level;
|
||||
options.verbosity = 1;
|
||||
ASSERT_TRUE(roundTrip(options));
|
||||
errorGuard.dismiss();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Pzstd, DISABLED_ExtremelyLargeSize) {
|
||||
unsigned seed = std::random_device{}();
|
||||
std::fprintf(stderr, "Pzstd.ExtremelyLargeSize seed: %u\n", seed);
|
||||
std::mt19937 gen(seed);
|
||||
|
||||
std::string inputFile = std::tmpnam(nullptr);
|
||||
auto guard = makeScopeGuard([&] { std::remove(inputFile.c_str()); });
|
||||
|
||||
{
|
||||
// Write 4GB + 64 MB
|
||||
constexpr size_t kLength = 1 << 26;
|
||||
std::unique_ptr<uint8_t[]> buf(new uint8_t[kLength]);
|
||||
auto fd = std::fopen(inputFile.c_str(), "wb");
|
||||
auto closeGuard = makeScopeGuard([&] { std::fclose(fd); });
|
||||
for (size_t i = 0; i < (1 << 6) + 1; ++i) {
|
||||
RDG_genBuffer(buf.get(), kLength, 0.5, 0.0, gen());
|
||||
auto written = std::fwrite(buf.get(), 1, kLength, fd);
|
||||
if (written != kLength) {
|
||||
std::fprintf(stderr, "Failed to write file, skipping test\n");
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Options options;
|
||||
options.overwrite = true;
|
||||
options.inputFiles = {inputFile};
|
||||
options.compressionLevel = 1;
|
||||
if (options.numThreads == 0) {
|
||||
options.numThreads = 1;
|
||||
}
|
||||
ASSERT_TRUE(roundTrip(options));
|
||||
}
|
||||
|
||||
TEST(Pzstd, ExtremelyCompressible) {
|
||||
std::string inputFile = std::tmpnam(nullptr);
|
||||
auto guard = makeScopeGuard([&] { std::remove(inputFile.c_str()); });
|
||||
{
|
||||
std::unique_ptr<uint8_t[]> buf(new uint8_t[10000]);
|
||||
std::memset(buf.get(), 'a', 10000);
|
||||
auto fd = std::fopen(inputFile.c_str(), "wb");
|
||||
auto written = std::fwrite(buf.get(), 1, 10000, fd);
|
||||
std::fclose(fd);
|
||||
ASSERT_EQ(written, 10000);
|
||||
}
|
||||
Options options;
|
||||
options.overwrite = true;
|
||||
options.inputFiles = {inputFile};
|
||||
options.numThreads = 1;
|
||||
options.compressionLevel = 1;
|
||||
ASSERT_TRUE(roundTrip(options));
|
||||
}
|
@ -1,86 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "Options.h"
|
||||
#include "Pzstd.h"
|
||||
#include "utils/ScopeGuard.h"
|
||||
|
||||
#include <cstdio>
|
||||
#include <string>
|
||||
#include <cstdint>
|
||||
#include <memory>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
inline bool check(std::string source, std::string decompressed) {
|
||||
std::unique_ptr<std::uint8_t[]> sBuf(new std::uint8_t[1024]);
|
||||
std::unique_ptr<std::uint8_t[]> dBuf(new std::uint8_t[1024]);
|
||||
|
||||
auto sFd = std::fopen(source.c_str(), "rb");
|
||||
auto dFd = std::fopen(decompressed.c_str(), "rb");
|
||||
auto guard = makeScopeGuard([&] {
|
||||
std::fclose(sFd);
|
||||
std::fclose(dFd);
|
||||
});
|
||||
|
||||
size_t sRead, dRead;
|
||||
|
||||
do {
|
||||
sRead = std::fread(sBuf.get(), 1, 1024, sFd);
|
||||
dRead = std::fread(dBuf.get(), 1, 1024, dFd);
|
||||
if (std::ferror(sFd) || std::ferror(dFd)) {
|
||||
return false;
|
||||
}
|
||||
if (sRead != dRead) {
|
||||
return false;
|
||||
}
|
||||
|
||||
for (size_t i = 0; i < sRead; ++i) {
|
||||
if (sBuf.get()[i] != dBuf.get()[i]) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
} while (sRead == 1024);
|
||||
if (!std::feof(sFd) || !std::feof(dFd)) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
inline bool roundTrip(Options& options) {
|
||||
if (options.inputFiles.size() != 1) {
|
||||
return false;
|
||||
}
|
||||
std::string source = options.inputFiles.front();
|
||||
std::string compressedFile = std::tmpnam(nullptr);
|
||||
std::string decompressedFile = std::tmpnam(nullptr);
|
||||
auto guard = makeScopeGuard([&] {
|
||||
std::remove(compressedFile.c_str());
|
||||
std::remove(decompressedFile.c_str());
|
||||
});
|
||||
|
||||
{
|
||||
options.outputFile = compressedFile;
|
||||
options.decompress = false;
|
||||
if (pzstdMain(options) != 0) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
{
|
||||
options.decompress = true;
|
||||
options.inputFiles.front() = compressedFile;
|
||||
options.outputFile = decompressedFile;
|
||||
if (pzstdMain(options) != 0) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return check(source, decompressedFile);
|
||||
}
|
||||
}
|
@ -1,86 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
extern "C" {
|
||||
#include "datagen.h"
|
||||
}
|
||||
#include "Options.h"
|
||||
#include "test/RoundTrip.h"
|
||||
#include "utils/ScopeGuard.h"
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <memory>
|
||||
#include <random>
|
||||
|
||||
using namespace std;
|
||||
using namespace pzstd;
|
||||
|
||||
namespace {
|
||||
string
|
||||
writeData(size_t size, double matchProba, double litProba, unsigned seed) {
|
||||
std::unique_ptr<uint8_t[]> buf(new uint8_t[size]);
|
||||
RDG_genBuffer(buf.get(), size, matchProba, litProba, seed);
|
||||
string file = tmpnam(nullptr);
|
||||
auto fd = std::fopen(file.c_str(), "wb");
|
||||
auto guard = makeScopeGuard([&] { std::fclose(fd); });
|
||||
auto bytesWritten = std::fwrite(buf.get(), 1, size, fd);
|
||||
if (bytesWritten != size) {
|
||||
std::abort();
|
||||
}
|
||||
return file;
|
||||
}
|
||||
|
||||
template <typename Generator>
|
||||
string generateInputFile(Generator& gen) {
|
||||
// Use inputs ranging from 1 Byte to 2^16 Bytes
|
||||
std::uniform_int_distribution<size_t> size{1, 1 << 16};
|
||||
std::uniform_real_distribution<> prob{0, 1};
|
||||
return writeData(size(gen), prob(gen), prob(gen), gen());
|
||||
}
|
||||
|
||||
template <typename Generator>
|
||||
Options generateOptions(Generator& gen, const string& inputFile) {
|
||||
Options options;
|
||||
options.inputFiles = {inputFile};
|
||||
options.overwrite = true;
|
||||
|
||||
std::uniform_int_distribution<unsigned> numThreads{1, 32};
|
||||
std::uniform_int_distribution<unsigned> compressionLevel{1, 10};
|
||||
|
||||
options.numThreads = numThreads(gen);
|
||||
options.compressionLevel = compressionLevel(gen);
|
||||
|
||||
return options;
|
||||
}
|
||||
}
|
||||
|
||||
int main() {
|
||||
std::mt19937 gen(std::random_device{}());
|
||||
|
||||
auto newlineGuard = makeScopeGuard([] { std::fprintf(stderr, "\n"); });
|
||||
for (unsigned i = 0; i < 10000; ++i) {
|
||||
if (i % 100 == 0) {
|
||||
std::fprintf(stderr, "Progress: %u%%\r", i / 100);
|
||||
}
|
||||
auto inputFile = generateInputFile(gen);
|
||||
auto inputGuard = makeScopeGuard([&] { std::remove(inputFile.c_str()); });
|
||||
for (unsigned i = 0; i < 10; ++i) {
|
||||
auto options = generateOptions(gen, inputFile);
|
||||
if (!roundTrip(options)) {
|
||||
std::fprintf(stderr, "numThreads: %u\n", options.numThreads);
|
||||
std::fprintf(stderr, "level: %u\n", options.compressionLevel);
|
||||
std::fprintf(stderr, "decompress? %u\n", (unsigned)options.decompress);
|
||||
std::fprintf(stderr, "file: %s\n", inputFile.c_str());
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
@ -1,75 +0,0 @@
|
||||
cxx_library(
|
||||
name='buffer',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['Buffer.h'],
|
||||
deps=[':range'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='file_system',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['FileSystem.h'],
|
||||
deps=[':range'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='likely',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['Likely.h'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='range',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['Range.h'],
|
||||
deps=[':likely'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='resource_pool',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['ResourcePool.h'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='scope_guard',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['ScopeGuard.h'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='thread_pool',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['ThreadPool.h'],
|
||||
deps=[':work_queue'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='work_queue',
|
||||
visibility=['PUBLIC'],
|
||||
header_namespace='utils',
|
||||
exported_headers=['WorkQueue.h'],
|
||||
deps=[':buffer'],
|
||||
)
|
||||
|
||||
cxx_library(
|
||||
name='utils',
|
||||
visibility=['PUBLIC'],
|
||||
deps=[
|
||||
':buffer',
|
||||
':file_system',
|
||||
':likely',
|
||||
':range',
|
||||
':resource_pool',
|
||||
':scope_guard',
|
||||
':thread_pool',
|
||||
':work_queue',
|
||||
],
|
||||
)
|
@ -1,99 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <array>
|
||||
#include <cstddef>
|
||||
#include <memory>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
/**
|
||||
* A `Buffer` has a pointer to a shared buffer, and a range of the buffer that
|
||||
* it owns.
|
||||
* The idea is that you can allocate one buffer, and write chunks into it
|
||||
* and break off those chunks.
|
||||
* The underlying buffer is reference counted, and will be destroyed when all
|
||||
* `Buffer`s that reference it are destroyed.
|
||||
*/
|
||||
class Buffer {
|
||||
std::shared_ptr<unsigned char> buffer_;
|
||||
MutableByteRange range_;
|
||||
|
||||
static void delete_buffer(unsigned char* buffer) {
|
||||
delete[] buffer;
|
||||
}
|
||||
|
||||
public:
|
||||
/// Construct an empty buffer that owns no data.
|
||||
explicit Buffer() {}
|
||||
|
||||
/// Construct a `Buffer` that owns a new underlying buffer of size `size`.
|
||||
explicit Buffer(std::size_t size)
|
||||
: buffer_(new unsigned char[size], delete_buffer),
|
||||
range_(buffer_.get(), buffer_.get() + size) {}
|
||||
|
||||
explicit Buffer(std::shared_ptr<unsigned char> buffer, MutableByteRange data)
|
||||
: buffer_(buffer), range_(data) {}
|
||||
|
||||
Buffer(Buffer&&) = default;
|
||||
Buffer& operator=(Buffer&&) & = default;
|
||||
|
||||
/**
|
||||
* Splits the data into two pieces: [begin, begin + n), [begin + n, end).
|
||||
* Their data both points into the same underlying buffer.
|
||||
* Modifies the original `Buffer` to point to only [begin + n, end).
|
||||
*
|
||||
* @param n The offset to split at.
|
||||
* @returns A buffer that owns the data [begin, begin + n).
|
||||
*/
|
||||
Buffer splitAt(std::size_t n) {
|
||||
auto firstPiece = range_.subpiece(0, n);
|
||||
range_.advance(n);
|
||||
return Buffer(buffer_, firstPiece);
|
||||
}
|
||||
|
||||
/// Modifies the buffer to point to the range [begin + n, end).
|
||||
void advance(std::size_t n) {
|
||||
range_.advance(n);
|
||||
}
|
||||
|
||||
/// Modifies the buffer to point to the range [begin, end - n).
|
||||
void subtract(std::size_t n) {
|
||||
range_.subtract(n);
|
||||
}
|
||||
|
||||
/// Returns a read only `Range` pointing to the `Buffer`s data.
|
||||
ByteRange range() const {
|
||||
return range_;
|
||||
}
|
||||
/// Returns a mutable `Range` pointing to the `Buffer`s data.
|
||||
MutableByteRange range() {
|
||||
return range_;
|
||||
}
|
||||
|
||||
const unsigned char* data() const {
|
||||
return range_.data();
|
||||
}
|
||||
|
||||
unsigned char* data() {
|
||||
return range_.data();
|
||||
}
|
||||
|
||||
std::size_t size() const {
|
||||
return range_.size();
|
||||
}
|
||||
|
||||
bool empty() const {
|
||||
return range_.empty();
|
||||
}
|
||||
};
|
||||
}
|
@ -1,94 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <sys/stat.h>
|
||||
#include <cerrno>
|
||||
#include <cstdint>
|
||||
#include <system_error>
|
||||
|
||||
// A small subset of `std::filesystem`.
|
||||
// `std::filesystem` should be a drop in replacement.
|
||||
// See http://en.cppreference.com/w/cpp/filesystem for documentation.
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
// using file_status = ... causes gcc to emit a false positive warning
|
||||
#if defined(_MSC_VER)
|
||||
typedef struct ::_stat64 file_status;
|
||||
#else
|
||||
typedef struct ::stat file_status;
|
||||
#endif
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/status
|
||||
inline file_status status(StringPiece path, std::error_code& ec) noexcept {
|
||||
file_status status;
|
||||
#if defined(_MSC_VER)
|
||||
const auto error = ::_stat64(path.data(), &status);
|
||||
#else
|
||||
const auto error = ::stat(path.data(), &status);
|
||||
#endif
|
||||
if (error) {
|
||||
ec.assign(errno, std::generic_category());
|
||||
} else {
|
||||
ec.clear();
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/is_regular_file
|
||||
inline bool is_regular_file(file_status status) noexcept {
|
||||
#if defined(S_ISREG)
|
||||
return S_ISREG(status.st_mode);
|
||||
#elif !defined(S_ISREG) && defined(S_IFMT) && defined(S_IFREG)
|
||||
return (status.st_mode & S_IFMT) == S_IFREG;
|
||||
#else
|
||||
static_assert(false, "No POSIX stat() support.");
|
||||
#endif
|
||||
}
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/is_regular_file
|
||||
inline bool is_regular_file(StringPiece path, std::error_code& ec) noexcept {
|
||||
return is_regular_file(status(path, ec));
|
||||
}
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/is_directory
|
||||
inline bool is_directory(file_status status) noexcept {
|
||||
#if defined(S_ISDIR)
|
||||
return S_ISDIR(status.st_mode);
|
||||
#elif !defined(S_ISDIR) && defined(S_IFMT) && defined(S_IFDIR)
|
||||
return (status.st_mode & S_IFMT) == S_IFDIR;
|
||||
#else
|
||||
static_assert(false, "NO POSIX stat() support.");
|
||||
#endif
|
||||
}
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/is_directory
|
||||
inline bool is_directory(StringPiece path, std::error_code& ec) noexcept {
|
||||
return is_directory(status(path, ec));
|
||||
}
|
||||
|
||||
/// http://en.cppreference.com/w/cpp/filesystem/file_size
|
||||
inline std::uintmax_t file_size(
|
||||
StringPiece path,
|
||||
std::error_code& ec) noexcept {
|
||||
auto stat = status(path, ec);
|
||||
if (ec) {
|
||||
return -1;
|
||||
}
|
||||
if (!is_regular_file(stat)) {
|
||||
ec.assign(ENOTSUP, std::generic_category());
|
||||
return -1;
|
||||
}
|
||||
ec.clear();
|
||||
return stat.st_size;
|
||||
}
|
||||
}
|
@ -1,28 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
/**
|
||||
* Compiler hints to indicate the fast path of an "if" branch: whether
|
||||
* the if condition is likely to be true or false.
|
||||
*
|
||||
* @author Tudor Bosman (tudorb@fb.com)
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
|
||||
#undef LIKELY
|
||||
#undef UNLIKELY
|
||||
|
||||
#if defined(__GNUC__) && __GNUC__ >= 4
|
||||
#define LIKELY(x) (__builtin_expect((x), 1))
|
||||
#define UNLIKELY(x) (__builtin_expect((x), 0))
|
||||
#else
|
||||
#define LIKELY(x) (x)
|
||||
#define UNLIKELY(x) (x)
|
||||
#endif
|
@ -1,131 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
/**
|
||||
* A subset of `folly/Range.h`.
|
||||
* All code copied verbatim modulo formatting
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/Likely.h"
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstring>
|
||||
#include <stdexcept>
|
||||
#include <string>
|
||||
#include <type_traits>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
namespace detail {
|
||||
/*
|
||||
*Use IsCharPointer<T>::type to enable const char* or char*.
|
||||
*Use IsCharPointer<T>::const_type to enable only const char*.
|
||||
*/
|
||||
template <class T>
|
||||
struct IsCharPointer {};
|
||||
|
||||
template <>
|
||||
struct IsCharPointer<char*> {
|
||||
typedef int type;
|
||||
};
|
||||
|
||||
template <>
|
||||
struct IsCharPointer<const char*> {
|
||||
typedef int const_type;
|
||||
typedef int type;
|
||||
};
|
||||
|
||||
} // namespace detail
|
||||
|
||||
template <typename Iter>
|
||||
class Range {
|
||||
Iter b_;
|
||||
Iter e_;
|
||||
|
||||
public:
|
||||
using size_type = std::size_t;
|
||||
using iterator = Iter;
|
||||
using const_iterator = Iter;
|
||||
using value_type = typename std::remove_reference<
|
||||
typename std::iterator_traits<Iter>::reference>::type;
|
||||
using reference = typename std::iterator_traits<Iter>::reference;
|
||||
|
||||
constexpr Range() : b_(), e_() {}
|
||||
constexpr Range(Iter begin, Iter end) : b_(begin), e_(end) {}
|
||||
|
||||
constexpr Range(Iter begin, size_type size) : b_(begin), e_(begin + size) {}
|
||||
|
||||
template <class T = Iter, typename detail::IsCharPointer<T>::type = 0>
|
||||
/* implicit */ Range(Iter str) : b_(str), e_(str + std::strlen(str)) {}
|
||||
|
||||
template <class T = Iter, typename detail::IsCharPointer<T>::const_type = 0>
|
||||
/* implicit */ Range(const std::string& str)
|
||||
: b_(str.data()), e_(b_ + str.size()) {}
|
||||
|
||||
// Allow implicit conversion from Range<From> to Range<To> if From is
|
||||
// implicitly convertible to To.
|
||||
template <
|
||||
class OtherIter,
|
||||
typename std::enable_if<
|
||||
(!std::is_same<Iter, OtherIter>::value &&
|
||||
std::is_convertible<OtherIter, Iter>::value),
|
||||
int>::type = 0>
|
||||
constexpr /* implicit */ Range(const Range<OtherIter>& other)
|
||||
: b_(other.begin()), e_(other.end()) {}
|
||||
|
||||
Range(const Range&) = default;
|
||||
Range(Range&&) = default;
|
||||
|
||||
Range& operator=(const Range&) & = default;
|
||||
Range& operator=(Range&&) & = default;
|
||||
|
||||
constexpr size_type size() const {
|
||||
return e_ - b_;
|
||||
}
|
||||
bool empty() const {
|
||||
return b_ == e_;
|
||||
}
|
||||
Iter data() const {
|
||||
return b_;
|
||||
}
|
||||
Iter begin() const {
|
||||
return b_;
|
||||
}
|
||||
Iter end() const {
|
||||
return e_;
|
||||
}
|
||||
|
||||
void advance(size_type n) {
|
||||
if (UNLIKELY(n > size())) {
|
||||
throw std::out_of_range("index out of range");
|
||||
}
|
||||
b_ += n;
|
||||
}
|
||||
|
||||
void subtract(size_type n) {
|
||||
if (UNLIKELY(n > size())) {
|
||||
throw std::out_of_range("index out of range");
|
||||
}
|
||||
e_ -= n;
|
||||
}
|
||||
|
||||
Range subpiece(size_type first, size_type length = std::string::npos) const {
|
||||
if (UNLIKELY(first > size())) {
|
||||
throw std::out_of_range("index out of range");
|
||||
}
|
||||
|
||||
return Range(b_ + first, std::min(length, size() - first));
|
||||
}
|
||||
};
|
||||
|
||||
using ByteRange = Range<const unsigned char*>;
|
||||
using MutableByteRange = Range<unsigned char*>;
|
||||
using StringPiece = Range<const char*>;
|
||||
}
|
@ -1,96 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include <cassert>
|
||||
#include <functional>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <vector>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
/**
|
||||
* An unbounded pool of resources.
|
||||
* A `ResourcePool<T>` requires a factory function that takes allocates `T*` and
|
||||
* a free function that frees a `T*`.
|
||||
* Calling `ResourcePool::get()` will give you a new `ResourcePool::UniquePtr`
|
||||
* to a `T`, and when it goes out of scope the resource will be returned to the
|
||||
* pool.
|
||||
* The `ResourcePool<T>` *must* survive longer than any resources it hands out.
|
||||
* Remember that `ResourcePool<T>` hands out mutable `T`s, so make sure to clean
|
||||
* up the resource before or after every use.
|
||||
*/
|
||||
template <typename T>
|
||||
class ResourcePool {
|
||||
public:
|
||||
class Deleter;
|
||||
using Factory = std::function<T*()>;
|
||||
using Free = std::function<void(T*)>;
|
||||
using UniquePtr = std::unique_ptr<T, Deleter>;
|
||||
|
||||
private:
|
||||
std::mutex mutex_;
|
||||
Factory factory_;
|
||||
Free free_;
|
||||
std::vector<T*> resources_;
|
||||
unsigned inUse_;
|
||||
|
||||
public:
|
||||
/**
|
||||
* Creates a `ResourcePool`.
|
||||
*
|
||||
* @param factory The function to use to create new resources.
|
||||
* @param free The function to use to free resources created by `factory`.
|
||||
*/
|
||||
ResourcePool(Factory factory, Free free)
|
||||
: factory_(std::move(factory)), free_(std::move(free)), inUse_(0) {}
|
||||
|
||||
/**
|
||||
* @returns A unique pointer to a resource. The resource is null iff
|
||||
* there are no available resources and `factory()` returns null.
|
||||
*/
|
||||
UniquePtr get() {
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
if (!resources_.empty()) {
|
||||
UniquePtr resource{resources_.back(), Deleter{*this}};
|
||||
resources_.pop_back();
|
||||
++inUse_;
|
||||
return resource;
|
||||
}
|
||||
UniquePtr resource{factory_(), Deleter{*this}};
|
||||
++inUse_;
|
||||
return resource;
|
||||
}
|
||||
|
||||
~ResourcePool() noexcept {
|
||||
assert(inUse_ == 0);
|
||||
for (const auto resource : resources_) {
|
||||
free_(resource);
|
||||
}
|
||||
}
|
||||
|
||||
class Deleter {
|
||||
ResourcePool *pool_;
|
||||
public:
|
||||
explicit Deleter(ResourcePool &pool) : pool_(&pool) {}
|
||||
|
||||
void operator() (T *resource) {
|
||||
std::lock_guard<std::mutex> lock(pool_->mutex_);
|
||||
// Make sure we don't put null resources into the pool
|
||||
if (resource) {
|
||||
pool_->resources_.push_back(resource);
|
||||
}
|
||||
assert(pool_->inUse_ > 0);
|
||||
--pool_->inUse_;
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
}
|
@ -1,50 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include <utility>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
/**
|
||||
* Dismissable scope guard.
|
||||
* `Function` must be callable and take no parameters.
|
||||
* Unless `dissmiss()` is called, the callable is executed upon destruction of
|
||||
* `ScopeGuard`.
|
||||
*
|
||||
* Example:
|
||||
*
|
||||
* auto guard = makeScopeGuard([&] { cleanup(); });
|
||||
*/
|
||||
template <typename Function>
|
||||
class ScopeGuard {
|
||||
Function function;
|
||||
bool dismissed;
|
||||
|
||||
public:
|
||||
explicit ScopeGuard(Function&& function)
|
||||
: function(std::move(function)), dismissed(false) {}
|
||||
|
||||
void dismiss() {
|
||||
dismissed = true;
|
||||
}
|
||||
|
||||
~ScopeGuard() noexcept {
|
||||
if (!dismissed) {
|
||||
function();
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
/// Creates a scope guard from `function`.
|
||||
template <typename Function>
|
||||
ScopeGuard<Function> makeScopeGuard(Function&& function) {
|
||||
return ScopeGuard<Function>(std::forward<Function>(function));
|
||||
}
|
||||
}
|
@ -1,58 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/WorkQueue.h"
|
||||
|
||||
#include <cstddef>
|
||||
#include <functional>
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
|
||||
namespace pzstd {
|
||||
/// A simple thread pool that pulls tasks off its queue in FIFO order.
|
||||
class ThreadPool {
|
||||
std::vector<std::thread> threads_;
|
||||
|
||||
WorkQueue<std::function<void()>> tasks_;
|
||||
|
||||
public:
|
||||
/// Constructs a thread pool with `numThreads` threads.
|
||||
explicit ThreadPool(std::size_t numThreads) {
|
||||
threads_.reserve(numThreads);
|
||||
for (std::size_t i = 0; i < numThreads; ++i) {
|
||||
threads_.emplace_back([this] {
|
||||
std::function<void()> task;
|
||||
while (tasks_.pop(task)) {
|
||||
task();
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/// Finishes all tasks currently in the queue.
|
||||
~ThreadPool() {
|
||||
tasks_.finish();
|
||||
for (auto& thread : threads_) {
|
||||
thread.join();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds `task` to the queue of tasks to execute. Since `task` is a
|
||||
* `std::function<>`, it cannot be a move only type. So any lambda passed must
|
||||
* not capture move only types (like `std::unique_ptr`).
|
||||
*
|
||||
* @param task The task to execute.
|
||||
*/
|
||||
void add(std::function<void()> task) {
|
||||
tasks_.push(std::move(task));
|
||||
}
|
||||
};
|
||||
}
|
@ -1,181 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "utils/Buffer.h"
|
||||
|
||||
#include <atomic>
|
||||
#include <cassert>
|
||||
#include <cstddef>
|
||||
#include <condition_variable>
|
||||
#include <cstddef>
|
||||
#include <functional>
|
||||
#include <mutex>
|
||||
#include <queue>
|
||||
|
||||
namespace pzstd {
|
||||
|
||||
/// Unbounded thread-safe work queue.
|
||||
template <typename T>
|
||||
class WorkQueue {
|
||||
// Protects all member variable access
|
||||
std::mutex mutex_;
|
||||
std::condition_variable readerCv_;
|
||||
std::condition_variable writerCv_;
|
||||
std::condition_variable finishCv_;
|
||||
|
||||
std::queue<T> queue_;
|
||||
bool done_;
|
||||
std::size_t maxSize_;
|
||||
|
||||
// Must have lock to call this function
|
||||
bool full() const {
|
||||
if (maxSize_ == 0) {
|
||||
return false;
|
||||
}
|
||||
return queue_.size() >= maxSize_;
|
||||
}
|
||||
|
||||
public:
|
||||
/**
|
||||
* Constructs an empty work queue with an optional max size.
|
||||
* If `maxSize == 0` the queue size is unbounded.
|
||||
*
|
||||
* @param maxSize The maximum allowed size of the work queue.
|
||||
*/
|
||||
WorkQueue(std::size_t maxSize = 0) : done_(false), maxSize_(maxSize) {}
|
||||
|
||||
/**
|
||||
* Push an item onto the work queue. Notify a single thread that work is
|
||||
* available. If `finish()` has been called, do nothing and return false.
|
||||
* If `push()` returns false, then `item` has not been moved from.
|
||||
*
|
||||
* @param item Item to push onto the queue.
|
||||
* @returns True upon success, false if `finish()` has been called. An
|
||||
* item was pushed iff `push()` returns true.
|
||||
*/
|
||||
bool push(T&& item) {
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
while (full() && !done_) {
|
||||
writerCv_.wait(lock);
|
||||
}
|
||||
if (done_) {
|
||||
return false;
|
||||
}
|
||||
queue_.push(std::move(item));
|
||||
}
|
||||
readerCv_.notify_one();
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Attempts to pop an item off the work queue. It will block until data is
|
||||
* available or `finish()` has been called.
|
||||
*
|
||||
* @param[out] item If `pop` returns `true`, it contains the popped item.
|
||||
* If `pop` returns `false`, it is unmodified.
|
||||
* @returns True upon success. False if the queue is empty and
|
||||
* `finish()` has been called.
|
||||
*/
|
||||
bool pop(T& item) {
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
while (queue_.empty() && !done_) {
|
||||
readerCv_.wait(lock);
|
||||
}
|
||||
if (queue_.empty()) {
|
||||
assert(done_);
|
||||
return false;
|
||||
}
|
||||
item = std::move(queue_.front());
|
||||
queue_.pop();
|
||||
}
|
||||
writerCv_.notify_one();
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the maximum queue size. If `maxSize == 0` then it is unbounded.
|
||||
*
|
||||
* @param maxSize The new maximum queue size.
|
||||
*/
|
||||
void setMaxSize(std::size_t maxSize) {
|
||||
{
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
maxSize_ = maxSize;
|
||||
}
|
||||
writerCv_.notify_all();
|
||||
}
|
||||
|
||||
/**
|
||||
* Promise that `push()` won't be called again, so once the queue is empty
|
||||
* there will never any more work.
|
||||
*/
|
||||
void finish() {
|
||||
{
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
assert(!done_);
|
||||
done_ = true;
|
||||
}
|
||||
readerCv_.notify_all();
|
||||
writerCv_.notify_all();
|
||||
finishCv_.notify_all();
|
||||
}
|
||||
|
||||
/// Blocks until `finish()` has been called (but the queue may not be empty).
|
||||
void waitUntilFinished() {
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
while (!done_) {
|
||||
finishCv_.wait(lock);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
/// Work queue for `Buffer`s that knows the total number of bytes in the queue.
|
||||
class BufferWorkQueue {
|
||||
WorkQueue<Buffer> queue_;
|
||||
std::atomic<std::size_t> size_;
|
||||
|
||||
public:
|
||||
BufferWorkQueue(std::size_t maxSize = 0) : queue_(maxSize), size_(0) {}
|
||||
|
||||
void push(Buffer buffer) {
|
||||
size_.fetch_add(buffer.size());
|
||||
queue_.push(std::move(buffer));
|
||||
}
|
||||
|
||||
bool pop(Buffer& buffer) {
|
||||
bool result = queue_.pop(buffer);
|
||||
if (result) {
|
||||
size_.fetch_sub(buffer.size());
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
void setMaxSize(std::size_t maxSize) {
|
||||
queue_.setMaxSize(maxSize);
|
||||
}
|
||||
|
||||
void finish() {
|
||||
queue_.finish();
|
||||
}
|
||||
|
||||
/**
|
||||
* Blocks until `finish()` has been called.
|
||||
*
|
||||
* @returns The total number of bytes of all the `Buffer`s currently in the
|
||||
* queue.
|
||||
*/
|
||||
std::size_t size() {
|
||||
queue_.waitUntilFinished();
|
||||
return size_.load();
|
||||
}
|
||||
};
|
||||
}
|
@ -1,35 +0,0 @@
|
||||
cxx_test(
|
||||
name='buffer_test',
|
||||
srcs=['BufferTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:buffer'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='range_test',
|
||||
srcs=['RangeTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:range'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='resource_pool_test',
|
||||
srcs=['ResourcePoolTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:resource_pool'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='scope_guard_test',
|
||||
srcs=['ScopeGuardTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:scope_guard'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='thread_pool_test',
|
||||
srcs=['ThreadPoolTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:thread_pool'],
|
||||
)
|
||||
|
||||
cxx_test(
|
||||
name='work_queue_test',
|
||||
srcs=['RangeTest.cpp'],
|
||||
deps=['//contrib/pzstd/utils:work_queue'],
|
||||
)
|
@ -1,89 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/Buffer.h"
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <memory>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
namespace {
|
||||
void deleter(const unsigned char* buf) {
|
||||
delete[] buf;
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Buffer, Constructors) {
|
||||
Buffer empty;
|
||||
EXPECT_TRUE(empty.empty());
|
||||
EXPECT_EQ(0, empty.size());
|
||||
|
||||
Buffer sized(5);
|
||||
EXPECT_FALSE(sized.empty());
|
||||
EXPECT_EQ(5, sized.size());
|
||||
|
||||
Buffer moved(std::move(sized));
|
||||
EXPECT_FALSE(sized.empty());
|
||||
EXPECT_EQ(5, sized.size());
|
||||
|
||||
Buffer assigned;
|
||||
assigned = std::move(moved);
|
||||
EXPECT_FALSE(sized.empty());
|
||||
EXPECT_EQ(5, sized.size());
|
||||
}
|
||||
|
||||
TEST(Buffer, BufferManagement) {
|
||||
std::shared_ptr<unsigned char> buf(new unsigned char[10], deleter);
|
||||
{
|
||||
Buffer acquired(buf, MutableByteRange(buf.get(), buf.get() + 10));
|
||||
EXPECT_EQ(2, buf.use_count());
|
||||
Buffer moved(std::move(acquired));
|
||||
EXPECT_EQ(2, buf.use_count());
|
||||
Buffer assigned;
|
||||
assigned = std::move(moved);
|
||||
EXPECT_EQ(2, buf.use_count());
|
||||
|
||||
Buffer split = assigned.splitAt(5);
|
||||
EXPECT_EQ(3, buf.use_count());
|
||||
|
||||
split.advance(1);
|
||||
assigned.subtract(1);
|
||||
EXPECT_EQ(3, buf.use_count());
|
||||
}
|
||||
EXPECT_EQ(1, buf.use_count());
|
||||
}
|
||||
|
||||
TEST(Buffer, Modifiers) {
|
||||
Buffer buf(10);
|
||||
{
|
||||
unsigned char i = 0;
|
||||
for (auto& byte : buf.range()) {
|
||||
byte = i++;
|
||||
}
|
||||
}
|
||||
|
||||
auto prefix = buf.splitAt(2);
|
||||
|
||||
ASSERT_EQ(2, prefix.size());
|
||||
EXPECT_EQ(0, *prefix.data());
|
||||
|
||||
ASSERT_EQ(8, buf.size());
|
||||
EXPECT_EQ(2, *buf.data());
|
||||
|
||||
buf.advance(2);
|
||||
EXPECT_EQ(4, *buf.data());
|
||||
|
||||
EXPECT_EQ(9, *(buf.range().end() - 1));
|
||||
|
||||
buf.subtract(2);
|
||||
EXPECT_EQ(7, *(buf.range().end() - 1));
|
||||
|
||||
EXPECT_EQ(4, buf.size());
|
||||
}
|
@ -1,82 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/Range.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <string>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
// Range is directly copied from folly.
|
||||
// Just some sanity tests to make sure everything seems to work.
|
||||
|
||||
TEST(Range, Constructors) {
|
||||
StringPiece empty;
|
||||
EXPECT_TRUE(empty.empty());
|
||||
EXPECT_EQ(0, empty.size());
|
||||
|
||||
std::string str = "hello";
|
||||
{
|
||||
Range<std::string::const_iterator> piece(str.begin(), str.end());
|
||||
EXPECT_EQ(5, piece.size());
|
||||
EXPECT_EQ('h', *piece.data());
|
||||
EXPECT_EQ('o', *(piece.end() - 1));
|
||||
}
|
||||
|
||||
{
|
||||
StringPiece piece(str.data(), str.size());
|
||||
EXPECT_EQ(5, piece.size());
|
||||
EXPECT_EQ('h', *piece.data());
|
||||
EXPECT_EQ('o', *(piece.end() - 1));
|
||||
}
|
||||
|
||||
{
|
||||
StringPiece piece(str);
|
||||
EXPECT_EQ(5, piece.size());
|
||||
EXPECT_EQ('h', *piece.data());
|
||||
EXPECT_EQ('o', *(piece.end() - 1));
|
||||
}
|
||||
|
||||
{
|
||||
StringPiece piece(str.c_str());
|
||||
EXPECT_EQ(5, piece.size());
|
||||
EXPECT_EQ('h', *piece.data());
|
||||
EXPECT_EQ('o', *(piece.end() - 1));
|
||||
}
|
||||
}
|
||||
|
||||
TEST(Range, Modifiers) {
|
||||
StringPiece range("hello world");
|
||||
ASSERT_EQ(11, range.size());
|
||||
|
||||
{
|
||||
auto hello = range.subpiece(0, 5);
|
||||
EXPECT_EQ(5, hello.size());
|
||||
EXPECT_EQ('h', *hello.data());
|
||||
EXPECT_EQ('o', *(hello.end() - 1));
|
||||
}
|
||||
{
|
||||
auto hello = range;
|
||||
hello.subtract(6);
|
||||
EXPECT_EQ(5, hello.size());
|
||||
EXPECT_EQ('h', *hello.data());
|
||||
EXPECT_EQ('o', *(hello.end() - 1));
|
||||
}
|
||||
{
|
||||
auto world = range;
|
||||
world.advance(6);
|
||||
EXPECT_EQ(5, world.size());
|
||||
EXPECT_EQ('w', *world.data());
|
||||
EXPECT_EQ('d', *(world.end() - 1));
|
||||
}
|
||||
|
||||
std::string expected = "hello world";
|
||||
EXPECT_EQ(expected, std::string(range.begin(), range.end()));
|
||||
EXPECT_EQ(expected, std::string(range.data(), range.size()));
|
||||
}
|
@ -1,72 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/ResourcePool.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <atomic>
|
||||
#include <thread>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
TEST(ResourcePool, FullTest) {
|
||||
unsigned numCreated = 0;
|
||||
unsigned numDeleted = 0;
|
||||
{
|
||||
ResourcePool<int> pool(
|
||||
[&numCreated] { ++numCreated; return new int{5}; },
|
||||
[&numDeleted](int *x) { ++numDeleted; delete x; });
|
||||
|
||||
{
|
||||
auto i = pool.get();
|
||||
EXPECT_EQ(5, *i);
|
||||
*i = 6;
|
||||
}
|
||||
{
|
||||
auto i = pool.get();
|
||||
EXPECT_EQ(6, *i);
|
||||
auto j = pool.get();
|
||||
EXPECT_EQ(5, *j);
|
||||
*j = 7;
|
||||
}
|
||||
{
|
||||
auto i = pool.get();
|
||||
EXPECT_EQ(6, *i);
|
||||
auto j = pool.get();
|
||||
EXPECT_EQ(7, *j);
|
||||
}
|
||||
}
|
||||
EXPECT_EQ(2, numCreated);
|
||||
EXPECT_EQ(numCreated, numDeleted);
|
||||
}
|
||||
|
||||
TEST(ResourcePool, ThreadSafe) {
|
||||
std::atomic<unsigned> numCreated{0};
|
||||
std::atomic<unsigned> numDeleted{0};
|
||||
{
|
||||
ResourcePool<int> pool(
|
||||
[&numCreated] { ++numCreated; return new int{0}; },
|
||||
[&numDeleted](int *x) { ++numDeleted; delete x; });
|
||||
auto push = [&pool] {
|
||||
for (int i = 0; i < 100; ++i) {
|
||||
auto x = pool.get();
|
||||
++*x;
|
||||
}
|
||||
};
|
||||
std::thread t1{push};
|
||||
std::thread t2{push};
|
||||
t1.join();
|
||||
t2.join();
|
||||
|
||||
auto x = pool.get();
|
||||
auto y = pool.get();
|
||||
EXPECT_EQ(200, *x + *y);
|
||||
}
|
||||
EXPECT_GE(2, numCreated);
|
||||
EXPECT_EQ(numCreated, numDeleted);
|
||||
}
|
@ -1,28 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/ScopeGuard.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
TEST(ScopeGuard, Dismiss) {
|
||||
{
|
||||
auto guard = makeScopeGuard([&] { EXPECT_TRUE(false); });
|
||||
guard.dismiss();
|
||||
}
|
||||
}
|
||||
|
||||
TEST(ScopeGuard, Executes) {
|
||||
bool executed = false;
|
||||
{
|
||||
auto guard = makeScopeGuard([&] { executed = true; });
|
||||
}
|
||||
EXPECT_TRUE(executed);
|
||||
}
|
@ -1,71 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/ThreadPool.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <atomic>
|
||||
#include <iostream>
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
TEST(ThreadPool, Ordering) {
|
||||
std::vector<int> results;
|
||||
|
||||
{
|
||||
ThreadPool executor(1);
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
executor.add([ &results, i ] { results.push_back(i); });
|
||||
}
|
||||
}
|
||||
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
EXPECT_EQ(i, results[i]);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(ThreadPool, AllJobsFinished) {
|
||||
std::atomic<unsigned> numFinished{0};
|
||||
std::atomic<bool> start{false};
|
||||
{
|
||||
std::cerr << "Creating executor" << std::endl;
|
||||
ThreadPool executor(5);
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
executor.add([ &numFinished, &start ] {
|
||||
while (!start.load()) {
|
||||
std::this_thread::yield();
|
||||
}
|
||||
++numFinished;
|
||||
});
|
||||
}
|
||||
std::cerr << "Starting" << std::endl;
|
||||
start.store(true);
|
||||
std::cerr << "Finishing" << std::endl;
|
||||
}
|
||||
EXPECT_EQ(10, numFinished.load());
|
||||
}
|
||||
|
||||
TEST(ThreadPool, AddJobWhileJoining) {
|
||||
std::atomic<bool> done{false};
|
||||
{
|
||||
ThreadPool executor(1);
|
||||
executor.add([&executor, &done] {
|
||||
while (!done.load()) {
|
||||
std::this_thread::yield();
|
||||
}
|
||||
// Sleep for a second to be sure that we are joining
|
||||
std::this_thread::sleep_for(std::chrono::seconds(1));
|
||||
executor.add([] {
|
||||
EXPECT_TRUE(false);
|
||||
});
|
||||
});
|
||||
done.store(true);
|
||||
}
|
||||
}
|
@ -1,282 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
#include "utils/Buffer.h"
|
||||
#include "utils/WorkQueue.h"
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <iostream>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
|
||||
using namespace pzstd;
|
||||
|
||||
namespace {
|
||||
struct Popper {
|
||||
WorkQueue<int>* queue;
|
||||
int* results;
|
||||
std::mutex* mutex;
|
||||
|
||||
void operator()() {
|
||||
int result;
|
||||
while (queue->pop(result)) {
|
||||
std::lock_guard<std::mutex> lock(*mutex);
|
||||
results[result] = result;
|
||||
}
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
TEST(WorkQueue, SingleThreaded) {
|
||||
WorkQueue<int> queue;
|
||||
int result;
|
||||
|
||||
queue.push(5);
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(5, result);
|
||||
|
||||
queue.push(1);
|
||||
queue.push(2);
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(1, result);
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(2, result);
|
||||
|
||||
queue.push(1);
|
||||
queue.push(2);
|
||||
queue.finish();
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(1, result);
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(2, result);
|
||||
EXPECT_FALSE(queue.pop(result));
|
||||
|
||||
queue.waitUntilFinished();
|
||||
}
|
||||
|
||||
TEST(WorkQueue, SPSC) {
|
||||
WorkQueue<int> queue;
|
||||
const int max = 100;
|
||||
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
queue.push(int{i});
|
||||
}
|
||||
|
||||
std::thread thread([ &queue, max ] {
|
||||
int result;
|
||||
for (int i = 0;; ++i) {
|
||||
if (!queue.pop(result)) {
|
||||
EXPECT_EQ(i, max);
|
||||
break;
|
||||
}
|
||||
EXPECT_EQ(i, result);
|
||||
}
|
||||
});
|
||||
|
||||
std::this_thread::yield();
|
||||
for (int i = 10; i < max; ++i) {
|
||||
queue.push(int{i});
|
||||
}
|
||||
queue.finish();
|
||||
|
||||
thread.join();
|
||||
}
|
||||
|
||||
TEST(WorkQueue, SPMC) {
|
||||
WorkQueue<int> queue;
|
||||
std::vector<int> results(50, -1);
|
||||
std::mutex mutex;
|
||||
std::vector<std::thread> threads;
|
||||
for (int i = 0; i < 5; ++i) {
|
||||
threads.emplace_back(Popper{&queue, results.data(), &mutex});
|
||||
}
|
||||
|
||||
for (int i = 0; i < 50; ++i) {
|
||||
queue.push(int{i});
|
||||
}
|
||||
queue.finish();
|
||||
|
||||
for (auto& thread : threads) {
|
||||
thread.join();
|
||||
}
|
||||
|
||||
for (int i = 0; i < 50; ++i) {
|
||||
EXPECT_EQ(i, results[i]);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(WorkQueue, MPMC) {
|
||||
WorkQueue<int> queue;
|
||||
std::vector<int> results(100, -1);
|
||||
std::mutex mutex;
|
||||
std::vector<std::thread> popperThreads;
|
||||
for (int i = 0; i < 4; ++i) {
|
||||
popperThreads.emplace_back(Popper{&queue, results.data(), &mutex});
|
||||
}
|
||||
|
||||
std::vector<std::thread> pusherThreads;
|
||||
for (int i = 0; i < 2; ++i) {
|
||||
auto min = i * 50;
|
||||
auto max = (i + 1) * 50;
|
||||
pusherThreads.emplace_back(
|
||||
[ &queue, min, max ] {
|
||||
for (int i = min; i < max; ++i) {
|
||||
queue.push(int{i});
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
for (auto& thread : pusherThreads) {
|
||||
thread.join();
|
||||
}
|
||||
queue.finish();
|
||||
|
||||
for (auto& thread : popperThreads) {
|
||||
thread.join();
|
||||
}
|
||||
|
||||
for (int i = 0; i < 100; ++i) {
|
||||
EXPECT_EQ(i, results[i]);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(WorkQueue, BoundedSizeWorks) {
|
||||
WorkQueue<int> queue(1);
|
||||
int result;
|
||||
queue.push(5);
|
||||
queue.pop(result);
|
||||
queue.push(5);
|
||||
queue.pop(result);
|
||||
queue.push(5);
|
||||
queue.finish();
|
||||
queue.pop(result);
|
||||
EXPECT_EQ(5, result);
|
||||
}
|
||||
|
||||
TEST(WorkQueue, BoundedSizePushAfterFinish) {
|
||||
WorkQueue<int> queue(1);
|
||||
int result;
|
||||
queue.push(5);
|
||||
std::thread pusher([&queue] {
|
||||
queue.push(6);
|
||||
});
|
||||
// Dirtily try and make sure that pusher has run.
|
||||
std::this_thread::sleep_for(std::chrono::seconds(1));
|
||||
queue.finish();
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(5, result);
|
||||
EXPECT_FALSE(queue.pop(result));
|
||||
|
||||
pusher.join();
|
||||
}
|
||||
|
||||
TEST(WorkQueue, SetMaxSize) {
|
||||
WorkQueue<int> queue(2);
|
||||
int result;
|
||||
queue.push(5);
|
||||
queue.push(6);
|
||||
queue.setMaxSize(1);
|
||||
std::thread pusher([&queue] {
|
||||
queue.push(7);
|
||||
});
|
||||
// Dirtily try and make sure that pusher has run.
|
||||
std::this_thread::sleep_for(std::chrono::seconds(1));
|
||||
queue.finish();
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(5, result);
|
||||
EXPECT_TRUE(queue.pop(result));
|
||||
EXPECT_EQ(6, result);
|
||||
EXPECT_FALSE(queue.pop(result));
|
||||
|
||||
pusher.join();
|
||||
}
|
||||
|
||||
TEST(WorkQueue, BoundedSizeMPMC) {
|
||||
WorkQueue<int> queue(10);
|
||||
std::vector<int> results(200, -1);
|
||||
std::mutex mutex;
|
||||
std::cerr << "Creating popperThreads" << std::endl;
|
||||
std::vector<std::thread> popperThreads;
|
||||
for (int i = 0; i < 4; ++i) {
|
||||
popperThreads.emplace_back(Popper{&queue, results.data(), &mutex});
|
||||
}
|
||||
|
||||
std::cerr << "Creating pusherThreads" << std::endl;
|
||||
std::vector<std::thread> pusherThreads;
|
||||
for (int i = 0; i < 2; ++i) {
|
||||
auto min = i * 100;
|
||||
auto max = (i + 1) * 100;
|
||||
pusherThreads.emplace_back(
|
||||
[ &queue, min, max ] {
|
||||
for (int i = min; i < max; ++i) {
|
||||
queue.push(int{i});
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
std::cerr << "Joining pusherThreads" << std::endl;
|
||||
for (auto& thread : pusherThreads) {
|
||||
thread.join();
|
||||
}
|
||||
std::cerr << "Finishing queue" << std::endl;
|
||||
queue.finish();
|
||||
|
||||
std::cerr << "Joining popperThreads" << std::endl;
|
||||
for (auto& thread : popperThreads) {
|
||||
thread.join();
|
||||
}
|
||||
|
||||
std::cerr << "Inspecting results" << std::endl;
|
||||
for (int i = 0; i < 200; ++i) {
|
||||
EXPECT_EQ(i, results[i]);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(WorkQueue, FailedPush) {
|
||||
WorkQueue<std::unique_ptr<int>> queue;
|
||||
std::unique_ptr<int> x(new int{5});
|
||||
EXPECT_TRUE(queue.push(std::move(x)));
|
||||
EXPECT_EQ(nullptr, x);
|
||||
queue.finish();
|
||||
x.reset(new int{6});
|
||||
EXPECT_FALSE(queue.push(std::move(x)));
|
||||
EXPECT_NE(nullptr, x);
|
||||
EXPECT_EQ(6, *x);
|
||||
}
|
||||
|
||||
TEST(BufferWorkQueue, SizeCalculatedCorrectly) {
|
||||
{
|
||||
BufferWorkQueue queue;
|
||||
queue.finish();
|
||||
EXPECT_EQ(0, queue.size());
|
||||
}
|
||||
{
|
||||
BufferWorkQueue queue;
|
||||
queue.push(Buffer(10));
|
||||
queue.finish();
|
||||
EXPECT_EQ(10, queue.size());
|
||||
}
|
||||
{
|
||||
BufferWorkQueue queue;
|
||||
queue.push(Buffer(10));
|
||||
queue.push(Buffer(5));
|
||||
queue.finish();
|
||||
EXPECT_EQ(15, queue.size());
|
||||
}
|
||||
{
|
||||
BufferWorkQueue queue;
|
||||
queue.push(Buffer(10));
|
||||
queue.push(Buffer(5));
|
||||
queue.finish();
|
||||
Buffer buffer;
|
||||
queue.pop(buffer);
|
||||
EXPECT_EQ(5, queue.size());
|
||||
}
|
||||
}
|
@ -1,53 +0,0 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2017-present, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# ################################################################
|
||||
|
||||
# This Makefile presumes libzstd is built, using `make` in / or /lib/
|
||||
|
||||
ZSTDLIB_PATH = ../../../lib
|
||||
ZSTDLIB_NAME = libzstd.a
|
||||
ZSTDLIB = $(ZSTDLIB_PATH)/$(ZSTDLIB_NAME)
|
||||
|
||||
CPPFLAGS += -I../ -I../../../lib -I../../../lib/common
|
||||
|
||||
CFLAGS ?= -O3
|
||||
CFLAGS += -g
|
||||
|
||||
SEEKABLE_OBJS = ../zstdseek_compress.c ../zstdseek_decompress.c $(ZSTDLIB)
|
||||
|
||||
.PHONY: default all clean test
|
||||
|
||||
default: all
|
||||
|
||||
all: seekable_compression seekable_decompression seekable_decompression_mem \
|
||||
parallel_processing
|
||||
|
||||
$(ZSTDLIB):
|
||||
make -C $(ZSTDLIB_PATH) $(ZSTDLIB_NAME)
|
||||
|
||||
seekable_compression : seekable_compression.c $(SEEKABLE_OBJS)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
|
||||
|
||||
seekable_decompression : seekable_decompression.c $(SEEKABLE_OBJS)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
|
||||
|
||||
seekable_decompression_mem : seekable_decompression_mem.c $(SEEKABLE_OBJS)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
|
||||
|
||||
parallel_processing : parallel_processing.c $(SEEKABLE_OBJS)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@ -pthread
|
||||
|
||||
parallel_compression : parallel_compression.c $(SEEKABLE_OBJS)
|
||||
$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@ -pthread
|
||||
|
||||
clean:
|
||||
@rm -f core *.o tmp* result* *.zst \
|
||||
seekable_compression seekable_decompression \
|
||||
seekable_decompression_mem \
|
||||
parallel_processing parallel_compression
|
||||
@echo Cleaning completed
|
@ -1,215 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
#include <stdlib.h> // malloc, free, exit, atoi
|
||||
#include <stdio.h> // fprintf, perror, feof, fopen, etc.
|
||||
#include <string.h> // strlen, memset, strcat
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include <zstd.h> // presumes zstd library is installed
|
||||
#include <zstd_errors.h>
|
||||
#if defined(WIN32) || defined(_WIN32)
|
||||
# include <windows.h>
|
||||
# define SLEEP(x) Sleep(x)
|
||||
#else
|
||||
# include <unistd.h>
|
||||
# define SLEEP(x) usleep(x * 1000)
|
||||
#endif
|
||||
|
||||
#define XXH_NAMESPACE ZSTD_
|
||||
#include "xxhash.h"
|
||||
|
||||
#include "pool.h" // use zstd thread pool for demo
|
||||
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
static void* malloc_orDie(size_t size)
|
||||
{
|
||||
void* const buff = malloc(size);
|
||||
if (buff) return buff;
|
||||
/* error */
|
||||
perror("malloc:");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static FILE* fopen_orDie(const char *filename, const char *instruction)
|
||||
{
|
||||
FILE* const inFile = fopen(filename, instruction);
|
||||
if (inFile) return inFile;
|
||||
/* error */
|
||||
perror(filename);
|
||||
exit(3);
|
||||
}
|
||||
|
||||
static size_t fread_orDie(void* buffer, size_t sizeToRead, FILE* file)
|
||||
{
|
||||
size_t const readSize = fread(buffer, 1, sizeToRead, file);
|
||||
if (readSize == sizeToRead) return readSize; /* good */
|
||||
if (feof(file)) return readSize; /* good, reached end of file */
|
||||
/* error */
|
||||
perror("fread");
|
||||
exit(4);
|
||||
}
|
||||
|
||||
static size_t fwrite_orDie(const void* buffer, size_t sizeToWrite, FILE* file)
|
||||
{
|
||||
size_t const writtenSize = fwrite(buffer, 1, sizeToWrite, file);
|
||||
if (writtenSize == sizeToWrite) return sizeToWrite; /* good */
|
||||
/* error */
|
||||
perror("fwrite");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
static size_t fclose_orDie(FILE* file)
|
||||
{
|
||||
if (!fclose(file)) return 0;
|
||||
/* error */
|
||||
perror("fclose");
|
||||
exit(6);
|
||||
}
|
||||
|
||||
static void fseek_orDie(FILE* file, long int offset, int origin)
|
||||
{
|
||||
if (!fseek(file, offset, origin)) {
|
||||
if (!fflush(file)) return;
|
||||
}
|
||||
/* error */
|
||||
perror("fseek");
|
||||
exit(7);
|
||||
}
|
||||
|
||||
static long int ftell_orDie(FILE* file)
|
||||
{
|
||||
long int off = ftell(file);
|
||||
if (off != -1) return off;
|
||||
/* error */
|
||||
perror("ftell");
|
||||
exit(8);
|
||||
}
|
||||
|
||||
struct job {
|
||||
const void* src;
|
||||
size_t srcSize;
|
||||
void* dst;
|
||||
size_t dstSize;
|
||||
|
||||
unsigned checksum;
|
||||
|
||||
int compressionLevel;
|
||||
int done;
|
||||
};
|
||||
|
||||
static void compressFrame(void* opaque)
|
||||
{
|
||||
struct job* job = opaque;
|
||||
|
||||
job->checksum = XXH64(job->src, job->srcSize, 0);
|
||||
|
||||
size_t ret = ZSTD_compress(job->dst, job->dstSize, job->src, job->srcSize, job->compressionLevel);
|
||||
if (ZSTD_isError(ret)) {
|
||||
fprintf(stderr, "ZSTD_compress() error : %s \n", ZSTD_getErrorName(ret));
|
||||
exit(20);
|
||||
}
|
||||
|
||||
job->dstSize = ret;
|
||||
job->done = 1;
|
||||
}
|
||||
|
||||
static void compressFile_orDie(const char* fname, const char* outName, int cLevel, unsigned frameSize, int nbThreads)
|
||||
{
|
||||
POOL_ctx* pool = POOL_create(nbThreads, nbThreads);
|
||||
if (pool == NULL) { fprintf(stderr, "POOL_create() error \n"); exit(9); }
|
||||
|
||||
FILE* const fin = fopen_orDie(fname, "rb");
|
||||
FILE* const fout = fopen_orDie(outName, "wb");
|
||||
|
||||
if (ZSTD_compressBound(frameSize) > 0xFFFFFFFFU) { fprintf(stderr, "Frame size too large \n"); exit(10); }
|
||||
unsigned dstSize = ZSTD_compressBound(frameSize);
|
||||
|
||||
|
||||
fseek_orDie(fin, 0, SEEK_END);
|
||||
long int length = ftell_orDie(fin);
|
||||
fseek_orDie(fin, 0, SEEK_SET);
|
||||
|
||||
size_t numFrames = (length + frameSize - 1) / frameSize;
|
||||
|
||||
struct job* jobs = malloc_orDie(sizeof(struct job) * numFrames);
|
||||
|
||||
size_t i;
|
||||
for(i = 0; i < numFrames; i++) {
|
||||
void* in = malloc_orDie(frameSize);
|
||||
void* out = malloc_orDie(dstSize);
|
||||
|
||||
size_t inSize = fread_orDie(in, frameSize, fin);
|
||||
|
||||
jobs[i].src = in;
|
||||
jobs[i].srcSize = inSize;
|
||||
jobs[i].dst = out;
|
||||
jobs[i].dstSize = dstSize;
|
||||
jobs[i].compressionLevel = cLevel;
|
||||
jobs[i].done = 0;
|
||||
POOL_add(pool, compressFrame, &jobs[i]);
|
||||
}
|
||||
|
||||
ZSTD_frameLog* fl = ZSTD_seekable_createFrameLog(1);
|
||||
if (fl == NULL) { fprintf(stderr, "ZSTD_seekable_createFrameLog() failed \n"); exit(11); }
|
||||
for (i = 0; i < numFrames; i++) {
|
||||
while (!jobs[i].done) SLEEP(5); /* wake up every 5 milliseconds to check */
|
||||
fwrite_orDie(jobs[i].dst, jobs[i].dstSize, fout);
|
||||
free((void*)jobs[i].src);
|
||||
free(jobs[i].dst);
|
||||
|
||||
size_t ret = ZSTD_seekable_logFrame(fl, jobs[i].dstSize, jobs[i].srcSize, jobs[i].checksum);
|
||||
if (ZSTD_isError(ret)) { fprintf(stderr, "ZSTD_seekable_logFrame() error : %s \n", ZSTD_getErrorName(ret)); }
|
||||
}
|
||||
|
||||
{ unsigned char seekTableBuff[1024];
|
||||
ZSTD_outBuffer out = {seekTableBuff, 1024, 0};
|
||||
while (ZSTD_seekable_writeSeekTable(fl, &out) != 0) {
|
||||
fwrite_orDie(seekTableBuff, out.pos, fout);
|
||||
out.pos = 0;
|
||||
}
|
||||
fwrite_orDie(seekTableBuff, out.pos, fout);
|
||||
}
|
||||
|
||||
ZSTD_seekable_freeFrameLog(fl);
|
||||
free(jobs);
|
||||
fclose_orDie(fout);
|
||||
fclose_orDie(fin);
|
||||
}
|
||||
|
||||
static const char* createOutFilename_orDie(const char* filename)
|
||||
{
|
||||
size_t const inL = strlen(filename);
|
||||
size_t const outL = inL + 5;
|
||||
void* outSpace = malloc_orDie(outL);
|
||||
memset(outSpace, 0, outL);
|
||||
strcat(outSpace, filename);
|
||||
strcat(outSpace, ".zst");
|
||||
return (const char*)outSpace;
|
||||
}
|
||||
|
||||
int main(int argc, const char** argv) {
|
||||
const char* const exeName = argv[0];
|
||||
if (argc!=4) {
|
||||
printf("wrong arguments\n");
|
||||
printf("usage:\n");
|
||||
printf("%s FILE FRAME_SIZE NB_THREADS\n", exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
{ const char* const inFileName = argv[1];
|
||||
unsigned const frameSize = (unsigned)atoi(argv[2]);
|
||||
int const nbThreads = atoi(argv[3]);
|
||||
|
||||
const char* const outFileName = createOutFilename_orDie(inFileName);
|
||||
compressFile_orDie(inFileName, outFileName, 5, frameSize, nbThreads);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,194 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
/*
|
||||
* A simple demo that sums up all the bytes in the file in parallel using
|
||||
* seekable decompression and the zstd thread pool
|
||||
*/
|
||||
|
||||
#include <stdlib.h> // malloc, exit
|
||||
#include <stdio.h> // fprintf, perror, feof
|
||||
#include <string.h> // strerror
|
||||
#include <errno.h> // errno
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include <zstd.h> // presumes zstd library is installed
|
||||
#include <zstd_errors.h>
|
||||
#if defined(WIN32) || defined(_WIN32)
|
||||
# include <windows.h>
|
||||
# define SLEEP(x) Sleep(x)
|
||||
#else
|
||||
# include <unistd.h>
|
||||
# define SLEEP(x) usleep(x * 1000)
|
||||
#endif
|
||||
|
||||
#include "pool.h" // use zstd thread pool for demo
|
||||
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
|
||||
static void* malloc_orDie(size_t size)
|
||||
{
|
||||
void* const buff = malloc(size);
|
||||
if (buff) return buff;
|
||||
/* error */
|
||||
perror("malloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static void* realloc_orDie(void* ptr, size_t size)
|
||||
{
|
||||
ptr = realloc(ptr, size);
|
||||
if (ptr) return ptr;
|
||||
/* error */
|
||||
perror("realloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static FILE* fopen_orDie(const char *filename, const char *instruction)
|
||||
{
|
||||
FILE* const inFile = fopen(filename, instruction);
|
||||
if (inFile) return inFile;
|
||||
/* error */
|
||||
perror(filename);
|
||||
exit(3);
|
||||
}
|
||||
|
||||
static size_t fread_orDie(void* buffer, size_t sizeToRead, FILE* file)
|
||||
{
|
||||
size_t const readSize = fread(buffer, 1, sizeToRead, file);
|
||||
if (readSize == sizeToRead) return readSize; /* good */
|
||||
if (feof(file)) return readSize; /* good, reached end of file */
|
||||
/* error */
|
||||
perror("fread");
|
||||
exit(4);
|
||||
}
|
||||
|
||||
static size_t fwrite_orDie(const void* buffer, size_t sizeToWrite, FILE* file)
|
||||
{
|
||||
size_t const writtenSize = fwrite(buffer, 1, sizeToWrite, file);
|
||||
if (writtenSize == sizeToWrite) return sizeToWrite; /* good */
|
||||
/* error */
|
||||
perror("fwrite");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
static size_t fclose_orDie(FILE* file)
|
||||
{
|
||||
if (!fclose(file)) return 0;
|
||||
/* error */
|
||||
perror("fclose");
|
||||
exit(6);
|
||||
}
|
||||
|
||||
static void fseek_orDie(FILE* file, long int offset, int origin) {
|
||||
if (!fseek(file, offset, origin)) {
|
||||
if (!fflush(file)) return;
|
||||
}
|
||||
/* error */
|
||||
perror("fseek");
|
||||
exit(7);
|
||||
}
|
||||
|
||||
struct sum_job {
|
||||
const char* fname;
|
||||
unsigned long long sum;
|
||||
unsigned frameNb;
|
||||
int done;
|
||||
};
|
||||
|
||||
static void sumFrame(void* opaque)
|
||||
{
|
||||
struct sum_job* job = (struct sum_job*)opaque;
|
||||
job->done = 0;
|
||||
|
||||
FILE* const fin = fopen_orDie(job->fname, "rb");
|
||||
|
||||
ZSTD_seekable* const seekable = ZSTD_seekable_create();
|
||||
if (seekable==NULL) { fprintf(stderr, "ZSTD_seekable_create() error \n"); exit(10); }
|
||||
|
||||
size_t const initResult = ZSTD_seekable_initFile(seekable, fin);
|
||||
if (ZSTD_isError(initResult)) { fprintf(stderr, "ZSTD_seekable_init() error : %s \n", ZSTD_getErrorName(initResult)); exit(11); }
|
||||
|
||||
size_t const frameSize = ZSTD_seekable_getFrameDecompressedSize(seekable, job->frameNb);
|
||||
unsigned char* data = malloc_orDie(frameSize);
|
||||
|
||||
size_t result = ZSTD_seekable_decompressFrame(seekable, data, frameSize, job->frameNb);
|
||||
if (ZSTD_isError(result)) { fprintf(stderr, "ZSTD_seekable_decompressFrame() error : %s \n", ZSTD_getErrorName(result)); exit(12); }
|
||||
|
||||
unsigned long long sum = 0;
|
||||
size_t i;
|
||||
for (i = 0; i < frameSize; i++) {
|
||||
sum += data[i];
|
||||
}
|
||||
job->sum = sum;
|
||||
job->done = 1;
|
||||
|
||||
fclose(fin);
|
||||
ZSTD_seekable_free(seekable);
|
||||
free(data);
|
||||
}
|
||||
|
||||
static void sumFile_orDie(const char* fname, int nbThreads)
|
||||
{
|
||||
POOL_ctx* pool = POOL_create(nbThreads, nbThreads);
|
||||
if (pool == NULL) { fprintf(stderr, "POOL_create() error \n"); exit(9); }
|
||||
|
||||
FILE* const fin = fopen_orDie(fname, "rb");
|
||||
|
||||
ZSTD_seekable* const seekable = ZSTD_seekable_create();
|
||||
if (seekable==NULL) { fprintf(stderr, "ZSTD_seekable_create() error \n"); exit(10); }
|
||||
|
||||
size_t const initResult = ZSTD_seekable_initFile(seekable, fin);
|
||||
if (ZSTD_isError(initResult)) { fprintf(stderr, "ZSTD_seekable_init() error : %s \n", ZSTD_getErrorName(initResult)); exit(11); }
|
||||
|
||||
unsigned const numFrames = ZSTD_seekable_getNumFrames(seekable);
|
||||
struct sum_job* jobs = (struct sum_job*)malloc(numFrames * sizeof(struct sum_job));
|
||||
|
||||
unsigned fnb;
|
||||
for (fnb = 0; fnb < numFrames; fnb++) {
|
||||
jobs[fnb] = (struct sum_job){ fname, 0, fnb, 0 };
|
||||
POOL_add(pool, sumFrame, &jobs[fnb]);
|
||||
}
|
||||
|
||||
unsigned long long total = 0;
|
||||
|
||||
for (fnb = 0; fnb < numFrames; fnb++) {
|
||||
while (!jobs[fnb].done) SLEEP(5); /* wake up every 5 milliseconds to check */
|
||||
total += jobs[fnb].sum;
|
||||
}
|
||||
|
||||
printf("Sum: %llu\n", total);
|
||||
|
||||
POOL_free(pool);
|
||||
ZSTD_seekable_free(seekable);
|
||||
fclose(fin);
|
||||
free(jobs);
|
||||
}
|
||||
|
||||
|
||||
int main(int argc, const char** argv)
|
||||
{
|
||||
const char* const exeName = argv[0];
|
||||
|
||||
if (argc!=3) {
|
||||
fprintf(stderr, "wrong arguments\n");
|
||||
fprintf(stderr, "usage:\n");
|
||||
fprintf(stderr, "%s FILE NB_THREADS\n", exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
{
|
||||
const char* const inFilename = argv[1];
|
||||
int const nbThreads = atoi(argv[2]);
|
||||
sumFile_orDie(inFilename, nbThreads);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,133 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
#include <stdlib.h> // malloc, free, exit, atoi
|
||||
#include <stdio.h> // fprintf, perror, feof, fopen, etc.
|
||||
#include <string.h> // strlen, memset, strcat
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include <zstd.h> // presumes zstd library is installed
|
||||
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
static void* malloc_orDie(size_t size)
|
||||
{
|
||||
void* const buff = malloc(size);
|
||||
if (buff) return buff;
|
||||
/* error */
|
||||
perror("malloc:");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static FILE* fopen_orDie(const char *filename, const char *instruction)
|
||||
{
|
||||
FILE* const inFile = fopen(filename, instruction);
|
||||
if (inFile) return inFile;
|
||||
/* error */
|
||||
perror(filename);
|
||||
exit(3);
|
||||
}
|
||||
|
||||
static size_t fread_orDie(void* buffer, size_t sizeToRead, FILE* file)
|
||||
{
|
||||
size_t const readSize = fread(buffer, 1, sizeToRead, file);
|
||||
if (readSize == sizeToRead) return readSize; /* good */
|
||||
if (feof(file)) return readSize; /* good, reached end of file */
|
||||
/* error */
|
||||
perror("fread");
|
||||
exit(4);
|
||||
}
|
||||
|
||||
static size_t fwrite_orDie(const void* buffer, size_t sizeToWrite, FILE* file)
|
||||
{
|
||||
size_t const writtenSize = fwrite(buffer, 1, sizeToWrite, file);
|
||||
if (writtenSize == sizeToWrite) return sizeToWrite; /* good */
|
||||
/* error */
|
||||
perror("fwrite");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
static size_t fclose_orDie(FILE* file)
|
||||
{
|
||||
if (!fclose(file)) return 0;
|
||||
/* error */
|
||||
perror("fclose");
|
||||
exit(6);
|
||||
}
|
||||
|
||||
static void compressFile_orDie(const char* fname, const char* outName, int cLevel, unsigned frameSize)
|
||||
{
|
||||
FILE* const fin = fopen_orDie(fname, "rb");
|
||||
FILE* const fout = fopen_orDie(outName, "wb");
|
||||
size_t const buffInSize = ZSTD_CStreamInSize(); /* can always read one full block */
|
||||
void* const buffIn = malloc_orDie(buffInSize);
|
||||
size_t const buffOutSize = ZSTD_CStreamOutSize(); /* can always flush a full block */
|
||||
void* const buffOut = malloc_orDie(buffOutSize);
|
||||
|
||||
ZSTD_seekable_CStream* const cstream = ZSTD_seekable_createCStream();
|
||||
if (cstream==NULL) { fprintf(stderr, "ZSTD_seekable_createCStream() error \n"); exit(10); }
|
||||
size_t const initResult = ZSTD_seekable_initCStream(cstream, cLevel, 1, frameSize);
|
||||
if (ZSTD_isError(initResult)) { fprintf(stderr, "ZSTD_seekable_initCStream() error : %s \n", ZSTD_getErrorName(initResult)); exit(11); }
|
||||
|
||||
size_t read, toRead = buffInSize;
|
||||
while( (read = fread_orDie(buffIn, toRead, fin)) ) {
|
||||
ZSTD_inBuffer input = { buffIn, read, 0 };
|
||||
while (input.pos < input.size) {
|
||||
ZSTD_outBuffer output = { buffOut, buffOutSize, 0 };
|
||||
toRead = ZSTD_seekable_compressStream(cstream, &output , &input); /* toRead is guaranteed to be <= ZSTD_CStreamInSize() */
|
||||
if (ZSTD_isError(toRead)) { fprintf(stderr, "ZSTD_seekable_compressStream() error : %s \n", ZSTD_getErrorName(toRead)); exit(12); }
|
||||
if (toRead > buffInSize) toRead = buffInSize; /* Safely handle case when `buffInSize` is manually changed to a value < ZSTD_CStreamInSize()*/
|
||||
fwrite_orDie(buffOut, output.pos, fout);
|
||||
}
|
||||
}
|
||||
|
||||
while (1) {
|
||||
ZSTD_outBuffer output = { buffOut, buffOutSize, 0 };
|
||||
size_t const remainingToFlush = ZSTD_seekable_endStream(cstream, &output); /* close stream */
|
||||
if (ZSTD_isError(remainingToFlush)) { fprintf(stderr, "ZSTD_seekable_endStream() error : %s \n", ZSTD_getErrorName(remainingToFlush)); exit(13); }
|
||||
fwrite_orDie(buffOut, output.pos, fout);
|
||||
if (!remainingToFlush) break;
|
||||
}
|
||||
|
||||
ZSTD_seekable_freeCStream(cstream);
|
||||
fclose_orDie(fout);
|
||||
fclose_orDie(fin);
|
||||
free(buffIn);
|
||||
free(buffOut);
|
||||
}
|
||||
|
||||
static char* createOutFilename_orDie(const char* filename)
|
||||
{
|
||||
size_t const inL = strlen(filename);
|
||||
size_t const outL = inL + 5;
|
||||
void* outSpace = malloc_orDie(outL);
|
||||
memset(outSpace, 0, outL);
|
||||
strcat(outSpace, filename);
|
||||
strcat(outSpace, ".zst");
|
||||
return (char*)outSpace;
|
||||
}
|
||||
|
||||
int main(int argc, const char** argv) {
|
||||
const char* const exeName = argv[0];
|
||||
if (argc!=3) {
|
||||
printf("wrong arguments\n");
|
||||
printf("usage:\n");
|
||||
printf("%s FILE FRAME_SIZE\n", exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
{ const char* const inFileName = argv[1];
|
||||
unsigned const frameSize = (unsigned)atoi(argv[2]);
|
||||
|
||||
char* const outFileName = createOutFilename_orDie(inFileName);
|
||||
compressFile_orDie(inFileName, outFileName, 5, frameSize);
|
||||
free(outFileName);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,138 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
|
||||
#include <stdlib.h> // malloc, exit
|
||||
#include <stdio.h> // fprintf, perror, feof
|
||||
#include <string.h> // strerror
|
||||
#include <errno.h> // errno
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include <zstd.h> // presumes zstd library is installed
|
||||
#include <zstd_errors.h>
|
||||
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
|
||||
static void* malloc_orDie(size_t size)
|
||||
{
|
||||
void* const buff = malloc(size);
|
||||
if (buff) return buff;
|
||||
/* error */
|
||||
perror("malloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static void* realloc_orDie(void* ptr, size_t size)
|
||||
{
|
||||
ptr = realloc(ptr, size);
|
||||
if (ptr) return ptr;
|
||||
/* error */
|
||||
perror("realloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static FILE* fopen_orDie(const char *filename, const char *instruction)
|
||||
{
|
||||
FILE* const inFile = fopen(filename, instruction);
|
||||
if (inFile) return inFile;
|
||||
/* error */
|
||||
perror(filename);
|
||||
exit(3);
|
||||
}
|
||||
|
||||
static size_t fread_orDie(void* buffer, size_t sizeToRead, FILE* file)
|
||||
{
|
||||
size_t const readSize = fread(buffer, 1, sizeToRead, file);
|
||||
if (readSize == sizeToRead) return readSize; /* good */
|
||||
if (feof(file)) return readSize; /* good, reached end of file */
|
||||
/* error */
|
||||
perror("fread");
|
||||
exit(4);
|
||||
}
|
||||
|
||||
static size_t fwrite_orDie(const void* buffer, size_t sizeToWrite, FILE* file)
|
||||
{
|
||||
size_t const writtenSize = fwrite(buffer, 1, sizeToWrite, file);
|
||||
if (writtenSize == sizeToWrite) return sizeToWrite; /* good */
|
||||
/* error */
|
||||
perror("fwrite");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
static size_t fclose_orDie(FILE* file)
|
||||
{
|
||||
if (!fclose(file)) return 0;
|
||||
/* error */
|
||||
perror("fclose");
|
||||
exit(6);
|
||||
}
|
||||
|
||||
static void fseek_orDie(FILE* file, long int offset, int origin) {
|
||||
if (!fseek(file, offset, origin)) {
|
||||
if (!fflush(file)) return;
|
||||
}
|
||||
/* error */
|
||||
perror("fseek");
|
||||
exit(7);
|
||||
}
|
||||
|
||||
|
||||
static void decompressFile_orDie(const char* fname, off_t startOffset, off_t endOffset)
|
||||
{
|
||||
FILE* const fin = fopen_orDie(fname, "rb");
|
||||
FILE* const fout = stdout;
|
||||
size_t const buffOutSize = ZSTD_DStreamOutSize(); /* Guarantee to successfully flush at least one complete compressed block in all circumstances. */
|
||||
void* const buffOut = malloc_orDie(buffOutSize);
|
||||
|
||||
ZSTD_seekable* const seekable = ZSTD_seekable_create();
|
||||
if (seekable==NULL) { fprintf(stderr, "ZSTD_seekable_create() error \n"); exit(10); }
|
||||
|
||||
size_t const initResult = ZSTD_seekable_initFile(seekable, fin);
|
||||
if (ZSTD_isError(initResult)) { fprintf(stderr, "ZSTD_seekable_init() error : %s \n", ZSTD_getErrorName(initResult)); exit(11); }
|
||||
|
||||
while (startOffset < endOffset) {
|
||||
size_t const result = ZSTD_seekable_decompress(seekable, buffOut, MIN(endOffset - startOffset, buffOutSize), startOffset);
|
||||
|
||||
if (ZSTD_isError(result)) {
|
||||
fprintf(stderr, "ZSTD_seekable_decompress() error : %s \n",
|
||||
ZSTD_getErrorName(result));
|
||||
exit(12);
|
||||
}
|
||||
fwrite_orDie(buffOut, result, fout);
|
||||
startOffset += result;
|
||||
}
|
||||
|
||||
ZSTD_seekable_free(seekable);
|
||||
fclose_orDie(fin);
|
||||
fclose_orDie(fout);
|
||||
free(buffOut);
|
||||
}
|
||||
|
||||
|
||||
int main(int argc, const char** argv)
|
||||
{
|
||||
const char* const exeName = argv[0];
|
||||
|
||||
if (argc!=4) {
|
||||
fprintf(stderr, "wrong arguments\n");
|
||||
fprintf(stderr, "usage:\n");
|
||||
fprintf(stderr, "%s FILE START END\n", exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
{
|
||||
const char* const inFilename = argv[1];
|
||||
off_t const startOffset = atoll(argv[2]);
|
||||
off_t const endOffset = atoll(argv[3]);
|
||||
decompressFile_orDie(inFilename, startOffset, endOffset);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,144 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
|
||||
#include <stdlib.h> // malloc, exit
|
||||
#include <stdio.h> // fprintf, perror, feof
|
||||
#include <string.h> // strerror
|
||||
#include <errno.h> // errno
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include <zstd.h> // presumes zstd library is installed
|
||||
#include <zstd_errors.h>
|
||||
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
|
||||
#define MAX_FILE_SIZE (8 * 1024 * 1024)
|
||||
|
||||
static void* malloc_orDie(size_t size)
|
||||
{
|
||||
void* const buff = malloc(size);
|
||||
if (buff) return buff;
|
||||
/* error */
|
||||
perror("malloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static void* realloc_orDie(void* ptr, size_t size)
|
||||
{
|
||||
ptr = realloc(ptr, size);
|
||||
if (ptr) return ptr;
|
||||
/* error */
|
||||
perror("realloc");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
static FILE* fopen_orDie(const char *filename, const char *instruction)
|
||||
{
|
||||
FILE* const inFile = fopen(filename, instruction);
|
||||
if (inFile) return inFile;
|
||||
/* error */
|
||||
perror(filename);
|
||||
exit(3);
|
||||
}
|
||||
|
||||
static size_t fread_orDie(void* buffer, size_t sizeToRead, FILE* file)
|
||||
{
|
||||
size_t const readSize = fread(buffer, 1, sizeToRead, file);
|
||||
if (readSize == sizeToRead) return readSize; /* good */
|
||||
if (feof(file)) return readSize; /* good, reached end of file */
|
||||
/* error */
|
||||
perror("fread");
|
||||
exit(4);
|
||||
}
|
||||
|
||||
static size_t fwrite_orDie(const void* buffer, size_t sizeToWrite, FILE* file)
|
||||
{
|
||||
size_t const writtenSize = fwrite(buffer, 1, sizeToWrite, file);
|
||||
if (writtenSize == sizeToWrite) return sizeToWrite; /* good */
|
||||
/* error */
|
||||
perror("fwrite");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
static size_t fclose_orDie(FILE* file)
|
||||
{
|
||||
if (!fclose(file)) return 0;
|
||||
/* error */
|
||||
perror("fclose");
|
||||
exit(6);
|
||||
}
|
||||
|
||||
static void fseek_orDie(FILE* file, long int offset, int origin) {
|
||||
if (!fseek(file, offset, origin)) {
|
||||
if (!fflush(file)) return;
|
||||
}
|
||||
/* error */
|
||||
perror("fseek");
|
||||
exit(7);
|
||||
}
|
||||
|
||||
|
||||
static void decompressFile_orDie(const char* fname, off_t startOffset, off_t endOffset)
|
||||
{
|
||||
FILE* const fin = fopen_orDie(fname, "rb");
|
||||
FILE* const fout = stdout;
|
||||
// Just for demo purposes, assume file is <= MAX_FILE_SIZE
|
||||
void* const buffIn = malloc_orDie(MAX_FILE_SIZE);
|
||||
size_t const inSize = fread_orDie(buffIn, MAX_FILE_SIZE, fin);
|
||||
size_t const buffOutSize = ZSTD_DStreamOutSize(); /* Guarantee to successfully flush at least one complete compressed block in all circumstances. */
|
||||
void* const buffOut = malloc_orDie(buffOutSize);
|
||||
|
||||
ZSTD_seekable* const seekable = ZSTD_seekable_create();
|
||||
if (seekable==NULL) { fprintf(stderr, "ZSTD_seekable_create() error \n"); exit(10); }
|
||||
|
||||
size_t const initResult = ZSTD_seekable_initBuff(seekable, buffIn, inSize);
|
||||
if (ZSTD_isError(initResult)) { fprintf(stderr, "ZSTD_seekable_init() error : %s \n", ZSTD_getErrorName(initResult)); exit(11); }
|
||||
|
||||
while (startOffset < endOffset) {
|
||||
size_t const result = ZSTD_seekable_decompress(seekable, buffOut, MIN(endOffset - startOffset, buffOutSize), startOffset);
|
||||
|
||||
if (ZSTD_isError(result)) {
|
||||
fprintf(stderr, "ZSTD_seekable_decompress() error : %s \n",
|
||||
ZSTD_getErrorName(result));
|
||||
exit(12);
|
||||
}
|
||||
fwrite_orDie(buffOut, result, fout);
|
||||
startOffset += result;
|
||||
}
|
||||
|
||||
ZSTD_seekable_free(seekable);
|
||||
fclose_orDie(fin);
|
||||
fclose_orDie(fout);
|
||||
free(buffIn);
|
||||
free(buffOut);
|
||||
}
|
||||
|
||||
|
||||
int main(int argc, const char** argv)
|
||||
{
|
||||
const char* const exeName = argv[0];
|
||||
|
||||
if (argc!=4) {
|
||||
fprintf(stderr, "wrong arguments\n");
|
||||
fprintf(stderr, "usage:\n");
|
||||
fprintf(stderr, "%s FILE START END\n", exeName);
|
||||
return 1;
|
||||
}
|
||||
|
||||
{
|
||||
const char* const inFilename = argv[1];
|
||||
off_t const startOffset = atoll(argv[2]);
|
||||
off_t const endOffset = atoll(argv[3]);
|
||||
decompressFile_orDie(inFilename, startOffset, endOffset);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,186 +0,0 @@
|
||||
#ifndef SEEKABLE_H
|
||||
#define SEEKABLE_H
|
||||
|
||||
#if defined (__cplusplus)
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
#include <stdio.h>
|
||||
#include "zstd.h" /* ZSTDLIB_API */
|
||||
|
||||
|
||||
#define ZSTD_seekTableFooterSize 9
|
||||
|
||||
#define ZSTD_SEEKABLE_MAGICNUMBER 0x8F92EAB1
|
||||
|
||||
#define ZSTD_SEEKABLE_MAXFRAMES 0x8000000U
|
||||
|
||||
/* Limit the maximum size to avoid any potential issues storing the compressed size */
|
||||
#define ZSTD_SEEKABLE_MAX_FRAME_DECOMPRESSED_SIZE 0x80000000U
|
||||
|
||||
/*-****************************************************************************
|
||||
* Seekable Format
|
||||
*
|
||||
* The seekable format splits the compressed data into a series of "frames",
|
||||
* each compressed individually so that decompression of a section in the
|
||||
* middle of an archive only requires zstd to decompress at most a frame's
|
||||
* worth of extra data, instead of the entire archive.
|
||||
******************************************************************************/
|
||||
|
||||
typedef struct ZSTD_seekable_CStream_s ZSTD_seekable_CStream;
|
||||
typedef struct ZSTD_seekable_s ZSTD_seekable;
|
||||
|
||||
/*-****************************************************************************
|
||||
* Seekable compression - HowTo
|
||||
* A ZSTD_seekable_CStream object is required to tracking streaming operation.
|
||||
* Use ZSTD_seekable_createCStream() and ZSTD_seekable_freeCStream() to create/
|
||||
* release resources.
|
||||
*
|
||||
* Streaming objects are reusable to avoid allocation and deallocation,
|
||||
* to start a new compression operation call ZSTD_seekable_initCStream() on the
|
||||
* compressor.
|
||||
*
|
||||
* Data streamed to the seekable compressor will automatically be split into
|
||||
* frames of size `maxFrameSize` (provided in ZSTD_seekable_initCStream()),
|
||||
* or if none is provided, will be cut off whenever ZSTD_seekable_endFrame() is
|
||||
* called or when the default maximum frame size (2GB) is reached.
|
||||
*
|
||||
* Use ZSTD_seekable_initCStream() to initialize a ZSTD_seekable_CStream object
|
||||
* for a new compression operation.
|
||||
* `maxFrameSize` indicates the size at which to automatically start a new
|
||||
* seekable frame. `maxFrameSize == 0` implies the default maximum size.
|
||||
* `checksumFlag` indicates whether or not the seek table should include frame
|
||||
* checksums on the uncompressed data for verification.
|
||||
* @return : a size hint for input to provide for compression, or an error code
|
||||
* checkable with ZSTD_isError()
|
||||
*
|
||||
* Use ZSTD_seekable_compressStream() repetitively to consume input stream.
|
||||
* The function will automatically update both `pos` fields.
|
||||
* Note that it may not consume the entire input, in which case `pos < size`,
|
||||
* and it's up to the caller to present again remaining data.
|
||||
* @return : a size hint, preferred nb of bytes to use as input for next
|
||||
* function call or an error code, which can be tested using
|
||||
* ZSTD_isError().
|
||||
* Note 1 : it's just a hint, to help latency a little, any other
|
||||
* value will work fine.
|
||||
*
|
||||
* At any time, call ZSTD_seekable_endFrame() to end the current frame and
|
||||
* start a new one.
|
||||
*
|
||||
* ZSTD_seekable_endStream() will end the current frame, and then write the seek
|
||||
* table so that decompressors can efficiently find compressed frames.
|
||||
* ZSTD_seekable_endStream() may return a number > 0 if it was unable to flush
|
||||
* all the necessary data to `output`. In this case, it should be called again
|
||||
* until all remaining data is flushed out and 0 is returned.
|
||||
******************************************************************************/
|
||||
|
||||
/*===== Seekable compressor management =====*/
|
||||
ZSTDLIB_API ZSTD_seekable_CStream* ZSTD_seekable_createCStream(void);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_freeCStream(ZSTD_seekable_CStream* zcs);
|
||||
|
||||
/*===== Seekable compression functions =====*/
|
||||
ZSTDLIB_API size_t ZSTD_seekable_initCStream(ZSTD_seekable_CStream* zcs, int compressionLevel, int checksumFlag, unsigned maxFrameSize);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_compressStream(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_endFrame(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_endStream(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output);
|
||||
|
||||
/*= Raw seek table API
|
||||
* These functions allow for the seek table to be constructed directly.
|
||||
* This table can then be appended to a file of concatenated frames.
|
||||
* This allows the frames to be compressed independently, even in parallel,
|
||||
* and compiled together afterward into a seekable archive.
|
||||
*
|
||||
* Use ZSTD_seekable_createFrameLog() to allocate and initialize a tracking
|
||||
* structure.
|
||||
*
|
||||
* Call ZSTD_seekable_logFrame() once for each frame in the archive.
|
||||
* checksum is optional, and will not be used if checksumFlag was 0 when the
|
||||
* frame log was created. If present, it should be the least significant 32
|
||||
* bits of the XXH64 hash of the uncompressed data.
|
||||
*
|
||||
* Call ZSTD_seekable_writeSeekTable to serialize the data into a seek table.
|
||||
* If the entire table was written, the return value will be 0. Otherwise,
|
||||
* it will be equal to the number of bytes left to write. */
|
||||
typedef struct ZSTD_frameLog_s ZSTD_frameLog;
|
||||
ZSTDLIB_API ZSTD_frameLog* ZSTD_seekable_createFrameLog(int checksumFlag);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_freeFrameLog(ZSTD_frameLog* fl);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_logFrame(ZSTD_frameLog* fl, unsigned compressedSize, unsigned decompressedSize, unsigned checksum);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_writeSeekTable(ZSTD_frameLog* fl, ZSTD_outBuffer* output);
|
||||
|
||||
/*-****************************************************************************
|
||||
* Seekable decompression - HowTo
|
||||
* A ZSTD_seekable object is required to tracking the seekTable.
|
||||
*
|
||||
* Call ZSTD_seekable_init* to initialize a ZSTD_seekable object with the
|
||||
* the seek table provided in the input.
|
||||
* There are three modes for ZSTD_seekable_init:
|
||||
* - ZSTD_seekable_initBuff() : An in-memory API. The data contained in
|
||||
* `src` should be the entire seekable file, including the seek table.
|
||||
* `src` should be kept alive and unmodified until the ZSTD_seekable object
|
||||
* is freed or reset.
|
||||
* - ZSTD_seekable_initFile() : A simplified file API using stdio. fread and
|
||||
* fseek will be used to access the required data for building the seek
|
||||
* table and doing decompression operations. `src` should not be closed
|
||||
* or modified until the ZSTD_seekable object is freed or reset.
|
||||
* - ZSTD_seekable_initAdvanced() : A general API allowing the client to
|
||||
* provide its own read and seek callbacks.
|
||||
* + ZSTD_seekable_read() : read exactly `n` bytes into `buffer`.
|
||||
* Premature EOF should be treated as an error.
|
||||
* + ZSTD_seekable_seek() : seek the read head to `offset` from `origin`,
|
||||
* where origin is either SEEK_SET (beginning of
|
||||
* file), or SEEK_END (end of file).
|
||||
* Both functions should return a non-negative value in case of success, and a
|
||||
* negative value in case of failure. If implementing using this API and
|
||||
* stdio, be careful with files larger than 4GB and fseek. All of these
|
||||
* functions return an error code checkable with ZSTD_isError().
|
||||
*
|
||||
* Call ZSTD_seekable_decompress to decompress `dstSize` bytes at decompressed
|
||||
* offset `offset`. ZSTD_seekable_decompress may have to decompress the entire
|
||||
* prefix of the frame before the desired data if it has not already processed
|
||||
* this section. If ZSTD_seekable_decompress is called multiple times for a
|
||||
* consecutive range of data, it will efficiently retain the decompressor object
|
||||
* and avoid redecompressing frame prefixes. The return value is the number of
|
||||
* bytes decompressed, or an error code checkable with ZSTD_isError().
|
||||
*
|
||||
* The seek table access functions can be used to obtain the data contained
|
||||
* in the seek table. If frameIndex is larger than the value returned by
|
||||
* ZSTD_seekable_getNumFrames(), they will return error codes checkable with
|
||||
* ZSTD_isError(). Note that since the offset access functions return
|
||||
* unsigned long long instead of size_t, in this case they will instead return
|
||||
* the value ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE.
|
||||
******************************************************************************/
|
||||
|
||||
/*===== Seekable decompressor management =====*/
|
||||
ZSTDLIB_API ZSTD_seekable* ZSTD_seekable_create(void);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_free(ZSTD_seekable* zs);
|
||||
|
||||
/*===== Seekable decompression functions =====*/
|
||||
ZSTDLIB_API size_t ZSTD_seekable_initBuff(ZSTD_seekable* zs, const void* src, size_t srcSize);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_initFile(ZSTD_seekable* zs, FILE* src);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t dstSize, unsigned long long offset);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_decompressFrame(ZSTD_seekable* zs, void* dst, size_t dstSize, unsigned frameIndex);
|
||||
|
||||
#define ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE (0ULL-2)
|
||||
/*===== Seek Table access functions =====*/
|
||||
ZSTDLIB_API unsigned ZSTD_seekable_getNumFrames(ZSTD_seekable* const zs);
|
||||
ZSTDLIB_API unsigned long long ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, unsigned frameIndex);
|
||||
ZSTDLIB_API unsigned long long ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, unsigned frameIndex);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_getFrameCompressedSize(ZSTD_seekable* const zs, unsigned frameIndex);
|
||||
ZSTDLIB_API size_t ZSTD_seekable_getFrameDecompressedSize(ZSTD_seekable* const zs, unsigned frameIndex);
|
||||
ZSTDLIB_API unsigned ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, unsigned long long offset);
|
||||
|
||||
/*===== Seekable advanced I/O API =====*/
|
||||
typedef int(ZSTD_seekable_read)(void* opaque, void* buffer, size_t n);
|
||||
typedef int(ZSTD_seekable_seek)(void* opaque, long long offset, int origin);
|
||||
typedef struct {
|
||||
void* opaque;
|
||||
ZSTD_seekable_read* read;
|
||||
ZSTD_seekable_seek* seek;
|
||||
} ZSTD_seekable_customFile;
|
||||
ZSTDLIB_API size_t ZSTD_seekable_initAdvanced(ZSTD_seekable* zs, ZSTD_seekable_customFile src);
|
||||
|
||||
#if defined (__cplusplus)
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif
|
@ -1,116 +0,0 @@
|
||||
# Zstandard Seekable Format
|
||||
|
||||
### Notices
|
||||
|
||||
Copyright (c) 2017-present Facebook, Inc.
|
||||
|
||||
Permission is granted to copy and distribute this document
|
||||
for any purpose and without charge,
|
||||
including translations into other languages
|
||||
and incorporation into compilations,
|
||||
provided that the copyright notice and this notice are preserved,
|
||||
and that any substantive changes or deletions from the original
|
||||
are clearly marked.
|
||||
Distribution of this document is unlimited.
|
||||
|
||||
### Version
|
||||
0.1.0 (11/04/17)
|
||||
|
||||
## Introduction
|
||||
This document defines a format for compressed data to be stored so that subranges of the data can be efficiently decompressed without requiring the entire document to be decompressed.
|
||||
This is done by splitting up the input data into frames,
|
||||
each of which are compressed independently,
|
||||
and so can be decompressed independently.
|
||||
Decompression then takes advantage of a provided 'seek table', which allows the decompressor to immediately jump to the desired data. This is done in a way that is compatible with the original Zstandard format by placing the seek table in a Zstandard skippable frame.
|
||||
|
||||
### Overall conventions
|
||||
In this document:
|
||||
- square brackets i.e. `[` and `]` are used to indicate optional fields or parameters.
|
||||
- the naming convention for identifiers is `Mixed_Case_With_Underscores`
|
||||
- All numeric fields are little-endian unless specified otherwise
|
||||
|
||||
## Format
|
||||
|
||||
The format consists of a number of frames (Zstandard compressed frames and skippable frames), followed by a final skippable frame at the end containing the seek table.
|
||||
|
||||
### Seek Table Format
|
||||
The structure of the seek table frame is as follows:
|
||||
|
||||
|`Skippable_Magic_Number`|`Frame_Size`|`[Seek_Table_Entries]`|`Seek_Table_Footer`|
|
||||
|------------------------|------------|----------------------|-------------------|
|
||||
| 4 bytes | 4 bytes | 8-12 bytes each | 9 bytes |
|
||||
|
||||
__`Skippable_Magic_Number`__
|
||||
|
||||
Value : 0x184D2A5E.
|
||||
This is for compatibility with [Zstandard skippable frames].
|
||||
Since it is legal for other Zstandard skippable frames to use the same
|
||||
magic number, it is not recommended for a decoder to recognize frames
|
||||
solely on this.
|
||||
|
||||
__`Frame_Size`__
|
||||
|
||||
The total size of the skippable frame, not including the `Skippable_Magic_Number` or `Frame_Size`.
|
||||
This is for compatibility with [Zstandard skippable frames].
|
||||
|
||||
[Zstandard skippable frames]: https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md#skippable-frames
|
||||
|
||||
#### `Seek_Table_Footer`
|
||||
The seek table footer format is as follows:
|
||||
|
||||
|`Number_Of_Frames`|`Seek_Table_Descriptor`|`Seekable_Magic_Number`|
|
||||
|------------------|-----------------------|-----------------------|
|
||||
| 4 bytes | 1 byte | 4 bytes |
|
||||
|
||||
__`Seekable_Magic_Number`__
|
||||
|
||||
Value : 0x8F92EAB1.
|
||||
This value must be the last bytes present in the compressed file so that decoders
|
||||
can efficiently find it and determine if there is an actual seek table present.
|
||||
|
||||
__`Number_Of_Frames`__
|
||||
|
||||
The number of stored frames in the data.
|
||||
|
||||
__`Seek_Table_Descriptor`__
|
||||
|
||||
A bitfield describing the format of the seek table.
|
||||
|
||||
| Bit number | Field name |
|
||||
| ---------- | ---------- |
|
||||
| 7 | `Checksum_Flag` |
|
||||
| 6-2 | `Reserved_Bits` |
|
||||
| 1-0 | `Unused_Bits` |
|
||||
|
||||
While only `Checksum_Flag` currently exists, there are 7 other bits in this field that can be used for future changes to the format,
|
||||
for example the addition of inline dictionaries.
|
||||
|
||||
__`Checksum_Flag`__
|
||||
|
||||
If the checksum flag is set, each of the seek table entries contains a 4 byte checksum of the uncompressed data contained in its frame.
|
||||
|
||||
`Reserved_Bits` are not currently used but may be used in the future for breaking changes, so a compliant decoder should ensure they are set to 0. `Unused_Bits` may be used in the future for non-breaking changes, so a compliant decoder should not interpret these bits.
|
||||
|
||||
#### __`Seek_Table_Entries`__
|
||||
|
||||
`Seek_Table_Entries` consists of `Number_Of_Frames` (one for each frame in the data, not including the seek table frame) entries of the following form, in sequence:
|
||||
|
||||
|`Compressed_Size`|`Decompressed_Size`|`[Checksum]`|
|
||||
|-----------------|-------------------|------------|
|
||||
| 4 bytes | 4 bytes | 4 bytes |
|
||||
|
||||
__`Compressed_Size`__
|
||||
|
||||
The compressed size of the frame.
|
||||
The cumulative sum of the `Compressed_Size` fields of frames `0` to `i` gives the offset in the compressed file of frame `i+1`.
|
||||
|
||||
__`Decompressed_Size`__
|
||||
|
||||
The size of the decompressed data contained in the frame. For skippable or otherwise empty frames, this value is 0.
|
||||
|
||||
__`Checksum`__
|
||||
|
||||
Only present if `Checksum_Flag` is set in the `Seek_Table_Descriptor`. Value : the least significant 32 bits of the XXH64 digest of the uncompressed data, stored in little-endian format.
|
||||
|
||||
## Version Changes
|
||||
- 0.1.0: initial version
|
@ -1,369 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
*/
|
||||
|
||||
#include <stdlib.h> /* malloc, free */
|
||||
#include <limits.h> /* UINT_MAX */
|
||||
#include <assert.h>
|
||||
|
||||
#define XXH_STATIC_LINKING_ONLY
|
||||
#define XXH_NAMESPACE ZSTD_
|
||||
#include "xxhash.h"
|
||||
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include "zstd.h"
|
||||
#include "zstd_errors.h"
|
||||
#include "mem.h"
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
#define CHECK_Z(f) { size_t const ret = (f); if (ret != 0) return ret; }
|
||||
|
||||
#undef ERROR
|
||||
#define ERROR(name) ((size_t)-ZSTD_error_##name)
|
||||
|
||||
#undef MIN
|
||||
#undef MAX
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
#define MAX(a, b) ((a) > (b) ? (a) : (b))
|
||||
|
||||
typedef struct {
|
||||
U32 cSize;
|
||||
U32 dSize;
|
||||
U32 checksum;
|
||||
} framelogEntry_t;
|
||||
|
||||
struct ZSTD_frameLog_s {
|
||||
framelogEntry_t* entries;
|
||||
U32 size;
|
||||
U32 capacity;
|
||||
|
||||
int checksumFlag;
|
||||
|
||||
/* for use when streaming out the seek table */
|
||||
U32 seekTablePos;
|
||||
U32 seekTableIndex;
|
||||
} framelog_t;
|
||||
|
||||
struct ZSTD_seekable_CStream_s {
|
||||
ZSTD_CStream* cstream;
|
||||
ZSTD_frameLog framelog;
|
||||
|
||||
U32 frameCSize;
|
||||
U32 frameDSize;
|
||||
|
||||
XXH64_state_t xxhState;
|
||||
|
||||
U32 maxFrameSize;
|
||||
|
||||
int writingSeekTable;
|
||||
};
|
||||
|
||||
size_t ZSTD_seekable_frameLog_allocVec(ZSTD_frameLog* fl)
|
||||
{
|
||||
/* allocate some initial space */
|
||||
size_t const FRAMELOG_STARTING_CAPACITY = 16;
|
||||
fl->entries = (framelogEntry_t*)malloc(
|
||||
sizeof(framelogEntry_t) * FRAMELOG_STARTING_CAPACITY);
|
||||
if (fl->entries == NULL) return ERROR(memory_allocation);
|
||||
fl->capacity = FRAMELOG_STARTING_CAPACITY;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_frameLog_freeVec(ZSTD_frameLog* fl)
|
||||
{
|
||||
if (fl != NULL) free(fl->entries);
|
||||
return 0;
|
||||
}
|
||||
|
||||
ZSTD_frameLog* ZSTD_seekable_createFrameLog(int checksumFlag)
|
||||
{
|
||||
ZSTD_frameLog* fl = malloc(sizeof(ZSTD_frameLog));
|
||||
if (fl == NULL) return NULL;
|
||||
|
||||
if (ZSTD_isError(ZSTD_seekable_frameLog_allocVec(fl))) {
|
||||
free(fl);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
fl->checksumFlag = checksumFlag;
|
||||
fl->seekTablePos = 0;
|
||||
fl->seekTableIndex = 0;
|
||||
fl->size = 0;
|
||||
|
||||
return fl;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_freeFrameLog(ZSTD_frameLog* fl)
|
||||
{
|
||||
ZSTD_seekable_frameLog_freeVec(fl);
|
||||
free(fl);
|
||||
return 0;
|
||||
}
|
||||
|
||||
ZSTD_seekable_CStream* ZSTD_seekable_createCStream()
|
||||
{
|
||||
ZSTD_seekable_CStream* zcs = malloc(sizeof(ZSTD_seekable_CStream));
|
||||
|
||||
if (zcs == NULL) return NULL;
|
||||
|
||||
memset(zcs, 0, sizeof(*zcs));
|
||||
|
||||
zcs->cstream = ZSTD_createCStream();
|
||||
if (zcs->cstream == NULL) goto failed1;
|
||||
|
||||
if (ZSTD_isError(ZSTD_seekable_frameLog_allocVec(&zcs->framelog))) goto failed2;
|
||||
|
||||
return zcs;
|
||||
|
||||
failed2:
|
||||
ZSTD_freeCStream(zcs->cstream);
|
||||
failed1:
|
||||
free(zcs);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_freeCStream(ZSTD_seekable_CStream* zcs)
|
||||
{
|
||||
if (zcs == NULL) return 0; /* support free on null */
|
||||
ZSTD_freeCStream(zcs->cstream);
|
||||
ZSTD_seekable_frameLog_freeVec(&zcs->framelog);
|
||||
free(zcs);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_initCStream(ZSTD_seekable_CStream* zcs,
|
||||
int compressionLevel,
|
||||
int checksumFlag,
|
||||
unsigned maxFrameSize)
|
||||
{
|
||||
zcs->framelog.size = 0;
|
||||
zcs->frameCSize = 0;
|
||||
zcs->frameDSize = 0;
|
||||
|
||||
/* make sure maxFrameSize has a reasonable value */
|
||||
if (maxFrameSize > ZSTD_SEEKABLE_MAX_FRAME_DECOMPRESSED_SIZE) {
|
||||
return ERROR(frameParameter_unsupported);
|
||||
}
|
||||
|
||||
zcs->maxFrameSize = maxFrameSize
|
||||
? maxFrameSize
|
||||
: ZSTD_SEEKABLE_MAX_FRAME_DECOMPRESSED_SIZE;
|
||||
|
||||
zcs->framelog.checksumFlag = checksumFlag;
|
||||
if (zcs->framelog.checksumFlag) {
|
||||
XXH64_reset(&zcs->xxhState, 0);
|
||||
}
|
||||
|
||||
zcs->framelog.seekTablePos = 0;
|
||||
zcs->framelog.seekTableIndex = 0;
|
||||
zcs->writingSeekTable = 0;
|
||||
|
||||
return ZSTD_initCStream(zcs->cstream, compressionLevel);
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_logFrame(ZSTD_frameLog* fl,
|
||||
unsigned compressedSize,
|
||||
unsigned decompressedSize,
|
||||
unsigned checksum)
|
||||
{
|
||||
if (fl->size == ZSTD_SEEKABLE_MAXFRAMES)
|
||||
return ERROR(frameIndex_tooLarge);
|
||||
|
||||
/* grow the buffer if required */
|
||||
if (fl->size == fl->capacity) {
|
||||
/* exponential size increase for constant amortized runtime */
|
||||
size_t const newCapacity = fl->capacity * 2;
|
||||
framelogEntry_t* const newEntries = realloc(fl->entries,
|
||||
sizeof(framelogEntry_t) * newCapacity);
|
||||
|
||||
if (newEntries == NULL) return ERROR(memory_allocation);
|
||||
|
||||
fl->entries = newEntries;
|
||||
assert(newCapacity <= UINT_MAX);
|
||||
fl->capacity = (U32)newCapacity;
|
||||
}
|
||||
|
||||
fl->entries[fl->size] = (framelogEntry_t){
|
||||
compressedSize, decompressedSize, checksum
|
||||
};
|
||||
fl->size++;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_endFrame(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output)
|
||||
{
|
||||
size_t const prevOutPos = output->pos;
|
||||
/* end the frame */
|
||||
size_t ret = ZSTD_endStream(zcs->cstream, output);
|
||||
|
||||
zcs->frameCSize += output->pos - prevOutPos;
|
||||
|
||||
/* need to flush before doing the rest */
|
||||
if (ret) return ret;
|
||||
|
||||
/* frame done */
|
||||
|
||||
/* store the frame data for later */
|
||||
ret = ZSTD_seekable_logFrame(
|
||||
&zcs->framelog, zcs->frameCSize, zcs->frameDSize,
|
||||
zcs->framelog.checksumFlag
|
||||
? XXH64_digest(&zcs->xxhState) & 0xFFFFFFFFU
|
||||
: 0);
|
||||
if (ret) return ret;
|
||||
|
||||
/* reset for the next frame */
|
||||
zcs->frameCSize = 0;
|
||||
zcs->frameDSize = 0;
|
||||
|
||||
ZSTD_resetCStream(zcs->cstream, 0);
|
||||
if (zcs->framelog.checksumFlag)
|
||||
XXH64_reset(&zcs->xxhState, 0);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_compressStream(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input)
|
||||
{
|
||||
const BYTE* const inBase = (const BYTE*) input->src + input->pos;
|
||||
size_t inLen = input->size - input->pos;
|
||||
|
||||
inLen = MIN(inLen, (size_t)(zcs->maxFrameSize - zcs->frameDSize));
|
||||
|
||||
/* if we haven't finished flushing the last frame, don't start writing a new one */
|
||||
if (inLen > 0) {
|
||||
ZSTD_inBuffer inTmp = { inBase, inLen, 0 };
|
||||
size_t const prevOutPos = output->pos;
|
||||
|
||||
size_t const ret = ZSTD_compressStream(zcs->cstream, output, &inTmp);
|
||||
|
||||
if (zcs->framelog.checksumFlag) {
|
||||
XXH64_update(&zcs->xxhState, inBase, inTmp.pos);
|
||||
}
|
||||
|
||||
zcs->frameCSize += output->pos - prevOutPos;
|
||||
zcs->frameDSize += inTmp.pos;
|
||||
|
||||
input->pos += inTmp.pos;
|
||||
|
||||
if (ZSTD_isError(ret)) return ret;
|
||||
}
|
||||
|
||||
if (zcs->maxFrameSize == zcs->frameDSize) {
|
||||
/* log the frame and start over */
|
||||
size_t const ret = ZSTD_seekable_endFrame(zcs, output);
|
||||
if (ZSTD_isError(ret)) return ret;
|
||||
|
||||
/* get the client ready for the next frame */
|
||||
return (size_t)zcs->maxFrameSize;
|
||||
}
|
||||
|
||||
return (size_t)(zcs->maxFrameSize - zcs->frameDSize);
|
||||
}
|
||||
|
||||
static inline size_t ZSTD_seekable_seekTableSize(const ZSTD_frameLog* fl)
|
||||
{
|
||||
size_t const sizePerFrame = 8 + (fl->checksumFlag?4:0);
|
||||
size_t const seekTableLen = ZSTD_SKIPPABLEHEADERSIZE +
|
||||
sizePerFrame * fl->size +
|
||||
ZSTD_seekTableFooterSize;
|
||||
|
||||
return seekTableLen;
|
||||
}
|
||||
|
||||
static inline size_t ZSTD_stwrite32(ZSTD_frameLog* fl,
|
||||
ZSTD_outBuffer* output, U32 const value,
|
||||
U32 const offset)
|
||||
{
|
||||
if (fl->seekTablePos < offset + 4) {
|
||||
BYTE tmp[4]; /* so that we can work with buffers too small to write a whole word to */
|
||||
size_t const lenWrite =
|
||||
MIN(output->size - output->pos, offset + 4 - fl->seekTablePos);
|
||||
MEM_writeLE32(tmp, value);
|
||||
memcpy((BYTE*)output->dst + output->pos,
|
||||
tmp + (fl->seekTablePos - offset), lenWrite);
|
||||
output->pos += lenWrite;
|
||||
fl->seekTablePos += lenWrite;
|
||||
|
||||
if (lenWrite < 4) return ZSTD_seekable_seekTableSize(fl) - fl->seekTablePos;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_writeSeekTable(ZSTD_frameLog* fl, ZSTD_outBuffer* output)
|
||||
{
|
||||
/* seekTableIndex: the current index in the table and
|
||||
* seekTableSize: the amount of the table written so far
|
||||
*
|
||||
* This function is written this way so that if it has to return early
|
||||
* because of a small buffer, it can keep going where it left off.
|
||||
*/
|
||||
|
||||
size_t const sizePerFrame = 8 + (fl->checksumFlag?4:0);
|
||||
size_t const seekTableLen = ZSTD_seekable_seekTableSize(fl);
|
||||
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output, ZSTD_MAGIC_SKIPPABLE_START | 0xE, 0));
|
||||
assert(seekTableLen <= (size_t)UINT_MAX);
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output, (U32)seekTableLen - ZSTD_SKIPPABLEHEADERSIZE, 4));
|
||||
|
||||
while (fl->seekTableIndex < fl->size) {
|
||||
unsigned long long const start = ZSTD_SKIPPABLEHEADERSIZE + sizePerFrame * fl->seekTableIndex;
|
||||
assert(start + 8 <= UINT_MAX);
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output,
|
||||
fl->entries[fl->seekTableIndex].cSize,
|
||||
(U32)start + 0));
|
||||
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output,
|
||||
fl->entries[fl->seekTableIndex].dSize,
|
||||
(U32)start + 4));
|
||||
|
||||
if (fl->checksumFlag) {
|
||||
CHECK_Z(ZSTD_stwrite32(
|
||||
fl, output, fl->entries[fl->seekTableIndex].checksum,
|
||||
(U32)start + 8));
|
||||
}
|
||||
|
||||
fl->seekTableIndex++;
|
||||
}
|
||||
|
||||
assert(seekTableLen <= UINT_MAX);
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output, fl->size,
|
||||
(U32)seekTableLen - ZSTD_seekTableFooterSize));
|
||||
|
||||
if (output->size - output->pos < 1) return seekTableLen - fl->seekTablePos;
|
||||
if (fl->seekTablePos < seekTableLen - 4) {
|
||||
BYTE sfd = 0;
|
||||
sfd |= (fl->checksumFlag) << 7;
|
||||
|
||||
((BYTE*)output->dst)[output->pos] = sfd;
|
||||
output->pos++;
|
||||
fl->seekTablePos++;
|
||||
}
|
||||
|
||||
CHECK_Z(ZSTD_stwrite32(fl, output, ZSTD_SEEKABLE_MAGICNUMBER,
|
||||
(U32)seekTableLen - 4));
|
||||
|
||||
if (fl->seekTablePos != seekTableLen) return ERROR(GENERIC);
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_endStream(ZSTD_seekable_CStream* zcs, ZSTD_outBuffer* output)
|
||||
{
|
||||
if (!zcs->writingSeekTable && zcs->frameDSize) {
|
||||
const size_t endFrame = ZSTD_seekable_endFrame(zcs, output);
|
||||
if (ZSTD_isError(endFrame)) return endFrame;
|
||||
/* return an accurate size hint */
|
||||
if (endFrame) return endFrame + ZSTD_seekable_seekTableSize(&zcs->framelog);
|
||||
}
|
||||
|
||||
zcs->writingSeekTable = 1;
|
||||
|
||||
return ZSTD_seekable_writeSeekTable(&zcs->framelog, output);
|
||||
}
|
@ -1,467 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
* You may select, at your option, one of the above-listed licenses.
|
||||
*/
|
||||
|
||||
/* *********************************************************
|
||||
* Turn on Large Files support (>4GB) for 32-bit Linux/Unix
|
||||
***********************************************************/
|
||||
#if !defined(__64BIT__) || defined(__MINGW32__) /* No point defining Large file for 64 bit but MinGW-w64 requires it */
|
||||
# if !defined(_FILE_OFFSET_BITS)
|
||||
# define _FILE_OFFSET_BITS 64 /* turn off_t into a 64-bit type for ftello, fseeko */
|
||||
# endif
|
||||
# if !defined(_LARGEFILE_SOURCE) /* obsolete macro, replaced with _FILE_OFFSET_BITS */
|
||||
# define _LARGEFILE_SOURCE 1 /* Large File Support extension (LFS) - fseeko, ftello */
|
||||
# endif
|
||||
# if defined(_AIX) || defined(__hpux)
|
||||
# define _LARGE_FILES /* Large file support on 32-bits AIX and HP-UX */
|
||||
# endif
|
||||
#endif
|
||||
|
||||
/* ************************************************************
|
||||
* Avoid fseek()'s 2GiB barrier with MSVC, macOS, *BSD, MinGW
|
||||
***************************************************************/
|
||||
#if defined(_MSC_VER) && _MSC_VER >= 1400
|
||||
# define LONG_SEEK _fseeki64
|
||||
#elif !defined(__64BIT__) && (PLATFORM_POSIX_VERSION >= 200112L) /* No point defining Large file for 64 bit */
|
||||
# define LONG_SEEK fseeko
|
||||
#elif defined(__MINGW32__) && !defined(__STRICT_ANSI__) && !defined(__NO_MINGW_LFS) && defined(__MSVCRT__)
|
||||
# define LONG_SEEK fseeko64
|
||||
#elif defined(_WIN32) && !defined(__DJGPP__)
|
||||
# include <windows.h>
|
||||
static int LONG_SEEK(FILE* file, __int64 offset, int origin) {
|
||||
LARGE_INTEGER off;
|
||||
DWORD method;
|
||||
off.QuadPart = offset;
|
||||
if (origin == SEEK_END)
|
||||
method = FILE_END;
|
||||
else if (origin == SEEK_CUR)
|
||||
method = FILE_CURRENT;
|
||||
else
|
||||
method = FILE_BEGIN;
|
||||
|
||||
if (SetFilePointerEx((HANDLE) _get_osfhandle(_fileno(file)), off, NULL, method))
|
||||
return 0;
|
||||
else
|
||||
return -1;
|
||||
}
|
||||
#else
|
||||
# define LONG_SEEK fseek
|
||||
#endif
|
||||
|
||||
#include <stdlib.h> /* malloc, free */
|
||||
#include <stdio.h> /* FILE* */
|
||||
#include <limits.h> /* UNIT_MAX */
|
||||
#include <assert.h>
|
||||
|
||||
#define XXH_STATIC_LINKING_ONLY
|
||||
#define XXH_NAMESPACE ZSTD_
|
||||
#include "xxhash.h"
|
||||
|
||||
#define ZSTD_STATIC_LINKING_ONLY
|
||||
#include "zstd.h"
|
||||
#include "zstd_errors.h"
|
||||
#include "mem.h"
|
||||
#include "zstd_seekable.h"
|
||||
|
||||
#undef ERROR
|
||||
#define ERROR(name) ((size_t)-ZSTD_error_##name)
|
||||
|
||||
#define CHECK_IO(f) { int const errcod = (f); if (errcod < 0) return ERROR(seekableIO); }
|
||||
|
||||
#undef MIN
|
||||
#undef MAX
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
#define MAX(a, b) ((a) > (b) ? (a) : (b))
|
||||
|
||||
/* Special-case callbacks for FILE* and in-memory modes, so that we can treat
|
||||
* them the same way as the advanced API */
|
||||
static int ZSTD_seekable_read_FILE(void* opaque, void* buffer, size_t n)
|
||||
{
|
||||
size_t const result = fread(buffer, 1, n, (FILE*)opaque);
|
||||
if (result != n) {
|
||||
return -1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ZSTD_seekable_seek_FILE(void* opaque, long long offset, int origin)
|
||||
{
|
||||
int const ret = LONG_SEEK((FILE*)opaque, offset, origin);
|
||||
if (ret) return ret;
|
||||
return fflush((FILE*)opaque);
|
||||
}
|
||||
|
||||
typedef struct {
|
||||
const void *ptr;
|
||||
size_t size;
|
||||
size_t pos;
|
||||
} buffWrapper_t;
|
||||
|
||||
static int ZSTD_seekable_read_buff(void* opaque, void* buffer, size_t n)
|
||||
{
|
||||
buffWrapper_t* buff = (buffWrapper_t*) opaque;
|
||||
if (buff->pos + n > buff->size) return -1;
|
||||
memcpy(buffer, (const BYTE*)buff->ptr + buff->pos, n);
|
||||
buff->pos += n;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ZSTD_seekable_seek_buff(void* opaque, long long offset, int origin)
|
||||
{
|
||||
buffWrapper_t* const buff = (buffWrapper_t*) opaque;
|
||||
unsigned long long newOffset;
|
||||
switch (origin) {
|
||||
case SEEK_SET:
|
||||
newOffset = offset;
|
||||
break;
|
||||
case SEEK_CUR:
|
||||
newOffset = (unsigned long long)buff->pos + offset;
|
||||
break;
|
||||
case SEEK_END:
|
||||
newOffset = (unsigned long long)buff->size + offset;
|
||||
break;
|
||||
default:
|
||||
assert(0); /* not possible */
|
||||
}
|
||||
if (newOffset > buff->size) {
|
||||
return -1;
|
||||
}
|
||||
buff->pos = newOffset;
|
||||
return 0;
|
||||
}
|
||||
|
||||
typedef struct {
|
||||
U64 cOffset;
|
||||
U64 dOffset;
|
||||
U32 checksum;
|
||||
} seekEntry_t;
|
||||
|
||||
typedef struct {
|
||||
seekEntry_t* entries;
|
||||
size_t tableLen;
|
||||
|
||||
int checksumFlag;
|
||||
} seekTable_t;
|
||||
|
||||
#define SEEKABLE_BUFF_SIZE ZSTD_BLOCKSIZE_MAX
|
||||
|
||||
struct ZSTD_seekable_s {
|
||||
ZSTD_DStream* dstream;
|
||||
seekTable_t seekTable;
|
||||
ZSTD_seekable_customFile src;
|
||||
|
||||
U64 decompressedOffset;
|
||||
U32 curFrame;
|
||||
|
||||
BYTE inBuff[SEEKABLE_BUFF_SIZE]; /* need to do our own input buffering */
|
||||
BYTE outBuff[SEEKABLE_BUFF_SIZE]; /* so we can efficiently decompress the
|
||||
starts of chunks before we get to the
|
||||
desired section */
|
||||
ZSTD_inBuffer in; /* maintain continuity across ZSTD_seekable_decompress operations */
|
||||
buffWrapper_t buffWrapper; /* for `src.opaque` in in-memory mode */
|
||||
|
||||
XXH64_state_t xxhState;
|
||||
};
|
||||
|
||||
ZSTD_seekable* ZSTD_seekable_create(void)
|
||||
{
|
||||
ZSTD_seekable* zs = malloc(sizeof(ZSTD_seekable));
|
||||
|
||||
if (zs == NULL) return NULL;
|
||||
|
||||
/* also initializes stage to zsds_init */
|
||||
memset(zs, 0, sizeof(*zs));
|
||||
|
||||
zs->dstream = ZSTD_createDStream();
|
||||
if (zs->dstream == NULL) {
|
||||
free(zs);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
return zs;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_free(ZSTD_seekable* zs)
|
||||
{
|
||||
if (zs == NULL) return 0; /* support free on null */
|
||||
ZSTD_freeDStream(zs->dstream);
|
||||
free(zs->seekTable.entries);
|
||||
free(zs);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/** ZSTD_seekable_offsetToFrameIndex() :
|
||||
* Performs a binary search to find the last frame with a decompressed offset
|
||||
* <= pos
|
||||
* @return : the frame's index */
|
||||
unsigned ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, unsigned long long pos)
|
||||
{
|
||||
U32 lo = 0;
|
||||
U32 hi = (U32)zs->seekTable.tableLen;
|
||||
assert(zs->seekTable.tableLen <= UINT_MAX);
|
||||
|
||||
if (pos >= zs->seekTable.entries[zs->seekTable.tableLen].dOffset) {
|
||||
return (U32)zs->seekTable.tableLen;
|
||||
}
|
||||
|
||||
while (lo + 1 < hi) {
|
||||
U32 const mid = lo + ((hi - lo) >> 1);
|
||||
if (zs->seekTable.entries[mid].dOffset <= pos) {
|
||||
lo = mid;
|
||||
} else {
|
||||
hi = mid;
|
||||
}
|
||||
}
|
||||
return lo;
|
||||
}
|
||||
|
||||
unsigned ZSTD_seekable_getNumFrames(ZSTD_seekable* const zs)
|
||||
{
|
||||
assert(zs->seekTable.tableLen <= UINT_MAX);
|
||||
return (unsigned)zs->seekTable.tableLen;
|
||||
}
|
||||
|
||||
unsigned long long ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, unsigned frameIndex)
|
||||
{
|
||||
if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE;
|
||||
return zs->seekTable.entries[frameIndex].cOffset;
|
||||
}
|
||||
|
||||
unsigned long long ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, unsigned frameIndex)
|
||||
{
|
||||
if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE;
|
||||
return zs->seekTable.entries[frameIndex].dOffset;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_getFrameCompressedSize(ZSTD_seekable* const zs, unsigned frameIndex)
|
||||
{
|
||||
if (frameIndex >= zs->seekTable.tableLen) return ERROR(frameIndex_tooLarge);
|
||||
return zs->seekTable.entries[frameIndex + 1].cOffset -
|
||||
zs->seekTable.entries[frameIndex].cOffset;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_getFrameDecompressedSize(ZSTD_seekable* const zs, unsigned frameIndex)
|
||||
{
|
||||
if (frameIndex > zs->seekTable.tableLen) return ERROR(frameIndex_tooLarge);
|
||||
return zs->seekTable.entries[frameIndex + 1].dOffset -
|
||||
zs->seekTable.entries[frameIndex].dOffset;
|
||||
}
|
||||
|
||||
static size_t ZSTD_seekable_loadSeekTable(ZSTD_seekable* zs)
|
||||
{
|
||||
int checksumFlag;
|
||||
ZSTD_seekable_customFile src = zs->src;
|
||||
/* read the footer, fixed size */
|
||||
CHECK_IO(src.seek(src.opaque, -(int)ZSTD_seekTableFooterSize, SEEK_END));
|
||||
CHECK_IO(src.read(src.opaque, zs->inBuff, ZSTD_seekTableFooterSize));
|
||||
|
||||
if (MEM_readLE32(zs->inBuff + 5) != ZSTD_SEEKABLE_MAGICNUMBER) {
|
||||
return ERROR(prefix_unknown);
|
||||
}
|
||||
|
||||
{ BYTE const sfd = zs->inBuff[4];
|
||||
checksumFlag = sfd >> 7;
|
||||
|
||||
/* check reserved bits */
|
||||
if ((checksumFlag >> 2) & 0x1f) {
|
||||
return ERROR(corruption_detected);
|
||||
}
|
||||
}
|
||||
|
||||
{ U32 const numFrames = MEM_readLE32(zs->inBuff);
|
||||
U32 const sizePerEntry = 8 + (checksumFlag?4:0);
|
||||
U32 const tableSize = sizePerEntry * numFrames;
|
||||
U32 const frameSize = tableSize + ZSTD_seekTableFooterSize + ZSTD_SKIPPABLEHEADERSIZE;
|
||||
|
||||
U32 remaining = frameSize - ZSTD_seekTableFooterSize; /* don't need to re-read footer */
|
||||
{
|
||||
U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE);
|
||||
|
||||
CHECK_IO(src.seek(src.opaque, -(S64)frameSize, SEEK_END));
|
||||
CHECK_IO(src.read(src.opaque, zs->inBuff, toRead));
|
||||
|
||||
remaining -= toRead;
|
||||
}
|
||||
|
||||
if (MEM_readLE32(zs->inBuff) != (ZSTD_MAGIC_SKIPPABLE_START | 0xE)) {
|
||||
return ERROR(prefix_unknown);
|
||||
}
|
||||
if (MEM_readLE32(zs->inBuff+4) + ZSTD_SKIPPABLEHEADERSIZE != frameSize) {
|
||||
return ERROR(prefix_unknown);
|
||||
}
|
||||
|
||||
{ /* Allocate an extra entry at the end so that we can do size
|
||||
* computations on the last element without special case */
|
||||
seekEntry_t* entries = (seekEntry_t*)malloc(sizeof(seekEntry_t) * (numFrames + 1));
|
||||
|
||||
U32 idx = 0;
|
||||
U32 pos = 8;
|
||||
|
||||
|
||||
U64 cOffset = 0;
|
||||
U64 dOffset = 0;
|
||||
|
||||
if (!entries) {
|
||||
free(entries);
|
||||
return ERROR(memory_allocation);
|
||||
}
|
||||
|
||||
/* compute cumulative positions */
|
||||
for (; idx < numFrames; idx++) {
|
||||
if (pos + sizePerEntry > SEEKABLE_BUFF_SIZE) {
|
||||
U32 const offset = SEEKABLE_BUFF_SIZE - pos;
|
||||
U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE - offset);
|
||||
memmove(zs->inBuff, zs->inBuff + pos, offset); /* move any data we haven't read yet */
|
||||
CHECK_IO(src.read(src.opaque, zs->inBuff+offset, toRead));
|
||||
remaining -= toRead;
|
||||
pos = 0;
|
||||
}
|
||||
entries[idx].cOffset = cOffset;
|
||||
entries[idx].dOffset = dOffset;
|
||||
|
||||
cOffset += MEM_readLE32(zs->inBuff + pos);
|
||||
pos += 4;
|
||||
dOffset += MEM_readLE32(zs->inBuff + pos);
|
||||
pos += 4;
|
||||
if (checksumFlag) {
|
||||
entries[idx].checksum = MEM_readLE32(zs->inBuff + pos);
|
||||
pos += 4;
|
||||
}
|
||||
}
|
||||
entries[numFrames].cOffset = cOffset;
|
||||
entries[numFrames].dOffset = dOffset;
|
||||
|
||||
zs->seekTable.entries = entries;
|
||||
zs->seekTable.tableLen = numFrames;
|
||||
zs->seekTable.checksumFlag = checksumFlag;
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_initBuff(ZSTD_seekable* zs, const void* src, size_t srcSize)
|
||||
{
|
||||
zs->buffWrapper = (buffWrapper_t){src, srcSize, 0};
|
||||
{ ZSTD_seekable_customFile srcFile = {&zs->buffWrapper,
|
||||
&ZSTD_seekable_read_buff,
|
||||
&ZSTD_seekable_seek_buff};
|
||||
return ZSTD_seekable_initAdvanced(zs, srcFile); }
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_initFile(ZSTD_seekable* zs, FILE* src)
|
||||
{
|
||||
ZSTD_seekable_customFile srcFile = {src, &ZSTD_seekable_read_FILE,
|
||||
&ZSTD_seekable_seek_FILE};
|
||||
return ZSTD_seekable_initAdvanced(zs, srcFile);
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_initAdvanced(ZSTD_seekable* zs, ZSTD_seekable_customFile src)
|
||||
{
|
||||
zs->src = src;
|
||||
|
||||
{ const size_t seekTableInit = ZSTD_seekable_loadSeekTable(zs);
|
||||
if (ZSTD_isError(seekTableInit)) return seekTableInit; }
|
||||
|
||||
zs->decompressedOffset = (U64)-1;
|
||||
zs->curFrame = (U32)-1;
|
||||
|
||||
{ const size_t dstreamInit = ZSTD_initDStream(zs->dstream);
|
||||
if (ZSTD_isError(dstreamInit)) return dstreamInit; }
|
||||
return 0;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t len, unsigned long long offset)
|
||||
{
|
||||
U32 targetFrame = ZSTD_seekable_offsetToFrameIndex(zs, offset);
|
||||
do {
|
||||
/* check if we can continue from a previous decompress job */
|
||||
if (targetFrame != zs->curFrame || offset != zs->decompressedOffset) {
|
||||
zs->decompressedOffset = zs->seekTable.entries[targetFrame].dOffset;
|
||||
zs->curFrame = targetFrame;
|
||||
|
||||
CHECK_IO(zs->src.seek(zs->src.opaque,
|
||||
zs->seekTable.entries[targetFrame].cOffset,
|
||||
SEEK_SET));
|
||||
zs->in = (ZSTD_inBuffer){zs->inBuff, 0, 0};
|
||||
XXH64_reset(&zs->xxhState, 0);
|
||||
ZSTD_resetDStream(zs->dstream);
|
||||
}
|
||||
|
||||
while (zs->decompressedOffset < offset + len) {
|
||||
size_t toRead;
|
||||
ZSTD_outBuffer outTmp;
|
||||
size_t prevOutPos;
|
||||
if (zs->decompressedOffset < offset) {
|
||||
/* dummy decompressions until we get to the target offset */
|
||||
outTmp = (ZSTD_outBuffer){zs->outBuff, MIN(SEEKABLE_BUFF_SIZE, offset - zs->decompressedOffset), 0};
|
||||
} else {
|
||||
outTmp = (ZSTD_outBuffer){dst, len, zs->decompressedOffset - offset};
|
||||
}
|
||||
|
||||
prevOutPos = outTmp.pos;
|
||||
toRead = ZSTD_decompressStream(zs->dstream, &outTmp, &zs->in);
|
||||
if (ZSTD_isError(toRead)) {
|
||||
return toRead;
|
||||
}
|
||||
|
||||
if (zs->seekTable.checksumFlag) {
|
||||
XXH64_update(&zs->xxhState, (BYTE*)outTmp.dst + prevOutPos,
|
||||
outTmp.pos - prevOutPos);
|
||||
}
|
||||
zs->decompressedOffset += outTmp.pos - prevOutPos;
|
||||
|
||||
if (toRead == 0) {
|
||||
/* frame complete */
|
||||
|
||||
/* verify checksum */
|
||||
if (zs->seekTable.checksumFlag &&
|
||||
(XXH64_digest(&zs->xxhState) & 0xFFFFFFFFU) !=
|
||||
zs->seekTable.entries[targetFrame].checksum) {
|
||||
return ERROR(corruption_detected);
|
||||
}
|
||||
|
||||
if (zs->decompressedOffset < offset + len) {
|
||||
/* go back to the start and force a reset of the stream */
|
||||
targetFrame = ZSTD_seekable_offsetToFrameIndex(zs, zs->decompressedOffset);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
/* read in more data if we're done with this buffer */
|
||||
if (zs->in.pos == zs->in.size) {
|
||||
toRead = MIN(toRead, SEEKABLE_BUFF_SIZE);
|
||||
CHECK_IO(zs->src.read(zs->src.opaque, zs->inBuff, toRead));
|
||||
zs->in.size = toRead;
|
||||
zs->in.pos = 0;
|
||||
}
|
||||
}
|
||||
} while (zs->decompressedOffset != offset + len);
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
size_t ZSTD_seekable_decompressFrame(ZSTD_seekable* zs, void* dst, size_t dstSize, unsigned frameIndex)
|
||||
{
|
||||
if (frameIndex >= zs->seekTable.tableLen) {
|
||||
return ERROR(frameIndex_tooLarge);
|
||||
}
|
||||
|
||||
{
|
||||
size_t const decompressedSize =
|
||||
zs->seekTable.entries[frameIndex + 1].dOffset -
|
||||
zs->seekTable.entries[frameIndex].dOffset;
|
||||
if (dstSize < decompressedSize) {
|
||||
return ERROR(dstSize_tooSmall);
|
||||
}
|
||||
return ZSTD_seekable_decompress(
|
||||
zs, dst, decompressedSize,
|
||||
zs->seekTable.entries[frameIndex].dOffset);
|
||||
}
|
||||
}
|
@ -1,28 +0,0 @@
|
||||
name: zstd
|
||||
version: git
|
||||
summary: Zstandard - Fast real-time compression algorithm
|
||||
description: |
|
||||
Zstandard, or zstd as short version, is a fast lossless compression
|
||||
algorithm, targeting real-time compression scenarios at zlib-level and better
|
||||
compression ratios. It's backed by a very fast entropy stage, provided by
|
||||
Huff0 and FSE library
|
||||
|
||||
grade: devel # must be 'stable' to release into candidate/stable channels
|
||||
confinement: devmode # use 'strict' once you have the right plugs and slots
|
||||
|
||||
apps:
|
||||
zstd:
|
||||
command: usr/local/bin/zstd
|
||||
plugs: [home, removable-media]
|
||||
zstdgrep:
|
||||
command: usr/local/bin/zstdgrep
|
||||
plugs: [home, removable-media]
|
||||
zstdless:
|
||||
command: usr/local/bin/zstdless
|
||||
plugs: [home, removable-media]
|
||||
|
||||
parts:
|
||||
zstd:
|
||||
source: .
|
||||
plugin: make
|
||||
build-packages: [g++]
|
@ -1,10 +1,11 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
# Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# You may select, at your option, one of the above-listed licenses.
|
||||
# ################################################################
|
||||
|
||||
ZSTD ?= zstd # note: requires zstd installation on local system
|
||||
@ -36,7 +37,7 @@ harness: $(HARNESS_FILES)
|
||||
$(CC) $(FLAGS) $^ -o $@
|
||||
|
||||
clean:
|
||||
@$(RM) harness
|
||||
@$(RM) harness *.o
|
||||
@$(RM) -rf harness.dSYM # MacOS specific
|
||||
|
||||
test: harness
|
||||
@ -59,4 +60,3 @@ test: harness
|
||||
@./harness tmp.zst tmp dictionary
|
||||
@$(DIFF) -s tmp README.md
|
||||
@$(RM) tmp* dictionary
|
||||
@$(MAKE) clean
|
||||
|
@ -13,6 +13,13 @@ It also contains implementations of Huffman and FSE table decoding.
|
||||
[Zstandard format specification]: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
|
||||
[format specification]: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
|
||||
|
||||
While the library's primary objective is code clarity,
|
||||
it also happens to compile into a small object file.
|
||||
The object file can be made even smaller by removing error messages,
|
||||
using the macro directive `ZDEC_NO_MESSAGE` at compilation time.
|
||||
This can be reduced even further by foregoing dictionary support,
|
||||
by defining `ZDEC_NO_DICTIONARY`.
|
||||
|
||||
`harness.c` provides a simple test harness around the decoder:
|
||||
|
||||
harness <input-file> <output-file> [dictionary]
|
||||
|
@ -1,10 +1,11 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* Copyright (c) 2017-2020, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
* You may select, at your option, one of the above-listed licenses.
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
@ -21,108 +22,98 @@ typedef unsigned char u8;
|
||||
// Protect against allocating too much memory for output
|
||||
#define MAX_OUTPUT_SIZE ((size_t)1024 * 1024 * 1024)
|
||||
|
||||
static size_t read_file(const char *path, u8 **ptr)
|
||||
// Error message then exit
|
||||
#define ERR_OUT(...) { fprintf(stderr, __VA_ARGS__); exit(1); }
|
||||
|
||||
|
||||
typedef struct {
|
||||
u8* address;
|
||||
size_t size;
|
||||
} buffer_s;
|
||||
|
||||
static void freeBuffer(buffer_s b) { free(b.address); }
|
||||
|
||||
static buffer_s read_file(const char *path)
|
||||
{
|
||||
FILE* const f = fopen(path, "rb");
|
||||
if (!f) {
|
||||
fprintf(stderr, "failed to open file %s \n", path);
|
||||
exit(1);
|
||||
}
|
||||
if (!f) ERR_OUT("failed to open file %s \n", path);
|
||||
|
||||
fseek(f, 0L, SEEK_END);
|
||||
size_t const size = (size_t)ftell(f);
|
||||
rewind(f);
|
||||
|
||||
*ptr = malloc(size);
|
||||
if (!ptr) {
|
||||
fprintf(stderr, "failed to allocate memory to hold %s \n", path);
|
||||
exit(1);
|
||||
}
|
||||
void* const ptr = malloc(size);
|
||||
if (!ptr) ERR_OUT("failed to allocate memory to hold %s \n", path);
|
||||
|
||||
size_t const read = fread(*ptr, 1, size, f);
|
||||
if (read != size) { /* must read everything in one pass */
|
||||
fprintf(stderr, "error while reading file %s \n", path);
|
||||
exit(1);
|
||||
}
|
||||
size_t const read = fread(ptr, 1, size, f);
|
||||
if (read != size) ERR_OUT("error while reading file %s \n", path);
|
||||
|
||||
fclose(f);
|
||||
|
||||
return read;
|
||||
buffer_s const b = { ptr, size };
|
||||
return b;
|
||||
}
|
||||
|
||||
static void write_file(const char *path, const u8 *ptr, size_t size)
|
||||
static void write_file(const char* path, const u8* ptr, size_t size)
|
||||
{
|
||||
FILE* const f = fopen(path, "wb");
|
||||
if (!f) {
|
||||
fprintf(stderr, "failed to open file %s \n", path);
|
||||
exit(1);
|
||||
}
|
||||
if (!f) ERR_OUT("failed to open file %s \n", path);
|
||||
|
||||
size_t written = 0;
|
||||
while (written < size) {
|
||||
written += fwrite(ptr+written, 1, size, f);
|
||||
if (ferror(f)) {
|
||||
fprintf(stderr, "error while writing file %s\n", path);
|
||||
exit(1);
|
||||
} }
|
||||
if (ferror(f)) ERR_OUT("error while writing file %s\n", path);
|
||||
}
|
||||
|
||||
fclose(f);
|
||||
}
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
if (argc < 3) {
|
||||
fprintf(stderr, "usage: %s <file.zst> <out_path> [dictionary] \n",
|
||||
argv[0]);
|
||||
if (argc < 3)
|
||||
ERR_OUT("usage: %s <file.zst> <out_path> [dictionary] \n", argv[0]);
|
||||
|
||||
return 1;
|
||||
}
|
||||
buffer_s const input = read_file(argv[1]);
|
||||
|
||||
u8* input;
|
||||
size_t const input_size = read_file(argv[1], &input);
|
||||
|
||||
u8* dict = NULL;
|
||||
size_t dict_size = 0;
|
||||
buffer_s dict = { NULL, 0 };
|
||||
if (argc >= 4) {
|
||||
dict_size = read_file(argv[3], &dict);
|
||||
dict = read_file(argv[3]);
|
||||
}
|
||||
|
||||
size_t out_capacity = ZSTD_get_decompressed_size(input, input_size);
|
||||
size_t out_capacity = ZSTD_get_decompressed_size(input.address, input.size);
|
||||
if (out_capacity == (size_t)-1) {
|
||||
out_capacity = MAX_COMPRESSION_RATIO * input_size;
|
||||
out_capacity = MAX_COMPRESSION_RATIO * input.size;
|
||||
fprintf(stderr, "WARNING: Compressed data does not contain "
|
||||
"decompressed size, going to assume the compression "
|
||||
"ratio is at most %d (decompressed size of at most "
|
||||
"%u) \n",
|
||||
MAX_COMPRESSION_RATIO, (unsigned)out_capacity);
|
||||
}
|
||||
if (out_capacity > MAX_OUTPUT_SIZE) {
|
||||
fprintf(stderr,
|
||||
"Required output size too large for this implementation \n");
|
||||
return 1;
|
||||
}
|
||||
if (out_capacity > MAX_OUTPUT_SIZE)
|
||||
ERR_OUT("Required output size too large for this implementation \n");
|
||||
|
||||
u8* const output = malloc(out_capacity);
|
||||
if (!output) {
|
||||
fprintf(stderr, "failed to allocate memory \n");
|
||||
return 1;
|
||||
}
|
||||
if (!output) ERR_OUT("failed to allocate memory \n");
|
||||
|
||||
dictionary_t* const parsed_dict = create_dictionary();
|
||||
if (dict) {
|
||||
parse_dictionary(parsed_dict, dict, dict_size);
|
||||
if (dict.size) {
|
||||
#if defined (ZDEC_NO_DICTIONARY)
|
||||
printf("dict.size = %zu \n", dict.size);
|
||||
ERR_OUT("no dictionary support \n");
|
||||
#else
|
||||
parse_dictionary(parsed_dict, dict.address, dict.size);
|
||||
#endif
|
||||
}
|
||||
size_t const decompressed_size =
|
||||
ZSTD_decompress_with_dict(output, out_capacity,
|
||||
input, input_size,
|
||||
input.address, input.size,
|
||||
parsed_dict);
|
||||
|
||||
free_dictionary(parsed_dict);
|
||||
|
||||
write_file(argv[2], output, decompressed_size);
|
||||
|
||||
free(input);
|
||||
freeBuffer(input);
|
||||
freeBuffer(dict);
|
||||
free(output);
|
||||
free(dict);
|
||||
return 0;
|
||||
}
|
||||
|
@ -1,34 +1,52 @@
|
||||
/*
|
||||
* Copyright (c) 2017-present, Facebook, Inc.
|
||||
* Copyright (c) 2017-2020, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
* You may select, at your option, one of the above-listed licenses.
|
||||
*/
|
||||
|
||||
/// Zstandard educational decoder implementation
|
||||
/// See https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
|
||||
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <stdint.h> // uint8_t, etc.
|
||||
#include <stdlib.h> // malloc, free, exit
|
||||
#include <stdio.h> // fprintf
|
||||
#include <string.h> // memset, memcpy
|
||||
#include "zstd_decompress.h"
|
||||
|
||||
/******* UTILITY MACROS AND TYPES *********************************************/
|
||||
// Max block size decompressed size is 128 KB and literal blocks can't be
|
||||
// larger than their block
|
||||
#define MAX_LITERALS_SIZE ((size_t)128 * 1024)
|
||||
|
||||
/******* IMPORTANT CONSTANTS *********************************************/
|
||||
|
||||
// Zstandard frame
|
||||
// "Magic_Number
|
||||
// 4 Bytes, little-endian format. Value : 0xFD2FB528"
|
||||
#define ZSTD_MAGIC_NUMBER 0xFD2FB528U
|
||||
|
||||
// The size of `Block_Content` is limited by `Block_Maximum_Size`,
|
||||
#define ZSTD_BLOCK_SIZE_MAX ((size_t)128 * 1024)
|
||||
|
||||
// literal blocks can't be larger than their block
|
||||
#define MAX_LITERALS_SIZE ZSTD_BLOCK_SIZE_MAX
|
||||
|
||||
|
||||
/******* UTILITY MACROS AND TYPES *********************************************/
|
||||
#define MAX(a, b) ((a) > (b) ? (a) : (b))
|
||||
#define MIN(a, b) ((a) < (b) ? (a) : (b))
|
||||
|
||||
#if defined(ZDEC_NO_MESSAGE)
|
||||
#define MESSAGE(...)
|
||||
#else
|
||||
#define MESSAGE(...) fprintf(stderr, "" __VA_ARGS__)
|
||||
#endif
|
||||
|
||||
/// This decoder calls exit(1) when it encounters an error, however a production
|
||||
/// library should propagate error codes
|
||||
#define ERROR(s) \
|
||||
do { \
|
||||
fprintf(stderr, "Error: %s\n", s); \
|
||||
MESSAGE("Error: %s\n", s); \
|
||||
exit(1); \
|
||||
} while (0)
|
||||
#define INP_SIZE() \
|
||||
@ -39,12 +57,12 @@
|
||||
#define BAD_ALLOC() ERROR("Memory allocation error")
|
||||
#define IMPOSSIBLE() ERROR("An impossibility has occurred")
|
||||
|
||||
typedef uint8_t u8;
|
||||
typedef uint8_t u8;
|
||||
typedef uint16_t u16;
|
||||
typedef uint32_t u32;
|
||||
typedef uint64_t u64;
|
||||
|
||||
typedef int8_t i8;
|
||||
typedef int8_t i8;
|
||||
typedef int16_t i16;
|
||||
typedef int32_t i32;
|
||||
typedef int64_t i64;
|
||||
@ -176,10 +194,6 @@ static void HUF_init_dtable_usingweights(HUF_dtable *const table,
|
||||
|
||||
/// Free the malloc'ed parts of a decoding table
|
||||
static void HUF_free_dtable(HUF_dtable *const dtable);
|
||||
|
||||
/// Deep copy a decoding table, so that it can be used and free'd without
|
||||
/// impacting the source table.
|
||||
static void HUF_copy_dtable(HUF_dtable *const dst, const HUF_dtable *const src);
|
||||
/*** END HUFFMAN PRIMITIVES ***********/
|
||||
|
||||
/*** FSE PRIMITIVES *******************/
|
||||
@ -241,10 +255,6 @@ static void FSE_init_dtable_rle(FSE_dtable *const dtable, const u8 symb);
|
||||
|
||||
/// Free the malloc'ed parts of a decoding table
|
||||
static void FSE_free_dtable(FSE_dtable *const dtable);
|
||||
|
||||
/// Deep copy a decoding table, so that it can be used and free'd without
|
||||
/// impacting the source table.
|
||||
static void FSE_copy_dtable(FSE_dtable *const dst, const FSE_dtable *const src);
|
||||
/*** END FSE PRIMITIVES ***************/
|
||||
|
||||
/******* END IMPLEMENTATION PRIMITIVE PROTOTYPES ******************************/
|
||||
@ -373,7 +383,7 @@ static void execute_match_copy(frame_context_t *const ctx, size_t offset,
|
||||
|
||||
size_t ZSTD_decompress(void *const dst, const size_t dst_len,
|
||||
const void *const src, const size_t src_len) {
|
||||
dictionary_t* uninit_dict = create_dictionary();
|
||||
dictionary_t* const uninit_dict = create_dictionary();
|
||||
size_t const decomp_size = ZSTD_decompress_with_dict(dst, dst_len, src,
|
||||
src_len, uninit_dict);
|
||||
free_dictionary(uninit_dict);
|
||||
@ -417,12 +427,7 @@ static void decompress_data(frame_context_t *const ctx, ostream_t *const out,
|
||||
static void decode_frame(ostream_t *const out, istream_t *const in,
|
||||
const dictionary_t *const dict) {
|
||||
const u32 magic_number = (u32)IO_read_bits(in, 32);
|
||||
// Zstandard frame
|
||||
//
|
||||
// "Magic_Number
|
||||
//
|
||||
// 4 Bytes, little-endian format. Value : 0xFD2FB528"
|
||||
if (magic_number == 0xFD2FB528U) {
|
||||
if (magic_number == ZSTD_MAGIC_NUMBER) {
|
||||
// ZSTD frame
|
||||
decode_data_frame(out, in, dict);
|
||||
|
||||
@ -576,43 +581,6 @@ static void parse_frame_header(frame_header_t *const header,
|
||||
}
|
||||
}
|
||||
|
||||
/// A dictionary acts as initializing values for the frame context before
|
||||
/// decompression, so we implement it by applying it's predetermined
|
||||
/// tables and content to the context before beginning decompression
|
||||
static void frame_context_apply_dict(frame_context_t *const ctx,
|
||||
const dictionary_t *const dict) {
|
||||
// If the content pointer is NULL then it must be an empty dict
|
||||
if (!dict || !dict->content)
|
||||
return;
|
||||
|
||||
// If the requested dictionary_id is non-zero, the correct dictionary must
|
||||
// be present
|
||||
if (ctx->header.dictionary_id != 0 &&
|
||||
ctx->header.dictionary_id != dict->dictionary_id) {
|
||||
ERROR("Wrong dictionary provided");
|
||||
}
|
||||
|
||||
// Copy the dict content to the context for references during sequence
|
||||
// execution
|
||||
ctx->dict_content = dict->content;
|
||||
ctx->dict_content_len = dict->content_size;
|
||||
|
||||
// If it's a formatted dict copy the precomputed tables in so they can
|
||||
// be used in the table repeat modes
|
||||
if (dict->dictionary_id != 0) {
|
||||
// Deep copy the entropy tables so they can be freed independently of
|
||||
// the dictionary struct
|
||||
HUF_copy_dtable(&ctx->literals_dtable, &dict->literals_dtable);
|
||||
FSE_copy_dtable(&ctx->ll_dtable, &dict->ll_dtable);
|
||||
FSE_copy_dtable(&ctx->of_dtable, &dict->of_dtable);
|
||||
FSE_copy_dtable(&ctx->ml_dtable, &dict->ml_dtable);
|
||||
|
||||
// Copy the repeated offsets
|
||||
memcpy(ctx->previous_offsets, dict->previous_offsets,
|
||||
sizeof(ctx->previous_offsets));
|
||||
}
|
||||
}
|
||||
|
||||
/// Decompress the data from a frame block by block
|
||||
static void decompress_data(frame_context_t *const ctx, ostream_t *const out,
|
||||
istream_t *const in) {
|
||||
@ -1411,7 +1379,7 @@ size_t ZSTD_get_decompressed_size(const void *src, const size_t src_len) {
|
||||
{
|
||||
const u32 magic_number = (u32)IO_read_bits(&in, 32);
|
||||
|
||||
if (magic_number == 0xFD2FB528U) {
|
||||
if (magic_number == ZSTD_MAGIC_NUMBER) {
|
||||
// ZSTD frame
|
||||
frame_header_t header;
|
||||
parse_frame_header(&header, &in);
|
||||
@ -1431,17 +1399,33 @@ size_t ZSTD_get_decompressed_size(const void *src, const size_t src_len) {
|
||||
/******* END OUTPUT SIZE COUNTING *********************************************/
|
||||
|
||||
/******* DICTIONARY PARSING ***************************************************/
|
||||
#define DICT_SIZE_ERROR() ERROR("Dictionary size cannot be less than 8 bytes")
|
||||
#define NULL_SRC() ERROR("Tried to create dictionary with pointer to null src");
|
||||
|
||||
dictionary_t* create_dictionary() {
|
||||
dictionary_t* dict = calloc(1, sizeof(dictionary_t));
|
||||
dictionary_t* const dict = calloc(1, sizeof(dictionary_t));
|
||||
if (!dict) {
|
||||
BAD_ALLOC();
|
||||
}
|
||||
return dict;
|
||||
}
|
||||
|
||||
/// Free an allocated dictionary
|
||||
void free_dictionary(dictionary_t *const dict) {
|
||||
HUF_free_dtable(&dict->literals_dtable);
|
||||
FSE_free_dtable(&dict->ll_dtable);
|
||||
FSE_free_dtable(&dict->of_dtable);
|
||||
FSE_free_dtable(&dict->ml_dtable);
|
||||
|
||||
free(dict->content);
|
||||
|
||||
memset(dict, 0, sizeof(dictionary_t));
|
||||
|
||||
free(dict);
|
||||
}
|
||||
|
||||
|
||||
#if !defined(ZDEC_NO_DICTIONARY)
|
||||
#define DICT_SIZE_ERROR() ERROR("Dictionary size cannot be less than 8 bytes")
|
||||
#define NULL_SRC() ERROR("Tried to create dictionary with pointer to null src");
|
||||
|
||||
static void init_dictionary_content(dictionary_t *const dict,
|
||||
istream_t *const in);
|
||||
|
||||
@ -1513,19 +1497,93 @@ static void init_dictionary_content(dictionary_t *const dict,
|
||||
memcpy(dict->content, content, dict->content_size);
|
||||
}
|
||||
|
||||
/// Free an allocated dictionary
|
||||
void free_dictionary(dictionary_t *const dict) {
|
||||
HUF_free_dtable(&dict->literals_dtable);
|
||||
FSE_free_dtable(&dict->ll_dtable);
|
||||
FSE_free_dtable(&dict->of_dtable);
|
||||
FSE_free_dtable(&dict->ml_dtable);
|
||||
static void HUF_copy_dtable(HUF_dtable *const dst,
|
||||
const HUF_dtable *const src) {
|
||||
if (src->max_bits == 0) {
|
||||
memset(dst, 0, sizeof(HUF_dtable));
|
||||
return;
|
||||
}
|
||||
|
||||
free(dict->content);
|
||||
const size_t size = (size_t)1 << src->max_bits;
|
||||
dst->max_bits = src->max_bits;
|
||||
|
||||
memset(dict, 0, sizeof(dictionary_t));
|
||||
dst->symbols = malloc(size);
|
||||
dst->num_bits = malloc(size);
|
||||
if (!dst->symbols || !dst->num_bits) {
|
||||
BAD_ALLOC();
|
||||
}
|
||||
|
||||
free(dict);
|
||||
memcpy(dst->symbols, src->symbols, size);
|
||||
memcpy(dst->num_bits, src->num_bits, size);
|
||||
}
|
||||
|
||||
static void FSE_copy_dtable(FSE_dtable *const dst, const FSE_dtable *const src) {
|
||||
if (src->accuracy_log == 0) {
|
||||
memset(dst, 0, sizeof(FSE_dtable));
|
||||
return;
|
||||
}
|
||||
|
||||
size_t size = (size_t)1 << src->accuracy_log;
|
||||
dst->accuracy_log = src->accuracy_log;
|
||||
|
||||
dst->symbols = malloc(size);
|
||||
dst->num_bits = malloc(size);
|
||||
dst->new_state_base = malloc(size * sizeof(u16));
|
||||
if (!dst->symbols || !dst->num_bits || !dst->new_state_base) {
|
||||
BAD_ALLOC();
|
||||
}
|
||||
|
||||
memcpy(dst->symbols, src->symbols, size);
|
||||
memcpy(dst->num_bits, src->num_bits, size);
|
||||
memcpy(dst->new_state_base, src->new_state_base, size * sizeof(u16));
|
||||
}
|
||||
|
||||
/// A dictionary acts as initializing values for the frame context before
|
||||
/// decompression, so we implement it by applying it's predetermined
|
||||
/// tables and content to the context before beginning decompression
|
||||
static void frame_context_apply_dict(frame_context_t *const ctx,
|
||||
const dictionary_t *const dict) {
|
||||
// If the content pointer is NULL then it must be an empty dict
|
||||
if (!dict || !dict->content)
|
||||
return;
|
||||
|
||||
// If the requested dictionary_id is non-zero, the correct dictionary must
|
||||
// be present
|
||||
if (ctx->header.dictionary_id != 0 &&
|
||||
ctx->header.dictionary_id != dict->dictionary_id) {
|
||||
ERROR("Wrong dictionary provided");
|
||||
}
|
||||
|
||||
// Copy the dict content to the context for references during sequence
|
||||
// execution
|
||||
ctx->dict_content = dict->content;
|
||||
ctx->dict_content_len = dict->content_size;
|
||||
|
||||
// If it's a formatted dict copy the precomputed tables in so they can
|
||||
// be used in the table repeat modes
|
||||
if (dict->dictionary_id != 0) {
|
||||
// Deep copy the entropy tables so they can be freed independently of
|
||||
// the dictionary struct
|
||||
HUF_copy_dtable(&ctx->literals_dtable, &dict->literals_dtable);
|
||||
FSE_copy_dtable(&ctx->ll_dtable, &dict->ll_dtable);
|
||||
FSE_copy_dtable(&ctx->of_dtable, &dict->of_dtable);
|
||||
FSE_copy_dtable(&ctx->ml_dtable, &dict->ml_dtable);
|
||||
|
||||
// Copy the repeated offsets
|
||||
memcpy(ctx->previous_offsets, dict->previous_offsets,
|
||||
sizeof(ctx->previous_offsets));
|
||||
}
|
||||
}
|
||||
|
||||
#else // ZDEC_NO_DICTIONARY is defined
|
||||
|
||||
static void frame_context_apply_dict(frame_context_t *const ctx,
|
||||
const dictionary_t *const dict) {
|
||||
(void)ctx;
|
||||
if (dict && dict->content) ERROR("dictionary not supported");
|
||||
}
|
||||
|
||||
#endif
|
||||
/******* END DICTIONARY PARSING ***********************************************/
|
||||
|
||||
/******* IO STREAM OPERATIONS *************************************************/
|
||||
@ -1945,26 +2003,6 @@ static void HUF_free_dtable(HUF_dtable *const dtable) {
|
||||
free(dtable->num_bits);
|
||||
memset(dtable, 0, sizeof(HUF_dtable));
|
||||
}
|
||||
|
||||
static void HUF_copy_dtable(HUF_dtable *const dst,
|
||||
const HUF_dtable *const src) {
|
||||
if (src->max_bits == 0) {
|
||||
memset(dst, 0, sizeof(HUF_dtable));
|
||||
return;
|
||||
}
|
||||
|
||||
const size_t size = (size_t)1 << src->max_bits;
|
||||
dst->max_bits = src->max_bits;
|
||||
|
||||
dst->symbols = malloc(size);
|
||||
dst->num_bits = malloc(size);
|
||||
if (!dst->symbols || !dst->num_bits) {
|
||||
BAD_ALLOC();
|
||||
}
|
||||
|
||||
memcpy(dst->symbols, src->symbols, size);
|
||||
memcpy(dst->num_bits, src->num_bits, size);
|
||||
}
|
||||
/******* END HUFFMAN PRIMITIVES ***********************************************/
|
||||
|
||||
/******* FSE PRIMITIVES *******************************************************/
|
||||
@ -2279,25 +2317,4 @@ static void FSE_free_dtable(FSE_dtable *const dtable) {
|
||||
free(dtable->new_state_base);
|
||||
memset(dtable, 0, sizeof(FSE_dtable));
|
||||
}
|
||||
|
||||
static void FSE_copy_dtable(FSE_dtable *const dst, const FSE_dtable *const src) {
|
||||
if (src->accuracy_log == 0) {
|
||||
memset(dst, 0, sizeof(FSE_dtable));
|
||||
return;
|
||||
}
|
||||
|
||||
size_t size = (size_t)1 << src->accuracy_log;
|
||||
dst->accuracy_log = src->accuracy_log;
|
||||
|
||||
dst->symbols = malloc(size);
|
||||
dst->num_bits = malloc(size);
|
||||
dst->new_state_base = malloc(size * sizeof(u16));
|
||||
if (!dst->symbols || !dst->num_bits || !dst->new_state_base) {
|
||||
BAD_ALLOC();
|
||||
}
|
||||
|
||||
memcpy(dst->symbols, src->symbols, size);
|
||||
memcpy(dst->num_bits, src->num_bits, size);
|
||||
memcpy(dst->new_state_base, src->new_state_base, size * sizeof(u16));
|
||||
}
|
||||
/******* END FSE PRIMITIVES ***************************************************/
|
||||
|
@ -1,10 +1,11 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
* in the COPYING file in the root directory of this source tree).
|
||||
* You may select, at your option, one of the above-listed licenses.
|
||||
*/
|
||||
|
||||
#include <stddef.h> /* size_t */
|
||||
|
@ -16,7 +16,7 @@ Distribution of this document is unlimited.
|
||||
|
||||
### Version
|
||||
|
||||
0.3.4 (16/08/19)
|
||||
0.3.5 (13/11/19)
|
||||
|
||||
|
||||
Introduction
|
||||
@ -341,6 +341,8 @@ The structure of a block is as follows:
|
||||
|:--------------:|:---------------:|
|
||||
| 3 bytes | n bytes |
|
||||
|
||||
__`Block_Header`__
|
||||
|
||||
`Block_Header` uses 3 bytes, written using __little-endian__ convention.
|
||||
It contains 3 fields :
|
||||
|
||||
@ -385,17 +387,30 @@ There are 4 block types :
|
||||
__`Block_Size`__
|
||||
|
||||
The upper 21 bits of `Block_Header` represent the `Block_Size`.
|
||||
|
||||
When `Block_Type` is `Compressed_Block` or `Raw_Block`,
|
||||
`Block_Size` is the size of `Block_Content`, hence excluding `Block_Header`.
|
||||
When `Block_Type` is `RLE_Block`, `Block_Content`’s size is always 1,
|
||||
and `Block_Size` represents the number of times this byte must be repeated.
|
||||
A block can contain and decompress into any number of bytes (even zero),
|
||||
up to `Block_Maximum_Decompressed_Size`, which is the smallest of:
|
||||
- Window_Size
|
||||
`Block_Size` is the size of `Block_Content` (hence excluding `Block_Header`).
|
||||
|
||||
When `Block_Type` is `RLE_Block`, since `Block_Content`’s size is always 1,
|
||||
`Block_Size` represents the number of times this byte must be repeated.
|
||||
|
||||
`Block_Size` is limited by `Block_Maximum_Size` (see below).
|
||||
|
||||
__`Block_Content`__ and __`Block_Maximum_Size`__
|
||||
|
||||
The size of `Block_Content` is limited by `Block_Maximum_Size`,
|
||||
which is the smallest of:
|
||||
- `Window_Size`
|
||||
- 128 KB
|
||||
|
||||
If this condition cannot be respected when generating a `Compressed_Block`,
|
||||
the block must be sent uncompressed instead (`Raw_Block`).
|
||||
`Block_Maximum_Size` is constant for a given frame.
|
||||
This maximum is applicable to both the decompressed size
|
||||
and the compressed size of any block in the frame.
|
||||
|
||||
The reasoning for this limit is that a decoder can read this information
|
||||
at the beginning of a frame and use it to allocate buffers.
|
||||
The guarantees on the size of blocks ensure that
|
||||
the buffers will be large enough for any following block of the valid frame.
|
||||
|
||||
|
||||
Compressed Blocks
|
||||
@ -1658,6 +1673,7 @@ or at least provide a meaningful error code explaining for which reason it canno
|
||||
|
||||
Version changes
|
||||
---------------
|
||||
- 0.3.5 : clarifications for Block_Maximum_Size
|
||||
- 0.3.4 : clarifications for FSE decoding table
|
||||
- 0.3.3 : clarifications for field Block_Size
|
||||
- 0.3.2 : remove additional block size restriction on compressed blocks
|
||||
|
@ -1,10 +1,10 @@
|
||||
<html>
|
||||
<head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
||||
<title>zstd 1.4.4 Manual</title>
|
||||
<title>zstd 1.4.5 Manual</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>zstd 1.4.4 Manual</h1>
|
||||
<h1>zstd 1.4.5 Manual</h1>
|
||||
<hr>
|
||||
<a name="Contents"></a><h2>Contents</h2>
|
||||
<ol>
|
||||
@ -217,7 +217,10 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
|
||||
* Default level is ZSTD_CLEVEL_DEFAULT==3.
|
||||
* Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT.
|
||||
* Note 1 : it's possible to pass a negative compression level.
|
||||
* Note 2 : setting a level resets all other compression parameters to default */
|
||||
* Note 2 : setting a level does not automatically set all other compression parameters
|
||||
* to default. Setting this will however eventually dynamically impact the compression
|
||||
* parameters which have not been manually set. The manually set
|
||||
* ones will 'stick'. */
|
||||
</b>/* Advanced compression parameters :<b>
|
||||
* It's possible to pin down compression parameters to some specific values.
|
||||
* In which case, these values are no longer dynamically selected by the compressor */
|
||||
@ -451,11 +454,13 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
|
||||
</b>/* note : additional experimental parameters are also available<b>
|
||||
* within the experimental section of the API.
|
||||
* At the time of this writing, they include :
|
||||
* ZSTD_c_format
|
||||
* ZSTD_d_format
|
||||
* ZSTD_d_stableOutBuffer
|
||||
* Because they are not stable, it's necessary to define ZSTD_STATIC_LINKING_ONLY to access them.
|
||||
* note : never ever use experimentalParam? names directly
|
||||
*/
|
||||
ZSTD_d_experimentalParam1=1000
|
||||
ZSTD_d_experimentalParam1=1000,
|
||||
ZSTD_d_experimentalParam2=1001
|
||||
|
||||
} ZSTD_dParameter;
|
||||
</b></pre><BR>
|
||||
@ -1055,23 +1060,28 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
|
||||
size_t ZSTD_estimateCCtxSize_usingCParams(ZSTD_compressionParameters cParams);
|
||||
size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params);
|
||||
size_t ZSTD_estimateDCtxSize(void);
|
||||
</b><p> These functions make it possible to estimate memory usage of a future
|
||||
{D,C}Ctx, before its creation.
|
||||
</b><p> These functions make it possible to estimate memory usage
|
||||
of a future {D,C}Ctx, before its creation.
|
||||
|
||||
ZSTD_estimateCCtxSize() will provide a budget large enough for any
|
||||
compression level up to selected one. Unlike ZSTD_estimateCStreamSize*(),
|
||||
this estimate does not include space for a window buffer, so this estimate
|
||||
is guaranteed to be enough for single-shot compressions, but not streaming
|
||||
compressions. It will however assume the input may be arbitrarily large,
|
||||
which is the worst case. If srcSize is known to always be small,
|
||||
ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation.
|
||||
ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with
|
||||
ZSTD_getCParams() to create cParams from compressionLevel.
|
||||
ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with
|
||||
ZSTD_CCtxParams_setParameter().
|
||||
ZSTD_estimateCCtxSize() will provide a memory budget large enough
|
||||
for any compression level up to selected one.
|
||||
Note : Unlike ZSTD_estimateCStreamSize*(), this estimate
|
||||
does not include space for a window buffer.
|
||||
Therefore, the estimation is only guaranteed for single-shot compressions, not streaming.
|
||||
The estimate will assume the input may be arbitrarily large,
|
||||
which is the worst case.
|
||||
|
||||
Note: only single-threaded compression is supported. This function will
|
||||
return an error code if ZSTD_c_nbWorkers is >= 1.
|
||||
When srcSize can be bound by a known and rather "small" value,
|
||||
this fact can be used to provide a tighter estimation
|
||||
because the CCtx compression context will need less memory.
|
||||
This tighter estimation can be provided by more advanced functions
|
||||
ZSTD_estimateCCtxSize_usingCParams(), which can be used in tandem with ZSTD_getCParams(),
|
||||
and ZSTD_estimateCCtxSize_usingCCtxParams(), which can be used in tandem with ZSTD_CCtxParams_setParameter().
|
||||
Both can be used to estimate memory using custom compression parameters and arbitrary srcSize limits.
|
||||
|
||||
Note 2 : only single-threaded compression is supported.
|
||||
ZSTD_estimateCCtxSize_usingCCtxParams() will return an error code if ZSTD_c_nbWorkers is >= 1.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t ZSTD_estimateCStreamSize(int compressionLevel);
|
||||
|
@ -1,10 +1,11 @@
|
||||
# ################################################################
|
||||
# Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
# Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under both the BSD-style license (found in the
|
||||
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
||||
# in the COPYING file in the root directory of this source tree).
|
||||
# You may select, at your option, one of the above-listed licenses.
|
||||
# ################################################################
|
||||
|
||||
# This Makefile presumes libzstd is installed, using `sudo make install`
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020 Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
||||
* Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This source code is licensed under both the BSD-style license (found in the
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user