freebsd-nq/lib
Gvozden Neskovic fc897b24b2 Rework of fletcher_4 module
- Benchmark memory block is increased to 128kiB to reflect real block sizes more
accurately. Measurements include all three stages needed for checksum generation,
i.e. `init()/compute()/fini()`. The inner loop is repeated multiple times to offset
overhead of time function.

- Fastest implementation selects native and byteswap methods independently in
benchmark. To support this new function pointers `init_byteswap()/fini_byteswap()`
are introduced.

- Implementation mutex lock is replaced by atomic variable.

- To save time, benchmark is not executed in userspace. Instead, highest supported
implementation is used for fastest. Default userspace selector is still 'cycle'.

- `fletcher_4_native/byteswap()` methods use incremental methods to finish
calculation if data size is not multiple of vector stride (currently 64B).

- Added `fletcher_4_native_varsize()` special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.

- Benchmark `kstat` is changed to match the one of vdev_raidz. It now shows
throughput for all supported implementations (in B/s), native and byteswap,
as well as the code [fastest] is running.

Example of `fletcher_4_bench` running on `Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz`:
implementation   native         byteswap
scalar           4768120823     3426105750
sse2             7947841777     4318964249
ssse3            7951922722     6112191941
avx2             13269714358    11043200912
fastest          avx2           avx2

Example of `fletcher_4_bench` running on `Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz`:
implementation   native         byteswap
scalar           1291115967     1031555336
sse2             2539571138     1280970926
ssse3            2537778746     1080016762
avx2             4950749767     1078493449
avx512f          9581379998     4010029046
fastest          avx512f        avx512f

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4952
2016-08-16 14:11:55 -07:00
..
libavl Support parallel build trees (VPATH builds) 2015-07-17 13:42:51 -07:00
libefi Fixes for issues found with cppcheck tool 2016-07-27 13:31:22 -07:00
libicp Illumos Crypto Port module added to enable native encryption in zfs 2016-07-20 10:43:30 -07:00
libnvpair Add support for libtirpc 2016-04-28 09:27:40 -07:00
libshare Support parallel build trees (VPATH builds) 2015-07-17 13:42:51 -07:00
libspl OpenZFS 5997 - FRU field not set during pool creation and never updated 2016-08-12 13:06:48 -07:00
libunicode Support parallel build trees (VPATH builds) 2015-07-17 13:42:51 -07:00
libuutil Fixes for issues found with cppcheck tool 2016-07-27 13:31:22 -07:00
libzfs Rework of fletcher_4 module 2016-08-16 14:11:55 -07:00
libzfs_core Fix indefinite article 2016-08-11 11:23:49 -07:00
libzpool Fletcher4 implementation using avx512f instruction set 2016-08-16 14:11:14 -07:00
Makefile.am Illumos Crypto Port module added to enable native encryption in zfs 2016-07-20 10:43:30 -07:00