freebsd-dev

History

Conrad Meyer 6be2ff7d3e calculate_crc32c: Add SSE4.2 implementation on x86 Derived from an implementation by Mark Adler. The fast loop performs three simultaneous CRCs over subsets of the data before composing them. This takes advantage of certain properties of the CRC32 implementation in Intel hardware. (The CRC instruction takes 1 cycle but has 2-3 cycles of latency.) The CRC32 instruction does not manipulate FPU state. i386 does not have the crc32q instruction, so avoid it there. Otherwise the implementation is identical to amd64. Add basic userland tests to verify correctness on a variety of inputs. PR: 216467 Reported by: Ben RUBSON <ben.rubson at gmail.com> Reviewed by: kib@, markj@ (earlier version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D9342	2017-01-31 03:26:32 +00:00
..
crc32_sse42.c

Conrad Meyer 6be2ff7d3e calculate_crc32c: Add SSE4.2 implementation on x86

Derived from an implementation by Mark Adler.

The fast loop performs three simultaneous CRCs over subsets of the data
before composing them.  This takes advantage of certain properties of
the CRC32 implementation in Intel hardware.  (The CRC instruction takes 1
cycle but has 2-3 cycles of latency.)

The CRC32 instruction does not manipulate FPU state.

i386 does not have the crc32q instruction, so avoid it there.  Otherwise
the implementation is identical to amd64.

Add basic userland tests to verify correctness on a variety of inputs.

PR:		216467
Reported by:	Ben RUBSON <ben.rubson at gmail.com>
Reviewed by:	kib@, markj@ (earlier version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D9342

2017-01-31 03:26:32 +00:00

crc32_sse42.c