Mateusz Guzik f0ddecd745 amd64: revamp memcmp
Borrow the trick from memset and memmove and use the scale/index/base addressing
to avoid branches.

If a mismatch is found, the routine has to calculate the difference. Make sure
there is always up to 8 bytes to inspect. This replaces the previous loop which
would operate over up to 16 bytes with an unrolled list of 8 tests.

Speed varies a lot, but this is a net win over the previous routine with probably
a lot more to gain.

Validated with glibc test suite.
2020-01-28 17:48:17 +00:00
..
2019-03-29 20:21:28 +00:00
2020-01-28 17:48:17 +00:00
2019-11-28 02:32:17 +00:00
2020-01-21 17:28:36 +00:00