Commit Graph

15 Commits

Author SHA1 Message Date
Conrad Meyer
f20b149b45 sort(1): Memoize MD5 computation to reduce repeated computation
Experimentally, reduces sort -R time of a 148160 line corpus from about
3.15s to about 0.93s on this particular system.

There's probably room for improvement using some digest other than md5, but
I don't want to look at sort(1) anymore.  Some discussion of other possible
improvements in the Test Plan section of the Differential.

PR:		230792
Reviewed by:	jhb (earlier version)
Differential Revision:	https://reviews.freebsd.org/D19885
2019-04-13 04:42:17 +00:00
Conrad Meyer
fff4eaebbf sort(1): randomcoll: Skip the memory allocation entirely
There's no reason to order based on strcmp of ASCII digests instead of
memcmp of the raw digests.

While here, remove collision fallback.  If you collide two MD5s, they're
probably the same string anyway.  If robustness against MD5 collisions is
desired, maybe we shouldn't use MD5.

None of the behavior of sort -R is specified by POSIX, so we're free to
implement this however we like.  E.g., using a 128-bit counter and block cipher
to generate unique indices for each line of input.

PR:		230792 (2/many)
Relnotes:	This will change the sort order for a given dataset with a
		given seed.  Other similarly breaking changes are planned.
Sponsored by:	Dell EMC Isilon
2019-04-04 23:32:27 +00:00
Conrad Meyer
e667e2a480 sort(1): randomcoll: Don't sort on ENOMEM
PR:		230792 (1/many)
Sponsored by:	Dell EMC Isilon
2019-04-04 20:27:13 +00:00
Pedro F. Giffuni
1de7b4b805 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
Pedro F. Giffuni
759a9a9d24 sort(1): Remove unneeded initializations.
Found by:	Clang static analyzer
2017-02-17 19:53:20 +00:00
Marius Strobl
ed7aec1e45 - Use correct offsets into the keys set array. As the elements of this
zero-length array are dynamically sized at run-time based on the use
  of hints, compilers can't be expected to figure out these offsets on
  their own. [1]
- Fix incorrect comparison in cmp_nans(). [2]

PR:		204571 [1], 202301 [2]
Submitted by:	David Binderman [2]
MFC after:	3 days
2016-12-28 17:13:03 +00:00
Pedro F. Giffuni
80c7cc1c8f Cleanup unnecessary semicolons from utilities we all love. 2016-04-15 22:31:22 +00:00
Pedro F. Giffuni
e5f71a07e4 Revert (partial) r281123, r281125:
sort: style knits / cleanups.

Our style guide(9) specifies that in absence of local variables
an empty line must be inserted.

Pointed out by:	eadler
2015-04-06 02:35:55 +00:00
Pedro F. Giffuni
db8026c7bb sort: style knits / cleanups.
Obtained from:	OpenBSD
2015-04-05 23:06:42 +00:00
Pedro F. Giffuni
f79477ebd5 sort: Cleanup small issues with spaces.
Obtained from:	OpenBSD
2015-04-05 22:22:43 +00:00
Gabor Kovesdan
c859c6dd54 - Update Oleg Moskalenko's email address
Requested by:	Oleg Moskalenko <mom040267@gmail.com>
2013-06-02 09:43:48 +00:00
Gabor Kovesdan
e8da8c744b - Portability changes for ARM
- Allow larger sort memory on 64-bit platforms

Submitted by:	Oleg Moskalenko <oleg.moskalenko@citrix.com>
2012-11-01 11:38:34 +00:00
Gabor Kovesdan
89b7a5879e - Remove the UNUSED_ARG macro and use __unused in argument lists
Reviewed by:	dim
MFC after:	3 days
2012-06-08 19:21:49 +00:00
Gabor Kovesdan
ce1e997f54 - Eliminate initializations if global variables. Compilers are not
required to optimize these so it may result in larger binary size.

Pointed out by:	kib
2012-05-14 10:06:49 +00:00
Gabor Kovesdan
c66bbc9143 Add a BSD-licensed sort rewrite that was started by me and later completed
with the major functionality and optimizations by Oleg Moskalenko.
It is compatible with the latest version of POSIX and the current GNU sort
version that we have in base.  Beside this, it implements all the
functionality introduced in later versions of GNU sort.  For now, it will
be installed as "bsdsort", keeping GNU sort as the default sort
implementation.
2012-05-11 12:37:16 +00:00