Commit Graph

15 Commits

Author SHA1 Message Date
cem
7474baafb6 sort(1): Memoize MD5 computation to reduce repeated computation
Experimentally, reduces sort -R time of a 148160 line corpus from about
3.15s to about 0.93s on this particular system.

There's probably room for improvement using some digest other than md5, but
I don't want to look at sort(1) anymore.  Some discussion of other possible
improvements in the Test Plan section of the Differential.

PR:		230792
Reviewed by:	jhb (earlier version)
Differential Revision:	https://reviews.freebsd.org/D19885
2019-04-13 04:42:17 +00:00
cem
0ebf68b84e sort(1): randomcoll: Skip the memory allocation entirely
There's no reason to order based on strcmp of ASCII digests instead of
memcmp of the raw digests.

While here, remove collision fallback.  If you collide two MD5s, they're
probably the same string anyway.  If robustness against MD5 collisions is
desired, maybe we shouldn't use MD5.

None of the behavior of sort -R is specified by POSIX, so we're free to
implement this however we like.  E.g., using a 128-bit counter and block cipher
to generate unique indices for each line of input.

PR:		230792 (2/many)
Relnotes:	This will change the sort order for a given dataset with a
		given seed.  Other similarly breaking changes are planned.
Sponsored by:	Dell EMC Isilon
2019-04-04 23:32:27 +00:00
cem
d8f1dd2350 sort(1): randomcoll: Don't sort on ENOMEM
PR:		230792 (1/many)
Sponsored by:	Dell EMC Isilon
2019-04-04 20:27:13 +00:00
pfg
7551d83c35 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
pfg
a69458844e sort(1): Remove unneeded initializations.
Found by:	Clang static analyzer
2017-02-17 19:53:20 +00:00
marius
367a1c5d16 - Use correct offsets into the keys set array. As the elements of this
zero-length array are dynamically sized at run-time based on the use
  of hints, compilers can't be expected to figure out these offsets on
  their own. [1]
- Fix incorrect comparison in cmp_nans(). [2]

PR:		204571 [1], 202301 [2]
Submitted by:	David Binderman [2]
MFC after:	3 days
2016-12-28 17:13:03 +00:00
pfg
26c891f034 Cleanup unnecessary semicolons from utilities we all love. 2016-04-15 22:31:22 +00:00
pfg
4a1d849efc Revert (partial) r281123, r281125:
sort: style knits / cleanups.

Our style guide(9) specifies that in absence of local variables
an empty line must be inserted.

Pointed out by:	eadler
2015-04-06 02:35:55 +00:00
pfg
cf28b0945a sort: style knits / cleanups.
Obtained from:	OpenBSD
2015-04-05 23:06:42 +00:00
pfg
ed77a0cd5f sort: Cleanup small issues with spaces.
Obtained from:	OpenBSD
2015-04-05 22:22:43 +00:00
gabor
f4ae49737b - Update Oleg Moskalenko's email address
Requested by:	Oleg Moskalenko <mom040267@gmail.com>
2013-06-02 09:43:48 +00:00
gabor
40317d46a3 - Portability changes for ARM
- Allow larger sort memory on 64-bit platforms

Submitted by:	Oleg Moskalenko <oleg.moskalenko@citrix.com>
2012-11-01 11:38:34 +00:00
gabor
671916033f - Remove the UNUSED_ARG macro and use __unused in argument lists
Reviewed by:	dim
MFC after:	3 days
2012-06-08 19:21:49 +00:00
gabor
1b903beae4 - Eliminate initializations if global variables. Compilers are not
required to optimize these so it may result in larger binary size.

Pointed out by:	kib
2012-05-14 10:06:49 +00:00
gabor
3c7b03ea74 Add a BSD-licensed sort rewrite that was started by me and later completed
with the major functionality and optimizations by Oleg Moskalenko.
It is compatible with the latest version of POSIX and the current GNU sort
version that we have in base.  Beside this, it implements all the
functionality introduced in later versions of GNU sort.  For now, it will
be installed as "bsdsort", keeping GNU sort as the default sort
implementation.
2012-05-11 12:37:16 +00:00