Commit Graph

103 Commits

Author SHA1 Message Date
Conrad Meyer
f20b149b45 sort(1): Memoize MD5 computation to reduce repeated computation
Experimentally, reduces sort -R time of a 148160 line corpus from about
3.15s to about 0.93s on this particular system.

There's probably room for improvement using some digest other than md5, but
I don't want to look at sort(1) anymore.  Some discussion of other possible
improvements in the Test Plan section of the Differential.

PR:		230792
Reviewed by:	jhb (earlier version)
Differential Revision:	https://reviews.freebsd.org/D19885
2019-04-13 04:42:17 +00:00
Conrad Meyer
7a590a370a sort(1): Simplify and bound random seeding
Bound input file processing length to avoid the issue reported in [1].  For
simplicity, only allow regular file and character device inputs.  For
character devices, only allow /dev/random (and /dev/urandom symblink).

32 bytes of random is perfectly sufficient to seed MD5; we don't need any
more.  Users that want to use large files as seeds are encouraged to truncate
those files down to an appropriate input file via tools like sha256(1).

(This does not change the sort algorithm of sort -R.)

[1]: https://lists.freebsd.org/pipermail/freebsd-hackers/2018-August/053152.html

PR:		230792
Reported by:	Ali Abdallah <aliovx AT gmail.com>
Relnotes:	yes
2019-04-11 05:08:49 +00:00
Conrad Meyer
74504eefa1 sort(1): Whitespace and style cleanup
No functional change.

Sponsored by:	Dell EMC Isilon
2019-04-11 00:39:06 +00:00
Conrad Meyer
fff4eaebbf sort(1): randomcoll: Skip the memory allocation entirely
There's no reason to order based on strcmp of ASCII digests instead of
memcmp of the raw digests.

While here, remove collision fallback.  If you collide two MD5s, they're
probably the same string anyway.  If robustness against MD5 collisions is
desired, maybe we shouldn't use MD5.

None of the behavior of sort -R is specified by POSIX, so we're free to
implement this however we like.  E.g., using a 128-bit counter and block cipher
to generate unique indices for each line of input.

PR:		230792 (2/many)
Relnotes:	This will change the sort order for a given dataset with a
		given seed.  Other similarly breaking changes are planned.
Sponsored by:	Dell EMC Isilon
2019-04-04 23:32:27 +00:00
Conrad Meyer
e667e2a480 sort(1): randomcoll: Don't sort on ENOMEM
PR:		230792 (1/many)
Sponsored by:	Dell EMC Isilon
2019-04-04 20:27:13 +00:00
Alex Richardson
101db63b42 Don't use absolute path to sed when building usr.bin/join
This is required to build sort on Linux hosts since sed is in /bin there.

Approved By:	jhb (mentor)
2018-08-23 18:18:43 +00:00
Kyle Evans
7137597e15 sort(1): Fix -m when only implicit stdin is used for input
Observe:

printf "a\nb\nc\n" > /tmp/foo
# Next command results in no output
cat /tmp/foo | sort -m
# Next command results in proper output
cat /tmp/foo | sort -m -
# Also works:
sort -m /tmp/foo

Some const'ification was done to simplify the actual solution of adding "-"
explicitly to the file list if we didn't have any file arguments left over.

PR:		190099
MFC after:	1 week
2018-06-20 03:31:19 +00:00
Kyle Evans
36180cd53d sort(1): Add bits to allow easy checking against NetBSD tests
I'm looking at sort(1) failures, for better or worse.
2018-06-20 03:10:49 +00:00
Mark Johnston
7f180c0f80 Fix the WITH_SORT_THREADS build.
PR:		201664
MFC after:	1 week
2018-02-07 20:36:37 +00:00
Pedro F. Giffuni
1de7b4b805 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
Bryan Drewery
ea825d0274 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
Pedro F. Giffuni
759a9a9d24 sort(1): Remove unneeded initializations.
Found by:	Clang static analyzer
2017-02-17 19:53:20 +00:00
Pedro F. Giffuni
692cd1a3b2 sort - Don't live-loop threads.
Worker threads now use a pthread_cond_t to wait for work instead of
burning the cpu up.

Obtained from:	DragonflyBSD (07774aea0ccf64a48fcfad8899e3bf7c8f18277a)
MFC after:	2 weeks
2017-01-23 15:39:51 +00:00
Marius Strobl
ed7aec1e45 - Use correct offsets into the keys set array. As the elements of this
zero-length array are dynamically sized at run-time based on the use
  of hints, compilers can't be expected to figure out these offsets on
  their own. [1]
- Fix incorrect comparison in cmp_nans(). [2]

PR:		204571 [1], 202301 [2]
Submitted by:	David Binderman [2]
MFC after:	3 days
2016-12-28 17:13:03 +00:00
Xin LI
3611de44ef pages and psize are always assigned, so there is no need to initialize
them as zero.

MFC after:	2 weeks
2016-11-28 06:38:41 +00:00
Xin LI
c514c3ed4f Eliminate variables that are computed, assigned but never
used.

MFC after:	2 weeks
2016-11-28 06:36:10 +00:00
Xin LI
665d2db378 Fix an obvious typo.
MFC after:	2 weeks
2016-11-28 06:32:05 +00:00
Gabor Kovesdan
a6be469014 - Fix typo
PR:		211245
Submitted by:	Christoph Schonweiler <public2016@hauptsignal.at>
MFC after:	5 days
2016-09-08 14:50:23 +00:00
Pedro F. Giffuni
80c7cc1c8f Cleanup unnecessary semicolons from utilities we all love. 2016-04-15 22:31:22 +00:00
Baptiste Daroussin
902b9f79f7 Fix some mdoc(7) issues
Obtained from:	DragonflyBSD
2015-10-24 13:43:10 +00:00
Gabor Kovesdan
a7bc18929d -C and -c allow at most one input file. Ensure this is the case when the
input files are specified through --files0-from.

Submitted by:	tim@OpenBSD
Obtained from:	OpenBSD
MFC after:	1 week
2015-10-22 10:57:15 +00:00
Simon J. Gerraty
2ef6d5a7b9 new depends 2015-06-16 23:37:19 +00:00
Simon J. Gerraty
ccfb965433 Add META_MODE support.
Off by default, build behaves normally.
WITH_META_MODE we get auto objdir creation, the ability to
start build from anywhere in the tree.

Still need to add real targets under targets/ to build packages.

Differential Revision:       D2796
Reviewed by: brooks imp
2015-06-13 19:20:56 +00:00
Simon J. Gerraty
44d314f704 dirdeps.mk now sets DEP_RELDIR 2015-06-08 23:35:17 +00:00
Simon J. Gerraty
98e0ffaefb Merge sync of head 2015-05-27 01:19:58 +00:00
Pedro F. Giffuni
0f4b9a9057 Remove custom getdelim(3) and fix a small memory leak.
Originally from Andre Smagin.

Obtained from:	OpenBSD
MFC after:	1 week
2015-04-07 01:17:49 +00:00
Pedro F. Giffuni
b904ab99bc sort(1): Cleanups and a small memory leak.
Remove useless check for leading blanks in the month name.  The
code didn't adjust len after stripping blanks so even if a month
*did* start with a blank we'd end up copying garbage at the end.
Also convert a malloc + memcpy to strdup and fix a memory leak in
the wide char version if mbstowcs() fails.
Originally from Andre Smagin.

Obtained from:  OpenBSD (CVS rev. 1.2, 1.3)
MFC after:	1 week
2015-04-07 01:17:29 +00:00
Pedro F. Giffuni
45e151e97d sort: style knits / cleanups.
Minor cleanups that got accidentally reverted.

Obtained from:	OpenBSD
2015-04-06 03:02:20 +00:00
Pedro F. Giffuni
e5f71a07e4 Revert (partial) r281123, r281125:
sort: style knits / cleanups.

Our style guide(9) specifies that in absence of local variables
an empty line must be inserted.

Pointed out by:	eadler
2015-04-06 02:35:55 +00:00
Pedro F. Giffuni
db8026c7bb sort: style knits / cleanups.
Obtained from:	OpenBSD
2015-04-05 23:06:42 +00:00
Pedro F. Giffuni
bd0f80c667 sort: Fix a comment.
Obtained from:	OpenBSD
2015-04-05 22:34:03 +00:00
Pedro F. Giffuni
f79477ebd5 sort: Cleanup small issues with spaces.
Obtained from:	OpenBSD
2015-04-05 22:22:43 +00:00
Joel Dahl
385385fb1c mdoc: use An macro. 2015-01-04 12:42:08 +00:00
Baptiste Daroussin
3e11bd9e2a Convert to usr.bin/ to LIBADD
Reduce overlinking
2014-11-25 14:29:10 +00:00
Simon J. Gerraty
9268022b74 Merge from head@274682 2014-11-19 01:07:58 +00:00
Sean Bruno
e4684c7895 Change LDFLAGS to LDADD in order to allow static builds. This is more
proper way to ensure that the command line compile works the way we intend.

Add explicity DPADD statemens on LIBMD and LIBPTHREAD depending on which
options are used in the build.

Reviewed by:	andrew
MFC after:	2 weeks
2014-11-15 18:03:38 +00:00
Baptiste Daroussin
3e16491d77 Make sure to not skip any argument when converting from deprecated
+POS1, -POS2 to -kPOS1,POS2, so that sort +0n gets translated to sort -k1,1n
as it is expected

PR:		193994
Submitted by:	rodrigo
MFC after:	3 days
2014-10-02 06:29:49 +00:00
Simon J. Gerraty
ee7b0571c2 Merge head from 7/28 2014-08-19 06:50:54 +00:00
Glen Barber
96c566eaf9 Remove trailing '.' from See Also section.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2014-07-30 04:40:50 +00:00
Simon J. Gerraty
fae50821ae Updated dependencies 2014-05-16 14:09:51 +00:00
Simon J. Gerraty
76b28ad6ab Updated dependencies 2014-05-10 05:16:28 +00:00
Simon J. Gerraty
cc3f4b9965 Merge from head 2014-05-08 23:54:15 +00:00
Warner Losh
c6063d0da8 Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
Simon J. Gerraty
3b8f084595 Merge head 2014-04-28 07:50:45 +00:00
Bryan Drewery
459d6434a8 Fix spelling error.
MFC after:	3 days
2014-04-25 15:27:19 +00:00
Pedro F. Giffuni
b1a409863f Various style(9) fixes and typos in grep, sort and patch.
MFC after:	3 days
2014-04-21 22:52:18 +00:00
Warner Losh
cf57243a46 Convert sort to using newer MK_ convention. 2014-04-05 18:01:49 +00:00
Dimitry Andric
77f98fbb7d In usr.bin/sort/radixsort.c, pop_ls_mt() is only referenced if
SORT_THREADS is defined, so make the whole function conditional, instead
of just the pthread calls in it.

MFC after:	3 days
2013-12-22 20:46:31 +00:00
Simon J. Gerraty
d1d0158641 Merge from head 2013-09-05 20:18:59 +00:00
Eitan Adler
702a1889cc Fix header guards.
This was ready about the same time as r251862 so just make one final cleanup

Submitted by:	dt71@gmx.com
2013-06-17 20:15:39 +00:00