Commit Graph

106 Commits

Author SHA1 Message Date
sevan
47dd39213b Adjust history, info source from v1's manuals
https://www.bell-labs.com/usr/dmr/www/1stEdman.html

MFC after:	5 days
2019-09-04 13:44:46 +00:00
cem
7474baafb6 sort(1): Memoize MD5 computation to reduce repeated computation
Experimentally, reduces sort -R time of a 148160 line corpus from about
3.15s to about 0.93s on this particular system.

There's probably room for improvement using some digest other than md5, but
I don't want to look at sort(1) anymore.  Some discussion of other possible
improvements in the Test Plan section of the Differential.

PR:		230792
Reviewed by:	jhb (earlier version)
Differential Revision:	https://reviews.freebsd.org/D19885
2019-04-13 04:42:17 +00:00
cem
04b7883bc0 sort(1): Simplify and bound random seeding
Bound input file processing length to avoid the issue reported in [1].  For
simplicity, only allow regular file and character device inputs.  For
character devices, only allow /dev/random (and /dev/urandom symblink).

32 bytes of random is perfectly sufficient to seed MD5; we don't need any
more.  Users that want to use large files as seeds are encouraged to truncate
those files down to an appropriate input file via tools like sha256(1).

(This does not change the sort algorithm of sort -R.)

[1]: https://lists.freebsd.org/pipermail/freebsd-hackers/2018-August/053152.html

PR:		230792
Reported by:	Ali Abdallah <aliovx AT gmail.com>
Relnotes:	yes
2019-04-11 05:08:49 +00:00
cem
b3cf0de890 sort(1): Whitespace and style cleanup
No functional change.

Sponsored by:	Dell EMC Isilon
2019-04-11 00:39:06 +00:00
cem
0ebf68b84e sort(1): randomcoll: Skip the memory allocation entirely
There's no reason to order based on strcmp of ASCII digests instead of
memcmp of the raw digests.

While here, remove collision fallback.  If you collide two MD5s, they're
probably the same string anyway.  If robustness against MD5 collisions is
desired, maybe we shouldn't use MD5.

None of the behavior of sort -R is specified by POSIX, so we're free to
implement this however we like.  E.g., using a 128-bit counter and block cipher
to generate unique indices for each line of input.

PR:		230792 (2/many)
Relnotes:	This will change the sort order for a given dataset with a
		given seed.  Other similarly breaking changes are planned.
Sponsored by:	Dell EMC Isilon
2019-04-04 23:32:27 +00:00
cem
d8f1dd2350 sort(1): randomcoll: Don't sort on ENOMEM
PR:		230792 (1/many)
Sponsored by:	Dell EMC Isilon
2019-04-04 20:27:13 +00:00
arichardson
21f8254f2b Don't use absolute path to sed when building usr.bin/join
This is required to build sort on Linux hosts since sed is in /bin there.

Approved By:	jhb (mentor)
2018-08-23 18:18:43 +00:00
kevans
4a742cbb82 sort(1): Fix -m when only implicit stdin is used for input
Observe:

printf "a\nb\nc\n" > /tmp/foo
# Next command results in no output
cat /tmp/foo | sort -m
# Next command results in proper output
cat /tmp/foo | sort -m -
# Also works:
sort -m /tmp/foo

Some const'ification was done to simplify the actual solution of adding "-"
explicitly to the file list if we didn't have any file arguments left over.

PR:		190099
MFC after:	1 week
2018-06-20 03:31:19 +00:00
kevans
ebe66850c9 sort(1): Add bits to allow easy checking against NetBSD tests
I'm looking at sort(1) failures, for better or worse.
2018-06-20 03:10:49 +00:00
markj
fe0ada269c Fix the WITH_SORT_THREADS build.
PR:		201664
MFC after:	1 week
2018-02-07 20:36:37 +00:00
pfg
7551d83c35 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
bdrewery
a598c4b809 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
pfg
a69458844e sort(1): Remove unneeded initializations.
Found by:	Clang static analyzer
2017-02-17 19:53:20 +00:00
pfg
44311318dc sort - Don't live-loop threads.
Worker threads now use a pthread_cond_t to wait for work instead of
burning the cpu up.

Obtained from:	DragonflyBSD (07774aea0ccf64a48fcfad8899e3bf7c8f18277a)
MFC after:	2 weeks
2017-01-23 15:39:51 +00:00
marius
367a1c5d16 - Use correct offsets into the keys set array. As the elements of this
zero-length array are dynamically sized at run-time based on the use
  of hints, compilers can't be expected to figure out these offsets on
  their own. [1]
- Fix incorrect comparison in cmp_nans(). [2]

PR:		204571 [1], 202301 [2]
Submitted by:	David Binderman [2]
MFC after:	3 days
2016-12-28 17:13:03 +00:00
delphij
fe3a3910b9 pages and psize are always assigned, so there is no need to initialize
them as zero.

MFC after:	2 weeks
2016-11-28 06:38:41 +00:00
delphij
6269645ad5 Eliminate variables that are computed, assigned but never
used.

MFC after:	2 weeks
2016-11-28 06:36:10 +00:00
delphij
8c1593bea2 Fix an obvious typo.
MFC after:	2 weeks
2016-11-28 06:32:05 +00:00
gabor
6fc6025343 - Fix typo
PR:		211245
Submitted by:	Christoph Schonweiler <public2016@hauptsignal.at>
MFC after:	5 days
2016-09-08 14:50:23 +00:00
pfg
26c891f034 Cleanup unnecessary semicolons from utilities we all love. 2016-04-15 22:31:22 +00:00
bapt
c18da30003 Fix some mdoc(7) issues
Obtained from:	DragonflyBSD
2015-10-24 13:43:10 +00:00
gabor
ef2ee73fde -C and -c allow at most one input file. Ensure this is the case when the
input files are specified through --files0-from.

Submitted by:	tim@OpenBSD
Obtained from:	OpenBSD
MFC after:	1 week
2015-10-22 10:57:15 +00:00
sjg
852129abd1 new depends 2015-06-16 23:37:19 +00:00
sjg
008d7c831f Add META_MODE support.
Off by default, build behaves normally.
WITH_META_MODE we get auto objdir creation, the ability to
start build from anywhere in the tree.

Still need to add real targets under targets/ to build packages.

Differential Revision:       D2796
Reviewed by: brooks imp
2015-06-13 19:20:56 +00:00
sjg
75a137820d dirdeps.mk now sets DEP_RELDIR 2015-06-08 23:35:17 +00:00
sjg
65145fa4c8 Merge sync of head 2015-05-27 01:19:58 +00:00
pfg
11828bfdd2 Remove custom getdelim(3) and fix a small memory leak.
Originally from Andre Smagin.

Obtained from:	OpenBSD
MFC after:	1 week
2015-04-07 01:17:49 +00:00
pfg
afe0cfa5d0 sort(1): Cleanups and a small memory leak.
Remove useless check for leading blanks in the month name.  The
code didn't adjust len after stripping blanks so even if a month
*did* start with a blank we'd end up copying garbage at the end.
Also convert a malloc + memcpy to strdup and fix a memory leak in
the wide char version if mbstowcs() fails.
Originally from Andre Smagin.

Obtained from:  OpenBSD (CVS rev. 1.2, 1.3)
MFC after:	1 week
2015-04-07 01:17:29 +00:00
pfg
23c3b77477 sort: style knits / cleanups.
Minor cleanups that got accidentally reverted.

Obtained from:	OpenBSD
2015-04-06 03:02:20 +00:00
pfg
4a1d849efc Revert (partial) r281123, r281125:
sort: style knits / cleanups.

Our style guide(9) specifies that in absence of local variables
an empty line must be inserted.

Pointed out by:	eadler
2015-04-06 02:35:55 +00:00
pfg
cf28b0945a sort: style knits / cleanups.
Obtained from:	OpenBSD
2015-04-05 23:06:42 +00:00
pfg
b99203e9f2 sort: Fix a comment.
Obtained from:	OpenBSD
2015-04-05 22:34:03 +00:00
pfg
ed77a0cd5f sort: Cleanup small issues with spaces.
Obtained from:	OpenBSD
2015-04-05 22:22:43 +00:00
joel
31f04f6cc4 mdoc: use An macro. 2015-01-04 12:42:08 +00:00
bapt
8d6c7a49a6 Convert to usr.bin/ to LIBADD
Reduce overlinking
2014-11-25 14:29:10 +00:00
sjg
b137080f19 Merge from head@274682 2014-11-19 01:07:58 +00:00
sbruno
85036d1f0b Change LDFLAGS to LDADD in order to allow static builds. This is more
proper way to ensure that the command line compile works the way we intend.

Add explicity DPADD statemens on LIBMD and LIBPTHREAD depending on which
options are used in the build.

Reviewed by:	andrew
MFC after:	2 weeks
2014-11-15 18:03:38 +00:00
bapt
34ba09d82b Make sure to not skip any argument when converting from deprecated
+POS1, -POS2 to -kPOS1,POS2, so that sort +0n gets translated to sort -k1,1n
as it is expected

PR:		193994
Submitted by:	rodrigo
MFC after:	3 days
2014-10-02 06:29:49 +00:00
sjg
d7cd1d425c Merge head from 7/28 2014-08-19 06:50:54 +00:00
gjb
02e29b8a29 Remove trailing '.' from See Also section.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2014-07-30 04:40:50 +00:00
sjg
5860f0d106 Updated dependencies 2014-05-16 14:09:51 +00:00
sjg
1a7e48acf1 Updated dependencies 2014-05-10 05:16:28 +00:00
sjg
ed3fc70bf5 Merge from head 2014-05-08 23:54:15 +00:00
imp
2118f42afd Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
sjg
5e568154a0 Merge head 2014-04-28 07:50:45 +00:00
bdrewery
e199bfd4cc Fix spelling error.
MFC after:	3 days
2014-04-25 15:27:19 +00:00
pfg
2a89efd87b Various style(9) fixes and typos in grep, sort and patch.
MFC after:	3 days
2014-04-21 22:52:18 +00:00
imp
0160f80f72 Convert sort to using newer MK_ convention. 2014-04-05 18:01:49 +00:00
dim
35d70075c9 In usr.bin/sort/radixsort.c, pop_ls_mt() is only referenced if
SORT_THREADS is defined, so make the whole function conditional, instead
of just the pthread calls in it.

MFC after:	3 days
2013-12-22 20:46:31 +00:00
sjg
62bb106222 Merge from head 2013-09-05 20:18:59 +00:00