Every usage of MB_CUR_MAX results in a call to __mb_cur_max. This is
inefficient and redundant. Caching the value of MB_CUR_MAX in a global
variable removes these calls and speeds up the runtime of sort. For
numeric sorting, runtime is almost halved in some tests.
PR: 255551
PR: 255840
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30170
Experimentally, reduces sort -R time of a 148160 line corpus from about
3.15s to about 0.93s on this particular system.
There's probably room for improvement using some digest other than md5, but
I don't want to look at sort(1) anymore. Some discussion of other possible
improvements in the Test Plan section of the Differential.
PR: 230792
Reviewed by: jhb (earlier version)
Differential Revision: https://reviews.freebsd.org/D19885
Bound input file processing length to avoid the issue reported in [1]. For
simplicity, only allow regular file and character device inputs. For
character devices, only allow /dev/random (and /dev/urandom symblink).
32 bytes of random is perfectly sufficient to seed MD5; we don't need any
more. Users that want to use large files as seeds are encouraged to truncate
those files down to an appropriate input file via tools like sha256(1).
(This does not change the sort algorithm of sort -R.)
[1]: https://lists.freebsd.org/pipermail/freebsd-hackers/2018-August/053152.html
PR: 230792
Reported by: Ali Abdallah <aliovx AT gmail.com>
Relnotes: yes
Observe:
printf "a\nb\nc\n" > /tmp/foo
# Next command results in no output
cat /tmp/foo | sort -m
# Next command results in proper output
cat /tmp/foo | sort -m -
# Also works:
sort -m /tmp/foo
Some const'ification was done to simplify the actual solution of adding "-"
explicitly to the file list if we didn't have any file arguments left over.
PR: 190099
MFC after: 1 week
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
No functional change intended.
In addition to adding `static' where possible:
- bin/date: Move `retval' into extern.h to make it visible to date.c.
- bin/ed: Move globally used variables into ed.h.
- sbin/camcontrol: Move `verbose' into camcontrol.h and fix shadow warnings.
- usr.bin/calendar: Remove unneeded variables.
- usr.bin/chat: Make `line' local instead of global.
- usr.bin/elfdump: Comment out unneeded function.
- usr.bin/rlogin: Use _Noreturn instead of __dead2.
- usr.bin/tset: Pull `Ospeed' into extern.h.
- usr.sbin/mfiutil: Put global variables in mfiutil.h.
- usr.sbin/pkg: Remove unused `os_corres'.
- usr.sbin/quotaon, usr.sbin/repquota: Remove unused `qfname'.
- Change default sort method to mergesort, which has a better worst case
performance than qsort
Submitted by: Oleg Moskalenko <oleg.moskalenko@citrix.com>
- Do not use mmap() by default; it can be enabled by --mmap
- Add some minor optimizations for -u
- Update manual page according to the changes
Submitted by: Oleg Moskalenko <oleg.moskalenko@citrix.com>
with the major functionality and optimizations by Oleg Moskalenko.
It is compatible with the latest version of POSIX and the current GNU sort
version that we have in base. Beside this, it implements all the
functionality introduced in later versions of GNU sort. For now, it will
be installed as "bsdsort", keeping GNU sort as the default sort
implementation.