Commit Graph

251789 Commits

Author SHA1 Message Date
mjg
fe4195ffb4 malloc: try to use builtins for zeroing at the callsite
Plenty of allocation sites pass M_ZERO and sizes which are small and known
at compilation time. Handling them internally in malloc loses this information
and results in avoidable calls to memset.

Instead, let the compiler take the advantage of it whenever possible.

Discussed with:	jeff
2018-06-02 22:20:09 +00:00
eadler
7f99483fb2 top(1): Fix two speeling errors I introduced 2018-06-02 22:12:57 +00:00
eadler
4bce4f59c4 top(1): chdir to / as init; remove unneeded comment
- chdir to / to allow unmounting of wd
- remove warning about running top(1) as setuid. If this is a concern we
should just drop privs instead.
2018-06-02 22:06:27 +00:00
eadler
0296b74c55 top(1): cleanup memory allocation and warnings
- Prefer calloc over malloc. This is more predicable and we're not in a
performance sensitive context. [1]
- Remove bogus comment (obsolete from prior commit). [2]
- Remove void casts and type casts of NULL
- Remove redundant declaration of 'quit'
- Add additional const

Reported by:	kib [1], vangyzen [2]
2018-06-02 21:40:45 +00:00
jhibbits
45450d16d7 Included VSX registers in powerpc core dumps
Summary: Included VSX registers in powerpc core dumps (both kernel and gcore)

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15512
2018-06-02 20:28:58 +00:00
mjg
9b8d5f13a3 amd64: add a mild depessimization to rep mov/stos users
Currently all the primitives are waiting for a rewrite, tidy them up in the
meantime.

Vast majority of cases pass sizes which are multiple of 8. Which means the
following rep stosb/movb has nothing to do. Turns out testing first if there
is anything to do is a big win across the board (cpus with and without ERMS,
Intel and AMD) while not pessimizing the case where there is work to do.

Sample results for zeroing 64 bytes (ops/second):
Ryzen Threadripper 1950X		91433212 -> 147265741
Intel(R) Xeon(R) CPU X5675 @ 3.07GHz	90714044 -> 121992888

bzero and bcopy are on their way out and were not modified. Nothing in the
tree uses them.
2018-06-02 20:14:43 +00:00
jhibbits
729cc88801 Added ptrace support for reading/writing powerpc VSX registers
Summary:
Added ptrace support for getting/setting the remaining part of the VSX registers
(the part that's not already covered by FPR or VR registers).

This is necessary to add support for VSX registers in debuggers.

Submitted by:	Luis Pires
Differential Revision: https://reviews.freebsd.org/D15458
2018-06-02 19:17:11 +00:00
mjg
49d287e33d Use __builtin for various mem* and b* (e.g. bzero) routines.
Some of the routines were using artificially limited builtin already,
drop the explicit limit.

The use of builtins allows quite often allows the compiler to elide the call
or most zeroing to begin with. For instance, if the target object is 32 bytes
in size and gets zeroed + has 16 bytes initialized, the compiler can just
add code to zero out the rest.

Note not all the primites have asm variants and some of the existing ones
are not optimized. Maintaines are strongly encourage to take a look
(regardless of this change).
2018-06-02 18:03:35 +00:00
mjg
46b1558c0c libkern: tidy up memset
1. Remove special-casing of 0 as it just results in an extra function call.
This is clearly pessimal.
2. Drop the inline stuff. For the most part it is much better served with
__builtin_memset (coming later).
3. Move the declaration to systm.h to match other funcs.

Archs are encouraged to implement the variant for their own platform so that
this implementation can be dropped.
2018-06-02 17:57:09 +00:00
tuexen
6189cf7c65 Don't overflow a buffer if we receive an INIT or INIT-ACK chunk
without a RANDOM parameter but with a CHUNKS or HMAC-ALGO parameter.
Please note that sending this combination violates the specification.

Thnanks to Ronald E. Crane for reporting the issue for the userland
stack.

MFC after:	3 days
2018-06-02 16:28:10 +00:00
novel
f2fe0a1957 top: add -p option and p command to only show a single process
Allow to show only a single process specified by PID. This could
be done either by running top like 'top -p PID' or using the 'p' command
inside top.

Reviewed by:	eadler
Approved by:	eadler
Obtained from:	OpenBSD
Differential Revision:	https://reviews.freebsd.org/D15501
2018-06-02 15:52:18 +00:00
bde
107463a1fd Improve defaults for per-CPU kernel console colors, especially with 2
or 4 CPUs.  Add a compile-time option SC_KERNEL_CONS_ATTRS to control the
defaults.

Default to color numbers in reverse order to CPU numbers (instead of
in the same order with white first and wrapping to dark grey), so that
the brightest bright colors are used first.  Don't use dark grey at all;
replace it by dark green.

Syscons has too many compile-time options, but this one is needed in
in case the defaults give something like white on white, or the user
really hates this feature and can't wait to turn it off in rc.

MFC after:	next release?
2018-06-02 14:07:27 +00:00
bde
070d6e2563 Use per-CPU attributes earlier.
The per-CPU ts is not initialized early, so the global kernel ts is used
early, but it ony has 1 (normal) attribute.  Switch this to the per-CPU
attribute.

The difference is most visible with EARLY_AP_STARTUP.

Change to using the curcpu macro instead of PCPU_GET(cpuid) in 2 places for
the above and in 1 other place in my old code in syscons.  The function-like
spelling is perhaps better for indicating that curcpu is volatile (unlike
curthread), but for CPU attributes volatility is a feature.
2018-06-02 10:36:30 +00:00
bde
48738e7c73 Oops, the last minute reduction in the clobber list for i386
MCOUNT_OVERHEAD() in r334522 was too agressive.  Only mcount exit
preserves %eax and %edx.
2018-06-02 09:59:27 +00:00
eadler
aa8c5e3456 Use stpcpy instead of home grown solution 2018-06-02 08:46:09 +00:00
bde
14f65346d3 Fix low-level locking during panics.
The SCHEDULER_STOPPED() hack breaks locking generally, and
mtx_trylock_*() especially.  When mtx_trylock_*() returns nonzero,
naive code version here trusts it to have worked.  But when
SCHEDULER_STOPPED() is true, mtx_trylock_*() returns 1 without doing
anything.  Then mtx_unlock_*() crashes especially badly attempting to
unlock iff the error is detected, since mutex unlocking functions don't
check SCHEDULER_STOPPED().

syscons already didn't trust mtx_trylock_spin(), but it was missing the
logic to turn on sp->kdb_locked when turning off sp->mtx_locked during
panics.  It also used panicstr instead of SCHEDULER_LOCKED because I
thought that panicstr was more fragile.  They only differ for a window
of lines in panic(), and in broken cases where stop_cpus_hard() in panic()
didn't work.
2018-06-02 08:38:59 +00:00
eadler
a8efb181f4 top(1): remove wrapper around putchar
This appears to have been written for portability which we no longer
need.
2018-06-02 07:44:53 +00:00
eadler
1bb7502e75 top(1): const poison part 2
Further reduce the number of warnings emitted by gcc.
2018-06-02 07:44:50 +00:00
bde
1cd9b8b732 Finish COMPAT_AOUT support for amd64. It wasn't in any amd64 or MI
file in /sys/conf, so was unavailable in configurations that don't use
modules, and was not testable or notable in NOTES.  Its normal
configuration (not using a module) is still silently deprecated in
aout(4) by not mentioning it there.

Update i386 NOTES for COMPAT_AOUT.  It is not i386-only, or even very MD.
Sort its entry better.

Finish gzip configuration (but not support) for amd64.  gzip is really
gzipped aout.  It is currently broken even for i386 (a call to vm fails).
amd64 has always attempted to configure and test it, but it depends on
COMPAT_AOUT (as noted).  The bug that it depends on unconfigured files
was not detected since it is configured as a device.  All other optional
image activators are configured properly using an option.
2018-06-02 06:40:15 +00:00
bde
0409ad53fa Fix high resolution kernel profiling just enough to not crash at boot
time, especially for SMP.  If configured, it turns itself on at boot
time for calibration, so is fragile even if never otherwise used.

Both types of kernel profiling were supposed to use a global spinlock
in the SMP case.  If hi-res profiling is configured (but not necessarily
used), this was supposed to be optimized by only using it when
necessary, and slightly more efficiently, in asm.  But it was not done
at all for mcount entry where it is necessary.  This caused crashes
in the SMP case when either type of profiling was enabled.  For mcount
exit, it only caused wrong times.  The times were wrongest with an
i8254 timer since using that requires exclusive access to the hardware.
The i8254 timer was too slow to use here 20 years ago and is much less
usable now, but it is the default for the SMP case since TSCs weren't
invariant when SMP was new.  Do the locking in all hi-res SMP cases for
simplicity.

Calibration uses special asms, and the clobber lists in these were sort
of inverted.  They contained the arg and return registers which are not
clobbered, but on amd64 they didn't contain the residue of the call-used
registers which may be clobbered (%r10 and %r11).  This usually caused
hangs at boot time.  This usually affected even the UP case.
2018-06-02 05:48:44 +00:00
eadler
c60e3080a1 top(1): const poison
top(1) has a number of issues with writing to const strings. Begin
helping this along by marking easy cases as const.
2018-06-02 04:37:37 +00:00
bde
d4e99d50f0 Fix recent breakages of kernel profiling, mostly on i386 (high resolution
kernel profiling remains broken).

memmove() was broken using ALTENTRY().  ALTENTRY() is only different from
ENTRY() in the profiling case, and its use in that case was sort of
backwards.  The backwardness magically turned memmove() into memcpy()
instead of completely breaking it.  Only the high resolution parts of
profiling itself were broken.  Use ordinary ENTRY() for memmove().
Turn bcopy() into a tail call to memmove() to reduce complications.
This gives slightly different pessimizations and profiling lossage.
The pessimizations are minimized by not using a frame pointer() for
bcopy().

Calls to profiling functions from exception trampolines were not
relocated.  This caused crashes on the first exception.  Fix this using
function pointers.

Addresses of exception handlers in trampolines were not relocated.  This
caused unknown offsets in the profiling data.  Relocate by abusing
setidt_disp as for pmc although this is slower than necessary and
requires namespace pollution.  pmc seems to be missing some relocations.
Stack traces and lots of other things in debuggers need similar relocations.

Most user addresses were misclassified as unknown kernel addresses and
then ignored.  Treat all unknown addresses as user. Now only user
addresses in the kernel text range are significantly misclassified (as
known kernel addresses).

The ibrs functions didn't preserve enough registers.  This is the only
recent breakage on amd64.  Although these functions are written in
asm, in the profiling case they call profiling functions which are
mostly for the C ABI, so they only have to save call-used registers.
They also have to save arg and return registers in some cases and
actually save them in all cases to reduce complications.  They end up
saving all registers except %ecx on i386 and %r10 and %r11 on amd64.
Saving these is only needed for 1 caller on each of amd64 and i386.
Save them there.  This is slightly simpler.

Remove saving %ecx in handle_ibrs_exit on i386.  Both handle_ibrs_entry
and handle_ibrs_exit use %ecx, but only the latter needed to or did
save it.  But saving it there doesn't work for the profiling case.

amd64 has more automatic saving of the most common scratch registers
%rax, %rcx and %rdx (its complications for %r10 are from unusual use
of %r10 by SYSCALL).  Thus profiling of handle_ibrs_exit_rs() was not
broken, and I didn't simplify the saving by moving the saving of these
registers from it to the caller.
2018-06-02 04:25:09 +00:00
eadler
dfdbe4a30f top(1): clean up a bit
- remove unused defines
- use standard defines for STDOUT
- don't cast for memset
- avoid using (void) cast
2018-06-02 04:20:42 +00:00
eadler
ff370bfd90 top(1): help scan-build along a bit
Teach scan-build that some arrays are larger than zero, and thus not to
warn.
2018-06-02 04:08:52 +00:00
eadler
d08be95677 top(1): Use uid_t for uid rather than 'int'
Remove unneeded define while here.
2018-06-02 03:54:50 +00:00
eadler
5af3d07de3 top(1): Remove now-invalid NOTE 2018-06-02 03:33:02 +00:00
eadler
f085a59180 top(1): avoid casting malloc 2018-06-02 03:31:14 +00:00
eadler
28565af2eb top(1): Use standard boolean rather than homegrown alternative 2018-06-02 03:25:15 +00:00
rmacklem
f996124d10 Fix the default number of threads for Flex File layout pNFS client I/O.
The intent was that the default would be based on number of CPUs, but the
code disabled using taskqueue() by default.
This code is only executed when mounting a NFSv4.1 server that supports the
Flexible File layout for pNFS and, since such servers are rare, this change
shouldn't result in a POLA violation.
(The FreeBSD pNFS server is still a project and the only other one that
 uses Flexible File layout is being developed by Primary Data and I don't
 know if they have even shipped any to customers yet.)
Found while testing the pNFS server.
2018-06-02 00:11:26 +00:00
eadler
1a8e246518 top(1): remove two unneeded headers 2018-06-02 00:02:27 +00:00
eadler
72a0a627b4 top(1): ansify, style(9). and nits
- Prefer using ansi prototypes rather than C prototypes
- Keep type on separate line from name of function
- Try to keep things const where possible. This will help get to WARNS=6
- switch to "bool" where it makes sense
2018-06-02 00:02:15 +00:00
markj
1dfe3a57b4 Remove the "pass" variable from the page daemon control loop.
It serves little purpose after r308474 and r329882.  As a side
effect, the removal fixes a bug in r329882 which caused the
page daemon to periodically invoke lowmem handlers even in the
absence of memory pressure.

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D15491
2018-06-02 00:01:07 +00:00
kib
df59ae728a Only check for MAP_32BIT when available.
Reported by:	mmacy
Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
2018-06-01 23:50:51 +00:00
markj
746978a065 Avoid completing I/O when dumping core after a panic.
Filesystem or pager completion callbacks are generally non-functional
after a panic and may trigger deadlocks if invoked in this context
(e.g., by attempting to destroying a buffer mapping).  To avoid this
situation, short-circuit I/O completion in biodone().

Reviewed by:	imp
Discussed with:	mav
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D15592
2018-06-01 23:49:32 +00:00
markj
84011a1282 Don't export _end on arm64 and riscv.
These platforms don't support brk() and sbrk(), which are the reason
for exporting _end in the first place.

MFC after:	1 week
2018-06-01 23:42:10 +00:00
markj
3cabb7d089 Remove an inaccuracy from mincore.2.
Super pages are supported on non-x86 architectures, so just remove the
incorrect note.  While here, change terminology to be consistent with
mmap.2.

MFC after:	1 week
2018-06-01 23:40:43 +00:00
cem
47461dc44c at.man: Bump .Dd missed in r334502
Sponsored by:	Dell EMC Isilon
2018-06-01 22:57:19 +00:00
cem
18c29dd505 Update other man pages to match leap second reality
Missed these in r334501; see justification there:

https://svnweb.freebsd.org/base?view=revision&revision=334501

Sponsored by:	Dell EMC Isilon
2018-06-01 22:37:59 +00:00
cem
497a7b01e3 touch.1: Update to conform to POSIX 2004
POSIX borrowed the "double leap second" bug from C89.  Double leap seconds can
never happen.  This mistake was present in at least POSIX 1997 and fixed by
POSIX 2004.  I can't find a copy of 2001 online to determine if the bug was
present in that revision.

While here, remove duplicate language between -d and -t.  A few other minor
enhancements and an igor (lint) bugfix.

Further reading:

2018 POSIX (documents -d):
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/touch.html

2004 POSIX (documents SS from 0-60):
http://pubs.opengroup.org/onlinepubs/009695399/utilities/touch.html

1997 POSIX/SUSv2 (historical interest, 0-61):
http://pubs.opengroup.org/onlinepubs/007908799/xcu/touch.html

More on this subject (start at "Unix system time and the POSIX standard")
https://www.ucolick.org/~sla/leapsecs/onlinebib.html

And: https://marc.info/?l=openbsd-tech&m=92682843416159&w=2

Reported by:	Vishal Sahu <vsahu AT isilon.com>
Sponsored by:	Dell EMC Isilon
2018-06-01 22:34:59 +00:00
brooks
9c13b943ee Remove support for SYS_sys_exit in favor of SYS_exit.
SYS_exit has been defined in the repo since 1994 except for a brief
window when SYS_sys_exit was defined in 2000.
2018-06-01 22:09:27 +00:00
alc
718d3e777c Only a small subset of mmap(2)'s flags should be used in combination with
the flag MAP_GUARD.  Rather than enumerating the flags that are not
allowed, enumerate the flags that are allowed.  The list of allowed flags
is much shorter and less likely to change.  (As an aside, one of the
previously enumerated flags, MAP_PREFAULT, was not even a legal flag for
mmap(2).  However, because of an earlier check within kern_mmap(), this
misuse of MAP_PREFAULT was harmless.)

Reviewed by:	kib
MFC after:	10 days
2018-06-01 21:37:42 +00:00
jhibbits
47bc3a3d85 Increase powerpc64 KVA from ~7.25GB to 32GB
This will let us use much more KVA for ZFS ARC where needed.  This may be
incresed in the future if memory requirements increase.

Discussed with:	nwhitehorn
2018-06-01 21:37:20 +00:00
tuexen
feabe856c7 Limit the retransmission timer for SYN-ACKs by TCPTV_REXMTMAX.
Use the same logic to handle the SYN-ACK retransmission when sent from
the syn cache code as when sent from the main code.

MFC after:	3 days
Sponsored by:	Netflix, Inc.
2018-06-01 21:24:27 +00:00
asomers
f7cc577f1a audit(4): add tests for the fd audit class
The only syscalls in this class are rmdir, unlink, unlinkat, rename, and
renameat.  Also, set is_exclusive for all audit(4) tests, because they can
start and stop auditd.

Submitted by:	aniketp
MFC after:	2 weeks
Sponsored by:	Google, Inc. (GSoC 2018)
Differential Revision:	https://reviews.freebsd.org/D15647
2018-06-01 21:24:10 +00:00
pstef
a5429afc55 indent(1): improve an error message
When producing a "[...] requires a parameter" error, provide the recognized
name of the option instead of argument provided.
2018-06-01 20:45:35 +00:00
tuexen
d87639476c Ensure net.inet.tcp.syncache.rexmtlimit is limited by TCP_MAXRXTSHIFT.
If the sysctl variable is set to a value larger than TCP_MAXRXTSHIFT+1,
the array tcp_syn_backoff[] is accessed out of bounds.

Discussed with: jtl@
MFC after:	3 days
Sponsored by:	Netflix, Inc.
2018-06-01 19:58:19 +00:00
pstef
b1f55fa0b2 indent(1): restore working -pcs
My previous indent(1) commit accidentally broke the -pcs option (which adds
space between function name and opening parenthesis in function calls) by
copying all but one of a few conditions in an if clause. Reinstate the
condition.

Add a regression test to lower the chances of breaking it again.

Correct a comment with description of what the option does.
2018-06-01 19:56:41 +00:00
rmacklem
3cc36cf922 Add the BindConnectiontoSession operation to the NFSv4.1 server.
Under some fairly unusual circumstances, the Linux NFSv4.1 client is
doing a BindConnectiontoSession operation for TCP connections.
It is also used by the ESXi6.5 NFSv4.1 client.
This patch adds this operation to the NFSv4.1 server.

Reported by:	andreas.nagy@frequentis.com
Tested by:	andreas.nagy@frequentis.com
MFC after:	2 weeks
2018-06-01 19:47:41 +00:00
imp
7d46287b05 Add PNP_INFO to aac
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-01 19:42:59 +00:00
jtl
4b5cc954df Update the sysctl(9) manpage to indicate that <sys/param.h> is required
instead of <sys/types.h>.  (<sys/sysctl.h> includes NULL, which is defined
with <sys/param.h> and not <sys/types.h>.)

Sponsored by:	Netflix
2018-06-01 16:47:39 +00:00