Commit Graph

122850 Commits

Author SHA1 Message Date
Andrew Turner
8e5d76e654 Make cpu_set_syscall_retval common between the existing FreeBSD ABI and
the Linuxulator. We need to translate error values onto Linux errno values
and return them to userspace when a syscall fails. We also need to preserve
x1 as all registers are preserved other than the return value.

Reviewed by:	emaste
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16008
2018-06-25 22:36:25 +00:00
Justin Hibbits
4a28eaada4 Expose stopped cpu contexts to ddb on PowerPC
Summary: In r220638, stoppcbs started being tracked. This never got exposed to
ddb though, so kdb_thr_ctx() didn't know how to look them up.

This allows switching to threads on stopped CPUs in kdb.

Submitted by:	Brandon Bergren <git_bdragon.rkt0.net>
Differential Revision: https://reviews.freebsd.org/D15986
2018-06-25 22:05:33 +00:00
Ed Maste
d3b03d746b linux64: add arm64 linuxulator build details
The arm64 linuxulator needs different arguments for the objcopy
invocation used to build the linux VDSO.  These arguments are both arch-
and OS-dependent, so I did not try to use some common setting for them.

Reviewed by:	imp
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D16011
2018-06-25 20:33:04 +00:00
Ed Maste
9c42fa94a6 Quiet unused fn warning for linuxulator w/o legacy syscalls
Sponsored by:	Turing Robotic Industries
2018-06-25 19:24:50 +00:00
Ed Maste
3911ee2c92 Initial arm64 linuxulator linux_sysvec
This is sufficient to run Linux arm64 'hello world' and other simple
binaries.

Reviewed by:	andrew
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D15834
2018-06-25 14:12:33 +00:00
Konstantin Belousov
7f12ebe583 Do not leave stray qword on top of stack for interrupts and exceptions
without error code.  Doing so it mis-aligned the stack.

Since the only consumer of the SSE instructions with the alignment
requirements is AES-NI module, and since the FPU context cannot be
accessed in interrupts, the only situation where the alignment matter
are the compat32 syscalls, as reported in the PR.

PR:	229222
Reported and tested by:	 dewayne@heuristicsystems.com.au
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-06-25 11:29:04 +00:00
Konstantin Belousov
ce3bf75015 Do not access ISA timer if BIOS reports that there is no legacy
devices present.

On at least one machine where it would matter since the ISA timer is
power gated when booted in the UEFI mode, BIOS still reports that the
legacy devices are present.  That is, user still have to manually
disable TSC calibration on such machines.  Hopefully it will be more
useful in the future.

Discussed with:	Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed by:	royger
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D16004
MFC after:	1 week
2018-06-25 11:24:26 +00:00
Konstantin Belousov
28ebccd5fa Fix compilation.
Pointy hat to:	me
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-06-25 11:12:21 +00:00
Konstantin Belousov
7705dd4df0 Provide a helper function acpi_get_fadt_bootflags() to fetch the FADT
x86 boot flags.

Reviewed by:	royger
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D16004
MFC after:	1 week
2018-06-25 11:01:12 +00:00
Konstantin Belousov
120186ad8c Always initialize the ignore local variable.
Reviewed by:	royger
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D16004
2018-06-25 10:52:41 +00:00
Roger Pau Monné
8f62926e03 vt: add option to ignore NO_VGA flag in ACPI
To workaround buggy firmware that sets this flag when there's actually
a VGA present.

Reported and tested by:	Yasuhiro KIMURA <yasu@utahime.org>
Sponsored by:		Citrix Systems R&D
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D16003
2018-06-25 09:39:16 +00:00
Oleksandr Tymoshenko
adcd95974d [rpi] Fix compatiblity with upstream DTB for RPi 3B and 3B+
Upstream dtb switched to using brcm,bcm2837 for platform
compatibility string. Patch platfrom and cpufreq compatiblity
data accordingly.

Submitted by:	Sylvain Garrigues <sylgar@gmail.com>
Tested by:	db@
Differential Revision:	https://reviews.freebsd.org/D15998
2018-06-24 23:19:31 +00:00
Sean Bruno
af4da58655 Enable TCP_FASTOPEN by default for FreeBSD 12.
Submitted by:	kbowling
Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D15959
2018-06-24 21:46:29 +00:00
Sean Bruno
45fc0718d8 Reap unused variable and assignment that had no effect. Noted by cross
compiling with gcc on mips.

Reviewed by:	mmacy
2018-06-24 21:36:37 +00:00
Warner Losh
b7841bd0eb Don't use generic PCI_VENDOR and PCI_PRODUCT macros. Prefix them with
BKTR_ to avoid possible conflicts.
2018-06-24 19:01:01 +00:00
Matt Macy
74333b3dee fix assert and conditionally allow mutexes to be held across epoch_wait_preempt 2018-06-24 18:57:06 +00:00
Mateusz Guzik
a3d799fbb5 vm: stop passing M_ZERO when allocating radix nodes
Allocation explicitely initialized the 3 leading fields. The rest is an
array which is supposed to be NULL-ed prior to deallocation.

Delegate zeroing to the infrequently called object initializator.

This gets rid of one of the most common memset consumers.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D15989
2018-06-24 13:08:05 +00:00
Ian Lepore
57ec63b396 Retrieve the bus clock speed and mode (polarity/phase) from the child device
and set up the hardware accordingly on each transfer.  This replaces the old
configuration done via sysctl, and allows both fdt configuration data and
userland control via the spigen device to work.

Submitted by:	Bob Frazier
Differential Revision:	https://reviews.freebsd.org/D15031
2018-06-23 23:44:36 +00:00
Ian Lepore
95e390b666 Add spi-max-frequency properties to all spigen nodes. This is a required
property for spi devices, although in the spigen case it's expected that
the speed will be overridden at runtime via the ioctl interface.  A very
conservative 500khz speed is used (I've never seen a spi device that
couldn't run at 1mhz).
2018-06-23 22:55:22 +00:00
Conrad Meyer
e50f10b5a4 aesni(4): Fix {de,en}crypt operations that allocated a buffer
aesni(4) allocates a contiguous buffer for the data it processes if the
provided input was not already virtually contiguous, and copies the input
there.  It performs encryption or decryption in-place.

r324037 removed the logic that then copied the processed data back to the
user-provided input buffer, breaking {de,enc}crypt for mbuf chains or
iovecs with more than a single descriptor.

PR:		228094 (probably, not confirmed)
Submitted by:	Sean Fagan <kithrup AT me.com>
Reported by:	Emeric POUPON <emeric.poupon AT stormshield.eu>
X-MFC-With:	324037
Security:	could result in plaintext being output by "encrypt"
		operation
2018-06-23 18:20:17 +00:00
Conrad Meyer
7d0ffa388e aesni(4): Support CRD_F_KEY_EXPLICIT OCF mode
PR:		227788
Reported by:	eadler@
2018-06-23 17:24:19 +00:00
Emmanuel Vadot
7e807acfee aw_mmc: Fix style(9) after r335476 2018-06-23 15:05:21 +00:00
Emmanuel Vadot
8b3bb1afb1 allwinner: clkng: Correct mux width and flags
The test for checking if the clock have a mux was inverted and the mask
to calculate the parent index was wrong was wrong too.
It means that upon creation the incorrect parent was resolved as the current
one and upon reparent the switch was never made.

Pointy hat (lots of them): manu
2018-06-23 15:03:54 +00:00
Mark Johnston
a8be239d69 Re-count available PV entries after reclaiming a PV chunk.
The call to reclaim_pv_chunk() in reserve_pv_entries() may free a
PV chunk with free entries belonging to the current pmap.  In this
case we must account for the free entries that were reclaimed, or
reserve_pv_entries() may return without having reserved the requested
number of entries.

Reviewed by:	alc, kib
Tested by:	pho (previous version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D15911
2018-06-23 10:41:52 +00:00
Jeff Roberson
63b5557b2f Sort uma_zone fields according to 64 byte cache line with adjacent line
prefetch on 64bit architectures.  Prior to this, two lines were needed
for the fast path and each line may fetch an unused adjacent neighbor.
 - Move fields used by the fast path into a single line.
 - Move constants into the adjacent line which is mostly used for
   the spare bucket alloc 'medium path'.
 - Unpad the mtx which is only used by the fast path and place it in
   a line with rarely used data.  This aligns the cachelines better and
   eliminates 128 bytes of wasted space.

This gives a 45% improvement on a will-it-scale test on a 24 core machine.

Reviewed by:	mmacy
2018-06-23 08:10:09 +00:00
Matt Macy
0bcfb47363 epoch(9): Don't trigger taskq enqueue before the grouptaskqs are setup
If EARLY_AP_STARTUP is not defined it is possible for an epoch to be
allocated prior to it being possible to call epoch_call without
issue.

Based on patch by andrew@

PR:		229014
Reported by:	andrew
2018-06-23 07:14:08 +00:00
Gleb Smirnoff
a00f4ac22f Revert r334843, and partially revert r335180.
tcp_outflags[] were defined since 4BSD and are defined nowadays in
all its descendants. Removing them breaks third party application.
2018-06-23 06:53:53 +00:00
Justin Hibbits
ab42fbe2e9 powerpc64: Fix stack setup in dbtrap
r330610 relocated the DMAP from the base of memory to the base of the fourth
quadrant of memory.  This broke synthetic traps, such as KDB forced
breakpoints.  Use GET_TOCBASE() so the DMAP offset is handled.

Submitted by:	git_bdragon.rkt0.net
Differential Revision:	https://reviews.freebsd.org/D15973
2018-06-23 01:42:34 +00:00
Rick Macklem
9f4c522e6b Set the slotid and ND_HASSLOTID flag for NFSv4.1 sequenced operations.
Most NFSv4.1 compound RPCs start with a Sequence operation. For these
cases, save the slotid and note that it is saved by setting ND_HASSLOTID.
This is used by r335568 to free up the session slot and disable it.

MFC after:	2 weeks
2018-06-23 00:48:45 +00:00
Rick Macklem
b18130d330 Define ND_HASSLOTID needed by r335568.
r335568 uses a flag called ND_HASSLOTID to indicate that the slotid is set,
so it can free and invalidate it.
This flag needs to be set, which will be done in a subsequent commit.

MFC after:	2 weeks
2018-06-23 00:37:15 +00:00
Kristof Provost
150182e309 pf: Support "return" statements in passing rules when they fail.
Normally pf rules are expected to do one of two things: pass the traffic or
block it. Blocking can be silent - "drop", or loud - "return", "return-rst",
"return-icmp". Yet there is a 3rd category of traffic passing through pf:
Packets matching a "pass" rule but when applying the rule fails. This happens
when redirection table is empty or when src node or state creation fails. Such
rules always fail silently without notifying the sender.

Allow users to configure this behaviour too, so that pf returns an error packet
in these cases.

PR:		226850
Submitted by:	Kajetan Staszkiewicz <vegeta tuxpowered.net>
MFC after:	1 week
Sponsored by:	InnoGames GmbH
2018-06-22 21:59:30 +00:00
Rick Macklem
ba6cce3aea Fix the handling of NFSv4.1 sessions for "soft" mounts.
When a "soft" mount is used for NFSv4.1, an RPC that fails without completing
will leave a slot in the NFSv4.1 session in an indeterminate state.
As such, all that can be done is free up the slot while making is no longer
usable.
A "soft" NFSv4.1 mount is not recommended in general, since it will leave
Open/Lock state in an indeterminate state. An exception is a pNFS mount of
a DS, since there are no Opens/Locks done for them except file creates
where loss of the Open state does not matter.
The patch also makes connections to DSs soft, so that they will fail when
a DS is non-functional or network partitioned, allowing the pNFS MDS to disable
the DS for a mirrored configuration.
This patch should not affect normal "hard" NFSv4.1 mounts.

MFC after:	2 weeks
2018-06-22 21:37:20 +00:00
Rick Macklem
2e35b8fe24 Change the NFSv4.1 pNFS client so that it returns the DS error in layoutreturn.
When the NFSv4.1 pNFS client gets an error for a DS I/O operation using a
Flexible File layout, it returns the layout with an error.
This patch changes the code slightly, so that it returns the layout for all
errors except EACCES and lets the MDS decide what to do based on the error.
It also makes a couple of changes to nfscl_layoutrecall() to ensure that
the first layoutreturn(s) will have the error in the reply.
Plus, the patch adds a wakeup() so that the "nfscl" thread won't wait 1sec
before doing the LayoutReturn.
Tested against the pNFS service.
This patch should not affect non-pNFS use of the client.
The unused "dsp" argument will be used by a future patch that disables the
connection to the DS when possible.

MFC after:	2 weeks
2018-06-22 21:25:27 +00:00
Ian Lepore
ce0b0df7ab Add spigen(4) fdt data overlays for RPI-B, RPI-2.
By adding spigen-rpi{2,-b}.dtso to fdt_overlays= in loader.conf, the fdt data
will set up the correct pinmux and device nodes to create a spigen(4) device
for each available chipselect pin.

Submitted by:	Bob Frazier
Differential Revision:	https://reviews.freebsd.org/D15067
2018-06-22 20:45:40 +00:00
Ian Lepore
c5b7751fa2 Eliminate a spurious panic on non-SMP systems (occurred on shutdown/reboot). 2018-06-22 20:22:26 +00:00
Colin Percival
7e8db78116 Improve the accuracy of the POSIX "process CPU-time" clocks by adding the
used portion of the current thread's time slice if the current thread
belongs to the process being queried (i.e., if clock_gettime is invoked
with a clock ID of CLOCK_PROCESS_CPUTIME_ID or the value provided by
passing getpid(2) to clock_getcpuclockid(3)).

The CLOCK_VIRTUAL and CLOCK_PROF timers already make this adjustment via
long-standing code in calcru(), but since those timers are not specified
by POSIX it seems useful to add it here so that the higher accuracy is
available to code which aims to be portable.

PR:		228669
Reported by:	Graham Percival
Reviewed by:	kib
MFC after:	1 week
2018-06-22 10:23:32 +00:00
Rick Macklem
c16f407e31 Add a counter to limit the number of disabled DSs for a mirrored pNFS MDS.
This patch adds a counter that limits the number of disabled mirrored DSs
to mirror level - 1.  It also makes a small change that keeps a Write that
has failed with EACCES when attempted by a client to a DS from disabling
the DS.
This patch only affects the pNFS server.
2018-06-22 00:55:39 +00:00
Matt Macy
ae25f40b72 epoch(9): make non-preemptible variant work early boot 2018-06-22 00:47:18 +00:00
Chuck Tuffli
4c4317193b Fix output of linprocfs stat entry
The Linux /proc/stat entry has grown over time

 v2.5.41 <
   user, nice, system, idle
 v2.5.41
   user, nice, system, idle, iowait, irq
 v2.6.11
   user, nice, system, idle, iowait, irq, softirq, steal
 v2.6.24
   user, nice, system, idle, iowait, irq, softirq, steal, guest
 v2.6.32 >
   user, nice, system, idle, iowait, irq, softirq, steal, guest, guest_nice

Some applications (e.g. nodejs) depend on the correct number of entries
and will abort otherwise.

Fix is to print the correct number of entries based on the value of
osrelease set either in sysctl or the jail settings. Change is similar
to approach used by illumos.

Reviewed by: emaste, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15858
2018-06-22 00:02:05 +00:00
Chuck Tuffli
3575504976 Fix the Linux kernel version number calculation
The Linux compatibility code was converting the version number (e.g.
2.6.32) in two different ways and then comparing the results.

The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to
what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2
    v = v0 * 1000000 + v1 * 1000 + v2;

The LINUX_KERNVER() macro, on the other hand, converted the value with
bit shifts. I.e. where major=a, minor=b, and patch=c
    v = (((a) << 16) + ((b) << 8) + (c))

The Linux kernel uses the later format via the KERNEL_VERSION() macro in
include/generated/uapi/linux/version.h

Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as
in the .trans_osrel functions.

PR: 229209
Reviewed by: emaste, cem, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15952
2018-06-22 00:02:03 +00:00
Kyle Evans
03d7aee8a7 subr_hints: Fix acpi unit hinting (at the very least)
The refactoring in r335479 overlooked the fact that the dynamic kenv can
also be switched to if hintmode == 0. This is problematic because the
checkmethod bits are only ever ran once, but it worked previously because
the use_kenv was a global state and the first lookup would enable it if
occurring after the dynamic environment has been setup.

Extending our local definition of use_kenv to include all non-STATIC
hintmodes as long as the dynamic_kenv is setup fixes this. We still have
potential issues if the dynamic kenv comes up while we're doing an anchored
search through the environment, but this is not much of a concern right now
because:

1.) The dynamic environment comes up super early in boot, just after kmem

2.) This is going to get rewritten to provide a safer mechanism for the
anchored searches, ensuring that we continue using the same environment
chain (dynamic env or static fallback) for all anchored search invocations

Reported by:	mmamcy
X-MFC-With: r335479
2018-06-21 21:50:00 +00:00
Ian Lepore
1fcf4de055 Incorporate bus and chip select numbers into spigen(4) cdev names. Rather
than assigning spigen device names in order of creation, this uses a device
name that corresponds to the owning spibus and chip-select index.

Example: /dev/spigen0.1 would be a child of spibus0, and use cs = 1

The intent is for systems like Raspberry Pi to have a consistent way of
using an SPI interface with a specific cs value from a user application.
Otherwise, there is no consistent way of knowing which cs pin will be
assigned to a particular spigen device. The alternative is to specify
everything in "the right order" in an overlay file, which is less than
ideal. Additionally, this duplicates (to some extent) the way Linux handles
a similar situation with their 'spidev' device, so it would be somewhat
familiar to those who also use Linux.

A new kernel config option, SPIGEN_LEGACY_CDEVNAME, causes the driver to
also create /dev/spigenN device name aliases, with N incrementing in the
order of device instantiation.  This is provided to ease the transition
for existing systems using the original naming convention (particularly
when these changes are MFC'd to stable branches).

Differential Revision:	https://reviews.freebsd.org/D15301
2018-06-21 21:16:26 +00:00
Konstantin Belousov
31665c1abb linux_clone_thread: mark new thread as TDB_BORN.
So that the ptrace code will catch it and report it to attached
debugger.  Enables debugging of threaded Linux binaries with FreeBSD
debugger.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15880
2018-06-21 21:15:04 +00:00
Konstantin Belousov
6e22bbf66e fork: avoid endless wait with PTRACE_FORK and RFSTOPPED.
An RFSTOPPED thread can't clean TDB_STOPATFORK, which is done in the
fork_return() in its context, so parent is stuck forever.  Triggered
when trying to ptrace linux process.  Instead of waiting for the new
thread to clear TDB_STOPATFORK, tag it as traced and reparent to the
debugger in do_fork(), and let it only notify the debugger when run.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
Reviewed by:	jhb
MFC after:	1 week
X-MFC-Note:	keep p_dbgwait placeholder intact
Differential revision:	https://reviews.freebsd.org/D15857
2018-06-21 21:12:49 +00:00
Konstantin Belousov
ac4bc0c171 Update proc->p_ptevents annotation to reflect the actual locking.
Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
Reviewed by:	jhb
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15954
2018-06-21 21:07:25 +00:00
Randall Stewart
581a046a8b This adds in an optimization so that we only walk one
time through the mbuf chain during copy and TSO limiting.
It is used by both Rack and now the FreeBSD stack.
Sponsored by:	Netflix Inc
Differential Revision: https://reviews.freebsd.org/D15937
2018-06-21 21:03:58 +00:00
Matt Macy
e93fdbe212 raw_ip: validate inp in both loops
Continuation of r335497. Also move the lock acquisition up to
validate before referencing inp_cred.

Reported by:	pho
2018-06-21 20:18:23 +00:00
Matt Macy
3d348772e7 in_pcblookup_hash: validate inp before return
Post r335356 it is possible to have an inpcb on the hash lists that is
partially torn down. Validate before using. Also as a side effect of this
change the lock ordering issue between hash lock and inpcb no longer exists
allowing some simplification.

Reported by:	pho@
2018-06-21 18:40:15 +00:00
Conrad Meyer
e38b830780 Sync strlcpy with userland version, again
No functional change.

Please remember to update libkern copies of libc functions when you update
libc.

Sponsored by:	Dell EMC Isilon
2018-06-21 17:35:13 +00:00
Matt Macy
e5c331cf78 raw_ip: validate inp
Post r335356 it is possible to have an inpcb on the hash lists that is
partially torn down. Validate before using.

Reported by:	pho
2018-06-21 17:24:10 +00:00