Commit Graph

129821 Commits

Author SHA1 Message Date
jhibbits
589a67838d Minimal change to build linuxkpi on architectures with physical addresses larger
than virtual

Summary:
Some architectures have physical/bus addresses that are much larger
than virtual addresses.  This change just quiets a warning, as DMAP is not used
on those architectures, and on 64-bit platforms uintptr_t is the same size as
vm_paddr_t and void *.

Reviewed By:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14043
2018-01-26 00:56:09 +00:00
np
afe3fb1404 cxgbe(4): Accept old names of a couple of tunables. 2018-01-26 00:45:40 +00:00
np
42322cf25b cxgbe(4): Do not display harmless warning in non-debug builds.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2018-01-26 00:03:14 +00:00
cem
e83cbe4f78 nfs: Remove NFSSOCKADDRALLOC, NFSSOCKADDRFREE macros
They were just thin wrappers over malloc(9) w/ M_ZERO and free(9).

Discussed with:	rmacklem, markj
Sponsored by:	Dell EMC Isilon
2018-01-25 22:38:39 +00:00
cem
c060d198e3 style: Remove remaining deprecated MALLOC/FREE macros
Mechanically replace uses of MALLOC/FREE with appropriate invocations of
malloc(9) / free(9) (a series of sed expressions).  Something like:

* MALLOC(a, b, ... -> a = malloc(...
* FREE( -> free(
* free((caddr_t) -> free(

No functional change.

For now, punt on modifying contrib ipfilter code, leaving a definition of
the macro in its KMALLOC().

Reported by:	jhb
Reviewed by:	cy, imp, markj, rmacklem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14035
2018-01-25 22:25:13 +00:00
imp
848a9a5ec4 Add new opt_da.h for stand-alone build.
Sponsored by: Netflix
2018-01-25 21:48:07 +00:00
imp
03a94857f9 Track Ref / DeRef and Hold / Unhold that da is doing to track down
leaks. We assume each source can be taken / dropped only once and
don't recurse. These are only enabled via DA_TRACK_REFS or
INVARIANTS. There appreas to be a reference leak under extreme load,
and these should help us colaberatively work it out. It also documents
better the reference / holding protocol better.

Reviewed by: ken@, scottl@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14040
2018-01-25 21:38:30 +00:00
imp
296548866a When devices are invalidated, there's some cases where ccbs for that
device still wind up in xpt_done after the path has been
invalidated. Since we don't always need sim or devq, add some guard
rails to only fail if we have to use them.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14040
2018-01-25 21:38:09 +00:00
nwhitehorn
b2d5bf8667 Avoid all SLB operations in trap handling if the process is not using a
software-managed SLB.
2018-01-25 18:10:33 +00:00
nwhitehorn
f4e081a8c4 Treat DSE exceptions like DSI exceptions when generating signinfo.
Both can generate SIGSEGV, but DSEs would have put the wrong address
into the siginfo structure when the signal was delivered.

MFC after:	1 week
2018-01-25 18:09:26 +00:00
ian
28c319f3c1 Fix return style in RD2. Remove bogus return value from a void function
in WR2 (I have no idea why that didn't result in a compile error).
2018-01-25 18:08:56 +00:00
pfg
08d8a954a8 Minor style issue introduced in r328346.
Pointed by:	bde
2018-01-25 18:01:46 +00:00
ian
4e9942e668 Minor cleanups... Move DRIVER_MODULE() and other boilerplate stuff to the
bottom of the file, where it is in most imx5/6 drivers.  Switch from an RD2
macro using bus_space_read_2() to an inline function using bus_read_2();
likewise for WR2.  Use RESOURCE_SPEC_END to end the resource_spec list.

Net effect should be no functional changes.
2018-01-25 17:53:33 +00:00
br
ee8ff9b500 o Move sdhci_fdt to the generic files list.
o Include Qualcomm EHCI and UART drivers to the build.

Sponsored by:	DARPA, AFRL
2018-01-25 17:16:29 +00:00
br
85a51c91ee Add support for SDHCI controller found in Qualcomm Snapdragon 410e.
Tested on DragonBoard 410c.

Sponsored by:	DARPA, AFRL
2018-01-25 17:00:35 +00:00
br
d2383ccfab Add basic driver for Qualcomm USB 2.0 EHCI controller.
This driver relies on system initialization in u-boot.

Tested on DragonBoard 410c.

Sponsored by:	DARPA, AFRL
2018-01-25 16:58:23 +00:00
markj
19e9e67650 Use tcpinfoh_t for TCP headers in the tcp:::debug-{drop,input} probes.
The header passed to these probes has some fields converted to host
order by tcp_fields_to_host(), so the tcpinfo_t translator doesn't do
what we want.

Submitted by:	Hannes Mehnert
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12647
2018-01-25 15:35:34 +00:00
wma
d14bbbd48e BPF: Switch to 32 bit compatible mode only when thread is 32 bit
Sometimes 32 bit and 64 bit ioctls are represented by the same number.
It causes unnecessary switch to 32 bit commpatible mode.

This patch prevents switching when we are dealing with 64 bit executable.
It fixes issue mentioned here

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Reviewed by:           andrew, wma
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14023
2018-01-25 12:13:41 +00:00
lwhsu
344189d5c0 Fix build for architectures where size_t is not unsigned long
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D14045
2018-01-25 06:37:14 +00:00
mizhka
ceb9715f37 [etherswitch] fix LINT build for rtl8366rb
Build with rtl8366rb has been broken due to incorrect retrieval of pointer
to device_t.

Reported by:	lwhsu
Differential Revision:	https://reviews.freebsd.org/D14044
2018-01-25 05:48:42 +00:00
imp
f1c12a8a9e Minor whitespace cleanup to remove leading space before tab. No
functional changes.
2018-01-25 02:52:44 +00:00
manu
0016f32650 arm: lpc: Remove support
Code hasn't been touch this it's original commit in 2012 beside api changes.

Reviewed by:	ian
Differential Revision:	https://reviews.freebsd.org/D13625
Discussed with:		freebsd-arm@freebsd.org (no reply)
2018-01-24 22:04:16 +00:00
mizhka
447acd1dc7 [etherswitch] check if_alloc returns NULL
This patch is cosmetic. It checks if allocation of ifnet structure failed.
It's better to have this check rather than assume positive scenario.

Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
Reported by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
2018-01-24 21:33:18 +00:00
jhb
7094ca6ab7 Store IV in output buffer in GCM software fallback when requested.
Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM
software fallback case for encryption requests.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:16:48 +00:00
jhb
698fd0363b Don't read or generate an IV until all error checking is complete.
In particular, this avoids edge cases where a generated IV might be
written into the output buffer even though the request is failed with
an error.

Sponsored by:	Chelsio Communications
2018-01-24 20:15:49 +00:00
jhb
fdcfcaeba5 Expand the software fallback for GCM to cover more cases.
- Extend ccr_gcm_soft() to handle requests with a non-empty payload.
  While here, switch to allocating the GMAC context instead of placing
  it on the stack since it is over 1KB in size.
- Allow ccr_gcm() to return a special error value (EMSGSIZE) which
  triggers a fallback to ccr_gcm_soft().  Move the existing empty
  payload check into ccr_gcm() and change a few other cases
  (e.g. large AAD) to fallback to software via EMSGSIZE as well.
- Add a new 'sw_fallback' stat to count the number of requests
  processed via the software fallback.

Submitted by:	Harsh Jain @ Chelsio (original version)
Sponsored by:	Chelsio Communications
2018-01-24 20:14:57 +00:00
jhb
c7c9c36785 Clamp DSGL entries to a length of 2KB.
This works around an issue in the T6 that can result in DMA engine
stalls if an error occurs while processing a DSGL entry with a length
larger than 2KB.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:13:07 +00:00
jhb
75449ebc4c Fail crypto requests when the resulting work request is too large.
Most crypto requests will not trigger this condition, but a request
with a highly-fragmented data buffer (and a resulting "large" S/G
list) could trigger it.

Sponsored by:	Chelsio Communications
2018-01-24 20:12:00 +00:00
jhb
d66616766d Don't discard AAD and IV output data for AEAD requests.
The T6 can hang when processing certain AEAD requests if the request
sets a flag asking the crypto engine to discard the input IV and AAD
rather than copying them into the output buffer.  The existing driver
always discards the IV and AAD as we do not need it.  As a workaround,
allocate a single "dummy" buffer when the ccr driver attaches and
change all AEAD requests to write the IV and AAD to this scratch
buffer.  The contents of the scratch buffer are never used (similar to
"bogus_page"), and it is ok for multiple in-flight requests to share
this dummy buffer.

Submitted by:	Harsh Jain @ Chelsio (original version)
Sponsored by:	Chelsio Communications
2018-01-24 20:11:00 +00:00
jhb
5507c02b91 Reject requests with AAD and IV larger than 511 bytes.
The T6 crypto engine's control messages only support a total AAD
length (including the prefixed IV) of 511 bytes.  Reject requests with
large AAD rather than returning incorrect results.

Sponsored by:	Chelsio Communications
2018-01-24 20:08:10 +00:00
jhb
4745a6e4af Always set the IV location to IV_NOP.
The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work
request.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:06:02 +00:00
jhb
30b7fd6812 Always store the IV in the immediate portion of a work request.
Combined authentication-encryption and GCM requests already stored the
IV in the immediate explicitly.  This extends this behavior to block
cipher requests to work around a firmware bug.  While here, simplify
the AEAD and GCM handlers to not include always-true conditions.

Submitted by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-01-24 20:04:08 +00:00
ae
9ecab3344c Adopt revision 1.76 and 1.77 from NetBSD:
Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely
  crash the kernel with a single packet.

  In this loop we need to increment 'ad' by two, because the length field
  of the option header does not count the size of the option header itself.

  If the length is zero, then 'count' is incremented by zero, and there's
  an infinite loop. Beyond that, this code was written with the assumption
  that since the IPv6 packet already went through the generic IPv6 option
  parser, several fields are guaranteed to be valid; but this assumption
  does not hold because of the missing '+2', and there's as a result a
  triggerable buffer overflow (write zeros after the end of the mbuf,
  potentially to the next mbuf in memory since it's a pool).

  Add the missing '+2', this place will be reinforced in separate commits.

Reported by:	Maxime Villard <maxv at NetBSD.org>
MFC after:	1 week
2018-01-24 19:48:25 +00:00
cem
2b1bc6707d malloc(9): Change nominal size to size_t to match standard C
No functional change -- size_t matches unsigned long on all platforms.

Reported by:	bde
Discussed with:	jhb
Sponsored by:	Dell EMC Isilon
2018-01-24 19:37:18 +00:00
ae
733b094ecd Merge revision 1.35 from NetBSD:
fix pointer/offset mistakes in handling of IPv4 options

Reported by:	Maxime Villard <maxv at NetBSD.org>
MFC after:	1 week
2018-01-24 19:06:44 +00:00
ian
f30683076e Make the trivial imx_soc_family() function an inline in imx_machdep.h.
The imx_machdep.c file is on the fast path to non-existance and this would
be the only thing left in it after some watchdog changes are completed.
2018-01-24 18:10:11 +00:00
pfg
944d693f04 ext2fs|ufs:Unsign some values related to allocation.
When allocating memory through malloc(9), we always expect the amount of
memory requested to be unsigned as a negative value would either stand for
an error or an overflow.
Unsign some values, found when considering the use of mallocarray(9), to
avoid unnecessary casting. Also consider that indexes should be of
at least the same size/type as the upper limit they pretend to index.

MFC after:	2 weeks
2018-01-24 17:58:48 +00:00
ian
d1ff67731c Reformat indentation to match other imx5/6 register definition headers, and
tweak some comments.  No functional changes.
2018-01-24 17:52:06 +00:00
trasz
1da0bebce4 Add SPDX identifiers to linux_ptrace.c and cfumass.c.
MFC after:	2 weeks
2018-01-24 17:04:01 +00:00
trasz
db1fff314f Add SPDX tags to iscsi(4).
MFC after:	2 weeks
2018-01-24 16:58:26 +00:00
pfg
ca690ecdf9 Revert r327781, r328093, r328056:
ufs|ext2fs: Revert uses of mallocarray(9).

These aren't really useful: drop them.
Variable unsigning will be brought again later.
2018-01-24 16:44:57 +00:00
trasz
cf8b777c32 Add SPDX tags to autofs(5).
MFC after:	2 weeks
2018-01-24 16:40:26 +00:00
wma
86a31a6bb4 Reverting r328320 2018-01-24 13:57:01 +00:00
hselasky
af27af9521 Properly implement the "id" callback argument in the "idr_for_each" function
in the LinuxKPI. The old implementation assumed only one IDR layer was present.
Take additional IDR layers into account when computing the "id" value.

MFC after:	1 week
Found by:	Karthik Palanichamy <karthikp@chelsio.com>
Tested by:	Karthik Palanichamy <karthikp@chelsio.com>
Sponsored by:	Mellanox Technologies
2018-01-24 13:37:07 +00:00
ae
ba9f1438e7 When IPv6 packet is handled by O_REJECT opcode, convert ICMP code
specified in the arg1 into ICMPv6 destination unreachable code according
to RFC7915.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2018-01-24 12:40:28 +00:00
wma
0b4385ba23 PPC: Add KASSERT in intrcnt_add which checks for buffer overflow
Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-24 12:01:32 +00:00
smh
c62edd09fd Added missing CTLFLAG_VNET to lacp default_strict_mode
Added CTLFLAG_VNET to net.link.lagg.lacp.default_strict_mode which was missed
in r290450.

Reported by:	julian@
MFC after:	1 week
Sponsored by:	Multiplay
2018-01-24 10:13:14 +00:00
wma
d8d083c4f2 ULE: provide defaults to ts_cpu
Fix a bug when the system has no CPU 0. When created, threads were implicitly assigned to CPU 0.
This had no practical effect since a real CPU was chosen immediately by the scheduler. However,
on systems without a CPU 0, sched_ule attempted to access the scheduler queue of the "old" CPU
when assigned the initial choice of the old one. This caused an attempt to use illegal memory
and a crash (or, more usually, a deadlock). Fix this by assigned new threads to the BSP
explicitly and add some asserts to see that this problem does not recur.

Authored by:           Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Differential revision: https://reviews.freebsd.org/D13932
2018-01-24 07:54:05 +00:00
np
a2781c4db2 cxgb(4): Validate offset/len in the GET_EEPROM ioctl.
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
2018-01-24 05:16:11 +00:00
np
af35a0e296 Do not generate illegal mbuf chains during IP fragment reassembly. Only
the first mbuf of the reassembled datagram should have a pkthdr.

This was discovered with cxgbe(4) + IPSEC + ping with payload more than
interface MTU.  cxgbe can generate !M_WRITEABLE mbufs and this results
in m_unshare being called on the reassembled datagram, and it complains:

panic: m_unshare: m0 0xfffff80020f82600, m 0xfffff8005d054100 has M_PKTHDR

PR:		224922
Reviewed by:	ae@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14009
2018-01-24 05:09:21 +00:00
kp
d610e605bf pf: States have at least two references
pf_unlink_state() releases a reference to the state without checking if
this is the last reference. It can't be, because pf_state_insert()
initialises it to two. KASSERT() that this is always the case.

CID:	1347140
2018-01-24 04:29:16 +00:00
ian
d645c0dfb2 Follow changes in r328307 by using new IIC_RECURSIVE flag.
The driver now ensures only one thread at a time is running in the API
functions (clock_gettime() and clock_settime()) by specifically requesting
ownership of the i2c bus without using IIC_RECURSIVE, then it does all IO
using IIC_RECURSIVE so that each individual IO operation doesn't try to
re-acquire the bus.

The other IO done by the driver happens at attach or intr_config_hooks time,
when there can't be multiple threads running with the same device instance.
So, the IIC_RECURSIVE flag can be safely ORed into the wait flags for all IO
done by the driver, because it's all either done in a single-threaded
environment, or protected within a block bounded by explict
iicbus_acquire_bus() and iicbus_release_bus() calls.
2018-01-24 03:09:56 +00:00
ian
ce92c66763 Follow changes in r328307 by using new IIC_RECURSIVE flag.
The driver now ensures only one thread at a time is running in the API
functions (clock_gettime() and clock_settime()) by specifically requesting
ownership of the i2c bus without using IIC_RECURSIVE, then it does all IO
using IIC_RECURSIVE so that each individual IO operation doesn't try to
re-acquire the bus.

The other IO done by the driver happens at attach or intr_config_hooks time,
when there can't be multiple threads running with the same device instance.
So, the IIC_RECURSIVE flag can be safely ORed into the wait flags for all IO
done by the driver, because it's all either done in a single-threaded
environment, or protected within a block bounded by explict
iicbus_acquire_bus() and iicbus_release_bus() calls.
2018-01-24 03:09:41 +00:00
ian
010441d9d3 Fix a bug introduced with recursive bus ownership support in r321584.
The recursive ownership support added in r321584 was unconditionally in
effect all the time -- whenever a given i2c slave device instance tried to
lock the i2c bus for exclusive use when it already owned the bus, the call
returned immediately without waiting.  However, many i2c slave drivers use
bus ownership to enforce that only a single thread at a time can be using
the slave device.  The recursive locking changes broke this use case.

Now there is a new flag, IIC_RECURSIVE, which can be mixed in with the
other flags passed to iicbus_acquire_bus() to allow drivers to indicate
when recursive locking is desired.  Using the flag implies that the driver
is managing concurrent access to the device by different threads in some way.

This immediately fixes all existing i2c slave drivers except for the two
i2c RTC drivers which use the recursive locking feature; those will be
fixed in a followup commit.
2018-01-23 23:30:19 +00:00
ian
b766c8cc68 Switch to using the bcd_clocktime conversion functions that validate the BCD
data without panicking, and have common code for handling AM/PM mode.
2018-01-23 21:36:26 +00:00
ian
1479ef0fb1 Switch to using the bcd_clocktime conversion functions that validate the BCD
data without panicking, and have common code for handling AM/PM mode.
2018-01-23 21:31:43 +00:00
ian
64f4ab4c2b Switch to using the bcd_clocktime conversion functinos that validate the BCD
data without panicking, and have common code for handling AM/PM mode.
2018-01-23 21:18:15 +00:00
emaste
afaeca80d6 copyright.h: Update license text to 'THE AUTHOR'
This matches the license text at
https://www.freebsd.org/copyright/freebsd-license.html

Sponsored by:	The FreeBSD Foundation
2018-01-23 20:38:03 +00:00
emaste
ae11f64597 Use BSD-2-Clause-FreeBSD license on linux_support.s
These files previously had a 3-clause license and 'THE REGENTS' text.
Switch to standard 2-clause text with kib's approval, and add the SPDX
tag.

Approved by:	kib
2018-01-23 20:35:43 +00:00
asomers
c9a63ac910 sys/netinet6: fix typos in comments. No functional change.
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2018-01-23 19:40:05 +00:00
pfg
341f8038fe extfs: Remove unused variables.
Found by:	scan-build
Reviewed by:	fsu
Differential Revision:	https://reviews.freebsd.org/D14017
2018-01-23 14:17:04 +00:00
wma
46822a0e45 PowerNV: send MSI_EOI always after MSI unmask
MSI/MSI-x interrupts are edge-triggered. If an interrupt
arrives when IRQ line is masked, it will be lost and will
never recover. Perform MSI_EOI always after unmask to give
a chance for PHB/XICS to send an interrupt again if MSI/MSI-x
pending bit is set in MSI/MSI-x BAR space.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-23 08:07:00 +00:00
rstone
9c794ac899 Increment the route table gen count after a modify
Increment the route table generation count after modifying a
route.  This signals back to TCP connections that they need to
update their L2 caches as the gateway for their route may have
changed.  This is a heavier hammer than is needed, strictly
speaking, but route changes will be unlikely enough that the
performance effects of invalidating all connection route caches
should be negligible.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13990
Reviewed by:	karels
2018-01-23 03:15:44 +00:00
rstone
c0c5474ab0 Reduce code duplication for inpcb route caching
Add a new macro to clear both the L3 and L2 route caches, to
hopefully prevent future instances where only the L3 cache was
cleared when both should have been.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13989
Reviewed by:	karels
2018-01-23 03:15:39 +00:00
rstone
4fb0175a26 Invalidate inpcb LLE cache if cached route is invalidated
When the inpcb route cache is invalidated after a change to the
routing tables, we need to invalidate the LLE cache as well.
Previous to this change packets for the connection would continue
to use the old L2 information from the old L3 gateway, and the
packets for the connection would likely be blackholed.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13988
Reviewed by:	karels
2018-01-23 03:15:39 +00:00
jhibbits
08a5b74395 Fix 64-bit booke kernel builds after the ldscript changes
Commits r326203 and r326978 broke 64-bit booke kernels by introducing a 1MB
zero-pad between the ELF header and the start of the kernel.  This didn't
cause a build failure, but caused kernels to need to be loaded into memory
1MB lower, which could easily break scripts expecting previous behavior.
This change matches the similar change made to AIM in r327358.
2018-01-23 02:52:12 +00:00
erj
0475b09ef6 ixv(4): Stop setting editing ifnet flags in ixv_if_init()
In iflib, the device-specific init() function isn't supposed to edit
the struct ifnet driver flags. If it does, it'll cause an MPASS() assert
in iflib to fail.

PR:		225312
Reported by:	bhughes@
2018-01-22 20:56:21 +00:00
kib
027c7f4d66 Fix compat32 for sysctl net.PF_ROUTE...NET_RT_IFLISTL.
Route messages are aligned to the host long type alignment, which
breaks 32bit.

Reported and tested by:	lwhsu
Diagnosed by:	Yuri Pankov <yuripv@icloud.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-22 20:49:17 +00:00
imp
f300c8e371 This comment is bogus. This is a legit release.
Reviewed by: scottl@, ken@
Sponsored by: Netflix
2018-01-22 17:47:49 +00:00
pfg
1deb03ba9c drm2: Basic use of mallocarray(9).
These functions deal the same type of overflows we do with mallocarray(9).
Using our mallocarray will panic, which different from the previous
behavior (returning NULL), but neither behavior is more correct.

As a sidenote, drm_calloc_large() is not currently used at all.

Reviewed by:	dumbbell
Differential Revision:	https://reviews.freebsd.org/D13835
2018-01-22 15:55:51 +00:00
phk
0833ef388c Forgot to add the skeleton BCM283x Clock Manager
Reminded by:	lwhsu
2018-01-22 08:33:59 +00:00
phk
4acf4a5e01 Add skeleton manual page for bcm283x_pwm
(Feel free to improve this)
2018-01-22 07:43:54 +00:00
phk
c80c647d0e Forgot to edit copy&pasted copyright blurb. 2018-01-22 07:15:24 +00:00
phk
15663d7fc1 Add a skeleton Clock Manager for RPi2/3, and use that from pwm
instead of frobbing the registers directly.

As a hack the bcm2835_pwm kmod presently ignores the 'status="disabled"'
in the RPI3 DTB, assuming that if you load the kld you probably
want the PWM to work.
2018-01-22 07:10:30 +00:00
mav
47ae44b999 MFV r328253: 8835 Speculative prefetch in ZFS not working for misaligned reads
illumos/illumos-gate@5cb8d943bc

https://www.illumos.org/issues/8835:
Sequential reads not aligned to block size are not detected by ZFS
prefetcher as sequential, killing prefetch and severely hurting
performance.  It is caused by dmu_zfetch() in case of misaligned
sequential accesses being called with overlap of one block.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Alexander Motin <mav@FreeBSD.org>
2018-01-22 05:57:14 +00:00
mav
2dd60f22d7 MFV r328251: 8652 Tautological comparisons with ZPROP_INVAL
illumos/illumos-gate@4ae5f5f06c

https://www.illumos.org/issues/8652:
Clang and GCC prefer to use unsigned ints to store enums. With Clang, that
causes tautological comparison warnings when comparing a zfs_prop_t or
zpool_prop_t variable to the macro ZPROP_INVAL. It's likely that error
handling code is being silently removed as a result.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Alan Somers <asomers@gmail.com>
2018-01-22 05:52:39 +00:00
mav
27fedeb8ad MFV r328247: 8959 Add notifications when a scrub is paused or resumed
illumos/illumos-gate@301fd1d6f2

Reviewed by: Alek Pinchuk <pinchuk.alek@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Sean Eric Fagan <sef@ixsystems.com>
2018-01-22 04:31:48 +00:00
mav
84b8a477fb MFV r328245: 8856 arc_cksum_is_equal() doesn't take into account ABD-logic
illumos/illumos-gate@01a059ee0c

https://www.illumos.org/issues/8856:
arc_cksum_is_equal() calls zio_push_transform() that requires abd_t*
(second arg), but a void* is passed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Roman Strashkin <roman.strashkin@nexenta.com>
2018-01-22 04:23:48 +00:00
pfg
6f66677652 Forgot to sort here in r328238. 2018-01-22 02:26:10 +00:00
pfg
f0c6025eb6 Unsign some values related to allocation.
When allocating memory through malloc(9), we always expect the amount of
memory requested to be unsigned as a negative value would either stand for
an error or an overflow.
Unsign some values, found when considering the use of mallocarray(9), to
avoid unnecessary casting. Also consider that indexes should be of
at least the same size/type as the upper limit they pretend to index.

MFC after:	3 weeks
2018-01-22 02:08:10 +00:00
pfg
35335c1009 Use the __alloc_size2 attribute where relevant.
This follows the documented use in GCC. It is basically only relevant for
calloc(3), reallocarray(3) and  mallocarray(9).

Suggested by:	Mark Millard

Reference:
https://docs.freebsd.org/cgi/mid.cgi?9DE674C6-EAA3-4E8A-906F-446E74D82FC4
2018-01-22 01:50:10 +00:00
mav
428df4ba9a MFV r328229:
8930 zfs_zinactive: do not remove the node if the filesystem is readonly

illumos/illumos-gate@93c618e0f4

https://www.illumos.org/issues/8930:
We normally remove an unlinked node when its last user goes away and the
node becomes inactive. However, we should not do that if the filesystem
is mounted read-only including the case where it has its readonly
property set. The node will remain on the unlinked queue, so it will
not be leaked.

One particular scenario is when we receive an incremental stream into a
mounted read-only filesystem and that stream contains an unlinked file
(still on the unlinked queue). If that file is opened before the
receive and some time later after the receive it becomes inactive we
would remove it and, thus, modify the read-only filesystem. As a
result, the filesystem would diverge from its source and further
incremental receives would not be possible (without forcing a rollback).

Another related scenario, that may or may not be possible depending on an
OS / VFS policy, is when an open file is unlinked, then the filesystem is
remounted read-only, and then the file is closed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Andriy Gapon <avg@FreeBSD.org>
2018-01-21 23:49:17 +00:00
mav
c8d77253f9 MFV r328227: 8909 8585 can cause a use-after-free kernel panic
illumos/illumos-gate@94ddd0900a

https://www.illumos.org/issues/8909:
There's a race condition that exists if `zil_free_lwb` races with either
`zil_commit_waiter_timeout` and/or `zil_lwb_flush_vdevs_done`.

Here's an example panic due to this bug:

> ::status
    debugging crash dump vmcore.0 (64-bit) from ip-10-110-205-40
    operating system: 5.11 dlpx-5.2.2.0_2017-12-04-17-28-32b6ba51fb (i86pc)
    image uuid: 4af0edfb-e58e-6ed8-cafc-d3e9167c7513
    panic message:
    BAD TRAP: type=e (#pf Page fault) rp=ffffff0010555970 addr=60 occurred in mo
dule "zfs" due to a NULL pointer dereference
    dump content: kernel pages only

> $c
    zio_shrink+0x12()
    zil_lwb_write_issue+0x30d(ffffff03dcd15cc0, ffffff03e0730e20)
    zil_commit_waiter_timeout+0xa2(ffffff03dcd15cc0, ffffff03d97ffcf8)
    zil_commit_waiter+0xf3(ffffff03dcd15cc0, ffffff03d97ffcf8)
    zil_commit+0x80(ffffff03dcd15cc0, 9a9)
    zfs_write+0xc34(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0)
    fop_write+0x5b(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0)
    write+0x250(42, fffffd7ff4832000, 2000)
    sys_syscall+0x177()

If there's an outstanding lwb that's in `zil_commit_waiter_timeout`
waiting to timeout, waiting on it's waiter's CV, we must be sure not to
call `zil_free_lwb`. If we end up calling `zil_free_lwb`, then that LWB
may be freed and can result in a use-after-free situation where the
stale lwb pointer stored in the `zil_commit_waiter_t` structure of the
thread waiting on the waiter's CV is used.

A similar situation can occur if an lwb is issued to disk, and thus in
the `LWB_STATE_ISSUED` state, and `zil_free_lwb` is called while the
disk is servicing that lwb. In this situation, the lwb will be freed by
`zil_free_lwb`, which will result in a use-after-free situation when the
lwb's zio completes, and `zil_lwb_flush_vdevs_done` is called.

This race condition is prevented in `zil_close` by calling `zil_commit`
before `zil_free_lwb` is called, which will ensure all outstanding (i.e.
all lwb's in the `LWB_STATE_OPEN` and/or `LWB_STATE_ISSUED` states)
reach the `LWB_STATE_DONE` state before the lwb's are freed
(`zil_commit` will not return untill all the lwb's are
`LWB_STATE_DONE`).

Further, this race condition is prevented in `zil_sync` by only calling
`zil_free_lwb` for lwb's that do not have their `lwb_buf` pointer set.
All lwb's not in the `LWB_STATE_DONE` state will have a non-null value
for this pointer; the pointer is only cleared in
`zil_lwb_flush_vdevs_done`, at which point the lwb's state will be
changed to `LWB_STATE_DONE`.

This race is present in `zil_suspend`, leading to this bug.

At first glance, it would appear as though this would not be true
because `zil_suspend` will call `zil_commit`, just like `zil_close`, but
the problem is that `zil_suspend` will set the zilog's `zl_suspend`
field prior to calling `zil_commit`. Further, in `zil_commit`, if
`zl_suspend` is set, `zil_commit` will take a special branch of logic
and use `txg_wait_synced` instead of performing the normal `zil_commit`
logic.

This call to `txg_wait_synced` might be good enough for the data to
reach disk safely before it returns, but it does not ensure that all
outstanding lwb's reach the `LWB_STATE_DONE` state before it returns.
This is because, if there's an lwb "stuck" in
`zil_commit_waiter_timeout`, waiting for it's lwb to timeout, it will
maintain a non-null value for it's `lwb_buf` field and thus `zil_sync`
will not free that lwb. Thus, even though the lwb's data is already on
disk, the lwb will be left lingering, waiting on the CV, and will
eventually timeout and be issued to disk even though the write is
unnesseary.

So, after `zil_commit` is called from `zil_suspend`, we incorrectly
assume that there are not outstanding lwb's, and proceed to free all
lwb's found on the zilog's lwb list. As a result, we free the lwb that
will later be used `zil_commit_waiter_timeout`.

Reviewed by: John Kennedy <jwk404@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
2018-01-21 23:18:42 +00:00
mav
46f172e5a8 MFV r328225: 8603 rename zilog's "zl_writer_lock" to "zl_issuer_lock"
illumos/illumos-gate@cf07d3da99

https://www.illumos.org/issues/8603:
  To help make the ZIL's code more understandable, it was suggested that
  the zilog_t's "zl_writer_lock" field should be renamed to "zl_issuer_lock".

Reviewed by: C Fraire <cfraire@me.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
2018-01-21 23:11:20 +00:00
mav
2700f9ece1 MFV r328220: 8677 Open-Context Channel Programs
illumos/illumos-gate@a3b2868063

https://www.illumos.org/issues/8677
  We want to be able to run channel programs outside of synching context.
  This would greatly improve performance of channel program that just gather
  information, as we won't have to wait for synching context anymore.

  This feature should introduce the following:
  - A new command line flag in "zfs program" to specify our intention to
  run in open context.
  - A new flag/option within the channel program ioctl which selects the
  context.
  - Appropriate error handling whenever we try a channel program in
  open-context that contains zfs.sync* expressions.
  - Documentation for the new feature in the manual pages.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
2018-01-21 23:02:05 +00:00
phk
0f399435e7 Rename rpi_pwm to bcm283x_pwm, and build it on armv[67] and arm64.
Truncate ratio if period is lowered.

Tested on Rpi2 and Rpi3.

Rpi3 requires DTB->DTS->edit->DTB hack
2018-01-21 21:27:41 +00:00
pfg
b48800bac3 Define a new __alloc_size2 attribute to complement the exiting support.
At least on GCC7 calling __alloc_size(x) twice is not equivalent to
calling using the attribute once with two arguments. The later is the
documented use in GCC documentation so add a new alloc_size(n, x)
alternative to cover for the few places where it is used: basically:
calloc(3), reallocarray(3) and  mallocarray(9).

Submitted by:	Mark Millard
MFC after:	3 days
Reference:
http://docs.freebsd.org/cgi/mid.cgi?F227842D-6BE2-4680-82E7-07906AF61CD7
2018-01-21 20:27:47 +00:00
trasz
45726aab09 Add missing manufacturer/serial number string descriptors.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-01-21 17:31:31 +00:00
pfg
ced875130d Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
avg
ed9760cef2 zfs: no need to check that size of zfs_cmd_t is not greater than IOCPARM_MAX
Nowadays we do not pass zfs_cmd_t directly through the ioctl interface.
Instead a small zfs_iocparm_t object is passed and the command is
explicitly copied in and out.  So, the check has become irrelevant.

MFC after:	3 weeks
Sponsored by:	Panzura
2018-01-21 11:19:18 +00:00
dumbbell
c94a7b65c2 psm: Log syncmask[1], not syncmask[0] twice
MFC after:	1 week
2018-01-20 19:04:21 +00:00
kib
db37b511ae Use correct symbol name in r328202.
Sponsored by:	The FreeBSD Foundation
MFC after:	11 days
2018-01-20 18:05:14 +00:00
kib
14962b8ee9 Use predefined symbol for the CR3.PCID mask.
Sponsored by:	The FreeBSD Foundation
MFC after:	11 days
2018-01-20 17:46:09 +00:00
mmel
bcd3d3b8e5 Convert extres/phy to kobj model.
Similarly as other extres pseudo-drivers, implement phy by using kobj model.
This detaches it from provider device, so single device driver can export
multiple different phys. Additionally, this  allows phy to be subclassed to
more specialized drivers, like is USB OTG phy, or PCIe phy with hot-plug
capability.

Tested by:	manu (previous version, on Allwinner board)
MFC after:	1 month
2018-01-20 17:02:17 +00:00
royger
774ffaf6a8 xen: fix IDT setup after PTI
On amd64 the IDT handler was not set correctly when using PTI.

While there also fix the selectors to SEL_KPL.

Obtained from:	kib
MFC with:	r328083
2018-01-20 14:59:37 +00:00
manu
9f54e35383 clk: Get new parent freq after set_freq
During set_freq a clknode might have reparent (using a better parent that
have a higher frequency for example), before refreshing the cache, re-get
the parent frequency.

Reviewed by:	mmel
2018-01-20 14:47:27 +00:00
trasz
3510b11ee9 Remove unused index.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-01-20 14:05:55 +00:00
trasz
addb3664f7 Add missing SPDX tags; the rest of the license text is the same as in other
USB templates.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-01-20 14:03:55 +00:00
trasz
0914656890 Add usb_template(4) to RPI-B kernel config. This is to support the USB OTG
functionality on Raspberry Pi 0.

Reviewed by:	hselasky@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13924
2018-01-20 14:00:07 +00:00
trasz
f3e0d3d831 Add sysctls to control device side USB identifiers. This makes it
possible to change string and numeric vendor and product identifiers,
as well as anything else there might be to change for a particular
device side template, eg the MAC address.

Reviewed by:	hselasky@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13920
2018-01-20 13:58:34 +00:00
kib
93e18e3197 Assign map->header values to avoid boundary checks.
In several places, entry start and end field are checked, after
excluding the possibility that the entry is map->header.  By assigning
max and min values to the start and end fields of map->header in
vm_map_init, the explicit map->header checks become unnecessary.

Submitted by:	Doug Moore <dougm@rice.edu>
Reviewed by:	alc, kib, markj (previous version)
Tested by:	pho (previous version)
MFC after:	1 week
Differential Revision:  https://reviews.freebsd.org/D13735
2018-01-20 12:19:02 +00:00
dumbbell
60b7ee2b97 psm: Don't try to detect trackpoint packets if the Elantech device has none
This fixes a panic when `EVDEV_SUPPORT` is enabled: if a trackpoint
packet was detected but there was no trackpoint, we still tried to emit an
evdev event even though the associated relative evdev device (`evdev_r`)
was not initialized.

PR:		225339
MFC after:	1 week
2018-01-20 11:21:22 +00:00
dumbbell
77f731af0b psm: Skip sync check when PSM_CONFIG_NOCHECKSYNC is set
In psmprobe(), we set the initial `syncmask` to the vendor default value
if the `PSM_CONFIG_NOCHECKSYNC` bit is unset. However, we currently only
set it for the Elantech touchpad later in psmattach(), thus `syncmask`
is always configured.

Now, we check `PSM_CONFIG_NOCHECKSYNC` and skip sync check if it is set.
This fixes Elantech touchpad support for units which have `hascrc` set.

To clarify that, when we log the `syncmask` and `syncbits` fields, also
mention if they are actually used.

Finally, when we set `PSM_CONFIG_NOCHECKSYNC`, clear `PSM_NEED_SYNCBITS`
flag.

PR:		225338
MFC after:	1 week
2018-01-20 11:02:18 +00:00
landonf
9fd2ae12db bhnd_chipc(4): Fix leak of child device ivars by explicitly deleting
any children prior to detach.

With the newbus child deletion ordering changes introduced in r307518,
parent devices are now detached (and their driver set to NULL) prior to
detaching and deleting child devices; child-related bus methods (e.g.
BUS_CHILD_DETACHED, BUS_CHILD_DELETED) are no longer be dispatched to the
parent device driver after it returns 0 (success) from DEVICE_DETACH.

Sponsored by:   The FreeBSD Foundation
2018-01-20 01:55:34 +00:00
landonf
e23dd6b815 bhnd/bwn(4): Define a bhnd(4) softmodem device class for the v.90 modem
codec core, and mark the core as unpopulated on all BCM4306 bwn(4) devices.

Sponsored by:	The FreeBSD Foundation
2018-01-19 22:43:08 +00:00
landonf
9987a17348 bwn(4): Add missing BCM4306 PCI IDs.
Sponsored by:	The FreeBSD Foundation
2018-01-19 22:37:48 +00:00
landonf
ef589a6b08 bwn(4): Fix DMA translation lookup on devices limited to 30-bit host
addressing. The host addressing constraint does not apply to device address
space, and shouldn't be passed to bhnd_get_dma_translation() as the
maximum supported device address width.

Sponsored by:	The FreeBSD Foundation
2018-01-19 22:33:25 +00:00
landonf
ad7d50eb63 bhndb_pci(4): Implement bridge support for CardBus-attached devices.
- Extend the probe method to accept devclasses that inherit from the pci
   devclass (e.g. cardbus).
 - Some BCM4306-based CardBus adapters appear to advertise 4K SPROM, but
   only the first 2K is mapped into BAR0. We can safely assume that the
   SPROM data fits within the first 2K of the SPROM, rather than rejecting
   the SPROM mapping as invalid.

Sponsored by:	The FreeBSD Foundation
2018-01-19 22:22:02 +00:00
nwhitehorn
8e759d45c7 On AIM systems without a software-managed SLB, such as POWER9 systems using
either hardware segment tables or radix-tree-based page tables, do not try
to install SLB entries at trap boundaries.
2018-01-19 22:19:50 +00:00
nwhitehorn
49a1f46412 Define PHYS_TO_DMAP() and DMAP_TO_PHYS() as panics on the architectures
(i386 and arm) that never implement them. This allows the removal of
#ifdef PHYS_TO_DMAP on code otherwise protected by a runtime check on
PMAP_HAS_DMAP. It also fixes the build on ARM and i386 after I forgot an
#ifdef in r328168.

Reported by:	Milan Obuch
Pointy hat to:	me
2018-01-19 22:17:13 +00:00
kib
b4c82b3b07 PTI: Trap if we returned to userspace with kernel (full) page table
still active.

Map userspace portion of VA in the PTI kernel-mode page table as
non-executable. This way, if we ever miss reloading ucr3 into %cr3 on
the return to usermode, the process traps instead of executing in
potentially vulnerable setup.  Catch the condition of such trap and
verify user-mode %cr3, which is saved by page fault handler.

I peek this trick in some article about Linux implementation.

Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
DIfferential revision:	https://reviews.freebsd.org/D13956
2018-01-19 22:10:29 +00:00
landonf
cfbe0c6679 bhnd(4): fix a few bugs in pwrctl/fixed-clock device support.
- Do not panic on siba(4) detach when the bhnd(4) bus calls
   bhnd_get_pmu_info() on a PMU-less device.
 - Fix bhnd_pwrctl attach/detach on fixed-clock devices:
    - Treat bhnd_pwrctl_updateclk() as a no-op on fixed-clock devices.
    - Use bhnd_pwrctl_updateclk() to perform the appropriate clock
      transition on detach.

Sponsored by:	The FreeBSD Foundation
2018-01-19 21:58:48 +00:00
landonf
80792d29e7 bhnd_chipc(4): Fix the assignment of non-wildcard child unit numbers
introduced in r326102 and r326109; all chipc children should be added with
a wildcard unit (-1).

Sponsored by:	The FreeBSD Foundation
2018-01-19 21:36:28 +00:00
scottl
5da4f0f640 Fix compile errors in r328165
Reported by:	O. Hartmann
Sponsored by:	Netflix
2018-01-19 19:18:14 +00:00
nwhitehorn
e79f2b9178 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
emaste
1cf1c6c06d Enable KPTI by default on amd64 for non-AMD CPUs
Kernel Page Table Isolation (KPTI) was introduced in r328083 as a
mitigation for the 'Meltdown' vulnerability.  AMD CPUs are not affected,
per https://www.amd.com/en/corporate/speculative-execution:

    We believe AMD processors are not susceptible due to our use of
    privilege level protections within paging architecture and no
    mitigation is required.

Thus default KPTI to off for AMD CPUs, and to on for others.  This may
be refined later as we obtain more specific information on the sets of
CPUs that are and are not affected.

Submitted by:	Mitchell Horne
Reviewed by:	cem
Relnotes:	Yes
Security:	CVE-2017-5754
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13971
2018-01-19 15:42:34 +00:00
scottl
01b92c372f Revert ABI breakage to CAM that came in with MMC/SD support in r320844.
Make it possible to retrieve mmc parameters via the XPT_GET_ADVINFO
call instead.  Convert camcontrol to the new scheme.

Reviewed by:	imp. kibab
Sponsored by:	Netflix
Differential Revision:	D13868
2018-01-19 15:32:27 +00:00
pfg
293a141389 libnv: Use mallocarray(9) for the nv_calloc. 2018-01-19 14:50:53 +00:00
hselasky
a2393284a2 Add new USB ID to U3G driver.
PR:		134299
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-19 13:06:36 +00:00
hselasky
629ac703ce Improve support for USB based 3G/4G/5G dongles from Huawei.
PR:		192345
Sponsored by:	Mellanox Technologies
2018-01-19 12:59:14 +00:00
ae
77c1143027 Add UDPLite support to ipfw(4).
Now it is possible to use UDPLite's port numbers in rules,
create dynamic states for UDPLite packets and see "UDPLite" for matched
packets in log.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2018-01-19 12:50:03 +00:00
cem
036832f8e3 Unbreak i386 build
The logical result of a right shift >= the width of a type is zero, but our
compiler decides this is a warning (and thus, error).  Just remove ccp(4)
from i386.

Reported by:	cy
Sponsored by:	Dell EMC Isilon
2018-01-19 04:34:06 +00:00
jhb
e2ed91ad09 Use a dedicated per-CPU stack for machine check exceptions.
Similar to NMIs, machine check exceptions can fire at any time and are
not masked by IF.  This means that machine checks can fire when the
kstack is too deep to hold a trap frame, or at critical sections in
trap handlers when a user %gs is used with a kernel %cs.  Use the same
strategy used for NMIs of using a dedicated per-CPU stack configured
in IST 3.  Store the CPU's pcpu pointer at the stop of the stack so
that the machine check handler can reliably find the proper value for
%gs (also borrowed from NMIs).

This should also fix a similar issue with PTI with a MC# occurring
while the CPU is executing on the trampoline stack.

While here, bypass trap() entirely and just call mca_intr().  This
avoids a bogus call to kdb_reenter() (there's no reason to try to
reenter kdb if a MC# is raised).

Reviewed by:	kib
Tested by:	avg (on AMD without PTI)
Differential Revision:	https://reviews.freebsd.org/D13962
2018-01-18 23:50:21 +00:00
jhb
efbbc35271 Remove two no-longer-used labels from the NMI interrupt handler.
Reviewed by:	kib
2018-01-18 22:13:53 +00:00
cem
d67e92fd24 Add ccp(4): experimental driver for AMD Crypto Co-Processor
* Registers TRNG source for random(4)
* Finds available queues, LSBs; allocates static objects
* Allocates a shared MSI-X for all queues.  The hardware does not have
  separate interrupts per queue.  Working interrupt mode driver.
* Computes SHA hashes, HMAC.  Passes cryptotest.py, cryptocheck tests.
* Does AES-CBC, CTR mode, and XTS.  cryptotest.py and cryptocheck pass.
* Support for "authenc" (AES + HMAC).  (SHA1 seems to result in
  "unaligned" cleartext inputs from cryptocheck -- which the engine
  cannot handle.  SHA2 seems to work fine.)
* GCM passes for block-multiple AAD, input lengths

Largely based on ccr(4), part of cxgbe(4).

Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
ccp:               ~630 Mb/s    SHA256:  ~660 Mb/s  SHA512:  ~700 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

As you can see, performance is poor in comparison to aesni(4) and even
cryptosoft (due to high setup cost).  At a larger buffer size (128kB),
throughput is a little better (but still worse than aesni(4)):

aesni:      SHA1:~10400 Mb/s    SHA256: ~9950 Mb/s
ccp:              ~2200 Mb/s    SHA256: ~2600 Mb/s  SHA512: ~3800 Mb/s
cryptosoft:       ~1750 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

AES performance has a similar story:

aesni:      4kB: ~11250 Mb/s    128kB: ~11250 Mb/s
ccp:               ~350 Mb/s    128kB:  ~4600 Mb/s
cryptosoft:       ~1750 Mb/s    128kB:  ~1700 Mb/s

This driver is EXPERIMENTAL.  You should verify cryptographic results on
typical and corner case inputs from your application against a known- good
implementation.

Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12723
2018-01-18 22:01:30 +00:00
cem
f629be4b46 Add Elf_Nhdr definition to match NetBSD, OpenBSD, Linux
The mesa port started to use this type and fails to build without it.

NetBSD: http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/sys/exec_elf.h.diff?r1=1.26&r2=1.27&f=h
OpenBSD: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/sys/exec_elf.h.diff?r1=1.21&r2=1.22&f=h

PR:		225302
Reported by:	Greg V <greg AT unrelenting.technology>
Sponsored by:	Dell EMC Isilon
2018-01-18 21:19:57 +00:00
jhb
8fb0491b83 Adjust branch target in NMI handler for the !PTI case.
In the !PTI case the NMI handler jumped past the instructions that set
%rdi to point to the current PCB, but the target instructions assumed %rdi
were set.

Reviewed by:	kib
Tested by:	pho
2018-01-18 20:12:12 +00:00
jhb
286c310205 Update various statements in vmstat(8) to match reality.
- The process stats are actually thread counts rather than process
  counts.
- Simplify various descriptions to remove mention of stats that are
  updated every 5 seconds (all VM related stats are now "instant",
  only the load average is updated every 5 seconds).
- Don't make any mention of special treatment for processes that have
  been active in the last 20 seconds.  We don't track that stat.
- Rework the description of active virtual memory.  Call it mapped
  virtual memory and explicitly point out it is not the same as the
  active page queue (which corresponds to "Active" in top(1)), and
  also hint at the possible bogusness of the value (e.g. if a process
  maps a single page out of a multiple GB file, the entire file's size
  is considered mapped).
- Simplify a few descriptions that implied their output was a value
  per interval.  All of the "rate" values are per-second rates scaled
  across the interval.
- Update a few comments for 'struct vmtotal' along similar lines.

Reported by:	mwlucas (indirectly)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13905
2018-01-18 19:43:02 +00:00
br
afe48fcda0 UART Clock Selection Register holds a divider value for a supplied clock,
not a final baud rate. The value for this register has to be calculated.

Sponsored by:	DARPA, AFRL
2018-01-18 18:19:31 +00:00
br
317d4fe843 Support for UART device found in Qualcomm Snapdragon 410E SoC.
Tested on DragonBoard 410c.

Reviewed by:	andrew
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13972
2018-01-18 17:43:32 +00:00
br
8673b42398 Set the base address of translation table 0.
This fixes operation on Qualcomm Snapdragon and some other platforms.

During boot time on subsystems initialization we have some amount of
kernel threads created, then scheduler gives CPU time to each thread.
Eventually scheduler returns CPU execution back to thread 0. In this
case writing zero to ttbr0 in cpu_switch leads Qualcomm board to
reboot (asynchronously, CPU continues execution).

Similar to other kernel threads install a valid physical address
(kernel pmap) to user page table base register ttbr0.

Reviewed by:	andrew
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13536
2018-01-18 16:20:09 +00:00
manu
77258f226e nfs: Do not printf each time a lock structure is freed during module unload
There can be a lot of those structures and printing a line each time we free
one on module unload.

MFC after:	3 days
2018-01-18 15:28:49 +00:00
kib
af04296ad4 Move the kernphys declaration to machine/md_var.h.
Apparently machinde/cpu.h is supposed to contain MD implementations of
MI interfaces.  Also, remove kernphys declaration from machdep.c,
since it is already provided by md_var.h.

Requested and reviewed by:	bde
MFC after:	13 days
2018-01-18 15:15:35 +00:00
avg
d57a913b89 correct read-ahead calculations in vfs_bio_getpages
Previously the calculations were done as if the requested region
ended at the start of the last requested page, not its end.
The problem as actually quite minor as it affected only stats and
page prefaulting, not the actual page data, and only with specific
parameters.

Reviewed by:	kib (previous version)
MFC after:	2 weeks
2018-01-18 12:59:04 +00:00
kib
10c1564cbe Fix compilation with gcc.
etext is already declared in machine/cpu.h, move kernphys declaration
there too.

Based on the patch by:	bde
MFC after:	13 days
2018-01-18 11:21:03 +00:00
kib
911f28f4eb Fix compilation with gas.
Submitted by:	bde
MFC after:	13 days
2018-01-18 11:19:58 +00:00
kib
e24bdf2ac4 Remove the 'last' argument from the pmap_pti_free_page().
It is in fact unused.

Noted and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2018-01-18 11:01:41 +00:00
andrew
35d8f24a88 Add a pmap invalidate that doesn't call sched_pin.
When demoting DMAP pages curthread may be pointing to data within the
page we are demoting. Create a new invalidate that doesn't pin and use
it in the demote case.

As the demote has both interrupts disabled, and is within a critical section
this is safe from having the scheduler from switching to another CPU.

Reported by:	loos
Reviewed by:	loos
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13955
2018-01-18 10:52:31 +00:00
wma
f4720d5de9 Call platform_smp_ap_init before decr_ap_init
In platform_smp_ap_init we are doing some crucial code (eg. set LPCR register)
    which have influence over further execution.

    Practiculary in PowerNV platform we have experienced Data Storage Interrupt
    before we set apropriate LPCR. It caused code execution from location which was
    legal in bootloader (petitboot based on linux) but illegal in FreeBSD
2018-01-18 08:34:20 +00:00
wma
b6338c1a07 PPC64: fix TOC behavior on process initialization
Set stack pointer to correct value after thread's stack pointer restore

Restoring new thread's stack pointer caused stack corruption because
restored stack pointer didn't point to callee (cpu_switch) stack frame but
caller stack frame.

As a result we had mysterious errors in caller function (sched_switch).

Solution: simply set stack pointer to correct value

Also, initialize TOC to a valid pointer once the thread is being
created.

Created by:            Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn
Differential revision: https://reviews.freebsd.org/D13947
Sponsored by:          QCM Technologies
2018-01-18 07:42:51 +00:00
wma
6362070dc8 PPC: machdep, zero BSS always but BookE
Zero BSS always. The only case when this operation is
ommitted is when booting on BookE.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           imp, nwhitehorn
Differential revision: https://reviews.freebsd.org/D13948
Sponsored by:          QCM Technologies
2018-01-18 07:41:04 +00:00
wma
e93d0fe0cf KDB: restart only CPUs stopped by KDB
There is a case when not all CPUs went online. In that situation,
restart only APs which were operational before entering KDB.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           nwhitehorn
Differential revision: https://reviews.freebsd.org/D13949
Sponsored by:          QCM Technologies
2018-01-18 07:38:54 +00:00
wma
b75d04af72 PPC64: add AHCI back to GENERIC64
> Description of fields to fill in above:                     76 columns --|
> PR:                       If a GNATS PR is affected by the change.
> Submitted by:             If someone else sent in the change.
> Reviewed by:              If someone else reviewed your modification.
> Approved by:              If you needed approval for this commit.
> Obtained from:            If the change is from a third party.
> MFC after:                N [day[s]|week[s]|month[s]].  Request a reminder email.
> MFH:                      Ports tree branch name.  Request approval for merge.
> Relnotes:                 Set to 'yes' for mention in release notes.
> Security:                 Vulnerability reference (one per line) or description.
> Sponsored by:             If the change was sponsored by an organization.
> Differential Revision:    https://reviews.freebsd.org/D### (*full* phabric URL needed).
> Empty fields above will be automatically removed.

M    sys/powerpc/conf/GENERIC64
2018-01-18 06:28:21 +00:00
asomers
2efe3d3999 gnop(8): add the ability to set a nop provider's physical path
While I'm here, expand the existing tests a bit.

MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D13579
2018-01-18 05:57:10 +00:00
kevans
49720cebf3 libfdt: Update to 1.4.6, switch to using libfdt for overlay support
libfdt highlights since 1.4.3:

- fdt_property_placeholder added to create a property without specifying its
value at creation time
- stringlist helper functions added to libfdt
- Improved overlay support
- Various internal cleanup

Also switch stand/fdt over to using libfdt for overlay support with this
update. Our current overlay implementation works only for limited use cases
with overlays generated only by some specific versions of our dtc(1). Swap
it out for the libfdt implementation, which supports any properly generated
overlay being applied to a properly generated base.

This will be followed up fairly soon with an update to dtc(1) in tree to
properly generate overlays.

MFC note: the <stdlib.h> include this update introduces in libfdt_env.h is
apparently not necessary in the context we use this in. It's not immediately
clear to me the motivation for it being introduced, but it came in with
overlay support. I've left it in for the sake of accuracy and because it's
not harmful here on HEAD, but MFC'ing this to stable/11 will require
wrapping the #include in an `#ifndef _STANDALONE` block or else it will
cause build failures.

Tested on:	Banana Pi-M3 (ARMv7)
Tested on:	Pine64 (aarch64)
Tested on:	PowerPC [nwhitehorn]
Reviewed by:	manu, nwhitehorn
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13893
2018-01-18 04:39:09 +00:00
jhb
391a83c86b Save and restore guest debug registers.
Currently most of the debug registers are not saved and restored
during VM transitions allowing guest and host debug register values to
leak into the opposite context.  One result is that hardware
watchpoints do not work reliably within a guest under VT-x.

Due to differences in SVM and VT-x, slightly different approaches are
used.

For VT-x:

- Enable debug register save/restore for VM entry/exit in the VMCS for
  DR7 and MSR_DEBUGCTL.
- Explicitly save DR0-3,6 of the guest.
- Explicitly save DR0-3,6-7, MSR_DEBUGCTL, and the trap flag from
  %rflags for the host.  Note that because DR6 is "software" managed
  and not stored in the VMCS a kernel debugger which single steps
  through VM entry could corrupt the guest DR6 (since a single step
  trap taken after loading the guest DR6 could alter the DR6
  register).  To avoid this, explicitly disable single-stepping via
  the trace flag before loading the guest DR6.  A determined debugger
  could still defeat this by setting a breakpoint after the guest DR6
  was loaded and then single-stepping.

For SVM:
- Enable debug register caching in the VMCB for DR6/DR7.
- Explicitly save DR0-3 of the guest.
- Explicitly save DR0-3,6-7, and MSR_DEBUGCTL for the host.  Since SVM
  saves the guest DR6 in the VMCB, the race with single-stepping
  described for VT-x does not exist.

For both platforms, expose all of the guest DRx values via --get-drX
and --set-drX flags to bhyvectl.

Discussed with:	avg, grehan
Tested by:	avg (SVM), myself (VT-x)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D13229
2018-01-17 23:11:25 +00:00
jhb
1248834695 Require the SHF_ALLOC flag for program sections from kernel object modules.
ELF object files can contain program sections which are not supposed
to be loaded into memory (e.g. .comment).  Normally the static linker
uses these flags to decide which sections are allocated to loadable
program segments in ELF binaries and shared objects (including kernels
on all architectures and kernel modules on architectures other than
amd64).

Mapping ELF object files (such as amd64 kernel modules) into memory
directly is a bit of a grey area.  ELF object files are intended to be
used as inputs to the static linker.  As a result, there is not a
standardized definition for what the memory layout of an ELF object
should be (none of the section headers have valid virtual memory
addresses for example).

The kernel and loader were not checking the SHF_ALLOC flag but loading
any program sections with certain types such as SHT_PROGBITS.  As a
result, the kernel and loader would load into RAM some sections that
weren't marked with SHF_ALLOC such as .comment that are not loaded
into RAM for kernel modules on other architectures (which are
implemented as ELF shared objects).  Aside from possibly requiring
slightly more RAM to hold a kernel module this does not affect runtime
correctness as the kernel relocates symbols based on the layout it
uses.

Debuggers such as gdb and lldb do not extract symbol tables from a
running process or kernel.  Instead, they replicate the memory layout
of ELF executables and shared objects and use that to construct their
own symbol tables.  For executables and shared objects this works
fine.  For ELF objects the current logic in kgdb (and probably lldb
based on a simple reading) assumes that only sections with SHF_ALLOC
are memory resident when constructing a memory layout.  If the
debugger constructs a different memory layout than the kernel, then it
will compute different addresses for symbols causing symbols in the
debugger to appear to have the wrong values (though the kernel itself
is working fine).  The current port of mdb does not check SHF_ALLOC as
it replicates the kernel's logic in its existing kernel support.

The bfd linker sorts the sections in ELF object files such that all of
the allocated sections (sections with SHF_ALLOCATED) are placed first
followed by unallocated sections.  As a result, when kgdb composed a
memory layout using only the allocated sections, this layout happened
to match the layout used by the kernel and loader.  The lld linker
does not sort the sections in ELF object files and mixed allocated and
unallocated sections.  This resulted in kgdb composing a different
memory layout than the kernel and loader.

We could either patch kgdb (and possibly in the future lldb) to use
custom handling when generating memory layouts for kernel modules that
are ELF objects, or we could change the kernel and loader to check
SHF_ALLOCATED.  I chose the latter as I feel we shouldn't be loading
things into RAM that the module won't use.  This should mostly be a
NOP when linking with bfd but will allow the existing kgdb to work
with amd64 kernel modules linked with lld.

Note that we only require SHF_ALLOC for "program" sections for types
like SHT_PROGBITS and SHT_NOBITS.  Other section types such as symbol
tables, string tables, and relocations must also be loaded and are not
marked with SHF_ALLOC.

Reported by:	np
Reviewed by:	kib, emaste
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D13926
2018-01-17 22:51:59 +00:00
jhb
0aaf564f94 Use long for the last argument to VOP_PATHCONF rather than a register_t.
pathconf(2) and fpathconf(2) both return a long.  The kern_[f]pathconf()
functions now accept a pointer to a long value rather than modifying
td_retval directly.  Instead, the system calls explicitly store the
returned long value in td_retval[0].

Requested by:	bde
Reviewed by:	kib
Sponsored by:	Chelsio Communications
2018-01-17 22:36:58 +00:00
landonf
d49b03f3fe bwn(4): Enable, by default, the opt-in support for bhnd(4) introduced in
r326454.

bwn(4)/bhnd(4) has been tested with most chipsets currently supported by
bwn(4), and this change should be transparent to existing bwn(4) users;
please report any regressions that you do encounter.

To revert to using siba_bwn(4) instead of bhnd(4), place the following
lines in loader.conf(5):

  hw.bwn_pci.preferred="0"

Once we're satisfied that the switch to bhnd(4) has seen sufficient broader
testing, bwn(4) will be migrated to use the native bhnd(9) interface
directly, and support for siba_bwn(4) will be dropped (see D13518).

Sponsored by:	The FreeBSD Foundation
2018-01-17 22:33:19 +00:00
markj
0c26afc4c5 Annotate a couple of changes from r328083.
Reviewed by:	kib
X-MFC with:	r328083
2018-01-17 21:52:12 +00:00
pfg
036ebddf97 ufs: use mallocarray(9).
Basic use of mallocarray to prevent overflows: static analyzers are also
likely to perform additional checks.

Since mallocarray expects unsigned parameters, unsign some
related variables to minimize sign conversions.

Reviewed by:	mckusick
2018-01-17 18:18:33 +00:00
dim
bbd1562a49 Revert r327340, as the workaround for rep prefixes followed by .byte
directives is no longer needed after r328090.
2018-01-17 17:14:19 +00:00
imp
5af8ebb2f8 Move setting of CAM_SIM_QUEUED to before we actually submit it to the
hardware. Setting it after is racy, and we can lose the race on a
heavily loaded system.

Reviewed by: scottl@, gallatin@
Sponsored by: Netflix
2018-01-17 17:08:26 +00:00
fabient
326b96a88c Fix pmcstat exit from kernel introduced by r325275.
pmcstat request for close will generate a close event.
This event will be in turn received by pmcstat to close the file.

Reviewed by:	kib
Tested by:	pho
MFC after:	1 week
Sponsored by: Stormshield
2018-01-17 16:41:22 +00:00
kib
c35d24e497 PTI for amd64.
The implementation of the Kernel Page Table Isolation (KPTI) for
amd64, first version. It provides a workaround for the 'meltdown'
vulnerability.  PTI is turned off by default for now, enable with the
loader tunable vm.pmap.pti=1.

The pmap page table is split into kernel-mode table and user-mode
table. Kernel-mode table is identical to the non-PTI table, while
usermode table is obtained from kernel table by leaving userspace
mappings intact, but only leaving the following parts of the kernel
mapped:

    kernel text (but not modules text)
    PCPU
    GDT/IDT/user LDT/task structures
    IST stacks for NMI and doublefault handlers.

Kernel switches to user page table before returning to usermode, and
restores full kernel page table on the entry. Initial kernel-mode
stack for PTI trampoline is allocated in PCPU, it is only 16
qwords.  Kernel entry trampoline switches page tables. then the
hardware trap frame is copied to the normal kstack, and execution
continues.

IST stacks are kept mapped and no trampoline is needed for
NMI/doublefault, but of course page table switch is performed.

On return to usermode, the trampoline is used again, iret frame is
copied to the trampoline stack, page tables are switched and iretq is
executed.  The case of iretq faulting due to the invalid usermode
context is tricky, since the frame for fault is appended to the
trampoline frame.  Besides copying the fault frame and original
(corrupted) frame to kstack, the fault frame must be patched to make
it look as if the fault occured on the kstack, see the comment in
doret_iret detection code in trap().

Currently kernel pages which are mapped during trampoline operation
are identical for all pmaps.  They are registered using
pmap_pti_add_kva().  Besides initial registrations done during boot,
LDT and non-common TSS segments are registered if user requested their
use.  In principle, they can be installed into kernel page table per
pmap with some work.  Similarly, PCPU can be hidden from userspace
mapping using trampoline PCPU page, but again I do not see much
benefits besides complexity.

PDPE pages for the kernel half of the user page tables are
pre-allocated during boot because we need to know pml4 entries which
are copied to the top-level paging structure page, in advance on a new
pmap creation.  I enforce this to avoid iterating over the all
existing pmaps if a new PDPE page is needed for PTI kernel mappings.
The iteration is a known problematic operation on i386.

The need to flush hidden kernel translations on the switch to user
mode make global tables (PG_G) meaningless and even harming, so PG_G
use is disabled for PTI case.  Our existing use of PCID is
incompatible with PTI and is automatically disabled if PTI is
enabled.  PCID can be forced on only for developer's benefit.

MCE is known to be broken, it requires IST stack to operate completely
correctly even for non-PTI case, and absolutely needs dedicated IST
stack because MCE delivery while trampoline did not switched from PTI
stack is fatal.  The fix is pending.

Reviewed by:	markj (partially)
Tested by:	pho (previous version)
Discussed with:	jeff, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2018-01-17 11:44:21 +00:00
kib
72f9a98571 Amd64 user_ldt_deref() is not used outside sys_machdep.c. Mark it as
static.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-17 11:21:03 +00:00
wma
687a4fd55c PPC64: implement missing busdma ops
Add missing little-endian 64-bit read and write. Since there
is no direct ASM opcode for this, perform byte swap if
necessary.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:45:18 +00:00
wma
0ae6682f6b PPC64: fix copyinout ranges
Use current userspace address for segment mapping. Previously,
there was a bug which made the funciton constantly using the userspace
base address which could cause data integrity issues.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:36:48 +00:00
wma
225d71d135 PPC64: add CXGBE and remove AHCI from GENERIC64
Add CXGBE driver which is required for PowerNV system.
Also, remove AHCI which does not work in BigEndian.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:33:16 +00:00
wma
842790678f PowerNV: workaround console on OPAL 5.4
FreeBSD prints text char-by-char, which is not what OPAL
is designed to. Poll events more frequently to avoid buffer
overflow and loosing data.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 08:01:51 +00:00
wma
4b5982d8ab PowerNV: make PowerNV PCIe working on a real hardware
Fixes:
- map all devices to PE0
- use 1:1 TCE mapping
- provide the same TCE mapping for all PEs (not only PE0)
- add TCE reset and alignment (required by OPAL)

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 07:39:11 +00:00
landonf
2789832d3c bhndb_pci(4): fix incorrect BHND_PCI_SRSH_PI workaround
On a SPROM-less device, the PCI(e) bridge core will be initialized with its
power-on-reset defaults; this can leave the SPROM-derived BHND_PCI_SRSH_PI
value pointing to the wrong backplane address. This value is used by the
PCI core when performing address translation between the static register
windows in BAR0 that map the PCI core's register block, and backplane
address space.

Previously, bhndb_pci(4) incorrectly used the potentially invalid static
BAR0 PCI register windows when attempting to correct the BHND_PCI_SRSH_PI
value in the PCI core's SPROM shadow.

Instead, we now read/update BHND_PCI_SRSH_PI by fetching the PCI core's
backplane address from the core enumeration table, and then using a dynamic
register window to explicitly map the PCI core's register block into BAR0.

Sponsored by:	The FreeBSD Foundation
2018-01-17 03:34:26 +00:00
pfg
d32751c6b7 SPDX: finish tagging sys/cam. 2018-01-16 23:19:57 +00:00
ian
f0c14bee67 Remove redundant critical_enter/exit() calls. The block of code delimited
by these calls is now protected by a spin mutex (obscured within the
RTC_LOCK/RTC_UNLOCK macros).

Reported by:	bde@
2018-01-16 23:18:52 +00:00
ian
2fcaa5e746 Move some code around and rename a couple variables; no functional changes.
The static atrtc_set() function was called only from clock_settime(), so
just move its contents entirely into clock_settime() and delete atrtc_set().

Rename the struct bcd_clocktime variables from 'ct' to 'bct'.  I had
originally wanted to emphasize how identical the clocktime and bcd_clocktime
structs were, but things evolved to the point where the structs are not at
all identical anymore, so now emphasizing the difference seems better.
2018-01-16 23:14:12 +00:00
pfg
bf3a218e8b scsi_ch.c: Small cleanups to the comments.
Move the the NetBSD tag near to the related licence. Update it to reflect
better the point where we started diverging.

Use grouping parenthesis for the SPDX tag.

No functional change.
2018-01-16 23:08:25 +00:00
tuexen
b096e231df Fix a bug related to fast retransmissions.
When processing a SACK advancing the cumtsn-ack in fast recovery,
increment the miss-indications for all TSN's reported as missing.

Thanks to Fabian Ising for finding the bug and to Timo Voelker
for provinding a fix.

This fix moves also CMT related initialisation of some variables
to a more appropriate place.

MFC after:	1 week
2018-01-16 21:58:38 +00:00
arichardson
1c21ef5dad Use ln -n instead of -h to allow building the kernel on Linux
Both flags do the same thing but -n is more widely supported.

Reviewed By:	jhb, emaste
Approved By:	jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D13936
2018-01-16 21:43:57 +00:00
jhb
9c89db2019 Split crp_buf into a union.
This adds explicit crp_mbuf and crp_uio pointers of the right type to
replace casts of crp_buf.  This does not sweep through changing existing
code, but new code should use the correct fields instead of casts.

Reviewed by:	kib
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D13927
2018-01-16 19:41:18 +00:00
pfg
0c49a087dc ext2fs: use mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). These
are not likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
2018-01-16 19:29:32 +00:00
wma
b76b1a3176 PowerNV: XICS support for PowerNV/OPAL
Make XICS to be OPAL-aware.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-16 06:24:19 +00:00
pfg
d3692a35af Fix build after r328020.
Should have noticed earlier but the build was already broken by another
change.

Reported by:	Ravi Pokala
2018-01-16 06:04:39 +00:00
jhibbits
7c500b69ee Make fsl_sata driver work on P1022
P1022 SATA controller may set the wrong CCR bit for a command completion.
This would previously cause an interrupt storm.  Solve this by marking all
commands complete, and letting the end_transaction deal with the successes.
Causes no problems on P5020.

While here, fix a minor bug in collision detection.  The Freescale SATA
controller only has 16 slots, not 32.
2018-01-16 04:50:23 +00:00
ian
6ac58f6094 Add static inline rtcin_locked() and rtcout_locked() functions for doing a
related series of operations without doing a lock/unlock for each byte.
Use them when reading and writing the entire set of time registers.

The original rtcin() and writertc() functions which do lock/unlock on each
byte still exist, because they are public and called by outside code.
2018-01-16 03:02:41 +00:00
cem
3bac5f1698 random(4): Add CCP random source definitions
The implementation will follow (D12723).  For now, get the changes to
commit-protected files out of the way.

Approved by:	secteam (gordon)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13925
2018-01-16 02:56:27 +00:00
tuexen
8524b21106 Don't provide a (meaningless) cmsg when proving a notification
in a recvmsg() call.

MFC after:	1 week
2018-01-15 21:59:20 +00:00
pfg
067d5edba6 misc geom and gnu: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:23:16 +00:00
pfg
bf156bc88c net*: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:21:51 +00:00
pfg
d5d345aa9e netgraph: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:19:21 +00:00
pfg
50d0301ca5 kern: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:18:04 +00:00
pfg
c24c2d4c02 cam: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:15:25 +00:00
pfg
7bb0460e2e nfsclient: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:14:56 +00:00
pfg
5426a059d6 mips: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:13:30 +00:00
pfg
f35549b5ed ndis: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:11:38 +00:00
pfg
e1b1b7bd96 powerpc: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:10:40 +00:00
pfg
af8b614ef3 arm: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:09:58 +00:00
pfg
a7c6776f59 x86: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:08:22 +00:00
tychon
03cbc447ee Provide some mitigation against CVE-2017-5715 by clearing registers
upon returning from the guest which aren't immediately clobbered by
the host.  This eradicates any remaining guest contents limiting their
usefulness in an exploit gadget.

This was inspired by this linux commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b6c02f38315b720c593c6079364855d276886aa

Reviewed by:	grehan, rgrimes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13573
2018-01-15 18:37:03 +00:00
ian
3f5e0fe8f4 Convert the x86 RTC driver to use new validated BCD<->timespec conversions.
New common routines were added to kern/subr_clock.c for converting between
calendrical time expressed in BCD and struct timespec. The new functions
return EINVAL on error, as expected when the clock hardware does not provide
valid time.

PR:		224813
Differential Revision:	https://reviews.freebsd.org/D13731 (no reviewers)
2018-01-15 16:40:43 +00:00
nwhitehorn
3f80ebc5ec Install the SLB miss trap-handling code in the SLB-based MMU driver set up,
to which it is specific, rather than in the generic AIM startup code. This
will be required to support the radix-table-based MMU introduced with POWER9.
2018-01-15 16:08:34 +00:00
avg
e3bb7b0fbf geom_disk / scsi_da: deny opening write-protected disks for writing
Ths change consists of two parts.

geom_disk: deny opening a disk for writing if it's marked as
write-protected.  A new disk(9) flag is added to mark write protected
disks.  A possible alternative could be to add another parameter to d_open,
so that the open mode could be passed to it and the disk drivers could
make the decision internally, but the flag required less churn.

scsi_da: add a new phase of disk probing to query the all pages mode
sense page.  We can determine if the disk is write protected using bit 7
of the device specific field in the mode parameter header returned by
MODE SENSE.

PR:		224037
Reviewed by:	mav
MFC after:	4 weeks
Differential Revision: https://reviews.freebsd.org/D13360
2018-01-15 11:20:00 +00:00
nwhitehorn
a1ff6e907c Move the pmap-specific code in copyinout.c that gets pointers to userland
buffers into a new pmap-module function pmap_map_user_ptr() that can
be implemented by the respective modules. This is required to implement
non-segment-based AIM-ish MMU systems such as the radix-tree page tables
introduced by POWER ISA 3.0 and present on POWER9.

Reviewed by:	jhibbits
2018-01-15 06:46:33 +00:00
manu
a09fcfef11 allwinner: mmc: Multiple improvement
- Add a per compatible configuration struct
  - Not all SoC uses the same size for DMA transfert, add this into the
    configuration data
  - Use new timing mode for some SoC (A64 mmc)
  - Auto calibrate clock for A64 mmc/emmc
  - A64 mmc controller need masking of data0
  - Add support for vmmc/vqmmc regulator
  - Add more capabilities, r/w speed is better for eMMC
  - MMC_CAP_SIGNALING_180 gives weird result so do not enable it for now.
  - Add new register documented in H3/A64 user manual

Tested-On: Pine64-LTS (A64), eMMC still doesn't work
Tested-On: A64-Olinuxino (A64), sd and eMMC are working
Tested-On: NanoPi Neo Plus2 (H5), sd and eMMC are working
Tested-On: OrangePi PC2 (H5), sd only (no eMMC)
Tested-On: OrangePi One (H3), sd only (no eMMC)
Tested-On: BananaPi M2 (A31s), sd only (no eMMC)
2018-01-14 22:05:29 +00:00
fsu
1c4853adf6 Add metadata_csum feature support.
Reviewed by:   pfg (mentor)
Approved by:   pfg (mentor)
MFC after:     6 months

Differential Revision:    https://reviews.freebsd.org/D13810
2018-01-14 20:46:39 +00:00
phk
7765038ea7 Add a rudimentary PWM driver for the RaspberryPi.
Control is through sysctl, only GPIO12 supported.

bootverbose creates sysctls for direct mangling of relevant registers.

Only tested on RPI2
2018-01-14 20:36:21 +00:00
markj
f1eb0fc41a Use the thread's ucred struct when fetching jid or jailname.
Reported by:	mjg
X-MFC with:	r327888
2018-01-14 17:55:40 +00:00
ian
9c7cac637a Add RTC clock conversions for BCD values, with non-panic validation.
RTC clock hardware frequently uses BCD numbers.  Currently the low-level
bcd2bin() and bin2bcd() functions will KASSERT if given out-of-range BCD
values.  Every RTC driver must implement its own code for validating the
unreliable data coming from the hardware to avoid a potential kernel panic.

This change introduces two new functions, clock_bcd_to_ts() and
clock_ts_to_bcd().  The former validates its inputs and returns EINVAL if any
values are out of range. The latter guarantees the returned data will be
valid BCD in a known format (4-digit years, etc).

A new bcd_clocktime structure is used with the new functions.  It is similar
to the original clocktime structure, but defines the fields holding BCD
values as uint8_t (uint16_t for year), and adds a PM flag for handling hours
using AM/PM mode.

PR:		224813
Differential Revision:	https://reviews.freebsd.org/D13730 (no reviewers)
2018-01-14 17:01:37 +00:00
emaste
48fbddaae8 Enable VIMAGE in i386 GENERIC (revert r327840)
We've switched back to ld.bfd on i386 for now.

PR:		225077
Sponsored by:	The FreeBSD Foundation
2018-01-14 16:04:51 +00:00
bz
8d499a89c5 Remove trailing whitespace.
No functional change.
2018-01-14 15:01:25 +00:00
kib
f98ceb5bd2 Add STAC and CLAC instructions wrappers.
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13838
2018-01-14 12:39:50 +00:00
kib
dcd37bb111 Enumerate and print Intel CPU features for Speculative Execution Side
Channel Mitigations.

The definitions are taken from the document 336996-001.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-14 12:36:23 +00:00
kib
1d4c150060 When re-evaluating cpu_features, also re-print CPU identification.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-14 12:33:05 +00:00
bryanv
adfee38d71 Sync VirtIO IDs with Linux 2018-01-14 06:03:40 +00:00
jeff
cc3d6a3370 Move VM_NUMA_ALLOC and DEVICE_NUMA under the single global config option NUMA.
Sponsored by:	Netflix, Dell/EMC Isilon
Discussed with:	jhb
2018-01-14 03:36:03 +00:00
pfg
9026a8ae0a Fix build after r327949.
Reported by:	Cy Schubert
2018-01-14 00:31:34 +00:00
dim
7dcc49151a Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
6.0.0 (branches/release_60 r321788).  Upstream has branched for the
6.0.0 release, which should be in about 6 weeks.  Please report bugs and
regressions, so we can get them into the release.

Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.

MFC after:	3 months
2018-01-14 00:08:34 +00:00
n_hibma
22f32bbbfb Add support for Quectel EC25.
Submitted by:	Samuel Crookes
MFC after:	3 days
2018-01-13 23:31:21 +00:00
nwhitehorn
f92ca972e6 Document places we assume that physical memory is direct-mapped at zero by
using a new macro PHYS_TO_DMAP, which deliberately has the same name as the
equivalent macro on amd64. This also sets the stage for moving the direct
map to another base address.
2018-01-13 23:14:53 +00:00
pfg
86c1e7ab7b dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
bryanv
c4995464a5 Fix possible panic when creating VirtIO console dev aliases
Since we have no control over the name, the MAKEDEV_CHECKNAME flag must be
used to return an error on an invalid (to devfs) name instead of panicing.

r305900 that originally added this feature also introduced a few other bugs:
  - Proper locking not performed
  - Theoretically broke the expectation that the control event buffer would
    not span more than one pages, but did not update the CTASSERT that was
    in place to prevent this. However, since the struct virtio_console_control
    and the bulk buffer together were quite small, this could not have happened.

Also workaround an QEMU VirtIO spec violation in that it includes the NUL
terminator in the buffer length when the spec says it is not included.

PR:		223531
MFC after:	1 week
2018-01-13 21:39:46 +00:00
jhibbits
1c6e6537d4 Include only the headers needed
The extra headers came through evolution of the file.
2018-01-13 21:10:42 +00:00
pfg
405bd7b74e zstd: Use mallocarray(9) for calloc macro.
This is in contrib code but since we only have mallocarray(9) in current
we will not upstream this.

This effectively brings back r327934, which was reverted to correct the
log message.
2018-01-13 19:02:51 +00:00
kevans
9265bdb095 Add SPDX tag to aw_syscon(4) 2018-01-13 19:02:08 +00:00
kevans
58786377c9 Add SPDX tags to syscon bits, correct inconsistency in Copyright line. 2018-01-13 19:00:41 +00:00
pfg
75aad63139 Revert r327934 to fix the log message. 2018-01-13 18:56:42 +00:00
kevans
85798e7755 Introduce aw_syscon(4) for earlier attachment
Attaching syscon_generic earlier than BUS_PASS_DEFAULT makes it more
difficult for specific syscon drivers to attach to the syscon node and to
get ordering right. Further discussion yielded the following set of
decisions:

- Move syscon_generic to BUS_PASS_DEFAULT
- If a platform needs a syscon with different attach order or probe
behavior, it should subclass syscon_generic and match on the SoC specific
compat string
- When we come across a need for a syscon that attaches earlier but only
specifies compatible = "syscon", we should create a syscon_exclusive driver
that provides generic access but probes earlier and only matches if "syscon"
is the only compatible. Such fdt nodes do exist in the wild right now, but
we don't really use them at the moment.

Additionally:

- Any syscon provider that has needs any more complex than a spinlock solely
for syscon access and a single memory resource should subclass syscon
directly rather than attempting to subclass syscon_generic or add complexity
to it. syscon_generic's attach/detach methods may be made public should the
need arise to subclass it with additional attach/detach behavior.

We introduce aw_syscon(4) that just subclasses syscon_generic but probes
earlier to meet our requirements for if_awg and implements #2 above for this
specific situation. It currently only matches a64/a83t/h3 since these are
the only platforms that really need it at the time being.

Discussed with:	ian
Reviewed by:	manu, andrew, bcr (manpages, content unchanged since review)
Differential Revision:	https://reviews.freebsd.org/D13793
2018-01-13 18:46:31 +00:00
pfg
d78380c6a4 zstd: Use memalloc(9) for calloc macro.
This is in contrib code but since we only have memalloc(9) in current we
will not upstream this.
2018-01-13 18:09:09 +00:00
cem
d1b1083a47 amd64: Add a 48-bit MAXADDR constant
Some devices (e.g., ccp(4) -- to be committed) can only access the low 48
bits of physical memory.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
2018-01-13 17:55:22 +00:00
dim
d91380862c Merge ^/head r327886 through r327930. 2018-01-13 17:52:55 +00:00
marius
983127b606 Use the correct revision specifier (EXT_CSD revision rather than
system specification version) for deciding whether the EXT_CSD
register includes the EXT_CSD_GEN_CMD6_TIME field.

Submitted by:	Masanobu SAITOH
2018-01-13 17:36:11 +00:00
jhibbits
afa3517cfd Add SPDX identifier to header
Reported by:	pfg
2018-01-13 17:25:48 +00:00
marius
a80489fe28 Fix a bug introduced in r327355; in mmcsd_ioctl_cmd() when ensuring
that userland doesn't switch partitions on its own, compare against
the partition mmcsd_ioctl_cmd() is going to switch to (based on the
device node used) rather than the currently selected partition.
2018-01-13 16:32:09 +00:00
mav
f9c58dac63 Add IDs for Nuvoton NCT6793/NCT6795.
MFC after:	2 weeks
2018-01-13 16:31:07 +00:00
marius
3d1d28421e Fix a bug introduced in r327339; at the point in time re-tuning is
executed, the interrupt aggregation code might have disabled the
SDHCI_INT_DMA_END and/or SDHCI_INT_RESPONSE bits in slot->intmask
and the SDHCI_SIGNAL_ENABLE register respectively. So when restoring
the interrupt masks based on the previous contents of slot->intmask
in sdhci_exec_tuning(), ensure that the SDHCI_INT_ENABLE register
doesn't lose these two bits.
While at it and in the spirit of r327339, let sdhci_tuning_intmask()
set the tuning error and re-tuning interrupt bits based on the
SDHCI_TUNING_ENABLED rather than the SDHCI_TUNING_SUPPORTED flag
being set, i. e. only when (re-)tuning is actually used. Currently,
this changes makes no net difference, though.
2018-01-13 16:21:13 +00:00
manu
e6c2708832 dwmmc_hisi: Fix build when option MMCCAM is defined 2018-01-13 14:10:45 +00:00
kib
78904c70bb Add sysctl debug.kdb.stack_overflow to conveniently test kernel
handling of the kstack overflow.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-13 11:59:49 +00:00
mjg
00f02d857e sx: retry hard shared unlock just like in r327905 for rwlocks 2018-01-13 09:26:24 +00:00
cy
e79b07245c Remove redundant variable.
MFC after:	1 week
2018-01-13 08:28:46 +00:00
cy
297cff2ef7 Though this block of code is not used by FreeBSD, correct a call to
sprintf() with a macro call to SNPRINTF similar to other calls to
SNPRINTF within this same block.

MFC after:	1 week
2018-01-13 08:16:10 +00:00
nwhitehorn
845db5ffdf Chase removal of FDT fixup code on PowerPC in r327907. 2018-01-13 03:09:05 +00:00
jhibbits
57c3e69c29 Remove fdt fixups for powerpc, they are no longer needed.
If a fixup really is needed, it should be fixed in u-boot, not in FreeBSD.

Suggested by:	nwhitehorn
2018-01-13 02:56:09 +00:00
jhibbits
958d24c6d5 Enable L2 cache on supported PowerQUICC and QorIQ platforms
Some PowerQUICC and QorIQ platforms have a L2 cache managed via the
memory-mapped configuration registers, and appear as a node in the device
tree.  This adds basic support to enable the cache.
2018-01-13 01:36:37 +00:00
mjg
6f29c630d9 rwlock: try regular read unlock even in the hard path
Saves on turnstile trips if the lock got more readers.
2018-01-13 00:05:31 +00:00
np
4bda86983b cxgbe/iw_cxgbe: Remove duplicates to fix compilation with recent gcc. 2018-01-13 00:04:11 +00:00
jeff
56eac8c497 Fix compile error from r327900 2018-01-12 23:41:12 +00:00
jeff
bc9177f3a2 Add support for NUMA domains to bus dma tags. This causes all memory
allocated with a tag to come from the specified domain if it meets the
other constraints provided by the tag.  Automatically create a tag at
the root of each bus specifying the domain local to that bus if
available.

Reviewed by:	jhb, kib
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13545
2018-01-12 23:34:16 +00:00
jeff
f375b4dd66 Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
jeff
e7c9f84113 Implement NUMA policy for kmem_*(9). This maintains compatibility with
reservations by giving each memory domain its own KVA space in vmem that
is naturally aligned on superpage boundaries.

Reviewed by:	alc, markj, kib  (some objections)
Sponsored by:	Netflix, Dell/EMC Isilon
Tested by;	pho
Differential Revision:	https://reviews.freebsd.org/D13289
2018-01-12 23:13:55 +00:00
pfg
f48ea5d543 libalias: small memory allocation cleanups.
Make the calloc wrappers behave as expected by using mallocarray.
It is rather weird that the malloc wrappers also zeroes the memory: update
a comment to reflect at least two cases where it is expected.

Reviewed by:	tuexen
2018-01-12 23:12:30 +00:00
jeff
058d03378a Regenerate auto-generated files 2018-01-12 23:06:35 +00:00
jeff
01c8e28f80 Add files for r327895
Implement 'domainset', a cpuset based NUMA policy mechanism.  This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by:  markj, kib
Discussed with:       alc
Tested by:    pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision:        https://reviews.freebsd.org/D13403
2018-01-12 22:57:57 +00:00
jeff
94c7af8ca2 Implement 'domainset', a cpuset based NUMA policy mechanism. This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by:	markj, kib
Discussed with:	alc
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13403
2018-01-12 22:48:23 +00:00
kevans
c6e172b277 allwinner/a83t_padconf: Rename "emac" function to "gmac" as per upstream DTS
Although these should have been 'emac', upstream DTS is going with using
'gmac' as the function name for the emac RGMII pins. Rename here to
accommodate.

emac support for the a83t should come in with the 4.16 DTS update, in
another couple of months.
2018-01-12 20:35:27 +00:00
markj
1bfc3a6a76 Add "jid" and "jailname" variables to DTrace.
These return the jail ID and jail name for the traced process,
respectively, and are analogous to "zonename" on Solaris/illumos.
"zonename" is now aliased to "jailname".

Also add some stress tests for the new variables.

Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
Reviewed by:	dteske (previous version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13877
2018-01-12 19:59:46 +00:00
dim
b64d96a23d Merge ^/head r327624 through r327885. 2018-01-12 18:23:35 +00:00
andrew
fe961bddea Workaround Spectre Variant 2 on arm64.
We need to handle two cases:

1. One process attacking another process.
2. A process attacking the kernel.

For the first case we clear the branch predictor state on context switch
between different processes. For the second we do this when taking an
instruction abort on a non-userspace address.

To clear the branch predictor state a per-CPU function pointer has been
added. This is set by the new cpu errata code based on if the CPU is
known to be affected.

On Cortex-A57, A72, A73, and A75 we call into the PSCI firmware as newer
versions of this will clear the branch predictor state for us.

It has been reported the ThunderX is unaffected, however the ThunderX2 is
vulnerable. The Qualcomm Falkor core is also affected. As FreeBSD doesn't
yet run on the ThunderX2 or Falkor no workaround is included for these CPUs.

MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13812
2018-01-12 14:01:38 +00:00
mjg
ddfd5797d3 mtx: use fcmpset to cover setting MTX_CONTESTED 2018-01-12 13:40:50 +00:00
mjg
749ff5c610 vfs: tidy up vdrop
Skip vfs_refcount_release_if_not_last if the interlock is held and just
go straight to refcount_release.

While here do cosmetic rearrangement of _vhold to better show it contains
equivalent behaviour.
2018-01-12 13:39:02 +00:00
wma
6c1a57398d PowerNV: update OPAL driver
Update OPAL driver with:
- better console support
- proper AP configuration
- enhanced IRQ/OFW mapping
- RTC support

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-12 12:14:52 +00:00
lwhsu
b6379674b8 - Fix make in sys/modules
Reviewed by:	gonzo, landonf, br
Differential Revision:	https://reviews.freebsd.org/D13856
2018-01-12 12:14:14 +00:00