Commit Graph

127681 Commits

Author SHA1 Message Date
Konstantin Belousov
4b8b28e130 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-07-02 19:32:48 +00:00
Konstantin Belousov
5dc7e31a09 Control implicit PROT_MAX() using procctl(2) and the FreeBSD note
feature bit.

In particular, allocate the bit to opt-out the image from implicit
PROTMAX enablement.  Provide procctl(2) verbs to set and query
implicit PROTMAX handling.  The knobs mimic the same per-image flag
and per-process controls for ASLR.

Reviewed by:	emaste, markj (previous version)
Discussed with:	brooks
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D20795
2019-07-02 19:07:17 +00:00
Konstantin Belousov
3730695151 Use traditional 'p' local to designate td->td_proc in kern_mmap.
Reviewed by:	emaste, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D20795
2019-07-02 19:01:14 +00:00
Ed Maste
91c33ba3a3 if_muge: set IFCAP_VLAN_MTU to maintain 1500 MTU with vlan use
PR:		238665
Submitted by:	Ralf <iz-rpi03@hs-karlsruhe.de>
MFC after:	1 week
2019-07-02 16:44:04 +00:00
Alexander Motin
3a76d901d6 Include sys/lock.h, as told by man page.
MFC after:	1 week
2019-07-02 15:01:54 +00:00
Mark Johnston
6d958292f3 Fix handling of errors from sblock() in soreceive_stream().
Previously we would attempt to unlock the socket buffer despite having
failed to lock it.  Simply return an error instead: no resources need
to be released at this point, and doing so is consistent with
soreceive_generic().

PR:		238789
Submitted by:	Greg Becker <greg@codeconcepts.com>
MFC after:	1 week
2019-07-02 14:24:42 +00:00
Ganbold Tsagaankhuu
73155b4327 Extend simple_mfd driver to expose a syscon interface if
that node is also compatible with syscon. For instance,
Rockchip RK3399's GRF (General Register Files) is compatible
with simple-mfd as well as syscon and has devices like
usb2-phy, emmc-phy and pcie-phy etc. under it.

Reviewed by:	manu
2019-07-02 08:47:18 +00:00
Alexander Motin
7b96ad44dd Fix i386 LINT after r349594.
MFC after:	1 month
2019-07-02 07:47:11 +00:00
Alexander Motin
6683132d54 Add driver for NTB in AMD SoC.
This patch is the driver for NTB hardware in AMD SoCs (ported from Linux)
and enables the NTB infrastructure like Doorbells, Scratchpads and Memory
window in AMD SoC. This driver has been validated using ntb_transport and
if_ntb driver already available in FreeBSD.

Submitted by:	Rajesh Kumar <rajesh1.kumar@amd.com>
MFC after:	1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D18774
2019-07-02 05:25:18 +00:00
Landon J. Fuller
ecb278f2e6 bwn(4): Include SROM revision when printing device identification. 2019-07-02 02:52:05 +00:00
Kirk McKusick
daba4da81d Add a new "untrusted" option to the mount command. Its purpose
is to notify the kernel that the file system is untrusted and it
should use more extensive checks on the file-system's metadata
before using it. This option is intended to be used when mounting
file systems from untrusted media such as USB memory sticks or other
externally-provided media.

It will initially be used by the UFS/FFS file system, but should
likely be expanded to be used by other file systems that may appear
on external media like msdosfs, exfat, and ext2fs.

Reviewed by:  kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20786
2019-07-01 23:22:26 +00:00
Emmanuel Vadot
6c4395e3b5 arm64: efi: Map memory IO region as device
Reviewed by:	andrew
Sponsored by:	Ampere Computing, LLC
2019-07-01 22:11:56 +00:00
Ryan Libby
9167705c8c g_mirror_taste: avoid deadlock, always clear tasting flag
If g_mirror_taste encountered an error at g_mirror_add_disk, it might
try to g_mirror_destroy the device with the G_MIRROR_DEVICE_FLAG_TASTING
flag still set.  This would wait on a worker to complete the destruction
with g_mirror_try_destroy, but that function bails out if the tasting
flag is set, resulting in a deadlock.  Clear the tasting flag before
trying to destroy the device.

Test Plan:
sysctl debug.fail_point.mnowait="1%return"
kyua test -k /usr/tests/sys/geom/class/mirror/Kyuafile

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20744
2019-07-01 22:06:36 +00:00
Ryan Libby
3bb6e0f0c7 g_eli_create: only dec g_access acw if we inc'd it
Reviewed by:	cem, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20743
2019-07-01 22:06:16 +00:00
Alan Cox
b6ce9ba9c3 Tidy up pmap_copy(). Notably, deindent the innermost loop by making a
simple change to the control flow.  Replace an unnecessary test by a
KASSERT.  Add a comment explaining an obscure test.

Reviewed by:	kib, markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D20812
2019-07-01 22:00:42 +00:00
Emmanuel Vadot
a4e0b5a471 Since r349571 we need all the accessor to be present for set or get
otherwise we panic.
dwmmc don't handle VCCQ (voltage for the IO line of the SD/eMMC) or
TIMING.
Add the needed accessor in the {read,write}_ivar functions.

Reviewed by:	imp (previous version)
2019-07-01 21:50:53 +00:00
Rick Macklem
555d8f2859 Factor out the code that does a VOP_SETATTR(size) from vn_truncate().
This patch factors the code in vn_truncate() that does the actual
VOP_SETATTR() of size into a separate function called vn_truncate_locked().
This will allow the NFS server and the patch that adds a
copy_file_range(2) syscall to call this function instead of duplicating
the code and carrying over changes, such as the recent r347151.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D20808
2019-07-01 20:41:43 +00:00
Vincenzo Maffione
23ced94451 netmap: fix two panics with emulated adapter
This patch fixes 2 panics. The first one is due to the current VNET not
being set in the emulated adapter transmission path. The second one
is caused by the M_PKTHDR flag not being set when preallocated mbufs
are recycled in the transmit path.

Submitted by:	aleksandr.fedorov@itglobal.com
Reviewed by:	vmaffione
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20824
2019-07-01 20:37:35 +00:00
Andriy Gapon
e3722b788e add superio driver
The goal of this driver is consolidate information about SuperIO chips
and to provide for peaceful coexistence of drivers that need to access
SuperIO configuration registers.

While SuperIO chips can host various functions most of them are
discoverable and accessible without any knowledge of the SuperIO.
Examples are: keyboard and mouse controllers, UARTs, floppy disk
controllers.  SuperIO-s also provide non-standard functions such as
GPIO, watchdog timers and hardware monitoring.  Such functions do
require drivers with a knowledge of a specific SuperIO.

At this time the driver supports a number of ITE and Nuvoton (fka
Winbond) SuperIO chips.
There is a single driver for all devices.  So, I have not done the usual
split between the hardware driver and the bus functionality.  Although,
superio does act as a bus for devices that represent known non-standard
functions of a SuperIO chip.  The bus provides enumeration of child
devices based on the hardcoded knowledge of such functions.  The
knowledge as extracted from datasheets and other drivers.
As there is a single driver, I have not defined a kobj interface for it.
So, its interface is currently made of simple functions.
I think that we can the flexibility (and complications) when we actually
need it.

I am planning to convert nctgpio and wbwd to superio bus very soon.
Also, I am working on itwd driver (watchdog in ITE SuperIO-s).
Additionally, there is ithwm driver based on the reverted sensors
import, but I am not sure how to integrate it given that we still lack
any sensors interface.

Discussed with:	imp, jhb
MFC after:	7 weeks
Differential Revision: https://reviews.freebsd.org/D8175
2019-07-01 17:05:41 +00:00
Andriy Gapon
0222625608 nctgpio: change default pin names to those used by the datasheet(s)
That is, instead of the current GPIO00 - GPIO15 the names will be GPIO00
- GPIO07, GPIO10 - GPIO17.  The first digit is a GPIO "bank" / group
number and the second one is a pin number within the bank.  Alternative
view is that the pin names are changed from decimal numbering scheme to
octal one (as there are 8 pins per bank).

Discussed with:	cem, gonzo
MFC after:	2 weeks
2019-07-01 15:43:48 +00:00
Luiz Otavio O Souza
9aba06377d Add support for the Marvell 88E6190 11 ports switch.
With more ports, some of the registers are shifted a bit to accommodate.

This switch also adds two high speed Serdes/SGMII interfaces (2.5 Gb/s).

Sponsored by:	Rubicon Communications, LLC (Netgate)
2019-07-01 13:41:37 +00:00
Andriy Gapon
3e7bae0821 upgrade the warning printf-s in bus accessors to KASSERT-s, take 2
After this change sys/bus.h includes sys/systm.h when _KERNEL is
defined.
This brings back r349459 but with systm.h hidden from userland.

MFC after:	2 weeks
2019-07-01 06:22:41 +00:00
Cy Schubert
23cfb1b256 The RFC 3128 test should be made after the offset mask has been applied.
Reported by:	christos@NetBSD.org
X-MFC with:	r349399
2019-06-30 22:32:33 +00:00
Cy Schubert
a9a131902d Revert r349400. It has uintended effects.
Reported by:	christos@NetBSD.org
X-MFC with:	r349400.
2019-06-30 22:27:58 +00:00
Navdeep Parhar
57f317e60a Display the approximate space needed when a minidump fails due to lack
of space.

Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20801
2019-06-30 03:14:04 +00:00
Doug Moore
5201cbabf5 Remove a call to vm_map_simplify_entry from _vm_map_clip_start.
Recent changes to vm_map_protect have made it unnecessary.

Reviewed by: alc
Approved by: kib (mentor)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D20633
2019-06-30 02:08:13 +00:00
Mark Johnston
7c3703a694 Use a consistent snapshot of the fd's rights in fget_mmap().
fget_mmap() translates rights on the descriptor to a VM protection
mask.  It was doing so without holding any locks on the descriptor
table, so a writer could simultaneously be modifying those rights.
Such a situation would be detected using a sequence counter, but
not before an inconsistency could trigger assertion failures in
the capability code.

Fix the problem by copying the fd's rights to a structure on the stack,
and perform the translation only once we know that that snapshot is
consistent.

Reported by:	syzbot+ae359438769fda1840f8@syzkaller.appspotmail.com
Reviewed by:	brooks, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20800
2019-06-29 16:11:09 +00:00
Mark Johnston
02476c44c5 Fix mutual exclusion in pipe_direct_write().
We use PIPE_DIRECTW as a semaphore for direct writes to a pipe, where
the reader copies data directly from pages mapped into the writer.
However, when a reader finishes such a copy, it previously cleared
PIPE_DIRECTW, allowing multiple writers to race and corrupt the state
used to track wired pages belonging to the writer.

Fix this by having the writer clear PIPE_DIRECTW and instead use the
count of unread bytes to determine whether a write is finished.

Reported by:	syzbot+21811cc0a89b2a87a9e7@syzkaller.appspotmail.com
Reviewed by:	kib, mjg
Tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20784
2019-06-29 16:05:52 +00:00
John Baldwin
e37240f9f3 Add support for IFCAP_NOMAP to mlx5(4).
Since mlx5 uses bus_dma, this only required adding the capability
flag.

Submitted by:	gallatin
Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:53:07 +00:00
John Baldwin
d76bbe175a Add support for IFCAP_NOMAP to cxgbe(4).
Since cxgbe(4) uses sglist instead of bus_dma, this required updates
to the code that generates scatter/gather lists for packets.  Also,
unmapped mbufs are always sent via DMA and never as immediate data in
the payload of a work request.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Discussed with:	np
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:52:21 +00:00
John Baldwin
66d0c056be Support IFCAP_NOMAP in vlan(4).
Enable IFCAP_NOMAP for a vlan interface if it is supported by the
underlying trunk device.

Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:51:38 +00:00
John Baldwin
3807631b8e Compress pending socket buffer data once it is marked ready.
Apply similar logic from sbcompress to pending data in the socket
buffer once it is marked ready via sbready.  Normally sbcompress
merges small mbufs to reduce the length of mbuf chains in the socket
buffer.  However, sbcompress cannot do this for mbufs marked
M_NOTREADY.  sbcompress_ready is now called from sbready when mbufs
are marked ready to merge small mbuf chains once the data is available
to copy.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:50:25 +00:00
John Baldwin
cec06a3edc Add support for using unmapped mbufs with sendfile(2).
This can be enabled at runtime via the kern.ipc.mb_use_ext_pgs sysctl.
It is disabled by default.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:49:35 +00:00
John Baldwin
82334850ea Add an external mbuf buffer type that holds multiple unmapped pages.
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages.  It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.

For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer.  This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers).  It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused.  To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.

Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.

NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability.  This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands.  For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.

If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output.  If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Discussed with:	ae, kp (firewalls)
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:48:33 +00:00
Alan Cox
c134ef742f When we protect PTEs (as opposed to PDEs), we only call vm_page_dirty()
when, in fact, we are write protecting the page and the PTE has PG_M set.
However, pmap_protect_pde() was always calling vm_page_dirty() when the PDE
has PG_M set.  So, adding PG_NX to a writeable PDE could result in
unnecessary (but harmless) calls to vm_page_dirty().

Simplify the loop calling vm_page_dirty() in pmap_protect_pde().

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20793
2019-06-28 22:40:34 +00:00
Hans Petter Selasky
f48c41accd Need to apply the PCIM_BAR_MEM_BASE mask to the physical memory
address before returning it to the user. Some of the least significant
bits have special meaning and should be masked away.

Discussed with:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-06-28 22:28:51 +00:00
Luiz Otavio O Souza
d7cecbd179 Add the 802.1q support for the Marvell e6000 series of ethernet switches.
Tested on:	espressobin, Clearfog, SG-3100 and others
Sponsored by:	Rubicon Communications, LLC (Netgate)
2019-06-28 22:19:50 +00:00
Luiz Otavio O Souza
4e4cedb00b Add the 'drop tagged' flag support for ethernet switch ports.
This is intended to drop all 802.1q tagged packets on a port.

Sponsored by:	 Rubicon Communications, LLC (Netgate)
2019-06-28 22:12:43 +00:00
Konstantin Belousov
2d7a555294 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-06-28 20:40:54 +00:00
Navdeep Parhar
8674e626c6 cxgbe/t4_tom: Tweaks to some of the AIO related CTRs.
Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-06-28 19:57:42 +00:00
Navdeep Parhar
74a155edb0 cxgbe/t4_tom: the AIO tx job queue must be empty by the time the driver
releases the offload resources associated with the tid.

Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20798
2019-06-28 19:27:45 +00:00
Hans Petter Selasky
0dbdf04125 Need to wait for epoch callbacks to complete before detaching a
network interface.

This particularly manifests itself when an INP has multicast options
attached during a network interface detach. Then the IPv4 and IPv6
leave group call which results from freeing the multicast address, may
access a freed ifnet structure. These are the steps to reproduce:

service mdnsd onestart # installed from ports

ifconfig epair create
ifconfig epair0a 0/24 up
ifconfig epair0a destroy

Tested by:	pho @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-28 10:49:04 +00:00
Hans Petter Selasky
131b2b7658 Implement API for draining EPOCH(9) callbacks.
The epoch_drain_callbacks() function is used to drain all pending
callbacks which have been invoked by prior epoch_call() function calls
on the same epoch. This function is useful when there are shared
memory structure(s) referred to by the epoch callback(s) which are not
refcounted and are rarely freed. The typical place for calling this
function is right before freeing or invalidating the shared
resource(s) used by the epoch callback(s). This function can sleep and
is not optimized for performance.

Differential Revision: https://reviews.freebsd.org/D20109
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-28 10:38:56 +00:00
Navdeep Parhar
d49be2a696 cxgbe/t4_tom: Mark the socket's receive as done before calling
handle_ddp_close.

This eliminates a bad race where an aio_ddp_requeue that happened to run
after handle_ddp_close could bump up the active count.

Discussed with:	jhb@
MFC after:	3 days
Sponsored by:	Chelsio Communications
2019-06-28 04:02:56 +00:00
Navdeep Parhar
b7acf27c2e cxgbe/t4_tom: Fix regression in t_maxseg usage within t4_tom.
t_maxseg was changed in r293284 to not have any adjustment for TCP
timestamps.  t4_tom inadvertently went back to pre-r293284 semantics
in r332506.

Sponsored by:	Chelsio Communications
2019-06-28 02:41:17 +00:00
Navdeep Parhar
24a508820c cxgbe/iw_cxgbe: Remove unused field from the endpoint structure.
MFC after:	3 days
2019-06-28 02:21:42 +00:00
Doug Moore
a72dce340d If vm_map_protect fails with KERN_RESOURCE_SHORTAGE, be sure to
simplify modified entries before returning.

Reviewed by: alc, markj (earlier version), kib (earlier version)
Approved by: kib, markj (mentors, implicit)
Differential Revision: https://reviews.freebsd.org/D20753
2019-06-28 02:14:54 +00:00
Rebecca Cran
a852cb9596 Add ACPI entries for Synopsys Designware UARTs used on ARM platforms
This fixes (userspace) console on the Marvell MACCHIATObin in ACPI mode with
latest TianoCore EDK2 firmware.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	mw, bcran
Differential Revision:	https://reviews.freebsd.org/D20765
2019-06-28 01:19:08 +00:00
Rebecca Cran
b879578268 Add missing ACPI GICv2 MSI/MSI-X attachment
This lets PCIe MSI-X device interrupts work on the MACCHIATObin
(Marvell Armada 8k), which allows e.g. the Intel igb NIC to fully work.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	mw, bcran
Differential Revision:	https://reviews.freebsd.org/D20775
2019-06-28 01:17:33 +00:00
Mitchell Horne
c9207d3d11 Add some missing RISC-V ELF defines
This adds defines for the RISC-V specific e_flags values, and some of
the missing static relocations.

Reviewed by:	markj
Approved by:	markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D20766
2019-06-28 00:03:29 +00:00
Alan Somers
0cfc1ef38d FIOBMAP2: inline vn_ioc_bmap2
Reported by:	kib
Reviewed by:	kib
MFC after:	2 weeks
MFC-With:	349238
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20783
2019-06-27 23:39:06 +00:00
Rick Macklem
e368095437 Add non-blocking trylock variants for the rangelock functions.
A future patch that will add a Linux compatible copy_file_range(2) syscall
needs to be able to lock the byte ranges of two files concurrently.
To do this without a risk of deadlock, a non-blocking variant of
vn_rangelock_rlock() called vn_rangelock_tryrlock() was needed.
This patch adds this, along with vn_rangelock_trywlock(), in order to
do this.
The patch also adds a couple of comments, that I hope clarify how the
algorithm used in kern_rangelock.c works.

Reviewed by:	kib, asomers (previous version)
Differential Revision:	https://reviews.freebsd.org/D20645
2019-06-27 23:10:40 +00:00
John Baldwin
1db2626a9b Fix comment in sofree() to reference sbdestroy().
r160875 added sbdestroy() as a wrapper around sbrelease_internal to be
called from sofree(), yet the comment added in the same revision to
sofree() still mentions sbrelease_internal().

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20488
2019-06-27 22:50:11 +00:00
John Baldwin
6b69072acc Reject attempts to register a TCP stack being unloaded.
Reviewed by:	gallatin
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20617
2019-06-27 22:34:05 +00:00
Li-Wen Hsu
404e646960 Follow r349460 to complete removing "flags" in struct gpiobus_ivar
MFC with:	r349460
Sponsored by:	The FreeBSD Foundation
2019-06-27 22:18:21 +00:00
John Baldwin
7f63b888c7 Hold an explicit reference on the socket for the aiotx task.
Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket.  However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock
used to check if the queue is empty.  In addition, if the last job on
the queue is cancelled, the task can run with no queued jobs holding a
reference to the socket buffer lock the task uses to notice the queue
is empty.

Fix these races by holding an explicit reference on the socket when
the task is queued and dropping that reference when the task
completes.

Reviewed by:	np
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20539
2019-06-27 19:36:30 +00:00
Ruslan Bukin
2593e9dcb2 Add support for extended descriptor format to Altera mSGDMA driver.
The format to use depends on hardware configuration (synthesis-time),
so make it compile-time kernel option.

Extended format allows DMA engine to operate with 64-bit memory addresses.

Sponsored by:	DARPA, AFRL
2019-06-27 18:08:18 +00:00
Andriy Gapon
f45e9414cf revert r349460, printf -> KASSERT in bus.h, until I can fix it
I tested only kernel builds naively assuming that sys/bus.h cannot
affect userland builds.

Pointyhat to:	me
2019-06-27 15:51:50 +00:00
Andriy Gapon
061b38cdcc gpiobus: provide a new hint, pin_list
"pin_list" allows to specify child pins as a list of pin numbers.
Existing hint "pins" serves the same purpose but with a 32-bit wide bit
mask.  One problem with that is that a controller can have more than 32
pins.  One example is amdgpio.  Also, a list of numbers is a little bit
more human friendly than a matching bit mask.  As a side note, it seems
that in FDT pins are typically specified by their numbers as well.

This commit also adds accessors for instance variables (IVARs) that
define the child pins.  My primary goal is to allow a child to be
configured programmatically rather than via hints (assuming that FDT is
not supported on a platform).  Also, while a child should not care about
specific pin numbers that are allocated to it, it could be interested in
how many were actually assigned to it.

While there, I removed "flags" instance variable.  It was unused.

Reviewed by:	mizhka
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D20459
2019-06-27 15:46:06 +00:00
Andriy Gapon
d55fcc487e upgrade the warning printf-s in bus accessors to KASSERT-s
After this change sys/bus.h includes sys/systm.h.

Discussed with:	cem, imp
MFC after:	2 weeks
2019-06-27 15:07:06 +00:00
Olivier Houchard
84322e3ee3 In get_fpcontext32() and set_fpcontext32(), we can't just use memcpy() to
copy the VFP registers.
arvm7 VFP uses 32 64bits fp registers (but those could be used in pairs to
make 16 128bits registers), while aarch64 uses 32 128bits fp registers, so
we have to copy the value of each register.
2019-06-26 22:06:40 +00:00
Alan Cox
1d3423d914 Revert one of the changes from r349323. Specifically, undo the change
that replaced a pmap_invalidate_page() with a dsb(ishst) in
pmap_enter_quick_locked().  Even though this change is in principle
correct, I am seeing occasional, spurious bus errors that are only
reproducible without this pmap_invalidate_page().  (None of adding an
isb, "upgrading" the dsb to wait on loads as well as stores, or
disabling superpage mappings eliminates the bus errors.)  Add an XXX
comment explaining why the pmap_invalidate_page() is being performed.

Discussed with:	     andrew, markj
2019-06-26 21:43:41 +00:00
Rodney W. Grimes
e4da41f932 Emulate the "TEST r/m{16,32,64}, imm{16,32,32}" instructions (opcode F7H).
This adds emulation for:
	test r/m16, imm16
	test r/m32, imm32
	test r/m64, imm32 sign-extended to 64

OpenBSD guests compiled with clang 8.0.0 use TEST directly against a
Local APIC register instead of separate read via MOV followed by a
TEST against the register.

PR:		238794
Submitted by:	jhb
Reported by:	Jason Tubnor jason@tubnor.net
Tested by:	Jason Tubnor jason@tubnor.net
Reviewed by:	markj, Patrick Mooney patrick.mooney@joyent.com
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D20755
2019-06-26 21:19:43 +00:00
Andriy Gapon
b66ed8ee28 fix up r349428, fix a typo made during "fdt" removal
Reported by:	ian
MFC after:	11 days
2019-06-26 17:38:38 +00:00
Mark Johnston
0fd977b3fa Add a return value to vm_page_remove().
Use it to indicate whether the page may be safely freed following
its removal from the object.  Also change vm_page_remove() to assume
that the page's object pointer is non-NULL, and have callers perform
this check instead.

This is a step towards an implementation of an atomic reference counter
for each physical page structure.

Reviewed by:	alc, dougm, kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20758
2019-06-26 17:37:51 +00:00
Andriy Gapon
926c3367c8 owc_gpiobus: clean / fix up the driver module things
"fdt" is removed from the driver module name as the driver does not
require FDT and can work very well on hints based systems.

A module dependency is added for gpiobus.  Without that owc cannot
resolve symbols in gpiobus if both are loaded as kernel modules.

Finally, a driver module module version is added.

Reviewed by:	imp
MFC after:	11 days
2019-06-26 17:17:33 +00:00
Konstantin Belousov
7256d0fcfd amd64 pmap: Fix pkru handling in pmap_remove().
When pmap_pkru_on_remove() is called, the sva argument value was
advanced.  Clear PKRU earlier when sva still specifies the start of
the region.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-06-26 17:16:26 +00:00
Olivier Houchard
b726d74fce Fix debugging of 32bits arm binaries on arm64.
In set_regs32()/fill_regs32(), we have to get/set SP and LR from/to
tf_x[13] and tf_x[14].
set_regs() and fill_regs() may be called for a 32bits process, if the process
is ptrace'd from a 64bits debugger. So, in set_regs() and fill_regs(), get
or set PC and SPSR from where the debugger expects it, from tf_x[15] and
tf_x[16].
2019-06-26 16:56:56 +00:00
Mark Johnston
6137883ff3 Remove references to splbio in ffs_softdep.c.
Assert that the per-mountpoint softdep mutex is held in modified
functions that do not already have this assertion.  No functional
change intended.

Reviewed by:	kib, mckusick (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20741
2019-06-26 16:28:42 +00:00
Alexander Motin
c0c317d203 Fix qlxgbe(4) static build.
MFC after:	2 weeks
2019-06-26 16:23:24 +00:00
Marius Strobl
c2c5d1e787 o In iflib_txq_drain():
- Remove desc_used, which is only ever written to.
  - Remove a dead store to reclaimed.
  - Don't recycle avail.
  - Sort variables according to style(9).
  These changes will make a subsequent commit easier to read.
o In iflib_tx_credits_update(), don't bother checking whether the
  ift_txd_credits_update method pointer is NULL; _iflib_pre_assert()
  asserts upfront that this method has been assigned and functions
  like iflib_{fast_intr_rxtx,netmap_timer_adjust,txq_can_drain}()
  and _task_fn_tx() were already unconditionally relying on the
  method being callable.
2019-06-26 15:28:21 +00:00
Doug Moore
d1d3f7e1d1 Revert r349393, which leads to an assertion failure on bootup, in vm_map_stack_locked.
Reported by: ler@lerctr.org
Approved by: kib, markj (mentors, implicit)
2019-06-26 03:12:57 +00:00
Justin Hibbits
088c26aee8 powerpc/booke: Handle misaligned floating point loads/stores as on AIM
Misaligned floating point loads and stores are already handled for AIM, but
use the DSISR to obtain the necessary data.  Book-E does not have the DSISR,
so these fixups are not performed, leading to a SIGBUS on misaligned FP
loads or stores.  Obtain the necessary data on the Book-E side, similar to
how is done for SPE.

MFC after:	1 week
2019-06-26 01:14:39 +00:00
Cy Schubert
65f07d9976 While working on PR/238796 I discovered an unused variable in frdest,
the next hop structure. It is likely this contributes to PR/238796
though other factors remain to be investigated.

PR:		238796
MFC after:	1 week
2019-06-26 00:53:49 +00:00
Cy Schubert
2637412cbc Remove a tautological compare for offset != 0.
MFC after:	1 week
2019-06-26 00:53:46 +00:00
Cy Schubert
7f39a7e492 Prompted by r349366, ipfilter is also does not conform to RFC 3128
by dropping TCP fragments with offset = 1.

In addition to dropping these fragments, add a DTrace probe to allow
for more detailed monitoring and diagnosis if required.

MFC after:	1 week
2019-06-26 00:53:43 +00:00
Doug Moore
52499d1739 Eliminate some uses of the prev and next fields of vm_map_entry_t.
Since the only caller to vm_map_splay is vm_map_lookup_entry, move the
implementation of vm_map_splay into vm_map_lookup_helper, called by
vm_map_lookup_entry.

vm_map_lookup_entry returns the greatest entry less than or equal to a
given address, but in many cases the caller wants the least entry
greater than or equal to the address and uses the next pointer to get
to it. Provide an alternative interface to lookup,
vm_map_lookup_entry_ge, to provide the latter behavior, and let
callers use one or the other rather than having them use the next
pointer after a lookup miss to get what they really want.

In vm_map_growstack, the caller wants an entry that includes a given
address, and either the preceding or next entry depending on the value
of eflags in the first entry. Incorporate that behavior into
vm_map_lookup_helper, the function that implements all of these
lookups.

Eliminate some temporary variables used with vm_map_lookup_entry, but
inessential.

Reviewed by: markj (earlier version)
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20664
2019-06-25 20:25:16 +00:00
Julian Elischer
eb2b51ffda Fix annoying whitespace issue.
NO real change
2019-06-25 19:55:42 +00:00
Alan Somers
4f53d57e8c fcntl: style changes to r349248
Reported by:	bde
MFC after:	2 weeks
MFC-With:	349248
Sponsored by:	The FreeBSD Foundation
2019-06-25 19:44:22 +00:00
Alexander Motin
419110374a Avoid extra taskq_dispatch() calls by DMU.
DMU sync code calls taskq_dispatch() for each sublist of os_dirty_dnodes
and os_synced_dnodes.  Since the number of sublists by default is equal
to number of CPUs, it will dispatch equal, potentially large, number of
tasks, waking up many CPUs to handle them, even if only one or few of
sublists actually have any work to do.

This change adds check for empty sublists to avoid this.
2019-06-25 18:35:23 +00:00
Leandro Lupori
e2edff4167 [PowerPC64] Don't mark module data as static
Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler.

Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code
generated tries to access data in the wrong location, causing kernel panic
(data storage interrupt trap) when loading if_epair and ipfw.

Issue was reproduced with kernel/module compiled using gcc8 and clang8. It
affects both ELFv1 and ELFv2 ABI environments.

PR:		232387
Submitted by:	alfredo.junior_eldorado.org.br
Reported by:	Mark Millard
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20461
2019-06-25 17:15:44 +00:00
Warner Losh
7a3e3a2859 Remove a couple of harmless stray references to nandfs.
Submitted by: tsoome@
2019-06-25 16:39:25 +00:00
Ryan Libby
0e2464ea18 netipsec key_register: check for M_NOWAIT alloc failure
Reviewed by:	ae, cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20742
2019-06-25 15:43:52 +00:00
Hans Petter Selasky
59854ecf55 Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:54:41 +00:00
Hans Petter Selasky
43a9329e1b Free all allocated unit IDs in cuse(3) after the client character
devices have been destroyed to avoid creating character devices with
identical name.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:46:01 +00:00
Hans Petter Selasky
c7ffaed92e Fix for deadlock situation in cuse(3)
The final server unref should be done by the server thread to prevent
deadlock in the client cdevpriv destructor, which cannot destroy
itself.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:42:53 +00:00
Andrey V. Elsukov
019c8c9330 Follow the RFC 3128 and drop short TCP fragments with offset = 1.
Reported by:	emaste
MFC after:	1 week
2019-06-25 11:40:37 +00:00
Andrey V. Elsukov
7d4b2d5244 Mark default rule with IPFW_RULE_NOOPT flag, so it can be showed in
compact form.

MFC after:	1 week
2019-06-25 09:11:22 +00:00
Doug Moore
18cd8bb800 vm_map_protect may return an INVALID_ARGUMENT or PROTECTION_FAILURE
error response after clipping the first map entry in the region to be
reserved. This creates a pair of matching entries that should have
been "simplified" back into one, or never created. This change defers
the clipping of that entry until those two vm_map_protect failure
cases have been ruled out.

Reviewed by: alc
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D20711
2019-06-25 07:44:37 +00:00
Cy Schubert
c964c98793 The definition of icmptypes in ip_compt.h is dead code as it already
use the icmptypes in ip_icmp.h.

MFC after:	1 week
2019-06-25 07:04:47 +00:00
Warner Losh
a9154c1c83 Replay r349342 by imp accidentally reverted by r349352
Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-25 06:14:31 +00:00
Warner Losh
296218d4cf Replay r349340 by imp accidentally reverted by r349352
Create ata_param_fixup

Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-25 06:14:21 +00:00
Warner Losh
76769dc108 Replay r349339 by imp accidentally reverted by r349352
Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
2019-06-25 06:14:16 +00:00
Warner Losh
e5500f1efa Replay r349334 by markj accidentally reverted by r349352
Remove a lingering use of splbio().

The buffer must be locked by the caller.  No functional change
intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-06-25 06:14:00 +00:00
Warner Losh
f5a95d9a07 Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.

Numerous posts to arch@ and other locations have found no actual users
for this software.

Relnotes:	Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
2019-06-25 04:50:09 +00:00
Justin Hibbits
f62da49b2f powerpc: Transition to Secure-PLT, like most other OSs
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT.  BSS-PLT uses runtime
code generation to generate the PLT stubs.  Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.

This is the libc, rtld, and kernel support only.  The toolchain and build
parts will be updated separately.

Reviewed By: nwhitehorn, bdragon, pfg
Differential Revision: https://reviews.freebsd.org/D20598
MFC after:	1 month
2019-06-25 00:40:44 +00:00
Jayachandran C.
b0f79e328e arm64 acpi_iort: add some error handling
Print warnings for some bad kernel configurations (like NUMA disabled
with multiple domains). Check and report some firmware errors (like
incorrect proximity domain entries).

Differential Revision:	https://reviews.freebsd.org/D20416
2019-06-24 21:24:55 +00:00
Jayachandran C.
c66524f07d arm64 gicv3_its: enable all ITS blocks for a CPU
We now support multiple ITS blocks raising interrupts to a CPU.
Add all available CPUs to the ITS when no NUMA information is
available.

This reverts the check added in r340602, at that tim we did not
suppport multiple ITS blocks for a CPU.

Differential Revision:	https://reviews.freebsd.org/D20417
2019-06-24 21:13:45 +00:00
Jayachandran C.
893caf588a arm64 gic: Drop unused GICV3_IVAR_REDIST_VADDR
Now that GICV3_IVAR_REDIST is available, GICV3_IVAR_REDIST_VADDR
is unused and can be removed. Drop the define and add a comment.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D20454
2019-06-24 21:00:28 +00:00
Warner Losh
af9727f618 Add missing include of sys/boot.h
This change was dropped out in a rebase and I didn't catch that before
I committed.
2019-06-24 20:52:21 +00:00
Warner Losh
ec9abc1843 Move to using a common kernel path between the boot / laoder bits and
the kernel.
2019-06-24 20:34:53 +00:00
Warner Losh
97ad52ca4c Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-24 20:23:19 +00:00
Warner Losh
2afaed2d0f Create ata_param_fixup
Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-24 20:18:58 +00:00
Warner Losh
161d2a1796 Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
2019-06-24 20:18:49 +00:00
Mark Johnston
673c1c2944 Remove a lingering use of splbio().
The buffer must be locked by the caller.  No functional change
intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-06-24 19:19:37 +00:00
Cy Schubert
51a7230a18 Clean out duplicate definitions of TCP macros also found in netinet/tcp.h.
MFC after:	1 week
2019-06-24 02:58:02 +00:00
Ian Lepore
0bab2b6e6f Add pwm devices to NOTES. 2019-06-24 02:39:56 +00:00
Ian Lepore
6e36309d83 Add gpio(4) and related drivers to NOTES. 2019-06-24 02:30:05 +00:00
Ian Lepore
2973d38a49 The gpiopps(4) driver currently has probe and attach code only for FDT based
systems, so conditionalize it accordingly in conf/files.
2019-06-24 02:27:17 +00:00
Ian Lepore
5364951d98 Build an armv7 LINT kernel in addition to armv5 LINT. You might think this
had been done years ago.  I did.  All this time we've only compiled a LINT
kernel for TARGET_ARCH=arm.  Now separate LINT-V5 and LINT-V7 configs are
generated and built.

There are two new files in arm/conf, NOTES.armv5 and NOTES.armv7, containing
some of what used to be in the arm NOTES file.  That file now contains only
the bits that are common to v5 and v7.

The makeLINT.mk file now creates the LINT-V5 and LINT-V7 files by concatening
sys/conf/NOTES, arm/conf/NOTES, and arm/conf/NOTES.armv{5,7} in that order.
2019-06-24 01:42:09 +00:00
Konstantin Belousov
d8ddb98a5e amd64 pmap: block on turnstile for lock-less DI.
Port the code to block on turnstile instead of yielding, to lock-less
delayed invalidation. The yield might cause tight loop due to priority
inversion.

Since it is impossible to avoid race between block and wake-up, arm
1-tick callout to wakeup when thread blocks itself.

Reported and tested by:	mjg
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
Differential revision:	https://reviews.freebsd.org/D20636
2019-06-23 21:21:11 +00:00
Ian Lepore
7a3a48426e Allow compiling ukbdmap.h on arm, since it appears to work fine. 2019-06-23 21:17:41 +00:00
Konstantin Belousov
89f2ab0608 Switch to check for effective user id in r349320, and disable dumping
into existing files for sugid processes.

Despite using real user id pronounces the intent, it actually breaks
suid coredumps, while not making any difference for non-sugid
processes.  The reason for the breakage is that non-existent core file
is created with the effective uid (unless weird hacks like SUIDDIR are
configured).

Then, if user enabled kern.sugid_coredump, core dumping should not
overwrite core files owned by effective uid, but we cannot pretend to
use real uid for dumping.

PR:	68905
admbugs:	358
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-06-23 21:15:31 +00:00
Alan Cox
22c7bcb842 pmap_enter_quick_locked() never replaces a valid mapping, so it need not
perform a TLB invalidation.  A barrier suffices.  (See r343876.)

Add a comment to pmap_enter_quick_locked() in order to highlight the
fact that it does not replace valid mappings.

Correct a typo in one of pmap_enter()'s comments.

MFC after:	1 week
2019-06-23 21:06:56 +00:00
Alexander Motin
53f5ac1310 Improve AHCI Enclosure Management and SES interoperation.
Since SES specs do not define mechanism to map enclosure slots to SATA
disks, AHCI EM code I written many years ago appeared quite useless,
that always bugged me.  I was thinking whether it was a good idea, but
if LSI HBAs do that, why I shouldn't?

This change introduces simple non-standard mechanism for the mapping
into both AHCI EM and SES code, that makes AHCI EM on capable controllers
(most of Intel's) a first-class SES citizen, allowing it to report disk
physical path to GEOM, show devices inserted into each enclosure slot in
`sesutil map` and `getencstat`, control locate and fault LEDs for specific
devices with `sesutil locate adaX on` and `sesutil fault adaX on`, etc.

I've successfully tested this on Supermicro X10DRH-i motherboard connected
with sideband cable of its S-SATA Mini-SAS connector to SAS815TQ backplane.
It can indicate with LEDs Locate, Fault and Rebuild/Remap SES statuses for
each disk identical to real SES of Supermicro SAS2 backplanes.

MFC after:	2 weeks
2019-06-23 19:05:01 +00:00
Konstantin Belousov
7a29e0bf96 coredump: avoid writing to core files not owned by the real user.
Reported by: blake frantz <trew@hick.org>
PR:	68905
admbugs:	358
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-06-23 18:35:11 +00:00
Ian Lepore
ac6a9e474f Add some i2c slave-device drivers that were missing from NOTES. 2019-06-23 17:39:13 +00:00
Ian Lepore
9026f4b86d The sy8106a and syr827 drviers require FDT and the ext_resources subsystem. 2019-06-23 17:38:30 +00:00
Ian Lepore
48fedd0960 Add the rtc8583 driver to conf/files. Also, move sy8106a from
file.allwinner to conf/files... it's not allwinner-specific, some day
other platforms could use the same regulator chip.
2019-06-23 17:23:56 +00:00
Ian Lepore
5cafc16207 Remove some unused header files from the ad7418 driver. 2019-06-23 17:20:39 +00:00
Alexander Motin
6d4d657360 Decouple enc/ses verbosity from bootverbose.
I don't want to be regularly notified that my enclosure violates standards
until there is some real problem I want to debug.

MFC after:	2 weeks
2019-06-22 19:09:10 +00:00
Alan Cox
36c5a4cb3f Introduce pmap_remove_l3_range() and use it in two places:
(1) pmap_remove(), where it eliminates redundant TLB invalidations by
pmap_remove() and pmap_remove_l3(), and (2) pmap_enter_l2(), where it may
optimize the TLB invalidations by batching them.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12725
2019-06-22 16:26:38 +00:00
Ryan Libby
a6d2a24c3e ddb show proc typo 2019-06-22 05:35:23 +00:00
Alexander Motin
b8038d7827 Remove ancient SCSI-2/3 mentioning.
MFC after:	2 weeks
2019-06-22 03:50:43 +00:00
Eric van Gyzen
df8406543f VirtIO SCSI: validate seg_max on attach
Until r349278, bhyve presented a seg_max to the guest that was too large.
Detect this case and clamp it to the virtqueue size.  Otherwise, we would
fail the "too many segments to enqueue" assertion in virtqueue_enqueue().

I hit this by running a guest with a MAXPHYS of 256 KB.

Reviewed by:	bryanv cem
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20703
2019-06-22 01:20:45 +00:00
Alexander Motin
6805c9b74d Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page".  It allows us to question ELEMENT INDEX that is lower
then values we already processed.  There are many SAS2 enclosures with
this kind of problem.

While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong.  Also skip elements with INVALID bit set.

MFC after:	2 weeks
2019-06-22 01:06:41 +00:00
Scott Long
0feb46b0c6 Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
2019-06-21 23:40:26 +00:00
Alexander Motin
7318fcb51d Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first.  To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.

MFC after:	2 weeks
2019-06-21 23:29:16 +00:00
Alan Somers
1bb957296b Reduce namespace pollution from r349233
Define __daddr_t in _types.h and use it in filio.h

Reported by:	ian, bde
Reviewed by:	ian, imp, cem
MFC after:	2 weeks
MFC-With:	349233
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20715
2019-06-21 21:50:14 +00:00
Johannes Lundberg
6425fed7e6 LinuxKPI: Additions to rcu list.
- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.

Reviewed by:	hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20719
2019-06-21 18:48:07 +00:00
Johannes Lundberg
62260f68b4 LinuxKPI: Add atomic_long_sub macro.
Reviewed by:	imp (mentor), hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20718
2019-06-21 16:43:16 +00:00
Ian Lepore
83b319101f Add pwm to the armv7 GENERIC kernel, it's now used by TI and Allwinner. 2019-06-21 15:44:58 +00:00
Ian Lepore
ef558a1078 Add support for the PWM(9) API. This allows configuring the pwm output using
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps.  The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
2019-06-21 14:24:33 +00:00
Ian Lepore
3103a7eef7 Some mundane tweaks and cleanups to help de-clutter the diffs of some
upcoming functional changes.

Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data.  Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite.  Move the
device_t and driver_methods structs to the end of the file.  Tweak comments.
2019-06-21 14:01:02 +00:00
Ed Maste
b72236b407 nandsim: correct test to avoid out-of-bounds access
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds.  If only one failed the
bounds check arbitrary memory would be read and returned.

The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.

admbugs:	827
Reported by:	Daniel Hodson of elttam
MFC after:	3 days
Security:	kernel information leak or DoS
Sponsored by:	The FreeBSD Foundation
2019-06-21 13:42:40 +00:00
Andrey V. Elsukov
978f2d1728 Add "tcpmss" opcode to match the TCP MSS value.
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.

 # ipfw add deny log tcp from any to any tcpmss 0-500

Reviewed by:	melifaro,bcr
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-06-21 10:54:51 +00:00
Kristof Provost
05fc9d78d7 ip_output: pass PFIL_FWD in the slow path
If we take the slow path for forwarding we should still tell our
firewalls (hooked through pfil(9)) that we're forwarding. Pass the
ip_output() flags to ip_output_pfil() so it can set the PFIL_FWD flag
when we're forwarding.

MFC after:	1 week
Sponsored by:	Axiado
2019-06-21 07:58:08 +00:00
Conrad Meyer
c363b16c63 sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead.  Document removal in UPDATING; update NOTES and
random.4.

Reviewed by:	delphij, markm (previous version)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19918
2019-06-21 00:16:30 +00:00
Takanori Watanabe
a809abd44a Fix the case where no root hub object while host controller object exist in ACPI namespace.
Also you can disable ACPI support for USB by setting
debug.acpi.disabled="usb"

PR:	238711
2019-06-20 23:52:33 +00:00
Alan Somers
38b06f8ac4 fcntl: fix overflow when setting F_READAHEAD
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.

Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936.  Formalize that by using a union type.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20710
2019-06-20 23:07:20 +00:00
Alexander Motin
68035f6381 SPC-3 and up require some UAs to be returned as fixed.
MFC after:	2 weeks
2019-06-20 22:20:30 +00:00
Alexander Motin
35a9ffc350 Optimize xpt_getattr().
Do not allocate temporary buffer for attributes we are going to return
as-is, just make sure to NUL-terminate them.  Do not zero temporary 64KB
buffer for CDAI_TYPE_SCSI_DEVID, XPT tells us how much data it filled
and there are also length fields inside the returned data also.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-20 20:29:42 +00:00
Navdeep Parhar
17795d8234 cxgbe/t4_tom: DDP_DEAD is a ddp flag and not a toepcb flag.
The driver was in effect setting TPF_ABORT_SHUTDOWN on the toepcb
instead of what was intended.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-06-20 20:06:19 +00:00
Brooks Davis
74a1b66cf4 Extend mmap/mprotect API to specify the max page protections.
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions.  If
present, these flags specify the maximum permissions.

While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.

This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete.  This complements W^X protections allowing more
precise control by the programmer.

This change alters mprotect argument checking and returns an error when
unhandled protection flags are set.  This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.

In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1.  PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation.  A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.

Reviewed by:	emaste, kib (prior version), markj
Additional suggestions from:	alc
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D18880
2019-06-20 18:24:16 +00:00
Alan Somers
9de14ff010 #include <sys/types.h> from sys/filio.h
This fixes world build after r349231

Reported by:	Jenkins
MFC after:	2 weeks
MFC-With:	349231
Sponsored by:	The FreeBSD Foundation
2019-06-20 14:35:28 +00:00
Alan Somers
d49b446bfb Add FIOBMAP2 ioctl
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20705
2019-06-20 14:13:10 +00:00
Alan Somers
d01752c703 Add a VOP_BMAP(9) man page
Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20704
2019-06-20 13:59:46 +00:00
Alexander Motin
f91aa773be Add wakeup_any(), cheaper wakeup_one() for taskqueue(9).
wakeup_one() and underlying sleepq_signal() spend additional time trying
to be fair, waking thread with highest priority, sleeping longest time.
But in case of taskqueue there are many absolutely identical threads, and
any fairness between them is quite pointless.  It makes even worse, since
round-robin wakeups not only make previous CPU affinity in scheduler quite
useless, but also hide from user chance to see CPU bottlenecks, when
sequential workload with one request at a time looks evenly distributed
between multiple threads.

This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup
thread that went to sleep last, but no longer in context switch (to avoid
immediate spinning on the thread lock).  On top of that new wakeup_any()
function is added, equivalent to wakeup_one(), but setting the flag.
On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its
threads.

As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs
with 16KB block size spend 34% less time in wakeup_any() and descendants
then it was spending in wakeup_one(), and total write throughput increased
by ~10% with the same as before CPU usage.

Reviewed by:	markj, mmacy
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D20669
2019-06-20 01:15:33 +00:00
Mark Johnston
ee1f168540 Group vm_page_activate()'s definition with other related functions.
No functional change intended.

MFC after:	3 days
2019-06-19 21:36:00 +00:00
Alexander Motin
49ee0fcea5 Use sbuf_cat() in GEOM confxml generation.
When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-19 15:36:02 +00:00