Commit Graph

127718 Commits

Author SHA1 Message Date
Ryan Libby
3bb6e0f0c7 g_eli_create: only dec g_access acw if we inc'd it
Reviewed by:	cem, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20743
2019-07-01 22:06:16 +00:00
Alan Cox
b6ce9ba9c3 Tidy up pmap_copy(). Notably, deindent the innermost loop by making a
simple change to the control flow.  Replace an unnecessary test by a
KASSERT.  Add a comment explaining an obscure test.

Reviewed by:	kib, markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D20812
2019-07-01 22:00:42 +00:00
Emmanuel Vadot
a4e0b5a471 Since r349571 we need all the accessor to be present for set or get
otherwise we panic.
dwmmc don't handle VCCQ (voltage for the IO line of the SD/eMMC) or
TIMING.
Add the needed accessor in the {read,write}_ivar functions.

Reviewed by:	imp (previous version)
2019-07-01 21:50:53 +00:00
Rick Macklem
555d8f2859 Factor out the code that does a VOP_SETATTR(size) from vn_truncate().
This patch factors the code in vn_truncate() that does the actual
VOP_SETATTR() of size into a separate function called vn_truncate_locked().
This will allow the NFS server and the patch that adds a
copy_file_range(2) syscall to call this function instead of duplicating
the code and carrying over changes, such as the recent r347151.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D20808
2019-07-01 20:41:43 +00:00
Vincenzo Maffione
23ced94451 netmap: fix two panics with emulated adapter
This patch fixes 2 panics. The first one is due to the current VNET not
being set in the emulated adapter transmission path. The second one
is caused by the M_PKTHDR flag not being set when preallocated mbufs
are recycled in the transmit path.

Submitted by:	aleksandr.fedorov@itglobal.com
Reviewed by:	vmaffione
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20824
2019-07-01 20:37:35 +00:00
Andriy Gapon
e3722b788e add superio driver
The goal of this driver is consolidate information about SuperIO chips
and to provide for peaceful coexistence of drivers that need to access
SuperIO configuration registers.

While SuperIO chips can host various functions most of them are
discoverable and accessible without any knowledge of the SuperIO.
Examples are: keyboard and mouse controllers, UARTs, floppy disk
controllers.  SuperIO-s also provide non-standard functions such as
GPIO, watchdog timers and hardware monitoring.  Such functions do
require drivers with a knowledge of a specific SuperIO.

At this time the driver supports a number of ITE and Nuvoton (fka
Winbond) SuperIO chips.
There is a single driver for all devices.  So, I have not done the usual
split between the hardware driver and the bus functionality.  Although,
superio does act as a bus for devices that represent known non-standard
functions of a SuperIO chip.  The bus provides enumeration of child
devices based on the hardcoded knowledge of such functions.  The
knowledge as extracted from datasheets and other drivers.
As there is a single driver, I have not defined a kobj interface for it.
So, its interface is currently made of simple functions.
I think that we can the flexibility (and complications) when we actually
need it.

I am planning to convert nctgpio and wbwd to superio bus very soon.
Also, I am working on itwd driver (watchdog in ITE SuperIO-s).
Additionally, there is ithwm driver based on the reverted sensors
import, but I am not sure how to integrate it given that we still lack
any sensors interface.

Discussed with:	imp, jhb
MFC after:	7 weeks
Differential Revision: https://reviews.freebsd.org/D8175
2019-07-01 17:05:41 +00:00
Andriy Gapon
0222625608 nctgpio: change default pin names to those used by the datasheet(s)
That is, instead of the current GPIO00 - GPIO15 the names will be GPIO00
- GPIO07, GPIO10 - GPIO17.  The first digit is a GPIO "bank" / group
number and the second one is a pin number within the bank.  Alternative
view is that the pin names are changed from decimal numbering scheme to
octal one (as there are 8 pins per bank).

Discussed with:	cem, gonzo
MFC after:	2 weeks
2019-07-01 15:43:48 +00:00
Luiz Otavio O Souza
9aba06377d Add support for the Marvell 88E6190 11 ports switch.
With more ports, some of the registers are shifted a bit to accommodate.

This switch also adds two high speed Serdes/SGMII interfaces (2.5 Gb/s).

Sponsored by:	Rubicon Communications, LLC (Netgate)
2019-07-01 13:41:37 +00:00
Andriy Gapon
3e7bae0821 upgrade the warning printf-s in bus accessors to KASSERT-s, take 2
After this change sys/bus.h includes sys/systm.h when _KERNEL is
defined.
This brings back r349459 but with systm.h hidden from userland.

MFC after:	2 weeks
2019-07-01 06:22:41 +00:00
Cy Schubert
23cfb1b256 The RFC 3128 test should be made after the offset mask has been applied.
Reported by:	christos@NetBSD.org
X-MFC with:	r349399
2019-06-30 22:32:33 +00:00
Cy Schubert
a9a131902d Revert r349400. It has uintended effects.
Reported by:	christos@NetBSD.org
X-MFC with:	r349400.
2019-06-30 22:27:58 +00:00
Navdeep Parhar
57f317e60a Display the approximate space needed when a minidump fails due to lack
of space.

Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20801
2019-06-30 03:14:04 +00:00
Doug Moore
5201cbabf5 Remove a call to vm_map_simplify_entry from _vm_map_clip_start.
Recent changes to vm_map_protect have made it unnecessary.

Reviewed by: alc
Approved by: kib (mentor)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D20633
2019-06-30 02:08:13 +00:00
Mark Johnston
7c3703a694 Use a consistent snapshot of the fd's rights in fget_mmap().
fget_mmap() translates rights on the descriptor to a VM protection
mask.  It was doing so without holding any locks on the descriptor
table, so a writer could simultaneously be modifying those rights.
Such a situation would be detected using a sequence counter, but
not before an inconsistency could trigger assertion failures in
the capability code.

Fix the problem by copying the fd's rights to a structure on the stack,
and perform the translation only once we know that that snapshot is
consistent.

Reported by:	syzbot+ae359438769fda1840f8@syzkaller.appspotmail.com
Reviewed by:	brooks, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20800
2019-06-29 16:11:09 +00:00
Mark Johnston
02476c44c5 Fix mutual exclusion in pipe_direct_write().
We use PIPE_DIRECTW as a semaphore for direct writes to a pipe, where
the reader copies data directly from pages mapped into the writer.
However, when a reader finishes such a copy, it previously cleared
PIPE_DIRECTW, allowing multiple writers to race and corrupt the state
used to track wired pages belonging to the writer.

Fix this by having the writer clear PIPE_DIRECTW and instead use the
count of unread bytes to determine whether a write is finished.

Reported by:	syzbot+21811cc0a89b2a87a9e7@syzkaller.appspotmail.com
Reviewed by:	kib, mjg
Tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20784
2019-06-29 16:05:52 +00:00
John Baldwin
e37240f9f3 Add support for IFCAP_NOMAP to mlx5(4).
Since mlx5 uses bus_dma, this only required adding the capability
flag.

Submitted by:	gallatin
Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:53:07 +00:00
John Baldwin
d76bbe175a Add support for IFCAP_NOMAP to cxgbe(4).
Since cxgbe(4) uses sglist instead of bus_dma, this required updates
to the code that generates scatter/gather lists for packets.  Also,
unmapped mbufs are always sent via DMA and never as immediate data in
the payload of a work request.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Discussed with:	np
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:52:21 +00:00
John Baldwin
66d0c056be Support IFCAP_NOMAP in vlan(4).
Enable IFCAP_NOMAP for a vlan interface if it is supported by the
underlying trunk device.

Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:51:38 +00:00
John Baldwin
3807631b8e Compress pending socket buffer data once it is marked ready.
Apply similar logic from sbcompress to pending data in the socket
buffer once it is marked ready via sbready.  Normally sbcompress
merges small mbufs to reduce the length of mbuf chains in the socket
buffer.  However, sbcompress cannot do this for mbufs marked
M_NOTREADY.  sbcompress_ready is now called from sbready when mbufs
are marked ready to merge small mbuf chains once the data is available
to copy.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:50:25 +00:00
John Baldwin
cec06a3edc Add support for using unmapped mbufs with sendfile(2).
This can be enabled at runtime via the kern.ipc.mb_use_ext_pgs sysctl.
It is disabled by default.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:49:35 +00:00
John Baldwin
82334850ea Add an external mbuf buffer type that holds multiple unmapped pages.
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages.  It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.

For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer.  This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers).  It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused.  To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.

Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.

NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability.  This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands.  For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.

If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output.  If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Discussed with:	ae, kp (firewalls)
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:48:33 +00:00
Alan Cox
c134ef742f When we protect PTEs (as opposed to PDEs), we only call vm_page_dirty()
when, in fact, we are write protecting the page and the PTE has PG_M set.
However, pmap_protect_pde() was always calling vm_page_dirty() when the PDE
has PG_M set.  So, adding PG_NX to a writeable PDE could result in
unnecessary (but harmless) calls to vm_page_dirty().

Simplify the loop calling vm_page_dirty() in pmap_protect_pde().

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20793
2019-06-28 22:40:34 +00:00
Hans Petter Selasky
f48c41accd Need to apply the PCIM_BAR_MEM_BASE mask to the physical memory
address before returning it to the user. Some of the least significant
bits have special meaning and should be masked away.

Discussed with:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-06-28 22:28:51 +00:00
Luiz Otavio O Souza
d7cecbd179 Add the 802.1q support for the Marvell e6000 series of ethernet switches.
Tested on:	espressobin, Clearfog, SG-3100 and others
Sponsored by:	Rubicon Communications, LLC (Netgate)
2019-06-28 22:19:50 +00:00
Luiz Otavio O Souza
4e4cedb00b Add the 'drop tagged' flag support for ethernet switch ports.
This is intended to drop all 802.1q tagged packets on a port.

Sponsored by:	 Rubicon Communications, LLC (Netgate)
2019-06-28 22:12:43 +00:00
Konstantin Belousov
2d7a555294 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-06-28 20:40:54 +00:00
Navdeep Parhar
8674e626c6 cxgbe/t4_tom: Tweaks to some of the AIO related CTRs.
Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-06-28 19:57:42 +00:00
Navdeep Parhar
74a155edb0 cxgbe/t4_tom: the AIO tx job queue must be empty by the time the driver
releases the offload resources associated with the tid.

Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20798
2019-06-28 19:27:45 +00:00
Hans Petter Selasky
0dbdf04125 Need to wait for epoch callbacks to complete before detaching a
network interface.

This particularly manifests itself when an INP has multicast options
attached during a network interface detach. Then the IPv4 and IPv6
leave group call which results from freeing the multicast address, may
access a freed ifnet structure. These are the steps to reproduce:

service mdnsd onestart # installed from ports

ifconfig epair create
ifconfig epair0a 0/24 up
ifconfig epair0a destroy

Tested by:	pho @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-28 10:49:04 +00:00
Hans Petter Selasky
131b2b7658 Implement API for draining EPOCH(9) callbacks.
The epoch_drain_callbacks() function is used to drain all pending
callbacks which have been invoked by prior epoch_call() function calls
on the same epoch. This function is useful when there are shared
memory structure(s) referred to by the epoch callback(s) which are not
refcounted and are rarely freed. The typical place for calling this
function is right before freeing or invalidating the shared
resource(s) used by the epoch callback(s). This function can sleep and
is not optimized for performance.

Differential Revision: https://reviews.freebsd.org/D20109
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-28 10:38:56 +00:00
Navdeep Parhar
d49be2a696 cxgbe/t4_tom: Mark the socket's receive as done before calling
handle_ddp_close.

This eliminates a bad race where an aio_ddp_requeue that happened to run
after handle_ddp_close could bump up the active count.

Discussed with:	jhb@
MFC after:	3 days
Sponsored by:	Chelsio Communications
2019-06-28 04:02:56 +00:00
Navdeep Parhar
b7acf27c2e cxgbe/t4_tom: Fix regression in t_maxseg usage within t4_tom.
t_maxseg was changed in r293284 to not have any adjustment for TCP
timestamps.  t4_tom inadvertently went back to pre-r293284 semantics
in r332506.

Sponsored by:	Chelsio Communications
2019-06-28 02:41:17 +00:00
Navdeep Parhar
24a508820c cxgbe/iw_cxgbe: Remove unused field from the endpoint structure.
MFC after:	3 days
2019-06-28 02:21:42 +00:00
Doug Moore
a72dce340d If vm_map_protect fails with KERN_RESOURCE_SHORTAGE, be sure to
simplify modified entries before returning.

Reviewed by: alc, markj (earlier version), kib (earlier version)
Approved by: kib, markj (mentors, implicit)
Differential Revision: https://reviews.freebsd.org/D20753
2019-06-28 02:14:54 +00:00
Rebecca Cran
a852cb9596 Add ACPI entries for Synopsys Designware UARTs used on ARM platforms
This fixes (userspace) console on the Marvell MACCHIATObin in ACPI mode with
latest TianoCore EDK2 firmware.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	mw, bcran
Differential Revision:	https://reviews.freebsd.org/D20765
2019-06-28 01:19:08 +00:00
Rebecca Cran
b879578268 Add missing ACPI GICv2 MSI/MSI-X attachment
This lets PCIe MSI-X device interrupts work on the MACCHIATObin
(Marvell Armada 8k), which allows e.g. the Intel igb NIC to fully work.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	mw, bcran
Differential Revision:	https://reviews.freebsd.org/D20775
2019-06-28 01:17:33 +00:00
Mitchell Horne
c9207d3d11 Add some missing RISC-V ELF defines
This adds defines for the RISC-V specific e_flags values, and some of
the missing static relocations.

Reviewed by:	markj
Approved by:	markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D20766
2019-06-28 00:03:29 +00:00
Alan Somers
0cfc1ef38d FIOBMAP2: inline vn_ioc_bmap2
Reported by:	kib
Reviewed by:	kib
MFC after:	2 weeks
MFC-With:	349238
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20783
2019-06-27 23:39:06 +00:00
Rick Macklem
e368095437 Add non-blocking trylock variants for the rangelock functions.
A future patch that will add a Linux compatible copy_file_range(2) syscall
needs to be able to lock the byte ranges of two files concurrently.
To do this without a risk of deadlock, a non-blocking variant of
vn_rangelock_rlock() called vn_rangelock_tryrlock() was needed.
This patch adds this, along with vn_rangelock_trywlock(), in order to
do this.
The patch also adds a couple of comments, that I hope clarify how the
algorithm used in kern_rangelock.c works.

Reviewed by:	kib, asomers (previous version)
Differential Revision:	https://reviews.freebsd.org/D20645
2019-06-27 23:10:40 +00:00
John Baldwin
1db2626a9b Fix comment in sofree() to reference sbdestroy().
r160875 added sbdestroy() as a wrapper around sbrelease_internal to be
called from sofree(), yet the comment added in the same revision to
sofree() still mentions sbrelease_internal().

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20488
2019-06-27 22:50:11 +00:00
John Baldwin
6b69072acc Reject attempts to register a TCP stack being unloaded.
Reviewed by:	gallatin
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20617
2019-06-27 22:34:05 +00:00
Li-Wen Hsu
404e646960 Follow r349460 to complete removing "flags" in struct gpiobus_ivar
MFC with:	r349460
Sponsored by:	The FreeBSD Foundation
2019-06-27 22:18:21 +00:00
John Baldwin
7f63b888c7 Hold an explicit reference on the socket for the aiotx task.
Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket.  However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock
used to check if the queue is empty.  In addition, if the last job on
the queue is cancelled, the task can run with no queued jobs holding a
reference to the socket buffer lock the task uses to notice the queue
is empty.

Fix these races by holding an explicit reference on the socket when
the task is queued and dropping that reference when the task
completes.

Reviewed by:	np
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20539
2019-06-27 19:36:30 +00:00
Ruslan Bukin
2593e9dcb2 Add support for extended descriptor format to Altera mSGDMA driver.
The format to use depends on hardware configuration (synthesis-time),
so make it compile-time kernel option.

Extended format allows DMA engine to operate with 64-bit memory addresses.

Sponsored by:	DARPA, AFRL
2019-06-27 18:08:18 +00:00
Andriy Gapon
f45e9414cf revert r349460, printf -> KASSERT in bus.h, until I can fix it
I tested only kernel builds naively assuming that sys/bus.h cannot
affect userland builds.

Pointyhat to:	me
2019-06-27 15:51:50 +00:00
Andriy Gapon
061b38cdcc gpiobus: provide a new hint, pin_list
"pin_list" allows to specify child pins as a list of pin numbers.
Existing hint "pins" serves the same purpose but with a 32-bit wide bit
mask.  One problem with that is that a controller can have more than 32
pins.  One example is amdgpio.  Also, a list of numbers is a little bit
more human friendly than a matching bit mask.  As a side note, it seems
that in FDT pins are typically specified by their numbers as well.

This commit also adds accessors for instance variables (IVARs) that
define the child pins.  My primary goal is to allow a child to be
configured programmatically rather than via hints (assuming that FDT is
not supported on a platform).  Also, while a child should not care about
specific pin numbers that are allocated to it, it could be interested in
how many were actually assigned to it.

While there, I removed "flags" instance variable.  It was unused.

Reviewed by:	mizhka
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D20459
2019-06-27 15:46:06 +00:00
Andriy Gapon
d55fcc487e upgrade the warning printf-s in bus accessors to KASSERT-s
After this change sys/bus.h includes sys/systm.h.

Discussed with:	cem, imp
MFC after:	2 weeks
2019-06-27 15:07:06 +00:00
Olivier Houchard
84322e3ee3 In get_fpcontext32() and set_fpcontext32(), we can't just use memcpy() to
copy the VFP registers.
arvm7 VFP uses 32 64bits fp registers (but those could be used in pairs to
make 16 128bits registers), while aarch64 uses 32 128bits fp registers, so
we have to copy the value of each register.
2019-06-26 22:06:40 +00:00
Alan Cox
1d3423d914 Revert one of the changes from r349323. Specifically, undo the change
that replaced a pmap_invalidate_page() with a dsb(ishst) in
pmap_enter_quick_locked().  Even though this change is in principle
correct, I am seeing occasional, spurious bus errors that are only
reproducible without this pmap_invalidate_page().  (None of adding an
isb, "upgrading" the dsb to wait on loads as well as stores, or
disabling superpage mappings eliminates the bus errors.)  Add an XXX
comment explaining why the pmap_invalidate_page() is being performed.

Discussed with:	     andrew, markj
2019-06-26 21:43:41 +00:00
Rodney W. Grimes
e4da41f932 Emulate the "TEST r/m{16,32,64}, imm{16,32,32}" instructions (opcode F7H).
This adds emulation for:
	test r/m16, imm16
	test r/m32, imm32
	test r/m64, imm32 sign-extended to 64

OpenBSD guests compiled with clang 8.0.0 use TEST directly against a
Local APIC register instead of separate read via MOV followed by a
TEST against the register.

PR:		238794
Submitted by:	jhb
Reported by:	Jason Tubnor jason@tubnor.net
Tested by:	Jason Tubnor jason@tubnor.net
Reviewed by:	markj, Patrick Mooney patrick.mooney@joyent.com
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D20755
2019-06-26 21:19:43 +00:00
Andriy Gapon
b66ed8ee28 fix up r349428, fix a typo made during "fdt" removal
Reported by:	ian
MFC after:	11 days
2019-06-26 17:38:38 +00:00
Mark Johnston
0fd977b3fa Add a return value to vm_page_remove().
Use it to indicate whether the page may be safely freed following
its removal from the object.  Also change vm_page_remove() to assume
that the page's object pointer is non-NULL, and have callers perform
this check instead.

This is a step towards an implementation of an atomic reference counter
for each physical page structure.

Reviewed by:	alc, dougm, kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20758
2019-06-26 17:37:51 +00:00
Andriy Gapon
926c3367c8 owc_gpiobus: clean / fix up the driver module things
"fdt" is removed from the driver module name as the driver does not
require FDT and can work very well on hints based systems.

A module dependency is added for gpiobus.  Without that owc cannot
resolve symbols in gpiobus if both are loaded as kernel modules.

Finally, a driver module module version is added.

Reviewed by:	imp
MFC after:	11 days
2019-06-26 17:17:33 +00:00
Konstantin Belousov
7256d0fcfd amd64 pmap: Fix pkru handling in pmap_remove().
When pmap_pkru_on_remove() is called, the sva argument value was
advanced.  Clear PKRU earlier when sva still specifies the start of
the region.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-06-26 17:16:26 +00:00
Olivier Houchard
b726d74fce Fix debugging of 32bits arm binaries on arm64.
In set_regs32()/fill_regs32(), we have to get/set SP and LR from/to
tf_x[13] and tf_x[14].
set_regs() and fill_regs() may be called for a 32bits process, if the process
is ptrace'd from a 64bits debugger. So, in set_regs() and fill_regs(), get
or set PC and SPSR from where the debugger expects it, from tf_x[15] and
tf_x[16].
2019-06-26 16:56:56 +00:00
Mark Johnston
6137883ff3 Remove references to splbio in ffs_softdep.c.
Assert that the per-mountpoint softdep mutex is held in modified
functions that do not already have this assertion.  No functional
change intended.

Reviewed by:	kib, mckusick (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20741
2019-06-26 16:28:42 +00:00
Alexander Motin
c0c317d203 Fix qlxgbe(4) static build.
MFC after:	2 weeks
2019-06-26 16:23:24 +00:00
Marius Strobl
c2c5d1e787 o In iflib_txq_drain():
- Remove desc_used, which is only ever written to.
  - Remove a dead store to reclaimed.
  - Don't recycle avail.
  - Sort variables according to style(9).
  These changes will make a subsequent commit easier to read.
o In iflib_tx_credits_update(), don't bother checking whether the
  ift_txd_credits_update method pointer is NULL; _iflib_pre_assert()
  asserts upfront that this method has been assigned and functions
  like iflib_{fast_intr_rxtx,netmap_timer_adjust,txq_can_drain}()
  and _task_fn_tx() were already unconditionally relying on the
  method being callable.
2019-06-26 15:28:21 +00:00
Doug Moore
d1d3f7e1d1 Revert r349393, which leads to an assertion failure on bootup, in vm_map_stack_locked.
Reported by: ler@lerctr.org
Approved by: kib, markj (mentors, implicit)
2019-06-26 03:12:57 +00:00
Justin Hibbits
088c26aee8 powerpc/booke: Handle misaligned floating point loads/stores as on AIM
Misaligned floating point loads and stores are already handled for AIM, but
use the DSISR to obtain the necessary data.  Book-E does not have the DSISR,
so these fixups are not performed, leading to a SIGBUS on misaligned FP
loads or stores.  Obtain the necessary data on the Book-E side, similar to
how is done for SPE.

MFC after:	1 week
2019-06-26 01:14:39 +00:00
Cy Schubert
65f07d9976 While working on PR/238796 I discovered an unused variable in frdest,
the next hop structure. It is likely this contributes to PR/238796
though other factors remain to be investigated.

PR:		238796
MFC after:	1 week
2019-06-26 00:53:49 +00:00
Cy Schubert
2637412cbc Remove a tautological compare for offset != 0.
MFC after:	1 week
2019-06-26 00:53:46 +00:00
Cy Schubert
7f39a7e492 Prompted by r349366, ipfilter is also does not conform to RFC 3128
by dropping TCP fragments with offset = 1.

In addition to dropping these fragments, add a DTrace probe to allow
for more detailed monitoring and diagnosis if required.

MFC after:	1 week
2019-06-26 00:53:43 +00:00
Doug Moore
52499d1739 Eliminate some uses of the prev and next fields of vm_map_entry_t.
Since the only caller to vm_map_splay is vm_map_lookup_entry, move the
implementation of vm_map_splay into vm_map_lookup_helper, called by
vm_map_lookup_entry.

vm_map_lookup_entry returns the greatest entry less than or equal to a
given address, but in many cases the caller wants the least entry
greater than or equal to the address and uses the next pointer to get
to it. Provide an alternative interface to lookup,
vm_map_lookup_entry_ge, to provide the latter behavior, and let
callers use one or the other rather than having them use the next
pointer after a lookup miss to get what they really want.

In vm_map_growstack, the caller wants an entry that includes a given
address, and either the preceding or next entry depending on the value
of eflags in the first entry. Incorporate that behavior into
vm_map_lookup_helper, the function that implements all of these
lookups.

Eliminate some temporary variables used with vm_map_lookup_entry, but
inessential.

Reviewed by: markj (earlier version)
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20664
2019-06-25 20:25:16 +00:00
Julian Elischer
eb2b51ffda Fix annoying whitespace issue.
NO real change
2019-06-25 19:55:42 +00:00
Alan Somers
4f53d57e8c fcntl: style changes to r349248
Reported by:	bde
MFC after:	2 weeks
MFC-With:	349248
Sponsored by:	The FreeBSD Foundation
2019-06-25 19:44:22 +00:00
Alexander Motin
419110374a Avoid extra taskq_dispatch() calls by DMU.
DMU sync code calls taskq_dispatch() for each sublist of os_dirty_dnodes
and os_synced_dnodes.  Since the number of sublists by default is equal
to number of CPUs, it will dispatch equal, potentially large, number of
tasks, waking up many CPUs to handle them, even if only one or few of
sublists actually have any work to do.

This change adds check for empty sublists to avoid this.
2019-06-25 18:35:23 +00:00
Leandro Lupori
e2edff4167 [PowerPC64] Don't mark module data as static
Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler.

Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code
generated tries to access data in the wrong location, causing kernel panic
(data storage interrupt trap) when loading if_epair and ipfw.

Issue was reproduced with kernel/module compiled using gcc8 and clang8. It
affects both ELFv1 and ELFv2 ABI environments.

PR:		232387
Submitted by:	alfredo.junior_eldorado.org.br
Reported by:	Mark Millard
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20461
2019-06-25 17:15:44 +00:00
Warner Losh
7a3e3a2859 Remove a couple of harmless stray references to nandfs.
Submitted by: tsoome@
2019-06-25 16:39:25 +00:00
Ryan Libby
0e2464ea18 netipsec key_register: check for M_NOWAIT alloc failure
Reviewed by:	ae, cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20742
2019-06-25 15:43:52 +00:00
Hans Petter Selasky
59854ecf55 Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:54:41 +00:00
Hans Petter Selasky
43a9329e1b Free all allocated unit IDs in cuse(3) after the client character
devices have been destroyed to avoid creating character devices with
identical name.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:46:01 +00:00
Hans Petter Selasky
c7ffaed92e Fix for deadlock situation in cuse(3)
The final server unref should be done by the server thread to prevent
deadlock in the client cdevpriv destructor, which cannot destroy
itself.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:42:53 +00:00
Andrey V. Elsukov
019c8c9330 Follow the RFC 3128 and drop short TCP fragments with offset = 1.
Reported by:	emaste
MFC after:	1 week
2019-06-25 11:40:37 +00:00
Andrey V. Elsukov
7d4b2d5244 Mark default rule with IPFW_RULE_NOOPT flag, so it can be showed in
compact form.

MFC after:	1 week
2019-06-25 09:11:22 +00:00
Doug Moore
18cd8bb800 vm_map_protect may return an INVALID_ARGUMENT or PROTECTION_FAILURE
error response after clipping the first map entry in the region to be
reserved. This creates a pair of matching entries that should have
been "simplified" back into one, or never created. This change defers
the clipping of that entry until those two vm_map_protect failure
cases have been ruled out.

Reviewed by: alc
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D20711
2019-06-25 07:44:37 +00:00
Cy Schubert
c964c98793 The definition of icmptypes in ip_compt.h is dead code as it already
use the icmptypes in ip_icmp.h.

MFC after:	1 week
2019-06-25 07:04:47 +00:00
Warner Losh
a9154c1c83 Replay r349342 by imp accidentally reverted by r349352
Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-25 06:14:31 +00:00
Warner Losh
296218d4cf Replay r349340 by imp accidentally reverted by r349352
Create ata_param_fixup

Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-25 06:14:21 +00:00
Warner Losh
76769dc108 Replay r349339 by imp accidentally reverted by r349352
Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
2019-06-25 06:14:16 +00:00
Warner Losh
e5500f1efa Replay r349334 by markj accidentally reverted by r349352
Remove a lingering use of splbio().

The buffer must be locked by the caller.  No functional change
intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-06-25 06:14:00 +00:00
Warner Losh
f5a95d9a07 Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.

Numerous posts to arch@ and other locations have found no actual users
for this software.

Relnotes:	Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
2019-06-25 04:50:09 +00:00
Justin Hibbits
f62da49b2f powerpc: Transition to Secure-PLT, like most other OSs
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT.  BSS-PLT uses runtime
code generation to generate the PLT stubs.  Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.

This is the libc, rtld, and kernel support only.  The toolchain and build
parts will be updated separately.

Reviewed By: nwhitehorn, bdragon, pfg
Differential Revision: https://reviews.freebsd.org/D20598
MFC after:	1 month
2019-06-25 00:40:44 +00:00
Jayachandran C.
b0f79e328e arm64 acpi_iort: add some error handling
Print warnings for some bad kernel configurations (like NUMA disabled
with multiple domains). Check and report some firmware errors (like
incorrect proximity domain entries).

Differential Revision:	https://reviews.freebsd.org/D20416
2019-06-24 21:24:55 +00:00
Jayachandran C.
c66524f07d arm64 gicv3_its: enable all ITS blocks for a CPU
We now support multiple ITS blocks raising interrupts to a CPU.
Add all available CPUs to the ITS when no NUMA information is
available.

This reverts the check added in r340602, at that tim we did not
suppport multiple ITS blocks for a CPU.

Differential Revision:	https://reviews.freebsd.org/D20417
2019-06-24 21:13:45 +00:00
Jayachandran C.
893caf588a arm64 gic: Drop unused GICV3_IVAR_REDIST_VADDR
Now that GICV3_IVAR_REDIST is available, GICV3_IVAR_REDIST_VADDR
is unused and can be removed. Drop the define and add a comment.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D20454
2019-06-24 21:00:28 +00:00
Warner Losh
af9727f618 Add missing include of sys/boot.h
This change was dropped out in a rebase and I didn't catch that before
I committed.
2019-06-24 20:52:21 +00:00
Warner Losh
ec9abc1843 Move to using a common kernel path between the boot / laoder bits and
the kernel.
2019-06-24 20:34:53 +00:00
Warner Losh
97ad52ca4c Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
2019-06-24 20:23:19 +00:00
Warner Losh
2afaed2d0f Create ata_param_fixup
Create a common fixup routine to do the canonical fixup of the
ata_param fixup. Call it from both the ATA and the ATA over SCSI
paths.
2019-06-24 20:18:58 +00:00
Warner Losh
161d2a1796 Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
2019-06-24 20:18:49 +00:00
Mark Johnston
673c1c2944 Remove a lingering use of splbio().
The buffer must be locked by the caller.  No functional change
intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-06-24 19:19:37 +00:00
Cy Schubert
51a7230a18 Clean out duplicate definitions of TCP macros also found in netinet/tcp.h.
MFC after:	1 week
2019-06-24 02:58:02 +00:00
Ian Lepore
0bab2b6e6f Add pwm devices to NOTES. 2019-06-24 02:39:56 +00:00
Ian Lepore
6e36309d83 Add gpio(4) and related drivers to NOTES. 2019-06-24 02:30:05 +00:00
Ian Lepore
2973d38a49 The gpiopps(4) driver currently has probe and attach code only for FDT based
systems, so conditionalize it accordingly in conf/files.
2019-06-24 02:27:17 +00:00
Ian Lepore
5364951d98 Build an armv7 LINT kernel in addition to armv5 LINT. You might think this
had been done years ago.  I did.  All this time we've only compiled a LINT
kernel for TARGET_ARCH=arm.  Now separate LINT-V5 and LINT-V7 configs are
generated and built.

There are two new files in arm/conf, NOTES.armv5 and NOTES.armv7, containing
some of what used to be in the arm NOTES file.  That file now contains only
the bits that are common to v5 and v7.

The makeLINT.mk file now creates the LINT-V5 and LINT-V7 files by concatening
sys/conf/NOTES, arm/conf/NOTES, and arm/conf/NOTES.armv{5,7} in that order.
2019-06-24 01:42:09 +00:00
Konstantin Belousov
d8ddb98a5e amd64 pmap: block on turnstile for lock-less DI.
Port the code to block on turnstile instead of yielding, to lock-less
delayed invalidation. The yield might cause tight loop due to priority
inversion.

Since it is impossible to avoid race between block and wake-up, arm
1-tick callout to wakeup when thread blocks itself.

Reported and tested by:	mjg
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
Differential revision:	https://reviews.freebsd.org/D20636
2019-06-23 21:21:11 +00:00
Ian Lepore
7a3a48426e Allow compiling ukbdmap.h on arm, since it appears to work fine. 2019-06-23 21:17:41 +00:00
Konstantin Belousov
89f2ab0608 Switch to check for effective user id in r349320, and disable dumping
into existing files for sugid processes.

Despite using real user id pronounces the intent, it actually breaks
suid coredumps, while not making any difference for non-sugid
processes.  The reason for the breakage is that non-existent core file
is created with the effective uid (unless weird hacks like SUIDDIR are
configured).

Then, if user enabled kern.sugid_coredump, core dumping should not
overwrite core files owned by effective uid, but we cannot pretend to
use real uid for dumping.

PR:	68905
admbugs:	358
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-06-23 21:15:31 +00:00
Alan Cox
22c7bcb842 pmap_enter_quick_locked() never replaces a valid mapping, so it need not
perform a TLB invalidation.  A barrier suffices.  (See r343876.)

Add a comment to pmap_enter_quick_locked() in order to highlight the
fact that it does not replace valid mappings.

Correct a typo in one of pmap_enter()'s comments.

MFC after:	1 week
2019-06-23 21:06:56 +00:00
Alexander Motin
53f5ac1310 Improve AHCI Enclosure Management and SES interoperation.
Since SES specs do not define mechanism to map enclosure slots to SATA
disks, AHCI EM code I written many years ago appeared quite useless,
that always bugged me.  I was thinking whether it was a good idea, but
if LSI HBAs do that, why I shouldn't?

This change introduces simple non-standard mechanism for the mapping
into both AHCI EM and SES code, that makes AHCI EM on capable controllers
(most of Intel's) a first-class SES citizen, allowing it to report disk
physical path to GEOM, show devices inserted into each enclosure slot in
`sesutil map` and `getencstat`, control locate and fault LEDs for specific
devices with `sesutil locate adaX on` and `sesutil fault adaX on`, etc.

I've successfully tested this on Supermicro X10DRH-i motherboard connected
with sideband cable of its S-SATA Mini-SAS connector to SAS815TQ backplane.
It can indicate with LEDs Locate, Fault and Rebuild/Remap SES statuses for
each disk identical to real SES of Supermicro SAS2 backplanes.

MFC after:	2 weeks
2019-06-23 19:05:01 +00:00
Konstantin Belousov
7a29e0bf96 coredump: avoid writing to core files not owned by the real user.
Reported by: blake frantz <trew@hick.org>
PR:	68905
admbugs:	358
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-06-23 18:35:11 +00:00
Ian Lepore
ac6a9e474f Add some i2c slave-device drivers that were missing from NOTES. 2019-06-23 17:39:13 +00:00
Ian Lepore
9026f4b86d The sy8106a and syr827 drviers require FDT and the ext_resources subsystem. 2019-06-23 17:38:30 +00:00
Ian Lepore
48fedd0960 Add the rtc8583 driver to conf/files. Also, move sy8106a from
file.allwinner to conf/files... it's not allwinner-specific, some day
other platforms could use the same regulator chip.
2019-06-23 17:23:56 +00:00
Ian Lepore
5cafc16207 Remove some unused header files from the ad7418 driver. 2019-06-23 17:20:39 +00:00
Alexander Motin
6d4d657360 Decouple enc/ses verbosity from bootverbose.
I don't want to be regularly notified that my enclosure violates standards
until there is some real problem I want to debug.

MFC after:	2 weeks
2019-06-22 19:09:10 +00:00
Alan Cox
36c5a4cb3f Introduce pmap_remove_l3_range() and use it in two places:
(1) pmap_remove(), where it eliminates redundant TLB invalidations by
pmap_remove() and pmap_remove_l3(), and (2) pmap_enter_l2(), where it may
optimize the TLB invalidations by batching them.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12725
2019-06-22 16:26:38 +00:00
Ryan Libby
a6d2a24c3e ddb show proc typo 2019-06-22 05:35:23 +00:00
Alexander Motin
b8038d7827 Remove ancient SCSI-2/3 mentioning.
MFC after:	2 weeks
2019-06-22 03:50:43 +00:00
Eric van Gyzen
df8406543f VirtIO SCSI: validate seg_max on attach
Until r349278, bhyve presented a seg_max to the guest that was too large.
Detect this case and clamp it to the virtqueue size.  Otherwise, we would
fail the "too many segments to enqueue" assertion in virtqueue_enqueue().

I hit this by running a guest with a MAXPHYS of 256 KB.

Reviewed by:	bryanv cem
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20703
2019-06-22 01:20:45 +00:00
Alexander Motin
6805c9b74d Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page".  It allows us to question ELEMENT INDEX that is lower
then values we already processed.  There are many SAS2 enclosures with
this kind of problem.

While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong.  Also skip elements with INVALID bit set.

MFC after:	2 weeks
2019-06-22 01:06:41 +00:00
Scott Long
0feb46b0c6 Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
2019-06-21 23:40:26 +00:00
Alexander Motin
7318fcb51d Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first.  To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.

MFC after:	2 weeks
2019-06-21 23:29:16 +00:00
Alan Somers
1bb957296b Reduce namespace pollution from r349233
Define __daddr_t in _types.h and use it in filio.h

Reported by:	ian, bde
Reviewed by:	ian, imp, cem
MFC after:	2 weeks
MFC-With:	349233
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20715
2019-06-21 21:50:14 +00:00
Johannes Lundberg
6425fed7e6 LinuxKPI: Additions to rcu list.
- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.

Reviewed by:	hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20719
2019-06-21 18:48:07 +00:00
Johannes Lundberg
62260f68b4 LinuxKPI: Add atomic_long_sub macro.
Reviewed by:	imp (mentor), hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20718
2019-06-21 16:43:16 +00:00
Ian Lepore
83b319101f Add pwm to the armv7 GENERIC kernel, it's now used by TI and Allwinner. 2019-06-21 15:44:58 +00:00
Ian Lepore
ef558a1078 Add support for the PWM(9) API. This allows configuring the pwm output using
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps.  The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
2019-06-21 14:24:33 +00:00
Ian Lepore
3103a7eef7 Some mundane tweaks and cleanups to help de-clutter the diffs of some
upcoming functional changes.

Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data.  Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite.  Move the
device_t and driver_methods structs to the end of the file.  Tweak comments.
2019-06-21 14:01:02 +00:00
Ed Maste
b72236b407 nandsim: correct test to avoid out-of-bounds access
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds.  If only one failed the
bounds check arbitrary memory would be read and returned.

The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.

admbugs:	827
Reported by:	Daniel Hodson of elttam
MFC after:	3 days
Security:	kernel information leak or DoS
Sponsored by:	The FreeBSD Foundation
2019-06-21 13:42:40 +00:00
Andrey V. Elsukov
978f2d1728 Add "tcpmss" opcode to match the TCP MSS value.
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.

 # ipfw add deny log tcp from any to any tcpmss 0-500

Reviewed by:	melifaro,bcr
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-06-21 10:54:51 +00:00
Kristof Provost
05fc9d78d7 ip_output: pass PFIL_FWD in the slow path
If we take the slow path for forwarding we should still tell our
firewalls (hooked through pfil(9)) that we're forwarding. Pass the
ip_output() flags to ip_output_pfil() so it can set the PFIL_FWD flag
when we're forwarding.

MFC after:	1 week
Sponsored by:	Axiado
2019-06-21 07:58:08 +00:00
Conrad Meyer
c363b16c63 sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead.  Document removal in UPDATING; update NOTES and
random.4.

Reviewed by:	delphij, markm (previous version)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19918
2019-06-21 00:16:30 +00:00
Takanori Watanabe
a809abd44a Fix the case where no root hub object while host controller object exist in ACPI namespace.
Also you can disable ACPI support for USB by setting
debug.acpi.disabled="usb"

PR:	238711
2019-06-20 23:52:33 +00:00
Alan Somers
38b06f8ac4 fcntl: fix overflow when setting F_READAHEAD
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.

Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936.  Formalize that by using a union type.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20710
2019-06-20 23:07:20 +00:00
Alexander Motin
68035f6381 SPC-3 and up require some UAs to be returned as fixed.
MFC after:	2 weeks
2019-06-20 22:20:30 +00:00
Alexander Motin
35a9ffc350 Optimize xpt_getattr().
Do not allocate temporary buffer for attributes we are going to return
as-is, just make sure to NUL-terminate them.  Do not zero temporary 64KB
buffer for CDAI_TYPE_SCSI_DEVID, XPT tells us how much data it filled
and there are also length fields inside the returned data also.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-20 20:29:42 +00:00
Navdeep Parhar
17795d8234 cxgbe/t4_tom: DDP_DEAD is a ddp flag and not a toepcb flag.
The driver was in effect setting TPF_ABORT_SHUTDOWN on the toepcb
instead of what was intended.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-06-20 20:06:19 +00:00
Brooks Davis
74a1b66cf4 Extend mmap/mprotect API to specify the max page protections.
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions.  If
present, these flags specify the maximum permissions.

While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.

This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete.  This complements W^X protections allowing more
precise control by the programmer.

This change alters mprotect argument checking and returns an error when
unhandled protection flags are set.  This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.

In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1.  PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation.  A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.

Reviewed by:	emaste, kib (prior version), markj
Additional suggestions from:	alc
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D18880
2019-06-20 18:24:16 +00:00
Alan Somers
9de14ff010 #include <sys/types.h> from sys/filio.h
This fixes world build after r349231

Reported by:	Jenkins
MFC after:	2 weeks
MFC-With:	349231
Sponsored by:	The FreeBSD Foundation
2019-06-20 14:35:28 +00:00
Alan Somers
d49b446bfb Add FIOBMAP2 ioctl
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20705
2019-06-20 14:13:10 +00:00
Alan Somers
d01752c703 Add a VOP_BMAP(9) man page
Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20704
2019-06-20 13:59:46 +00:00
Alexander Motin
f91aa773be Add wakeup_any(), cheaper wakeup_one() for taskqueue(9).
wakeup_one() and underlying sleepq_signal() spend additional time trying
to be fair, waking thread with highest priority, sleeping longest time.
But in case of taskqueue there are many absolutely identical threads, and
any fairness between them is quite pointless.  It makes even worse, since
round-robin wakeups not only make previous CPU affinity in scheduler quite
useless, but also hide from user chance to see CPU bottlenecks, when
sequential workload with one request at a time looks evenly distributed
between multiple threads.

This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup
thread that went to sleep last, but no longer in context switch (to avoid
immediate spinning on the thread lock).  On top of that new wakeup_any()
function is added, equivalent to wakeup_one(), but setting the flag.
On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its
threads.

As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs
with 16KB block size spend 34% less time in wakeup_any() and descendants
then it was spending in wakeup_one(), and total write throughput increased
by ~10% with the same as before CPU usage.

Reviewed by:	markj, mmacy
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D20669
2019-06-20 01:15:33 +00:00
Mark Johnston
ee1f168540 Group vm_page_activate()'s definition with other related functions.
No functional change intended.

MFC after:	3 days
2019-06-19 21:36:00 +00:00
Alexander Motin
49ee0fcea5 Use sbuf_cat() in GEOM confxml generation.
When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-19 15:36:02 +00:00
Jonathan T. Looney
5e02b277a4 Add the ability to limit how much the code will fragment the RACK send map
in response to SACKs. The default behavior is unchanged; however, the limit
can be activated by changing the new net.inet.tcp.rack.split_limit sysctl.

Submitted by:	Peter Lei <peterlei@netflix.com>
Reported by:	jtl
Reviewed by:	lstewart (earlier version)
Security:	CVE-2019-5599
2019-06-19 13:55:00 +00:00
Alexander Motin
eeea0fcf0f Fix typo in r349178.
Reported by:	ae
MFC after:	1 week
2019-06-19 13:30:50 +00:00
Marko Zec
188adcb7e4 V_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h /
ip_var.h since at least 2008, so make use of those definitions here.

MFC after:	3 days
2019-06-19 08:49:24 +00:00
Marko Zec
6aee0bfa85 Evaluating htons() at compile time is more efficient than doing ntohs()
at runtime.  This change removes a dependency on a barrel shifter pass
before branch resolution, while reducing the instruction stream size
by 9 bytes on amd64.

MFC after:	3 days
2019-06-19 08:39:19 +00:00
Scott Long
da761f3b1f Implement VT-d capability detection on chipsets that have multiple
translation units with differing capabilities

From the author via Bugzilla:
---
When an attempt is made to passthrough a PCI device to a bhyve VM
(causing initialisation of IOMMU) on certain Intel chipsets using
VT-d the PCI bus stops working entirely. This issue occurs on the
E3-1275 v5 processor on C236 chipset and has also been encountered
by others on the forums with different hardware in the Skylake
series.

The chipset has two VT-d translation units. The issue is caused by
an attempt to use the VT-d device-IOTLB capability that is
supported by only the first unit for devices attached to the
second unit which lacks that capability. Only the capabilities of
the first unit are checked and are assumed to be the same for all
units.

Attached is a patch to rectify this issue by determining which
unit is responsible for the device being added to a domain and
then checking that unit's device-IOTLB capability. In addition to
this a few fixes have been made to other instances where the first
unit's capabilities are assumed for all units for domains they
share. In these cases a mutual set of capabilities is determined.
The patch should hopefully fix any bugs for current/future
hardware with multiple translation units supporting different
capabilities.

A description is on the forums at
https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235
The thread includes observations by other users of the bug
occurring, and description as well as confirmation of the fix.
I'd also like to thank Ordoban for their help.

---
Personally tested on a Skylake laptop, Skylake Xeon server, and
a Xeon-D-1541, passing through XHCI and NVMe functions.  Passthru
is hit-or-miss to the point of being unusable without this
patch.

PR: 229852
Submitted by: callum@aitchison.org
MFC after: 1 week
2019-06-19 06:41:07 +00:00
Alan Cox
0b3c5f1bc9 Correct an error in r349122. pmap_unwire() should update the pmap's wired
count, not its resident count.

X-MFC with:	r349122
2019-06-19 03:33:00 +00:00
Alexander Motin
5c32e9fcb2 Optimize kern.geom.conf* sysctls.
On large systems those sysctls may generate megabytes of output.  Before
this change sbuf(9) code was resizing buffer by 4KB each time many times,
generating tons of TLB shootdowns.  Unfortunately in this case existing
sbuf_new_for_sysctl() mechanism, supposed to help with this issue, is not
applicable, since all the sbuf writes are done in different kernel thread.

This change improves situation in two ways:
 - on first sysctl call, not providing any output buffer, it sets special
sbuf drain function, just counting the data and so not needing big buffer;
 - on second sysctl call it uses as initial buffer size value saved on
previous call, so that in most cases there will be no reallocation, unless
GEOM topology changed significantly.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-06-18 21:05:10 +00:00
Conrad Meyer
22eedc9722 random(4): Fix a regression in short AES mode reads
In r349154, random device reads of size < 16 bytes (AES block size) were
accidentally broken to loop forever.  Correct the loop condition for small
reads.

Reported by:	pho
Reviewed by:	delphij
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D20686
2019-06-18 18:50:58 +00:00
Ian Lepore
edd96b9fb9 Handle labels specified with hints even on FDT systems. Hints are the
easiest thing for a user to control (via loader.conf or kenv+kldload), so
handle them in addition to any label specified via the FDT data.
2019-06-18 17:05:05 +00:00
Ed Maste
05918954d8 Remove sys/capability.h for the third time
In all supported (and most unsupported) FreeBSD versions the appropriate
header for Capsicum is sys/capsicum.h.  Software including sys/capability.h
is most likely looking for Linux capabilities based on the withdrawn
POSIX.1e draft.

This header was previously removed in r334929 and r340156, but reverted
each time due to ports failures.  These issues have now (broadly) been
addressed.

PR:		228878 [exp-run]
Submitted by:	eadler (r334929)
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2019-06-18 14:13:52 +00:00
Ian Lepore
780c3de886 Remove everything related to channels from the pwmc public interface, now
that there is a pwmc(4) instance per channel and the channel number is
maintained as a driver ivar rather than being passed in from userland.
2019-06-18 00:11:00 +00:00
Takanori Watanabe
e68fcc8875 Add ACPI support for USB driver.
This adds ACPI device path on devinfo(8) output and
show  value of _UPC(usb port capabilities), _PLD (physical location of device)
when hw.usb.debug >= 1 .

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D20630
2019-06-17 23:03:30 +00:00
Conrad Meyer
179f62805c random(4): Fortuna: allow increased concurrency
Add experimental feature to increase concurrency in Fortuna.  As this
diverges slightly from canonical Fortuna, and due to the security
sensitivity of random(4), it is off by default.  To enable it, set the
tunable kern.random.fortuna.concurrent_read="1".  The rest of this commit
message describes the behavior when enabled.

Readers continue to update shared Fortuna state under global mutex, as they
do in the status quo implementation of the algorithm, but shift the actual
PRF generation out from under the global lock.  This massively reduces the
CPU time readers spend holding the global lock, allowing for increased
concurrency on SMP systems and less bullying of the harvestq kthread.

It is somewhat of a deviation from FS&K.  I think the primary difference is
that the specific sequence of AES keys will differ if READ_RANDOM_UIO is
accessed concurrently (as the 2nd thread to take the mutex will no longer
receive a key derived from rekeying the first thread).  However, I believe
the goals of rekeying AES are maintained: trivially, we continue to rekey
every 1MB for the statistical property; and each consumer gets a
forward-secret, independent AES key for their PRF.

Since Chacha doesn't need to rekey for sequences of any length, this change
makes no difference to the sequence of Chacha keys and PRF generated when
Chacha is used in place of AES.

On a GENERIC 4-thread VM (so, INVARIANTS/WITNESS, numbers not necessarily
representative), 3x concurrent AES performance jumped from ~55 MiB/s per
thread to ~197 MB/s per thread.  Concurrent Chacha20 at 3 threads went from
roughly ~113 MB/s per thread to ~430 MB/s per thread.

Prior to this change, the system was extremely unresponsive with 3-4
concurrent random readers; each thread had high variance in latency and
throughput, depending on who got lucky and won the lock.  "rand_harvestq"
thread CPU use was high (double digits), seemingly due to spinning on the
global lock.

After the change, concurrent random readers and the system in general are
much more responsive, and rand_harvestq CPU use dropped to basically zero.

Tests are added to the devrandom suite to ensure the uint128_add64 primitive
utilized by unlocked read functions to specification.

Reviewed by:	markm
Approved by:	secteam(delphij)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20313
2019-06-17 20:29:13 +00:00
Cy Schubert
b8358917db Make ipf_objbytes a constant. ipf_objbytes is a table of internal data
structures that are saved across reboots by ipfs(8). The table is not
changed at runtime.

MFC after:	3 days
2019-06-17 20:10:55 +00:00
Xin LI
f89d207279 Separate kernel crc32() implementation to its own header (gsb_crc32.h) and
rename the source to gsb_crc32.c.

This is a prerequisite of unifying kernel zlib instances.

PR:		229763
Submitted by:	Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision:	https://reviews.freebsd.org/D20193
2019-06-17 19:49:08 +00:00
Ian Lepore
b5d67730ee Put the pwmc cdev filenames under the pwm directory along with any label
names.  I.e., everything related to pwm now goes in /dev/pwm.  This will
make it easier for userland tools to turn an unqualified name into a fully
qualified pathname, whether it's the base pwmcX.Y name or a label name.
2019-06-17 16:26:43 +00:00
Conrad Meyer
d0d71d818c random(4): Generalize algorithm-independent APIs
At a basic level, remove assumptions about the underlying algorithm (such as
output block size and reseeding requirements) from the algorithm-independent
logic in randomdev.c.  Chacha20 does not have many of the restrictions that
AES-ICM does as a PRF (Pseudo-Random Function), because it has a cipher
block size of 512 bits.  The motivation is that by generalizing the API,
Chacha is not penalized by the limitations of AES.

In READ_RANDOM_UIO, first attempt to NOWAIT allocate a large enough buffer
for the entire user request, or the maximal input we'll accept between
signal checking, whichever is smaller.  The idea is that the implementation
of any randomdev algorithm is then free to divide up large requests in
whatever fashion it sees fit.

As part of this, two responsibilities from the "algorithm-generic" randomdev
code are pushed down into the Fortuna ra_read implementation (and any other
future or out-of-tree ra_read implementations):

  1. If an algorithm needs to rekey every N bytes, it is responsible for
  handling that in ra_read(). (I.e., Fortuna's 1MB rekey interval for AES
  block generation.)

  2. If an algorithm uses a block cipher that doesn't tolerate partial-block
  requests (again, e.g., AES), it is also responsible for handling that in
  ra_read().

Several APIs are changed from u_int buffer length to the more canonical
size_t.  Several APIs are changed from taking a blockcount to a bytecount,
to permit PRFs like Chacha20 to directly generate quantities of output that
are not multiples of RANDOM_BLOCKSIZE (AES block size).

The Fortuna algorithm is changed to NOT rekey every 1MiB when in Chacha20
mode (kern.random.use_chacha20_cipher="1").  This is explicitly supported by
the math in FS&K §9.4 (Ferguson, Schneier, and Kohno; "Cryptography
Engineering"), as well as by their conclusion: "If we had a block cipher
with a 256-bit [or greater] block size, then the collisions would not
have been an issue at all."

For now, continue to break up reads into PAGE_SIZE chunks, as they were
before.  So, no functional change, mostly.

Reviewed by:	markm
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D20312
2019-06-17 15:09:12 +00:00
Conrad Meyer
403c041316 random(4): Add regression tests for uint128 implementation, Chacha CTR
Add some basic regression tests to verify behavior of both uint128
implementations at typical boundary conditions, to run on all architectures.

Test uint128 increment behavior of Chacha in keystream mode, as used by
'kern.random.use_chacha20_cipher=1' (r344913) to verify assumptions at edge
cases.  These assumptions are critical to the safety of using Chacha as a
PRF in Fortuna (as implemented).

(Chacha's use in arc4random is safe regardless of these tests, as it is
limited to far less than 4 billion blocks of output in that API.)

Reviewed by:	markm
Approved by:	secteam(gordon)
Differential Revision:	https://reviews.freebsd.org/D20392
2019-06-17 14:59:45 +00:00
Ian Lepore
2c6c030ce2 Add back a const qualifier I somehow fumbled away between test-building
and committing recent changes.
2019-06-17 03:48:44 +00:00
Ian Lepore
0a06f11da3 Implement the ofw_bus_get_node method in aw_pwm(4) so that ofw_pwmbus can
find its metadata for instantiating children.
2019-06-17 03:40:00 +00:00
Ian Lepore
b43e2c8b56 Add ofw_pwmbus to enumerate pwmbus devices on systems configured with fdt
data.  Also, add fdt support to pwmc.
2019-06-17 03:32:05 +00:00
Alan Cox
29c25bd781 Eliminate a redundant call to pmap_invalidate_page() from
pmap_ts_referenced().

MFC after:	14 days
Differential Revision:	https://reviews.freebsd.org/D12725
2019-06-17 01:58:25 +00:00
Alan Cox
c8f59059d8 Three changes to arm64's pmap_unwire():
Implement wiring changes on superpage mappings.  Previously, a superpage
mapping was unconditionally demoted by pmap_unwire(), even if the wiring
change applied to the entire superpage mapping.

Rewrite a comment to use the arm64 names for bits in a page table entry.
Previously, the bits were referred to by their x86 names.

Use atomic_"op"_64() instead of atomic_"op"_long() to update a page table
entry in order to match the prevailing style in this file.

MFC after:	10 days
2019-06-16 22:13:27 +00:00
Nathan Whitehorn
4d210a60c3 Fix bug on newbus device deletion: we should delete the child's devinfo
on deletion, not the parent's.

MFC after:	3 weeks
2019-06-16 21:56:45 +00:00
Ian Lepore
0af7a9a451 Rework pwmbus and pwmc so that each child will handle a single PWM channel.
Previously, there was a pwmc instance for each instance of pwm hardware
regardless of how many pwm channels that hardware supported.  Now there
will be a pwmc instance for each channel when the hardware supports
multiple channels.  With a separate instance for each channel, we can have
"named channels" in userland by making devfs alias entries in /dev/pwm.

These changes add support for ivars to pwmbus, and use an ivar to track the
channel number for each child.  It also adds support for hinted children.

In pwmc, the driver checks for a label hint, and if present, it's used to
create an alias for the cdev in /dev/pwm.  It's not anticipated that hints
will be heavily used, but it's easy to do and allows quick ad-hoc creation
of named channels from userland by using kenv to create hint.pwmc.N.label=
hints.  Upcoming changes will add FDT support, and most labels will
probably be specified that way.
2019-06-16 19:44:42 +00:00
Alan Cox
bf13f9d279 Three enhancements to arm64's pmap_protect():
Implement protection changes on superpage mappings.  Previously, a superpage
mapping was unconditionally demoted by pmap_protect(), even if the
protection change applied to the entire superpage mapping.

Precompute the bit mask describing the protection changes rather than
recomputing it for every page table entry that is changed.

Skip page table entries that already have the requested protection changes
in place.

Reviewed by:	andrew, kib
MFC after:	10 days
Differential Revision:	https://reviews.freebsd.org/D20657
2019-06-16 16:45:01 +00:00
Ian Lepore
b71764df96 In detach(), call bus_generic_detach() before deleting the iicbus child.
This gives the bus and its children the chance to return EBUSY to abort
the detach if they're in the middle of doing some IO.
2019-06-16 16:02:50 +00:00
Ian Lepore
b93539730b Rename pwmbus.h to ofw_pwm.h, because after all the recent changes, there
is nothing left in the file that related to pwmbus at all.  It just contains
prototypes for the functions implemented in dev/pwm.ofw_pwm.c, so name it
accordingly and fix the include protect wrappers to match.

A new pwmbus.h will be coming along in a future commit.
2019-06-16 15:56:59 +00:00
Philip Paeps
5a037b1197 Add macOS-like three finger drag trackpad gesture to psm(4)
Submitted by:	Yan Ka Chiu <nyan@myuji.xyz>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20648
2019-06-16 03:06:05 +00:00
Ian Lepore
6e14f601aa Build SoC-specific modules with GENERIC for the SoCs that have them. 2019-06-16 01:23:45 +00:00
Ian Lepore
b9f654b163 Add module makefiles for Texas Instruments ARM SoCs.
The natural place to look for them based on how other SoCs are organized
would be sys/modules/ti, but that's already taken.  Drop a clue into
modules/ti/Makefile directing people to modules/arm_ti if they're looking
for ARM modules.
2019-06-16 01:22:44 +00:00
Ian Lepore
5935e64693 Split the dtb MODULES_EXTRA line to a series of += lines, making it easier
to maintain and keep in alphabetical order, and paving the way for adding
some other modules that aren't dtb-related.
2019-06-16 01:05:53 +00:00
Ian Lepore
e108c3df04 Add module makefiles for pwm. 2019-06-16 00:53:09 +00:00
Ian Lepore
e3384e8c44 This code no longer uses fdt/ofw stuff, no need to include ofw headers. 2019-06-16 00:43:05 +00:00
Ian Lepore
09ebe549ae Make channel number unsigned, and spell unsigned int u_int. This should
have been part of r349088.
2019-06-16 00:32:19 +00:00
Ian Lepore
aee0e20139 The pwm interface was replaced with pwmbus, include the right header file. 2019-06-16 00:27:11 +00:00
Ian Lepore
6cdbe2bf20 Make pwm channel numbers unsigned. 2019-06-15 23:02:09 +00:00
Ian Lepore
f8f8d87cd9 Restructure the pwm device hirearchy and interfaces.
The pwm and pwmbus interfaces were nearly identical, this merges them into a
single pwmbus interface.  The pwmbus driver now implements the pwmbus
interface by simply passing all calls through to its parent (the hardware
driver).  The channel_count method moves from pwm to pwmbus, and the
get_bus method is deleted (just no longer needed).

The net effect is that the interface for doing pwm stuff is now the same
regardless of whether you're a child of pwmbus, or some random driver
elsewhere in the hierarchy that is bypassing the pwmbus layer and is talking
directly to the hardware driver via cross-hierarchy connections established
using fdt data.

The pwmc driver is now a child of pwmbus, instead of being its sibling
(that's why the get_bus method is no longer needed; pwmc now gets the
device_t of the bus using device_get_parent()).
2019-06-15 22:25:39 +00:00
Ian Lepore
6bb8042535 Destroy the cdev on device detach. Also, make the driver and devclass
static, because nothing outside this file needs them.
2019-06-15 21:51:55 +00:00
Ian Lepore
9878710395 Rename the channel_max method to channel_count, because that's what it's
returning.  (If the channel count is 2, then the max channel number is 1.)
2019-06-15 21:36:14 +00:00
Ian Lepore
47e17a1b1a Give the aw_pwm driver a module version. 2019-06-15 21:31:04 +00:00
Ian Lepore
59d8a61ca7 Spell unsigned int as u_int and channel as chan; eliminates the need to wrap
some long lines.
2019-06-15 21:19:23 +00:00
Ian Lepore
cd6e47c168 Unwrap prototype lines so that return type and function name are on the
same line.  No functional changes.
2019-06-15 20:54:33 +00:00
Ian Lepore
968e5efcca Make pwmbus driver and devclass vars static; they're not mentioned in any
header file, so they can't be used outside this file anyway.
2019-06-15 20:53:26 +00:00
Ian Lepore
2d5913e4f4 Add a missing #include. I suspect this used to get included via some header
pollution that was cleaned up recently, and this file got missed in the
cleanup because it's not attached to the build unless you specifically
request this device in a custom kernel config.
2019-06-15 20:20:36 +00:00
Ian Lepore
1e76aee880 Use device_delete_children() instead of a locally-rolled copy of it that
leaks the device-list memory.
2019-06-15 20:17:00 +00:00
Ian Lepore
3cee44bc88 Remove pwmbus_attach_bus(), it no longer has any callers. Also remove a
couple prototypes for functions that never existed (and never will).
2019-06-15 20:13:42 +00:00
Ian Lepore
71fb373934 Move/rename the sys/pwm.h header file to dev/pwm/pwmc.h. The file contains
ioctl definitions and related datatypes that allow userland control of pwm
hardware via the pwmc device.  The new name and location better reflects its
assocation with a single device driver.
2019-06-15 19:46:59 +00:00
Ian Lepore
f9d8090ea8 Do not include pwm.h here, it is purely a userland interface file containing
ioctl defintions for the pwmc driver. It is not part of the pwmbus interface.
2019-06-15 19:43:33 +00:00
Alan Cox
3b97569c31 Previously, when pmap_remove_pages() destroyed a dirty superpage mapping,
it only called vm_page_dirty() on the first of the superpage's constituent
4KB pages.  This revision corrects that error, calling vm_page_dirty() on
all of superpage's constituent 4KB pages.

MFC after:	3 days
2019-06-15 17:26:42 +00:00
Ian Lepore
4d7ef8a2be Handle failure to enable the clock or obtain its frequency. 2019-06-15 16:59:03 +00:00
Ian Lepore
1bf4afc527 Don't call pwmbus_attach_bus(), because it may not be present if this
driver is compiled into the kernel but pwmbus will be loaded as a module
when needed (and because of that, pwmbus_attach_bus() is going away in
the near future).  Instead, just directly do what that function did:
register the fdt xfef handle, and attach the pwmbus.
2019-06-15 16:56:00 +00:00
Ian Lepore
5ef9c1079e In detach(), check for failure of bus_generic_detach(), only release
resources if they got allocated (because detach() gets called from attach()
to handle various failures), and delete the pwmbus child if it got created.
2019-06-15 16:36:29 +00:00
Ian Lepore
dd47326c82 Allow pwm(9) components to be selected individually, while 'device pwm'
still includes it all.
2019-06-15 16:16:29 +00:00
Marius Strobl
d49e83eac3 - Replace unused and only ever written to members of public iflib(9)
structs with placeholders (in the latter case, IFLIB_MAX_TX_BYTES
  etc. are also only ever used for these write-only members if at all,
  so both these macros and members can just go). Using these spares
  may render it possible to merge certain iflib(9) fixes to stable/12.
  Otherwise, changes extending struct if_irq or struct if_shared_ctx
  in any way would break KBI as instances of these are allocated by
  the driver front-ends (by contrast, struct if_pkt_info as well as
  struct if_softc_ctx instances are provided by iflib(9) and, thus,
  may grow at least at the end without breaking KBI).
- Make the pvi_name in struct pci_vendor_info const char * as device
  identifiers in hardware lookup tables aren't to be expected to ever
  change at runtime.
- Similarly, make the pci_vendor_info_t of struct if_shared_ctx which
  is used to point to the struct pci_vendor_info arrays provided by
  the driver front-ends const.
- Remove the ETH_ADDR_LEN macro from iflib.h; this was duplicating
  ETHER_ADDR_LEN of <net/ethernet.h> with iflib(9) actually only
  consuming the latter macro.
- Make the name argument of iflib_io_tqg_attach(9) const, matching
  the taskqgroup_attach_cpu(9) this function wraps as well as e. g.
  iflib_config_gtask_init(9).
- Remove the orphaned iflib_qset_lock_get() prototype.
- Remove some extraneous empty lines.
2019-06-15 11:07:41 +00:00
Doug Moore
4766eba1df Critical comments were lost in r349203. This patch seeks to restore
the lost information in new comments.

Reported by: alc
Reviewed by: alc
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20632
2019-06-15 04:30:13 +00:00
Julian Elischer
c749d68596 Lightly hide the 'var' inside the macros to read the arm special registers.
I just happenned to have 3rd party code using 'var' as the output variable
which drew my attention to this. variables defined inside macros should be
prefixed to avoid getting shadowed varable wanrings from clang.
2019-06-15 00:47:39 +00:00
Alan Cox
26066aef10 Batch the TLB invalidations that are performed by pmap_protect() rather
than performing them one at a time.

MFC after:	10 days
2019-06-14 22:06:43 +00:00
Alexander Motin
3bae917061 Minimize aggsum_compare(&arc_size, arc_c) calls.
For busy ARC situation when arc_size close to arc_c is desired.  But
then it is quite likely that aggsum_compare(&arc_size, arc_c) will need
to flush per-CPU buckets to find exact comparison result.  Doing that
often in a hot path penalizes whole idea of aggsum usage there, since it
replaces few simple atomic additions with dozens of lock acquisitions.

Replacing aggsum_compare() with aggsum_upper_bound() in code increasing
arc_p when ARC is growing (arc_size < arc_c) according to PMC profiles
allows to save ~5% of CPU time in aggsum code during sequential write
to 12 ZVOLs with 16KB block size on large dual-socket system.

I suppose there some minor arc_p behavior change due to lower precision
of the new code, but I don't think it is a big deal, since it should
affect only very small window in time (aggsum buckets are flushed every
second) and in ARC size (buckets are limited to 10 average ARC blocks
per CPU).

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-14 20:04:28 +00:00
Alexander Motin
b3b3aa2e29 Alike to ZoL disable metaslab allocation tracing code.
It is too generous to collect in production debug traces that can only
be read with kernel debugger.  Illumos includes special code in their
mdb debugger to read it, we don't.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-06-14 19:57:32 +00:00
Alexander Motin
284e53a401 Properly align struct multilist_sublist to cache line.
Manual Illumos alignment does not fit us due to different kmutex_t size.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-06-14 17:09:39 +00:00
Alan Cox
aa4a1d3a55 Change the arm64 pmap so that updates to the global count of wired pages are
not performed directly by the pmap.  Instead, they are performed by
vm_page_free_pages_toq().  (This is the same approach that we use on x86.)

Reviewed by:	kib, markj
MFC after:	10 days
Differential Revision:	https://reviews.freebsd.org/D20627
2019-06-14 04:01:08 +00:00
Doug Moore
771315283b Avoid using the prev field of vm_map_entry_t in two functions that
iterate over consecutive vm_map entries, and that can easily just
'remember' the prev value instead of looking it up.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20628
2019-06-14 03:15:54 +00:00