Commit Graph

134152 Commits

Author SHA1 Message Date
Emmanuel Vadot
1c62664f24 arm: allwinner: aw_nmi: Fix wrong logic when we disable the nmi
MFC after:	1 week
2020-09-20 16:11:38 +00:00
Michal Meloun
3182062142 Add missing assignment forgotten in r365899
Noticed by:	mav
MFC after:	1 month
MFC with:	r365899
2020-09-20 15:11:52 +00:00
Alexander V. Chernikov
c4bcfe98e2 Fix gw updates / flag updates during route changes.
* Zero gw_sdl if switching to interface route - the assumption
 that underlying storage is zeroed is incorrect with route changes.
* Apply proper flag mask to rte.

Reported by:	vangyzen
2020-09-20 12:31:48 +00:00
Hans Petter Selasky
a29c0348f0 Fix for use of the XHCI driver on Cortex-A72 by adding a missing cache
flush operation before writing to the XHCI_ERSTBA_LO/HI register(s).

PR:		237666
Discussed with:	Mark Millard <marklmi@yahoo.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies // Nvidia
2020-09-19 22:37:45 +00:00
Mark Johnston
d26ab2bec0 Fix some nits in 1G page support in the amd64 pmap.
- Move assertions out of the main loop to avoid duplicate conditional
  expressions, and improve assertion messages.
- Fix va_next updates.  In some cases we were not doing the wraparound
  check before continuing the loop.
- Use the right va_next.  In pmap_advise() and pmap_copy() we would step
  through 1G pages 2M at a time.
- Copy 1G mappings in pmap_copy().

Reviewed by:	alc, kib
MFC with:	r365518
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26463
2020-09-19 15:22:04 +00:00
Michal Meloun
b8bfffc1b6 Implement workaround for broken access to configuration space.
Due to a HW bug in the RockChip PCIe implementation, attempting to access
a non-existent register in the configuration space will throw an exception.
Use new bus functions bus_peek() and bus_poke() to overcomme this limitation.
2020-09-19 11:27:16 +00:00
Michal Meloun
95a85c125d Add NetBSD compatible bus_space_peek_N() and bus_space_poke_N() functions.
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.

Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.

This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25371
2020-09-19 11:06:41 +00:00
Rick Macklem
58dd2b52cb Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures in nfsrv_checksequence().  This was fixed by r365789.
A similar bug exists in nfsrv_bindconnsess(), where SVC_RELEASE() is called
while mutexes are held.
This patch applies a fix similar to r365789, moving the SVC_RELEASE() call
down to after the mutexes are released.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after:	1 week
2020-09-18 23:52:56 +00:00
Matt Macy
2c48331d28 MFV 2.0-rc2
- Fixes divide by zero for unusual hz
- remove cryptodev dependency
2020-09-18 23:21:24 +00:00
Eric van Gyzen
d8d2dda141 amd64 pmap_pkru_same: prev_ppr was always NULL
Fix the logic so it works as it appears.

Reported by:	Coverity
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	D26211 (in progress, so omitting full URL)
2020-09-18 20:53:40 +00:00
Ed Maste
11224884f2 ys/contrib/dev/ath: remove unintentional double semicolon
Approved by:	adrian
2020-09-18 18:35:18 +00:00
Eric van Gyzen
f9cc8410e1 vm_ooffset_t is now unsigned
vm_ooffset_t is now unsigned. Remove some tests for negative values,
or make other adjustments accordingly.

Reported by:	Coverity
Reviewed by:	kib markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D26214
2020-09-18 16:48:08 +00:00
Mitchell Horne
374ce2488a Initialize some local variables earlier
Move the initialization of these variables to the beginning of their
respective functions.

On our end this creates a small amount of unneeded churn, as these
variables are properly initialized before their first use in all cases.
However, changing this benefits at least one downstream consumer
(NetApp) by allowing local and future modifications to these functions
to be made without worrying about where the initialization occurs.

Reviewed by:	melifaro, rscheff
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26454
2020-09-18 14:01:10 +00:00
Mark Johnston
d99cb9802b Assert we are not traversing through superpages in the arm64 pmap.
Reviewed by:	alc, andrew
MFC after:	1 week
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26465
2020-09-18 12:37:41 +00:00
Mark Johnston
04636a71c6 Ensure that a protection key is selected in pmap_enter_largepage().
Reviewed by:	alc, kib
Reported by:	Coverity
MFC with:	r365518
Differential Revision:	https://reviews.freebsd.org/D26464
2020-09-18 12:30:39 +00:00
Navdeep Parhar
3b8506ae30 cxgbe(4): add the firmware binaries instead of the empty files that were added
in r365861.

Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-09-18 03:11:47 +00:00
Navdeep Parhar
a4a4ad2dd9 cxgbe(4): add support for stateless offloads for VXLAN traffic.
Hardware assistance includes checksumming (tx and rx), TSO, and RSS on
the inner traffic in a VXLAN tunnel.

Relnotes:	Yes
Sponsored by:	Chelsio Communications
2020-09-18 03:01:47 +00:00
Navdeep Parhar
b092fd6c97 if_vxlan(4): add support for hardware assisted checksumming, TSO, and RSS.
This lets a VXLAN pseudo-interface take advantage of hardware checksumming (tx
and rx), TSO, and RSS if the NIC is capable of performing these operations on
inner VXLAN traffic.

A VXLAN interface inherits the capabilities of its vxlandev interface if one is
specified or of the interface that hosts the vxlanlocal address. If other
interfaces will carry traffic for that VXLAN then they must have the same
hardware capabilities.

On transmit, if_vxlan verifies that the outbound interface has the required
capabilities and then translates the CSUM_ flags to their inner equivalents.
This tells the hardware ifnet that it needs to operate on the inner frame and
not the outer VXLAN headers.

An event is generated when a VXLAN ifnet starts. This allows hardware drivers to
configure their devices to expect VXLAN traffic on the specified incoming port.

On receive, the hardware does RSS and checksum verification on the inner frame.
if_vxlan now does a direct netisr dispatch to take full advantage of RSS. It is
not very clear why it didn't do this already.

Future work:
Rx: it should be possible to avoid the first trip up the protocol stack to get
the frame to if_vxlan just so it can decapsulate and requeue for a second trip
up the stack. The hardware NIC driver could directly call an if_vxlan receive
routine for VXLAN traffic instead.

Rx: LRO. depends on what happens with the previous item. There will have to to
be a mechanism to indicate that it's time for if_vxlan to flush its LRO state.

Reviewed by:	kib@
Relnotes:	Yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25873
2020-09-18 02:37:57 +00:00
Navdeep Parhar
72cc43df17 Add a knob to allow zero UDP checksums for UDP/IPv6 traffic on the given UDP port.
This will be used by some upcoming changes to if_vxlan(4).  RFC 7348 (VXLAN)
says that the UDP checksum "SHOULD be transmitted as zero.  When a packet is
received with a UDP checksum of zero, it MUST be accepted for decapsulation."
But the original IPv6 RFCs did not allow zero UDP checksum.  RFC 6935 attempts
to resolve this.

Reviewed by:	kib@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25873
2020-09-18 02:21:15 +00:00
Navdeep Parhar
830edb4561 Add two new ifnet capabilities for hw checksumming and TSO for VXLAN traffic.
These are similar to the existing VLAN capabilities.

Reviewed by:	kib@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25873
2020-09-18 02:10:28 +00:00
Navdeep Parhar
1f7313861b mbuf checksum flags and fields to support tunneling protocols.
These are being added to support VXLAN but will work for GENEVE as well.
ENCAP_RSVD1 will likely become ENCAP_GENEVE in the future.

The size of struct mbuf does not change and that means this change can be MFC'd.
If size wasn't a constraint a cleaner way may have been to add inner_csum_flags
and inner_csum_data to go with csum_flags and csum_data.

Reviewed by:	kib@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25873
2020-09-18 01:38:47 +00:00
Konstantin Belousov
294c24b194 State kgssapi dependency on xdr.
Submitted by:	Dmitry Afanasiev
PR:	249378
MFC after:	3 days
2020-09-17 22:29:38 +00:00
Navdeep Parhar
88c9c3f4dd cxgbe(4): Update T4/5/6 firmwares to 1.25.0.0.
Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-09-17 22:14:11 +00:00
Warner Losh
fd0a41d241 Move to a more robust and conservative alloation scheme for devctl messages
Change the zone setup:
- Allow slabs to be returned to the OS
- Set the number of slots to the max devctl will queue before discarding
- Reserve 2% of the max (capped at 100) for low memory allocations
- Disable per-cpu caching since we don't need it and we avoid some pathologies

Change the alloation strategiy a bit:
- If a normal allocation fails, try to get the reserve
- If a reserve allocation fails, re-use the oldest-queued entry for storage
- If there's a weird race/failure and nothing on the queue to steal, return NULL

This addresses two main issues in the old code:
- If devd had died, and we're generating a lot of messages, we have an
  unbounded leak. This new scheme avoids the issue that lead to this.
- The MPASS that was 'sure' the allocation couldn't have failed turned out
  to be wrong in some rare cases. The new code doesn't make this assumption.

Since we reserve only 2% of the space, we go from about 1MB of
allocation all the time to more like 50kB for the reserve.

Reviewed by: markj@
Differential Revision: https://reviews.freebsd.org/D26448
2020-09-17 17:29:33 +00:00
Mark Johnston
97458520cc Increase the default vm.max_user_wired value.
Since r347532 (merged to stable/12) we only count user-wired pages
towards the system limit.  However, we now also treat pages wired by
hypervisors (bhyve and virtualbox) as user-wired, so starting VMs with
large amounts of RAM tends to fail due to the low limit.

The purpose of the limit is to provide a seatbelt, not to impose some
policy on the use of wired memory.  Thus, increase the default limit to
allow reasonable VM configurations to work without tuning.

Reviewed by:	kib
Discussed with:	dougm
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26424
2020-09-17 16:49:28 +00:00
Mitchell Horne
003470c31a Add dtb/sifive module
This allows building the HiFive Unleashed device tree blob.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D26459
2020-09-17 14:58:30 +00:00
Edward Tomasz Napierala
106a784b35 Reduce code duplication by introducing linux_copyout_sockaddr()
helper function.  No functional changes.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25804
2020-09-17 12:14:24 +00:00
Edward Tomasz Napierala
79e3da0602 Add support for SOUND_MIXER_WRITE_MONITOR ioctl. Fixes alsamixer(1)
on my x220.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25806
2020-09-17 11:44:45 +00:00
Edward Tomasz Napierala
70890254b3 Get rid of sv_errtbl and SV_ABI_ERRNO().
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26388
2020-09-17 11:39:33 +00:00
Eugene Grosbein
b2b5d4c07d geom_part: make it possible recovering broken GPT after some LBAs cut off
This is followup to r365477.

If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.

This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".

It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0

For example:

# gpart show da0
=>        34  3906963389  da0  GPT  (1.8T) [CORRUPT]
          34      262144    1  ms-reserved  (128M)
      262178        2014       - free -  (1.0M)
      264192  3906764943    2  freebsd-swap  (1.8T)
# gpart resize -i 2 -s 3900000000 da0
# gpart recover da0

Reported by:	Alex Korchmar
MFC after:	3 days
2020-09-17 04:39:39 +00:00
Konstantin Belousov
dd90d96342 Put calls to check_pgrp_jobc() in fixjobc_kill() under INVARIANTS.
Reported by:	Michael Butler <imb@protected-networks.net>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-09-17 00:07:15 +00:00
Konstantin Belousov
182cfe6ff4 Add check_pgrp_jobc() calls into process exit path.
Both before and after job control adjustments.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:49:19 +00:00
Konstantin Belousov
2f5f11f533 Fix fixjobc+orhpanage.
Orphans affect job control state, we must account for them when
changing pg_jobc.

Instead of p_pptr, use proc_realparent() to get parent relevant for
job control.

Use correct calculation of the parent for exiting process.  For jobc
purposes, we must use realparent, but if it is also exiting, we should
fall to reaper, then recursively find non-exiting reaper.

Reported by:	trasz
PR:	249257
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:46:57 +00:00
Konstantin Belousov
928b85384a Assert that P_TREE_GRPEXITED is set only once.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:40:32 +00:00
Konstantin Belousov
844219f471 proc_realparent: if p_oppid does not match pid of the current parent
for non-orphaned process, return reaper instead of init.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:38:24 +00:00
Konstantin Belousov
82207cd246 Improve ddb 'show pgrpdump' command.
Use ddb pager.
Make lines more compact.
Eliminate unneeded casts.
Print more job-control related info when reporting process group.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:34:18 +00:00
Konstantin Belousov
016b7c7e39 tmpfs: restore atime updates for reads from page cache.
Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated
atomically without locks or (locked) atomics.

tn_update_getattr() change also contains unrelated bug fix.

Reported by:	lwhsu
PR:	249362
Reviewed by:	markj (previous version)
Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26451
2020-09-16 21:28:18 +00:00
Konstantin Belousov
23f9071466 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-09-16 21:24:34 +00:00
Mitchell Horne
ceff9b9d25 if_media: definitions for 40GE LM4 ethernet media type
Reviewed by:	erj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26276
2020-09-16 14:45:16 +00:00
Mark Johnston
e12492164a Move PLTs to the beginning of amd64 kernel modules.
As with .text, the aim is to ensure that executable sections are
segregated from the rest, to avoid creation of writeable and executable
mappings.  Recent versions of LLVM emit a PLT in firmware modules.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26444
2020-09-16 13:51:47 +00:00
Warner Losh
9ea860660f Use standard bool type, instead of non-standard boolean_t 2020-09-16 06:02:30 +00:00
Rick Macklem
a5c55410b3 Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures.
The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex
held.  Normally this is ok, since all that happens is SVC_RELEASE()
decrements a reference count.  However, if the socket has just been shut
down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep
lock during destruction of the server side krpc structure.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after:	1 week
2020-09-16 02:25:18 +00:00
Mark Johnston
b4e07e3da5 Fix locking in uipc_accept().
Reported by:	cy
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-09-15 23:03:56 +00:00
Konstantin Belousov
081e36e760 Add tmpfs page cache read support.
Or it could be explained as lockless (for vnode lock) reads.  Reads
are performed from the node tn_obj object.  Tmpfs regular vnode object
lifecycle is significantly different from the normal OBJT_VNODE: it is
alive as far as ref_count > 0.

Ensure liveness of the tmpfs VREG node and consequently v_object
inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open().
Provide custom tmpfs fo_close() method on file, to ensure that close
is paired with open.

Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks.
It is quite cheap in code size sense to support page-ins for read for
tmpfs even if we do not own tmpfs vnode lock.  Also, we can handle
holes in tmpfs node without additional efforts, and do not have
limitation of the transfer size.

Reviewed by:	markj
Discussed with and benchmarked by:	mjg (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:19:16 +00:00
Konstantin Belousov
4601f5f5ee Microoptimize tmpfs node ref/unref by using atomics.
Avoid tmpfs mount and node locks when ref count is greater than zero,
which is the case until node is being destroyed by unlink or unmount.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:13:21 +00:00
Konstantin Belousov
3c484f325e Convert page cache read to VOP.
There are several negative side-effects of not calling into VOP layer
at all for page cache reads.  The biggest is the missed activation of
EVFILT_READ knotes.

Also, it allows filesystem to make more fine grained decision to
refuse read from page cache.

Keep VIRF_PGREAD flag around, it is still useful for nullfs, and for
asserts.

Reviewed by:	markj
Tested by:	pho
Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:06:36 +00:00
Konstantin Belousov
888636655d vfs_subr.c: export io_hold_cnt and vn_read_from_obj().
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:00:58 +00:00
Konstantin Belousov
96474d2a3f Do not copy vp into f_data for DTYPE_VNODE files.
The pointer to vnode is already stored into f_vnode, so f_data can be
reused.  Fix all found users of f_data for DTYPE_VNODE.

Provide finit_vnode() helper to initialize file of DTYPE_VNODE type.

Reviewed by:	markj (previous version)
Discussed with:	freqlabs (openzfs chunk)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 21:55:21 +00:00
Eric Joyner
a3b9a7366e e1000: Properly retain promisc flag
From Franco:
The iflib rewrite forced the promisc flag but it was not reported
to the system.  Noticed on a stock VM that went into unsolicited
promisc mode when dhclient was started during bootup.

PR:		248869
Submitted by:	Franco Fichtner <franco@opnsense.org>
Reviewed by:	erj@
MFC after:	3 days
2020-09-15 21:07:30 +00:00
Ed Maste
09860d44e4 bhyve: do not permit write access to VMCB / VMCS
Reported by:	Patrick Mooney
Submitted by:	jhb
Security:	CVE-2020-24718
2020-09-15 21:04:27 +00:00
Eric Joyner
467515a494 igb(4): Fix define and includes with RSS option enabled
This re-adds the opt_rss.h header to the driver and includes some
RSS-specific headers when RSS is defined.

PR:		249191
Submitted by:	Milosz Kaniewski <milosz.kaniewski@gmail.com>
MFC after:	3 days
2020-09-15 21:00:25 +00:00
Brandon Bergren
9673f30503 [PowerPC64LE] Use correct in_masks table on LE to fix checksumming
Due to a check that should have been an endian check being an #if 0,
the wrong checksum mask table was being used on LE, which was causing
extreme strangeness in DNS resolution -- *some* hosts would be resolvable,
but most would not.

This fixes DNS resolution.

(I am committing some parts of the LE patchset ahead of time to reduce the
amount of work I have to do while committing the main patchset.)

Sponsored by:	Tag1 Consulting, Inc.
2020-09-15 20:47:33 +00:00
Brandon Bergren
1e936efbce [PowerPC64LE] Set up the powernv partition table correctly.
The partition table is always big endian.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-15 20:25:38 +00:00
Konstantin Belousov
101d5b527a bhyve: intercept AMD SVM instructions.
Intercept and report #UD to VM on SVM/AMD in case VM tried to execute an
SVM instruction.  Otherwise, SVM allows execution of them, and instructions
operate on host physical addresses despite being executed in guest mode.

Reported by:	Maxime Villard <max@m00nbsd.net>
admbug:	972
CVE:	CVE-2020-7467
Reviewed by:	grehan, markj
Differential revision:	https://reviews.freebsd.org/D26313
2020-09-15 20:22:50 +00:00
Mark Johnston
448000279e Fix locking in uipc_accept().
This function wasn't converted to use the new locking protocol in
r333744.  Make it use the PCB lock for synchronizing connection state.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26300
2020-09-15 19:23:42 +00:00
Mark Johnston
ccdadf1a9b Simplify unix socket connection peer locking.
unp_pcb_owned_lock2() has some sharp edges and forces callers to deal
with a bunch of cases.  Simplify it:

- Rename to unp_pcb_lock_peer().
- Return the connected peer instead of forcing callers to load it
  beforehand.
- Handle self-connected sockets.
- In unp_connectat(), just lock the accept socket directly.  It should
  not be possible for the nascent socket to participate in any other
  lock orders.
- Get rid of connect_internal().  It does not provide any useful
  checking anymore.
- Block in unp_connectat() when a different thread is concurrently
  attempting to lock both sides of a connection.  This provides simpler
  semantics for callers of unp_pcb_lock_peer().
- Make unp_connectat() return EISCONN if the socket is already
  connected.  This fixes a race[1] when multiple threads attempt to
  connect() to different addresses using the same datagram socket.
  Upper layers will disconnect a connected datagram socket before
  calling the protocol connect's method, but there is no synchronization
  between this and protocol-layer code.

Reported by:	syzkaller [1]
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26299
2020-09-15 19:23:22 +00:00
Mark Johnston
ed92e1c78c Avoid an unnecessary malloc() when connecting dgram sockets.
The allocated memory is only required for SOCK_STREAM and SOCK_SEQPACKET
sockets.

Reviewed by:	kevans
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26298
2020-09-15 19:23:01 +00:00
Mark Johnston
f0317f868b Simplify unp_disconnect() callers.
In all cases, PCBs are unlocked after unp_disconnect() returns.  Since
unp_disconnect() may release the last PCB reference, callers may have to
bump the refcount before the call just so that they can release them
again.

Change unp_disconnect() to release PCB locks as well as connection
references; this lets us remove several refcount manipulations.  Tighten
assertions.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26297
2020-09-15 19:22:37 +00:00
Mark Johnston
4820bf6ac2 Rename unp_pcb_lock2().
unp_pcb_lock_pair() seems like a better name.  Also make it handle the
case where the two sockets are the same instead of making callers do it.
No functional change intended.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26296
2020-09-15 19:22:16 +00:00
Mark Johnston
5362170da7 Improve unix socket PCB refcounting.
- Use refcount_init().
- Define an INVARIANTS-only zone destructor to assert that various
  bits of PCB state aren't left dangling.
- Annotate unp_pcb_rele() with __result_use_check.
- Simplify control flow.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26295
2020-09-15 19:21:58 +00:00
Mark Johnston
d5cbccecd8 Update unix domain socket locking comments.
- Define a locking key for unpcb members.
- Rewrite some of the locking protocol description to make it less
  verbose and avoid referencing some subroutines which will be renamed.
- Reorder includes.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26294
2020-09-15 19:21:33 +00:00
Edward Tomasz Napierala
c26391f4dd Move SV_ABI_ERRNO translation into linux-specific code, to simplify
the syscall path and declutter it a bit.  No functional changes intended.

Reviewed by:	kib (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26378
2020-09-15 16:41:21 +00:00
Warner Losh
2215adc5ff Include sys/types.h here
It's included by header pollution in most of the compile
environments. However, in the standalone envirnment, it's not
included. Go ahead and include it always since the overhead is low and
it is simpler that way.

MFC After: 3 days
2020-09-15 15:21:29 +00:00
Andrew Turner
d6aa5fe5eb Use ATTR_DEFAULT in the arm64 locore.S
We can use ATTR_DEFAULT directly in locore.S as it fits within an orr
instruction operand.

Sponsored by:	Innovate UK
2020-09-15 14:15:04 +00:00
Warner Losh
0c97af56a7 We don't need the sc_ekeys_lock in standalone environment.
When we bring in geli into the boot loader, we are single threaded so
we don't have to worry about locking. We have no mutexes, and don't need
to use them, so comment it out.

MFC After: 3 days
2020-09-14 23:51:14 +00:00
Warner Losh
44a8153672 Don't do the busy dance in icee_open/close
We don't need to do the busy dance for this driver. It's handled by
destroy_dev() entirely. Since all we did was busy/unbusy in
open/close, just delete them. We therefore don't need to track closes
either.

Reviewed by: ian@
Differential Revision: https://reviews.freebsd.org/D26431
2020-09-14 23:30:04 +00:00
Warner Losh
45472826b9 Tweak what's visible in the standalone environment. We define offsetof
in stand.h typically, but when this is included we can define it
multiple times. However, we don't define bool in stand.h at the
moment, so allow it to be defined inside types.h when we're building
for the standalone environment.

MFC After: 3 days
2020-09-14 23:27:51 +00:00
Navdeep Parhar
bb60ba7e22 cxgbe(4): Get the count of FCS errors from the MAC and not MPS for T6 ports.
The MPS register on the T6 counts something other than FCS errors despite its
name.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-09-14 22:15:54 +00:00
Ian Lepore
3c41bcdf29 Add product ID strings for a couple Microchip usb hubs. Also, update the
vendor ID string to say just "Microchip Technology" -- the buyout of
Standard Microsystems happened in 2012 and the SMC/SMSC names are pretty
much retired at this point.

PR:		241406
2020-09-14 17:33:28 +00:00
Andrew Turner
2a6803de1c Use MACHINE_CPUARCH when checking for arm64
Use MACHINE_CPUARCH with arm64 (aarch64) when we build code that could run
on any 64-bit Arm instruction set. This will simplify checks in downstream
consumers targeting prototype instruction sets.

The only place we check for MACHINE_ARCH == aarch64 is when building the
device tree blobs. As these are targeting current generation ISAs.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D26370
2020-09-14 16:12:28 +00:00
Brandon Bergren
115c987b3f [PowerPC] Make cpu frequency detection endian-independent
On ibm,extended-clock-frequency, ensure we be64toh() the value.

On clock-frequency, remove the right-shifting hack (which was needed due to
reading a 32 bit value into a 64 bit variable) and switch to OF_getencprop()
for reading (which will handle endian conversion internally.)

Reviewed by:	jhibbits (in irc)
Sponsored by:	Tag1 Consulting, Inc.
2020-09-14 15:20:37 +00:00
Gordon Tetlow
bfdd0a34ad Partially revert r346018 and use the if/then construct instead of shell.
There are a couple of places in the tree that directly parse the newvers.sh
script looking for the BRANCH variable. I found two locations, one in
release/Makefile and the other in bin/freebsd-version/Makefile.

While there is a good argument that BRANCH_OVERRIDE should properly
propagate in those circumstances and the new behavior is thus better, the
reality is this change broke freebsd-update's ability to find timestamps in
binaries and resulted in a large number of gratuitous changes.

Reported by:	freebsd-update
Discussed with:	cperciva
MFC after:	1 day
2020-09-14 14:45:30 +00:00
Hans Petter Selasky
39b0f9c389 Poll statistics more frequently in mlx5en(4).
This makes traffic steering algorithms more accurate.

MFC after:	1 week
Submitted by:	gallatin @
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-09-14 14:24:54 +00:00
Edward Tomasz Napierala
fecb19e431 Move td_softdep_cleanup() from userret() to ast(); it's infrequent
at best.  The schedule_cleanup() function already sets TDF_ASTPENDING.

Reviewed by:	kib, mckusick
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26375
2020-09-14 10:17:07 +00:00
Edward Tomasz Napierala
60f083efe2 Move TDP_GEOM check from userret() to ast(); this code path is quite
infrequent.

Reviewed by:	kib
No objections:	mav
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26374
2020-09-14 10:14:03 +00:00
Edward Tomasz Napierala
30d158eecc Move racct/rctl throttling from userret() to ast(). There's no reason
for it to sit in the syscall fast path.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26368
2020-09-14 09:44:24 +00:00
Andrew Turner
d729904a4f Allow for interrupts on pl061 children
Add enough infrastructure for interrupts on children of the pl061 GPIO
controller. As gpiobus already provided these the pl061 driver also needs
to pass requests up the newbus hierarchy.

Currently there are no children that expect to configure interrupts, however
this is expected to change to support the ACPI Event Information interface.

Sponsored by:	Innovate UK
2020-09-14 08:59:16 +00:00
Scott Long
74c781ed91 Refine the busdma template interface. Provide tools for filling in fields
that can be extended, but also ensure compile-time type checking.  Refactor
common code out of arch-specific implementations.  Move the mpr and mps
drivers to this new API.  The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
2020-09-14 05:58:12 +00:00
Kyle Evans
7c9b0046d4 __FreeBSD_version bump for r365605 (crunchgen producing WARNS-clean)
The change in D26397 will need a __FreeBSD_version to base off of for
bootstrapping crunchgen, to avoid avoidable build failures just because the
host has an outdated crunchgen.
2020-09-14 01:56:29 +00:00
Rick Macklem
2848d6d4de Fix a case where the NFSv4.0 server might crash if delegations are enabled.
asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
kdb_backtrace
vpanic
panic
nfsrv_docallback
nfsrv_checkgetattr
nfsrvd_getattr
nfsrvd_dorpc
nfssvc_program
svc_run_internal
svc_thread_start
fork_exit
fork_trampoline
where the panic message was "docallb", which indicates that a callback
was attempted when the ClientID is unconfirmed.
This would not normally occur, but it is possible to have an unconfirmed
ClientID structure with delegation structure(s) chained off it if the
client were to issue a SetClientID with the same "id" but different
"verifier" after acquiring delegations on the previously confirmed ClientID.

The bug appears to be that nfsrv_checkgetattr() failed to check for
this uncommon case of an unconfirmed ClientID with a delegation structure
that no longer refers to a delegation the client knows about.

This patch adds a check for this case, handling it as if no delegation
exists, which is the case when the above occurs.
Although difficult to reproduce, this change should avoid the panic().

PR:		249127
Reported by:	asomers
Reviewed by:	asomers
MFC after:	1 week
Differential Revision:	https://reviews.freebbsd.org/D26342
2020-09-14 00:44:50 +00:00
Brandon Bergren
e44d86731e [PowerPC] bus_space cleanup part 2: Convert to c99 initializers.
To make it easier to work with this in the future, convert to c99
designated initializer syntax.

Tested on powerpc, powerpc64, and powerpc64le. No functional change.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-13 21:34:32 +00:00
Brandon Bergren
88e3d5df6a [PowerPC] bus_space cleanup part 1 - rename bs_be / bs_le functions
The intention of the bus_be naming was for those to be the no-endian-swapping
and for the bus_le to be endian-swapping in all the functions.

This naming breaks down when we're actually are running in LE and need to
use the opposite sense.

As such, rename bs_be_* to native_bs_* and rename bs_le_* to swapped_bs_*.

No functional change.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-13 21:27:30 +00:00
Brandon Bergren
edf215199e [PowerPC64LE] Bus space prep for LE
Swap the BE and LE bus_space tags when on LE, and adjust the nexus tag
to match.

This is prep for a a followup that makes the powerpc bus_space macros easier
to maintain in the future.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-13 21:22:39 +00:00
Brandon Bergren
b963e10d68 [PowerPC64LE] Ensure nvram is built on powerpc64le.
Fix some cases where conditionals that were trying to exclude powerpcspe
were also excluding powerpc64le.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-13 18:24:15 +00:00
Brandon Bergren
81472778e8 [PowerPC64LE] Adjust ELF definitions for LE.
Set ELF_TARG_DATA correctly on PowerPC64LE.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-13 17:36:43 +00:00
Brandon Bergren
43d3fc803c [PowerPC] Implement pmap_mincore() for moea
Do the same as previous for moea.

Tested on G4.
2020-09-13 16:46:03 +00:00
Brandon Bergren
96f57c313d [PowerPC64] Implement pmap_mincore() for moea64
Implement pmap_mincore() for moea64.

This will need some slight tweaks when large page support in HPT lands.

Submitted by:	Fernando Eckhardt Valle <fernando.valle@eldorado.org.br>
Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D26314
2020-09-13 16:42:49 +00:00
Michael Tuexen
42d7560796 Export the name of the congestion control. This will be used by sockstat
and netstat.

Reviewed by:		rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D26412
2020-09-13 09:06:50 +00:00
Brandon Bergren
7be655c2b4 [PowerPC] Add PVO_PADDR macro to mmu_oea.c to match mmu_oea64.c changes
Use a PVO_PADDR macro on 32 bit as well, to reduce the difference between
mmu_oea.c and mmu_oea64.c.

Equivilent to the changes in r363222.
2020-09-12 23:54:57 +00:00
Mike Karels
ac89220b05 bcm2838_pci.c: Respect DMA limits of controller.
Fixes for Raspberry Pi 4B PCIe / USB:
- Pass through a DMA tag for the controller.
- In theory the controller can access the lower 3 GB, but testing found
  that unreliable. OpenBSD also restricts DMA to the lowest 960 MiB.
- Rename some constants to be a bit more meaningful.

Submitted by:	Robert Crowston, crowston at protonmail.com
Reviewed by:	mkarels, outside reviewers
Differential Revision:	https://reviews.freebsd.org/D26344
2020-09-12 23:49:43 +00:00
Jason A. Harmening
3966af52f3 amd64: prevent KCSan false positives on LAPIC mapping
For configurations without x2APIC support (guests, older hardware), the global
LAPIC MMIO mapping will trigger false-positive KCSan reports as it will appear
that multiple CPUs are concurrently reading and writing the same address.
This isn't actually true, as the underlying physical access will be performed
on the local CPU's APIC. Additionally, because LAPIC access can happen during
event timer configuration, the resulting KCSan printf can produce a panic due
to attempted recursion on event timer resources.

Add a __nosanitizethread preprocessor define to prevent the compiler from
inserting TSan hooks, and apply it to the x86 LAPIC accessors.

PR:		249149
Reported by:	gbe
Reviewed by:	andrew, kib
Tested by:	gbe
Differential Revision:	https://reviews.freebsd.org/D26354
2020-09-12 07:04:00 +00:00
John-Mark Gurney
7d5522e16a A major update to the ure driver.
This update adds support for:
HW VLAN tagging
HW checksum offload for IPv4 and IPv6
tx and rx aggreegation (for full gige speeds)
multiple transactions

In my testing, I am able to get 900-950Mbps depending upon
TCP or UDP, which is a significant improvement over the previous
91Mbps (~8kint/sec*1500bytes/packet*1packet/int).

Reviewed by:	hselasky
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D25809
2020-09-12 00:33:11 +00:00
Glen Barber
038fe1e3ef Enclose BRANCH_OVERRIDE in quotes in order to fix an issue with
freebsd-update(8) builds, where BRANCH is suffixed with -p0 for
builds.

Noticed by:	gordon
With help from:	cperciva
MFC after:	3 days
MFC note:	before 12.2-BETA2
Sponsored by:	Rubicon Communications, LLC (netgate.com)
2020-09-12 00:06:45 +00:00
Scott Long
1002529ea9 Convert the mps driver to use busdma templates 2020-09-11 22:27:35 +00:00
John Baldwin
d8dc46f6e9 Add constant for the DE_CFG MSR on AMD CPUs.
Reported by:	Patrick Mooney <pmooney@pfmooney.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25885
2020-09-11 20:32:40 +00:00
Bjoern A. Zeeb
aaa1fdb87b iwm: fix regression from r365419 (ieee80211_media_change())
In r365419 ieee80211_media_change() callers were updated to not longer
act on the obselete ENETRESET return code.
While in the old days iwm has done a stop/init cycle in these cases,
this was not executed since r193340.
As a consequence simplify iwm code as well by passing ieee80211_media_change()
right to ieee80211_vap_attach() as there is no more need for a local
implementation.

Reported by:	Tomoaki AOKI (junchoon dec.sakura.ne.jp)
Tested by:	Tomoaki AOKI (junchoon dec.sakura.ne.jp)
MFC after:	3 days
X-MFC:		fix is already in stable/12
PR:		248955
2020-09-11 14:18:47 +00:00
Kristof Provost
effd82ca70 dtrace: fix fbt return probes on RISC-V
Return values are passed in a0, so read it from there. We also pass a1 through
to userspace, as the ABI allows small structs to be returned in registers
a0/a1. While here read the register values directly from the trapframe rather
than rtval, and remove the now unneeded argument from dtrace_invop().

Set fbtp_roffset so that we get the correct return location in arg0.

Reviewed by:	markj
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D26389
2020-09-11 09:15:49 +00:00
John-Mark Gurney
0b39e3448a Don't clear reserved bits per RealTek
MFC after:	3 days
2020-09-11 02:02:13 +00:00
John Baldwin
39585a4c10 Disable WITNESS for spin locks by default.
This matches all other architectures and removes substantial overhead.

Reported by:	arichardson (indirectly)
Reviewed by:	imp, arichardson
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26403
2020-09-11 00:06:16 +00:00
Eric Joyner
7d7af7f85b ice(4): Update to 0.26.16
Summary of changes:

- Assorted bug fixes
- Support for newer versions of the device firmware
- Suspend/resume support
- Support for Lenient Link Mode for E82X devices (e.g. can try to link with
  SFP/QSFP modules with bad EEPROMs)
- Adds port-level rx_discards sysctl, similar to ixl(4)'s

This version of the driver is intended to be used with DDP package 1.3.16.0,
which has already been updated in a previous commit.

Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	3 days
MFC with:	r365332, r365550
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D26322
2020-09-10 23:46:13 +00:00
John Baldwin
385f4a5ac8 Use vmcb_read/write for the vmcb snapshot functions.
This avoids some unnecessary layers of indirection.
2020-09-10 22:22:23 +00:00
Konstantin Belousov
7978363417 Fix interaction between largepages and seals/writes.
On write with SHM_GROW_ON_WRITE, use proper truncate.
Do not allow to grow largepage shm if F_SEAL_GROW is set. Note that
shrinks are not supported at all due to unmanaged mappings.
Call to vm_pager_update_writecount() is only valid for swap objects,
skip it for unmanaged largepages.
Largepages cannot support write sealing.
Do not writecnt largepage mappings.

Reported by:	kevans
Reviewed by:	kevans, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26394
2020-09-10 20:54:44 +00:00
Brandon Bergren
691b35f344 [PowerPC64LE] Add LOAD_LR_NIA and RETURN_TO_NATIVE_ENDIAN defines.
* Add LOAD_LR_NIA define. This is preferred to "bl 1f; 1:" because it
doesn't pollute the branch predictor.

* Add magic sequence to return the CPU to the correct endianness after
jumping to cross-endian code, similar to the sequence from Linux.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-10 18:41:15 +00:00
Li-Wen Hsu
9f917148e0 urndis(4): Add support of Inseego/Novatel Wireless MiFi 8800/8000
PR:		245152
Submitted by:	rootless@gmail.com
Reviewed by:	hselasky
MFC after:	3 days
2020-09-10 18:27:52 +00:00
Andrew Turner
15fe2adacb Move the pl061 acpi attachment earlier
As the pl061 driver can be an interrupt controller attach it earlier in the
boot so other drivers can use it.

Use a new GPIO xref to not conflict with the existing root interrupt
controller.

Sponsored by:	Innovate UK
2020-09-10 14:58:46 +00:00
Ruslan Bukin
cb9050dd21 Move the rid variable to the generic iommu context.
It could be used in various IOMMU platforms, not only DMAR.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D26373
2020-09-10 14:12:25 +00:00
Andrew Turner
128e746c6a Switch the name of the pl061 driver to gpio
We need it to be named gpio for gpiobus to work.

Sponsored by:	Innovate UK
2020-09-10 09:50:43 +00:00
Andrew Turner
f5e4e9153c Only manage ofw gpio providers on ofw systems
On arm64 we may boot via ACPI. In this case we will still try to manage the
gpio providers as if we are using FDT. Fix this by checking if the FDT node
is valid before registering a cross reference.

Sponsored by:	Innovate UK
2020-09-10 09:42:37 +00:00
Andrew Turner
365ed84f28 Use the correct variable to check which interrupt mode to use
In the PL061 driver we incorrectly used the mask rather than mode to find
how to configure the interrupt.

Sponsored by:	Innovate UK
2020-09-10 09:37:30 +00:00
Alexander V. Chernikov
2b32d93e55 Fix RADIX_MPATH build broken by r365521.
Reported by:	jenkins, Hartmann, O. <ohartmann at walstatt.org>
2020-09-10 07:05:31 +00:00
Eric Joyner
cf08d707d4 ice_ddp: Fix 1.3.16.0 package
The version uploaded in the previous commit was far too small. This one
should be the right size.

MFC after:	1 day
Sponsored by:	Intel Corporation
2020-09-10 04:00:13 +00:00
Brandon Bergren
5c74d551d2 [PowerPC] Fix setting of time in OPAL
There were multiple bugs in the OPAL RTC code which had never been
discovered, as the default configuration of OPAL machines is to
have the BMC / FSP control the RTC.

* Fix calling convention for setting the time -- the variables are passed
directly in CPU registers, not via memory.

* Fix bug in the bcd encoding routines. (from jhibbits)

Tested on POWER9 Talos II (BE) and POWER9 Blackbird (LE).

Reviewed by:	jhibbits (in irc)
Sponsored by:	Tag1 Consulting, Inc.
2020-09-10 01:49:53 +00:00
Richard Scheffenegger
e74e64a191 cc_mod: remove unused CCF_DELACK definition
During the DCTCP improvements, use of CCF_DELACK was
removed. This change is just to rename the unused flag
bit to prevent use of it, without also re-implementing
the tcp_input and tcp_output interfaces.

No functional change.

Reviewed by:	chengc_netapp.com, tuexen
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D26181
2020-09-10 00:46:38 +00:00
Konstantin Belousov
d301b3580f Support for userspace non-transparent superpages (largepages).
Created with shm_open2(SHM_LARGEPAGE) and then configured with
FIOSSHMLPGCNF ioctl, largepages posix shared memory objects guarantee
that all userspace mappings of it are served by superpage non-managed
mappings.

Only amd64 for now, both 2M and 1G superpages can be requested, the
later requires CPU feature.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 22:12:51 +00:00
Alexander V. Chernikov
aa8f9f90ff Update nexthop handling for route addition/deletion in preparation for mpath.
Currently kernel requests deletion for the certain routes with specified gateway,
 but this gateway is not actually checked. With multipath routes, internal gateway
 checking becomes mandatory. Add the logic performing this check.

Generalise RTF_PINNED routes to the generic route priorities, simplifying the logic.

Add lookup_prefix() function to perform exact match search based on data in @info.

Differential Revision:	https://reviews.freebsd.org/D26356
2020-09-09 22:07:54 +00:00
Konstantin Belousov
e2e80fb3de vm_map: Add a map entry kind that can only be clipped at specific boundary.
The entries and their clip boundaries must be aligned on supported
superpages sizes from pagesizes[].  vm_map operations return Mach
error KERN_INVALID_ARGUMENT, which is usually translated to EINVAL, if
it would require clip not at the boundary.

In other words, entries force preserving virtual addresses superpage
properties.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 22:02:30 +00:00
Konstantin Belousov
6cadbcd203 Add pmap_enter(9) PMAP_ENTER_LARGEPAGE flag and implement it on amd64.
The flag requests entry of non-managed superpage mapping of size
pagesizes[psind] into the page table.

Pmap supports fake wiring of the largepage mappings.  Only attributes
of the largepage mapping can be changed by calling pmap_enter(9) over
existing mapping, physical address of the page must be unchanged.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:50:24 +00:00
Alexander V. Chernikov
cd6298d5c5 Retain marking net.fibs sysctl as a tunable.
Suggested by:	avg
2020-09-09 21:45:18 +00:00
Konstantin Belousov
7a9f2da33c Add vm_map_find_aligned(9).
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:44:59 +00:00
Konstantin Belousov
60cd9c95c5 Move MAP_32BIT_MAX_ADDR definition to sys/mman.h.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:39:06 +00:00
Konstantin Belousov
fceae0fd3c Fix assert.
Noted by:	alc
MFC after:	1 week
2020-09-09 21:35:44 +00:00
Konstantin Belousov
e8f77c204b Prepare to handle non-trivial errors from vm_map_delete().
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:34:31 +00:00
Konstantin Belousov
4ebfc4edaf amd64 pmap: teach functions walking user page tables about PG_PS bit in PDPE.
Only unmanaged 1G superpages are handled.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:08:45 +00:00
Konstantin Belousov
6e64bebb6f amd64: report support for 1G superpages in getpagesizes(2).
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:01:36 +00:00
Konstantin Belousov
25f44824ba uipc_shm.c: Move comment where it belongs.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 21:00:11 +00:00
Gleb Smirnoff
022c2f5570 In r354148 the goal was to check THREAD_CAN_SLEEP() only once for the
purpose of epoch_trace() and for calling subsequent panic, but to keep
code fully under INVARIANTS, so don't use bare function call to panic().
However, at the last stage of review a true value slipped in, while
always false was assumed.  I checked that in email archive with kib@.

Noticed by:	trasz
2020-09-09 16:13:33 +00:00
Randall Stewart
285385ba56 So it turns out that syzkaller hit another crash. It has to do with switching
stacks with a SENT_FIN outstanding. Both rack and bbr will only send a
FIN if all data is ack'd so this must be enforced. Also if the previous stack
sent the FIN we need to make sure in rack that when we manufacture the
"unknown" sends that we include the proper HAS_FIN bits.

Note for BBR we take a simpler approach and just refuse to switch.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D26269
2020-09-09 11:11:50 +00:00
Konstantin Belousov
a720b31c2a Allow consumer to customize physical pager.
Add support for user-supplied callbacks into phys pager operations,
providing custom getpages(), haspage(), and populate() methods
implementations.  Pager stores user data ptr/val in the object to
provide context.

Add phys_pager_allocate() helper that takes user ops table as one of
the arguments.

Current code for these methods is moved to the 'default' ops table,
assigned automatically when vm_pager_alloc() is used.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-09 00:00:43 +00:00
Brandon Bergren
6957645145 [PowerPC64] Fix xive order calculation in qemu TCG
When emulating a single thread system for testing reasons, mp_maxid can
be 0. This trips up our math for calculating the order.

Account for this to fix xive attachment when emulating a single-thread
core on qemu powernv (a configuration that doesn't exist in the real world.)

Sponsored by:	Tag1 Consulting, Inc.
2020-09-08 23:48:49 +00:00
Konstantin Belousov
67a659d282 Add kern_mmap_racct_check(), a helper to verify limits in vm_mmap*().
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-08 23:48:19 +00:00
Konstantin Belousov
fbf2a77876 Convert allocations of the phys pager to vm_pager_allocate().
Future changes would require additional initialization of OBJT_PHYS
objects, and vm_object_allocate() is not suitable for it.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-08 23:38:49 +00:00
Konstantin Belousov
89d2fb14d5 Add interruptible variant of vm_wait(9), vm_wait_intr(9).
Also add msleep flags argument to vm_wait_doms(9).

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-08 23:28:09 +00:00
Brandon Bergren
95090cd024 [PowerPC64] Hide dssall instruction from llvm assembler
When doing a build for a modern CPUTYPE, llvm will throw errors if obsolete
instructions are used, even if they will never run due to runtime checks.

Hiding the dssall instruction from the assembler fixes kernel build when
overriding CPUTYPE, without having any effect on the generated binary.

This has been in my local tree for over a year and is well tested across
a variety of machines.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-08 22:59:43 +00:00
Brandon Bergren
d20ae9cca2 [PowerPC] Add root_pic assertion.
When enabling an interrupt, assert that we do in fact have a root PIC.

This would have saved me some debugging effort.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-08 22:42:41 +00:00
John Baldwin
895c98cc29 Don't return errors from the cryptodev_process() method.
The cryptodev_process() method should either return 0 if it has
completed a request, or ERESTART to defer the request until later.  If
a request encounters an error, the error should be reported via
crp_etype before completing the request via crypto_done().

Fix a few more drivers noticed by asomers@ similar to the fix in
r365389.  This is an old bug, but went unnoticed since crypto requests
did not start failing as a normal part of operation until digest
verification was introduced which can fail requests with EBADMSG.

PR:		247986
Reported by:	asomers
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D26361
2020-09-08 22:41:35 +00:00
Eugene Grosbein
cea05ed9a9 geom_part: extend kern.geom.part.check_integrity to work on GPT
There are multiple USB/SATA bridges on the market that unconditionally
cut some LBAs off connected media. This could be a problem
for pre-partitioned drives so GEOM complains and does not create
devices in /dev for slices/partitions preventing access to existing data.

We have kern.geom.part.check_integrity that allows us to correct
partitioning if changed from default 1 to 0 but it works for MBR only.
If backup copy of GPT is unavailable due to decreases number of LBAs,
kernel still does not give access to partitions and prints to dmesg:

GEOM: md0: corrupt or invalid GPT detected.
GEOM: md0: GPT rejected -- may not be recoverable.

This change makes it work for GPT too, so it created partitions in /dev
and prints to dmesg this instead:

GEOM: md0: the secondary GPT table is corrupt or invalid.
GEOM: md0: using the primary only -- recovery suggested.

Then "gpart recover" re-created backup copy of GPT
and allows further manipulations with partitions.

This change is no-op for default configuration having
kern.geom.part.check_integrity=1

Reported by:	Alex Korchmar
MFC after:	3 days.
2020-09-08 22:23:53 +00:00
Alexander V. Chernikov
4a8201c13a Fix panic with net.fibs tunable set in loader.conf.
Fix by removing forgotten CTLFLAG_RWTUN flag from the sysctl,
 loader variable will be read later in vnet_rtables_init().

Reported by:	mav
2020-09-08 21:39:34 +00:00
Matt Macy
692aa83d53 ZFS: remove some extra defines
When merging a number of defines that are needed in the standalone
build made it in to the module makefile.

Reported by:	markj@
2020-09-08 17:47:30 +00:00
Mateusz Guzik
54052edaa0 fd: fix fhold on an uninitialized var in fdcopy_remapped
Reported by:	gcc9
2020-09-08 16:07:47 +00:00
Mateusz Guzik
da62ed4f1a cache: drop write-only tvp_seqc vars 2020-09-08 16:06:46 +00:00
Mateusz Guzik
2bcfa5ba6f vfs: drop a write-only var in vfs_periodic_msync_inactive 2020-09-08 16:06:26 +00:00
Mitchell Horne
66245bc536 arm64: check for CRC32 support via HWCAP
Doing it this way eliminates the assumption about homogeneous support
for the feature, since HWCAP values are only set if support is present
on all CPUs.

Reviewed by:	tuexen, markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26032
2020-09-08 15:39:19 +00:00
Mitchell Horne
752eb6a995 arm64: export new HWCAP features
Expose some of the new HWCAP features added in r65304. This includes the
addition of elf_hwcap2 into the sysvec, and a separate function to parse
for those features.

This only exposes features which require no further configuration, e.g.
indicating the presence of certain instructions. Larger features (SVE)
will not be advertised until we actually support them. The exact list of
features/extensions this patch exposes is:
  - ARMv8.0-DGH
  - ARMv8.0-SB
  - ARMv8.2-BF16
  - ARMv8.2-DCCVADP
  - ARMv8.2-I8MM
  - ARMv8.4-LRCPC
  - ARMv8.5-CondM
  - ARMv8.5-FRINT
  - ARMv8.5-RNG
  - PSTATE.SSBS

While here, annotate elf_hwcap and elf_hwcap2 as __read_frequently, and
move the declarations to the machine/md_var.h header.

Submitted by:	mikael@ (D22314 portion)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26031
Differential Revision:	https://reviews.freebsd.org/D22314
2020-09-08 15:36:38 +00:00
Mitchell Horne
d81d009c6c arm64: fix incorrect HWCAP definitions
FreeBSD exports CPU features as bits in the AT_HWCAP and AT_HWCAP2
vectors via elf_aux_info(3). This interface is similar to getauxval(3)
on Linux, and for simplicity to consumers we try to maintain an
identical set of feature flags on arm64.

The first batch of AT_HWCAP flags were added in r350166, corresponding
to definitions that already existed in Linux. Unfortunately, one flag
was missed, and a portion of the values are shifted one bit to the right
as a result.

Add the missing definition for HWCAP_ASIMDHP, and adjust the affected
values to match their Linux counterparts.

Although this is an ABI-breaking change, there is no plan to provide
compat code for old binaries. An audit of our ports tree and other
software via Debian code search indicates that there are not yet any
consumers of this interface for FreeBSD/arm64.

Bump __FreeBSD_version to be on the safe side, in case compat code needs
to be added in the future.

Reviewed by:	emaste, manu
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26329
2020-09-08 15:08:20 +00:00
Kristof Provost
a969635b83 net: mitigate vnet / epair cleanup races
There's a race where dying vnets move their interfaces back to their original
vnet, and if_epair cleanup (where deleting one interface also deletes the other
end of the epair). This is commonly triggered by the pf tests, but also by
cleanup of vnet jails.

As we've not yet been able to fix the root cause of the issue work around the
panic by not dereferencing a NULL softc in epair_qflush() and by not
re-attaching DYING interfaces.

This isn't a full fix, but makes a very common panic far less likely.

PR:		244703, 238870
Reviewed by:	lutz_donnerhacke.de
MFC after:	4 days
Differential Revision:	https://reviews.freebsd.org/D26324
2020-09-08 14:54:10 +00:00
Mitchell Horne
bc42dacb0e RISC-V: enable MK_FORMAT_EXTENSIONS
This option was marked as broken because our riscv64-xtoolchain-gcc
package lacked support. Since we are moving away from xtoolchain gcc in
favor of freebsd-gcc9, there should be no issue in enabling this option
by default.

Notably, this enables -Wformat errors.

Reviewed by:	kp, jhb
Differential Revision:	https://reviews.freebsd.org/D26320
2020-09-08 13:24:44 +00:00
Mitchell Horne
7c7b8f577e RISC-V: fix some mismatched format specifiers
RISC-V is currently built with -Wno-format, which is how these went
undetected. Address them now before re-enabling those warnings.

Differential Revision:	https://reviews.freebsd.org/D26319
2020-09-08 13:21:13 +00:00
Andrew Turner
88e43c7ca4 Move gpio and hwpmc to the correct place in files.arm64
Sponsored by:	Innovate UK
2020-09-08 11:46:33 +00:00
Andrew Turner
1fc1a22868 Add a GPIO driver for the Arm pl061 controller
A PL061 is a simple 8 pin GPIO controller. This GPIO device is used to
signal an internal request for shutdown on some virtual machines including
Arm-based Amazon EC2 instances.

Submitted by:	Ali Saidi <alisaidi_amazon.com> (previouss version)
Reviewed by:	Ali Saidi, manu
Differential Revision:	https://reviews.freebsd.org/D24065
2020-09-08 11:35:35 +00:00
Andriy Gapon
15f4848ab7 mmc_da: universally use uint8_t for the partition index
Also, assert in sdda_init_switch_part() that the index is within the
defined range.

MFC after:	1 week
2020-09-08 06:19:23 +00:00