Commit Graph

134030 Commits

Author SHA1 Message Date
Navdeep Parhar
88c9c3f4dd cxgbe(4): Update T4/5/6 firmwares to 1.25.0.0.
Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-09-17 22:14:11 +00:00
Warner Losh
fd0a41d241 Move to a more robust and conservative alloation scheme for devctl messages
Change the zone setup:
- Allow slabs to be returned to the OS
- Set the number of slots to the max devctl will queue before discarding
- Reserve 2% of the max (capped at 100) for low memory allocations
- Disable per-cpu caching since we don't need it and we avoid some pathologies

Change the alloation strategiy a bit:
- If a normal allocation fails, try to get the reserve
- If a reserve allocation fails, re-use the oldest-queued entry for storage
- If there's a weird race/failure and nothing on the queue to steal, return NULL

This addresses two main issues in the old code:
- If devd had died, and we're generating a lot of messages, we have an
  unbounded leak. This new scheme avoids the issue that lead to this.
- The MPASS that was 'sure' the allocation couldn't have failed turned out
  to be wrong in some rare cases. The new code doesn't make this assumption.

Since we reserve only 2% of the space, we go from about 1MB of
allocation all the time to more like 50kB for the reserve.

Reviewed by: markj@
Differential Revision: https://reviews.freebsd.org/D26448
2020-09-17 17:29:33 +00:00
Mark Johnston
97458520cc Increase the default vm.max_user_wired value.
Since r347532 (merged to stable/12) we only count user-wired pages
towards the system limit.  However, we now also treat pages wired by
hypervisors (bhyve and virtualbox) as user-wired, so starting VMs with
large amounts of RAM tends to fail due to the low limit.

The purpose of the limit is to provide a seatbelt, not to impose some
policy on the use of wired memory.  Thus, increase the default limit to
allow reasonable VM configurations to work without tuning.

Reviewed by:	kib
Discussed with:	dougm
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26424
2020-09-17 16:49:28 +00:00
Mitchell Horne
003470c31a Add dtb/sifive module
This allows building the HiFive Unleashed device tree blob.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D26459
2020-09-17 14:58:30 +00:00
Edward Tomasz Napierala
106a784b35 Reduce code duplication by introducing linux_copyout_sockaddr()
helper function.  No functional changes.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25804
2020-09-17 12:14:24 +00:00
Edward Tomasz Napierala
79e3da0602 Add support for SOUND_MIXER_WRITE_MONITOR ioctl. Fixes alsamixer(1)
on my x220.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25806
2020-09-17 11:44:45 +00:00
Edward Tomasz Napierala
70890254b3 Get rid of sv_errtbl and SV_ABI_ERRNO().
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26388
2020-09-17 11:39:33 +00:00
Eugene Grosbein
b2b5d4c07d geom_part: make it possible recovering broken GPT after some LBAs cut off
This is followup to r365477.

If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.

This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".

It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0

For example:

# gpart show da0
=>        34  3906963389  da0  GPT  (1.8T) [CORRUPT]
          34      262144    1  ms-reserved  (128M)
      262178        2014       - free -  (1.0M)
      264192  3906764943    2  freebsd-swap  (1.8T)
# gpart resize -i 2 -s 3900000000 da0
# gpart recover da0

Reported by:	Alex Korchmar
MFC after:	3 days
2020-09-17 04:39:39 +00:00
Konstantin Belousov
dd90d96342 Put calls to check_pgrp_jobc() in fixjobc_kill() under INVARIANTS.
Reported by:	Michael Butler <imb@protected-networks.net>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-09-17 00:07:15 +00:00
Konstantin Belousov
182cfe6ff4 Add check_pgrp_jobc() calls into process exit path.
Both before and after job control adjustments.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:49:19 +00:00
Konstantin Belousov
2f5f11f533 Fix fixjobc+orhpanage.
Orphans affect job control state, we must account for them when
changing pg_jobc.

Instead of p_pptr, use proc_realparent() to get parent relevant for
job control.

Use correct calculation of the parent for exiting process.  For jobc
purposes, we must use realparent, but if it is also exiting, we should
fall to reaper, then recursively find non-exiting reaper.

Reported by:	trasz
PR:	249257
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:46:57 +00:00
Konstantin Belousov
928b85384a Assert that P_TREE_GRPEXITED is set only once.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:40:32 +00:00
Konstantin Belousov
844219f471 proc_realparent: if p_oppid does not match pid of the current parent
for non-orphaned process, return reaper instead of init.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:38:24 +00:00
Konstantin Belousov
82207cd246 Improve ddb 'show pgrpdump' command.
Use ddb pager.
Make lines more compact.
Eliminate unneeded casts.
Print more job-control related info when reporting process group.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26416
2020-09-16 21:34:18 +00:00
Konstantin Belousov
016b7c7e39 tmpfs: restore atime updates for reads from page cache.
Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated
atomically without locks or (locked) atomics.

tn_update_getattr() change also contains unrelated bug fix.

Reported by:	lwhsu
PR:	249362
Reviewed by:	markj (previous version)
Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26451
2020-09-16 21:28:18 +00:00
Konstantin Belousov
23f9071466 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-09-16 21:24:34 +00:00
Mitchell Horne
ceff9b9d25 if_media: definitions for 40GE LM4 ethernet media type
Reviewed by:	erj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26276
2020-09-16 14:45:16 +00:00
Mark Johnston
e12492164a Move PLTs to the beginning of amd64 kernel modules.
As with .text, the aim is to ensure that executable sections are
segregated from the rest, to avoid creation of writeable and executable
mappings.  Recent versions of LLVM emit a PLT in firmware modules.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26444
2020-09-16 13:51:47 +00:00
Warner Losh
9ea860660f Use standard bool type, instead of non-standard boolean_t 2020-09-16 06:02:30 +00:00
Rick Macklem
a5c55410b3 Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures.
The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex
held.  Normally this is ok, since all that happens is SVC_RELEASE()
decrements a reference count.  However, if the socket has just been shut
down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep
lock during destruction of the server side krpc structure.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after:	1 week
2020-09-16 02:25:18 +00:00
Mark Johnston
b4e07e3da5 Fix locking in uipc_accept().
Reported by:	cy
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-09-15 23:03:56 +00:00
Konstantin Belousov
081e36e760 Add tmpfs page cache read support.
Or it could be explained as lockless (for vnode lock) reads.  Reads
are performed from the node tn_obj object.  Tmpfs regular vnode object
lifecycle is significantly different from the normal OBJT_VNODE: it is
alive as far as ref_count > 0.

Ensure liveness of the tmpfs VREG node and consequently v_object
inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open().
Provide custom tmpfs fo_close() method on file, to ensure that close
is paired with open.

Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks.
It is quite cheap in code size sense to support page-ins for read for
tmpfs even if we do not own tmpfs vnode lock.  Also, we can handle
holes in tmpfs node without additional efforts, and do not have
limitation of the transfer size.

Reviewed by:	markj
Discussed with and benchmarked by:	mjg (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:19:16 +00:00
Konstantin Belousov
4601f5f5ee Microoptimize tmpfs node ref/unref by using atomics.
Avoid tmpfs mount and node locks when ref count is greater than zero,
which is the case until node is being destroyed by unlink or unmount.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:13:21 +00:00
Konstantin Belousov
3c484f325e Convert page cache read to VOP.
There are several negative side-effects of not calling into VOP layer
at all for page cache reads.  The biggest is the missed activation of
EVFILT_READ knotes.

Also, it allows filesystem to make more fine grained decision to
refuse read from page cache.

Keep VIRF_PGREAD flag around, it is still useful for nullfs, and for
asserts.

Reviewed by:	markj
Tested by:	pho
Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:06:36 +00:00
Konstantin Belousov
888636655d vfs_subr.c: export io_hold_cnt and vn_read_from_obj().
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:00:58 +00:00
Konstantin Belousov
96474d2a3f Do not copy vp into f_data for DTYPE_VNODE files.
The pointer to vnode is already stored into f_vnode, so f_data can be
reused.  Fix all found users of f_data for DTYPE_VNODE.

Provide finit_vnode() helper to initialize file of DTYPE_VNODE type.

Reviewed by:	markj (previous version)
Discussed with:	freqlabs (openzfs chunk)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 21:55:21 +00:00
Eric Joyner
a3b9a7366e e1000: Properly retain promisc flag
From Franco:
The iflib rewrite forced the promisc flag but it was not reported
to the system.  Noticed on a stock VM that went into unsolicited
promisc mode when dhclient was started during bootup.

PR:		248869
Submitted by:	Franco Fichtner <franco@opnsense.org>
Reviewed by:	erj@
MFC after:	3 days
2020-09-15 21:07:30 +00:00
Ed Maste
09860d44e4 bhyve: do not permit write access to VMCB / VMCS
Reported by:	Patrick Mooney
Submitted by:	jhb
Security:	CVE-2020-24718
2020-09-15 21:04:27 +00:00
Eric Joyner
467515a494 igb(4): Fix define and includes with RSS option enabled
This re-adds the opt_rss.h header to the driver and includes some
RSS-specific headers when RSS is defined.

PR:		249191
Submitted by:	Milosz Kaniewski <milosz.kaniewski@gmail.com>
MFC after:	3 days
2020-09-15 21:00:25 +00:00
Brandon Bergren
9673f30503 [PowerPC64LE] Use correct in_masks table on LE to fix checksumming
Due to a check that should have been an endian check being an #if 0,
the wrong checksum mask table was being used on LE, which was causing
extreme strangeness in DNS resolution -- *some* hosts would be resolvable,
but most would not.

This fixes DNS resolution.

(I am committing some parts of the LE patchset ahead of time to reduce the
amount of work I have to do while committing the main patchset.)

Sponsored by:	Tag1 Consulting, Inc.
2020-09-15 20:47:33 +00:00
Brandon Bergren
1e936efbce [PowerPC64LE] Set up the powernv partition table correctly.
The partition table is always big endian.

Sponsored by:	Tag1 Consulting, Inc.
2020-09-15 20:25:38 +00:00
Konstantin Belousov
101d5b527a bhyve: intercept AMD SVM instructions.
Intercept and report #UD to VM on SVM/AMD in case VM tried to execute an
SVM instruction.  Otherwise, SVM allows execution of them, and instructions
operate on host physical addresses despite being executed in guest mode.

Reported by:	Maxime Villard <max@m00nbsd.net>
admbug:	972
CVE:	CVE-2020-7467
Reviewed by:	grehan, markj
Differential revision:	https://reviews.freebsd.org/D26313
2020-09-15 20:22:50 +00:00
Mark Johnston
448000279e Fix locking in uipc_accept().
This function wasn't converted to use the new locking protocol in
r333744.  Make it use the PCB lock for synchronizing connection state.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26300
2020-09-15 19:23:42 +00:00
Mark Johnston
ccdadf1a9b Simplify unix socket connection peer locking.
unp_pcb_owned_lock2() has some sharp edges and forces callers to deal
with a bunch of cases.  Simplify it:

- Rename to unp_pcb_lock_peer().
- Return the connected peer instead of forcing callers to load it
  beforehand.
- Handle self-connected sockets.
- In unp_connectat(), just lock the accept socket directly.  It should
  not be possible for the nascent socket to participate in any other
  lock orders.
- Get rid of connect_internal().  It does not provide any useful
  checking anymore.
- Block in unp_connectat() when a different thread is concurrently
  attempting to lock both sides of a connection.  This provides simpler
  semantics for callers of unp_pcb_lock_peer().
- Make unp_connectat() return EISCONN if the socket is already
  connected.  This fixes a race[1] when multiple threads attempt to
  connect() to different addresses using the same datagram socket.
  Upper layers will disconnect a connected datagram socket before
  calling the protocol connect's method, but there is no synchronization
  between this and protocol-layer code.

Reported by:	syzkaller [1]
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26299
2020-09-15 19:23:22 +00:00
Mark Johnston
ed92e1c78c Avoid an unnecessary malloc() when connecting dgram sockets.
The allocated memory is only required for SOCK_STREAM and SOCK_SEQPACKET
sockets.

Reviewed by:	kevans
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26298
2020-09-15 19:23:01 +00:00
Mark Johnston
f0317f868b Simplify unp_disconnect() callers.
In all cases, PCBs are unlocked after unp_disconnect() returns.  Since
unp_disconnect() may release the last PCB reference, callers may have to
bump the refcount before the call just so that they can release them
again.

Change unp_disconnect() to release PCB locks as well as connection
references; this lets us remove several refcount manipulations.  Tighten
assertions.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26297
2020-09-15 19:22:37 +00:00
Mark Johnston
4820bf6ac2 Rename unp_pcb_lock2().
unp_pcb_lock_pair() seems like a better name.  Also make it handle the
case where the two sockets are the same instead of making callers do it.
No functional change intended.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26296
2020-09-15 19:22:16 +00:00
Mark Johnston
5362170da7 Improve unix socket PCB refcounting.
- Use refcount_init().
- Define an INVARIANTS-only zone destructor to assert that various
  bits of PCB state aren't left dangling.
- Annotate unp_pcb_rele() with __result_use_check.
- Simplify control flow.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26295
2020-09-15 19:21:58 +00:00
Mark Johnston
d5cbccecd8 Update unix domain socket locking comments.
- Define a locking key for unpcb members.
- Rewrite some of the locking protocol description to make it less
  verbose and avoid referencing some subroutines which will be renamed.
- Reorder includes.

Reviewed by:	glebius, kevans, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26294
2020-09-15 19:21:33 +00:00
Edward Tomasz Napierala
c26391f4dd Move SV_ABI_ERRNO translation into linux-specific code, to simplify
the syscall path and declutter it a bit.  No functional changes intended.

Reviewed by:	kib (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26378
2020-09-15 16:41:21 +00:00
Warner Losh
2215adc5ff Include sys/types.h here
It's included by header pollution in most of the compile
environments. However, in the standalone envirnment, it's not
included. Go ahead and include it always since the overhead is low and
it is simpler that way.

MFC After: 3 days
2020-09-15 15:21:29 +00:00
Andrew Turner
d6aa5fe5eb Use ATTR_DEFAULT in the arm64 locore.S
We can use ATTR_DEFAULT directly in locore.S as it fits within an orr
instruction operand.

Sponsored by:	Innovate UK
2020-09-15 14:15:04 +00:00
Warner Losh
0c97af56a7 We don't need the sc_ekeys_lock in standalone environment.
When we bring in geli into the boot loader, we are single threaded so
we don't have to worry about locking. We have no mutexes, and don't need
to use them, so comment it out.

MFC After: 3 days
2020-09-14 23:51:14 +00:00
Warner Losh
44a8153672 Don't do the busy dance in icee_open/close
We don't need to do the busy dance for this driver. It's handled by
destroy_dev() entirely. Since all we did was busy/unbusy in
open/close, just delete them. We therefore don't need to track closes
either.

Reviewed by: ian@
Differential Revision: https://reviews.freebsd.org/D26431
2020-09-14 23:30:04 +00:00
Warner Losh
45472826b9 Tweak what's visible in the standalone environment. We define offsetof
in stand.h typically, but when this is included we can define it
multiple times. However, we don't define bool in stand.h at the
moment, so allow it to be defined inside types.h when we're building
for the standalone environment.

MFC After: 3 days
2020-09-14 23:27:51 +00:00
Navdeep Parhar
bb60ba7e22 cxgbe(4): Get the count of FCS errors from the MAC and not MPS for T6 ports.
The MPS register on the T6 counts something other than FCS errors despite its
name.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-09-14 22:15:54 +00:00
Ian Lepore
3c41bcdf29 Add product ID strings for a couple Microchip usb hubs. Also, update the
vendor ID string to say just "Microchip Technology" -- the buyout of
Standard Microsystems happened in 2012 and the SMC/SMSC names are pretty
much retired at this point.

PR:		241406
2020-09-14 17:33:28 +00:00
Andrew Turner
2a6803de1c Use MACHINE_CPUARCH when checking for arm64
Use MACHINE_CPUARCH with arm64 (aarch64) when we build code that could run
on any 64-bit Arm instruction set. This will simplify checks in downstream
consumers targeting prototype instruction sets.

The only place we check for MACHINE_ARCH == aarch64 is when building the
device tree blobs. As these are targeting current generation ISAs.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D26370
2020-09-14 16:12:28 +00:00
Brandon Bergren
115c987b3f [PowerPC] Make cpu frequency detection endian-independent
On ibm,extended-clock-frequency, ensure we be64toh() the value.

On clock-frequency, remove the right-shifting hack (which was needed due to
reading a 32 bit value into a 64 bit variable) and switch to OF_getencprop()
for reading (which will handle endian conversion internally.)

Reviewed by:	jhibbits (in irc)
Sponsored by:	Tag1 Consulting, Inc.
2020-09-14 15:20:37 +00:00
Gordon Tetlow
bfdd0a34ad Partially revert r346018 and use the if/then construct instead of shell.
There are a couple of places in the tree that directly parse the newvers.sh
script looking for the BRANCH variable. I found two locations, one in
release/Makefile and the other in bin/freebsd-version/Makefile.

While there is a good argument that BRANCH_OVERRIDE should properly
propagate in those circumstances and the new behavior is thus better, the
reality is this change broke freebsd-update's ability to find timestamps in
binaries and resulted in a large number of gratuitous changes.

Reported by:	freebsd-update
Discussed with:	cperciva
MFC after:	1 day
2020-09-14 14:45:30 +00:00