Commit Graph

130809 Commits

Author SHA1 Message Date
Kristof Provost
b02fd8b790 epair: Do not abuse params to register the second interface
if_epair used the 'params' argument to pass a pointer to the b interface
through if_clone_create().
This pointer can be controlled by userspace, which means it could be abused to
trigger a panic. While this requires PRIV_NET_IFCREATE
privileges those are assigned to vnet jails, which means that vnet jails
could panic the system.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	3 days
2020-01-28 22:44:24 +00:00
Mateusz Guzik
f0ddecd745 amd64: revamp memcmp
Borrow the trick from memset and memmove and use the scale/index/base addressing
to avoid branches.

If a mismatch is found, the routine has to calculate the difference. Make sure
there is always up to 8 bytes to inspect. This replaces the previous loop which
would operate over up to 16 bytes with an unrolled list of 8 tests.

Speed varies a lot, but this is a net win over the previous routine with probably
a lot more to gain.

Validated with glibc test suite.
2020-01-28 17:48:17 +00:00
Edward Tomasz Napierala
c2d4745705 Add TCP_CORK support to linux(4). This fixes one of the things Nginx
trips over.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23171
2020-01-28 13:57:24 +00:00
Edward Tomasz Napierala
da6d8ae6d8 Add compat.linux.ignore_ip_recverr sysctl. This is a workaround
for missing IP_RECVERR setsockopt(2) support. Without it, DNS
resolution is broken for glibc >= 2.30 (glibc BZ #24047).

From the user point of view this fixes "yum update" on recent
CentOS 8.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23234
2020-01-28 13:51:53 +00:00
Konstantin Belousov
5fc9e11c42 Save lower root vnode in nullfs mnt data instead of upper.
Nullfs needs to know the root vnode of the lower fs during the
operation.  Currently it caches the upper vnode of it, which is also
the root of the nullfs mount.  On unmount, nullfs calls vflush() with
rootrefs == 1, and aborts non-forced unmount if there are any more
vnodes instantiated during vflush().  This means that the reference to
the root vnode after failed non-forced unmount could be lost and
nullm_rootvp points to the freed memory.

Fix it by storing the reference for lower vnode instead, which is kept
intact during vflush().  nullfs_root() now instantiates the upper
vnode of lower root.  Care about VV_ROOT flag in null_nodeget().

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-01-28 11:29:06 +00:00
Konstantin Belousov
2a3529df1d Provide support for fdevname(3) on linuxkpi-backed devices.
Reported and tested by:	manu
Reviewed by:	hselasky, manu
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23386
2020-01-28 11:22:20 +00:00
Michael Tuexen
dc13edbc7d Fix build issues for the userland stack on 32-bit platforms.
Reported by:		Felix Weinrank
MFC after:		1 week
2020-01-28 10:09:05 +00:00
Conrad Meyer
cc3b01385b amdtemp(4): Remove dead code that snuck in with r357190
I intended to remove this before committing, but neglected to.
2020-01-28 03:27:06 +00:00
Conrad Meyer
c59b9a4f8d amdtemp(4): Add support for Family 17h CCD sensors
Probe Family 17h CPUs for up to 4 (Zen, Zen+) or 8 (Zen2) CCD temperature
sensors.  These were discovered by Ondrej Čerman
(https://github.com/ocerman) and collaborators experimentally, and are not
currently documented in any datasheet I have access to.
2020-01-28 01:39:50 +00:00
Conrad Meyer
02f7000293 amdtemp(4): Refactor shared temperature calculation logic
No functional change intended.
2020-01-28 01:38:51 +00:00
Conrad Meyer
d9591f0c2a x86: identcpu: Decode new Intel Structured Extended feature bits 2020-01-28 01:37:20 +00:00
Conrad Meyer
4799e1997a x86: identcpu: Decode new Zen2 AMD Feature2 bit 2020-01-28 01:36:45 +00:00
Warner Losh
42ec4f05a3 Make mqueue objects work across a fork again.
In r110908 (2003) alfred added DFLAG_PASSABLE to tag those types of FD
that can be passed via unix pipes, but mqueuefs didn't exist
yet. Later, in r152825 (2005) davidxu neglected to include
DFLAG_PASSABLE since people don't normally pass these things via unix
sockets (it's a FreeBSD implementation detail that it's a file
descriptor, nobody noticed). Then r223866 (2011) by jonathan used the
new flag in fdcopy, which fork uses. Due to that, mqueuefs actually
broke mqueue objects being propagated by fork. No mention of mqueuefs
was made in r223866, so I think it was an unintended consequence.

Fix this by tagging mqueuefs as passable as well. They were prior to
alfred's change (and it's clear there's no intent in his change to
change this behavior), and POSIX requires this to be the case as well.

PR: 243103
Reviewed by: kib@, jiles@
Differential Revision: https://reviews.freebsd.org/D23038
2020-01-27 22:36:54 +00:00
Warner Losh
160799c691 No need to have an extra layer of indirection here. Call the sdhci_cam_requiest
routine directly when handling a MMIO request.
2020-01-27 22:20:02 +00:00
Warner Losh
8c7cd14adf Create a convenince wrapper to fill in a CAM_PATH_INQ request for MMC sims. Pass
in the parameters needed for the different sims, but it's almost all identical.
2020-01-27 22:19:55 +00:00
Doug Moore
f886c4ba71 Correct the use of RB_AUGMENT in the RB_TREE macros so that is invoked
at the root of every subtree that changes in an insert or delete, and
only once, and ordered from the bottom of the tree to the top.  For
intel_gas.c, the only user of RB_AUGMENT I can find, change the
augmenting routine so that it does not climb from entry to tree root
on every call, and remove a 'tree correcting' function that can be
supplanted by proper tree augmentation.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23189
2020-01-27 15:09:13 +00:00
Konstantin Belousov
fd99699d7e Fix aggregating geoms for BIO_SPEEDUP.
If the bio was split into several bios going down, completion computes
bio_completed of the original bio as sum of the bio_completes of the
splits.  For BIO_SETUP, bio_length means something different than the
length. it is the requested speedup amount, and is duplicated into the
splits, which is in fact reasonable, since we cannot know how the
previous activity was distributed among subordinate geoms.  Obviously,
the sum of n bio_length is greater than bio_length for n > 1, which
triggers assert that bio_length >= bio_completed for e.g. geom_stripe
and geom_raid3.

Fix this by reassigning bio_completed from bio_length for completed
BIO_SPEEDED, I do not think it really mattters what we return in
bio_completed.

Reported and tested by:	pho
Reviewed by:	imp
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23380
2020-01-27 13:15:16 +00:00
Alex Richardson
162ae9c834 Allow bootstrapping makefs on older FreeBSD hosts and Linux/macOS
In order to do so we need to install the msdosfs headers to the bootstrap
sysroot and avoid includes of kernel headers that may not exist on every
host (e.g. sys/lockmgr.h). This change should allow bootstrapping of makefs
on FreeBSD 11+ as well as Linux and macOS.

We also have to avoid using the IO_SYNC macro since that may not be
available. In makefs it is only used to switch between calling
bwrite() and bdwrite() which both call the same function. Therefore we
can simply always call bwrite().

For our CheriBSD builds we always bootstrap makefs by setting
LOCAL_XTOOL_DIRS='lib/libnetbsd usr.sbin/makefs' and use the makefs binary
from the build tree to create a bootable disk image.

Reviewed By:	brooks
Differential Revision: https://reviews.freebsd.org/D23201
2020-01-27 12:02:41 +00:00
Conrad Meyer
9ea85092d9 hwpstate(4): Log a debug line when throttled
If we're going to throttle user requested P-states, we should at least produce
a debug log line indicating the surprising behavior.

PR:		inspired by 234733
2020-01-27 06:04:32 +00:00
Alexander Kabaev
8227d65b72 Enter net epoch in msk_tick.
One more instance of if_input being called outside of
interrupt, by means of msk_handle_events.

Differential Revision:  https://reviews.freebsd.org/D23379
2020-01-27 00:14:51 +00:00
Vincenzo Maffione
de27b30340 netmap_mem_unmap: fix NULL pointer dereference
MFC after:	3 days
2020-01-26 21:34:46 +00:00
Rick Macklem
60a09a94cf Fix a crash in the NFSv4 server.
The PR reported a crash that occurred when a file was removed while
client(s) were actively doing lock operations on it.
Since nfsvno_getvp() will return NULL when the file does not exist,
the bug was obvious and easy to fix via this patch. It is a little
surprising that this wasn't found sooner, but I guess the above
case rarely occurs.

Tested by:	iron.udjin@gmail.com
PR:		242768
Reported by:	iron.udjin@gmail.com
MFC after:	2 weeks
2020-01-26 17:59:05 +00:00
John Baldwin
425e5f9dcf Revert accidental change from r357146. 2020-01-26 14:23:27 +00:00
John Baldwin
c73222d0e6 Fix some misleading indentation warnings reported by recent clang.
These should not be any functional change.  While the change in
emul10kx-pcm.c looks like a real bug fix (as opposed to inconsistent
whitespace), the extra statements were not harmful.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23363
2020-01-26 14:20:57 +00:00
John Baldwin
1207cda961 Compile hack.c with normal CFLAGS + -shared -nostdlib.
Originally, hack.c was compiled into a shard object with just -shared
-nostdlib.  This assumed that ${CC} did not require any additional
flags for ABIs, cross-building, etc.

When kern.post.mk was created in r89509 by reducing duplication in
kernel Makefile.<arch> files, the -shared flag was moved into a
HACK_EXTRA_FLAGS variable so that sparc64 could override it with
-Wl,-shared.  The sparc64 hack was removed in r111650, but
HACK_EXTRA_FLAGS was left in place.  Over time, we have started
support toolchains that require flags to support alternate ABIs on
MIPS and PowerPC and started (ab)using HACK_EXTRA_FLAGS to set only
those flags.

I need to fix risc-v to pass -mno-relax to the hack.c build for lld in
llvm 10, and the patches to support cross-build from non-FreeBSD hosts
need to include -target for clang in CFLAGS for hack.c.  Rather than
adding more hacks into HACK_EXTRA_FLAGS, just use the full set of
CFLAGS with hack.c.

Reviewed by:	kib, arichardson
MFC after:	1 month
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23362
2020-01-26 14:19:08 +00:00
Alexander V. Chernikov
75831a1c95 Fix NOINET6 build after r357038.
Reported by:	AN <andy at neu.net>
2020-01-26 11:54:21 +00:00
Mateusz Guzik
1513f80391 vfs: do an unlocked check before iterating the lazy list
For most filesystems it is expected to be empty most of the time.
2020-01-26 07:06:18 +00:00
Mateusz Guzik
cd0e46c66b vfs: remove vop loop from vop_sigdefer
All ops are guaranteed to be present since r357131.
2020-01-26 07:05:06 +00:00
Mateusz Guzik
8a6f5fd50c vfs: stop null checking routines in vop wrappers
Calls to vop_bypass pass the same argument, but type casted to something else.
Thus by replacing NULL routines with vop_bypass we avoid a runtime check.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23357
2020-01-26 00:41:38 +00:00
Mateusz Guzik
6d69e665dd vfs: fix freevnodes count update race against preemption
vdbatch_process leaves the critical section too early, openign a time
window where another thread can get scheduled and modify vd->freevnodes.
Once it the preempted thread gets back it overrides the value with 0.

Just move critical_exit to the end of the function.
2020-01-26 00:40:27 +00:00
Mateusz Guzik
6c44a3e019 ufs: add vgone calls for unconstructed vnodes in the error path
This mostly eliminates the requirement that vput never unlocks the vnode
before calling VOP_INACTIVE. Note it may still be present for other
filesystems.

See r356126 for an example bug.

Note vput stopped doing early unlock in r357070 thus this change does
not affect correctness as it is.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23215
2020-01-26 00:38:06 +00:00
Mateusz Guzik
dc9a1cb60b vfs: predict vn_lock failure as unlikely in vget 2020-01-26 00:34:57 +00:00
Michael Tuexen
9cc711c9ff Sending CWR after an RTO is according to RFC 3168 generally required
and not only for the DCTCP congestion control.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes, tuexen@, Cheng Cui
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D23119
2020-01-25 13:45:10 +00:00
Michael Tuexen
47e2c17c12 Don't set the ECT codepoint on retransmitted packets during SACK loss
recovery. This is required by RFC 3168.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes@, tuexen@, Cheng Cui
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D23118
2020-01-25 13:34:29 +00:00
Michael Tuexen
a2d59694be As a TCP client only enable ECN when the corresponding sysctl variable
indicates that ECN should be negotiated for the client side.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes@, tuexen@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D23228
2020-01-25 13:11:14 +00:00
Jason A. Harmening
a9aa06f7b1 Implement cycle-detecting garbage collector for AF_UNIX sockets
The existing AF_UNIX socket garbage collector destroys any socket
which may potentially be in a cycle, as indicated by its file reference
count being equal to its enqueue count. However, this can produce false
positives for in-flight sockets which aren't part of a cycle but are
part of one or more SCM_RIGHTS mssages and which have been closed
on the sending side. If the garbage collector happens to run at
exactly the wrong time, destruction of these sockets will render them
unusable on the receiving side, such that no previously-written data
may be read.

This change rewrites the garbage collector to precisely detect cycles:

1. The existing check of msgcount==f_count is still used to determine
   whether the socket is potentially in a cycle.
2. The socket is now placed on a local "dead list", which is used to
   reduce iteration time (and therefore contention on the global
   unp_link_rwlock).
3. The first pass through the dead list removes each potentially-dead
   socket's outgoing references from the graph of potentially-dead
   sockets, using a gc-specific copy of the original reference count.
4. The second series of passes through the dead list removes from the
   list any socket whose remaining gc refcount is non-zero, as this
   indicates the socket is actually accessible outside of any possible
   cycle.  Iteration is repeated until no further sockets are removed
   from the dead list.
5. Sockets remaining in the dead list are destroyed as before.

PR:		227285
Submitted by:	jan.kokemueller@gmail.com (prior version)
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D23142
2020-01-25 08:57:26 +00:00
Gleb Smirnoff
d35738c38d Enter the network epoch in RX processing taskqueue. 2020-01-25 00:06:18 +00:00
Michael Tuexen
ee97681e5c Don't delay the ACK for a TCP segment with the CWR flag set.
This allows the data sender to increase the CWND faster.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes@, tuexen@, Cheng Cui
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D22670
2020-01-24 22:50:23 +00:00
Michael Tuexen
8f63a52bdb The server side of TCP fast open relies on the delayed ACK timer to allow
including user data in the SYN-ACK. When DSACK support was added in
r347382, an immediate ACK was sent even for the received SYN with
user data. This patch fixes that and allows again to send user data with
the SYN-ACK.

Reported by:		Jeremy Harris
Reviewed by:		Richard Scheffenegger, rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D23212
2020-01-24 22:37:53 +00:00
Gleb Smirnoff
e1d2b46953 Enter the network epoch when rack_output() is called in setsockopt(2). 2020-01-24 21:56:10 +00:00
Gleb Smirnoff
17c328b6ae Enter the network epoch in USB WiFi drivers when processing input
mbuf queues.

Submitted by:	Idwer Vollering <vidwer gmail.com>
2020-01-24 21:04:33 +00:00
Alexander V. Chernikov
75b893375f Add support for RFC 6598/Carrier Grade NAT subnets. to libalias and ipfw.
In libalias, a new flag PKT_ALIAS_UNREGISTERED_RFC6598 is added.
 This is like PKT_ALIAS_UNREGISTERED_ONLY, but also is RFC 6598 aware.
Also, we add a new NAT option to ipfw called unreg_cgn, which is like
 unreg_only, but also is RFC 6598-aware.  The reason for the new
 flags/options is to avoid breaking existing networks, especially those
 which rely on RFC 6598 as an external address.

Submitted by:	Neel Chauhan <neel AT neelc DOT org>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D22877
2020-01-24 20:35:41 +00:00
Konstantin Belousov
cd0047f3a9 Handle a race of collapse with a retrying fault.
Both vm_object_scan_all_shadowed() and vm_object_collapse_scan() might
observe an invalid page left in the default backing object by the
fault handler that retried.  Check for the condition and refuse to collapse.

Reported and tested by:	pho
Reviewed by:	jeff
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D23331
2020-01-24 19:42:53 +00:00
Gleb Smirnoff
2f0e17b7de re(4) uses taskqueue to process input packets. Enter network epoch
in there.
2020-01-24 17:24:02 +00:00
Gleb Smirnoff
5ab0c8434a ath(4) processing input packets in taskqueue. Enter network epoch
before calling ieee80211_input_mimo().
2020-01-24 17:11:54 +00:00
Ruslan Bukin
7106b618d2 Include the PCI stack to the riscv GENERIC kernel.
It will be used by an upcoming PCI root complex driver.

Sponsored by:	DARPA, AFRL
2020-01-24 17:10:21 +00:00
Ruslan Bukin
79a6ce8b41 Enable NEW_PCIB on riscv.
Sponsored by:	DARPA, AFRL
2020-01-24 16:50:51 +00:00
Ruslan Bukin
c344a95134 o Move the software context struct to a header file.
o Make the pci_host_generic_acpi_attach() globally visible.
o Declare a new driver class.

These will be used by a new PCI root complex driver.

Sponsored by:	DARPA, AFRL
2020-01-24 16:43:49 +00:00
Ruslan Bukin
9a82a56bee Move the ECAM macroses to the header file.
These will be used by other PCI root complex drivers.

Sponsored by:	DARPA, AFRL
2020-01-24 16:08:06 +00:00
Mark Johnston
a89c2c8c34 Revert r357050.
It seems to have introduced a couple of regressions.

Reported by:	cy, pho
2020-01-24 14:58:02 +00:00