Commit Graph

137204 Commits

Author SHA1 Message Date
Lutz Donnerhacke
f6e0c47169 netgraph/ng_bridge: move MACs via control message
Use the new control message to move ethernet addresses from a link to
a new link in ng_bridge(4).  Send this message instead of doing the
work directly requires to move the loop detection into the control
message processing.  This will delay the loop detection by a few
frames.

This decouples the read-only activity from the modification under a
more strict writer lock.

Reviewed by:	manpages (gbe)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28559
2021-05-13 17:27:01 +02:00
Mark Johnston
8b3c4231ab posix timers: Check for overflow when converting to ns
Disallow a time or timer period value when the conversion to nanoseconds
would overflow.  Otherwise it is possible to trigger a divison by zero
in realtime_expire_l(), where we compute the number of overruns by
dividing by the timer interval.

Fixes:	7995dae9 ("posix timers: Improve the overrun calculation")
Reported by:	syzbot+5ab360bd3d3e3c5a6e0e@syzkaller.appspotmail.com
Reported by:	syzbot+157b74ff493140d86eac@syzkaller.appspotmail.com
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30233
2021-05-13 08:34:03 -04:00
Mark Johnston
9246b3090c fork: Suspend other threads if both RFPROC and RFMEM are not set
Otherwise, a multithreaded parent process may trigger races in
vm_forkproc() if one thread calls rfork() with RFMEM set and another
calls rfork() without RFMEM.

Also simplify vm_forkproc() a bit, vmspace_unshare() already checks to
see if the address space is shared.

Reported by:	syzbot+0aa7c2bec74c4066c36f@syzkaller.appspotmail.com
Reported by:	syzbot+ea84cb06937afeae609d@syzkaller.appspotmail.com
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30220
2021-05-13 08:33:23 -04:00
Randall Stewart
02cffbc250 tcp: Incorrect KASSERT causes a panic in rack
Skyzall found an interesting panic in rack. When a SYN and FIN are
both sent together a KASSERT gets tripped where it is validating that
a mbuf pointer is in the sendmap. But a SYN and FIN often will not
have a mbuf pointer. So the fix is two fold a) make sure that the
SYN and FIN split the right way when cloning an RSM SYN on left
edge and FIN on right. And also make sure the KASSERT properly
accounts for the case that we have a SYN or FIN so we don't
panic.

Reviewed by: mtuexen
Sponsored by: Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D30241
2021-05-13 07:36:04 -04:00
Mateusz Guzik
cef8a95acb vfs: fix vnode use count leak in O_EMPTY_PATH support
The vnode returned by namei_setup is already referenced.

Reported by:	pho
2021-05-13 09:39:27 +00:00
Konstantin Belousov
6de3cf14c4 vn_open_cred(): disallow O_CREAT | O_EMPTY_PATH
This combination does not make sense, and cannot be satisfied by lookup.
In particular, lookup cannot supply dvp, it only can directly return vp.

Reported and reviewed by:	markj using syzkaller
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-05-13 02:32:04 +03:00
Michael Tuexen
eec6aed5b8 sctp: fix another locking bug in COOKIE handling
Thanks to Tolya Korniltsev for reporting the issue for
the userland stack and testing the fix.

MFC after:	3 days
2021-05-12 23:05:28 +02:00
Mark Johnston
d8acd2681b Fix mbuf leaks in various pru_send implementations
The various protocol implementations are not very consistent about
freeing mbufs in error paths.  In general, all protocols must free both
"m" and "control" upon an error, except if PRUS_NOTREADY is specified
(this is only implemented by TCP and unix(4) and requires further work
not handled in this diff), in which case "control" still must be freed.

This diff plugs various leaks in the pru_send implementations.

Reviewed by:	tuexen
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30151
2021-05-12 13:00:09 -04:00
Mark Johnston
c1dd4d642f nd6: Avoid using an uninitialized sockaddr in nd6_prefix_offlink()
Commit 81728a538 ("Split rtinit() into multiple functions.") removed
the initialization of sa6, but not one of its uses.  This meant that we
were passing an uninitialized sockaddr as the address to
lltable_prefix_free().  Remove the variable outright to fix the problem.
The caller is expected to hold a reference on pr.

Fixes:		81728a538 ("Split rtinit() into multiple functions.")
Reported by:	KMSAN
Reviewed by:	donner
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30166
2021-05-12 12:52:06 -04:00
Mark Johnston
ad22ba2b9f if: Remove unnecessary validation in the SIOCSIFNAME handler
A successful copyinstr() call guarantees that the returned string is
nul-terminated.  Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-12 12:52:06 -04:00
Mark Johnston
06d1fd9f42 swap_pager: Zero swap info before exporting to userspace
Otherwise padding bytes are leaked.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-12 12:52:05 -04:00
Michael Tuexen
251842c639 tcp rack: improve initialisation of retransmit timeout
When the TCP is in the front states, don't take the slop variable
into account. This improves consistency with the base stack.

Reviewed by:		rrs@
Differential Revision:	https://reviews.freebsd.org/D30230
MFC after:		1 week
Sponsored by:		Netflix, Inc.
2021-05-12 18:02:21 +02:00
Michael Tuexen
12dda000ed sctp: fix locking in case of error handling during a restart
Thanks to Taylor Brandstetter for finding the issue and providing
a patch for the userland stack.

MFC after:	3 days
2021-05-12 15:29:06 +02:00
John Baldwin
ed93deba11 Remove a write-only variable.
While refactoring an earlier series of changes during review, the
'saved_data' variable stopped being used at the bottom of if_ioctl().

Suggested by:	brooks
Reviewed by:	brooks, imp, kib
Fixes:		d17e0940f7 Rework compat shims in ifioctl().
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D30197
2021-05-11 14:56:23 -07:00
Mark Johnston
1a04f0156c cryptodev: Fix some input validation bugs
- When we do not have a separate IV, make sure that the IV length
  specified by the session is not larger than the payload size.
- Disallow AEAD requests without a separate IV.  crp_sanity() asserts
  that CRYPTO_F_IV_SEPARATE is set for AEAD requests, and some (but not
  all) drivers require it.
- Return EINVAL for AEAD requests if an IV is specified but the
  transform does not expect one.

Reported by:	syzbot+c9e8f6ff5cb7fa6a1250@syzkaller.appspotmail.com
Reported by:	syzbot+007341439ae295cee74f@syzkaller.appspotmail.com
Reported by:	syzbot+46e0cc42a428b3b0a40d@syzkaller.appspotmail.com
Reported by:	syzbot+2c4d670173b8bdb947df@syzkaller.appspotmail.com
Reported by:	syzbot+220faa5eeb4d47b23877@syzkaller.appspotmail.com
Reported by:	syzbot+e83434b40f05843722f7@syzkaller.appspotmail.com
Reviewed by:	jhb
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30154
2021-05-11 17:36:12 -04:00
Hans Petter Selasky
b8f113cab9 Implement cdev_device_add() and cdev_device_del() in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:23 +02:00
Hans Petter Selasky
67807f5066 cdev_del() should only put it's kernel object in the LinuxKPI.
The destructor takes care of the rest.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:23 +02:00
Hans Petter Selasky
904390b478 Implement read-only VM_SHARED flag in the LinuxKPI.
For use by mmap(2) callbacks.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:14 +02:00
Roger Pau Monné
4772e86beb xen/blkback: fix reconnection of backend
The hotplug script will be executed only once for each backend,
regardless of the frontend triggering reconnections. Fix blkback to
deal with the hotplug script being executed only once, so that
reconnections don't stall waiting for a hotplug script execution
that will never happen.

As a result of the fix move the initialization of dev_mode, dev_type
and dev_name to the watch callback, as they should be set only once
the first time the backend connects.

This fix is specially relevant for guests wanting to use UEFI OVMF
firmware, because OVMF will use Xen PV block devices and disconnect
afterwards, thus allowing them to be used by the guest OS. Without
this change the guest OS will stall waiting for the block backed to
attach.

Fixes: de0bad0001 ('blkback: add support for hotplug scripts')
MFC after: 1 week
Sponsored by: Citrix Systems R&D
2021-05-11 15:43:42 +02:00
Randall Stewart
4b86a24a76 tcp: In rack, we must only convert restored rtt when the hostcache does restore them.
Rack now after the previous commit is very careful to translate any
value in the hostcache for srtt/rttvar into its proper format. However
there is a snafu here in that if tp->srtt is 0 is the only time that
the HC will actually restore the srtt. We need to then only convert
the srtt restored when it is actually restored. We do this by making
sure it was zero before the call to cc_conn_init and it is non-zero
afterwards.

Reviewed by:	Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D30213
2021-05-11 08:15:05 -04:00
Wojciech Macek
0b103f7237 mrouter: do not loopback packets unconditionally
Looping back router multicast traffic signifficantly
stresses network stack. Add possibility to disable or enable
loopbacked based on sysctl value.

Reported by:    Daniel Deville
Reviewed by:	mw
Differential Revision:	https://reviews.freebsd.org/D29947
2021-05-11 12:36:07 +02:00
Wojciech Macek
65634ae748 mroute: fix race condition during mrouter shutting down
There is a race condition between V_ip_mrouter de-init
    and ip_mforward handling. It might happen that mrouted
    is cleaned up after V_ip_mrouter check and before
    processing packet in ip_mforward.
    Use epoch call aproach, similar to IPSec which also handles
    such case.

Reported by:    Damien Deville
Obtained from:	Stormshield
Reviewed by:	mw
Differential Revision:	https://reviews.freebsd.org/D29946
2021-05-11 12:34:20 +02:00
Mateusz Guzik
12288bd999 cache: fix lockless absolute symlink traversal to non-fp mounts
Said lookups would incorrectly fail with EOPNOTSUP.

Reported by:	kib
2021-05-11 04:30:12 +00:00
Justin Hibbits
a436e66531 powerpc/radix pmap: Convert stat counters from ulongs to counters
This should help performance a hair, for concurrent stat updates, by
reducing contention on cache lines.
2021-05-10 21:26:14 -05:00
Justin Hibbits
31c3770ee5 powerpc/mmu: Actually use the Radix pmap_align_superpage function
This was missed in the conversion to ifuncs.  It might help improve
promotion rates.
2021-05-10 21:26:14 -05:00
Rick Macklem
cb07628d9e nfscl: Delete unneeded redundant MODULE_DEPEND() calls
There are two module declarations in the nfscl.ko module for "nfscl"
and "nfs".  Both of these declarations had MODULE_DEPEND() calls.
This patch deletes the MODULE_DEPEND() calls for "nfs" to avoid
confusion with respect to what modules this module is dependent upon.

The patch also adds comments explaining why there are two module
declarations within the module.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D30102
2021-05-10 17:34:29 -07:00
Mark Johnston
c8bbb1272c vfs: Fix error handling in vn_fullpath_hardlink()
vn_fullpath_any_smr() will return a positive error number if the
caller-supplied buffer isn't big enough.  In this case the error must be
propagated up, otherwise we may copy out uninitialized bytes.

Reported by:	syzkaller+KMSAN
Reviewed by:	mjg, kib
MFC aftr:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30198
2021-05-10 20:22:27 -04:00
Konstantin Belousov
5e7cdf1817 openat(2): add O_EMPTY_PATH
It reopens the passed file descriptor, checking the file backing vnode'
current access rights against open mode. In particular, this flag allows
to convert file descriptor opened with O_PATH, into operable file
descriptor, assuming permissions allow that.

Reviewed by:	markj
Tested by:	Andrew Walker <awalker@ixsystems.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30148
2021-05-11 02:39:24 +03:00
Richard Scheffenegger
0471a8c734 tcp: SACK Lost Retransmission Detection (LRD)
Recover from excessive losses without reverting to a
retransmission timeout (RTO). Disabled by default, enable
with sysctl net.inet.tcp.do_lrd=1

Reviewed By: #transport, rrs, tuexen, #manpages
Sponsored by: Netapp, Inc.
Differential Revision: https://reviews.freebsd.org/D28931
2021-05-10 19:06:20 +02:00
Randall Stewart
9867224bab tcp:Host cache and rack ending up with incorrect values.
The hostcache up to now as been updated in the discard callback
but without checking if we are all done (the race where there are
more than one calls and the counter has not yet reached zero). This
means that when the race occurs, we end up calling the hc_upate
more than once. Also alternate stacks can keep there srtt/rttvar
in different formats (example rack keeps its values in microseconds).
Since we call the hc_update *before* the stack fini() then the
values will be in the wrong format.

Rack on the other hand, needs to convert items pulled from the
hostcache into its internal format else it may end up with
very much incorrect values from the hostcache. In the process
lets commonize the update mechanism for srtt/rttvar since we
now have more than one place that needs to call it.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D30172
2021-05-10 11:25:51 -04:00
Kristof Provost
2ef5d803e3 in6_mcast: Return EADDRINUSE when we've already joined the group
Distinguish between truly invalid requests and those that fail because
we've already joined the group. Both cases fail, but differentiating
them allows userspace to make more informed decisions about what the
error means.

For example. radvd tries to join the all-routers group on every SIGHUP.
This fails, because it's already joined it, but this failure should be
ignored (rather than treated as a sign that the interface's multicast is
broken).

This puts us in line with OpenBSD, NetBSD and Linux.

Reviewed by:	donner
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30111
2021-05-10 09:48:51 +02:00
Juraj Lutter
c2c9ef3ced rpi_ft5406: Recognize raspberrypi,firmware-ts touchscreen
- Recognize raspberrypi,firmware-ts touchscreen
- Move the driver from ofwbus to simplebus

Reviewed by:	manu
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D30169
2021-05-09 12:13:19 +02:00
Ruslan Bukin
9146c6240d ofw: support for a single 'port' DTS property.
On rk3399 the VOP-little node has a single 'port' property (not a
collection of 'ports' or indexed ports).

Reviewed by:	manu
Sponsored by:	UKRI
Differential Revision:	https://reviews.freebsd.org/D30165
2021-05-08 15:41:57 +01:00
Fedor Uporov
2a984c2b49 Make encode/decode extra time functions inline.
Mentioned by:   pfg
MFC after:      2 weeks
2021-05-08 06:42:20 +03:00
Rick Macklem
dd02d9d605 nfscl: Add support for va_birthtime to NFSv4
There is a NFSv4 file attribute called TimeCreate
that can be used for va_birthtime.
r362175 added some support for use of TimeCreate.
This patch completes support of va_birthtime by adding
support for setting this attribute to the server.
It also eanbles the client to
acquire and set the attribute for a NFSv4
server that supports the attribute.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30156
2021-05-07 17:30:56 -07:00
Randall Stewart
5a4333a537 This takes Warners suggested approach to making it so that
platforms that for whatever reason cannot include the RATELIMIT option
can still work with rack. It adds two dummy functions that rack will
call and find out that the highest hw supported b/w is 0 (which
kinda makes sense and rack is already prepared to handle).

Reviewed by: Michael Tuexen, Warner Losh
Sponsored by: Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D30163
2021-05-07 17:32:32 -04:00
Alexander V. Chernikov
aad59c79f5 Fix panic when trying to delete non-existent gateway in multipath route.
IF non-existend gateway was specified, the code responsible for calculating
 an updated nexthop group, returned the same already-used nexthop group.
After the route table update, the operation result contained the same
 old & new nexthop groups. Thus, the code responsible for decomposing
 the notification to the list of simple nexthop-level notifications,
 was not able to find any differences. As a result, it hasn't updated any
  of the "simple" notification fields, resulting in empty rtentry pointer.
This empty pointer was the direct reason of a panic.

Fix the problem by returning ESRCH when the new nexthop group is the same
 as the old one after applying gateway filter.

Reported by:	Michael <michael.adm at gmail.com>
PR:		255665
MFC after:	3 days
2021-05-07 20:41:31 +00:00
Kristof Provost
93abcf17e6 pf: Support killing 'matching' states
Optionally also kill states that match (i.e. are the NATed state or
opposite direction state entry for) the state we're killing.

See also https://redmine.pfsense.org/issues/8555

Submitted by:	Steven Brown
Reviewed by:	bcr (man page)
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30092
2021-05-07 22:13:31 +02:00
Kristof Provost
abbcba9cf5 pf: Allow states to by killed per 'gateway'
This allows us to kill states created from a rule with route-to/reply-to
set.  This is particularly useful in multi-wan setups, where one of the
WAN links goes down.

Submitted by:	Steven Brown
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30058
2021-05-07 22:13:31 +02:00
Kristof Provost
e989530a09 pf: Introduce DIOCKILLSTATESNV
Introduce an nvlist based alternative to DIOCKILLSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30054
2021-05-07 22:13:30 +02:00
Kristof Provost
7606a45dcc pf: Introduce DIOCCLRSTATESNV
Introduce an nvlist variant of DIOCCLRSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30052
2021-05-07 22:13:30 +02:00
Mark Johnston
a1fadf7de2 divert: Fix mbuf ownership confusion in div_output()
div_output_outbound() and div_output_inbound() relied on the caller to
free the mbuf if an error occurred.  However, this is contrary to the
semantics of their callees, ip_output(), ip6_output() and
netisr_queue_src(), which always consume the mbuf.  So, if one of these
functions returned an error, that would get propagated up to
div_output(), resulting in a double free.

Fix the problem by making div_output_outbound() and div_output_inbound()
responsible for freeing the mbuf in all cases.

Reported by:	Michael Schmiedgen <schmiedgen@gmx.net>
Tested by:	Michael Schmiedgen
Reviewed by:	donner
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30129
2021-05-07 14:31:08 -04:00
Mark Johnston
831850d8b0 stack(9): Disable KASAN in stack_capture()
When unwinding the stack, we may encounter a stack frame in a poisoned
region of the stack, triggering a false positive.

Reviewed by:	andrew, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30126
2021-05-07 14:31:08 -04:00
Mark Johnston
cfad8bd24f cdefs: Make __nosanitizeaddress work for KASAN as well
Add __nosanitizememory while I'm here.

Reviewed by:	andrew, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30126
2021-05-07 14:31:08 -04:00
Mark Johnston
2d499d5052 linker_set: Disable ASAN only in userspace
KASAN does not insert redzones around global variables and so is not
susceptible to the problem that led to us disabling ASAN for linker set
elements in the first place (see commit fe3d8086fb).

Reviewed by:	andrew, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30126
2021-05-07 14:31:08 -04:00
Randall Stewart
a16cee0218 Fix a UDP tunneling issue with rack. Basically there are two
issues.
A) Not enough hdrlen was being calculated when a UDP tunnel is
   in place.
and
B) Not enough memory is allocated in racks fsb. We need to
   overbook the fsb to include a udphdr just in case.

Submitted by: Peter Lei
Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30157
2021-05-07 14:06:43 -04:00
Konstantin Belousov
d474440ab3 Constify vm_pager-related virtual tables.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov
4b8365d752 Add OBJT_SWAP_TMPFS pager
This is OBJT_SWAP pager, specialized for tmpfs.  Right now, both swap pager
and generic vm code have to explicitly handle swap objects which are tmpfs
vnode v_object, in the special ways.  Replace (almost) all such places with
proper methods.

Since VM still needs a notion of the 'swap object', regardless of its
use, add yet another type-classification flag OBJ_SWAP. Set it in
vm_object_allocate() where other type-class flags are set.

This change almost completely eliminates the knowledge of tmpfs from VM,
and opens a way to make OBJT_SWAP_TMPFS loadable from tmpfs.ko.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov
0d2dfc6fed pagertab: use designated initializers
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov
838adc533f Style enum obj_type
Put each type into dedicated line, which makes addition of new
types cleaner.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00