Commit Graph

137355 Commits

Author SHA1 Message Date
Mateusz Guzik
a0842e69aa lockprof: add contested-only profiling
This allows tracking all wait times with much smaller runtime impact.

For example when doing -j 104 buildkernel on tmpfs:

no profiling:	2921.70s user 282.72s system 6598% cpu 48.562 total
all acquires:	2926.87s user 350.53s system 6656% cpu 49.237 total
contested only:	2919.64s user 290.31s system 6583% cpu 48.756 total
2021-05-22 19:28:37 +00:00
Mateusz Guzik
fca5cfd584 lockprof: retire lock_prof_skipcount
The implementation uses a global variable for *ALL* calls, defeating the
point of sampling in the first place. Remove it as it clearly remains
unused.
2021-05-22 19:28:37 +00:00
Mateusz Guzik
cf74b2be53 vfs: retire the now unused vnlru_free routine 2021-05-22 18:42:30 +00:00
Mark Johnston
5c7ef43e96 ktls.h: Guard includes behind _KERNEL
These are not needed when including ktls.h to get sockopt definitions.

Reviewed by:	gallatin, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30392
2021-05-22 12:12:19 -04:00
Mark Johnston
e4b16f2fb1 ktrace: Avoid recursion in namei()
sys_ktrace() calls namei(), which may call ktrnamei().  But sys_ktrace()
also calls ktrace_enter() first, so if the caller is itself being
traced, the assertion in ktrace_enter() is triggered.  And, ktrnamei()
does not check for recursion like most other ktrace ops do.

Fix the bug by simply deferring the ktrace_enter() call.

Also make the parameter to ktrnamei() const and convert to ANSI.

Reported by:	syzbot+d0a4de45e58d3c08af4b@syzkaller.appspotmail.com
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30340
2021-05-22 12:07:32 -04:00
Michael Tuexen
8923ce6304 tcp: Handle stack switch while processing socket options
Handle the case where during socket option processing, the user
switches a stack such that processing the stack specific socket
option does not make sense anymore. Return an error in this case.

MFC after:		1 week
Reviewed by:		markj
Reported by:		syzbot+a6e1d91f240ad5d72cd1@syzkaller.appspotmail.com
Sponsored by:		Netflix, Inc.
Differential revision:	https://reviews.freebsd.org/D30395
2021-05-22 14:39:36 +02:00
Konstantin Belousov
f784da883f Move mnt_maxsymlinklen into appropriate fs mount data structures
Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
X-MFC-Note:	struct mount layout
Differential revision:	https://reviews.freebsd.org/D30325
2021-05-22 15:16:09 +03:00
Konstantin Belousov
ea2b64c241 ktrace: add a kern.ktrace.filesize_limit_signal knob
When enabled, writes to ktrace.out that exceed the max file size limit
cause SIGXFSZ as it should be, but note that the limit is taken from
the process that initiated ktrace.   When disabled, write is blocked,
but signal is not send.

Note that in either case ktrace for the affected process is stopped.

Requested and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:09 +03:00
Konstantin Belousov
02645b886b ktrace: use the limit of the trace initiator for file size limit on writes
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:09 +03:00
Konstantin Belousov
1762f674cc ktrace: pack all ktrace parameters into allocated structure ktr_io_params
Ref-count the ktr_io_params structure instead of vnode/cred.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:08 +03:00
Konstantin Belousov
a6144f713c ktrace: do not stop tracing other processes if our cannot write to this vnode
Other processes might still be able to write, make the decision to stop
based on the per-process situation.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:08 +03:00
Konstantin Belousov
9bb84c23e7 accounting: explicitly mark the exiting thread as doing accounting
and use the mark to stop applying file size limits on the write of
the accounting record.  This allows to remove hack to clear process
limits in acct_process(), and avoids the bug with the clearing being
ineffective because limits are also cached in the thread structure.

Reported and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:08 +03:00
Konstantin Belousov
70c05850e2 kern_descrip.c: Style
Wrap too long lines.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:08 +03:00
Dmitry Chagin
d6fd321ef6 run(4): add support for ASUS USB-N14 wireless adaptor.
PR:		255759
Submitted by:	john.lmurdoch at gmail.com
MFC After:	1 week
2021-05-22 13:52:12 +03:00
Konstantin Belousov
42881526d4 nullfs: dirty v_object must imply the need for inactivation
Otherwise pages are cleaned some time later when the lower fs decides
that it is time to do it.  This mostly manifests itself as delayed
mtime update, e.g. breaking make-like programs.

Reported by:	mav
Tested by:	mav, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-05-22 12:30:17 +03:00
Konstantin Belousov
d713bf7927 vn_need_pageq_flush(): simplify
There is no need to own vnode interlock, since v_object is type stable
and can only change to/from NULL, and no other checks in the function
access fields protected by the interlock.  Remove the need variable, the
result of the test is directly usable as return value.

Tested by:	mav, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-05-22 12:29:44 +03:00
Edward Tomasz Napierala
33621dfc19 Refactor core dumping code a bit
This makes it possible to use core_write(), core_output(),
and sbuf_drain_core_output(), in Linux coredump code.  Moving
them out of imgact_elf.c is necessary because of the weird way
it's being built.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30369
2021-05-22 09:59:00 +01:00
Navdeep Parhar
ffbb373c5a cxgbe(4): Fix build warnings with NOINET kernels.
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D26334
2021-05-21 20:42:04 -07:00
Richard Scheffenegger
3975688563 rack: honor prior socket buffer lock when doing the upcall
While partially reverting D24237 with D29690, due to introducing some
unintended effects for in-kernel TCP consumers, the preexisting lock
on the socket send buffer was not considered properly.

Found by: markj
MFC after: 2 weeks
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D30390
2021-05-22 00:09:59 +02:00
Mark Johnston
916c61a5ed Fix handling of errors from pru_send(PRUS_NOTREADY)
PRUS_NOTREADY indicates that the caller has not yet populated the chain
with data, and so it is not ready for transmission.  This is used by
sendfile (for async I/O) and KTLS (for encryption).  In particular, if
pru_send returns an error, the caller is responsible for freeing the
chain since other implicit references to the data buffers exist.

For async sendfile, it happens that an error will only be returned if
the connection was dropped, in which case tcp_usr_ready() will handle
freeing the chain.  But since KTLS can be used in conjunction with the
regular socket I/O system calls, many more error cases - which do not
result in the connection being dropped - are reachable.  In these cases,
KTLS was effectively assuming success.

So:
- Change sosend_generic() to free the mbuf chain if
  pru_send(PRUS_NOTREADY) fails.  Nothing else owns a reference to the
  chain at that point.
- Similarly, in vn_sendfile() change the !async I/O && KTLS case to free
  the chain.
- If async I/O is still outstanding when pru_send fails in
  vn_sendfile(), set an error in the sfio structure so that the
  connection is aborted and the mbuf chain is freed.

Reviewed by:	gallatin, tuexen
Discussed with:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30349
2021-05-21 17:45:19 -04:00
Mark Johnston
7d2608a5d2 tcp: Make error handling in tcp_usr_send() more consistent
- Free the input mbuf in a single place instead of in every error path.
- Handle PRUS_NOTREADY consistently.
- Flush the socket's send buffer if an implicit connect fails.  At that
  point the mbuf has already been enqueued but we don't want to keep it
  in the send buffer.

Reviewed by:	gallatin, tuexen
Discussed with:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30349
2021-05-21 17:45:18 -04:00
Emmanuel Vadot
80e645dcdb mmc: Only build mmc_fdt_helper and mmc_pwrseq for arch that uses ext_resources
This is now a needed requirement and fixes powerpc* build
2021-05-21 19:35:20 +02:00
Emmanuel Vadot
c99d887ca8 dwmmc: Add bus_generic_add_child in the methods
Otherwise sdiob cannot add it's children.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30295
2021-05-21 17:40:14 +02:00
Emmanuel Vadot
115e71a457 arm: allwinner: aw_mmc: Check regulators status before enabling/disabling them
Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30294
2021-05-21 17:39:47 +02:00
Emmanuel Vadot
f52072b06d extres: regulator: Fix regulator_status for already enable regulators
If a regulator hasn't been enable by a driver but is enabled in hardware
(most likely enabled by U-Boot), regulator_status will returns that it
is enabled and so any call to regulator_disable will panic as it wasn't
enabled by one of our drivers.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30293
2021-05-21 17:39:07 +02:00
Emmanuel Vadot
ce41765c21 mmc: dwmmc: Call mmc_fdt_set_power
This allow us to powerup/down the card and enabling/disabling the
regulators if any.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30292
2021-05-21 17:38:35 +02:00
Emmanuel Vadot
03d4e8bb65 mmc_fdt_helper: Add mmc_fdt_set_power
This helper can be used to enable/disable the regulator and starting
the power sequence of sd/sdio/eMMC cards.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30291
2021-05-21 17:38:05 +02:00
Emmanuel Vadot
182717da88 arm64: allwinner: axp81x: Add support for regnode_status
This method is used to know if a regulator is enabled or not.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30290
2021-05-21 17:37:37 +02:00
Emmanuel Vadot
b0387990a7 mmc_fdt_helpers: Parse the optional pwrseq element.
If a sd/emmc node have a pwrseq property parse it and get the corresponding
driver.
This can later be used to powerup/powerdown the SDIO card or eMMC.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30289
2021-05-21 17:36:58 +02:00
Emmanuel Vadot
5b2a81f58d mmc: Add mmc-pwrseq driver
This driver is used to power up sdio card or eMMC.
It handle the reset-gpio, clocks and needed delays for powerup/powerdown.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30288
2021-05-21 17:36:20 +02:00
Emmanuel Vadot
bc1bb80564 arm64: rockchip: gpio: Give friendlier name to gpio
By default name the gpio P<bank><bankpin>
This make it easier to find the gpio when reading schematics or DTS.

Sponsored by:	Diablotin Systems
Differential Revision:	https://reviews.freebsd.org/D30287
2021-05-21 17:35:43 +02:00
Emmanuel Vadot
af2253f61c mmccam: Add two new XPT for MMC and use them in mmc_sim and sdhci
For the discovery phase of SD/eMMC we need to do some transaction in a async
way.
The classic CAM XPT_{GET,SET}_TRAN_SETTING cannot be used in a async way.
This also allow us to split the discovery phase into a more complete state
machine and we don't mtx_sleep with a random number to wait for completion
of the tasks.
For mmc_sim we now do the SET_TRAN_SETTING in a taskqueue so we can call
the needed function for regulators/clocks without the cam lock(s). This part is
still needed to be done for sdhci.
We also now save the host OCR in the discovery phase as it wasn't done before and
only worked because the same ccb was reused.

Reviewed by:	imp, kibab, bz
Differential Revision:	https://reviews.freebsd.org/D30038
2021-05-21 17:34:05 +02:00
Hans Petter Selasky
4eac63af23 Fix for use-after-free by if_ioctl() calls from user-space in USB drivers by
detaching the ifnet before the miibus.

PR:		252608
Suggested by:	jhb@
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 14:59:19 +02:00
Hans Petter Selasky
b764a42653 There is a window where threads are removed from the process list and where
the thread destructor is invoked. Catch that window by waiting for all
task_struct allocations to be returned before freeing the UMA zone in the
LinuxKPI. Else UMA may fail to release the zone due to concurrent access
and panic:

panic() - Bad link element prev->next != elm
zone_release()
bucket_drain()
bucket_free()
zone_dtor()
zone_free_item()
uma_zdestroy()
linux_current_uninit()

This failure can be triggered by loading and unloading the LinuxKPI module
in a loop:

while true
do
kldload linuxkpi
kldunload linuxkpi
done

Discussed with:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 13:18:41 +02:00
Hans Petter Selasky
c82c200622 Accessing the epoch structure should happen after the INIT_CHECK().
Else the epoch pointer may be NULL.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 11:21:32 +02:00
Hans Petter Selasky
f33168351b Properly define EPOCH(9) function macro.
No functional change intended.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 11:21:32 +02:00
Hans Petter Selasky
cc9bb7a9b8 Rework for-loop in EPOCH(9) to reduce indentation level.
No functional change intended.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 11:21:32 +02:00
Hans Petter Selasky
209d4919c5 Make sure all tasklets are drained before unloading the LinuxKPI.
Else use-after-free may happen.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 11:21:32 +02:00
Richard Scheffenegger
032bf749fd [tcp] Keep socket buffer locked until upcall
r367492 would unlock the socket buffer before eventually calling the upcall.
This leads to problematic interaction with NFS kernel server/client components
(MP threads) accessing the socket buffer with potentially not correctly updated
state.

Reported by: rmacklem
Reviewed By: tuexen, #transport
Tested by: rmacklem, otis
MFC after: 2 weeks
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29690
2021-05-21 11:07:51 +02:00
Michael Tuexen
500eb6dd80 tcp: Fix sending of TCP segments with IP level options
When bringing in TCP over UDP support in
https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605,
the length of IP level options was considered when locating the
transport header. This was incorrect and is fixed by this patch.

X-MFC with:		https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605
MFC after:		3 days
Reviewed by:		markj, rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D30358
2021-05-21 09:49:45 +02:00
Edward Tomasz Napierala
8dc96b74ed cam: clear on-stack CCBs in last few drivers
This changes ahc(4), ahd(4), hptiop(4), hptnr(4), hptrr(4),
and ps3cdrom(4).

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30305
2021-05-21 08:53:59 +01:00
Edward Tomasz Napierala
45f57ce122 arcmsr: clear CCB allocated on the stack
Reviewed By:	delphij, imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30304
2021-05-21 08:22:13 +01:00
Edward Tomasz Napierala
b9353e0b44 isci: clear CCBs allocated on the stack
Reviewed By:	gallatin, imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30303
2021-05-21 08:10:22 +01:00
Edward Tomasz Napierala
de992eed78 mpt: clear CCBs allocated on the stack
Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30302
2021-05-21 07:59:02 +01:00
Edward Tomasz Napierala
7608b98c43 mpr, mps: clear CCBs allocated on the stack
Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30301
2021-05-21 07:42:13 +01:00
Edward Tomasz Napierala
d39aac796b pms(4): clear CCBs allocated on the stack
Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30300
2021-05-21 07:29:23 +01:00
Edward Tomasz Napierala
95c19e1d65 linux: refactor bsd_to_linux_regset() out of linux_ptrace.c
This will be used for Linux coredump support.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30365
2021-05-21 07:26:07 +01:00
Wojciech Macek
eedbbec3fd ip_mroute: remove unused declarations
fix build for non-x86 targets
2021-05-21 08:01:26 +02:00
Wojciech Macek
741afc6233 ip_mroute: refactor bw_meter API
API should work as following:
- periodicaly report Lower-or-EQual bandwidth (LEQ) connections
  over kernel socket, if user application registered for such
  per-flow notifications
- report Grater-or-EQual (GEQ) bandwidth as soon as it reaches
  specified value in configured time window

Custom implementation of callouts was removed. There is no
point of doing calout-wheel here as generic callouts are
doing exactly the same. The performance is not critical
for such reporting, so the biggest concern should be
to have a code which can be easily maintained.

This is ia preparation for locking rework which is highly inefficient.

Approved by:    mw
Sponsored by:   Stormshield
Obtained from:  Semihalf
Differential Revision:  https://reviews.freebsd.org/D30210
2021-05-21 06:43:41 +02:00
Rick Macklem
d80a903a1c nfsd: Add support for CLAIM_DELEG_PREV_FH to the NFSv4.1/4.2 Open
Commit b3d4c70dc6 added support for CLAIM_DELEG_CUR_FH to Open.
While doing this, I noticed that CLAIM_DELEG_PREV_FH support
could be added the same way.  Although I am not aware of any extant
NFSv4.1/4.2 client that uses this claim type, it seems prudent to add
support for this variant of Open to the NFSv4.1/4.2 server.

This patch does not affect mounts from extant NFSv4.1/4.2 clients,
as far as I know.

MFC after:	2 weeks
2021-05-20 18:37:40 -07:00