Commit Graph

268944 Commits

Author SHA1 Message Date
Warner Losh
9e1dc7bec3 loader: create separate man pages for each of the loaders
Create a man page per loader. Loader(8) will have information common to
all of them, while loader_${INTERP}(8) will have information relevant to
that specific loader. Rewrite loader(8) to give an overview and point to
the appropriate man page. Rewrite each of the loader_${INTER}(8) man
pages to contain only the relevant information to that loader. Put all
the common commands, environment variables, etc in loader_simp(8) and
refernce that from the loader_lua or loader_4th man pages. The
loader_lua(8) could use more details about the Lua
integration. Additional organization may be benefitial.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D31340
2021-09-29 09:24:47 -06:00
Ed Maste
09e4502d5c Revert "mgb: Use MGB_DEBUG instead of DEBUG"
This reverts commit 5aa9f8dae3.

We might as well get coverage of this code via LINT.

Reported by:	mhorne
2021-09-29 11:07:11 -04:00
Mitchell Horne
440c645b8f sdhci: add a missing newline 2021-09-29 11:38:56 -03:00
Bartlomiej Grzesik
adbce5ff74 sdhci_xenon: add ACPI support
Add support for ACPI device probing for SDHCI controller found on Marvell chips.

Reviewed by: mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31600
2021-09-29 16:19:28 +02:00
Bartlomiej Grzesik
d78e464d23 sdhci_xenon: split driver file into generic file and fdt parts
This patch splits driver code into two seperate files sdhci_xenon.c
and sdhci_xenon_fdt.c. This will allow future implementation of ACPI
discovery of sdhci on Xenon chips.

Reviewed by: mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31599
2021-09-29 16:19:28 +02:00
Ed Maste
5aa9f8dae3 mgb: Use MGB_DEBUG instead of DEBUG
The debug register dump routine is not hooked up and is really only
useful to driver developers, so put it under an mgb-specific MGB_DEBUG
rather than general DEBUG.

MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
2021-09-29 10:00:55 -04:00
Bjoern A. Zeeb
1269873159 LinuxKPI: fix build
Add a missing "static" for non-{i386,amd64,arm64} which was missed in
c39eefe715.   This should ifx the builds.

Sponsored by:	The FreeBSD Foundation
MFC after:	7 days
X-MFC with:	c39eefe715
2021-09-29 13:50:12 +00:00
jfranklin13
9589362bc9 syslogd: Fix bug that caused -N to drop SecureMode if specified after -s
MFC after:	2 weeks
Pull Request:	https://github.com/freebsd/freebsd-src/pull/541
2021-09-29 09:44:11 -04:00
Kristof Provost
2f20d80692 pf tests: Basic adaptive mode syncookie test
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32139
2021-09-29 15:42:01 +02:00
Kristof Provost
dc0636636b pf tests: Basic syncookie test
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32138
2021-09-29 15:42:01 +02:00
Kristof Provost
20f015f08d pf.conf: document syncookies
Reviewed by:	bcr
Obtained from:	OpenBSD
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32137
2021-09-29 15:41:49 +02:00
Kristof Provost
5062afff9d pfctl: userspace adaptive syncookies configration
Hook up the userspace bits to configure syncookies in adaptive mode.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32136
2021-09-29 15:11:54 +02:00
Kristof Provost
955460d41e pf: hook up adaptive mode configuration
The kernel side of pf syncookie adaptive mode configuration.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32135
2021-09-29 15:11:54 +02:00
Kristof Provost
bf8637181a pf: implement adaptive mode
Use atomic counters to ensure that we correctly track the number of half
open states and syncookie responses in-flight.
This determines if we activate or deactivate syncookies in adaptive
mode.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32134
2021-09-29 15:11:54 +02:00
Jessica Clarke
4a331971d2 mmc: Fix regression in 8a8166e5bc breaking Stratix 10 boot
The refactoring in 8a8166e5bc introduced a functional change that
breaks booting on the Stratix 10, hanging when it should be attaching
da0. Previously OF_getencprop was called with a pointer to host->f_max,
so if it wasn't present then the existing value was left untouched, but
after that commit it will instead clobber the value with 0. The dwmmc
driver, as used on the Stratix 10, sets a default value before calling
mmc_fdt_parse and so was broken by this functional change. It appears
that aw_mmc also does the same thing, so was presumably also broken on
some boards.

Fixes:	8a8166e5bc ("mmc: switch mmc_helper to device_ api")
Reviewed by:	manu, mw
Differential Revision:	https://reviews.freebsd.org/D32209
2021-09-29 13:59:13 +01:00
Bjoern A. Zeeb
c39eefe715 LinuxKPI: implement dma_set_coherent_mask()
Coherent is lower 32bit only by default in Linux and our only default
dma mask is 64bit currently which violates expectations unless
dma_set_coherent_mask() was called explicitly with a different mask.

Implement coherent by creating a second tag, and storing the tags in the
objects and use the tag from the object wherever possible.
This currently does not update the scatterlist or pool (both could be
converted but S/G cannot be MFCed as easily).

There is a 2nd change embedded in the updated logic of
linux_dma_alloc_coherent() to always zero the allocation as
otherwise some drivers get cranky on uninialised garbage.

Sponsored by:	The FreeBSD Foundation
MFC after:	7 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D32164
2021-09-29 12:41:28 +00:00
Bjoern A. Zeeb
25adbd0b8c neta: cleanup warning
mvneta_find_ethernet_prop_switch() is file-local static to
if_mvneta_fdt.c.  Normally we would not need a function declararion
but in case MVNETA_DEBUG is set it becomes public.  Move the
function declaration from if_mvneta.c to if_mvneta_fdt.c to avoid
a warning during each compile.
2021-09-29 12:37:16 +00:00
Li-Wen Hsu
5f07d7fe40
mgb: Fix DEBUG (and LINT) build
Sponsored by:	The FreeBSD Foundation
2021-09-29 16:34:59 +08:00
Warner Losh
36a87d0c6f nvme: Sanity check completion id
Make sure the completion ID is in the range of [0..num_trackers) since
the values past the end of the act_tr array are never going to be valid
trackers and will lead to pain and suffering if we try to dereference
them to get the tracker or to set the tracker back to NULL as we
complete the I/O.

Sponsored by:		Netflix
Reviewed by:		mav, chs, chuck
Differential Revision:	https://reviews.freebsd.org/D32088
2021-09-28 21:21:50 -06:00
Warner Losh
587aa25525 nvme: count number of ignored interrupts
Count the number of times we're asked to process completions, but that
we ignore because the state of the qpair isn't in RECOVERY_NONE.

Sponsored by:		Netflix
Reviewed by:		mav, chuck
Differential Revision:	https://reviews.freebsd.org/D32212
2021-09-28 21:18:00 -06:00
Warner Losh
7d5eebe0f4 nvme: Add sanity check for phase on startup.
The proper phase for the qpiar right after reset in the first interrupt
is 1. For it, make sure that we're not still in phase 0. This is an
illegal state to be processing interrupts and indicates that we've
failed to properly protect against a race between initializing our state
and processing interrupts. Modify stat resetting code so it resets the
number of interrpts to 1 instead of 0 so we don't trigger a false
positive panic.

Sponsored by:		Netflix
Reviewed by:		cperciva, mav (prior version)
Differential Revision:	https://reviews.freebsd.org/D32211
2021-09-28 21:18:00 -06:00
Warner Losh
fa81f3731d nvme: start qpair in state RECOVERY_WAITING
An interrupt happens on the admin queue right away after the reset, so
as soon as we enable interrupts, we'll get a call to our interrupt
handler. It is safe to ignore this interrupt if we're not yet
initialized, or	to process it if we are. If we are initialized,	we'll
see there's no completion records and return. If we're not, we'll
process	no completion records and return. Either way, nothing is
processed and nothing is lost.

Until we've completely setup the qpair, we need to avoid processing
completion records. Start the qpair in the waiting recovery state so we
return immediately when we try to process completions. The code already
sets it to 'NONE' when we're initialization is complete. It's safe to
defer completion processing here because we don't send any commands
before the initialization of the software state of the qpair is
complete. And even if we were to somehow send a command prior to that
completing, the completion record for that command would be processed
when we send commands to the admin qpair after we've setup the software
state. There's no good central point to add an assert for this last
condition.

This fixes an KASSERT "received completion for unknown cmd" panic on
boot.

Fixes:			502dc84a8b
Sponsored by:		Netflix
Reviewed by:		mav, cperciva, gallatin
Differential Revision:	https://reviews.freebsd.org/D32210
2021-09-28 21:16:19 -06:00
Ed Maste
543df60907 mgb: Connect if_mgb module to the build
It supports the following Microchip devices:

LAN7430 PCIe Gigabit Ethernet controller with PHY
LAN7431 PCIe Gigabit Ethernet controller with RGMII interface

The driver has a number of caveats and limitations, but is functional.

Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2021-09-28 21:16:40 -04:00
Michael Tuexen
28ea947078 sctp: provide a specific stream scheduler function for FCFS
A KASSERT in the genric routine does not apply and triggers
incorrectly.

Reported by:	syzbot+8435af157238c6a11430@syzkaller.appspotmail.com
MFC after:	1 week
2021-09-29 02:08:37 +02:00
Colin Percival
7457840230 loader: Set twiddle globaldiv to 16 by default
Booting FreeBSD on an EC2 c5.xlarge instance, the loader "twiddles"
810 times over the course of 510 ms, a rate of 1.59 kHz. Even accepting
that many systems are slower than this particular VM and will take
longer to boot (especially if using spinning-rust disks), this seems
like an unhelpfully large amount of twiddling when compared to the
~60 Hz frame rate of many displays; printing the twiddles also consumes
roughly 10% of the boot time on the aforementioned VM.

Setting the default globaldiv to 16 dramatically reduces the time spent
printing twiddles to the console while still twiddling at roughly 100
Hz; this should be ample even for systems which take longer to boot and
consequently twiddle slower.

Note that this can adjusted via the twiddle_divisor variable in
loader.conf, but that file is not processed until nearly halfway
through the loader's runtime.

Reviewed by:	allanjude, jrtc27, kevans
MFC after:	1 week
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	<https://reviews.freebsd.org/D32163>
2021-09-28 15:24:02 -07:00
Ed Maste
667ea7385d mgb: Update man page wrt state of the driver
Be explicit that the driver has caveats and limitations, and remove the
note about not being connected to the build: I plan to connect it soon.
(Also the note serves no real purpose in a man page that is not
installed.)

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-28 16:28:37 -04:00
Ed Maste
820da5820e mgb: Apply some style(9)
Add parens around return values, rewrap lines

MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
2021-09-28 16:17:16 -04:00
Li-Wen Hsu
0b159faaca
Temporarily skip flaky tset cases under sys.aio.aio_test in CI
- sys.aio.aio_test.vectored_unaligned
- sys.aio.aio_test.vectored_zvol_poll

PR:		258766
Sponsored by:	The FreeBSD Foundation
2021-09-29 03:32:47 +08:00
Ian Lepore
dc91a9715f Fix busdma resource leak on usb device detach.
When a usb device is detached, usb_pc_dmamap_destroy() called
bus_dmamap_destroy() while the map was still loaded. That's harmless on x86
architectures, but on all other platforms it causes bus_dmamap_destroy() to
return EBUSY and leak away any memory resources (including bounce buffers)
associated with the mapping, as well as any allocated map structure itself.

This change introduces a new is_loaded flag to the usb_page_cache struct to
track whether a map is loaded or not. If the map is loaded,
bus_dmamap_unload() is called before bus_dmamap_destroy() to avoid leaking
away resources.

MFC after:	7 days
Differential Revision:	https://reviews.freebsd.org/D32208
2021-09-28 13:29:10 -06:00
Ed Maste
c83ae596f3 mgb: Staticize devclass and iflib structs (as is typical)
MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
2021-09-28 15:11:01 -04:00
Li-Wen Hsu
b9b5a4dd59
gmultipath failloop test: Put the dtrace sanity checker in right place
Check if dtrace excution is successful or not right after execution.

Sponsored by:	The FreeBSD Foundation
2021-09-29 02:38:34 +08:00
Michael Tuexen
fa947a3687 sctp: cleanup and adding KASSERT()s, no functional change
MFC after:	1 week
2021-09-28 20:31:12 +02:00
Li-Wen Hsu
38dac71d0a
Fix typo
Reported by:	swills
Sponsored by:	The FreeBSD Foundation
2021-09-29 02:28:01 +08:00
Gleb Smirnoff
2dbc9a388e Fix memory deadlock when GELI partition is used for swap.
When we get low on memory, the VM system tries to free some by swapping
pages. However, if we are so low on free pages that GELI allocations block,
then the swapout operation cannot complete. This keeps the VM system from
being able to free enough memory so the allocation can complete.

To alleviate this, keep a UMA pool at the GELI layer which is used for data
buffer allocation in the fast path, and reserve some of that memory for swap
operations. If an IO operation is a swap, then use the reserved memory. If
the allocation still fails, return ENOMEM instead of blocking.

For non-swap allocations, change the default to using M_NOWAIT. In general,
this *should* be better, since it gives upper layers a signal of the memory
pressure and a chance to manage their failure strategy appropriately. However,
a user can set the kern.geom.eli.blocking_malloc sysctl/tunable to restore
the previous M_WAITOK strategy.

Submitted by:		jtl
Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:52 -07:00
Gleb Smirnoff
183f8e1e57 Externalize nsw_cluster_max and initialize it early.
GEOM_ELI needs to know the value, cause it will soon have special
memory handling for IO operations associated with swap.

Move initialization to swap_pager_init(), which is executed at
SI_SUB_VM, unlike swap_pager_swap_init(), which would be executed
only when a swap is configured. GEOM_ELI might need the value at
SI_SUB_DRIVERS, when disks are tasted by GEOM.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:52 -07:00
Gleb Smirnoff
c6213beff4 Add flag BIO_SWAP to mark IOs that are associated with swap.
Submitted by:		jtl
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:51 -07:00
Ian Lepore
1acf73d544 qoriq_therm.c: avoid a segfault on the error exit path.
If anything goes wrong during attach() it is handled with a 'goto fail'
which calls sysctl_ctx_free().  But the sysctl context doesn't get
initialized until very late in attach(), so almost any error just results
in a segfault.  Move the sysctl_ctx_init() call to the beginning of the
attach() function, so that it is done before any errors can happen that
will lead to freeing the context.
2021-09-28 12:19:44 -06:00
Li-Wen Hsu
819961c580
Temporarily skip sys.geom.class.multipath.failloop.failloop in CI
This test case uses `dtrace -c` but it has some issues at the moment

While here, add a checker for dtrace executes successfully or not to provide
a more informative error message.

PR:             258763
Sponsored by:   The FreeBSD Foundation
2021-09-29 02:02:27 +08:00
Ed Maste
8b889b8953 mgb: Do not KASSERT on error in mgb_init
There's not much we can do if mii_mediachg() fails, but KASSERT is not
appropriate.

MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
2021-09-28 13:57:36 -04:00
Ian Lepore
ea5c0b7b14 Add the clock for the imx8 thermal monitoring unit. 2021-09-28 11:51:57 -06:00
Ian Lepore
5e6f76f370 Add ethernet to the standard drivers for imx8. 2021-09-28 11:18:51 -06:00
Ed Maste
ecac5c2928 mgb: enable multicast in mgb_init
Receive Filtering Engine (RFE) configuration is not yet implemented,
and mgb intended to enable all broadcast, multicast, and unicast.
However, MGB_RFE_ALLOW_MULTICAST was missed (MGB_RFE_ALLOW_UNICAST was
included twice).

MFC after:	1 week
Fixes:		8890ab7758 ("Introduce if_mgb driver...")
Sponsored by:	The FreeBSD Foundation
2021-09-28 12:32:44 -04:00
Mitchell Horne
800e74955d boot(9): update to match reality
This function was renamed to kern_reboot() in 2010, but the man page has
failed to keep in sync. Bring it up to date on the rename, add the
shutdown hooks to the synopsis, and document the (obvious) fact that
kern_reboot() does not return.

Fix an outdated reference to the old name in kern_reboot(), and leave a
reference to the man page so future readers might find it before any
large changes.

Reviewed by:	imp, markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32085
2021-09-28 11:36:09 -03:00
Michael Tuexen
5b53e749a9 sctp: fix usage of stream scheduler functions
sctp_ss_scheduled() should only be called for streams that are
scheduled. So call sctp_ss_remove_from_stream() before it.
This bug was uncovered by the earlier cleanup.

Reported by:	syzbot+bbf739922346659df4b2@syzkaller.appspotmail.com
Reported by:	syzbot+0a0857458f4a7b0507c8@syzkaller.appspotmail.com
Reported by:	syzbot+a0b62c6107b34a04e54d@syzkaller.appspotmail.com
Reported by:	syzbot+0aa0d676429ebcd53299@syzkaller.appspotmail.com
Reported by:	syzbot+104cc0c1d3ccf2921c1d@syzkaller.appspotmail.com
MFC after:	1 week
2021-09-28 05:25:58 +02:00
Michael Tuexen
171633765c sctp: avoid locking an already locked mutex
Reported by:	syzbot+f048680690f2e8d7ddad@syzkaller.appspotmail.com
Reported by:	syzbot+0725c712ba89d123c2e9@syzkaller.appspotmail.com
MFC after:	1 week
2021-09-28 05:17:03 +02:00
Andrew Turner
f3aa0098a8 Use mtx_lock_spin in the gic driver
The mutex was changed to a spin lock when the MSI/MSI-X handling was
moved from the gicv2m to the gic driver. Update the calls to lock
and unlock the mutex to the spin variant.

Submitted by:	jrtc27 ("Change all the mtx_(un)lock(&sc->mutex) to be the _spin versions.")
Reported by:	mw, antranigv@freebsd.am
Sponsored by:	The FreeBSD Foundation
2021-09-28 12:42:06 +01:00
Hans Petter Selasky
3984400149 mixer(3): Add support for controlling mixer mute and volume on feeder channels.
PR:	258711
Reported by:	jbeich@FreeBSD.org
Differential Revision:	https://reviews.freebsd.org/D31636
Sponsored by:	NVIDIA Networking
2021-09-28 11:20:23 +02:00
Hans Petter Selasky
4a83ca1078 sound(4): Implement mixer mute control for feeder channels.
PR:	258711
Differential Revision:	https://reviews.freebsd.org/D31636
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-09-28 11:20:09 +02:00
Navdeep Parhar
45d6fbaec2 cxgbe(4): Update firmwares to 1.26.2.0.
The firmwares and the following changelog are from the "Chelsio Unified
Wire v3.15.0.0 for Linux."

Version : 1.26.2.0
Date    : 09/24/2021
====================

FIXES
-----

BASE:
- Added support for SFP+ RJ45 (0x1C).
- Fixing backward compatibility issue with older drivers when multiple
  speeds are passed to firmware.

OFLD:
- Do not touch tp_plen_max if driver is supplying tp_plen_max. This
  fixes a connection reset issue in iscsi.

ENHANCEMENTS
------------

BASE:
- Firmware header modified to add firmware binary signature.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2021-09-27 23:52:51 -07:00
Kirk McKusick
4a365e863f Avoid "consumer not attached in g_io_request" panic when disk lost
while using a UFS snapshot.

The UFS filesystem supports snapshots. Each snapshot is a file whose
contents are a frozen image of the disk partition on which the filesystem
resides. Each time an existing block in the filesystem is modified,
the filesystem checks whether that block was in use at the time that
the snapshot was taken. If so, and if it has not already been copied,
a new block is allocated from among the blocks that were not in use
at the time that the snapshot was taken and placed in the snapshot file
to replace the entry that has not yet been copied. The previous contents
of the block are copied to the newly allocated snapshot file block,
and the write to the original is then allowed to proceed.

The block allocation is done using the usual UFS_BALLOC() routine
which allocates the needed block in the snapshot and returns a
buffer that is set up to write data into the newly allocated block.
In usual filesystem operation, the contents for the new block is
copied from user space into the buffer and the buffer is then written
to the file using bwrite(), bawrite(), or bdwrite(). In the case of a
snapshot the new block must be filled from the disk block that is about
to be rewritten. The snapshot routine has a function readblock() that
it uses to read the `about to be rewritten' disk block.

/*
 * Read the specified block into the given buffer.
 */
static int
readblock(snapvp, bp, lbn)
	struct vnode *snapvp;
	struct buf *bp;
	ufs2_daddr_t lbn;
{
	struct inode *ip;
	struct bio *bip;
	struct fs *fs;

	ip = VTOI(snapvp);
	fs = ITOFS(ip);

	bip = g_alloc_bio();
	bip->bio_cmd = BIO_READ;
	bip->bio_offset = dbtob(fsbtodb(fs, blkstofrags(fs, lbn)));
	bip->bio_data = bp->b_data;
	bip->bio_length = bp->b_bcount;
	bip->bio_done = NULL;

	g_io_request(bip, ITODEVVP(ip)->v_bufobj.bo_private);
	bp->b_error = biowait(bip, "snaprdb");
	g_destroy_bio(bip);
	return (bp->b_error);
}

When the underlying disk fails, its GEOM module is removed.
Subsequent attempts to access it should return the ENXIO error.
The functionality of checking for the lost disk and returning
ENXIO is handled by the g_vfs_strategy() routine:

void
g_vfs_strategy(struct bufobj *bo, struct buf *bp)
{
	struct g_vfs_softc *sc;
	struct g_consumer *cp;
	struct bio *bip;

	cp = bo->bo_private;
	sc = cp->geom->softc;

	/*
	 * If the provider has orphaned us, just return ENXIO.
	 */
	mtx_lock(&sc->sc_mtx);
	if (sc->sc_orphaned || sc->sc_enxio_active) {
		mtx_unlock(&sc->sc_mtx);
		bp->b_error = ENXIO;
		bp->b_ioflags |= BIO_ERROR;
		bufdone(bp);
		return;
	}
	sc->sc_active++;
	mtx_unlock(&sc->sc_mtx);

	bip = g_alloc_bio();
	bip->bio_cmd = bp->b_iocmd;
	bip->bio_offset = bp->b_iooffset;
	bip->bio_length = bp->b_bcount;
	bdata2bio(bp, bip);
	if ((bp->b_flags & B_BARRIER) != 0) {
		bip->bio_flags |= BIO_ORDERED;
		bp->b_flags &= ~B_BARRIER;
	}
	if (bp->b_iocmd == BIO_SPEEDUP)
		bip->bio_flags |= bp->b_ioflags;
	bip->bio_done = g_vfs_done;
	bip->bio_caller2 = bp;
	g_io_request(bip, cp);
}

Only after checking that the device is present does it construct
the "bio" request and call g_io_request(). When readblock()
constructs its own "bio" request and calls g_io_request() directly
it panics with "consumer not attached in g_io_request" when the
underlying device no longer exists.

The fix is to have readblock() call g_vfs_strategy() rather than
constructing its own "bio" request:

/*
 * Read the specified block into the given buffer.
 */
static int
readblock(snapvp, bp, lbn)
	struct vnode *snapvp;
	struct buf *bp;
	ufs2_daddr_t lbn;
{
	struct inode *ip;
	struct fs *fs;

	ip = VTOI(snapvp);
	fs = ITOFS(ip);

	bp->b_iocmd = BIO_READ;
	bp->b_iooffset = dbtob(fsbtodb(fs, blkstofrags(fs, lbn)));
	bp->b_iodone = bdone;
	g_vfs_strategy(&ITODEVVP(ip)->v_bufobj, bp);
	bufwait(bp);
	return (bp->b_error);
}

Here it uses the buffer that will eventually be written to the disk.
The g_vfs_strategy() routine uses four parts of the buffer: b_bcount,
b_iocmd, b_iooffset, and b_data.

The b_bcount field is already correctly set for the buffer. It is
safe to set the b_iocmd and b_iooffset fields as they are set
correctly when the later write is done. The write path will also
clear the B_DONE flag that our use of the buffer will set. The
b_iodone callback has to be set to bdone() which will do just
notification that the I/O is done in bufdone(). The rest of
bufdone() includes things like processing the softdeps associated
with the buffer should not be done until the buffer has been
written. Bufdone() will set b_iodone back to NULL after using it,
so the full bufdone() processing will be done when the buffer is
written. The final change from the previous version of readblock()
is that it used the b_data for the destination of the read while
g_vfs_strategy() uses the bdata2bio() function to take advantage
of VMIO when it is available.

Differential revision: https://reviews.freebsd.org/D32150
Reviewed by:  kib, chs
MFC after:    1 week
Sponsored by: Netflix
2021-09-27 20:04:51 -07:00