Commit Graph

2395 Commits

Author SHA1 Message Date
Ed Maste
9c296a2105 geom: Add HiFive boot partitions
As documented in the HiFive Unmatched Software Reference Manual.

Reviewed by:	imp, mhorne
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34010
2022-01-26 10:54:45 -05:00
Mark Johnston
d91d2b513e geom: Handle partial I/O in g_{read,write,delete}_data()
These routines are used internally by GEOM to dispatch I/O requests to a
provider, typically for tasting or for updating GEOM class metadata
blocks.

These routines assumed that partial I/O did not occur without setting
BIO_ERROR, but this is possible in at least two cases:
- Some or all of the I/O range is beyond the provider's mediasize.
  In this scenario g_io_check() truncates the bounds of the request
  before it is handed to the target provider.
- A read from vnode-backed md(4) device returns EOF (the backing vnode
  is allowed to be smaller than the device itself) or partial vnode I/O
  occurs.
In these scenarios g_read_data() could return a partially uninitialized
buffer.  Many consumers are not affected by the first case, since the
offsets used for provider metadata or tasting are relative to the
provider's mediasize, but in some cases metadata is read at fixed
offsets, such as when searching for a UFS superblock using the offsets
defined by SBLOCKSEARCH.

Thus, modify the routines to explicitly check for a non-zero residual
and return EIO in that case.  Remove a related check from the
DIOCGDELETE ioctl handler, it is handled within g_delete_data() now.

Reviewed by:	mav, imp, kib
Reported by:	KMSAN
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31293
2022-01-20 08:29:39 -05:00
Robert Wing
a50e92cc20 geom: add kqfilter support for geom dev
The only event hooked up is NOTE_ATTRIB, which is triggered when the
device is resized. Support for other NOTE_* events to follow.

Reviewed by:	kib, jhb
Differential Revision:	https://reviews.freebsd.org/D33402
2022-01-18 10:54:59 -09:00
John Baldwin
d61effd38b Use G_ELI_IVKEYLEN as the size of IV in the user test code.
IVs are not the size of keys as a general case.  Most often they are
the size of a single block.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33885
2022-01-13 17:22:06 -08:00
Konstantin Belousov
9f4073d446 geom label msdosfs: sanity check BPB before using it for io request
It must be greater than zero, and be multiple of the device block size.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Alan Somers
f284bed200 geom_gate: ensure readprov is null-terminated
With crafted input to the G_GATE_CMD_CREATE ioctl, geom_gate can be made
to print kernel memory to the system console, potentially revealing
sensitive data from whatever was previously in that memory page.

But but but: this is a case of the sys admin misconfiguring, and you'd
need root privileges to do this.

Submitted By:	Johannes Totz <jo@bruelltuete.com>
MFC after:	2 weeks
Reviewed By:	asomers
Differential Revision: https://reviews.freebsd.org/D31727
2022-01-02 18:01:23 -07:00
John Baldwin
0ff783dc15 sys/geom: Use C99 fixed-width integer types.
No functional change.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D33635
2021-12-28 09:41:51 -08:00
Alexander Motin
f4bf48c25c GEOM: Minor polishing in geom_event.
- Remove timeouts from msleep()'s.  Those should always be woken up.
 - Move wakeup() under the lock to not call on possibly freed pointer.
 - Remove some dead code.

MFC after:	2 weeks
2021-12-27 21:01:08 -05:00
Edward Tomasz Napierala
739a9c51b0 geom(4): Fix some of the "set but not used" warnings
The few I've left in place look like potential bugs.

Sponsored By:	EPSRC
2021-12-18 11:42:34 +00:00
Mateusz Guzik
8ad5b9498e Revert "geom_bde: plug set-but-not-used vars"
The commit at hand happens to break userspace build as the header ins
included by sbin/gbde/gbde.c and the __diagused macro is not provided to
userspace.

Revert until this gets sorted out.

This reverts commit 26e837e2d4.
2021-12-09 19:23:05 +00:00
Mateusz Guzik
2cc5a480a6 geom_raid3: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 18:08:03 +00:00
Mateusz Guzik
c904812018 geom_eli: mostly plug set-but-not-unused vars
The remaining case is an ignored error.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 18:05:06 +00:00
Mateusz Guzik
0d81fba680 geom_mirror: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 18:00:27 +00:00
Mateusz Guzik
26e837e2d4 geom_bde: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 17:53:48 +00:00
Scott Long
2d5d242406 Fix "set but not used" for geom
Sponsored by: Rubicon Communications, LLC ("Netgate")
2021-12-03 23:40:24 -07:00
Mitchell Horne
0d2224733e Implement GET_STACK_USAGE on remaining archs
This definition enables callers to estimate remaining space on the
kstack, and take action on it. Notably, it enables optimizations in the
GEOM and netgraph subsystems to directly dispatch work items when there
is sufficient stack space, rather than queuing them for a worker thread.

Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
not go unimplemented elsewhere.

PR:		259157
Reviewed by:	mav, kib, markj (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32580
2021-11-30 11:15:56 -04:00
Mateusz Guzik
b74fdaaf1c geom_multipath: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-25 11:31:50 +00:00
Mateusz Guzik
cb2bfd3ecb geom_journal: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:21:59 +00:00
Alexander Motin
06bd74e1e3 GEOM: Switch g_io_deliver() locking from cp to pp.
Single provider may have multiple consumers, and locking one of consumers
is not sufficient to protect the provider.  Though the only part of the
provider this locking protects now is its statistics.

Reported by:	Arka Sharma <arka.sw1988@gmail.com>
MFC after:	2 weeks
2021-11-21 18:50:59 -05:00
Wuyang Chung
9cb485d18f geom: Remove g_class.config
g_class.config is write only, remove it.
2021-11-18 23:17:07 -07:00
Konstantin Belousov
4fdc5b8494 g_vfs_close(): vp is unused
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-11-18 05:02:59 +02:00
Konstantin Belousov
c34a5148e8 ffs: fix newly introduced LOR between mntfs vnode lock and topology lock
The mntfs vnode lock should be before topology, as established in
ffs_mountfs().  Extend the locked region in ffs_unmount().

Reported and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33013
2021-11-16 20:01:31 +02:00
Kirk McKusick
728f2c6131 Suppress UFS/FFS superblock check-hash failure messages when identifying
disk labels.

When the geom label subsystem is checking labels to discover if they
are UFS/FFS filesystems, do not print a kernel error message if a
superblock is found with a check-hash error. That issue is best
handled later if an attempt is made to actually use the filesystem.

Sponsored by: Netflix
2021-11-15 09:26:21 -08:00
Konstantin Belousov
8db7d16526 geom_vfs: lock devvp in g_vfs_close()
It is needed for g_vfs_close() invalidating the buffers.  We rely on the
vnode lock for correctness.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:13 +02:00
Gordon Bergling
9d2e51884e gjournal(8): Fix a typo in a source code comment
- s/writting/writing/

MFC after:	3 days
2021-11-03 17:14:00 +01:00
Warner Losh
edfbbfd541 gpart: Move MBR efimedia reporting to a separate routine
Move the efimedia reporting to g_part_mbr_efimedia and use that from
g_part_mbr_dumpconf to report it.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32781
2021-11-02 17:09:17 -06:00
Warner Losh
e3ab141fda gpart: Move GPT efimedia reporting to a separate routine
Move the efimedia reporting to g_part_gpt_efimedia and use that from
g_part_gpt_dumpconf to report it.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32780
2021-11-02 17:09:17 -06:00
Mateusz Guzik
627d5d1966 geli: eli data -> eli_data for consistency with other geom classes
PR:	259392
Reported by:	dewayne@heuristicsystems.com.au
MFC after:	1 week
2021-10-31 20:36:51 +00:00
Jessica Clarke
63d24336fd Fix off-by-one error in msdosfs FAT32 volume label copying
I dropped the + 1 from the other two instances in each file but failed
to do so for this one, resulting in a more egregious buffer overread
than the one I was fixing (since the read character ended up in the
output if there was space).

Reported by:	Jenkins
Fixes:	34fb1c133c ("Fix intra-object buffer overread for labeled msdosfs volumes")
2021-10-28 01:01:00 +01:00
Jessica Clarke
34fb1c133c Fix intra-object buffer overread for labeled msdosfs volumes
Volume labels, like directory entries, are padded with spaces and so
have no NUL terminator. Whilst the MIN for the dsize argument to strlcpy
ensures that the copy does not overflow the destination, strlcpy is
defined to return the number of characters in the source string,
regardless of the provided dsize, and so keeps reading until it finds a
NUL, which likely exists somewhere within the following fields, but On
CHERI with the subobject bounds enabled in the compiler this buffer
overread will be detected and trap with a bounds violation.

Found by:	CHERI
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D32579
2021-10-27 18:38:37 +01:00
Mark Johnston
f0a08fa9f5 geom_label: Add more validation for NTFS volume tasting
- Ensure that the computed MFT record size isn't negative or larger than
  maxphys before trying to read $Volume.
- Guard against truncated records in volume metadata.
- Ensure that the record length is large enough to contain the volume
  name.
- Verify that the (UTF-16-encoded) volume name's length is a multiple of
  two.

PR:		258833, 258914
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-10-04 18:15:06 -04:00
Gleb Smirnoff
b984d153e0 Don't set GELI UMA zone as UMA_ZONE_NOFREE.
That fixes memory leak on last GELI provider destroyed, introduced
in 2dbc9a388e. This patch was originally developed late 2019 and
the flag was necessary to prevent zone drainage under memory pressure.
Today, with f09cbea31a the UMA is fixed not to drain into reserves.

Discussed with:	jtl, markj
Fixes:		2dbc9a388e
PR:		258787
2021-10-01 10:31:17 -07:00
Gleb Smirnoff
2dbc9a388e Fix memory deadlock when GELI partition is used for swap.
When we get low on memory, the VM system tries to free some by swapping
pages. However, if we are so low on free pages that GELI allocations block,
then the swapout operation cannot complete. This keeps the VM system from
being able to free enough memory so the allocation can complete.

To alleviate this, keep a UMA pool at the GELI layer which is used for data
buffer allocation in the fast path, and reserve some of that memory for swap
operations. If an IO operation is a swap, then use the reserved memory. If
the allocation still fails, return ENOMEM instead of blocking.

For non-swap allocations, change the default to using M_NOWAIT. In general,
this *should* be better, since it gives upper layers a signal of the memory
pressure and a chance to manage their failure strategy appropriately. However,
a user can set the kern.geom.eli.blocking_malloc sysctl/tunable to restore
the previous M_WAITOK strategy.

Submitted by:		jtl
Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:52 -07:00
Gleb Smirnoff
c6213beff4 Add flag BIO_SWAP to mark IOs that are associated with swap.
Submitted by:		jtl
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:51 -07:00
Mark Johnston
5402baa5b5 g_label: Handle small sector sizes when tasting
Make sure that the provider sector size is large enough to contain a
valid label before trying to read it.  We performed this check already
for most label types, but not for several filesystem labels.

Reported by:	syzbot+f52918174cdf193ae29c@syzkaller.appspotmail.com
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-07 11:19:29 -04:00
Mark Johnston
9e9ba9c73d graid: Avoid tasting devices with small sector sizes
The RAID metadata parsers effectively assume a sector size of 512 bytes
or larger, but md(4) devices can be created with a sector size that's
any power of 2.  Add some seatbelts to graid tasting routines to ensure
that the requested sector(s) are large enough for the device to
plausibly contain RAID metadata.

Reported by:	syzbot+f43583c9bf8357c8b56f@syzkaller.appspotmail.com
Reported by:	syzbot+537dd9f22b91b698e161@syzkaller.appspotmail.com
Reported by:	syzbot+51509dd48871c57c6e47@syzkaller.appspotmail.com
Reported by:	syzbot+c882a31037ea2a54ff63@syzkaller.appspotmail.com
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-08-31 17:09:52 -04:00
Gordon Bergling
5bdf58e196 Fix some common typos in source code comments
- s/priviledged/privileged/
- s/funtion/function/
- s/doens't/doesn't/
- s/sychronization/synchronization/

MFC after:	3 days
2021-08-28 18:57:23 +02:00
Mark Johnston
645b7efd49 geom_disk: Add KMSAN checks
- In g_disk_start(), verify that the data to be written is initialized
  according to KMSAN shadow state.
- In g_disk_done(), verify that the block driver updated shadow state as
  expected, so as to catch sources of false positives early.

Sponsored by:	The FreeBSD Foundation
2021-08-11 16:33:41 -04:00
Alexander Motin
c2da954203 geom(4): Mark all sysctls as CTLFLAG_MPSAFE.
This code does not use Giant lock for very long time.

MFC after:	2 weeks
2021-08-10 20:18:46 -04:00
John Baldwin
419d406e4e geom_vfs: Pre-allocate event for g_vfs_destroy.
When an active g_vfs is orphaned due to an underlying disk going away
the destroy is deferred until the filesystem is unmounted in
g_vfs_done().  However, g_vfs_done() is invoked from a non-sleepable
context and cannot use M_WAITOK to allocate the event.  Instead,
allocate the event in g_vfs_orphan() and save it in the softc to be
retrieved by the last call to g_vfs_done().

Reported by:	Jithesh Arakkan @ Chelsio
Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31354
2021-07-29 17:09:23 -07:00
John Baldwin
5b5d78897c Use a more specific type for geom_disk.d_event.
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D31353
2021-07-29 16:34:46 -07:00
Warner Losh
47aeda7b70 geom_disk: use a preallocated geom_event for disk destruction.
Preallocate a geom_event (using the new geom_alloc_event) when we create
a disk. When we create the disk, we're going to be in a sleepable
context, so we can always allocate this extra bit of memory. Then use
this preallocated memory to free the disk. CAM can try to free the disk
from an unsleepable context if there was I/O outstanding when the disk
was destroyted (say because the SIM said it had gone away). The I/O
context isn't sleepable. Rather than trying to invent a retry mechanism
and making sure all the other geom_disk consumers did it properly,
preallocating the event ensure that the geom_disk will be properly torn
down, even when there's memory pressure when the disk departs.

Reviewd by:		jhb
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D30544
2021-07-23 18:08:52 -06:00
Warner Losh
380710a5c8 geom: create an API to allocate events, and use that storage to send them
g_alloc_event will allocate storage for an opaque event. g_post_event_ep
can use memory returned by g_alloc_event to send an event from a context
that might not be able to allocate the event. Occasionally, we can
alloate memory when we create an object, but not while we're destroy
it. This allows one to allocate at creation time memory to use when
destorying the object.

Reviewed by:		jhb
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D30544
2021-07-23 18:08:45 -06:00
Jessica Clarke
e77ef47d36 geom_label: Partially reinstate old sysinstall(8) workaround
This partially reverts commit af433832f7.
Since such bogus disklabels still exist in the wild, we now probe for a
disklabel to decide whether to ignore the UFS partition or not; if there
is a label then we use the old behaviour, and if there isn't one then we
use the new behaviour.

Reviewed by:	cy, mckusick
Differential Revision:	https://reviews.freebsd.org/D31068
2021-07-21 02:51:25 +01:00
Mark Johnston
0fcafe8516 eli: Zero pad bytes that arise when certain auth algorithms are used
When authentication is configured, GELI ensures that the amount of data
per sector is a multiple of 16 bytes.  This is done in
eli_metadata_softc().  When the digest size is not a multiple of 16
bytes, this leaves some extra pad bytes at the end of every sector, and
they were not being zeroed before being written to disk.  In particular,
this happens with the HMAC/SHA1, HMAC/RIPEMD160 and HMAC/SHA384 data
authentication algorithms.

This change ensures that they are zeroed before being written to disk.

Reported by:	KMSAN
Reviewed by:	delphij, asomers
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31170
2021-07-15 12:23:04 -04:00
Mark Johnston
39552dff7b graid3: Zero the metadata block before writing
Ensure that string buffers and pad bytes are zero-filled before writing
graid3 metadata.

Reported by:	KMSAN
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-13 17:46:02 -04:00
Mark Johnston
0f09ab89cc gconcat: Zero the metadata block before writing
Ensure that string buffers and pad bytes are zero-filled before writing
gconcat metadata.  Also make sure to zero the full block buffer before
encoding the metadata and writing.

Fix some style bugs in g_concat_write_metadata() while here.

Reported by:	KMSAN
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-13 17:45:59 -04:00
Mark Johnston
7f053a44ae gmirror: Zero the metadata block before writing
The mirror metadata fields contain string buffers and pad bytes, neither
were being zeroed before metadata was written to disk.  Also, the
metadata structure is smaller than the sector size, and in one case
gmirror was failing to zero-fill the full buffer before writing.

Fix these problems by pre-zeroing the metadata structure and the sector
buffer.

Reported by:	KMSAN
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-13 17:45:57 -04:00
Jessica Clarke
af433832f7 geom_label: Remove an old sysinstall(8) workaround
We removed sysinstall(8) back in 2011, so this workaround should be long
since unnecessary. This workaround can end up breaking cases that are
hit in the real world, such as dd'ing a small pre-built disk image to a
large partition that you intend to grow on first boot and uses a UFS
disk label for / in its /etc/fstab (as the only reliable thing a raw UFS
image can reference).

Reviewed by:	imp, mckusick
Differential Revision:	https://reviews.freebsd.org/D30825
2021-07-05 16:15:32 +01:00
Noah Bergbauer
d575e81fbc gconcat: Implement new online append feature
Implement the "gconcat append" command which can be used
to append a disk to the end of an existing gconcat device
without unmounting.

If the gconcat device is using the "automatic" method, i.e.,
stores metadata on the devices, new metadata is written
to all existing components, as well as to the newly added one.

Pull Request:	https://github.com/freebsd/freebsd-src/pull/472
Reviewed by:	imp@
2021-06-14 11:42:03 -06:00