Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.
This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch. A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.
Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH. But the tcp_lro_flush_all() asserts the presence
of network epoch. Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.
Reviewed by: zlei, gallatin, erj
Differential Revision: https://reviews.freebsd.org/D39510
The code paths while dumping do not got through busdma. As such,
safeguard against calling bus_dmamap_sync() with a NULL map. The x86
implementation tolerates this but others do not, resulting in a
NULL dereference panic when dumping to a vtblk device on arm64, riscv,
etc.
Fixes: 782105f7c8 ("vtblk: Use busdma")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37990
Virtio operates with physical addresses, while busdma is designed to
map these to produce bus addresses. On most supported platforms,
these two are interchangeable; on powerpc platforms, they are not.
When on powerpc, set an IOMMU of NULL, which causes the powerpc busdma
code to bypass the iommu mapping; this leaves us with the physical
buffer addresses which the virtio host expects to see.
Tested by: alfredo
Fixes: 782105f7c8 ("vtblk: Use busdma")
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D37891
Now that vtblk uses busdma, it keeps important information inside its
request structures. The functions used for kernel dumps synthesize
their own request structures rather than using structures initialized
with the necessary bits for busdma.
Add busdma-bypass paths. Since dumping writes contiguous blocks of
physical memory, vtblk doesn't need busdma in that case.
Reported by: glebius
Tested by: glebius
Differential Revision: https://reviews.freebsd.org/D37243
We assume that bus addresses from busdma are the same thing as
"physical" addresses in the Virtio specification; this seems to
be true for the platforms on which Virtio is currently supported.
For block devices which are limited to a single data segment per
I/O, we request PAGE_SIZE alignment from busdma; this is necessary
in order to support unaligned I/Os from userland, which may cross
a boundary between two non-physically-contiguous pages. On devices
which support more data segments per I/O, we retain the existing
behaviour of limiting I/Os to (max data segs - 1) * PAGE_SIZE.
Reviewed by: bryanv
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36667
Most virtio_blk requests are launched from vtblk_startio; prior to this
commit, if vtblk_request_execute failed (e.g. due to a lack of space on
the virtio queue) vtblk_startio would requeue the request to be
reattempted later.
Add a flag "vbr_requeue_on_error" to requests and perform the requeuing
from inside vtblk_request_execute instead.
No functional change intended.
Reviewed by: bryanv, imp
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36665
The error, if any, now gets stashed in the request structure. (Step 1
of reworking this driver to use busdma.)
No functional change intended.
Reviewed by: bryanv, imp
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36664
The Virtio MMIO bus driver was added in 2014 with support for devices
exposed via FDT; in 2018 support was added to discover Virtio MMIO
devices via ACPI tables, as in QEMU. The Firecracker VMM eschews both
FDT and ACPI, instead presenting device information via kernel command
line arguments of the form virtio_mmio.device=<parameters>.
These command line parameters get converted into kernel environment
variables; this adds support for parsing those variables and attaching
virtio_mmio children to nexus.
There is a case to be made that it would be cleaner to have a new
"cmdlinebus" attached to nexus and virtio_mmio children attached to
that. A future commit might do that.
Discussed with: imp, jrtc27
Sponsored by: https://patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36189
Implement the virtio_bus_finalize_features method so the FEATURES_OK
status bit is set. Implement the virtio_bus_config_generation method
to ensure larger than 4-byte reads are consistent.
Reviewed by: cperciva
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36150
The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35173
Disable software LRO during kernel dumping, because having it enabled
requires to be in a network epoch, which might or might not be the
case depending on the code path resulting in the panic.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D34787
FreeBSD 13+ running as virtual guest may load virtio_random(8) driver
by means of devd(8) unless the driver is blacklisted or disabled
via device.hints(5). Currently, the driver may prevent
the system from rebooting or shutting down correctly.
This change deactivates virtio_random at very late stage
during system shutdown sequence to avoid deadlock
that results in kernel hang.
PR: 253175
Tested by: tom
MFC after: 3 days
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ. In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.
Similarly, convert vm_page_alloc_domain() callers.
Note that callers are now responsible for assigning the pindex.
Reviewed by: alc, hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31986
No functional change intended, but noticed that we could add const here
while adding linuxkpi support for virtio.
Reviewed By: bryanv, imp
Differential Revision: https://reviews.freebsd.org/D32370
This allows one to be able to map the sglist entries passed into the
vq_ring_enqueue_segments() function to the segment indexes used in
the virtqueue.
This function normally gets inlined and may not available via FBT
probes.
Differential Revision: https://reviews.freebsd.org/D31620
Some simulators don't implement arbitrary sized memory access to the
virtio PCI registers. Follow Linux and use single byte accesses to read
and write to these registers.
Reviewed by: bryanv, emaste (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31424
ALTQ only works on network drivers which use if_start (rather than
if_transmit). vtnet uses if_start if built with VTNET_LEGACY_TX. Default
to that the kernel is built with ALTQ enabled, to reduce user surprise.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Now that the upper layers all go through a layer to tie into these
information functions that translates an sbuf into char * and len. The
current interface suffers issues of what to do in cases of truncation,
etc. Instead, migrate all these functions to using struct sbuf and these
issues go away. The caller is also in charge of any memory allocation
and/or expansion that's needed during this process.
Create a bus_generic_child_{pnpinfo,location} and make it default. It
just returns success. This is for those busses that have no information
for these items. Migrate the now-empty routines to using this as
appropriate.
Document these new interfaces with man pages, and oversight from before.
Reviewed by: jhb, bcr
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29937
Virtio modern has the common data organized in little endian, but
on powerpc64 BE it was reading and writing in the wrong endian.
Submitted by: Leonardo Bianconi <leonardo.bianconi@eldorado.org.br>
Reviewed by: bryanv, alfredo
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28947
DRIVER_OK status is set after device_attach() succeeds. For now postpone
disk_create to attach_completed() method.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: lwhsu (mentor)
Differential Revision: https://reviews.freebsd.org/D30049
For guests running under some kind of VMMs, configuration structure is
available in memory space but not I/O space.
Reported by: Yuan Rui <number201724@me.com>
MFC after: 2 weeks
Reviewed by: rpokala, bryanv, jhb
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D28818
The MSI-X resource shouldn't be assumed to be always on BAR1.
The Virtio v1.1 Spec did not specify that MSI-X table and PBA BAR has to
be BAR1 either.
Reported by: Yuan Rui <number201724@me.com>
MFC after: 2 weeks
Reviewed by: bryanv, jhb
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D28817
The decision whether a TCP packet is sent over IPv4 or IPv6 was
based on ethertype, which works correctly. In D27926 the criteria
was changed to checking if the CSUM_IP_TSO flag is set in the
csum-flags and then considering it to be TCP/IPv4.
However, the TCP stack sets the flag to CSUM_TSO for IPv4 and IPv6,
where CSUM_TSO is defined as CSUM_IP_TSO|CSUM_IP6_TSO.
Therefore TCP/IPv6 packets gets mis-classified as TCP/IPv4,
which breaks TSO for TCP/IPv6.
This patch bases the check again on the ethertype.
This fix will be MFC instantly as discussed with re(gjb).
MFC after: instantly
PR: 254366
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D29331
Definitions inside usr.sbin/bhyve/virtio.h are thrown away.
Definitions in sys/dev/virtio are used instead.
This reduces code duplication.
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29084
This was blindly moved in r360722 but the variable being printed is not
yet initialised. It's of little use and can easily be added back in the
right place if needed by someone debugging, so just delete it.
Reported by: bryanv
Rather than have every device register itself for both virtio_pci and
virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both,
merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the
latter register for both buses. This also has the benefit of abstracting
away the available transports and their names.
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D28073
We must check MagicValue not just Version before anything else, and then
we must check DeviceID and immediately abort if zero (and this must not
be an error).
Do all this when probing rather than at the start of attaching as that's
where this belongs, and provides a clear boundary between the device
detection and device initialisation parts of the specified driver
initialisation process. This also means we don't create empty device
instances for placeholder devices, reducing clutter on systems that
pre-allocate a large number, such as QEMU's AArch64 virt machine (which
provides 32).
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D28070
Try to standardize how drivers negotiate feature and the
function names
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27930