Virtio modern has the common data organized in little endian, but
on powerpc64 BE it was reading and writing in the wrong endian.
Submitted by: Leonardo Bianconi <leonardo.bianconi@eldorado.org.br>
Reviewed by: bryanv, alfredo
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28947
DRIVER_OK status is set after device_attach() succeeds. For now postpone
disk_create to attach_completed() method.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: lwhsu (mentor)
Differential Revision: https://reviews.freebsd.org/D30049
For guests running under some kind of VMMs, configuration structure is
available in memory space but not I/O space.
Reported by: Yuan Rui <number201724@me.com>
MFC after: 2 weeks
Reviewed by: rpokala, bryanv, jhb
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D28818
The MSI-X resource shouldn't be assumed to be always on BAR1.
The Virtio v1.1 Spec did not specify that MSI-X table and PBA BAR has to
be BAR1 either.
Reported by: Yuan Rui <number201724@me.com>
MFC after: 2 weeks
Reviewed by: bryanv, jhb
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D28817
The decision whether a TCP packet is sent over IPv4 or IPv6 was
based on ethertype, which works correctly. In D27926 the criteria
was changed to checking if the CSUM_IP_TSO flag is set in the
csum-flags and then considering it to be TCP/IPv4.
However, the TCP stack sets the flag to CSUM_TSO for IPv4 and IPv6,
where CSUM_TSO is defined as CSUM_IP_TSO|CSUM_IP6_TSO.
Therefore TCP/IPv6 packets gets mis-classified as TCP/IPv4,
which breaks TSO for TCP/IPv6.
This patch bases the check again on the ethertype.
This fix will be MFC instantly as discussed with re(gjb).
MFC after: instantly
PR: 254366
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D29331
Definitions inside usr.sbin/bhyve/virtio.h are thrown away.
Definitions in sys/dev/virtio are used instead.
This reduces code duplication.
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29084
This was blindly moved in r360722 but the variable being printed is not
yet initialised. It's of little use and can easily be added back in the
right place if needed by someone debugging, so just delete it.
Reported by: bryanv
Rather than have every device register itself for both virtio_pci and
virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both,
merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the
latter register for both buses. This also has the benefit of abstracting
away the available transports and their names.
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D28073
We must check MagicValue not just Version before anything else, and then
we must check DeviceID and immediately abort if zero (and this must not
be an error).
Do all this when probing rather than at the start of attaching as that's
where this belongs, and provides a clear boundary between the device
detection and device initialisation parts of the specified driver
initialisation process. This also means we don't create empty device
instances for placeholder devices, reducing clutter on systems that
pre-allocate a large number, such as QEMU's AArch64 virt machine (which
provides 32).
Reviewed by: bryanv
Differential Revision: https://reviews.freebsd.org/D28070
Try to standardize how drivers negotiate feature and the
function names
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27930
- Add and fix a few error path counters
- Improve sysctl descriptions
- Use flags consistently to determine IPv4 vs IPv6
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27926
Prior to V1, the driver would enable interrupts and then notify the
host that DRIVER_OK. Since for V1, DRIVER_OK needs to be set before
notifying the virtqueues, there may be items in the queues waiting
to be processed by the time interrupts are enabled.
This fixes a bug where the Rx queue would appear stuck, only being
usable after an interface down/up cycle.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27922
For multiqueue, we may use fewer than the provided maximum number of
queues. Try to limit allocations of the unused queues: no interrupts,
no indirect descriptors, and no taskqueues.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27921
Verify the max_virtqueue_pairs is within the range allowed by
the spec.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27920
This useful when running on hosts that support checksum offloading
but not the GUEST_TSO (LRO) feature. Or potentially, some GRO-like
support when doing forwarding.
Only enable SW LRO when the host LRO is not available since both
tends to be harmful, and difficult to enable/disable selectively
with only a single IFCAP_LRO flag.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27919
This allows the Rx checksum and LRO to be modified without a full
reinit of the device.
Remove IFCAP_RXCSUM_IPV6 from the interface capabilities since in
VirtIO Rx checksums are just enabled or disabled for all protocols.
Properly update IFCAP_LRO if LRO is becomes disabled when Rx
checksums are disabled.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27916
In modern VirtIO, the virtqueues cannot be notified before setting
DRIVER_OK status.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27932
Defer the ether_ifattach until the interface capabilities
are configured
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27913
This improves spec compliance because the driver is not suppose
to notify the device prior to setting the DRIVER_OK status, which
could happen with the VIRTIO_NET_F_CTRL_MAC_ADDR.
The VIRTIO_NET_F_MAC feature should always be negotiated so would
be a rare situation.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27910
This may have been required in an early, early, early version of the
specification but I cannot find any reference to it, and a promiscuous
default seems very odd so remove this code.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27909
This features lets the guest driver know the speed and duplex of
the "link". Instead of trying to support many media types based
on the possible/likely speeds/duplexes, only use the speed to
set the interface baudrate.
Cleanup ifmedia code to match other drivers.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27908
This feature lets the guest driver know the maximum MTU size
supported by the host device. If set, use this to limit the
acceptable MTUs, and improve how the receive mbuf cluster size
then is selected.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27907
- Fix the NEEDS_CSUM and DATA_VALID checksum flags. The NEEDS_CSUM
checksum is incomplete (partial) so offer a fallback for the driver
to calculate the checksum. Simplify DATA_VALID because we know
the host has validated the checksum.
- Default 4K mbuf clusters for mergeable buffers. May need to
scale this down to 2K clusters in certain configurations such
many queue pairs, big queues (like 4096 in GCP), and low memory.
- Use the MTU when calculated the receive mbuf cluster size
when not doing TSO/LRO. This will need more adjustment once
the MTU feature is supported.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27906
Very basic support to get packets flowing on modern QEMU but still
several conformance issues remain that will be addressed in later
commits.
First of many passes at cleaning up various accumulated cruft
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27904
Rework the header file changes from 2cc8a52 to use our
canonical upstream, Linux.
geom_disk already checks DISKFLAG_CANDELETE for BIO_DELETE
so remove an unnecessary check.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27902
This only supports the legacy virtqueue format that is now called
"Split Virtqueues". Support for the new "Packed Virtqueues" described
in v1.1 is left for a later date.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27857
Use the existing legacy PCI driver as the basis for shared code
between the legacy and modern PCI drivers. The existing virtio_pci
kernel module will contain both the legacy and modern drivers.
Changes to the virtqueue and each device driver (network, block, etc)
for V1 support come in later commits.
Update the MMIO driver to reflect the VirtIO bus method changes, but
the modern compliance can be improved on later.
Note that the modern PCI driver requires bus_map_resource() to be
implemented, which is not the case on all archs.
The hw.virtio.pci.transitional tunable default value is zero so
transitional devices will continue to be driven via the legacy
driver.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27856
And subsequent fix 576b099a.
By adding the mergable header to the vtnet_rx_header structure, the size
was increased by 2 bytes, breaking the alignment of this structure as
described the in preceding comments.
Furthermore, the mergable header does not belong the structure. With the
mergable feature, the header is placed in line with the data, so there is
no need for a separate segment, and misleading to follow the mergable
header with any padding.
The V1 header is effectively identical to mergable header, and the driver
has long supported the mergable feature. Revert this so the later changes
that add V1 support can show how V1 is derived from the existing mergable
buffers support, and to facilitate a later MFC.
Reviewed by: grehan (mentor)
Differential Revision: https://reviews.freebsd.org/D27855
This caused us to write to the low half of the feature word twice, once with
the high bits and once with the low bits. Common legacy device implementations
seem to be fairly lenient about being able to write to the feature bits
multiple times, but Arm's models use a stricter implementation that will ignore
the second write. This fixes using vtnet(4) on those models.
Reported by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Pointy hat: jrtc27