There are a number of assumptions about legacy interrupts always
being available in virtio so don't allow back-ends to make the
decision to support them.
This fixes the issue seen with virtio-rnd on OpenBSD. MSI-x vectors
were not being used, and the virtio-rnd backend wasn't allocating a
legacy interrupt resulting in a bhyve assert and guest exit.
Reported by: Julian Hsiao, madoka at nyanisore dot net
Reviewed by: neel
MFC after: 1 week
I've missed that network driver sometimes returns taken request back to
available queue without processing. Add new helper function for that case.
Reported by: flo
MFC after: 2 weeks
Original virtqueue design allows queued and out-of-order processing, but
helpers added in r253440 suppose only direct blocking in-order one.
It could be fine for network, etc., but it is a huge limitation for storage
devices.
NetBSD's virtio-net implementation doesn't negotiate
the merged rx-buffers feature. To support this, check
to see if the feature was negotiated, and then adjust
the operation of the receive path accordingly by using
a larger iovec, and a smaller rx header.
In addition, ignore writes to the (read-only) status byte.
Tested with NetBSD/amd64 5.2.2, 6.1.4 and 7-beta.
Reviewed by: neel, tychon
Phabric: D745
MFC after: 3 days
the virtio backends.
- Add a new ioctl to export the count of pins on the I/O APIC from vmm
to the hypervisor.
- Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for
ISA interrupts.
- Populate the MP Table with I/O interrupt entries for any PCI INTx
interrupts.
- Create a _PRT table under the PCI root bridge in ACPI to route any
PCI INTx interrupts appropriately.
- Track which INTx interrupts are in use per-slot so that functions
that share a slot attempt to distribute their INTx interrupts across
the four available pins.
- Implicitly mask INTx interrupts if either MSI or MSI-X is enabled
and when the INTx DIS bit is set in a function's PCI command register.
Either assert or deassert the associated I/O APIC pin when the
state of one of those conditions changes.
- Add INTx support to the virtio backends.
- Always advertise the MSI capability in the virtio backends.
Submitted by: neel (7)
Reviewed by: neel
MFC after: 2 weeks
Remove the VM name from some of the thread-naming calls
since it is now in the proc title.
Slightly modify the thread-naming for the net and block
threads.
This improves readability when using top/ps with the -a
and -H options on a system with a large number of bhyve VMs.
Requested by: Michael Dexter
Reviewed by: neel
MFC after: 4 weeks
- Allow a hostbridge to be created with AMD as a vendor.
This passes the OpenBSD check to allow the use of MSI
on a PCI bus.
- Enable the i/o interrupt section of the mptable, and
populate it with unity ISA mappings. This allows the
'legacy' IRQ mappings of the PCI serial port to be
set up. Delete unused print routine that was obscuring code.
- Use the '-W' option to enable virtio single-vector MSI
rather than an environment variable. Update the virtio
net/block drivers to query this flag when setting up
interrupts.: bhyverun.c
- Fix the arithmetic used to derive the century byte in
RTC CMOS, as well as encoding it in BCD.
Reviewed by: neel
MFC after: 3 days
users to set the MAC address for a device.
Clean up some obsolete code in pci_virtio_net.c
Allow an error return from a PCI device emulation's init routine
to be propagated all the way back to the top-level and result in
the process exiting.
Submitted by: Dinakar Medavaram dinnu sun at gmail (original version)
If this capability is negotiated by the guest then the device will
generate an interrupt when it runs out of available tx/rx descriptors.
Reviewed by: grehan
Obtained from: NetApp
drop any frames that arrive while the device is starved for receive buffers.
This makes the receive path to only execute in context of the receive thread
and allows for further simplification.
Reviewed by: grehan
than blocking the vCPU thread. This improves bulk data performance
by ~30-40% and doesn't harm req/resp time for stock netperf runs.
Future work will use a thread pool rather than a thread per tx queue.
Submitted by: Dinakar Medavaram
Reviewed by: neel, grehan
Obtained from: NetApp
command line option "-m <memsize in MB>" to specify the memory size.
Prior to this change the user needed to explicitly specify the amount of
memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>).
The "-M" option is no longer supported by 'bhyveload' and 'bhyve'.
The start of the PCI hole is fixed at 3GB and cannot be directly changed
using command line options. However it is still possible to change this in
special circumstances via the 'vm_set_lowmem_limit()' API provided by
libvmmapi.
Submitted by: Dinakar Medavaram (initial version)
Reviewed by: grehan
Obtained from: NetApp
This seems prudent to do in its own right but it also opens up the possibility
of not having to mmap the entire guest address space in the 'bhyve' process
context.
Discussed with: grehan
Obtained from: NetApp
devices are MSI-X capable. This in turn would lead it to treat bar 0 as
the MSI-X table bar even if the underlying device did not support MSI-X.
Fix this by providing an API to query the MSI-X table index of the emulated
device. If the underlying device does not support MSI-X then this API will
return -1.
Obtained from: NetApp
the default.
The current behavior of advertising a single MSI vector can be requested by
setting the environment variable "BHYVE_USE_MSI" to "true". The use of MSI
is not compliant with the virtio specification and will be eventually phased
out.
Submitted by: Gopakumar T
Obtained from: NetApp
bhyve is intended to be a generic hypervisor, and not FreeBSD-specific.
(renaming internal routines will come later)
Reviewed by: neel
Obtained from: NetApp
- New memory region interface. An RB tree holds the regions,
with a last-found per-vCPU cache to deal with the common case
of repeated guest accesses to MMIO registers in the same page.
- Support memory-mapped BARs in PCI emulation.
mem.c/h - memory region interface
instruction_emul.c/h - remove old region interface.
Use gpa from EPT exit to avoid a tablewalk to
determine operand address. Determine operand size
and use when calling through to region handler.
fbsdrun.c - call into region interface on paging
exit. Distinguish between instruction emul error
and region not found
pci_emul.c/h - implement new BAR callback api.
Split BAR alloc routine into routines that
require/don't require the BAR phys address.
ioapic.c
pci_passthru.c
pci_virtio_block.c
pci_virtio_net.c
pci_uart.c - update to new BAR callback i/f
Reviewed by: neel
Obtained from: NetApp
vmm.ko - kernel module for VT-x, VT-d and hypervisor control
bhyve - user-space sequencer and i/o emulation
vmmctl - dump of hypervisor register state
libvmm - front-end to vmm.ko chardev interface
bhyve was designed and implemented by Neel Natu.
Thanks to the following folk from NetApp who helped to make this available:
Joe CaraDonna
Peter Snyder
Jeff Heller
Sandeep Mann
Steve Miller
Brian Pawlowski