We want to allow the UEFI firmware to enumerate and assign
addresses to PCI devices so we can boot from NVMe[1]. Address
assignment of PCI BARs is properly handled by the PCI emulation
code in general, but a few specific cases need additional support.
fbuf and passthru map additional objects into the guest physical
address space and so need to handle address updates. Here we add a
callback to emulated PCI devices to inform them of a BAR
configuration change. fbuf and passthru then watch for these BAR
changes and relocate the frame buffer memory segment and passthru
device mmio area respectively.
We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls
to vmm(4) to facilitate the unmapping needed for addres updates.
[1]: https://github.com/freebsd/uefi-edk2/pull/9/
Originally by: scottph
MFC After: 1 week
Sponsored by: Intel Corporation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D24066
Replace the existing ad-hoc configuration via various global variables
with a small database of key-value pairs. The database supports
heirarchical keys using a MIB-like syntax to name the path to a given
key. Values are always stored as strings. The API used to manage
configuation values does include wrappers to handling boolean values.
Other values use non-string types require parsing by consumers.
The configuration values are stored in a tree using nvlists. Leaf
nodes hold string values. Configuration values are permitted to
reference other configuration values using '%(name)'. This permits
constructing template configurations.
All existing command line arguments now set configuration values. For
devices, the "-s" option parses its option argument to generate a list
of key-value pairs for the given device.
A new '-o' command line option permits setting an individual
configuration variable. The key name is always given as a full path
of dot-separated components.
A new '-k' command line option parses a simple configuration file.
This configuration file holds a flat list of 'key=value' lines where
the 'key' is the full path of a configuration variable. Lines
starting with a '#' are comments.
In general, bhyve starts by parsing command line options in sequence
and applying those settings to configuration values. Once this is
complete, bhyve then begins initializing its state based on the
configuration values. This means that subsequent configuration
options or files may override or supplement previously given settings.
A special 'config.dump' configuration value can be set to true to help
debug configuration issues. When this value is set, bhyve will print
out the configuration variables as a flat list of 'key=value' lines.
Most command line argments map to a single configuration variable,
e.g. '-w' sets the 'x86.strictmsr' value to false. A few command
line arguments have less obvious effects:
- Multiple '-p' options append their values (as a comma-seperated
list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number).
- For '-s' options, a pci.<bus>.<slot>.<function> node is created.
The first argument to '-s' (the device type) is used as the value of
a "device" variable. Additional comma-separated arguments are then
parsed into 'key=value' pairs and used to set additional variables
under the device node. A PCI device emulation driver can provide
its own hook to override the parsing of the additonal '-s' arguments
after the device type.
After the configuration phase as completed, the init_pci hook
then walks the "pci.<bus>.<slot>.<func>" nodes. It uses the
"device" value to find the device model to use. The device
model's init routine is passed a reference to its nvlist node
in the configuration tree which it can query for specific
variables.
The result is that a lot of the string parsing is removed from
the device models and centralized. In addition, adding a new
variable just requires teaching the model to look for the new
variable.
- For '-l' options, a similar model is used where the string is
parsed into values that are later read during initialization.
One key note here is that the serial ports use the commonly
used lowercase names from existing documentation and examples
(e.g. "lpc.com1") instead of the uppercase names previously
used internally in bhyve.
Reviewed by: grehan
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D26035
Definitions inside usr.sbin/bhyve/virtio.h are thrown away.
Definitions in sys/dev/virtio are used instead.
This reduces code duplication.
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29084
The save/restore feature uses a unix domain socket to send messages
from bhyvectl(8) to a bhyve(8) process. A datagram socket will suffice
for this.
An added benefit of using a datagram socket is simplified code. For
bhyve, the listen/accept calls are dropped; and for bhyvectl, the
connect() call is dropped.
EPRINTLN handles raw mode for bhyve(8), use it to print error messages.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28983
MAX_SNAPSHOT_VMNAME is a macro used to set the size of a character
buffer that stores a filename or the path to a file - this file is used
by the save/restore feature.
Since the file doesn't have anything to do with a vm name, rename
MAX_SNAPSHOT_VMNAME to MAX_SNAPSHOT_FILENAME. Bump the size to PATH_MAX
while here.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28879
Generalize the naming here since the domain socket that uses these codes
might be used for purposes other than the save/restore feature.
- rename checkpoint_opcodes to ipc_opcode
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28877
Add /var/run/bhyve/ to BSD.var.dist so we don't have to call mkdir when
creating the unix domain socket for a given bhyve vm.
The path to the unix domain socket for a bhyve vm will now be
/var/run/bhyve/vmname instead of /var/run/bhyve/checkpoint/vmname
Move BHYVE_RUN_DIR from snapshot.c to snapshot.h so it can be shared
to bhyvectl(8).
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28783
struct checkpoint_op, enum checkpoint_opcodes, and
MAX_SNAPSHOT_VMNAME are not vmm specific, move them out of the vmmapi
header.
They are used for the save/restore functionality that bhyve(8)
provides and are better suited in usr.sbin/bhyve/snapshot.h
Since bhyvectl(8) requires these, the Makefile for bhyvectl has been
modified to include usr.sbin/bhyve/snapshot.h
Reviewed by: kevans, grehan
Differential Revision: https://reviews.freebsd.org/D28410
Now that bhyve(8) supports UART, bvmconsole and bvmdebug are no longer needed.
This also removes the '-b' and '-g' flag from bhyve(8). These two flags were
marked deprecated in r368519.
Reviewed by: grehan, kevans
Approved by: kevans (mentor)
Differential Revision: https://reviews.freebsd.org/D27490
- support VNC version 3.3 (macos "Screen Sharing" builtin client)
- wait until client has requested an update prior to sending framebuffer data
- don't send an update if no framebuffer updates detected
- increase framebuffer poll frequency to 30Hz, and double that when
kbd/mouse input detected
- zero uninitialized array elements in rfb_send_server_init_msg()
- fix overly large allocation in rfb_init()
- use atomics for flags shared between input and output threads
- use #defines for constants
This work was contributed by Marko Kiiskila, with reuse of some earlier
work by Henrik Gulbrandsen.
Clients tested :
FreeBSD-current
- tightvnc
- tigervnc
- krdc
- vinagre
Linux (Ubuntu)
- krdc
- vinagre
- tigervnc
- xtightvncviewer
- remmina
MacOS
- VNC Viewer
- TigerVNC
- Screen Sharing (builtin client)
Windows 10
- Tiger VNC
- VNC Viewer (cursor lag)
- UltraVNC (cursor lag)
o/s independent
- noVNC (browser) using websockify relay
PR: 250795
Submitted by: Marko Kiiskila <marko@apache.org>
Reviewed by: jhb (bhyve)
MFC after: 3 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27605
Enforce the requirement that the RX callback cannot be called after a reset until the features have been negotiated.
This fixes a race condition where the receive callback is called during a device reset.
Reviewed by: vmaffione, grehan
Approved by: vmaffione (mentor)
Sponsored by: vstack.com
Differential Revision: https://reviews.freebsd.org/D27381
Now that bhyve(8) supports UART, bvmconsole and bvmdebug are no longer needed.
Mark the '-b' and '-g' flag as deprecated for bhyve(8).
These will be removed in 13.
Reviewed by: jhb, grehan
Approved by: kevans (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27519
This uses the same snapshot routine as other VirtIO devices.
Submitted by: Vitaliy Gusev <gusev.vitaliy@gmail.com>
Differential Revision: https://reviews.freebsd.org/D26265
Permit suspend/resume of a XHCI device model that has not been
attached to by a driver in a guest OS.
Submitted by: Vitaliy Gusev <gusev.vitaliy@gmail.com>
Differential Revision: https://reviews.freebsd.org/D26264
This fixes the amount of memory displayed in the EDK2 UiApp to be the same
as passed on the bhyve command line. Otherwise, 8GB is displayed as 4GB,
32GB as 28GB etc.
Reviewed by: jhb, kib, rgrimes
Differential Revision: https://reviews.freebsd.org/D27348
Fix a couple of style issues introduced in my previous commit.
Add a comment explaining that the SMBIOS specification defines the date
format to be mm/dd/yyyy, which is why we don't use ISO 8601.
Add a new ioctl to disable all MSI-X interrupts for a PCI passthrough
device and invoke it if a write to the MSI-X capability registers
disables MSI-X. This avoids leaving MSI-X interrupts enabled on the
host if a guest device driver has disabled them (e.g. as part of
detaching a guest device driver).
This was found by Chelsio QA when testing that a Linux guest could
switch from MSI-X to MSI interrupts when using the cxgb4vf driver.
While here, explicitly fail requests to enable MSI on a passthrough
device if MSI-X is enabled and vice versa.
Reported by: Sony Arpita Das @ Chelsio
Reviewed by: grehan, markj
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27212
Add update to RIP after a userspace instruction decode (as is done for
the in-kernel counterpart of this case).
Submitted by: adam_fenn.io
Reviewed by: cem, markj
Approved by: grehan (bhyve)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D27243
When the AHCI code was reworked to use FreeBSD struct
definitions, the valid element was mis-transcribed resulting
in the UMDA capability being hidden. This prevented Illumos
from using AHCI disk/cdrom drives.
Fix by using definitions that match the code pre-rework.
PR: 250924
Submitted by: Rolf Stalder
Reported by: Rolf Stalder
MFC after: 3 days
Since lots of work has been done on bhyve since 2014, increase the version
to 13.0 to match 13-CURRENT, and update the release date.
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D27147
Read CPUID leaf 0x8000008 to determine max supported phys address and
create BAR region right below it, reserving 1/4 of the supported guest
physical address space to the 64bit BARs mappings.
PR: 250802 (although the issue from PR is not fixed by the change)
Noted and reviewed by: grehan
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D27095
If bhyve is used to emulate 512e access in guest OS, then discard addresses should be properly aligned.
Otherwise ioctl DIOCGDELETE fails for 512b requires on devices with 4K sector size.
see g_dev_ioctl() in sys/geom/geom_dev.c
Submitted by: Vitaliy Gusev <gusev.vitaliy@gmail.com>
MFC after: 1 week
Sponsored by: vStack.com
Differential Revision: https://reviews.freebsd.org/D27075
"smbios.system.family" as " ".
This presents challenges for both humans and tools when trying to parse output
that uses those results.
The new values reported are now:
smbios.system.family="Virtual Machine"
smbios.system.maker="FreeBSD"
PR: 250728
Approved by: grehan@FreeBSD.org
Sponsored by: Netflix
bhyve sometimes segfaults when using an e1000 NIC with a Windows guest.
We are only updating our tdba and cached host mapping when the low address
register is written and when tx is set enabled, but not when the high address
or length registers are written. It is observed that Windows 10 is occasionally
enabling tx first then writing the registers in the order low, high, len. This
leaves us with a bogus base address and mapping, which causes a segfault later
when we try to copy from a descriptor that has unpredictable garbage in a
pointer.
Updating the address and mapping when any of those registers change seems to fix
that particular issue.
Reviewed by: mav, grehan (bhyve)
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D26798
VirtFS allows sharing an arbitrary directory tree between bhyve virtual
machine and the host. Current implementation has a fairly complete support
for 9P2000.L protocol, except for the extended attribute support. It has
been verified to work with the qemu-kvm hypervisor.
Reviewed by: rgrimes, emaste, jhb, trasz
Approved by: trasz (mentor)
MFC after: 1 month
Relnotes: yes
Sponsored by: Conclusive Engineering (development), vStack.com (funding)
Differential Revision: https://reviews.freebsd.org/D10335
'ident' was replaced with 'ata_ident' in revision r363596.
Submitted by: Vitaliy Gusev <gusev.vitaliy_gmail.com>
Reviewed by: Darius Mihai
Differential Revision: https://reviews.freebsd.org/D26263
The language id of String Descriptors in usb mouse is
0x0904, while the spec require 0x0409 (English - United States)
Submitted by: Wanpeng Qian
Reviewed by: grehan
Approved by: grehan (#bhyve)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26472
The NVMe emulation code did not explicitly initialize queue head and
tail pointers on queue creation. As these pointers are part of
calloc()'ed memory, this only becomes a problem if the queues are
deleted and then recreated.
This error can manifest with messages about completions not matching a
command.
Some operating systems believe bhyve's emulated NVMe drive is failing
based on certain values in the SMART / Health Information log page being
zero. Fix is to set the reported temperature and available spare values
to reasonable defaults.
Submitted by: wanpengqian@gmail.com
Reviewed by: grehan
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24202
Allow the serial number, firmware revision, model number and nominal media
rotation rate (nmrr) parameters to be set from the command line.
Note that setting the nmrr value can be used to indicate the AHCI
device is an SSD.
Submitted by: Wanpeng Qian
Reviewed by: jhb, grehan (#bhyve)
Approved by: jhb, grehan
MFC after: 3 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D24174
This fixes a coredump with NetBSD guests when XHCI is configured.
On seeing the AC64 flag clear, the NetBSD XHCI driver was only writing
to the lower 32-bits of 64-bit physical address registers. The emulation
relies on a write to the hi 32-bits to calculate a host virtual address
for internal use, and has always supported 64-bit addressing.
All other guests were seen to write to both the lo- and hi- address
registers, regardless of the AC64 setting.
Discussed with: Leon Dang (author)
Tested with: Ubuntu 16/18/20, Windows10, OpenBSD UEFI guests.
MFC after: 2 weeks.
Allow guests to set the RTC bit in the ACPI PM control register.
This eliminates an annoying (and harmless) Linux kernel boot message.
PR: 244721
Submitted by: Jose Luis Duran
MFC after: 1 week
The NVMe specification requires unused entries in the Identify, Active
Namespace ID data to be zero. Fix is bzero the provided page, similar to
what is done for the Namespace Descriptors list.
Fixes UNH Tests 2.6 and 2.9
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24901
Dataset Management range specifications may have a zero length (a.k.a.
an empty range definition). Handle the case of all ranges being empty by
completing with Success (DSM commands are advisory only). For
Deallocate, skip empty range definitions when sending TRIM's to the
backing storage.
Fixes UNH Test 2.2.4
Reviewed by: imp
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24900
If the Predictable Latency Mode is not supported, NVMe Controllers must
return Invalid Field in Command status for the Get Features command
with IDs:
- Predictable Latency Mode Config
- Predictable Latency Mode Window
Fixes UNH Tests 3.6
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24899
This adds support for NVMe Get Features, Interrupt Vector Config
parameter error checking done by the UNH compliance tests.
Fixes UNH Tests 1.6.8 and 5.5.6
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24898
This commit updates the Identify Controller data to advertise the
Controller supports a single firmware slot and that firmware slot 1 is
read-only. Additionally, it returns an "Invalid Firmware Slot" error
when the host issues any Firmware Commit command (a.k.a. Firmware
Activate).
Fixes UNH Test 5.5.3
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24897
This adds support to bhyve's NVMe device emulation for processing Async
Event Requests but not returning them (i.e. Async Event Notifications).
Fixes UNH Test 5.5.2
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24896