The UEFI driver included with Rocky Linux 8.4 uncovered an existing bug
in the NVMe emulation's construction of iovec's.
By default, NVMe data transfer operations use a scatter-gather list in
which all entries point to a fixed size memory region. For example, if
the Memory Page Size is 4KiB, a 2MiB IO requires 512 entries. Lists
themselves are also fixed size (default is 512 entries).
Because the list size is fixed, the last entry is special. If the IO
requires more than 512 entries, the last entry in the list contains the
address of the next list of entries. But if the IO requires exactly 512
entries, the last entry points to data.
The NVMe emulation missed this logic and unconditionally treated the
last entry as a pointer to the next list. Fix is to check if the
remaining data is greater than the page size before using the last entry
as a pointer to the next list.
PR: 256422
Reported by: dave@syix.com
Tested by: jason@tubnor.net
MFC after: 5 days
Relnotes: yes
Reviewed by: imp, grehan
Differential Revision: https://reviews.freebsd.org/D30897
Register a resize callback with the blockif interface. When the
callback fires, update the size of the disk and notify the guest via a
configuration change interrupt.
Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30506
This allows device models to assert VirtIO interrupts for reasons
other than publishing changes to a VirtIO ring such as configuration
changes.
Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30505
Allow clients of blockif to register a resize callback handler. When
a callback is registered, register an EVFILT_VNODE kevent watching the
backing store for a change in the file's attributes. If the size has
changed when the kevent fires, invoke the clients' callback.
Currently resize detection is limited to backing stores that support
EVFILT_VNODE kevents such as regular files.
Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30504
This allows registering an event to watch for changes to a file's
attributes. This is a bit imperfect as it would be nice to have a way
to determine if an fd can use EVFILT_VNODE successfully. mevent's
current structure does not permit that and a failure to register a
single kevent impacts several other kevents.
Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30503
Change mevent_add*() to synchronously add the new kevent. This
permits reporting event registration failures to the caller and avoids
failing the registration of other, unrelated events queued up in the
same batch.
Reviewed by: grehan, markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30502
This will be used to inject keyboard/mouse input events into a guest.
The command line syntax is:
-s <slot>,virtio-input,/dev/input/eventX
Reviewed by: jhb (bhyve), grehan
Obtained from: Corvin Köhne <C.Koehne@beckhoff.com>
MFC after: 3 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D30020
Move initialization of the mutex/condition variables required by the
save/restore feature to their own function.
The unix domain socket that facilitates communication between bhyvectl
and bhyve doesn't rely on these variables in order to be functional.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D30281
This is a step towards sending messages (other than suspend/checkpoint)
from bhyvectl to bhyve.
Introduce a new struct, ipc_message - this struct stores the type of
message and a union containing message specific structures for the type
of message being sent.
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D30221
Fixes segfault with the command `bhyve -s 0,virtio-scsi`, which is used
by some third party software to probe bhyve for virtio-scsi support.
Reviewed by: jhb
MFC after: 1 day
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D29926
- Use appropriate mdoc macros
- Document that tcp= is a synonym to rfb= (tcp is used in the examples,
but never mentioned)
- Clarify the IP address specification
MFC after: 2 weeks
- Set width of the list to the longest key word for readability.
- Separate descriptions of amd_hostbridge and hostbridge emulations.
Also, wordsmith their descriptions for consistency with other entries.
- Use Cm instead of Li for command modifiers.
- Do not stylize AMD with Li, there's no need to do it.
- Mention COM3 and COM4 in the definition of lpc.
- Fix a typo in the definition of ahci-hd ("hard drive" instead of
"hard-drive").
MFC after: 2 weeks
Also, remove the macros of the nested list which contained slot,
emulation and conf. This decreases the indention of the -s description.
It was necessary to clean up the slot description.
MFC after: 2 weeks
- Describe "-l help" separately for readability.
- List all the supported comX devices explicitly
- Use Cm instead of Ar for command modifiers (i.e., literal values a
user can specify as an argument to the command).
- Explain where to get more information about the possible values of the
conf argument.
MFC after: 2 weeks
In particular:
- Sort short options to align with style(9)
- Add two missing flags: -G and -r
- Drop unnecessary angle brackets for consistency
- Rename the "vm" argument to vmname for consistency with the manual
page
MFC after: 2 weeks
There is no need to squeeze all the possible options into one synopsis
entry. Let "-l help" and "-s help" be listed separately.
While here, keep -s and its arguments on the same line.
MFC after: 2 weeks
Without the -w option, Windows guests crash on boot. This is caused by a rdmsr
of MSR_IA32_FEATURE_CONTROL. Windows checks this MSR to determine enabled VMX
features. This MSR isn't emulated in bhyve, so a #GP exception is injected
which causes Windows to crash.
Fix by returning a rdmsr of MSR_IA32_FEATURE_CONTROL with Lock Bit set and
VMX disabled to informWindows that VMX isn't available.
Reviewed by: jhb, grehan (bhyve)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29665
The check needs to be in the public routine (gdb_cpu_suspend()), not
in the internal routine called from various places
(_gdb_cpu_suspend()). All the other callers of _gdb_cpu_suspend()
already check gdb_active, and this breaks the use of snapshots when
the debug server is not enabled since gdb_cpu_suspend() tries to lock
an uninitialized mutex.
Reported by: Darius Mihai, Elena Mihailescu
Reviewed by: elenamihailescu22_gmail.com
Fixes: 621b5090487de9fed1b503769702a9a2a27cc7bb
Differential Revision: https://reviews.freebsd.org/D29538
Add the System Management BIOS Baseboard (or Module) Information
a.k.a. Type 2 structure to the SMBIOS emulation.
Reviewed by: rgrimes, bcran, grehan
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29657
Commit 621b5090487de9fed1b503769702a9a2a27cc7bb introduced a regression
in legacy virtio-9p config parsing by not initializing *sharename to
NULL. As a result, "sharename != NULL" check in the first iteration fails
and bhyve exits with "virtio-9p: more than one share name given".
Fix by adding NULL back.
Approved by: grehan
This allows the xhci tablet device to be recognized and a PCI device
instantiated.
Reviewed by: jhb
Fixes: 621b5090487d Refactor configuration management in bhyve.
MFC after: 3 months.
The old prototype requires callers to inspect flags of each descriptors
to get the starting position of host-writable iovecs.
vq_getchain() is changed to return a virtio request with the number of
host-readable iovecs and host-writable iovecs instead. Callers can avoid
boilerplate code of getting the start offset of host-writable iovecs.
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Reviewed by: afedorov
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29433
The previous commit added the handler to parse the command line
options for virtio-scsi devices but forgot to set the correct function
pointer to point to the handler.
Reported by: vangyzen
Reviewed by: vangyzen
Fixes: 621b5090487de9fed1b503769702a9a2a27cc7bb
Differential Revision: https://reviews.freebsd.org/D29438
"device" is already used as the generic PCI-level name of the device
model to use (e.g. "hostbridge"). The result was that parsing
"hostbridge" as an integer failed and the host bridge used a device ID
of 0. The EFI ROM asserts that the device ID of the hostbridge is not
0, so booting with the current EFI ROM was failing during the ROM
boot.
Fixes: 621b5090487de9fed1b503769702a9a2a27cc7bb
We want to allow the UEFI firmware to enumerate and assign
addresses to PCI devices so we can boot from NVMe[1]. Address
assignment of PCI BARs is properly handled by the PCI emulation
code in general, but a few specific cases need additional support.
fbuf and passthru map additional objects into the guest physical
address space and so need to handle address updates. Here we add a
callback to emulated PCI devices to inform them of a BAR
configuration change. fbuf and passthru then watch for these BAR
changes and relocate the frame buffer memory segment and passthru
device mmio area respectively.
We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls
to vmm(4) to facilitate the unmapping needed for addres updates.
[1]: https://github.com/freebsd/uefi-edk2/pull/9/
Originally by: scottph
MFC After: 1 week
Sponsored by: Intel Corporation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D24066
Replace the existing ad-hoc configuration via various global variables
with a small database of key-value pairs. The database supports
heirarchical keys using a MIB-like syntax to name the path to a given
key. Values are always stored as strings. The API used to manage
configuation values does include wrappers to handling boolean values.
Other values use non-string types require parsing by consumers.
The configuration values are stored in a tree using nvlists. Leaf
nodes hold string values. Configuration values are permitted to
reference other configuration values using '%(name)'. This permits
constructing template configurations.
All existing command line arguments now set configuration values. For
devices, the "-s" option parses its option argument to generate a list
of key-value pairs for the given device.
A new '-o' command line option permits setting an individual
configuration variable. The key name is always given as a full path
of dot-separated components.
A new '-k' command line option parses a simple configuration file.
This configuration file holds a flat list of 'key=value' lines where
the 'key' is the full path of a configuration variable. Lines
starting with a '#' are comments.
In general, bhyve starts by parsing command line options in sequence
and applying those settings to configuration values. Once this is
complete, bhyve then begins initializing its state based on the
configuration values. This means that subsequent configuration
options or files may override or supplement previously given settings.
A special 'config.dump' configuration value can be set to true to help
debug configuration issues. When this value is set, bhyve will print
out the configuration variables as a flat list of 'key=value' lines.
Most command line argments map to a single configuration variable,
e.g. '-w' sets the 'x86.strictmsr' value to false. A few command
line arguments have less obvious effects:
- Multiple '-p' options append their values (as a comma-seperated
list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number).
- For '-s' options, a pci.<bus>.<slot>.<function> node is created.
The first argument to '-s' (the device type) is used as the value of
a "device" variable. Additional comma-separated arguments are then
parsed into 'key=value' pairs and used to set additional variables
under the device node. A PCI device emulation driver can provide
its own hook to override the parsing of the additonal '-s' arguments
after the device type.
After the configuration phase as completed, the init_pci hook
then walks the "pci.<bus>.<slot>.<func>" nodes. It uses the
"device" value to find the device model to use. The device
model's init routine is passed a reference to its nvlist node
in the configuration tree which it can query for specific
variables.
The result is that a lot of the string parsing is removed from
the device models and centralized. In addition, adding a new
variable just requires teaching the model to look for the new
variable.
- For '-l' options, a similar model is used where the string is
parsed into values that are later read during initialization.
One key note here is that the serial ports use the commonly
used lowercase names from existing documentation and examples
(e.g. "lpc.com1") instead of the uppercase names previously
used internally in bhyve.
Reviewed by: grehan
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D26035
Definitions inside usr.sbin/bhyve/virtio.h are thrown away.
Definitions in sys/dev/virtio are used instead.
This reduces code duplication.
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29084