This structure isn't used for anything, and only counts a subset of
vmexit types. Moreover, it is not accurate since there is no
synchronization between vcpu threads. Simply remove it.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40245
Commit 0bda8d3e9f ("vmm: permit some IPIs to be handled by userspace")
embedded cpuset_t into the vmm(4) ioctl ABI. This was a mistake since
we otherwise have some leeway to change the cpuset_t for the whole
system, but we want to keep the vmm ioctl ABI stable.
Rework IPI reporting to avoid this problem. Along the way, make VM_RUN
a bit more efficient:
- Split vmexit metadata out of the main VM_RUN structure. This data is
only written by the kernel.
- Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN
structure, as is done for cpuset syscalls.
- Have the destination CPU mask for VM_EXITCODE_IPIs live outside the
vmexit info structure, and make VM_RUN copy it out separately. Zero
out any extra bytes in the CPU mask, like cpuset syscalls do.
- Modify the vmexit handler prototype to take a full VM_RUN structure.
PR: 271330
Reviewed by: corvink, jhb (previous versions)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40113
At the moment, fwcfg reads the file once at startup and passes these
data to the guest. Therefore, we should always read the whole file.
Otherwise we should error out.
Additionally, GCC12 complains that the comparison whether
fwcfg_file->size is lower than 0 is always false due to the limited
range of data type.
Reviewed by: markj
Fixes: ca14781c81 ("bhyve: add cmdline option for user defined fw_cfg items")
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40076
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
Those definitions are required for the GVT-d emulation to parse the
OpRegion.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40037
Intel GPUs have two special memory regions. They are called Graphics
Stolen Memory and OpRegion. bhyve has to emulate both of them. In order
to keep track of those special regions, add generic mmio ranges to the
passthru emulation.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40036
The GVT-d emulation tries to allocate some specific memory. It could
happen that this address doesn't exist. In that case, GVT-d will fall
back to allocate any address. Nevertheless, this only works if the e820
fails with an error instead of exiting on an assertion.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40034
The GVT-d emulation requires access to this selector to read from the
device.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D40035
Some guest allow to configure themself by fw_cfg. E.g. Fedora CoreOs can
be provisioned by adding a JSON file as fw_cfg item.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38338
This patch fixes virtual machine single stepping on VMX hosts.
Currently, when using bhyve's gdb stub, each attempt at single-stepping
a vCPU lands in a timer interrupt. The current single-stepping mechanism
uses the Monitor Trap Flag feature to cause VMEXIT after a single
instruction is executed. Unfortunately, the SDM states that MTF causes
VMEXITs for the next instruction that gets executed, which is often not
what the person using the debugger expects. [1]
This patch adds a new VM capability that masks interrupts on a vCPU by
blocking interrupt injection and modifies the gdb stub to use the newly
added capability while single-stepping a vCPU.
[1] Intel SDM 26.5.2 Vol. 3C
Reviewed by: corvink, jbh
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39949
OPENSSL_API_COMPAT can be used to specify the OpenSSL API version in
use for the purpose of hiding deprecated interfaces and enabling
the appropriate deprecation notices.
This change is a NFC while we're still using OpenSSL 1.1.1 but will
avoid deprecation warnings upon the switch to OpenSSL 3.0. A future
change can then switch bhyve to use OpenSSL 3.0 APIs.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39998
This is required to enable capsicum for the snapshot code.
Reviewed by: corvink
Sponsored by: vStack
Differential Revision: https://reviews.freebsd.org/D38858
Adapt hda_print_cmd_ctl_data() to not generate compiler warnings
when DEBUG_HDA is off.
Reviewed by: corvink
Approved by: corvink
Differential Revision: https://reviews.freebsd.org/D39826
E820 table will be used to report valid RAM ranges and reserve special
memory areas like graphics memory for GPU passthrough.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39550
For debugging purposes it is helpful to dump the E820 table.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39549
This function makes it easy to allocate new E820 entries. It will be
used to allocate graphics memory for Intel integrated graphic devices.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39547
The VGA and the ROM memory ranges can't be used as system memory. For
that reason, remove them from the E820 table.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39546
There are some use cases where bhyve has to prepare some special memory
regions. E.g. GPU passthrough for Intel integrated graphic devices needs
to reserve some memory for the graphic device. So, bhyve has to inform
the guest about those memory regions. This information can be passed by
the qemu fwcfg interface. As qemu creates an E820 table, we can reuse
the existing fwcfg item "etc/e820".
This commit is the first one of a series. It only adds a basic
implementation for the creation of the E820 table. Some subsequent
commits will add more items to the E820 table and register it as fwcfg
item.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39545
Add all acpi tables to qemus acpi table loader. This passes the acpi
tables by fwcfg to the guest.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38439
The hypervisor is aware of all system properties. For the guest bios
it's hard and complex to detect all system properties. For that reason,
it would be better if the hypervisor creates acpi tables instead of the
guest. Therefore, the hypervisor has to send the acpi tables to the
guest. At the moment, bhyve just copies the acpi tables into the guest
memory. This approach has some restrictions. You have to keep sure that
the guest doesn't overwrite them accidentally. Additionally, the size of
acpi tables is limited.
Providing a plain copy of all acpi tables by fwcfg isn't possible. Acpi
tables have to point to each other. So, if the guest copies the acpi
tables into memory by it's own, it has to patch the tables. Due to
different layouts for different acpi tables, there's no generic way to
do that. For that reason, qemu created a table loader interface. It
contains commands for the guest for loading specific blobs into guest
memory and patching those blobs.
This commit adds a qemu_loader class which handles the creation of qemu
loader commands. At the moment, the WRITE_POINTER command isn't
implement. It won't be required by bhyve's acpi table generation yet.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38438
This will be useful for writing device specific ACPI tables or DSDT
methods.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39322
Most register of the PCI header are either constant values or require
emulation anyway. The command and status register are the only exception which
require hardware access. So, we're adding an emulation handler for all
other register.
As this emulation handler will be reused by some future features like
GPU passthrough, we directly export it.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D33010
GPU passthrough requires a special handling of some PCI config register.
Therefore, we need a flexible approach for implementing it. Adding an
array of handler meets this condition.
Start by using the default handler for all accesses to the PCI config
space. In upcoming commits, we can start to split the default handler
into several handler for each register that requires emulation.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39291
This feature will be used by future commits to implement a device
specific method (_DSM) for TPM devices.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39321
At the moment, this function can't fail. This behaviour will change in
the future. In preparation to that, convert the return type to int in
order to be able to check for errors.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39422
Some ACPI devices require a device specific acpi table. E.g. a TPM2
device requires a TPM2 table. Use the acpi_device_emul struct to define
such a device specific table.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39320
The host selector is only required when the user likes to use the same
LPC device IDs as the physical LPC device. This is an uncommon use case.
For that reason, it makes no sense to exit when we don't find the host
selector.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39311
It'll be easier to add new properties to the ACPI device emulation if we
have a struct which holds all device specific properties. In some future
commits the acpi_device_emul struct will be expanded to include some
device specific functions to build ACPI tables.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39319
At least on some AMD devices the host LPC bridge could be located as
seperate function of another PCI device.
Fixes: f4ceaff56d
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39310
Those const qualifier declare that the function doesn't change the
values internally. It makes no sense to add them in the header file.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39318
The Intel GOP driver checks the LPC IDs to detect the platform it's
running on. The GOP driver only works on the platforms it's written for.
Maybe other Intel driver have the same behaviour. For that reason, we
should use the LPC IDs of the FreeBSD host for GPU passthrough to work
properly.
We don't know if setting different LPC IDs have any side effect.
Therefore, don't use the host LPC IDs by default on Intel system. Give
the user the opportunity to modify the LPC IDs.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D28280
For compatibilty reasons, the old config values are still supported.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38403
Changing the PCI IDs is valuable in some situations. The Intel GOP
driver requires that some PCI IDs of the LPC bridge are aligned with the
physical values of the host LPC bridge. Another use case are oracles
virtio driver. They require different subvendor ID than the default one.
For that reason, create a helper which makes it easy to read PCI IDs
from bhyve config. Additionally, this helper ensures that all emulation
devices are using the same config keys.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38402
This is a userland-only pointer that isn't relevant to the kernel and
doesn't belong in the ioctl structure shared between userland and the
kernel. For the kernel, the old structure for the ioctl is still
supported under COMPAT_FREEBSD13.
This changes vm_snapshot_req() in libvmmapi to accept an explicit
vmctx argument.
It also changes vm_snapshot_guest2host_addr to take an explicit vmctx
argument. As part of this change, move the declaration for this
function and its wrapper macro from vmm_snapshot.h to snapshot.h as it
is a userland-only API.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38125
This replaces the 'struct vm, int vcpuid' tuple passed to most API
calls and is similar to the changes recently made in vmm(4) in the
kernel.
struct vcpu is an opaque type managed by libvmmapi. For now it stores
a pointer to the VM context and an integer id.
As an immediate effect this removes the divergence between the kernel
and userland for the instruction emulation code introduced by the
recent vmm(4) changes.
Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to
0x200 (2.0) and the shared library major version.
While here (and since the major version is bumped), remove unused
vcpu argument from vm_setup_pptdev_msi*().
Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for
use by the debug server. The underyling ioctl (which uses a vcpuid of
-1) remains unchanged, but the userlevel API now uses separate
functions for global CPU suspend/resume.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38124
It is illegal (UB?) to pass a shorter array to a function argument
that takes a fixed-length array. Do a runtime check for names that
are too long via strlen() instead.
Reviewed by: markj
Reported by: GCC -Wstringop-overread
Differential Revision: https://reviews.freebsd.org/D39211
As of commit 0bda8d3e9f ("vmm: permit some IPIs to be handled by
userspace") and commit 9cc9abf409 ("bhyve: create all vcpus on
startup"), we have a misbehaviour where AP vCPU threads spin until they
receive a SIPI. In particular, since they are "suspended", they simply
call the VMEXIT_DEBUG handler in a loop, but the handler is a no-op by
default.
This is tricky to fix since the gdb stub isn't aware of whether a given
vCPU is supposed to be running. For 13.2's sake, introduce a simple
workaround wherein the VMEXIT_DEBUG handler sleeps for a short period.
This ensures that host CPU usage remains sane when VMs are starting
without penalizing users of VMEXIT_DEBUG too much.
Reviewed by: corvink, jhb
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39174
Let the user decide if he wants to use bhyve's fwctl or qemu's fwcfg. He
can set the interface by adding a fwcfg option to bootrom:
-l bootrom,<path/to/rom>,fwcfg=bhyve
-l bootrom,<path/to/rom>,fwcfg=qemu
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38337
Fwcfg items without a fixed index are reported by the file_dir. They
have an index of 0x20 and above. This helper simplifies the addition of
such fwcfg items. It selects a new free index, assigns it to the fwcfg
items and creates an proper entry in the file_dir.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D38336