975 Commits

Author SHA1 Message Date
Corvin Köhne
fe453891d7 bhyve: add nvlist functions for setting unset nodes
If an emulation uses those functions instead of set_config_value_node
or set_config_value, it allows the config values to get
overwritten. Introducing new functions is much more readable than
if else statements in the emulation code.

Reviewed by:	khng
MFC after:	2 weeks
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D33770
2022-01-14 12:41:44 +01:00
Mark Johnston
4558c11f1b bhyve: Correct unmapping of the MSI-X table BAR
The starting address passed to mprotect was wrong, so in the case where
the last page containing the table is not the last page of the BAR, the
wrong region would be unmapped.

Reported by:	Andy Fiddaman <andy@omniosce.org>
Reviewed by:	jhb
Fixes:		7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough")
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33739
2022-01-05 10:12:09 -05:00
Mark Johnston
76b45e688a bhyve: Map the right BAR in init_msix_table()
The PBA and MSI-X table can reside in different BARs.

Reported by:	Andy Fiddaman <andy@omniosce.org>
Reviewed by:	jhb
Fixes:		7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough")
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33739
2022-01-05 10:12:09 -05:00
Corvin Köhne
9fe79f2f2b bhyve: dynamically register FwCtl ports
Qemu's FwCfg uses the same ports as Bhyve's FwCtl. Static allocated
ports wouldn't allow to switch between Qemu's FwCfg and Bhyve's
FwCtl.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D33496
2022-01-03 16:32:55 +01:00
Corvin Köhne
7d55d29508 bhyve: add more slop to 64 bit BARs
Bhyve allocates small 64 bit BARs below 4 GB and generates ACPI tables
based on this allocation. If the guest decides to relocate those BARs
above 4 GB, it could lead to mismatching ACPI tables. Especially
when using OVMF with enabled bus enumeration it could cause
issues. OVMF relocates all 64 bit BARs above 4 GB. The guest OS
may be unable to recover from this situation and disables some PCI
devices because their BARs are located outside of the MMIO space
reported by ACPI. Avoid this situation by giving the guest more
space for relocating BARs.

Let's be paranoid. The available space for BARs below 4 GB is 512 MB
large. Use a slop of 512 MB. It'll allow the guest to relocate all
BARs below 4 GB to an address above 4 GB. We could run into issues
when we exceeding the memlimit above 4 GB. However, this space has
a size of 32 GB. Even when using many PCI device with large BARs
like framebuffer or when using multiple PCI busses, it's very
unlikely that we run out of space due to the large slop.
Additionally, this situation will occur on startup and not at runtime
which is much better.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D33118
2022-01-03 16:32:55 +01:00
Corvin Köhne
8ec366ec6c bhyve: allow reading of fwctl signature multiple times
At the moment, you only have one single chance to read the fwctl
signature. At boot bhyve is in the state IDENT_WAIT. It's then
possible to switch to IDENT_SEND. After bhyve sends the signature,
it switches to REQ. From now on it's impossible to switch back to
IDENT_SEND to read the signature. For that reason, only a single
driver can read the signature. A guest can't use two drivers to
identify that fwctl is present. It gets even worse when using
OVMF. OVMF uses a library to access fwctl. Therefore, every single
OVMF driver would try to read the signature. Currently, only a
single OVMF driver accesses the fwctl. So, there's no issue with
it yet. However, no OS driver would have a chance to detect fwctl when
using OVMF because it's signature was already consumed by OVMF.

Reviewed by:    markj
MFC after:      2 weeks
Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:  https://reviews.freebsd.org/D31981
2022-01-03 16:32:55 +01:00
Corvin Köhne
01f9362ef4 bhyve: enumerate BARs by size
E.g. Framebuffers can require large space and BARs need to be aligned
by their size. If BARs aren't allocated by size, it'll cause much
fragmentation of the MMIO space. Reduce fragmentation by ordering
the BAR allocation on their size to reduce the risk of
OUT_OF_MMIO_SPACE issues.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D28278
2022-01-03 16:32:55 +01:00
Corvin Köhne
338a1be836 bhyve: only init MSI-X table if passthru device supports it
Some passthru devices only support MSI instead of MSI-X. For those
devices the initialization of MSI-X table will fail. Re-add the
check erroneously removed in f1442847c9404d4bc5f5524a0c3362dd39cb14f9.

MFC after:	3 days
X-MFC with:	f1442847c9404d4bc5f5524a0c3362dd39cb14f9
PR:		260148
Reviewed by:	manu, bz
Differential Revision:	https://reviews.freebsd.org/D33728
2022-01-03 14:55:10 +00:00
Toomas Soome
04f55b5b0e bhyve smbios type 3 structure is incorrect
If you look at the SMBIOS specification, we'll find something is
missing. In particular at offset 0Dh is supposed to be the OEM-defined
field. This should go between security and height. It is not legal to
actually skip this and will lead to other folks not properly
interpreting later parts of the table.

https://www.illumos.org/issues/14312

Reviewed by:	jhb
Submitted by:	Robert Mustacchi <rm@fingolfin.org>
Obtained from:	ilumos
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D33682
2021-12-27 20:06:33 +02:00
Toomas Soome
c2fa905cf6 bhyve: clean up trailing whitespaces
Clean up trailing whitespaces. No functional changes.

Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D33681
2021-12-27 19:58:10 +02:00
Bjoern A. Zeeb
f1442847c9 bhyve: passthru: enable BARs before possibly mmap(2)ing them
The first time we start bhyve with a passthru device everything is fine
as on boot we do enable BARs.  If a driver (unload) inside bhyve disables
the BAR(s) as some Linux drivers do, we need to make sure we re-enable
them on next bhyve start.

If we are trying to mmap a disabled BAR for MSI-X (PCIOCBARMMAP)
the kernel will give us an EBUSY.
While we were re-enabling the BAR(s) in the current code loop
cfginit() was writing the changes out too late to the real hardware.

Move the call to init_msix_table() after the register on the real
hardware was updated.  That way the kernel will be happy and the
mmap will succeed and bhyve will start.
Also simplify the code given the last argument to init_msix_table()
is unused we do not need to do checks for each bar. [1]

MFC after:	3 days
PR:		260148
Pointed out by:	markj [1]
Sponsored by:	The FreeBSD Foundation
Reviewed by:	markj
Differential Revision: https://reviews.freebsd.org/D33628
2021-12-29 17:01:05 +00:00
Vitaliy Gusev
d079fc197a bhyve: Only snapshot initialized VirtIO queues
If the virtio device is not fully initialized, then suspend fails with:

  vi_pci_snapshot_queues: invalid address: vq->vq_desc
  Failed to snapshot virtio-rnd; ret=14

MFC after:	1 week
Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D26268
2021-12-17 13:06:53 -05:00
Chuck Tuffli
cf3ed8e0cd bhyve nvme: Inform guests of namespace resize
Register a "block resize" callback to be notified of changes to the
backing storage for the Namespace. Use this to generate an Asynchronous
Event Notification, Namespace Attributes Changed when the guest OS
provides an Asynchronous Event Request.

MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D32953
2021-12-14 23:17:55 -08:00
Chuck Tuffli
9f1fa1a461 bhyve nvme: Add AEN support to NVMe emulation
Add Asynchronous Event Notification infrastructure to the NVMe
emulation.

Reviewed by:	imp, grehan
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D32952
2021-12-14 23:16:49 -08:00
Robert Wing
0b29683b32 bhyve: set EV_CLEAR for EVFILT_VNODE mevents
When an EVFILT_VNODE filter event is triggered, reset it.

This fixes the issue where a virtio-blk resize event would cause the
mevent thread to consume 100% of the cpu.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D33326
2021-12-12 12:39:40 -09:00
Rebecca Cran
866036f46c bhyve: Support a _VARS.fd file for bootrom
OVMF creates two separate .fd files, a _CODE.fd file containing
the UEFI code, and a _VARS.fd file containing a template of an
empty UEFI variable store.

OVMF decides to write variables to the memory range just below the
boot rom code if it detects a CFI flash device. So here we add
just the barest facsimile of CFI command handling to bootrom.c
that is needed to placate OVMF.

Submitted by: D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19976
MFC After: 1 week
2021-12-12 08:07:27 -07:00
Robert Wing
2616ee608c bhyve: fix -Wunused-but-set-variable warning
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D33306
2021-12-06 10:52:08 -09:00
Chuck Tuffli
d8c1d7b652 bhyve blockif: fix blockif_candelete with Capsicum
NVMe conformance tests for the Format command failed if the
backing-storage for the bhyve device was a file instead of a Zvol. The
tests (and the specification) expect a Format to destroy all previously
written data. The bhyve NVMe emulation implements this by trimming /
deallocating all data from the backing-storage.

The blockif_candelete() function indicated the file did not support
deallocation (i.e. fpathconf(..., _PC_DEALLOC_PRESENT) returned FALSE)
even though the kernel supported file hole punching. This occurs on
builds with Capsicum enabled because blockif did not allow the
fpathconf(2) right.

Fix is to add CAP_FPATHCONF to the cap_rights_init(3) call.

PR:		260081
Reviewed by:	allanjude, markj, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33203
2021-11-30 21:49:34 -08:00
Emmanuel Vadot
fc7207c877 bhyve: Fix compile
We need err.h

Fixes:	5cf21e48ccf11 ("bhyve: use a fixed 32 bit BAR base address")
Sponsored by:	Bechoff Automation GmbH & Co. KG
2021-11-22 17:13:09 +01:00
Corvin Köhne
fe66bcf9ff bhyve: emulate reads of MSI-X capabilities for passthru devices
Reads of the MSI-X capabilites aren't emulated by passthru devices
yet. The guest will read the host MSI-X capabilites which could
cause issues.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D32686
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2021-11-22 16:27:33 +01:00
Corvin Köhne
2eb2079554 bhyve: keep physical and virtual COMMAND reg in sync
On startup all virtual BARs are registered.
Additionally, the encoding bit in the virtual cmd register is set.
After that, the passthru emulation overwrites the virtual cmd register with
the physical one.
This could lead to a mismatch between registered BARs and the encoding
bits in the cmd register.
Instead of writing the physical to the virtual cmd register,
write the virtual to the physical cmd register to solve this issue.

Reviewed by:	  markj
Differential Revision:	https://reviews.freebsd.org/D32687
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2021-11-22 16:26:03 +01:00
Corvin Köhne
5cf21e48cc bhyve: use a fixed 32 bit BAR base address
OVMF always uses 0xC0000000 as base address for 32 bit PCI MMIO space.
For that reason, we should use that address too.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D31051
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2021-11-22 16:24:47 +01:00
Corvin Köhne
4a4053e1b0 bhyve: move 64 bit BAR location to match OVMF assumptions
OVMF will fail, if large 64 bit BARs are used. GCD-Map doesn't cover
64 bit addresses of BARs.
OVMF assumes that 64 bit addresses of BARS are located on next 32 GB
boundary behind Top of High RAM.

This patch moves 64 bit BARs on next 32 GB boundary behind Top of High
RAM to match OVMF assumptions.

Differential Revision:	https://reviews.freebsd.org/D27970
Sponsored by: Beckhoff Automation GmbH & Co. KG
2021-11-22 16:22:48 +01:00
Corvin Köhne
5085153ae4 bhyve: do not explicitly map fbuf framebuffer
Allocating a BAR will call baraddr which maps the framebuffer. No need
to allocate it explicitly on init.

Reviewed by:     grehan
Sponsored by:    Beckhoff Autmation GmbH & Co. KG
Differential Revision:    https://reviews.freebsd.org/D32596
2021-11-18 16:26:34 +01:00
Corvin Köhne
e87a6f3ef2 bhyve: use physical lobits for BARs of passthru devices
Tell the guest whether a BAR uses prefetched memory or not for
passthru devices by using the same lobits as the physical device.

Reviewed by:	 grehan
Sponsored by:	 Beckhoff Autmation GmbH & Co. KG
Differential Revision:	  https://reviews.freebsd.org/D32685
2021-11-18 16:25:09 +01:00
Rebecca Cran
35175e100a bhyve: Bump the SMBIOS firmware version to 14.0 for 14-CURRENT
Bump the firmware version to 14.0 and set the firmware release date
to today.

Reviewed by: jhb, bz, imp
Differential Revision: https://reviews.freebsd.org/D32534
2021-10-20 22:10:33 -06:00
Mark Johnston
77bc75c7ab bhyve: Fix the WITH_BHYVE_SNAPSHOT build
Note, this breaks compatibility with snapshots generated by older builds
of bhyve(8).

Fixes: 7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough")
Reported by:	Greg V <greg@unrelenting.technology>
Reviewed by:	grehan, bz
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32523
2021-10-18 09:56:59 -04:00
Corvin Köhne
1b0e2f0b60 bhyve: ignore low bits of CFGADR
Bhyve could emulate wrong PCI registers.
In the best case, the guest reads wrong registers and the device driver would
report some errors.
In the worst case, the guest writes to wrong PCI registers and could brick
hardware when using PCI passthrough.

According to Intels specification, low bits of CFGADR should be
ignored. Some OS like linux may rely on it. Otherwise, bhyve could
emulate a wrong PCI register.

E.g.
If linux would like to read 2 bytes from offset 0x02, following would
happen.
linux:
	outl 0x80000002 at CFGADR
	inw  at CFGDAT + 2
bhyve:
	cfgoff = 0x80000002 & 0xFF = 0x02
	coff   = cfgoff + (port - CFGDAT) = 0x02 + 0x02 = 0x04
Bhyve would emulate the register at offset 0x04 not 0x02.

Reviewed By: #bhyve, grehan
Differential Revision: https://reviews.freebsd.org/D31819
Sponsored by:	       Beckhoff Automation GmbH & Co. KG
2021-10-15 09:29:45 +02:00
Mateusz Piotrowski
f656df586a bhyve: Update usage and synopsis for the -k flag
Let's make it clear to users that -k is for configuration files.
Also, point to bhyve_config(5) in the paragraph describing the flag.

Reviewed by:	jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32467
2021-10-13 08:39:57 +02:00
Mateusz Piotrowski
775f6f4595 bhyve.8: Fix markup of the -G flag 2021-10-12 16:09:28 +02:00
Mark Johnston
7fa2335347 bhyve: Map the MSI-X table unconditionally for passthrough
It is possible for the PBA to reside in the same page as the MSI-X
table.  And, while devices are not supposed to do this, at least some
Intel wifi devices place registers in a page shared with the MSI-X
table.  To handle the first case we currently map the PBA page using
/dev/mem, and the second case is not handled.

Kill two birds with one stone: map the MSI-X table BAR using the
PCIOCBARMMAP ioctl instead of /dev/mem, and map the entire table so that
accesses beyond the bounds of the table can be emulated.  Regions of the
BAR not containing the table are left unmapped.

Reviewed by:	bz, grehan, jhb
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32359
2021-10-09 11:36:19 -04:00
John Baldwin
7ecdfc8237 bhyve: Add an empty case for event types in mevent_kq_fflags().
This fixes a -Wswitch error raised by GCC 9.

Differential Revision:	https://reviews.freebsd.org/D31938
2021-09-25 11:25:25 -07:00
John Baldwin
48759c4ed7 bhyve_config.5: Document gdb.address. 2021-09-25 10:07:18 -07:00
John Baldwin
b70b050ab5 bhyve: Update the -G description in the SYNPOSIS.
It was missing both the 'w' flag and 'bind_address'.
2021-09-25 10:01:43 -07:00
John Baldwin
c6efcb1281 bhyve: Support setting the disk serial number for VirtIO block devices.
Reviewed by:	allanjude
Obtained from:	illumos
Differential Revision:	https://reviews.freebsd.org/D31983
2021-09-17 09:55:48 -07:00
Ka Ho Ng
e31cc1d526 bhyve: Fix pci device node key in bhyve_config.5
PCI device node key in the manual page is wrong. It should be
pci.bus.slot.function.

MFC after:	3 days
2021-09-13 04:35:03 +08:00
Elliott Mitchell
e76c0e4f45 bhyve: Nuke double-semicolons
A distinct number of double-semicolons ended up in bhyve. Take a pass at
getting rid of many of these harmless typos.

MFC after:	3 days
2021-08-30 15:31:04 +08:00
Mark Johnston
71fbc6faed bhyve: Fix vq_getchain() error handling bugs in various device models
Reviewed by:	grehan, khng
Approved by:	so
Security:	CVE-2021-29631
Security:	FreeBSD-SA-21:13.bhyve
2021-08-24 14:29:13 -04:00
Mariusz Zaborski
3a92927bb6 bhyve: change a default address from ANY to localhost
Discussed with:     grehan, jhb
2021-08-21 19:43:17 +02:00
Mariusz Zaborski
2cdff9918e byhve: add option to specify IP address for gdb
Allow user to specify the IP address available for gdb debugger.

Reviewed by:	jhb, grehan, rgrimes, bcr (man pages)
Differential Revision:	https://reviews.freebsd.org/D29607
2021-08-21 19:43:17 +02:00
Mark Johnston
42375556e5 bhyve: Use pci(4) to access I/O port BARs
This removes the dependency on /dev/io.

PR:		251046
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31308
2021-08-14 10:59:04 -04:00
Ka Ho Ng
3676512b60 bhyve: Use fspacectl(2) for BOP_DELETE on regular file images
bhyve can also make use of fspacectl(2) to implement BOP_DELETE with
hole-punching. Since it is not desirable to do zero-filling for large
DEALLOCATE/UNMAP range, candelete is not set if pathconf(2) indicates
that the underlying file system does not support native
VOP_DEALLOCATE(9).

Sponsored by:	The FreeBSD Foundation
Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D28880
2021-08-07 17:10:30 +08:00
Bjoern A. Zeeb
56be282bc9 bhyve: net_backends, automatically IFF_UP tap devices
If you want communications with the outside world and tell bhyve to
create an interfaces then it should be usable as well.
Rather than relying on the sysctl net.link.tap.up_on_open automatically
try to IFF_UP the opened tap device.

MFC after:	10 days
Reviewed by:	markj, grehan
Differential Revision: https://reviews.freebsd.org/D31342
2021-08-01 20:50:53 +00:00
Chuck Tuffli
91064841d7 bhyve: Fix NVMe iovec construction for large IOs
The UEFI driver included with Rocky Linux 8.4 uncovered an existing bug
in the NVMe emulation's construction of iovec's.

By default, NVMe data transfer operations use a scatter-gather list in
which all entries point to a fixed size memory region. For example, if
the Memory Page Size is 4KiB, a 2MiB IO requires 512 entries. Lists
themselves are also fixed size (default is 512 entries).

Because the list size is fixed, the last entry is special. If the IO
requires more than 512 entries, the last entry in the list contains the
address of the next list of entries. But if the IO requires exactly 512
entries, the last entry points to data.

The NVMe emulation missed this logic and unconditionally treated the
last entry as a pointer to the next list. Fix is to check if the
remaining data is greater than the page size before using the last entry
as a pointer to the next list.

PR:		256422
Reported by:	dave@syix.com
Tested by:	jason@tubnor.net
MFC after:	5 days
Relnotes:	yes
Reviewed by:	imp, grehan
Differential Revision:	https://reviews.freebsd.org/D30897
2021-06-27 15:14:52 -07:00
Chuck Tuffli
a11ca79cd9 bhyve: fix NVMe MDTS comment
Removes an obsolete comment and adds parenthesis around the macro while
in the area. No functional change.
2021-06-25 08:02:28 -07:00
Chuck Tuffli
3a4ab18377 bhyve: Fix cli regression with NVMe ram
The configuration management refactoring inadvertently removed support
for a RAM-backed NVMe Namespace (i.e. -s X,nvme,ram=16384). This adds it
back.

Reported by:	andy@omniosce.org
Reviewed by:	jhb, andy@omniosce.org
Fixes:		621b5090487d
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D30717
2021-06-16 14:19:01 -07:00
Robert Crowston
efec757b20 bhyve: enhance debug info for memory range clash
Explain what the two clashing regions are.

Reivewed by:		grehan, jhb
Differential Revision:	https://reviews.freebsd.org/D29696
Pull Request:		https://github.com/freebsd/freebsd-src/pull/463
2021-06-13 16:41:45 -06:00
John Baldwin
2349cda44f bhyve vtblk: Inform guests of disk resize events.
Register a resize callback with the blockif interface.  When the
callback fires, update the size of the disk and notify the guest via a
configuration change interrupt.

Reviewed by:	grehan, markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D30506
2021-06-11 18:00:25 -07:00
John Baldwin
c06676bee3 bhyve: Split out a lower-level helper for VirtIO interrupts.
This allows device models to assert VirtIO interrupts for reasons
other than publishing changes to a VirtIO ring such as configuration
changes.

Reviewed by:	grehan, markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D30505
2021-06-11 18:00:25 -07:00
John Baldwin
8794846a91 bhyve: Add support for handling disk resize events to block_if.
Allow clients of blockif to register a resize callback handler.  When
a callback is registered, register an EVFILT_VNODE kevent watching the
backing store for a change in the file's attributes.  If the size has
changed when the kevent fires, invoke the clients' callback.

Currently resize detection is limited to backing stores that support
EVFILT_VNODE kevents such as regular files.

Reviewed by:	grehan, markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D30504
2021-06-11 18:00:24 -07:00