Commit Graph

77 Commits

Author SHA1 Message Date
D Scott Phillips
f8a6ec2d57 bhyve: support relocating fbuf and passthru data BARs
We want to allow the UEFI firmware to enumerate and assign
addresses to PCI devices so we can boot from NVMe[1]. Address
assignment of PCI BARs is properly handled by the PCI emulation
code in general, but a few specific cases need additional support.
fbuf and passthru map additional objects into the guest physical
address space and so need to handle address updates. Here we add a
callback to emulated PCI devices to inform them of a BAR
configuration change. fbuf and passthru then watch for these BAR
changes and relocate the frame buffer memory segment and passthru
device mmio area respectively.

We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls
to vmm(4) to facilitate the unmapping needed for addres updates.

[1]: https://github.com/freebsd/uefi-edk2/pull/9/

Originally by:	scottph
MFC After:	1 week
Sponsored by:	Intel Corporation
Reviewed by:	grehan
Approved by:	philip (mentor)
Differential Revision:	https://reviews.freebsd.org/D24066
2021-03-19 11:04:36 +08:00
John Baldwin
621b509048 Refactor configuration management in bhyve.
Replace the existing ad-hoc configuration via various global variables
with a small database of key-value pairs.  The database supports
heirarchical keys using a MIB-like syntax to name the path to a given
key.  Values are always stored as strings.  The API used to manage
configuation values does include wrappers to handling boolean values.
Other values use non-string types require parsing by consumers.

The configuration values are stored in a tree using nvlists.  Leaf
nodes hold string values.  Configuration values are permitted to
reference other configuration values using '%(name)'.  This permits
constructing template configurations.

All existing command line arguments now set configuration values.  For
devices, the "-s" option parses its option argument to generate a list
of key-value pairs for the given device.

A new '-o' command line option permits setting an individual
configuration variable.  The key name is always given as a full path
of dot-separated components.

A new '-k' command line option parses a simple configuration file.
This configuration file holds a flat list of 'key=value' lines where
the 'key' is the full path of a configuration variable.  Lines
starting with a '#' are comments.

In general, bhyve starts by parsing command line options in sequence
and applying those settings to configuration values.  Once this is
complete, bhyve then begins initializing its state based on the
configuration values.  This means that subsequent configuration
options or files may override or supplement previously given settings.

A special 'config.dump' configuration value can be set to true to help
debug configuration issues.  When this value is set, bhyve will print
out the configuration variables as a flat list of 'key=value' lines.

Most command line argments map to a single configuration variable,
e.g.  '-w' sets the 'x86.strictmsr' value to false.  A few command
line arguments have less obvious effects:

- Multiple '-p' options append their values (as a comma-seperated
  list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number).

- For '-s' options, a pci.<bus>.<slot>.<function> node is created.
  The first argument to '-s' (the device type) is used as the value of
  a "device" variable.  Additional comma-separated arguments are then
  parsed into 'key=value' pairs and used to set additional variables
  under the device node.  A PCI device emulation driver can provide
  its own hook to override the parsing of the additonal '-s' arguments
  after the device type.

  After the configuration phase as completed, the init_pci hook
  then walks the "pci.<bus>.<slot>.<func>" nodes.  It uses the
  "device" value to find the device model to use.  The device
  model's init routine is passed a reference to its nvlist node
  in the configuration tree which it can query for specific
  variables.

  The result is that a lot of the string parsing is removed from
  the device models and centralized.  In addition, adding a new
  variable just requires teaching the model to look for the new
  variable.

- For '-l' options, a similar model is used where the string is
  parsed into values that are later read during initialization.
  One key note here is that the serial ports use the commonly
  used lowercase names from existing documentation and examples
  (e.g. "lpc.com1") instead of the uppercase names previously
  used internally in bhyve.

Reviewed by:	grehan
MFC after:	3 months
Differential Revision:	https://reviews.freebsd.org/D26035
2021-03-18 16:30:26 -07:00
Konstantin Belousov
038f5c7bfe bhyve: remove a hack to map all 8G BARs 1:1
Suggested and reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D27186
2020-11-12 02:52:01 +00:00
Konstantin Belousov
670b364b76 bhyve: increase allowed size for 64bit BAR allocation below 4G from 32 to 128 MB.
Reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D27095
2020-11-12 00:51:53 +00:00
Konstantin Belousov
9922872ba2 bhyve: avoid allocating BARs above the end of supported physical addresses.
Read CPUID leaf 0x8000008 to determine max supported phys address and
create BAR region right below it, reserving 1/4 of the supported guest
physical address space to the 64bit BARs mappings.

PR:    250802 (although the issue from PR is not fixed by the change)
Noted and reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D27095
2020-11-12 00:46:53 +00:00
Peter Grehan
2136849868 Fix pci-passthru MSI issues with OpenBSD guests
- Return 2 x 16-bit registers in the correct byte order
 for a 4-byte read that spans the CMD/STATUS register.
  This reversal was hiding the capabilities-list, which prevented
 the MSI capability from being found for XHCI passthru.

- Reorganize MSI/MSI-x config writes so that a 4-byte write at the
 capability offset would have the read-only portion skipped.
  This prevented MSI interrupts from being enabled.

 Reported and extensively tested by Anatoli (me at anatoli dot ws)

PR:	245392
Reported by:	Anatoli (me at anatoli dot ws)
Reviewed by:	jhb (bhyve)
Approved by:	jhb, bz (mentor)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24951
2020-05-25 06:25:31 +00:00
John Baldwin
483d953a86 Initial support for bhyve save and restore.
Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed.  In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken).  A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.

To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.

While the current implementation is useful for several uses cases, it
has a few limitations.  The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system).  In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions.  The file format also does not currently support
versioning of individual chunks of state.  As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files.  The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility.  As a result, the current implementation is not enabled
by default.  It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.

Submitted by:	Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by:	Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes:	yes
Sponsored by:	University Politehnica of Bucharest
Sponsored by:	Matthew Grooms (student scholarships)
Sponsored by:	iXsystems
Differential Revision:	https://reviews.freebsd.org/D19495
2020-05-05 00:02:04 +00:00
John Baldwin
7840d1c45f Update the cached MSI state when any MSI capability register is written.
bhyve uses cached copies of the MSI capability registers to generate
MSI interrupts for device models.  Previously, these cached fields
were only set when the MSI capability control register was updated.
The Linux kernel recently adopted a change to deal with races in MSI
interrupt delivery that writes to the MSI capability address and data
registers to alter the destination of MSI interrupts without writing
to the MSI capability control register.  bhyve was not updating its
cached registers for these writes and continued to send interrupts
with the old data value to the old address.  Fix this by recomputing
the cached values for every write to any MSI capability register.

Reported by:	Jason Tubnor, Ryan Moeller
Reported by:	Marc Dionne (bisected the Linux kernel commit)
Reviewed by:	grehan
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24593
2020-04-27 22:27:35 +00:00
Vincenzo Maffione
332eff95e3 bhyve: add wrapper for debug printf statements
Add printf() wrapper to use CR/CRLF terminators depending on whether
stdio is mapped to a tty open in raw mode.
Try to use the wrapper everywhere.
For now we leave the custom DPRINTF/WPRINTF defined by device
models, but we may remove them in the future.

Reviewed by:	grehan, jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D22657
2020-01-08 22:55:22 +00:00
Jung-uk Kim
412d13d559 Catch up with ACPICA 20191018.
PR:		241467
XMFC with:	r353764
2019-10-24 22:33:46 +00:00
John Baldwin
0026d8ccb7 Remove a spurious break when setting up a 64-bit memory BAR.
This was causing 'enbit' to not be initialized in this case.

CID:		1401924
Reported by:	Coverity
MFC after:	1 week
2019-06-12 16:49:01 +00:00
Chuck Tuffli
129f93c5a7 bhyve: Add PCIe Integrated Endpoint capability
The NVMe CAM driver reports the PCIe Link Capability and Status for
devices. For emulated bhyve NVMe devices, this looks like:

nda0: nvme version 1.3 x63 (max x63) lanes PCIe Gen15 (max Gen15) link

The driver outputs this because the emulated device doesn't include the
PCIe Capability structure. The NVMe specification requires these
registers, so the fix is to add this set of capability registers to the
emulated device.

Note that PCI Express devices that are integrated into the Root Complex
(i.e. Bus 0x0) do not have to support the Link Capability or Status
registers. Windows will fail to start (i.e. Code 10) devices that appear
to be part of the Root Complex but report being a PCI Express Endpoint.
So also add a check to pci_emul_add_pciecap() to check if the device is
integrated and change the device type.

Reviewed by:	imp, ken, araujo, jhb, rgrimes
Approved by:	imp (mentor), ken (mentor), jhb (maintainer)
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D19904
2019-06-07 17:09:49 +00:00
John Baldwin
5628267505 Keep the shadow PCIR_COMMAND synced with the real one for pass through.
This ensures that bhyve properly recognizes when decoding is disabled
for BARs on passthru devices.  To properly handle writes to the
register, export a pci_emul_cmd_changed function from pci_emul.c that
the pass through device model invokes for config writes that change
PCIR_COMMAND.

Reviewed by:	rgrimes
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20531
2019-06-07 15:53:27 +00:00
John Baldwin
2729c9bbc7 Enable memory and I/O decoding in PCI devices on demand.
Rather than uncoditionally setting the MEMEN and PORTEN bits in
PCIR_COMMAND for PCI devices, set the respective bit when the first
BAR of a given type is added to the device.  This more closely matches
what firmware does on bare metal.

BUSMASTEREN is still set unconditionally.  Eventually this bit should
move into the device models as not all device models need this set.

Reviewed by:	rgrimes
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20530
2019-06-07 15:48:12 +00:00
John Baldwin
df61066e8b Whitespace cleanups, no functional change. 2019-05-31 18:00:44 +00:00
Chuck Tuffli
f0dfbcccf4 Revert r345171 pending review
Backing out commit pending further discussion on the PCIe version
supported by pseudo (i.e. emulated) devices. See Differential for
details.

Reviewed by:	imp
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D19580
2019-04-13 23:37:27 +00:00
Chuck Tuffli
2ba640758d Fix bhyve PCIe capability emulation
PCIe devices starting with version 1.1 must set the Role-Based Error
Reporting bit.

And while we're in the neighborhood, generalize the code assigning the
device type.

Reviewed by:	imp, araujo, rgrimes
Approved by:	imp (mentor)
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D19580
2019-03-15 02:11:28 +00:00
Marcelo Araujo
657d21589e Add -s "help" and -l "help" to print a list of supported PCI and LPC devices.
For tools that uses bhyve such like libvirt, it is important to be able to
probe what features are supported by the given bhyve binary.

To give more context, libvirt probes bhyve's capabilities in a not very
effective way:
- Running 'bhyve -h' and parsing output.
- To detect devices, it runs 'bhyve -s 0,dev' for every each device and
  parses error output to identify if the device is supported or not.

PR:		2101111
Submitted by:	novel
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems Inc.
2018-08-22 20:23:08 +00:00
Marcelo Araujo
f7224b709f Fix style(9) space vs tab.
Reviewed by:	jhb
MFC after:	3 weeks.
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D15768
2018-06-14 01:34:53 +00:00
Marcelo Araujo
92046bf113 Revert: r334016
Revert for now this change, it in somehow breaks init_pci.
2018-05-22 06:02:11 +00:00
Marcelo Araujo
b5e3928d6d We must free the variable str.
Spotted by:	clang's static analyzer
Submitted by:	Tom Rix <trix_juniper.net>
Reviewed by:	grehan
MFC after:	4 weeks
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D10009
2018-05-22 04:08:08 +00:00
Pedro F. Giffuni
1de7b4b805 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
Alexander Motin
1b4496d043 Make PCI interupts allocation static when using bootrom (UEFI).
This makes factual interrupt routing match one shipped with UEFI firmware.
With old firmware this make legacy interrupts work reliable for functions 0
of PCI slots 3-6.  Updated UEFI image fixes problem completely.
2016-07-14 17:16:10 +00:00
Enji Cooper
7e12dfe5ef Fix CTASSERT issue in a more clean way
- Replace all CTASSERT macro instances with static_assert's.
- Remove the WRAPPED_CTASSERT macro; it's now an unnecessary obfuscation.
- Localize all static_assert's to the structures being tested.
- Sort some headers per-style(9).

Approved by: re (hrs)
Differential Revision: https://reviews.freebsd.org/D7130
MFC after: 1 week
X-MFC with: r302364
Reviewed by: ed, grehan (maintainer)
Submitted by: ed
Sponsored by: EMC / Isilon Storage Division
2016-07-06 16:02:15 +00:00
Enji Cooper
edb603345b Fix gcc warnings
Add `WRAPPED_CTASSERT` macro by annotating CTASSERTs with __unused
to deal with -Wunused-local-typedefs warnings from gcc 4.8+.
All other compilers (clang, etc) use CTASSERT as-is. A more generic
solution for this issue will be proposed after ^/stable/11 is forked.

Consolidate all CTASSERTs under one block instead of inlining them in
functions.

Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D7119
MFC after: 1 week
Reported by: Jenkins
Reviewed by: grehan (maintainer)
Sponsored by: EMC / Isilon Storage Division
2016-07-06 05:02:59 +00:00
Pedro F. Giffuni
9f3dba686c bhyve: consider the bogus case of a negative bar idx.
This is a followup to r297472 to squelch Coverity.

CID:	1194319
2016-05-13 14:59:02 +00:00
Pedro F. Giffuni
6e43f3ed6d pci_emul_dior(): fix uninitialized scalar variable.
Prevent from returning an unitialized value in case the
ior size is unknown.

CID:		1194319
Reviewed by:	grehan
2016-03-31 19:07:03 +00:00
Marcelo Araujo
d74fdc6a35 Clean up unused-but-set-variable spotted by gcc-4.9.
Reviewed by:	grehan
Approved by:	bapt (mentor)
Differential Revision:	https://reviews.freebsd.org/D4735
2015-12-31 01:55:51 +00:00
Eitan Adler
463a577b27 Fix a ton of speelling errors
arc lint is helpful

Reviewed By: allanjude, wblock, #manpages, chris@bsdjunk.com
Differential Revision: https://reviews.freebsd.org/D3337
2015-10-21 05:37:09 +00:00
Neel Natu
fd4e0d4c52 Advertise an additional memory BAR in the "dummy" device emulation.
This is useful for testing the MOVS emulation when both the source and
destination addresses are in the MMIO space.

MFC after:	1 week
2015-05-02 03:25:24 +00:00
Neel Natu
54335630a7 Don't allow guest to modify readonly bits in the PCI config 'status' register.
Reported by:	Leon Dang (ldang@nahannisys.com)
MFC after:	2 weeks
2015-04-24 19:15:38 +00:00
Neel Natu
12a6eb99a1 Support PCI extended config space in bhyve.
Add the ACPI MCFG table to advertise the extended config memory window.

Introduce a new flag MEM_F_IMMUTABLE for memory ranges that cannot be deleted
or moved in the guest's address space. The PCI extended config space is an
example of an immutable memory range.

Add emulation for the "movzw" instruction. This instruction is used by FreeBSD
to read a 16-bit extended config space register.

CR:		https://phabric.freebsd.org/D505
Reviewed by:	jhb, grehan
Requested by:	tychon
2014-08-08 03:49:01 +00:00
Neel Natu
be679db4cd Provide APIs to directly get 'lowmem' and 'highmem' size directly.
Previously the sizes were inferred indirectly based on the size of the mappings
at 0 and 4GB respectively. This works fine as long as size of the allocation is
identical to the size of the mapping in the guest's address space. However, if
the mapping is disjoint then this assumption falls apart (e.g., due to the
legacy BIOS hole between 640KB and 1MB).
2014-06-24 02:02:51 +00:00
Tycho Nightingale
67b6ffaad6 r267169 should apply to 64-bit BARs as well.
Reviewed by:	neel
2014-06-09 19:55:50 +00:00
Tycho Nightingale
b6ae8b050b Some devices (e.g. Intel AHCI and NICs) support quad-word access to
register pairs where two 32-bit registers make up a larger logical
size.  Support those access by splitting the quad-word into two
double-words.

Reviewed by:	grehan
2014-06-06 16:18:37 +00:00
John Baldwin
b3e9732a76 Implement a PCI interrupt router to route PCI legacy INTx interrupts to
the legacy 8259A PICs.
- Implement an ICH-comptabile PCI interrupt router on the lpc device with
  8 steerable pins configured via config space access to byte-wide
  registers at 0x60-63 and 0x68-6b.
- For each configured PCI INTx interrupt, route it to both an I/O APIC
  pin and a PCI interrupt router pin.  When a PCI INTx interrupt is
  asserted, ensure that both pins are asserted.
- Provide an initial routing of PCI interrupt router (PIRQ) pins to
  8259A pins (ISA IRQs) and initialize the interrupt line config register
  for the corresponding PCI function with the ISA IRQ as this matches
  existing hardware.
- Add a global _PIC method for OSPM to select the desired interrupt routing
  configuration.
- Update the _PRT methods for PCI bridges to provide both APIC and legacy
  PRT tables and return the appropriate table based on the configured
  routing configuration.  Note that if the lpc device is not configured, no
  routing information is provided.
- When the lpc device is enabled, provide ACPI PCI link devices corresponding
  to each PIRQ pin.
- Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A
  pins via the ELCR.
- Mark the power management SCI as level triggered.
- Don't hardcode the number of elements in Packages in the source for
  the DSDT.  iasl(8) will fill in the actual number of elements, and
  this makes it simpler to generate a Package with a variable number of
  elements.

Reviewed by:	tycho
2014-05-15 14:16:55 +00:00
Neel Natu
b100acf254 Don't allow MPtable generation if there are multiple PCI hierarchies. This is
because there isn't a standard way to relay this information to the guest OS.

Add a command line option "-Y" to bhyve(8) to inhibit MPtable generation.

If the virtual machine is using PCI devices on buses other than 0 then it can
still use ACPI tables to convey this information to the guest.

Discussed with:	grehan@
2014-05-02 04:51:31 +00:00
Peter Grehan
fcbec69157 Respect and track the enable bit in the PCI configuration address word.
Ignore writes, and return 0xff's, on config accesses when not set.
Behaviour now matches that seen on h/w.

Found with a NetBSD/amd64 guest.

Reviewed by:	tychon
MFC after:	3 weeks
2014-04-25 17:35:34 +00:00
Xin LI
994f858a8b Use calloc() in favor of malloc + memset.
Reviewed by:	neel
2014-04-22 18:55:21 +00:00
Neel Natu
7a902ec0ec Add a check to validate that memory BARs of passthru devices are 4KB aligned.
Also, the MSI-x table offset is not required to be 4KB aligned so take this
into account when computing the pages occupied by the MSI-x tables.
2014-02-18 19:00:15 +00:00
John Baldwin
a96b8b801a Tweak the handling of PCI capabilities in emulated devices to remove
the non-standard zero capability list terminator.   Instead, track
the start and end of the most recently added capability and use that
to adjust the previous capability's next pointer when a capability is
added and to determine the range of config registers belonging to
PCI capability registers.

Reviewed by:	neel
2014-02-18 03:00:20 +00:00
Neel Natu
d84882ca8f Allow PCI devices to be configured on all valid bus numbers from 0 to 255.
This is done by representing each bus as root PCI device in ACPI. The device
implements the _BBN method to return the PCI bus number to the guest OS.

Each PCI bus keeps track of the resources that is decodes for devices
configured on the bus: i/o, mmio (32-bit) and mmio (64-bit). These windows
are advertised to the guest via the _CRS object of the root device.

Bus 0 is treated specially since it consumes the I/O ports to access the
PCI config space [0xcf8-0xcff]. It also decodes the legacy I/O ports that
are consumed by devices on the LPC bus. For this reason the LPC bridge can
be configured only on bus 0.

The bus number can be specified using the following command line option
to bhyve(8): "-s <bus>:<slot>:<func>,<emul>[,<config>]"

Discussed with:	grehan@
Reviewed by:	jhb@
2014-02-14 21:34:08 +00:00
John Baldwin
3cbf3585cb Enhance the support for PCI legacy INTx interrupts and enable them in
the virtio backends.
- Add a new ioctl to export the count of pins on the I/O APIC from vmm
  to the hypervisor.
- Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for
  ISA interrupts.
- Populate the MP Table with I/O interrupt entries for any PCI INTx
  interrupts.
- Create a _PRT table under the PCI root bridge in ACPI to route any
  PCI INTx interrupts appropriately.
- Track which INTx interrupts are in use per-slot so that functions
  that share a slot attempt to distribute their INTx interrupts across
  the four available pins.
- Implicitly mask INTx interrupts if either MSI or MSI-X is enabled
  and when the INTx DIS bit is set in a function's PCI command register.
  Either assert or deassert the associated I/O APIC pin when the
  state of one of those conditions changes.
- Add INTx support to the virtio backends.
- Always advertise the MSI capability in the virtio backends.

Submitted by:	neel (7)
Reviewed by:	neel
MFC after:	2 weeks
2014-01-29 14:56:48 +00:00
John Baldwin
d2bc4816c5 Remove support for legacy PCI devices. These haven't been needed since
support for LPC uart devices was added and it conflicts with upcoming
patches to add PCI INTx support.

Reviewed by:	neel
2014-01-27 22:26:15 +00:00
John Baldwin
e6c8bc291a Rework the DSDT generation code a bit to generate more accurate info about
LPC devices.  Among other things, the LPC serial ports now appear as
ACPI devices.
- Move the info for the top-level PCI bus into the PCI emulation code and
  add ResourceProducer entries for the memory ranges decoded by the bus
  for memory BARs.
- Add a framework to allow each PCI emulation driver to optionally write
  an entry into the DSDT under the \_SB_.PCI0 namespace.  The LPC driver
  uses this to write a node for the LPC bus (\_SB_.PCI0.ISA).
- Add a linker set to allow any LPC devices to write entries into the
  DSDT below the LPC node.
- Move the existing DSDT block for the RTC to the RTC driver.
- Add DSDT nodes for the AT PIC, the 8254 ISA timer, and the LPC UART
  devices.
- Add a "SuperIO" device under the LPC node to claim "system resources"
  aling with a linker set to allow various drivers to add IO or memory
  ranges that should be claimed as a system resource.
- Add system resource entries for the extended RTC IO range, the registers
  used for ACPI power management, the ELCR, PCI interrupt routing register,
  and post data register.
- Add various helper routines for generating DSDT entries.

Reviewed by:	neel (earlier version)
2014-01-02 21:26:59 +00:00
Neel Natu
4f8be175d5 Add an API to deliver message signalled interrupts to vcpus. This allows
callers treat the MSI 'addr' and 'data' fields as opaque and also lets
bhyve implement multiple destination modes: physical, flat and clustered.

Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
Reviewed by:	grehan@
2013-12-16 19:59:31 +00:00
Neel Natu
ac7304a758 Add an ioctl to assert and deassert an ioapic pin atomically. This will be used
to inject edge triggered legacy interrupts into the guest.

Start using the new API in device models that use edge triggered interrupts:
viz. the 8254 timer and the LPC/uart device emulation.

Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
2013-11-23 03:56:03 +00:00
Neel Natu
565bbb8698 Move the ioapic device model from userspace into vmm.ko. This is needed for
upcoming in-kernel device emulations like the HPET.

The ioctls VM_IOAPIC_ASSERT_IRQ and VM_IOAPIC_DEASSERT_IRQ are used to
manipulate the ioapic pin state.

Discussed with:	grehan@
Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
2013-11-12 22:51:03 +00:00
Neel Natu
c8afb9bc3f Fix an off-by-one error when iterating over the emulated PCI BARs.
Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
2013-11-06 22:35:52 +00:00
Neel Natu
ea7f1c8cd2 Add support for PCI-to-ISA LPC bridge emulation. If the LPC bus is attached
to a virtual machine then we implicitly create COM1 and COM2 ISA devices.

Prior to this change the only way of attaching a COM port to the virtual
machine was by presenting it as a PCI device that is mapped at the legacy
I/O address 0x3F8 or 0x2F8.

There were some issues with the original approach:
- It did not work at all with UEFI because UEFI will reprogram the PCI device
  BARs and remap the COM1/COM2 ports at non-legacy addresses.
- OpenBSD GENERIC kernel does not create a /dev/console because it expects
  the uart device at the legacy 0x3F8/0x2F8 address to be an ISA device.
- It was functional with a FreeBSD guest but caused the console to appear
  on /dev/ttyu2 which was not intuitive.

The uart emulation is now independent of the bus on which it resides. Thus it
is possible to have uart devices on the PCI bus in addition to the legacy
COM1/COM2 devices behind the LPC bus.

The command line option to attach ISA COM1/COM2 ports to a virtual machine is
"-s <bus>,lpc -l com1,stdio".

The command line option to create a PCI-attached uart device is:
"-s <bus>,uart[,stdio]"

The command line option to create PCI-attached COM1/COM2 device is:
"-S <bus>,uart[,stdio]". This style of creating COM ports is deprecated.

Discussed with:	grehan
Reviewed by:	grehan
Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)

M    share/examples/bhyve/vmrun.sh
AM   usr.sbin/bhyve/legacy_irq.c
AM   usr.sbin/bhyve/legacy_irq.h
M    usr.sbin/bhyve/Makefile
AM   usr.sbin/bhyve/uart_emul.c
M    usr.sbin/bhyve/bhyverun.c
AM   usr.sbin/bhyve/uart_emul.h
M    usr.sbin/bhyve/pci_uart.c
M    usr.sbin/bhyve/pci_emul.c
M    usr.sbin/bhyve/inout.c
M    usr.sbin/bhyve/pci_emul.h
M    usr.sbin/bhyve/inout.h
AM   usr.sbin/bhyve/pci_lpc.c
AM   usr.sbin/bhyve/pci_lpc.h
2013-10-29 00:18:11 +00:00