Commit Graph

94 Commits

Author SHA1 Message Date
grehan
88b044ba33 Implement support for the interrupt-on-terminal-count and
s/w-strobe timer modes. These are commonly used by non-FreeBSD
o/s's.

Approved by:	re@ (blanket)
2013-09-19 04:59:44 +00:00
grehan
9239c077f4 Add simplistic periodic timer support to mevent using kqueue's
timer support. This should be enough for the emulation of
h/w periodic timers (and no more) e.g. some of the 8254's
more esoteric modes that happen to be used by non-FreeBSD o/s's.

Approved by:	re@ (blanket)
2013-09-19 04:48:26 +00:00
grehan
bc1d6700f2 Allow the alarm hours/mins/seconds registers to be read/written,
though without any action. This avoids a hypervisor exit when
o/s's access these regs (Linux).

Reviewed by:	neel
Approved by:	re@ (blanket)
2013-09-19 04:29:03 +00:00
grehan
6f44e4d05b Use correct offset for the high byte of high memory written to
RTC NVRAM.

Submitted by:	Bela Lubkin   bela dot lubkin at tidalscale dot com
Approved by:	re@ (blanket)
2013-09-19 04:20:18 +00:00
grehan
6f381c7c57 Pass the number of supported vectors to pci_emul_add_msicap() and
not the actual PCI BAR number.

Reviewed by:	neel
Approved by:	re@ (blanket)
2013-09-17 18:42:13 +00:00
grehan
3d2b366a36 Go way past 11 and bump bhyve's max vCPUs to 16.
This should be sufficient for 10.0 and will do
until forthcoming work to avoid limitations
in this area is complete.

Thanks to Bela Lubkin at tidalscale for the
headsup on the apic/cpu id/io apic ASL parameters
that are actually hex values and broke when
written as decimal when 11 vCPUs were configured.

Approved by:	re@
2013-09-10 03:48:18 +00:00
grehan
32d186cf83 Fix spelling. 2013-09-06 05:58:10 +00:00
grehan
b669765ba2 Allow level-triggered interrupt sources. While this isn't
precisely emulated, it is good enough for the single consumer
i.e. irq4, the serial port on Linux.
2013-09-06 05:55:43 +00:00
neel
236964c12d Allow single byte reads of the emulated MSI-X tables. This is not required
by the PCI specification but needed to dump MMIO space from "ddb" in the
guest.
2013-08-27 16:50:48 +00:00
grehan
68734fc2a7 Fix off-by-1 error in assert.
Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
2013-08-27 03:49:47 +00:00
grehan
fe073d00ee Fix ordering of legacy IRQ reservations.
Submitted by:	Jeremiah Lott   jlott at averesystems dot com
2013-08-16 00:35:20 +00:00
grehan
dc702c2d98 Sanity-check the vm exitcode, and exit the process if it's out-of-bounds
or there is no registered handler.

Submitted by:	Bela Lubkin   bela dot lubkin at tidalscale dot com
2013-07-18 18:40:54 +00:00
grehan
a6cf66c6cf Major rework of the virtio code. Split out common parts, and modify
the net/block devices accordingly.

Submitted by:	Chris Torek   torek at torek dot net
Reviewed by:	grehan
2013-07-17 23:37:33 +00:00
grehan
db9a28132c Implement RTC CMOS nvram. Init some fields that are used
by FreeBSD and UEFI.
Tested with nvram(4).

Reviewed by:	neel
2013-07-11 03:54:35 +00:00
grehan
4afc69bc76 Support an optional "mac=" parameter to virtio-net config, to allow
users to set the MAC address for a device.

Clean up some obsolete code in pci_virtio_net.c

Allow an error return from a PCI device emulation's init routine
to be propagated all the way back to the top-level and result in
the process exiting.

Submitted by:	Dinakar Medavaram    dinnu sun at gmail (original version)
2013-07-04 05:35:56 +00:00
grehan
dcadb4f390 Fix up option parsing to allow a colon in the config section.
Clean up some other unnecessary code.

Submitted by:	Dinakar Medavaram    dinnu sun at gmail
Reviewed by:	neel
2013-07-01 23:53:22 +00:00
grehan
be00d9ee47 Allow 8259 registers to be read. This is a transient condition
during Linux boot.

Submitted by:	tycho nightingale at pluribusnetworks com
Reviewed by:	neel
2013-06-28 06:25:04 +00:00
grehan
535e6386b1 Allow the PCI config address register to be read. The Linux
kernel does this. Also remove an unused header file.

Submitted by:	tycho nightingale at pluribusnetworks com
Reviewed by:	neel
2013-06-28 05:01:25 +00:00
neel
2ae63ff174 Implement the NOTIFY_ON_EMPTY capability in the virtio-net device.
If this capability is negotiated by the guest then the device will
generate an interrupt when it runs out of available tx/rx descriptors.

Reviewed by:	grehan
Obtained from:	NetApp
2013-05-03 01:16:18 +00:00
neel
7f49c6dcce Reset some more softc state when the guest resets the virtio network device.
Obtained from:	NetApp
2013-04-30 01:14:54 +00:00
neel
773a4e04f4 Use a separate mutex for the receive path instead of overloading the softc
mutex for this purpose.

Reviewed by:	grehan
2013-04-30 00:36:16 +00:00
neel
7b0846b1c8 Get rid of the 'vsc_rxpend' state - it doesn't serve any purpose because we
drop any frames that arrive while the device is starved for receive buffers.

This makes the receive path to only execute in context of the receive thread
and allows for further simplification.

Reviewed by:	grehan
2013-04-28 01:02:59 +00:00
grehan
771ee43899 Use a thread for the processing of virtio tx descriptors rather
than blocking the vCPU thread. This improves bulk data performance
by ~30-40% and doesn't harm req/resp time for stock netperf runs.

Future work will use a thread pool rather than a thread per tx queue.

Submitted by:	Dinakar Medavaram
Reviewed by:	neel, grehan
Obtained from:	NetApp
2013-04-26 05:13:48 +00:00
neel
e682d80073 Gripe if some <slot,function> tuple is specified more than once instead of
silently overwriting the previous assignment.

Gripe if the emulation is not recognized instead of silently ignoring the
emulated device.

If an error is detected by pci_parse_slot() then exit from the command line
parsing loop in main().

Submitted by (initial version):	Chris Torek (chris.torek@gmail.com)
2013-04-26 02:24:50 +00:00
neel
826be154b0 Teach the virtio block device to deal with direct as well as indirect
descriptors. Prior to this change the device would only work with guests
that chose to use indirect descriptors.

Modify the device reset callback to actually reset the device state.

Submitted by:	Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
2013-04-23 16:40:39 +00:00
neel
3ab51705b7 Setup accesses to the memory hole below 4GB to return all 1's on read and
consume all writes without any side effects.

Obtained from:	NetApp
2013-04-17 02:03:12 +00:00
neel
48f321f98e Need to call init_mem() to really initialize the MMIO range lookups.
This was working by accident because:
- the RB_HEADs were being initialized to zero as part of BSS
- the pthread_rwlock functions were implicitly initializing the lock object

Obtained from:	NetApp
2013-04-10 18:59:20 +00:00
neel
6ca7b417ee Remove obsolete comment about lack of locking for MMIO range lookup.
Pointed out by:	Tycho Nightingale (tycho.nightingale@plurisbusnetworks.com)
2013-04-10 18:53:14 +00:00
neel
3bab173b64 Unsynchronized TSCs on the host require special handling in bhyve:
- use clock_gettime(2) as the time base for the emulated ACPI timer instead
  of directly using rdtsc().

- don't advertise the invariant TSC capability to the guest to discourage it
  from using the TSC as its time base.

Discussed with:	jhb@ (about making 'smp_tsc' a global)
Reported by:	Dan Mack on freebsd-virtualization@
Obtained from:	NetApp
2013-04-10 05:59:07 +00:00
neel
fc5a92987d Change name of variable from 'rwlock' to more descriptive 'mmio_rwlock'
Requested by:	grehan
Obtained from:	NetApp
2013-04-10 02:18:17 +00:00
neel
c350fded20 Improve PCI BAR emulation:
- Respect the MEMEN and PORTEN bits in the command register
- Allow the guest to reprogram the address decoded by the BAR

Submitted by:	Gopakumar T
Obtained from:	NetApp
2013-04-10 02:12:39 +00:00
grehan
f220c315bb Remove dangling ISA uart stubs.
Obtained from:	NetApp
2013-04-05 22:19:02 +00:00
grehan
9b0ba3c866 config checksum is over the entire fixed portion, not just the
config header. FreeBSD doesn't check this but other o/s's do.

Obtained from:	NetApp
2013-04-05 22:14:07 +00:00
neel
8d05d984e8 Simplify the assignment of memory to virtual machines by requiring a single
command line option "-m <memsize in MB>" to specify the memory size.

Prior to this change the user needed to explicitly specify the amount of
memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>).

The "-M" option is no longer supported by 'bhyveload' and 'bhyve'.

The start of the PCI hole is fixed at 3GB and cannot be directly changed
using command line options. However it is still possible to change this in
special circumstances via the 'vm_set_lowmem_limit()' API provided by
libvmmapi.

Submitted by:	Dinakar Medavaram (initial version)
Reviewed by:	grehan
Obtained from:	NetApp
2013-03-18 22:38:30 +00:00
neel
6a88d4ed82 Change the type of 'ndesc' from 'int' to 'uint16_t' so that descriptor index
wraparound is handled correctly.

The gory details are available here:
http://lists.freebsd.org/pipermail/freebsd-virtualization/2013-March/001119.html

This fixes a regression introduced in r247871.

Pointed out by:	Bruce Evans, Chris Torek
2013-03-16 05:40:29 +00:00
neel
c92442accd Convert the offset into the bar that contains the MSI-X table to an offset
into the MSI-X table before using it to calculate the table index.

In the common case where the MSI-X table is located at the begining of the
BAR these two offsets are identical and thus the code was working by accident.

This change will fix the case where the MSI-X table is located in the middle
or at the end of the BAR that contains it.

Obtained from:	NetApp
2013-03-11 17:36:37 +00:00
grehan
2527cbb869 Simplify virtio ring num-available calculation.
Submitted by:	Chris Torek, torek at torek dot net
2013-03-06 07:28:20 +00:00
grehan
f5b9af8949 Reorder code to avoid the stat buffer being used uninitialized.
Obtained from:	NetApp
2013-03-06 06:24:09 +00:00
neel
9c2aecf6da Specify the length of the mapping requested from 'paddr_guest2host()'.
This seems prudent to do in its own right but it also opens up the possibility
of not having to mmap the entire guest address space in the 'bhyve' process
context.

Discussed with:	grehan
Obtained from:	NetApp
2013-03-01 02:26:28 +00:00
neel
97e63956d8 Ignore the BARRIER flag in the virtio block header.
This capability is not advertised by the host so ignore it even if the guest
insists on setting the flag.

Reviewed by:	grehan
Obtained from:	NetApp
2013-02-26 20:02:17 +00:00
neel
7025a86fb3 Get rid of unused struct member.
Pointed out by:	Gopakumar T
Obtained from:	NetApp
2013-02-25 20:31:47 +00:00
grehan
8da5ddc4a5 Add the ability to have a 'fallback' search for memory ranges.
These set of ranges will be looked at if a standard memory
range isn't found, and won't be installed in the cache.
Use this to implement the memory behaviour of the PCI hole on
x86 systems, where writes are ignored and reads always return -1.
This allows breakpoints to be set when issuing a 'boot -d', which
has the side effect of accessing the PCI hole when changing the
PTE protection on kernel code, since the pmap layer hasn't been
initialized (a bug, but present in existing FreeBSD releases so
has to be handled).

Reviewed by:	neel
Obtained from:	NetApp
2013-02-22 00:46:32 +00:00
neel
d8ccba8d32 Advertise PCI-E capability in the hostbridge device presented to the guest.
FreeBSD wants to see this capability in at least one device in the PCI
hierarchy before it allows use of MSI or MSI-X.

Obtained from:	NetApp
2013-02-15 18:41:36 +00:00
neel
3a9eeaa765 Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'.
Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING)
that called 'sched_bind()' on behalf of the user thread.

The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn
runs afoul of the assertion '(td_pinned == 0)' in userret().

Using the cpuset affinity to implement pinning of the vcpu threads works with
both 4BSD and ULE schedulers and has the happy side-effect of getting rid
of a bunch of code in vmm.ko.

Discussed with:	grehan
2013-02-11 20:36:07 +00:00
jhb
b313f550e1 Install <dev/agp/agpreg.h> and <dev/pci/pcireg.h> as userland headers
in /usr/include.

MFC after:	2 weeks
2013-02-05 18:55:09 +00:00
neel
da1ce8990c Add support for MSI-X interrupts in the virtio block device and make that
the default.

The current behavior of advertising a single MSI vector can be requested by
setting the environment variable "BHYVE_USE_MSI" to "yes". The use of MSI
is not compliant with the virtio specification and will be eventually phased
out.

Submitted by:	Gopakumar T
Obtained from:	NetApp
2013-02-01 16:58:59 +00:00
neel
81de6f5cc4 Fix a broken assumption in the passthru implementation that the MSI-X table
can only be located at the beginning or the end of the BAR.

If the MSI-table is located in the middle of a BAR then we will split the
BAR into two and create two mappings - one before the table and one after
the table - leaving a hole in place of the table so accesses to it can be
trapped and emulated.

Obtained from:	NetApp
2013-02-01 03:49:09 +00:00
neel
803db8c37c Fix a bug in the passthru implementation where it would assume that all
devices are MSI-X capable. This in turn would lead it to treat bar 0 as
the MSI-X table bar even if the underlying device did not support MSI-X.

Fix this by providing an API to query the MSI-X table index of the emulated
device. If the underlying device does not support MSI-X then this API will
return -1.

Obtained from:	NetApp
2013-02-01 02:41:47 +00:00
neel
6b8dd85cc6 Add support for MSI-X interrupts in the virtio network device and make that
the default.

The current behavior of advertising a single MSI vector can be requested by
setting the environment variable "BHYVE_USE_MSI" to "true". The use of MSI
is not compliant with the virtio specification and will be eventually phased
out.

Submitted by:	Gopakumar T
Obtained from:	NetApp
2013-01-30 04:30:36 +00:00
grehan
6112ba9b18 Improve correctness of rtc register implementation.
Submitted by:	tycho nightingale at pluribusnetworks com
2013-01-25 22:43:20 +00:00