183447 Commits

Author SHA1 Message Date
neel
1e4c5ce626 Probe for existence of the bvm debug port instead of just assuming that it is
always present.

Suggested by:	grehan
Obtained from:	NetApp
2012-10-27 22:54:23 +00:00
neel
86d868af7f Present the bvm console device to the guest only when explicitly requested via
the "-b" command line option.

Reviewed by:	grehan
Obtained from:	NetApp
2012-10-27 22:33:23 +00:00
neel
bd0ca87b04 Ignore PCI configuration accesses to all bus numbers other than PCI bus 0.
Obtained from:	NetApp
2012-10-27 02:39:08 +00:00
grehan
ed9d132c4e Rename vmmctl to bhyvectl. 'vmmctl' came from a pre-bhyve
internal codebase at NetApp. No need for it to have an
unrelated name to the other userspace utils.

Reviewed by:	neel
Obtained from:	NetApp
2012-10-27 02:10:45 +00:00
grehan
dc37578ed2 Set the valid field of the newly allocated field as all other
vm page allocators do. This fixes a panic when a virtio block
device is mounted as root, with the host system dying in
vm_page_dirty with invalid bits.

Reviewed by:	neel
Obtained from:	NetApp
2012-10-26 22:32:26 +00:00
grehan
1372a368e0 Remove mptable generation code from libvmmapi and move it to bhyve.
Firmware tables require too much knowledge of system configuration,
and it's difficult to pass that information in general terms to a library.
The upcoming ACPI work exposed this - it will also livein bhyve.

Also, remove code specific to NetApp from the mptable name, and remove
the -n option from bhyve.

Reviewed by:	neel
Obtained from:	NetApp
2012-10-26 13:40:12 +00:00
neel
cbd59fc940 Unconditionally enable fpu emulation by setting CR0.TS in the host after the
guest does a vm exit.

This allows us to trap any fpu access in the host context while the fpu still
has "dirty" state belonging to the guest.

Reported by: "s vas" on freebsd-virtualization@
Obtained from:	NetApp
2012-10-26 03:12:40 +00:00
neel
bcb3589583 If the guest vcpu wants to idle then use that opportunity to relinquish the
host cpu to the scheduler until the guest is ready to run again.

This implies that the host cpu utilization will now closely mirror the actual
load imposed by the guest vcpu.

Also, the vcpu mutex now needs to be of type MTX_SPIN since we need to acquire
it inside a critical section.

Obtained from:	NetApp
2012-10-25 04:29:21 +00:00
neel
80aee5fb8a Hide the monitor/mwait instruction capability from the guest until we know how
to properly intercept it.

Obtained from:	NetApp
2012-10-25 04:08:26 +00:00
neel
f5d9223df5 Fix typo: host_rip -> host_rsp
Obtained from:	NetApp
2012-10-25 03:39:36 +00:00
neel
583a9ef76d Maintain state regarding NMI delivery to guest vcpu in VT-x independent manner.
Also add a stats counter to count the number of NMIs delivered per vcpu.

Obtained from:	NetApp
2012-10-24 02:54:21 +00:00
neel
a74007510a Test for AST pending with interrupts disabled right before entering the guest.
If an IPI was delivered to this cpu before interrupts were disabled
then return right away via vmx_setjmp() with a return value of VMX_RETURN_AST.

Obtained from:	NetApp
2012-10-23 02:20:42 +00:00
neel
26dd051c2c Calculate the number of host ticks until the next guest timer interrupt.
This information will be used in conjunction with guest "HLT exiting" to
yield the thread hosting the virtual cpu.

Obtained from:	NetApp
2012-10-20 08:23:05 +00:00
grehan
beaad57fa0 Rework how guest MMIO regions are dealt with.
- New memory region interface. An RB tree holds the regions,
with a last-found per-vCPU cache to deal with the common case
of repeated guest accesses to MMIO registers in the same page.

- Support memory-mapped BARs in PCI emulation.

 mem.c/h - memory region interface

 instruction_emul.c/h - remove old region interface.
 Use gpa from EPT exit to avoid a tablewalk to
 determine operand address. Determine operand size
 and use when calling through to region handler.

 fbsdrun.c - call into region interface on paging
  exit. Distinguish between instruction emul error
  and region not found

 pci_emul.c/h - implement new BAR callback api.
 Split BAR alloc routine into routines that
 require/don't require the BAR phys address.

 ioapic.c
 pci_passthru.c
 pci_virtio_block.c
 pci_virtio_net.c
 pci_uart.c  - update to new BAR callback i/f

Reviewed by:	neel
Obtained from:	NetApp
2012-10-19 18:11:17 +00:00
grehan
8fb5b5f8de Add the guest physical address and r/w/x bits to
the paging exit in preparation for a rework of
bhyve MMIO handling.

Reviewed by:	neel
Obtained from:	NetApp
2012-10-12 23:12:19 +00:00
neel
4650e5d776 Deal with transient EBUSY error return from vm_run() by retrying the operation. 2012-10-12 18:49:07 +00:00
neel
e3e8a520e2 Provide per-vcpu locks instead of relying on a single big lock.
This also gets rid of all the witness.watch warnings related to calling
malloc(M_WAITOK) while holding a mutex.

Reviewed by:	grehan
2012-10-12 18:32:44 +00:00
neel
4829fce72f Output the value of all capabilities when the "--getcap" option is used without
a "--capname=<capname>". Do the same for the "--get-all" option.
2012-10-12 18:14:54 +00:00
neel
a62f9562ca Add an api to map a vm capability type into a string to be used for display
purposes.
2012-10-12 17:39:28 +00:00
neel
97c20149fa Fix warnings generated by 'debug.witness.watch' during VM creation and
destruction for calling malloc() with M_WAITOK while holding a mutex.

Do not allow vmm.ko to be unloaded until all virtual machines are destroyed.
2012-10-11 19:39:54 +00:00
neel
d09cf38e25 Deliver the MSI to the correct guest virtual cpu.
Prior to this change the MSI was being delivered unconditionally to vcpu 0
regardless of how the guest programmed the MSI delivery.
2012-10-11 19:28:07 +00:00
neel
364c9ec6f9 Grab the softc from the ACPI host-pci bridge device instead of from the pci
endpoint device.

Reviewed by:	jhb
2012-10-10 00:11:06 +00:00
neel
ca6e3cf930 Allocate memory pages for the guest from the host's free page queue.
It is no longer necessary to hard-partition the memory between the host
and guests at boot time.
2012-10-08 23:41:26 +00:00
grehan
89c25d5adf Clarify comment about default number of FICL dictionary cells.
Suggested by:	peterj
2012-10-04 03:59:45 +00:00
neel
09939583a7 The ioctl VM_GET_MEMORY_SEG is no longer able to return the host physical
address associated with the guest memory segment. This is because there is
no longer a 1:1 mapping between GPA and HPA.

As a result 'vmmctl' can only display the guest physical address and the
length of the lowmem and highmem segments.
2012-10-04 03:07:05 +00:00
neel
18dd2c0d51 Change vm_malloc() to map pages in the guest physical address space in 4KB
chunks. This breaks the assumption that the entire memory segment is
contiguously allocated in the host physical address space.

This also paves the way to satisfy the 4KB page allocations by requesting
free pages from the VM subsystem as opposed to hard-partitioning host memory
at boot time.
2012-10-04 02:27:14 +00:00
grehan
cdb0dba22b Allow the number of FICL dictionary cells to be overridden.
Loading a 7.3 ISO with userboot/amd64 takes up 10035 cells,
overflowing the long-standing default of 10000.

Bump userboot's value up to 15000 cells.
2012-10-03 04:22:39 +00:00
grehan
edf1984a8f Rework the GPT/MBR/raw policy so that it actually works, and navigates
around disk_open's current handling of falling back from GPT to MBR.

As in the previous commit, this should all be fixed in CURRENT.
2012-10-03 03:00:37 +00:00
grehan
8ad01bb5de Restore the ability to boot partitioned disks. The previous submit
broke that by forcing raw disks, due to the use of error returns
by userboot's initial disk opens.
2012-10-03 02:58:55 +00:00
neel
77ab4804ac Get rid of assumptions in the hypervisor that the host physical memory
associated with guest physical memory is contiguous.

Add check to vm_gpa2hpa() that the range indicated by [gpa,gpa+len) is all
contained within a single 4KB page.
2012-10-03 01:18:51 +00:00
neel
3e50e0220b Get rid of assumptions in the hypervisor that the host physical memory
associated with guest physical memory is contiguous.

Rewrite vm_gpa2hpa() to get the GPA to HPA mapping by querying the nested
page tables.
2012-10-03 00:46:30 +00:00
grehan
c17623c804 Fix the error return in disk_readslicetab() when an MBR/GPT partition
wasn't found, and use that in userdisk_open() to allow raw disks
and ISO images to be read.

This is a temporary fix - disk.c has changed a lot in CURRENT so this
code may be reworked or made redundant on the next IFC. It is useful
to be able to boot from CD in the meantime.
2012-10-02 04:41:43 +00:00
grehan
dd517bc793 Add cd9660 support to userboot to allow CD boot. 2012-10-02 04:36:37 +00:00
neel
bc87f08e98 Get rid of assumptions in the hypervisor that the host physical memory
associated with guest physical memory is contiguous.

In this case vm_malloc() was using vm_gpa2hpa() to indirectly infer whether
or not the address range had already been allocated.

Replace this instead with an explicit API 'vm_gpa_available()' that returns
TRUE if a page is available for allocation in guest physical address space.
2012-09-29 01:15:45 +00:00
neel
b65259b285 Intel VT-x provides the length of the instruction at the time of the nested
page table fault. Use this when fetching the instruction bytes from the guest
memory.

Also modify the lapic_mmio() API so that a decoded instruction is fed into it
instead of having it fetch the instruction bytes from the guest. This is
useful for hardware assists like SVM that provide the faulting instruction
as part of the vmexit.
2012-09-27 00:27:58 +00:00
neel
5dbc1ca26a Add an option "-a" to present the local apic in the XAPIC mode instead of the
default X2APIC mode to the guest.
2012-09-26 00:06:17 +00:00
neel
bc269b51af Add support for trapping MMIO writes to local apic registers and emulating them.
The default behavior is still to present the local apic to the guest in the
x2apic mode.
2012-09-25 22:31:35 +00:00
neel
ebdd69568d Add ioctls to control the X2APIC capability exposed by the virtual machine to
the guest.

At the moment this simply sets the state in the 'vcpu' instance but there is
no code that acts upon these settings.
2012-09-25 19:08:51 +00:00
neel
c34be7b811 Add an explicit exit code 'SPINUP_AP' to tell the controlling process that an
AP needs to be activated by spinning up an execution context for it.

The local apic emulation is now completely done in the hypervisor and it will
detect writes to the ICR_LO register that try to bring up the AP. In response
to such writes it will return to userspace with an exit code of SPINUP_AP.

Reviewed by: grehan
2012-09-25 02:33:25 +00:00
neel
34b672cc8a Stash the 'vm_exit' information in each 'struct vcpu'.
There is no functional change at this time but this paves the way for vm exit
handler functions to easily modify the exit reason going forward.
2012-09-24 19:32:24 +00:00
neel
c0caea8c2f Restructure the x2apic access code in preparation for supporting memory mapped
access to the local apic.

The vlapic code is now aware of the mode that the guest is using to access the
local apic.

Reviewed by: grehan@
2012-09-21 03:09:23 +00:00
grehan
6c5ad005be Add sysctls to display the total and free amount of hard-wired mem for VMs
# sysctl hw.vmm
   hw.vmm.mem_free: 2145386496
   hw.vmm.mem_total: 2145386496

Submitted by:	Takeshi HASEGAWA hasegaw at gmail com
2012-08-26 01:41:41 +00:00
neel
75106bd298 Fix a bug in how a 64-bit bar in a pci passthru device would be presented to
the guest. Prior to the fix it was possible for such a bar to appear as a
32-bit bar as long as it was allocated from the region below 4GB.

This had the potential to confuse some drivers that were particular about
the size of the bars.

Obtained from:	NetApp
2012-08-06 07:20:25 +00:00
neel
d4dec74190 Add support for emulating PCI multi-function devices.
These function number is specified by an optional [:<func>] after the slot
number: -s 1:0,virtio-net,tap0

Ditto for the mptable naming: -n 1:0,e0a

Obtained from:	NetApp
2012-08-06 06:51:27 +00:00
neel
5bfba16f73 Device model for ioapic emulation.
With this change the uart emulation is entirely interrupt driven.

Obtained from: NetApp
2012-08-05 00:00:52 +00:00
neel
9f954dc599 The displacement field in the decoded instruction should be treated as a 8-bit
or 32-bit signed integer.

Simplify the handling of indirect addressing with displacement by
unconditionally adding the 'instruction->disp' to the target address.
This is alright since 'instruction->disp' is non-zero only for the
addressing modes that specify a displacement.

Obtained from: NetApp
2012-08-04 23:51:21 +00:00
neel
72240bed65 Add the "-I" option to control whether or not an ioapic is visible to the guest.
Obtained from: NetApp
2012-08-04 22:48:04 +00:00
neel
ac88464d24 Allow the 'bhyve' process to control whether or not the virtual machine sees an
ioapic.

Obtained from: NetApp
2012-08-04 22:46:29 +00:00
neel
66c8120152 Include 'device uart' in the guest kernel. 2012-08-04 04:30:26 +00:00
neel
5b0a521317 Use the correct variable to index into the 'lirq[]' array to check the legacy
IRQ ownership.
2012-08-04 04:26:17 +00:00