freebsd-skq

Author	SHA1	Message	Date
robak	541b0a86fd	Capsicum support for bhyve(8). Adds Capsicum sandboxing to bhyve. Submitted by: Pawel Biernacki <pawel.biernacki@gmail.com> Reviewed by: grehan, oshogbo Approved by: emaste, grehan Sponsored by: Mysterious Code Ltd. Differential Revision: https://reviews.freebsd.org/D8290	2017-02-14 13:35:59 +00:00
glebius	224897e572	Fix possible integer overflow in guest memory bounds checking, which could lead to access from the virtual machine to the heap of the bhyve(8) process. Submitted by: Felix Wilhelm <fwilhelm ernw.de> Patch by: grehan Security: FreeBSD-SA-16:38.bhyve	2016-12-06 18:50:44 +00:00
skra	812447f90a	As <machine/param.h> is included from <sys/param.h>, there is no need to include it explicitly when <sys/param.h> is already included. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D5378	2016-02-22 09:04:36 +00:00
neel	e9213dd046	Move the 'devmem' device nodes from /dev/vmm to /dev/vmm.io Some external tools just do a 'ls /dev/vmm' to figure out the bhyve virtual machines on the host. These tools break if the devmem device nodes also appear in /dev/vmm. Requested by: grehan	2015-07-06 19:41:43 +00:00
neel	e419539610	Fix a regression in "movs" emulation after r284539. The regression was caused due to a change in behavior of the 'vm_map_gpa()'. Prior to r284539 if 'vm_map_gpa()' was called to map an address range in the guest MMIO region then it would return NULL. This was used by the "movs" emulation to detect if the 'src' or 'dst' operand was in MMIO space. Post r284539 'vm_map_gpa()' started returning a non-NULL pointer even when mapping the guest MMIO region. Fix this by returning non-NULL only if [gaddr, gaddr+len) is entirely within the 'lowmem' or 'highmem' regions and NULL otherwise. Pointy hat to: neel Reviewed by: grehan Reported by: tychon, Ben Perrault (ben.perrault@gmail.com) MFC after: 1 week	2015-06-22 00:30:34 +00:00
neel	8c70d6c7af	Restructure memory allocation in bhyve to support "devmem". devmem is used to represent MMIO devices like the boot ROM or a VESA framebuffer where doing a trap-and-emulate for every access is impractical. devmem is a hybrid of system memory (sysmem) and emulated device models. devmem is mapped in the guest address space via nested page tables similar to sysmem. However the address range where devmem is mapped may be changed by the guest at runtime (e.g. by reprogramming a PCI BAR). Also devmem is usually mapped RO or RW as compared to RWX mappings for sysmem. Each devmem segment is named (e.g. "bootrom") and this name is used to create a device node for the devmem segment (e.g. /dev/vmm/testvm.bootrom). The device node supports mmap(2) and this decouples the host mapping of devmem from its mapping in the guest address space (which can change). Reviewed by: tychon Discussed with: grehan Differential Revision: https://reviews.freebsd.org/D2762 MFC after: 4 weeks	2015-06-18 06:00:17 +00:00
neel	7776059e98	Deprecate the 3-way return values from vm_gla2gpa() and vm_copy_setup(). Prior to this change both functions returned 0 for success, -1 for failure and +1 to indicate that an exception was injected into the guest. The numerical value of ERESTART also happens to be -1 so when these functions returned -1 it had to be translated to a positive errno value to prevent the VM_RUN ioctl from being inadvertently restarted. This made it easy to introduce bugs when writing emulation code. Fix this by adding an 'int *guest_fault' parameter and setting it to '1' if an exception was delivered to the guest. The return value is 0 or EFAULT so no additional translation is needed. Reviewed by: tychon MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D2428	2015-05-06 16:25:20 +00:00
tychon	b3c521e852	Fix "MOVS" instruction memory to MMIO emulation. Currently updates to %rdi, %rsi, etc are inadvertently bypassed along with the check to see if the instruction needs to be repeated per the 'rep' prefix. Add "MOVS" instruction support for the 'MMIO to MMIO' case. Reviewed by: neel	2015-04-01 00:15:31 +00:00
neel	9d0b530c89	Fix a bug in libvmmapi 'vm_copy_setup()' where it would return success even if the 'gpa' was in the guest MMIO region. This would manifest as a segmentation fault in 'vm_map_copyin()' or 'vm_map_copyout()' because 'vm_map_gpa()' would return NULL for this 'gpa'. Fix this by calling 'vm_map_gpa()' in 'vm_copy_setup' and returning a failure if the 'gpa' cannot be mapped. This matches the behavior of 'vm_copy_setup()' in vmm.ko. MFC after: 1 week	2015-01-19 06:51:04 +00:00
neel	d9f07f9841	Simplify instruction restart logic in bhyve. Keep track of the next instruction to be executed by the vcpu as 'nextrip'. As a result the VM_RUN ioctl no longer takes the %rip where a vcpu should start execution. Also, instruction restart happens implicitly via 'vm_inject_exception()' or explicitly via 'vm_restart_instruction()'. The APIs behave identically in both kernel and userspace contexts. The main beneficiary is the instruction emulation code that executes in both contexts. bhyve(8) VM exit handlers now treat 'vmexit->rip' and 'vmexit->inst_length' as readonly: - Restarting an instruction is now done by calling 'vm_restart_instruction()' as opposed to setting 'vmexit->inst_length' to 0 (e.g. emulate_inout()) - Resuming vcpu at an arbitrary %rip is now done by setting VM_REG_GUEST_RIP as opposed to changing 'vmexit->rip' (e.g. vmexit_task_switch()) Differential Revision: https://reviews.freebsd.org/D1526 Reviewed by: grehan MFC after: 2 weeks	2015-01-18 03:08:30 +00:00
neel	7aa6460c48	Replace bhyve's minimal RTC emulation with a fully featured one in vmm.ko. The new RTC emulation supports all interrupt modes: periodic, update ended and alarm. It is also capable of maintaining the date/time and NVRAM contents across virtual machine reset. Also, the date/time fields can now be modified by the guest. Since bhyve now emulates both the PIT and the RTC there is no need for "Legacy Replacement Routing" in the HPET so get rid of it. The RTC device state can be inspected via bhyvectl as follows: bhyvectl --vm=vm --get-rtc-time bhyvectl --vm=vm --set-rtc-time=<unix_time_secs> bhyvectl --vm=vm --rtc-nvram-offset=<offset> --get-rtc-nvram bhyvectl --vm=vm --rtc-nvram-offset=<offset> --set-rtc-nvram=<value> Reviewed by: tychon Discussed with: grehan Differential Revision: https://reviews.freebsd.org/D1385 MFC after: 2 weeks	2014-12-30 22:19:34 +00:00
neel	4535fa67c4	Fix fault injection in bhyve. The faulting instruction needs to be restarted when the exception handler is done handling the fault. bhyve now does this correctly by setting 'vmexit[vcpu].inst_length' to zero so the %rip is not advanced. A minor complication is that the fault injection APIs are used by instruction emulation code that is shared by vmm.ko and bhyve. Thus the argument that refers to 'struct vm ' in kernel or 'struct vmctx ' in userspace needs to be loosely typed as a 'void *'.	2014-07-24 01:38:11 +00:00
neel	e972917c13	Emulate instructions emitted by OpenBSD/i386 version 5.5: - CMP REG, r/m - MOV AX/EAX/RAX, moffset - MOV moffset, AX/EAX/RAX - PUSH r/m	2014-07-23 04:28:51 +00:00
neel	1f15eea2e0	Handle nested exceptions in bhyve. A nested exception condition arises when a second exception is triggered while delivering the first exception. Most nested exceptions can be handled serially but some are converted into a double fault. If an exception is generated during delivery of a double fault then the virtual machine shuts down as a result of a triple fault. vm_exit_intinfo() is used to record that a VM-exit happened while an event was being delivered through the IDT. If an exception is triggered while handling the VM-exit it will be treated like a nested exception. vm_entry_intinfo() is used by processor-specific code to get the event to be injected into the guest on the next VM-entry. This function is responsible for deciding the disposition of nested exceptions.	2014-07-19 20:59:08 +00:00
neel	921bfc7679	Provide APIs to directly get 'lowmem' and 'highmem' size directly. Previously the sizes were inferred indirectly based on the size of the mappings at 0 and 4GB respectively. This works fine as long as size of the allocation is identical to the size of the mapping in the guest's address space. However, if the mapping is disjoint then this assumption falls apart (e.g., due to the legacy BIOS hole between 640KB and 1MB).	2014-06-24 02:02:51 +00:00
neel	80a67d54c4	Add ioctl(VM_REINIT) to reinitialize the virtual machine state maintained by vmm.ko. This allows the virtual machine to be restarted without having to destroy it first. Reviewed by: grehan	2014-06-07 21:36:52 +00:00
neel	9c2a942387	Activate vcpus from bhyve(8) using the ioctl VM_ACTIVATE_CPU instead of doing it implicitly in vmm.ko. Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and "--get-suspended-cpus" options. This is in preparation for being able to reset virtual machine state without having to destroy and recreate it.	2014-05-31 23:37:34 +00:00
neel	31a1b31ef2	Fix issue with restarting an "insb/insw/insl" instruction because of a page fault on the destination buffer. Prior to this change a page fault would be detected in vm_copyout(). This was done after the I/O port access was done. If the I/O port access had side-effects (e.g. reading the uart FIFO) then restarting the instruction would result in incorrect behavior. Fix this by validating the guest linear address before doing the I/O port emulation. If the validation results in a page fault exception being injected into the guest then the instruction can now be restarted without any side-effects.	2014-05-26 18:21:08 +00:00
neel	51a05acc08	Add libvmmapi functions vm_copyin() and vm_copyout() to copy into and out of the guest linear address space. These APIs in turn use a new ioctl 'VM_GLA2GPA' to convert the guest linear address to guest physical. Use the new copyin/copyout APIs when emulating ins/outs instruction in bhyve(8).	2014-05-24 23:12:30 +00:00
jhb	f558af85b7	Implement a PCI interrupt router to route PCI legacy INTx interrupts to the legacy 8259A PICs. - Implement an ICH-comptabile PCI interrupt router on the lpc device with 8 steerable pins configured via config space access to byte-wide registers at 0x60-63 and 0x68-6b. - For each configured PCI INTx interrupt, route it to both an I/O APIC pin and a PCI interrupt router pin. When a PCI INTx interrupt is asserted, ensure that both pins are asserted. - Provide an initial routing of PCI interrupt router (PIRQ) pins to 8259A pins (ISA IRQs) and initialize the interrupt line config register for the corresponding PCI function with the ISA IRQ as this matches existing hardware. - Add a global _PIC method for OSPM to select the desired interrupt routing configuration. - Update the _PRT methods for PCI bridges to provide both APIC and legacy PRT tables and return the appropriate table based on the configured routing configuration. Note that if the lpc device is not configured, no routing information is provided. - When the lpc device is enabled, provide ACPI PCI link devices corresponding to each PIRQ pin. - Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A pins via the ELCR. - Mark the power management SCI as level triggered. - Don't hardcode the number of elements in Packages in the source for the DSDT. iasl(8) will fill in the actual number of elements, and this makes it simpler to generate a Package with a variable number of elements. Reviewed by: tycho	2014-05-15 14:16:55 +00:00
neel	f14c076ec7	Don't include the guest memory segments in the bhyve(8) process core dump. This has not added a lot of value when debugging bhyve issues while greatly increasing the time and space required to store the core file. Passing the "-C" option to bhyve(8) will change the default and dump guest memory in the core dump. Requested by: grehan Reviewed by: grehan	2014-05-13 16:40:27 +00:00
neel	b616a9a2e4	Allow a virtual machine to be forcibly reset or powered off. This is done by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF. The disposition of VM_SUSPEND is also made available to the exit handler via the 'u.suspended' member of 'struct vm_exit'. This capability is exposed via the '--force-reset' and '--force-poweroff' arguments to /usr/sbin/bhyvectl. Discussed with: grehan@	2014-04-28 22:06:40 +00:00
neel	3e49998fdf	Add an ioctl to suspend a virtual machine (VM_SUSPEND). The ioctl can be called from any context i.e., it is not required to be called from a vcpu thread. The ioctl simply sets a state variable 'vm->suspend' to '1' and returns. The vcpus inspect 'vm->suspend' in the run loop and if it is set to '1' the vcpu breaks out of the loop with a reason of 'VM_EXITCODE_SUSPENDED'. The suspend handler waits until all 'vm->active_cpus' have transitioned to 'vm->suspended_cpus' before returning to userspace. Discussed with: grehan	2014-03-26 23:34:27 +00:00
tychon	25c8b61cfd	Replace the userspace atpic stub with a more functional vmm.ko model. New ioctls VM_ISA_ASSERT_IRQ, VM_ISA_DEASSERT_IRQ and VM_ISA_PULSE_IRQ can be used to manipulate the pic, and optionally the ioapic, pin state. Reviewed by: jhb, neel Approved by: neel (co-mentor)	2014-03-11 16:56:00 +00:00
neel	e01c440dae	Queue pending exceptions in the 'struct vcpu' instead of directly updating the processor-specific VMCS or VMCB. The pending exception will be delivered right before entering the guest. The order of event injection into the guest is: - hardware exception - NMI - maskable interrupt In the Intel VT-x case, a pending NMI or interrupt will enable the interrupt window-exiting and inject it as soon as possible after the hardware exception is injected. Also since interrupts are inherently asynchronous, injecting them after the hardware exception should not affect correctness from the guest perspective. Rename the unused ioctl VM_INJECT_EVENT to VM_INJECT_EXCEPTION and restrict it to only deliver x86 hardware exceptions. This new ioctl is now used to inject a protection fault when the guest accesses an unimplemented MSR. Discussed with: grehan, jhb Reviewed by: jhb	2014-02-26 00:52:05 +00:00
jhb	3f6ca218c6	Enhance the support for PCI legacy INTx interrupts and enable them in the virtio backends. - Add a new ioctl to export the count of pins on the I/O APIC from vmm to the hypervisor. - Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for ISA interrupts. - Populate the MP Table with I/O interrupt entries for any PCI INTx interrupts. - Create a _PRT table under the PCI root bridge in ACPI to route any PCI INTx interrupts appropriately. - Track which INTx interrupts are in use per-slot so that functions that share a slot attempt to distribute their INTx interrupts across the four available pins. - Implicitly mask INTx interrupts if either MSI or MSI-X is enabled and when the INTx DIS bit is set in a function's PCI command register. Either assert or deassert the associated I/O APIC pin when the state of one of those conditions changes. - Add INTx support to the virtio backends. - Always advertise the MSI capability in the virtio backends. Submitted by: neel (7) Reviewed by: neel MFC after: 2 weeks	2014-01-29 14:56:48 +00:00
jhb	8ab82a5fe1	Extend the support for local interrupts on the local APIC: - Add a generic routine to trigger an LVT interrupt that supports both fixed and NMI delivery modes. - Add an ioctl and bhyvectl command to trigger local interrupts inside a guest. In particular, a global NMI similar to that raised by SERR# or PERR# can be simulated by asserting LINT1 on all vCPUs. - Extend the LVT table in the vCPU local APIC to support CMCI. - Flesh out the local APIC error reporting a bit to cache errors and report them via ESR when ESR is written to. Add support for asserting the error LVT when an error occurs. Raise illegal vector errors when attempting to signal an invalid vector for an interrupt or when sending an IPI. - Ignore writes to reserved bits in LVT entries. - Export table entries the MADT and MP Table advertising the stock x86 config of LINT0 set to ExtInt and LINT1 wired to NMI. Reviewed by: neel (earlier version)	2013-12-23 19:29:07 +00:00
neel	e24d040187	Rename the ambiguously named 'vm_setup_msi()' and 'vm_setup_msix()' to 'vm_setup_pptdev_msi()' and 'vm_setup_pptdev_msix()' respectively. It should now be clear that these functions operate on passthru devices.	2013-12-18 03:58:51 +00:00
neel	e62c100b90	Add an API to deliver message signalled interrupts to vcpus. This allows callers treat the MSI 'addr' and 'data' fields as opaque and also lets bhyve implement multiple destination modes: physical, flat and clustered. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com) Reviewed by: grehan@	2013-12-16 19:59:31 +00:00
neel	89dbc92028	Add HPET device emulation to bhyve. bhyve supports a single timer block with 8 timers. The timers are all 32-bit and capable of being operated in periodic mode. All timers support interrupt delivery using MSI. Timers 0 and 1 also support legacy interrupt routing. At the moment the timers are not connected to any ioapic pins but that will be addressed in a subsequent commit. This change is based on a patch from Tycho Nightingale (tycho.nightingale@pluribusnetworks.com).	2013-11-25 19:04:51 +00:00
neel	3b87354d1e	Add an ioctl to assert and deassert an ioapic pin atomically. This will be used to inject edge triggered legacy interrupts into the guest. Start using the new API in device models that use edge triggered interrupts: viz. the 8254 timer and the LPC/uart device emulation. Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)	2013-11-23 03:56:03 +00:00
neel	384d86e888	Move the ioapic device model from userspace into vmm.ko. This is needed for upcoming in-kernel device emulations like the HPET. The ioctls VM_IOAPIC_ASSERT_IRQ and VM_IOAPIC_DEASSERT_IRQ are used to manipulate the ioapic pin state. Discussed with: grehan@ Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)	2013-11-12 22:51:03 +00:00
neel	75369cb181	Add a new capability, VM_CAP_ENABLE_INVPCID, that can be enabled to expose 'invpcid' instruction to the guest. Currently bhyve will try to enable this capability unconditionally if it is available. Consolidate code in bhyve to set the capabilities so it is no longer duplicated in BSP and AP bringup. Add a sysctl 'vm.pmap.invpcid_works' to display whether the 'invpcid' instruction is available. Reviewed by: grehan MFC after: 3 days	2013-10-16 18:20:27 +00:00
neel	f9f9a7e617	Parse the memory size parameter using expand_number() to allow specifying the memory size more intuitively (e.g. 512M, 4G etc). Submitted by: rodrigc Reviewed by: grehan Approved by: re (blanket)	2013-10-09 03:56:07 +00:00
neel	aed205d5cd	Merge projects/bhyve_npt_pmap into head. Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory. Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT. pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT. The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest. Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'. An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR) The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@). The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap. TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB. Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose. PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them. Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable. Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement. Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions. Special thanks to Peter Holm for testing the patch on short notice. Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho	2013-10-05 21:22:35 +00:00
neel	4f5865691e	Remove deprecated APIs to get the total and free memory available to vmm.ko. These APIs were relevant when memory for virtual machine allocation was hard partitioned away from the rest of the system but that is no longer the case. The sysctls that provided this information were garbage collected a while back. Obtained from: NetApp	2013-04-25 20:42:21 +00:00
neel	8d05d984e8	Simplify the assignment of memory to virtual machines by requiring a single command line option "-m <memsize in MB>" to specify the memory size. Prior to this change the user needed to explicitly specify the amount of memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>). The "-M" option is no longer supported by 'bhyveload' and 'bhyve'. The start of the PCI hole is fixed at 3GB and cannot be directly changed using command line options. However it is still possible to change this in special circumstances via the 'vm_set_lowmem_limit()' API provided by libvmmapi. Submitted by: Dinakar Medavaram (initial version) Reviewed by: grehan Obtained from: NetApp	2013-03-18 22:38:30 +00:00
neel	3a9eeaa765	Implement guest vcpu pinning using 'pthread_setaffinity_np(3)'. Prior to this change pinning was implemented via an ioctl (VM_SET_PINNING) that called 'sched_bind()' on behalf of the user thread. The ULE implementation of 'sched_bind()' bumps up 'td_pinned' which in turn runs afoul of the assertion '(td_pinned == 0)' in userret(). Using the cpuset affinity to implement pinning of the vcpu threads works with both 4BSD and ULE schedulers and has the happy side-effect of getting rid of a bunch of code in vmm.ko. Discussed with: grehan	2013-02-11 20:36:07 +00:00
grehan	1372a368e0	Remove mptable generation code from libvmmapi and move it to bhyve. Firmware tables require too much knowledge of system configuration, and it's difficult to pass that information in general terms to a library. The upcoming ACPI work exposed this - it will also livein bhyve. Also, remove code specific to NetApp from the mptable name, and remove the -n option from bhyve. Reviewed by: neel Obtained from: NetApp	2012-10-26 13:40:12 +00:00
neel	a62f9562ca	Add an api to map a vm capability type into a string to be used for display purposes.	2012-10-12 17:39:28 +00:00
neel	09939583a7	The ioctl VM_GET_MEMORY_SEG is no longer able to return the host physical address associated with the guest memory segment. This is because there is no longer a 1:1 mapping between GPA and HPA. As a result 'vmmctl' can only display the guest physical address and the length of the lowmem and highmem segments.	2012-10-04 03:07:05 +00:00
neel	18dd2c0d51	Change vm_malloc() to map pages in the guest physical address space in 4KB chunks. This breaks the assumption that the entire memory segment is contiguously allocated in the host physical address space. This also paves the way to satisfy the 4KB page allocations by requesting free pages from the VM subsystem as opposed to hard-partitioning host memory at boot time.	2012-10-04 02:27:14 +00:00
neel	ebdd69568d	Add ioctls to control the X2APIC capability exposed by the virtual machine to the guest. At the moment this simply sets the state in the 'vcpu' instance but there is no code that acts upon these settings.	2012-09-25 19:08:51 +00:00
grehan	6c5ad005be	Add sysctls to display the total and free amount of hard-wired mem for VMs # sysctl hw.vmm hw.vmm.mem_free: 2145386496 hw.vmm.mem_total: 2145386496 Submitted by: Takeshi HASEGAWA hasegaw at gmail com	2012-08-26 01:41:41 +00:00
neel	ac88464d24	Allow the 'bhyve' process to control whether or not the virtual machine sees an ioapic. Obtained from: NetApp	2012-08-04 22:46:29 +00:00
neel	f07527ef60	API to map an apic id to the vcpu. At the moment this is a simple mapping because the numerical values are identical.	2012-08-04 02:38:05 +00:00
neel	19e3bb399c	There is no need to explicitly specify the CR4_VMXE bit when writing to guest CR4. This bit is specific to the Intel VTX and removing it makes the library more portable to AMD/SVM. In the Intel VTX implementation, the hypervisor will ensure that this bit is always set. See vmx_fix_cr4() for details. Suggested by: grehan	2012-08-04 02:14:27 +00:00
grehan	9f0c999f81	MSI-x interrupt support for PCI pass-thru devices. Includes instruction emulation for memory r/w access. This opens the door for io-apic, local apic, hpet timer, and legacy device emulation. Submitted by: ryan dot berryhill at sandvine dot com Reviewed by: grehan Obtained from: Sandvine	2012-04-28 16:28:00 +00:00
jhb	dc43ebc8b3	First cut to port bhyve, vmmctl, and libvmmapi to HEAD.	2011-05-15 04:03:11 +00:00
grehan	d45b7f14ae	Import of bhyve hypervisor and utilities, part 1. vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski	2011-05-13 04:54:01 +00:00

50 Commits