freebsd-nq

Author	SHA1	Message	Date
Neel Natu	c17d4a83b8	Add a comment explaining the intent behind the I/O reservation [0x72-0x77].	2014-10-26 21:17:44 +00:00
Neel Natu	be679db4cd	Provide APIs to directly get 'lowmem' and 'highmem' size directly. Previously the sizes were inferred indirectly based on the size of the mappings at 0 and 4GB respectively. This works fine as long as size of the allocation is identical to the size of the mapping in the guest's address space. However, if the mapping is disjoint then this assumption falls apart (e.g., due to the legacy BIOS hole between 640KB and 1MB).	2014-06-24 02:02:51 +00:00
John Baldwin	e6c8bc291a	Rework the DSDT generation code a bit to generate more accurate info about LPC devices. Among other things, the LPC serial ports now appear as ACPI devices. - Move the info for the top-level PCI bus into the PCI emulation code and add ResourceProducer entries for the memory ranges decoded by the bus for memory BARs. - Add a framework to allow each PCI emulation driver to optionally write an entry into the DSDT under the \_SB_.PCI0 namespace. The LPC driver uses this to write a node for the LPC bus (\_SB_.PCI0.ISA). - Add a linker set to allow any LPC devices to write entries into the DSDT below the LPC node. - Move the existing DSDT block for the RTC to the RTC driver. - Add DSDT nodes for the AT PIC, the 8254 ISA timer, and the LPC UART devices. - Add a "SuperIO" device under the LPC node to claim "system resources" aling with a linker set to allow various drivers to add IO or memory ranges that should be claimed as a system resource. - Add system resource entries for the extended RTC IO range, the registers used for ACPI power management, the ELCR, PCI interrupt routing register, and post data register. - Add various helper routines for generating DSDT entries. Reviewed by: neel (earlier version)	2014-01-02 21:26:59 +00:00
Peter Grehan	062b878f58	Changes required for OpenBSD/amd64: - Allow a hostbridge to be created with AMD as a vendor. This passes the OpenBSD check to allow the use of MSI on a PCI bus. - Enable the i/o interrupt section of the mptable, and populate it with unity ISA mappings. This allows the 'legacy' IRQ mappings of the PCI serial port to be set up. Delete unused print routine that was obscuring code. - Use the '-W' option to enable virtio single-vector MSI rather than an environment variable. Update the virtio net/block drivers to query this flag when setting up interrupts.: bhyverun.c - Fix the arithmetic used to derive the century byte in RTC CMOS, as well as encoding it in BCD. Reviewed by: neel MFC after: 3 days	2013-10-17 22:01:17 +00:00
Neel Natu	318224bbe6	Merge projects/bhyve_npt_pmap into head. Make the amd64/pmap code aware of nested page table mappings used by bhyve guests. This allows bhyve to associate each guest with its own vmspace and deal with nested page faults in the context of that vmspace. This also enables features like accessed/dirty bit tracking, swapping to disk and transparent superpage promotions of guest memory. Guest vmspace: Each bhyve guest has a unique vmspace to represent the physical memory allocated to the guest. Each memory segment allocated by the guest is mapped into the guest's address space via the 'vmspace->vm_map' and is backed by an object of type OBJT_DEFAULT. pmap types: The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT. The PT_X86 pmap type is used by the vmspace associated with the host kernel as well as user processes executing on the host. The PT_EPT pmap is used by the vmspace associated with a bhyve guest. Page Table Entries: The EPT page table entries as mostly similar in functionality to regular page table entries although there are some differences in terms of what bits are used to express that functionality. For e.g. the dirty bit is represented by bit 9 in the nested PTE as opposed to bit 6 in the regular x86 PTE. Therefore the bitmask representing the dirty bit is now computed at runtime based on the type of the pmap. Thus PG_M that was previously a macro now becomes a local variable that is initialized at runtime using 'pmap_modified_bit(pmap)'. An additional wrinkle associated with EPT mappings is that older Intel processors don't have hardware support for tracking accessed/dirty bits in the PTE. This means that the amd64/pmap code needs to emulate these bits to provide proper accounting to the VM subsystem. This is achieved by using the following mapping for EPT entries that need emulation of A/D bits: Bit Position Interpreted By PG_V 52 software (accessed bit emulation handler) PG_RW 53 software (dirty bit emulation handler) PG_A 0 hardware (aka EPT_PG_RD) PG_M 1 hardware (aka EPT_PG_WR) The idea to use the mapping listed above for A/D bit emulation came from Alan Cox (alc@). The final difference with respect to x86 PTEs is that some EPT implementations do not support superpage mappings. This is recorded in the 'pm_flags' field of the pmap. TLB invalidation: The amd64/pmap code has a number of ways to do invalidation of mappings that may be cached in the TLB: single page, multiple pages in a range or the entire TLB. All of these funnel into a single EPT invalidation routine called 'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and sends an IPI to the host cpus that are executing the guest's vcpus. On a subsequent entry into the guest it will detect that the EPT has changed and invalidate the mappings from the TLB. Guest memory access: Since the guest memory is no longer wired we need to hold the host physical page that backs the guest physical page before we can access it. The helper functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose. PCI passthru: Guest's with PCI passthru devices will wire the entire guest physical address space. The MMIO BAR associated with the passthru device is backed by a vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that have one or more PCI passthru devices attached to them. Limitations: There isn't a way to map a guest physical page without execute permissions. This is because the amd64/pmap code interprets the guest physical mappings as user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U shares the same bit position as EPT_PG_EXECUTE all guest mappings become automatically executable. Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews as well as their support and encouragement. Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing object for pci passthru mmio regions. Special thanks to Peter Holm for testing the patch on short notice. Approved by: re Discussed with: grehan Reviewed by: alc, kib Tested by: pho	2013-10-05 21:22:35 +00:00
Peter Grehan	4458253e97	Allow the alarm hours/mins/seconds registers to be read/written, though without any action. This avoids a hypervisor exit when o/s's access these regs (Linux). Reviewed by: neel Approved by: re@ (blanket)	2013-09-19 04:29:03 +00:00
Peter Grehan	c20d3f633a	Use correct offset for the high byte of high memory written to RTC NVRAM. Submitted by: Bela Lubkin bela dot lubkin at tidalscale dot com Approved by: re@ (blanket)	2013-09-19 04:20:18 +00:00
Peter Grehan	9d6be09f8a	Implement RTC CMOS nvram. Init some fields that are used by FreeBSD and UEFI. Tested with nvram(4). Reviewed by: neel	2013-07-11 03:54:35 +00:00
Peter Grehan	2838f343c8	Improve correctness of rtc register implementation. Submitted by: tycho nightingale at pluribusnetworks com	2013-01-25 22:43:20 +00:00
Peter Grehan	1f3025e133	Changes to allow the GENERIC+bhye kernel built from this branch to run as a 1/2 CPU guest on an 8.1 bhyve host. bhyve/inout.c inout.h fbsdrun.c - Rather than exiting on accesses to unhandled i/o ports, emulate hardware by returning -1 on reads and ignoring writes to unhandled ports. Support the previous mode by allowing a 'strict' parameter to be set from the command line. The 8.1 guest kernel was vastly cut down from GENERIC and had no ISA devices. Booting GENERIC exposes a massive amount of random touching of i/o ports (hello syscons/vga/atkbdc). bhyve/consport.c dev/bvm/bvm_console.c - implement a simplistic signature for the bvm console by returning 'bv' for an inw on the port. Also, set the priority of the console to CN_REMOTE if the signature was returned. This works better in an environment where multiple consoles are in the kernel (hello syscons) bhyve/rtc.c - return 0 for the access to RTC_EQUIPMENT (yes, you syscons) amd64/vmm/x86.c x86.h - hide a bunch more CPUID leaf 1 bits from the guest to prevent cpufreq drivers from probing. The next step will be to move CPUID handling completely into user-space. This will allow the full spectrum of changes from presenting a lowest-common-denominator CPU type/feature set, to exposing (almost) everything that the host can support. Reviewed by: neel Obtained from: NetApp	2011-05-19 21:53:25 +00:00
Peter Grehan	366f60834f	Import of bhyve hypervisor and utilities, part 1. vmm.ko - kernel module for VT-x, VT-d and hypervisor control bhyve - user-space sequencer and i/o emulation vmmctl - dump of hypervisor register state libvmm - front-end to vmm.ko chardev interface bhyve was designed and implemented by Neel Natu. Thanks to the following folk from NetApp who helped to make this available: Joe CaraDonna Peter Snyder Jeff Heller Sandeep Mann Steve Miller Brian Pawlowski	2011-05-13 04:54:01 +00:00

11 Commits