the queue where to enqueue pages that are going to be unwired.
- Add stronger checks to the enqueue/dequeue for the pagequeues when
adding and removing pages to them.
Of course, for unmanaged pages the queue parameter of vm_page_unwire() will
be ignored, just as the active parameter today.
This makes adding new pagequeues quicker.
This change effectively modifies the KPI. __FreeBSD_version will be,
however, bumped just when the full cache of free pages will be
evicted.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Tested by: pho
This is needed for Xen PV(H) guests, since there's no hardware lapic
available on this kind of domains. This commit should not change
functionality.
Sponsored by: Citrix Systems R&D
Reviewed by: jhb
Approved by: gibbs
amd64/include/cpu.h:
amd64/amd64/mp_machdep.c:
i386/include/cpu.h:
i386/i386/mp_machdep.c:
- Remove lapic_ipi_vectored hook from cpu_ops, since it's now
implemented in the lapic hooks.
amd64/amd64/mp_machdep.c:
i386/i386/mp_machdep.c:
- Use lapic_ipi_vectored directly, since it's now an inline function
that will call the appropiate hook.
x86/x86/local_apic.c:
- Prefix bare metal public lapic functions with native_ and mark them
as static.
- Define default implementation of apic_ops.
x86/include/apicvar.h:
- Declare the apic_ops structure and create inline functions to
access the hooks, so the change is transparent to existing users of
the lapic_ functions.
x86/xen/hvm.c:
- Switch to use the new apic_ops.
injected into the guest. This allows the hypervisor to inject another
ExtINT or APIC vector as soon as the guest is able to process interrupts.
This change is not to address any correctness issue but to guarantee that
any pending APIC vector that was preempted by the ExtINT will be injected
as soon as possible. Prior to this change such pending interrupts could be
delayed until the next VM exit.
a partially populated reservation becomes fully populated, and decrease this
field when a fully populated reservation becomes partially populated.
Use this field to simplify the implementation of pmap_enter_object() on
amd64, arm, and i386.
On all architectures where we support superpages, the cost of creating a
superpage mapping is roughly the same as creating a base page mapping. For
example, both kinds of mappings entail the creation of a single PTE and PV
entry. With this in mind, use the page size field to make the
implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little
smarter. Previously, if MAP_PREFAULT_PARTIAL was specified to
vm_map_pmap_enter(), that function would only map base pages. Now, it will
create up to 96 base page or superpage mappings.
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
it implicitly in vmm.ko.
Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus
and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and
"--get-suspended-cpus" options.
This is in preparation for being able to reset virtual machine state without
having to destroy and recreate it.
correctly prepare KGSBASE msr to restore the user descriptor base on
the last swapgs during return to usermode.
Reported and tested by: peterj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
guest for which the rules regarding xsetbv emulation are known. In
particular future extensions like AVX-512 have interdependencies among
feature bits that could allow a guest to trigger a GP# in the host with
the current approach of allowing anything the host supports.
- Add proper checking of Intel MPX and AVX-512 XSAVE features in the
xsetbv emulation and allow these features to be exposed to the guest if
they are enabled in the host.
- Expose a subset of known-safe features from leaf 0 of the structured
extended features to guests if they are supported on the host including
RDFSBASE/RDGSBASE, BMI1/2, AVX2, AVX-512, HLE, ERMS, and RTM. Aside
from AVX-512, these features are all new instructions available for use
in ring 3 with no additional hypervisor changes needed.
Reviewed by: neel
API function 'vie_calculate_gla()'.
While the current implementation is simplistic it forms the basis of doing
segmentation checks if the guest is in 32-bit protected mode.
of the guest linear address space. These APIs in turn use a new ioctl
'VM_GLA2GPA' to convert the guest linear address to guest physical.
Use the new copyin/copyout APIs when emulating ins/outs instruction in
bhyve(8).
'struct vm_guest_paging'.
Check for canonical addressing in vmm_gla2gpa() and inject a protection
fault into the guest if a violation is detected.
If the page table walk is restarted in vmm_gla2gpa() then reset 'ptpphys' to
point to the root of the page tables.
indicate the faulting linear address.
If the guest PML4 entry has the PG_PS bit set then inject a page fault into
the guest with the PGEX_RSV bit set in the error_code.
Get rid of redundant checks for the PG_RW violations when walking the page
tables.
the UART FIFO.
The emulation is constrained in a number of ways: 64-bit only, doesn't check
for all exception conditions, limited to i/o ports emulated in userspace.
Some of these constraints will be relaxed in followup commits.
Requested by: grehan
Reviewed by: tychon (partially and a much earlier version)
the proper ICWx initialization sequence. It assumes, probably correctly, that
the boot firmware has done the 8259 initialization.
Since grub-bhyve does not initialize the 8259 this write to the mask register
takes a code path in which 'error' remains uninitialized (ready=0,icw_num=0).
Fix this by initializing 'error' at the start of the function.
These masks are documented in the Intel Architecture Instruction Set
Extensions Programming Reference (March 2014).
Reviewed by: kib
MFC after: 1 month
Set the accessed and dirty bits in the page table entry. If it fails then
restart the page table walk from the beginning. This might happen if another
vcpu modifies the page tables simultaneously.
Reviewed by: alc, kib
to a guest physical address.
PG_PS (page size) field is valid only in a PDE or a PDPTE so it is now
checked only in non-terminal paging entries.
Ignore the upper 32-bits of the CR3 for PAE paging.
- inserting frame enter/leave sequences
- restructuring the vmx_enter_guest routine so that it subsumes
the vm_exit_guest block, which was the #vmexit RIP and not a
callable routine.
Reviewed by: neel
MFC after: 3 weeks
XSAVE Extended Features for AVX512 and MPX (Memory Protection Extensions).
Obtained from: Intel's Instruction Set Extensions Programming Reference
(March 2014)
the legacy 8259A PICs.
- Implement an ICH-comptabile PCI interrupt router on the lpc device with
8 steerable pins configured via config space access to byte-wide
registers at 0x60-63 and 0x68-6b.
- For each configured PCI INTx interrupt, route it to both an I/O APIC
pin and a PCI interrupt router pin. When a PCI INTx interrupt is
asserted, ensure that both pins are asserted.
- Provide an initial routing of PCI interrupt router (PIRQ) pins to
8259A pins (ISA IRQs) and initialize the interrupt line config register
for the corresponding PCI function with the ISA IRQ as this matches
existing hardware.
- Add a global _PIC method for OSPM to select the desired interrupt routing
configuration.
- Update the _PRT methods for PCI bridges to provide both APIC and legacy
PRT tables and return the appropriate table based on the configured
routing configuration. Note that if the lpc device is not configured, no
routing information is provided.
- When the lpc device is enabled, provide ACPI PCI link devices corresponding
to each PIRQ pin.
- Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A
pins via the ELCR.
- Mark the power management SCI as level triggered.
- Don't hardcode the number of elements in Packages in the source for
the DSDT. iasl(8) will fill in the actual number of elements, and
this makes it simpler to generate a Package with a variable number of
elements.
Reviewed by: tycho
with all bits set to 1 beyond the I/O permission bitmap.
Prior to this change accessing I/O ports [0xFFF8-0xFFFF] would trigger a
#GP fault even though the I/O bitmap allowed access to those ports.
For more details see section "I/O Permission Bit Map" in the Intel SDM, Vol 1.
Reviewed by: kib
freebsd-current. These CPU speed control techniques are usually unhelpful
at best. For now, continue building the relevant code into GENERIC so that
it can trivially be re-enabled at runtime if anyone wants it.
MFC after: 1 month
This is derived from the mps(4) driver, but it supports only the 12Gb
IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.
Some notes about this driver:
o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
this driver.
o WarpDrive functionality has been removed, since it isn't supported in
the 12Gb driver interface.
o The Scatter/Gather list handling code is significantly different between
the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather
lists.
Thanks to LSI for developing and testing this driver for FreeBSD.
share/man/man4/mpr.4:
mpr(4) man page.
sys/dev/mpr/*:
mpr(4) driver files.
sys/modules/Makefile,
sys/modules/mpr/Makefile:
Add a module Makefile for the mpr(4) driver.
sys/conf/files:
Add the mpr(4) driver.
sys/amd64/conf/GENERIC,
sys/i386/conf/GENERIC,
sys/mips/conf/OCTEON1,
sys/sparc64/conf/GENERIC:
Add the mpr(4) driver to all config files that currently
have the mps(4) driver.
sys/ia64/conf/GENERIC:
Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
config file.
sys/i386/conf/XEN:
Exclude the mpr module from building here.
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com>
MFC after: 3 days
Tested by: Chris Reeves <chrisr@spectralogic.com>
Sponsored by: LSI, Spectra Logic
Relnotes: LSI 12Gb SAS driver mpr(4) added
lindev(4) was only used to provide /dev/full which is now a standard feature of
FreeBSD. /dev/full was never linux-specific and provides a generally useful
feature.
Document this in UPDATING and bump __FreeBSD_version. This will be documented
in the PH shortly.
Reported by: jkim
to sleep permanently by executing a HLT with interrupts disabled.
When this condition is detected the guest with be suspended with a reason of
VM_SUSPEND_HALT and the bhyve(8) process will exit.
Tested by executing "halt" inside a RHEL7-beta guest.
Discussed with: grehan@
Reviewed by: jhb@, tychon@
the 'HLT' instruction. This condition was detected by 'vm_handle_hlt()' and
converted into the SPINDOWN_CPU exitcode . The bhyve(8) process would exit
the vcpu thread in response to a SPINDOWN_CPU and when the last vcpu was
spun down it would reset the virtual machine via vm_suspend(VM_SUSPEND_RESET).
This functionality was broken in r263780 in a way that made it impossible
to kill the bhyve(8) process because it would loop forever in
vm_handle_suspend().
Unbreak this by removing the code to spindown vcpus. Thus a 'halt' from
a Linux guest will appear to be hung but this is consistent with the
behavior on bare metal. The guest can be rebooted by using the bhyvectl
options '--force-reset' or '--force-poweroff'.
Reviewed by: grehan@
by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual
machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF.
The disposition of VM_SUSPEND is also made available to the exit handler
via the 'u.suspended' member of 'struct vm_exit'.
This capability is exposed via the '--force-reset' and '--force-poweroff'
arguments to /usr/sbin/bhyvectl.
Discussed with: grehan@