Commit Graph

201531 Commits

Author SHA1 Message Date
neel
eefb10a184 Rework vNMI injection.
Keep track of NMI blocking by enabling the IRET intercept on a successful
vNMI injection. The NMI blocking condition is cleared when the handler
executes an IRET and traps back into the hypervisor.

Don't inject NMI if the processor is in an interrupt shadow to preserve the
atomic nature of "STI;HLT". Take advantage of this and artificially set the
interrupt shadow to prevent NMI injection when restarting the "iret".

Reviewed by:	Anish Gupta (akgupt3@gmail.com), grehan
2014-09-17 00:30:25 +00:00
neel
16169f746d Minor cleanup.
Get rid of unused 'svm_feature' from the softc.

Get rid of the redundant 'vcpu_cnt' checks in svm.c. There is a similar check
in vmm.c against 'vm->active_cpus' before the AMD-specific code is called.

Submitted by:	Anish Gupta (akgupt3@gmail.com)
2014-09-16 04:01:55 +00:00
neel
97bffd44f6 Use V_IRQ, V_INTR_VECTOR and V_TPR to offload APIC interrupt delivery to the
processor. Briefly, the hypervisor sets V_INTR_VECTOR to the APIC vector
and sets V_IRQ to 1 to indicate a pending interrupt. The hardware then takes
care of injecting this vector when the guest is able to receive it.

Legacy PIC interrupts are still delivered via the event injection mechanism.
This is because the vector injected by the PIC must reflect the state of its
pins at the time the CPU is ready to accept the interrupt.

Accesses to the TPR via %CR8 are handled entirely in hardware. This requires
that the emulated TPR must be synced to V_TPR after a #VMEXIT.

The guest can also modify the TPR via the memory mapped APIC. This requires
that the V_TPR must be synced with the emulated TPR before a VMRUN.

Reviewed by:	Anish Gupta (akgupt3@gmail.com)
2014-09-16 03:31:40 +00:00
neel
cbc92dc709 Set the 'vmexit->inst_length' field properly depending on the type of the
VM-exit and ultimately on whether nRIP is valid. This allows us to update
the %rip after the emulation is finished so any exceptions triggered during
the emulation will point to the right instruction.

Don't attempt to handle INS/OUTS VM-exits unless the DecodeAssist capability
is available. The effective segment field in EXITINFO1 is not valid without
this capability.

Add VM_EXITCODE_SVM to flag SVM VM-exits that cannot be handled. Provide the
VMCB fields exitinfo1 and exitinfo2 as collateral to help with debugging.

Provide a SVM VM-exit handler to dump the exitcode, exitinfo1 and exitinfo2
fields in bhyve(8).

Reviewed by:	Anish Gupta (akgupt3@gmail.com)
Reviewed by:	grehan
2014-09-14 04:39:04 +00:00
neel
3e1af2f123 Bug fixes.
- Don't enable the HLT intercept by default. It will be enabled by bhyve(8)
  if required. Prior to this change HLT exiting was always enabled making
  the "-H" option to bhyve(8) meaningless.

- Recognize a VM exit triggered by a non-maskable interrupt. Prior to this
  change the exit would be punted to userspace and the virtual machine would
  terminate.
2014-09-13 23:48:43 +00:00
neel
a38d07e455 style(9): insert an empty line if the function has no local variables
Pointed out by:	grehan
2014-09-13 22:45:04 +00:00
neel
32e0378b35 AMD processors that have the SVM decode assist capability will store the
instruction bytes in the VMCB on a nested page fault. This is useful because
it saves having to walk the guest page tables to fetch the instruction.

vie_init() now takes two additional parameters 'inst_bytes' and 'inst_len'
that map directly to 'vie->inst[]' and 'vie->num_valid'.

The instruction emulation handler skips calling 'vmm_fetch_instruction()'
if 'vie->num_valid' is non-zero.

The use of this capability can be turned off by setting the sysctl/tunable
'hw.vmm.svm.disable_npf_assist' to '1'.

Reviewed by:	Anish Gupta (akgupt3@gmail.com)
Discussed with:	grehan
2014-09-13 22:16:40 +00:00
neel
45bb1086d6 style(9): indent the switch, don't indent the case, indent case body one tab. 2014-09-11 06:17:56 +00:00
neel
e108397c4a Repurpose the V_IRQ interrupt injection to implement VMX-style interrupt
window exiting. This simply involves setting V_IRQ and enabling the VINTR
intercept. This instructs the CPU to trap back into the hypervisor as soon
as an interrupt can be injected into the guest. The pending interrupt is
then injected via the traditional event injection mechanism.

Rework vcpu interrupt injection so that Linux guests now idle with host cpu
utilization close to 0%.

Reviewed by:	Anish Gupta (earlier version)
Discussed with:	grehan
2014-09-11 02:37:02 +00:00
neel
907c0aec17 Allow intercepts and irq fields to be cached by the VMCB.
Provide APIs svm_enable_intercept()/svm_disable_intercept() to add/delete
VMCB intercepts. These APIs ensure that the VMCB state cache is invalidated
when intercepts are modified.

Each intercept is identified as a (index,bitmask) tuple. For e.g., the
VINTR intercept is identified as (VMCB_CTRL1_INTCPT,VMCB_INTCPT_VINTR).
The first 20 bytes in control area that are used to enable intercepts
are represented as 'uint32_t intercept[5]' in 'struct vmcb_ctrl'.

Modify svm_setcap() and svm_getcap() to use the new APIs.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-10 03:13:40 +00:00
neel
3c61bfaac6 Move the VMCB initialization into svm.c in preparation for changes to the
interrupt injection logic.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-10 02:35:19 +00:00
neel
bbc4ff544f Move the event injection function into svm.c and add KTR logging for
every event injection.

This in in preparation for changes to SVM guest interrupt injection.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-10 02:20:32 +00:00
neel
5b8dd1ad82 Remove a bogus check that flagged an error if the guest %rip was zero.
An AP begins execution with %rip set to 0 after a startup IPI.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-10 01:46:22 +00:00
neel
77b2918286 Make the KTR tracepoints uniform and ensure that every VM-exit is logged.
Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-10 01:37:32 +00:00
neel
d07afdc371 Allow guest read access to MSR_EFER without hypervisor intervention.
Dirty the VMCB_CACHE_CR state cache when MSR_EFER is modified.
2014-09-10 01:10:53 +00:00
neel
d6f50ad39f Remove gratuitous forward declarations.
Remove tabs on empty lines.
2014-09-09 23:39:43 +00:00
neel
5824b2ab21 Do proper ASID management for guest vcpus.
Prior to this change an ASID was hard allocated to a guest and shared by all
its vcpus. The meant that the number of VMs that could be created was limited
to the number of ASIDs supported by the CPU. It was also inefficient because
it forced a TLB flush on every VMRUN.

With this change the number of guests that can be created is independent of
the number of available ASIDs. Also, the TLB is flushed only when a new ASID
is allocated.

Discussed with:	grehan
Reviewed by:	Anish Gupta (akgupt3@gmail.com)
2014-09-06 19:02:52 +00:00
neel
8decf07cf3 Merge svm_set_vmcb() and svm_init_vmcb() into a single function that is called
just once when a vcpu is initialized.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-05 03:33:16 +00:00
neel
9c9ff7e250 Remove unused header file.
Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-04 06:07:32 +00:00
neel
4038b1d8f9 Consolidate the code to restore the host TSS after a #VMEXIT into a single
function restore_host_tss().

Don't bother to restore MSR_KGSBASE after a #VMEXIT since it is not used in
the kernel. It will be restored on return to userspace.

Discussed with:	Anish Gupta (akgupt3@gmail.com)
2014-09-04 06:00:18 +00:00
neel
e5d2f8730c IFC @r269962
Submitted by:	Anish Gupta (akgupt3@gmail.com)
2014-09-02 04:22:42 +00:00
neel
37c34582e9 An exception is allowed to be injected even if the vcpu is in an interrupt
shadow, so move the check for pending exception before bailing out due to
an interrupt shadow.

Change return type of 'vmcb_eventinject()' to a void and convert all error
returns into KASSERTs.

Fix VMCB_EXITINTINFO_EC(x) and VMCB_EXITINTINFO_TYPE(x) to do the shift
before masking the result.

Reviewed by:    Anish Gupta (akgupt3@gmail.com)
2014-08-25 00:58:20 +00:00
neel
53369652fd Use the max guest memory address when creating its iommu domain.
Also, assert that the GPA being mapped in the domain is less than its maxaddr.

Reviewed by:	grehan
Pointed out by:	Anish Gupta (akgupt3@gmail.com)
2014-08-14 05:00:45 +00:00
ache
844b1fe2e7 Bump version because challenge buffer size changed
MFC after:      1 week
2014-08-14 04:42:09 +00:00
imp
dc346017bf Add AIC to at91sam9260 support, now that it is needed for multipass to
work. This gets my AT91SAM9260-based boards almost booting with
current in multi pass. The MCI driver is broken, but it is equally
broken before multi-pass.
2014-08-14 04:21:31 +00:00
imp
84d638ac06 Add support for multipass to Atmel, for both FDT and !FDT cases. 2014-08-14 04:21:25 +00:00
imp
7603d1d8c5 Start to add FDT support. 2014-08-14 04:21:20 +00:00
imp
a2466e5139 Add support for FDT and !FDT configs on Atmel, though FDT isn't
working yet.
Bump rev on arm Makefile since files.at91 uses new '!' operator.
2014-08-14 04:21:14 +00:00
imp
79c623db6a From https://sourceware.org/ml/newlib/2014/msg00113.html
By Richard Earnshaw at ARM
>
>GCC has for a number of years provides a set of pre-defined macros for
>use with determining the ISA and features of the target during
>pre-processing.  However, the design was always somewhat cumbersome in
>that each new architecture revision created a new define and then
>removed the previous one.  This meant that it was necessary to keep
>updating the support code simply to recognise a new architecture being
>added.
>
>The ACLE specification (ARM C Language Extentions)
>(http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.swdev/index.html)
>provides a much more suitable interface and GCC has supported this
>since gcc-4.8.
>
>This patch makes use of the ACLE pre-defines to map to the internal
>feature definitions.  To support older versions of GCC a compatibility
>header is provided that maps the traditional pre-defines onto the new
>ACLE ones.

Stop using __FreeBSD_ARCH_armv6__ and switch to __ARM_ARCH >= 6 in the
couple of places in tree. clang already implements ACLE. Add a define
that says we implement version 1.1, even though the implementation
isn't quite complete.
2014-08-14 04:20:13 +00:00
dim
3e2e431c78 Stop telling people to directly report llvm or clang bugs upstream,
point them to the FreeBSD bug tracker instead, since we use our own
patches.

MFC after:	3 days
2014-08-13 21:38:29 +00:00
pfg
b7d1d90916 Use "NO NAME" as the default unnamed label.
Microsoft recommends avoiding the use of spaces in the
string structures for FAT. Unfortunately they do just
that by default in the case of unlabeled filesystems.

Follow the default MS behavior to avoid confusion in
common tools like file(1). This was actually the
default behavior before r203868.

Obtained from:	NetBSD (CVS rev. 1.39)
MFC after:	3 days
2014-08-13 21:18:31 +00:00
emaste
8e895517a8 Remove trailing whitespace 2014-08-13 19:55:14 +00:00
adrian
a3da4574a5 Make the libbsdstat useful again. 2014-08-13 19:43:22 +00:00
emaste
929929f045 Copy country-code .iso syscons keymaps for vt(4)
Existing syscons ISO 8859-1 keymaps (??.iso.kbd) are usable without
change as Unicode keymaps for vt(4).

Sponsored by:	The FreeBSD Foundation
2014-08-13 19:06:29 +00:00
dim
0dce8ed0a3 Supplement r259111 by also using correct casts in gcc's emmintrin.h for
the first argument of the following builtin function:

* __builtin_ia32_psrlqi128() takes __v2di instead of __v4si

This should fix the following errors when building the graphics/webp
port with base gcc:

lossless_sse2.c:403: error: incompatible type for argument 1 of '__builtin_ia32_psrlqi128'
lossless_sse2.c:404: error: incompatible type for argument 1 of '__builtin_ia32_psrlqi128'

Reported by:	Jos Chrispijn <ports@webrz.net>
MFC after:	3 days
2014-08-13 16:42:44 +00:00
tuexen
4feb6f37e3 Add support for the SCTP_PR_STREAM_STATUS and SCTP_PR_ASSOC_STATUS
socket options. This includes managing the correspoing stat counters.
Add the SCTP_DETAILED_STR_STATS kernel option to control per policy
counters on every stream. The default is off and only an aggregated
counter is available. This is sufficient for the RTCWeb usecase.

MFC after: 1 week
2014-08-13 15:50:16 +00:00
pluknet
4227b97b79 Fixed ENOMEM description.
MFC after:	1 week
Sponsored by:	Nginx, Inc.
2014-08-13 14:49:51 +00:00
kib
96b731501d Add a knob LIBPTHREAD_BIGSTACK_MAIN, which instructs libthr to leave
the whole RLIMIT_STACK-sized region of the kernel-allocated stack as
the stack of main thread.

By default, the main thread stack is clamped at 2MB (4MB on 64bit
ABIs) and the rest is used for other threads stack allocation.  Since
there is no programmatic way to adjust the size of the main thread
stack, pthread_attr_setstacksize() is too late, the knob allows user
to manage the main stack size both for single-threaded and
multi-threaded processes with the rlimit.

Reported by:	"Ivan A. Kosarev" <ivan@ivan-labs.com>
Tested by:	dim
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2014-08-13 05:53:41 +00:00
kib
628dc68fb5 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2014-08-13 05:47:49 +00:00
kib
065b68902b If vm_page_grab() allocates a new page, the page is not inserted into
page queue even when the allocation is not wired.  It is
responsibility of the vm_page_grab() caller to ensure that the page
does not end on the vm_object queue but not on the pagedaemon queue,
which would effectively create unpageable unwired page.

In exec_map_first_page() and vm_imgact_hold_page(), activate the page
immediately after unbusying it, to avoid leak.

In the uiomove_object_page(), deactivate page before the object is
unlocked.  There is no leak, since the page is deactivated after
uiomove_fromphys() finished.  But allowing non-queued non-wired page
in the unlocked object queue makes it impossible to assert that leak
does not happen in other places.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-08-13 05:44:08 +00:00
ngie
0285c8234c Add missing BSD.tests.dist entry for lib/libutil to unbreak installworld with
MK_TESTS == no

Phabric: D555
X-MFC with: r269904
Approved by: jmmv (mentor, implicit)
Pointyhat to: ngie
2014-08-13 05:15:28 +00:00
ngie
cba459dcfd Integrate lib/libutil into the build/kyua
Remove the .t wrappers

Rename all of the TAP test applications from test-<test> to
<test>_test to match the convention described in the TestSuite
wiki page

humanize_number_test.c:

- Fix -Wformat warnings with counter variables
- Fix minor style(9) issues:
-- Header sorting
-- Variable declaration alignment/sorting in main(..)
-- Fit the lines in <80 columns
- Fix an off by one index error in the testcase output [*]
- Remove unnecessary `extern char * optarg;` (this is already provided by
  unistd.h)

Phabric: D555
Approved by: jmmv (mentor)
MFC after: 2 weeks
Obtained from: EMC / Isilon Storage Division [*]
Submitted by: Casey Peel <cpeel@isilon.com> [*]
Sponsored by: EMC / Isilon Storage Division
2014-08-13 04:56:27 +00:00
ngie
c904689011 Port date/bin/tests to ATF
Phabric: D545
Approved by: jmmv (mentor)
Submitted by: keramida (earlier version)
MFC after: 2 weeks
Sponsored by: Google, Inc
Sponsored by: EMC / Isilon Storage Division
2014-08-13 04:43:29 +00:00
ngie
45e6755fc1 Convert bin/sh/tests to ATF
The new code uses a "test discovery mechanism" to determine
what tests are available for execution

The test shell can be specified via:

  kyua test -v test_suites.FreeBSD.bin.sh.test_shell=/path/to/test/sh

Sponsored by: EMC / Isilon Storage Division
Approved by: jmmv (mentor)
Reviewed by: jilles (maintainer)
2014-08-13 04:14:50 +00:00
pfg
2db0fe8cf8 Minor style tweaks.
Obtained from:	OpenBSD (CVS rev. 1.7)
MFC after:	3 days
2014-08-13 03:44:30 +00:00
rpaulo
abf7fb0281 Make sure the DTrace header files are built before depend and before
the build starts.

This adds a new variable DHDRS that contains a list of all DTrace
header files.  Then, we use the beforedepend hook to make sure the
heaeder files are built.

Introduce a beforebuild dependency (from projects/bmake) based on
feedback from Simon J. Gerraty.  This lets us generate the header
files without running make depend.

Reviewed by:	sjg, imp
MFC after:	3 days
2014-08-13 01:27:51 +00:00
neel
ccce21b061 Fix typo when displaying the HPET timer unit number. 2014-08-13 00:18:16 +00:00
neel
253faed0cb Minor cleanup:
- Set 'pirq_cold' to '0' on the first PIRQ allocation.
- Make assertions stronger.

Reviewed by:	jhb
CR:		https://phabric.freebsd.org/D592
2014-08-13 00:14:26 +00:00
imp
1a87dca760 Truncate the ctfmerge command line, like we do with SYSTEM_LD. 2014-08-12 23:48:37 +00:00
gjb
3a0070cced Fix a typo in a comment: s/interprete/interpret/
Submitted by:	Sam Fourman Jr.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2014-08-12 19:37:49 +00:00