Commit Graph

7704 Commits

Author SHA1 Message Date
John Baldwin
9e2154ff1c Cleanups related to debug exceptions on x86.
- Add constants for fields in DR6 and the reserved fields in DR7.  Use
  these constants instead of magic numbers in most places that use DR6
  and DR7.
- Refer to T_TRCTRAP as "debug exception" rather than a "trace trap"
  as it is not just for trace exceptions.
- Always read DR6 for debug exceptions and only clear TF in the flags
  register for user exceptions where DR6.BS is set.
- Clear DR6 before returning from a debug exception handler as
  recommended by the SDM dating all the way back to the 386.  This
  allows debuggers to determine the cause of each exception.  For
  kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value
  to other parts of the handler (namely, user_dbreg_trap()).  For user
  traps, wait until after trapsignal to clear DR6 so that userland
  debuggers can read DR6 via PT_GETDBREGS while the thread is stopped
  in trapsignal().

Reviewed by:	kib, rgrimes
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15189
2018-05-22 00:45:00 +00:00
Konstantin Belousov
3621ba1ede Add Intel Spec Store Bypass Disable control.
Speculative Store Bypass (SSB) is a speculative execution side channel
vulnerability identified by Jann Horn of Google Project Zero (GPZ) and
Ken Johnson of the Microsoft Security Response Center (MSRC)
https://bugs.chromium.org/p/project-zero/issues/detail?id=1528.
Updated Intel microcode introduces a MSR bit to disable SSB as a
mitigation for the vulnerability.

Introduce a sysctl hw.spec_store_bypass_disable to provide global
control over the SSBD bit, akin to the existing sysctl that controls
IBRS. The sysctl can be set to one of three values:
0: off
1: on
2: auto

Future work will enable applications to control SSBD on a per-process
basis (when it is not enabled globally).

SSBD bit detection and control was verified with prerelease microcode.

Security:	CVE-2018-3639
Tested by:	emaste (previous version, without updated microcode)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:08:19 +00:00
Konstantin Belousov
2320153fcc Preserve other bits in IA32_SPEC_CTL MSR when changing the IBRS and
STIBP states.

Tested by:	emaste (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:05:55 +00:00
Konstantin Belousov
5988464ec4 Fix grammar.
Submitted by:	alc
MFC after:	1 week
2018-05-21 19:15:05 +00:00
Konstantin Belousov
0a4b04a616 Add missed barrier for pm_gen/pm_active interaction.
When we issue shootdown IPIs, we first assign zero to pm_gens to
indicate the need to flush on the next context switch in case our IPI
misses the context, next we read pm_active. On context switch we set
our bit in pm_active, then we read pm_gen. It is crucial that both
threads see the memory in the program order, otherwise invalidation
thread might read pm_active bit as zero and the context switching
thread might read pm_gen as zero.

IA32 allows CPU for both reads to see zero. We must use the barriers
between write and read. The pm_active bit set is already locked, so
only the invalidation functions need it.

I never saw it in real life, or at least I do not have a good
reproduction case. I found this during code inspection when hunting
for the Xen TLB issue reported by cperciva.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15506
2018-05-21 18:41:16 +00:00
Mateusz Guzik
edacda736b amd64: annotate pti with __read_frequently 2018-05-21 05:20:23 +00:00
Mark Johnston
892bdccca0 Enable kernel dump features in GENERIC for most platforms.
This turns on support for kernel dump encryption and compression, and
netdump. arm and mips platforms are omitted for now, since they are more
constrained and don't benefit as much from these features.

Reviewed by:	cem, manu, rgrimes
Tested by:	manu (arm64)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D15465
2018-05-19 19:53:23 +00:00
Matt Macy
f5ad6b4b00 pmap: silence warnings 2018-05-19 05:58:05 +00:00
Ed Maste
3dc3b1235a amd64 GENERIC: correct whitespace on smartpqi entry 2018-05-18 17:51:42 +00:00
Antoine Brodin
147d12a7d3 vmmdev: return EFAULT when trying to read beyond VM system memory max address
Currently, when using dd(1) to take a VM memory image, the capture never ends,
reading zeroes when it's beyond VM system memory max address.
Return EFAULT when trying to read beyond VM system memory max address.

Reviewed by:	imp, grehan, anish
Approved by:	grehan
Differential Revision:	https://reviews.freebsd.org/D15156
2018-05-15 17:20:58 +00:00
John Baldwin
0b3e6e4c50 Make the common interrupt entry point labels local labels.
Kernel debuggers depend on symbol names to find stack frames with a
trapframe rather than a normal stack frame.  The labels used for the
shared interrupt entry point for the PTI and non-PTI cases did not
match the existing patterns confusing debuggers.  Add the '.L' prefix
to mark these symbols as local so they are not visible in the symbol
table.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-05-14 17:27:53 +00:00
Konstantin Belousov
8b4fc8b11c Make fpusave() and fpurestore() on amd64 ifuncs.
From now on, linking amd64 kernel requires either lld or newer ld.bfd.

Reviewed by:	jhb (as part of the large patch)
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D13838
2018-05-10 15:01:43 +00:00
Mateusz Guzik
20ca271fdd amd64: depessimize bcmp for small buffers
Adapt assembly generated by clang for memcmp and use it for <= 64 sized
compares (which are the vast majority).

Sample result of doing stats on Broadwell (% of samples):
before: 4.0 kernel     bcmp                 cache_lookup
after : 0.7 kernel     bcmp                 cache_lookup

The routine is most definitely still not optimal. Anyone interested in
spending time improving it is welcome to take over.

Reviewed by:	kib
2018-05-09 15:16:25 +00:00
Konstantin Belousov
55c9d75e6b Avoid calls to bzero() before ireloc.
Evaluate cpu_stdext_feature early to have moved link_elf_ireloc() see
correct flags, most important is SMAP.

Tested by:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D15367
2018-05-09 14:39:24 +00:00
Konstantin Belousov
71d1bbce91 Remove PG_U from the rest of the kernel pmap ptes.
Supposedly, they PG_U bits there were set to easier making some kernel
page accessible to userspace in-place.  Since it was not used for the
whole existence of the amd64 pmap.c and current design of the shared
pages prefers double-mapping over the in-place access, remove PG_U
both from the direct map and KVA slots.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-05-09 12:09:08 +00:00
Konstantin Belousov
5aaa5bc3d6 Remove PG_U from the recursive pte for kernel pmap' PML4 page.
This PML4 page is never used for the userspace process, so there is no
security implications.  But the configuration trips SMAP check, which
should be corrected.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-05-09 12:03:40 +00:00
Konstantin Belousov
053641bb1c Prepare DB# handler for deferred trigger of watchpoints.
Since pop %ss/mov %ss instructions defer all interrupts and exceptions
for the next instruction, it is possible that the userspace watchpoint
trap executes on the first instruction of the kernel entry for
syscall/bpt.

In this case, DB# should be treated similarly to NMI: on amd64 we must
always load GSBASE even if the trap comes from kernel mode, and load
the kernel page table root into %cr3.  Moreover, the trap must
use the dedicated stack, because we are still on the user stack when
trapped on syscall entry.

For i386, we must reload %cr3.  The syscall instruction is not configured,
so there is no issue with executing on user stack when trapping.

Due to some CPU erratas it is not always possible to detect that the
userspace watchpoint triggered by inspecting %dr6.  In trap(), compare the
trap %rip with the known unsafe entry points and if matched pretend that
the watchpoint did not fire at all.

Thank you to the MSRC Incident Response Team, and in particular Greg
Lenti and Nate Warfield, for coordinating the response to this issue
across multiple vendors.

Thanks to Computer Recycling at The Working Center of Kitchener for
making hardware available to allow us to test the patch on additional
CPU families.

Reviewed by:	jhb
Discussed with:	Matthew Dillon
Tested by:	emaste
Sponsored by:	The FreeBSD Foundation
Security:	CVE-2018-8897
Security:	FreeBSD-SA-18:06.debugreg
2018-05-08 17:00:34 +00:00
Mateusz Guzik
a9456603f2 amd64: stop asserting params != NULL in the syscall path
The parameter is effectively controllable by userspace. It does not matter
what it is set to as it is being passed to copyin - worst case the operation
will just fail.

While here stop computing it unless it is going to be used.

Noted by:	dillon@backplane.com
2018-05-07 21:32:08 +00:00
Mateusz Guzik
bed34b0b04 amd64: fix up memset added in r333324
There was a missing trick expanding the passed pattern to a full word
by multiplication. As a side effect non-zero patterns would be
incorrectly laid down.

This stems from the use of rep stosq which is word-sized, while the passed
argument is byte-sized.

I initially repurposed memcpy into memset without taking this into account.
All but non-bzero testing was performed with a variant utilizing ERMS, i.e.
using only stosb which happens to not into the problem whatsoever. So my bad
twice.

Thanks to Oliver Pinter for noting the problem and providing a testcase.
2018-05-07 20:54:42 +00:00
Mateusz Guzik
f185a3dc33 amd64: tweak the memmove comment regarding authorship
To make it clear the mentioned author did not write memmove.
2018-05-07 17:37:07 +00:00
Mateusz Guzik
6a909b9680 amd64: replace libkern's memset and memmove with assembly variants
memmove is repurposed bcopy (arguments swapped, return value added)
The libkern variant is a wrapper around bcopy, so this is a big
improvement.

memset is repurposed memcpy. The librkern variant is doing fishy stuff,
including branching on 0 and calling bzero.

Both functions are rather crude and subject to partial depessimization.

This is a soft prerequisite to adding variants utilizing the
'Enhanced REP MOVSB/STOSB' bit and let the kernel patch at runtime.
2018-05-07 15:07:28 +00:00
Mateusz Guzik
ac7edb45e1 amd64: syscall path bcopy -> memcpy 2018-05-04 22:41:12 +00:00
Mateusz Guzik
f0648bcc04 amd64: get rid of the pessimized bcopy in syscall arg copy
The code was unnecessarily conditionally copying either 5 or 6 args.
It can blindly copy 6, which also means the size is known at compilation
time and the operation can be depessimized.

Note the entire syscall handling code is rather slow.

Tested on Skylake, sample result for getppid (calls/s):
without pti: 7310106 -> 10653569
with pti: 3304843 -> 4148306

Some syscalls (like read) did not note any difference, other have typically
very modest wins.
2018-05-04 04:05:07 +00:00
Konstantin Belousov
7035cf14ee Implement support for ifuncs in the kernel linker.
Required MD bits are only provided for x86.

Reviewed by:	jhb (previous version, as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D13838
2018-05-03 21:37:46 +00:00
Konstantin Belousov
9ea6332090 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D13838
2018-05-03 10:17:37 +00:00
Peter Grehan
adb947a67a Use PCI power-mgmt to reset a device if FLR fails.
A large number of devices don't support PCIe FLR, in particular
graphics adapters. Use PCI power management to perform the
reset if FLR fails or isn't available, by cycling the device
through the D3 state.

This has been tested by a number of users with Nvidia and AMD GPUs.

Submitted and tested by: Matt Macy
Reviewed by:	jhb, imp, rgrimes
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D15268
2018-05-02 17:41:00 +00:00
Mark Johnston
20f85b1ddd Print the dump progress indicator after calling dump_start().
Dumpers may wish to print messages from an initialization hook; this
change ensures that such messages aren't mixed with output from the
generic dump code.

MFC after:	1 week
2018-05-01 17:32:43 +00:00
Conrad Meyer
538184fa2f amd64/mp_machdep.c: Fix GCC build after r333059
GCC warns about the potentially confusing use of the binary AND ('&')
operator with a left operand containing an addition expression.  (The
confusion would be around the operator precedence between the + and & infix
operators.)  The warning is converted into an error with -Werror.

No functional change.

This construct was actually introduced in r328083, but r333059 (re)moved the
closing parentheses.

For reference, see http://en.cppreference.com/w/c/language/operator_precedence .
2018-04-28 17:55:28 +00:00
Tycho Nightingale
27275f8a52 Expand the checks for UCR3 == PMAP_NO_CR3 to enable processes to be
excluded from PTI.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15100
2018-04-27 12:44:20 +00:00
Sean Bruno
14ec0f3a3b move smartpqi(4) controller out of NOTES and into sys/amd64/NOTES to
appease LINT

Submitted by:	rpokala
Reported by:	npn
2018-04-26 22:43:25 +00:00
Sean Bruno
1e66f787c8 martpqi(4):
- Microsemi SCSI driver for PQI controllers.
- Found on newer model HP servers.
- Restrict to AMD64 only as per developer request.

The driver provides support for the new generation of PQI controllers
from Microsemi. This driver is the first SCSI driver to implement the PQI
queuing model and it will replace the aacraid driver for Adaptec Series 9
controllers.  HARDWARE Controllers supported by the driver include:

    HPE Gen10 Smart Array Controller Family
    OEM Controllers based on the Microsemi Chipset.

Submitted by:   deepak.ukey@microsemi.com
Relnotes:       yes
Sponsored by:   Microsemi
Differential Revision:   https://reviews.freebsd.org/D14514
2018-04-26 16:59:06 +00:00
Tycho Nightingale
19c5cea336 If a trap is encountered upon executing iretq from within doreti() the
hardware will ensure the stack pointer is aligned to a 16-byte
boundary before saving the fault state on the stack.

In the PTI case, handle this potential alignment adjustment by copying
both frames independently while unwinding the stack in between.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15183
2018-04-25 14:21:13 +00:00
Mark Johnston
5cd29d0f3c Improve VM page queue scalability.
Currently both the page lock and a page queue lock must be held in
order to enqueue, dequeue or requeue a page in a given page queue.
The queue locks are a scalability bottleneck in many workloads. This
change reduces page queue lock contention by batching queue operations.
To detangle the page and page queue locks, per-CPU batch queues are
used to reference pages with pending queue operations. The requested
operation is encoded in the page's aflags field with the page lock
held, after which the page is enqueued for a deferred batch operation.
Page queue scans are similarly optimized to minimize the amount of
work performed with a page queue lock held.

Reviewed by:	kib, jeff (previous versions)
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14893
2018-04-24 21:15:54 +00:00
Konstantin Belousov
b7941dc91e Correct undesirable interaction between caching of %cr4 in bhyve and
invltlb_glob().

Reviewed by:	grehan, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15138
2018-04-24 13:44:19 +00:00
John Baldwin
73c8686e91 Simplify the code to allocate stack for auxv, argv[], and environment vectors.
Remove auxarg_size as it was only used once right after a confusing
assignment in each of the variants of exec_copyout_strings().

Reviewed by:	emaste
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15123
2018-04-19 16:00:34 +00:00
Andriy Gapon
f3f6ecb450 set kdb_why to "trap" when calling kdb_trap from trap_fatal
This will allow to hook a ddb script to "kdb.enter.trap" event.
Previously there was no specific name for this event, so it could only
be handled by either "kdb.enter.unknown" or "kdb.enter.default" hooks.
Both are very unspecific.

Having a specific event is useful because the fatal trap condition is
very similar to panic but it has an additional property that the current
stack frame is the frame where the trap occurred.  So, both a register
dump and a stack bottom dump have additional information that can help
analyze the problem.

I have added the event only on architectures that have trap_fatal()
function defined.  I haven't looked at other architectures.  Their
maintainers can add support for the event later.

Sample script:
kdb.enter.trap=bt; show reg; x/aS $rsp,20; x/agx $rsp,20

Reviewed by:	kib, jhb, markj
MFC after:	11 days
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D15093
2018-04-19 05:06:56 +00:00
Andriy Gapon
6d83b2e971 don't check for kdb reentry in trap_fatal(), it's impossible
trap() checks for it earlier and calls kdb_reentry().

Discussed with:	jhb
MFC after:	12 days
Sponsored by:	Panzura
2018-04-18 15:44:54 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Konstantin Belousov
23084818ff Set PG_G global mapping bit on the trampoline ptes.
Trampoline mappings are better treated as global since they are valid
in all address spaces, even for PTI.  pmap_invalidate_range() must work
on global mappings for pti since kernel_pmap invalidations are really
same as for non-PTI.

Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D15052
2018-04-14 17:33:16 +00:00
Tycho Nightingale
6ac73777ea Add SDT probes to vmexit on Intel.
Submitted by:	domagoj.stolfa_gmail.com
Reviewed by:	grehan, tychon
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D14656
2018-04-13 17:23:05 +00:00
Konstantin Belousov
7c5d1690e9 Fix PSL_T inheritance on exec for x86.
The miscellaneous x86 sysent->sv_setregs() implementations tried to
migrate PSL_T from the previous program to the new executed one, but
they evaluated regs->tf_eflags after the whole regs structure was
bzeroed.  Make this functional by saving PSL_T value before zeroing.

Note that if the debugger is not attached, executing the first
instruction in the new program with PSL_T set results in SIGTRAP, and
since all intercepted signals are reset to default dispostion on
exec(2), this means that non-debugged process gets killed immediately
if PSL_T is inherited.  In particular, since suid images drop
P_TRACED, attempt to set PSL_T for execution of such program would
kill the process.

Another issue with userspace PSL_T handling is that it is reset by
trap().  It is reasonable to clear PSL_T when entering SIGTRAP
handler, to allow the signal to be handled without recursion or
delivery of blocked fault.  But it is not reasonable to return back to
the normal flow with PSL_T cleared.  This is too late to change, I
think.

Discussed with:	bde, Ali Mashtizadeh
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D14995
2018-04-12 20:43:39 +00:00
Konstantin Belousov
b7dbf1132e Optimize context switch for PTI on PCID pmap.
In pti-enabled pmap, the PCID allocation scheme assigns temporal id
for the kernel page table, and user page table twin PCID is
calculating by setting high bit in the kernel PCID.  So the kernel AS
is mapped with per-vmspace PCID, and we must completely shut down all
mappings in KVA when switching contexts, so that newly switched thread
would see all changes in KVA occured while it was not executing.
After all, KVA is same between all threads.

Currently the pti context switch for the user part of the page table
gets its TLB entries flushed too. It is excessive. The same PCID
flushing algorithm that is used for non-pti pmap, correctly works for
the UVA mappings.  The only shared TLB entries are the pages from KVA
accessed by the kernel entry trampoline.  All of them are static
except per-thread TSS and LDT. For TSS and LDT, the lifetime of newly
allocated entries is the whole thread life, so it is fine as well. If
not fine, then explicit shutdowns for current pmap of the newly
allocated LDT and TSS pages would be enough.

Also restore the constant value for the pm_pcid for the kernel_pmap.
Before, for PTI pmap, pm_pcid was erronously rolled same as user
pmap's pm_pcid, but it was not used.

Reviewed by:	markj (previous version)
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D14961
2018-04-12 19:59:36 +00:00
Ed Maste
c7fb0e1ddf linuxulator: add else case braces to reduce diffs between archs
Sponsored by:	Turing Robotic Industries Inc.
2018-04-09 19:11:24 +00:00
Ed Maste
b267239d4b linuxulator: deduplicate linux_exec_imgact_try
Previously linuxulator had three identical copies of
linux_exec_imgact_try.  Deduplicate before adding another arch to
linuxulator.

Sponsored by:	Turing Robotic Industries Inc
Differential Revision:	https://reviews.freebsd.org/D14856
2018-04-09 17:24:01 +00:00
Rodney W. Grimes
01d822d33b Add the ability to control the CPU topology of created VMs
from userland without the need to use sysctls, it allows the old
sysctls to continue to function, but deprecates them at
FreeBSD_version 1200060 (Relnotes for deprecate).

The command line of bhyve is maintained in a backwards compatible way.
The API of libvmmapi is maintained in a backwards compatible way.
The sysctl's are maintained in a backwards compatible way.

Added command option looks like:
bhyve -c [[cpus=]n][,sockets=n][,cores=n][,threads=n][,maxcpus=n]
The optional parts can be specified in any order, but only a single
integer invokes the backwards compatible parse.  [,maxcpus=n] is
hidden by #ifdef until kernel support is added, though the api
is put in place.

bhyvectl --get-cpu-topology option added.

Reviewed by:	grehan (maintainer, earlier version),
Reviewed by:	bcr (manpages)
Approved by:	bde (mentor), phk (mentor)
Tested by:	Oleg Ginzburg <olevole@olevole.ru> (cbsd)
MFC after:	1 week
Relnotes:	Y
Differential Revision:	https://reviews.freebsd.org/D9930
2018-04-08 19:24:49 +00:00
Brooks Davis
1a449272a3 Fix LINT (and static COMPAT_LINUX32) after r332122. 2018-04-08 17:10:32 +00:00
Konstantin Belousov
e55d32b7b3 Handle Skylake-X errata SKZ63.
SKZ63 Processor May Hang When Executing Code In an HLE Transaction
Region

Problem: Under certain conditions, if the processor acquires an HLE
(Hardware Lock Elision) lock via the XACQUIRE instruction in the Host
Physical Address range between 40000000H and 403FFFFFH, it may hang
with an internal timeout error (MCACOD 0400H) logged into
IA32_MCi_STATUS.

Move the pages from the range into the blacklist.  Add a tunable to
not waste 4M if local DoS is not the issue.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15001
2018-04-07 17:06:13 +00:00
John Baldwin
fc276d92ae Add a way to temporarily suspend and resume virtual CPUs.
This is used as part of implementing run control in bhyve's debug
server.  The hypervisor now maintains a set of "debugged" CPUs.
Attempting to run a debugged CPU will fail to execute any guest
instructions and will instead report a VM_EXITCODE_DEBUG exit to
the userland hypervisor.  Virtual CPUs are placed into the debugged
state via vm_suspend_cpu() (implemented via a new VM_SUSPEND_CPU ioctl).
Virtual CPUs can be resumed via vm_resume_cpu() (VM_RESUME_CPU ioctl).

The debug server suspends virtual CPUs when it wishes them to stop
executing in the guest (for example, when a debugger attaches to the
server).  The debug server can choose to resume only a subset of CPUs
(for example, when single stepping) or it can choose to resume all
CPUs.  The debug server must explicitly mark a CPU as resumed via
vm_resume_cpu() before the virtual CPU will successfully execute any
guest instructions.

Reviewed by:	avg, grehan
Tested on:	Intel (jhb), AMD (avg)
Differential Revision:	https://reviews.freebsd.org/D14466
2018-04-06 22:03:43 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Jonathan T. Looney
6a740d0bcf Pat the watchdog less while producing a coredump. Prior to this change,
we patted the watchdog approximately once per 4KB page of memory.  After
this change, we pat the watchdog approximately once per 128MB of memory.
On a sample machine, this translated to patting the watchdog approximately
every 5.4 seconds, which "seems reasonable". We can choose a different
value in the future, if warranted.

This has extensive field experience. It is a performance improvement, and
has not caused any known problems.

Reviewed by:	imp, kib
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D14988
2018-04-06 17:06:22 +00:00