Commit Graph

6915 Commits

Author SHA1 Message Date
trasz
e1055c772b MFC r282213:
Add kern.racct.enable tunable and RACCT_DISABLED config option.
The point of this is to be able to add RACCT (with RACCT_DISABLED)
to GENERIC, to avoid having to rebuild the kernel to use rctl(8).

MFC r282901:

Build GENERIC with RACCT/RCTL support by default.  Note that it still
needs to be enabled by adding "kern.racct.enable=1" to /boot/loader.conf.

Note those two are MFC-ed together, because the latter one changes the
name of RACCT_DISABLED option to RACCT_DEFAULT_TO_DISABLED.  Should have
committed the renaming separately...

Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
2015-06-21 06:28:26 +00:00
kib
f014bfc33c MFC r284104:
Updates from SDM rev. 55.
2015-06-13 07:31:50 +00:00
dim
1824c628a6 MFC r283870:
Remove unneeded NULL checks in amd64's trap_fatal().

Since td_name is an array member of struct thread, it can never be NULL,
so the check can be removed.  In addition, curproc can never be NULL,
so remove the if statement, and splice the two printfs() together.

While here, remove the u_long cast, and use the correct printf format
specifier curproc->p_pid.

Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D2695
2015-06-08 19:44:04 +00:00
kib
8adebdc1fb MFC r283735:
Remove several write-only variables.
2015-06-05 08:36:25 +00:00
jhb
017f11d1f3 MFC 281887:
Reassign copyright statements on several files from Advanced
Computing Technologies LLC to Hudson River Trading LLC.
2015-06-02 19:20:39 +00:00
erj
e12c5d1ed6 MFC ixgbe commits for 10.2:
- r280182 - Split the driver into independent pf/vf loadables
- r280197 - Resolve build issues
- r280204 - Fix multiple same-name devclasses
- r280228 - Fix i386 LINT build issues / remove unused variable
- r280252 - Fix building ixgbe with gcc
- r280962 - Make changes to busdma code similar to r257541
- r281772 & r281773 - Remove unused variable
- partial r282280 - stats counter update (ix-only)
- r282289 - Add X550 support
- r282290 - Add X550 makefile updates
- r282293 - Add ixgbe_x550.c to conf/files
- r282299 - Fix gcc compile (extraneous extern declaration)

Finally, add ix_txrx.c to conf/files because it's required for compile in stable/10.

Approved by:	jfv (mentor)
2015-05-27 17:44:11 +00:00
kib
0d8ee7566b MFC r282708:
On exec, single-threading must be enforced before arguments space is
allocated from exec_map.
2015-05-24 07:32:02 +00:00
jhb
b92445bf9a MFC 266852,270223:
- Fix pf(4) to build with MAXCPU set to 256.  MAXCPU is actually a count,
  not a maximum ID value (so it is a cap on mp_ncpus, not mp_maxid).
- Bump MAXCPU on amd64 from 64 to 256.  In practice APIC only permits 255
  CPUs (IDs 0 through 254).  Getting above that limit requires x2APIC.
2015-05-22 21:51:36 +00:00
whu
30cd3b9808 MFC r282212:
Microsoft vmbus, storage and other related driver enhancements for HyperV.
    - Vmbus multi channel support.
    - Vector interrupt support.
    - Signal optimization.
    - Storvsc driver performance improvement.
    - Scatter and gather support for storvsc driver.
    - Minor bug fix for KVP driver.
Thanks royger, jhb and delphij from FreeBSD community for the reviews
and comments. Also thanks Hovy Xu from NetApp for the contributions to
the storvsc driver.

PR:     195238
Submitted by:   whu
Reviewed by:    royger
Approved by:    royger
Relnotes:       yes
Sponsored by:   Microsoft OSTC
Differential Revision:  https://reviews.freebsd.org/D2575
2015-05-22 09:03:55 +00:00
emaste
60bdd635db MFC r258431: Disable amd64 boot time memory test by default
The page presence memory test takes a long time on large memory systems
  and has little value on contemporary amd64 hardware.

Relnotes:	Yes
Reviewed by:	jhb, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D1544
2015-05-21 19:40:31 +00:00
jimharris
1901626e01 MFC r282921:
Add nvme and nvd drivers to GENERIC for amd64 and i386.

Sponsored by:	Intel
2015-05-18 19:48:41 +00:00
kib
9749d5f34e MFC r282680:
Remove unused define.
2015-05-12 08:52:50 +00:00
kib
e86f81bbc7 MFC r281762:
Remove duplicate definitions of MWAIT_CX hints.  Identical defines in
specialreg.h are enough.
2015-04-27 08:06:33 +00:00
jhb
5967aacd0b MFC 278325,280866:
Revert the IPI startup sequence to match what is described in the
Intel Multiprocessor Specification v1.4.  The Intel SDM claims that

278325:
Revert the IPI startup sequence to match what is described in the
Intel Multiprocessor Specification v1.4.  The Intel SDM claims that
the INIT IPIs here are invalid, but other systems follow the MP
spec instead.

While here, fix the IPI wait routine to accept a timeout in microseconds
instead of a raw spin count, and don't spin forever during AP startup.
Instead, panic if a STARTUP IPI is not delivered after 20 us.

280866:
Wait 100 microseconds for a local APIC to dispatch each startup-related IPI
rather than 20.  The MP 1.4 specification states in Appendix B.2:

  "A period of 20 microseconds should be sufficient for IPI dispatch to
   complete under normal operating conditions".

(Note that this appears to be separate from the 10 millisecond (INIT) and
200 microsecond (STARTUP) waits after the IPIs are dispatched.)  The
Intel SDM is silent on this issue as far as I can tell.

At least some hardware requires 60 microseconds as noted in the PR, so
bump this to 100 to be on the safe side.

PR:		196542, 197756
2015-04-15 16:52:34 +00:00
jhb
b09b758bf2 MFC 276724:
On some Intel CPUs with a P-state but not C-state invariant TSC the TSC
may also halt in C2 and not just C3 (it seems that in some cases the BIOS
advertises its C3 state as a C2 state in _CST).  Just play it safe and
disable both C2 and C3 states if a user forces the use of the TSC as the
timecounter on such CPUs.

PR:		192316
2015-04-02 01:02:42 +00:00
jhb
5fdf8ec777 MFC 261790:
Add support for managing PCI bus numbers.  As with BARs and PCI-PCI bridge
I/O windows, the default is to preserve the firmware-assigned resources.
PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture
defines a PCI_RES_BUS resource type.
- Add a helper API to create top-level PCI bus resource managers for each
  PCI domain/segment.  Host-PCI bridge drivers use this API to allocate
  bus numbers from their associated domain.
- Change the PCI bus and CardBus drivers to allocate a bus resource for
  their bus number from the parent PCI bridge device.
- Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the
  full range of bus numbers from secbus to subbus from their parent bridge.
  The drivers also always program their primary bus register.  The bridge
  drivers also support growing their bus range by extending the bus resource
  and updating subbus to match the larger range.
- Add support for managing PCI bus resources to the Host-PCI bridge drivers
  used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib).
- Define a PCI_RES_BUS resource type for amd64 and i386.

PR:		197076
2015-04-01 21:48:54 +00:00
kib
4826fa033b MFC r280781:
Make it possible for the signal handler to act on #ss.  Load the
canonical user data segment' selector into %ss when calling the
handler.
2015-03-31 01:05:34 +00:00
kib
ae182701d0 MFC r280780:
The #ss fault handler erronously does not check for the fault
originated from the return to usermode. #ss must be handled same as
#np.
2015-03-31 00:59:30 +00:00
jhb
ec839daca2 Revert accidental(?) change in r280455 and do not compile hwpmc statically
into GENERIC by default.  This change is not present in HEAD and was not
made in the two commits to HEAD that r280455 merged.
2015-03-30 16:28:04 +00:00
mav
3e11976044 MFC r280134:
Report ARAT (APIC-Timer-always-running) feature for virtual CPU.

This makes FreeBSD guest to not avoid using LAPIC timer, preferring HPET
due to worries about non-existing for virtual CPUs deep sleep states.

Benchmarks of usleep(1) on guest and host show such extra latencies:
 - 51us for virtual HPET,
 - 22us for virtual LAPIC timer,
 - 22us for host HPET and
 - 3us for host LAPIC timer.
2015-03-30 07:11:49 +00:00
rrs
0444d8258d MFC of r277177 and r279894 with the fixes for the PMC for Haswell.
Sponsored by:	Netflix Inc.
2015-03-24 20:00:11 +00:00
markj
bf5cc57879 MFC r278655:
Add support for decoding multibyte NOPs.
2015-03-19 23:13:19 +00:00
rwatson
6102a34d38 Merge r263233 from HEAD to stable/10:
Update kernel inclusions of capability.h to use capsicum.h instead; some
  further refinement is required as some device drivers intended to be
  portable over FreeBSD versions rely on __FreeBSD_version to decide whether
  to include capability.h.

Sponsored by:	Google, Inc.
2015-03-19 13:37:36 +00:00
jhb
fa9df7fb76 MFC 277713:
If the boot-time memory test is enabled, output a dot ('.') for
each GB of RAM tested so people watching the console can see that
the machine is making progress and not hung.

PR:		196650
2015-03-12 15:08:23 +00:00
rstone
0b55a8c80a MFC r264007,r264008,r264009,r264011,r264012,r264013
MFC support for PCI Alternate RID Interpretation.  ARI is an optional PCIe
feature that allows PCI devices to present up to 256 functions on a bus.
This is effectively a prerequisite for PCI SR-IOV support.

r264007:
   Add a method to get the PCI RID for a device.

   Reviewed by:  kib
   MFC after:    2 months
   Sponsored by: Sandvine Inc.

r264008:
   Re-implement the DMAR I/O MMU code in terms of PCI RIDs

   Under the hood the VT-d spec is really implemented in terms of
   PCI RIDs instead of bus/slot/function, even though the spec makes
   pains to convert back to bus/slot/function in examples.  However
   working with bus/slot/function is not correct when PCI ARI is
   in use, so convert to using RIDs in most cases.  bus/slot/function
   will only be used when reporting errors to a user.

   Reviewed by:  kib
   MFC after:    2 months
   Sponsored by: Sandvine Inc.

r264009:
   Re-write bhyve's I/O MMU handling in terms of PCI RID.

   Reviewed by:  neel
   MFC after:    2 months
   Sponsored by: Sandvine Inc.

r264011:
   Add support for PCIe ARI

   PCIe Alternate RID Interpretation (ARI) is an optional feature that
   allows devices to have up to 256 different functions.  It is
   implemented by always setting the PCI slot number to 0 and
   re-purposing the 5 bits used to encode the slot number to instead
   contain the function number.  Combined with the original 3 bits
   allocated for the function number, this allows for 256 functions.

   This is enabled by default, but it's expected to be a no-op on currently
   supported hardware.  It's a prerequisite for supporting PCI SR-IOV, and
   I want the ARI support to go in early to help shake out any bugs in it.
   ARI can be disabled by setting the tunable hw.pci.enable_ari=0.

   Reviewed by:  kib
   MFC after:    2 months
   Sponsored by: Sandvine Inc.

r264012:
   Print status of ARI capability in pciconf -c

   Teach pciconf how to print out the status (enabled/disabled) of the ARI
   capability on PCI Root Complexes and Downstream Ports.

   MFC after:    2 months
   Sponsored by: Sandvine Inc.

r264013:
   Add missing copyright date.

   MFC after:    2 months
2015-03-01 04:22:06 +00:00
jhb
4ee9c49971 MFC 274817,274878,276801,276840,278976:
Improve support for XSAVE with debuggers.
- Dump an NT_X86_XSTATE note if XSAVE is in use. This note is designed
  to match what Linux does in that 1) it dumps the entire XSAVE area
  including the fxsave state, and 2) it stashes a copy of the current
  xsave mask in the unused padding between the fxsave state and the
  xstate header at the same location used by Linux.
- Teach readelf() to recognize NT_X86_XSTATE notes.
- Change PT_GET/SETXSTATE to take the entire XSAVE state instead of
  only the extra portion. This avoids having to always make two
  ptrace() calls to get or set the full XSAVE state.
- Add a PT_GET_XSTATE_INFO which returns the length of the current
  XSTATE save area (so the size of the buffer needed for PT_GETXSTATE)
  and the current XSAVE mask (%xcr0).
2015-02-23 18:38:41 +00:00
jhb
9809511c44 MFC 273800:
Rework virtual machine hypervisor detection.
- Move the existing code to x86/x86/identcpu.c since it is x86-specific.
- If the CPUID2_HV flag is set, assume a hypervisor is present and query
  the 0x40000000 leaf to determine the hypervisor vendor ID.  Export the
  vendor ID and the highest supported hypervisor CPUID leaf via
  hv_vendor[] and hv_high variables, respectively.  The hv_vendor[]
  array is also exported via the hw.hv_vendor sysctl.
- Merge the VMWare detection code from tsc.c into the new probe in
  identcpu.c.  Add a VM_GUEST_VMWARE to identify vmware and use that in
  the TSC code to identify VMWare.
2015-02-10 16:34:42 +00:00
kib
9dc38ea5dc MFC r278001:
Do not qualify the mcontext_t *mcp argument for set_mcontext(9) as const.
2015-02-07 08:47:15 +00:00
kib
44ff2e7b62 MFC r277055:
Revert r263475: TDP_DEVMEMIO no longer needed.
2015-01-19 11:07:29 +00:00
kib
762486d18f MFC r277051:
Fix several issues with /dev/mem and /dev/kmem devices on amd64.
2015-01-19 11:02:23 +00:00
kib
da566e85be MFC r277047:
For x86, read MAXPHYADDR into variable cpu_maxphyaddr.
2015-01-19 10:52:55 +00:00
nwhitehorn
d454042db9 MFC r265329:
Disable ACPI and P4TCC throttling by default, following discussion on
freebsd-current. These CPU speed control techniques are usually unhelpful
at best. For now, continue building the relevant code into GENERIC so that
it can trivially be re-enabled at runtime if anyone wants it.

Relnotes:	yes
2015-01-11 17:10:07 +00:00
kib
a47e5e5116 MFC r276523:
Restore access to the page at zero through /dev/mem after r263475.
2015-01-09 02:35:19 +00:00
kib
b085354527 MFC r276522:
Actually remove GIANT_REQUIRED, declared but not done in r263475.
Style.
2015-01-09 02:33:12 +00:00
dchagin
c9ac1e485b Regen for r276810. 2015-01-08 06:24:43 +00:00
dchagin
8d0bd37b09 MFC r276508, r276509:
Correct an argument status of wait4 syscall for Linuxulator.
2015-01-08 06:23:11 +00:00
kib
5823747ac5 MFC r276322:
Change the way the lcall $7,$0 is reflected to usermode.  Instead of
setting call gate, which must be 64 bit, put a code segment descriptor
into ldt slot 0.
2015-01-03 01:41:10 +00:00
alc
afa6861080 MFC r270961
Update a comment to reflect the changes in r213408.
2015-01-02 18:50:18 +00:00
alc
d5a13901bf MFC r273701, r274556
By the time that pmap_init() runs, vm_phys_segs[] has been initialized.
  Obtaining the end of memory address from vm_phys_segs[] is a little
  easier than obtaining it from phys_avail[].

  Enable the use of VM_PHYSSEG_SPARSE on amd64 and i386, making it the
  default on i386 PAE.  (The use of VM_PHYSSEG_SPARSE on i386 PAE saves
  us some precious kernel virtual address space that would have been
  wasted on unused vm_page structures.)
2015-01-02 17:45:52 +00:00
neel
eb8065a6a6 MFC r276323
Implement "special mask mode" in vatpic.
2014-12-31 04:12:38 +00:00
neel
10c6be06b4 MFC r273683
Move the ACPI PM timer emulation into vmm.ko.

MFC r273706
Change the type of the first argument to the I/O emulation handlers to
'struct vm *'.

MFC r273710
Add a comment explaining the intent behind the I/O reservation [0x72-0x77].

MFC r273744
Add foo_genassym.c files to DPSRCS so dependencies for them are generated.
This ensures these objects are rebuilt to generate an updated header of
assembly constants if needed.

MFC r274045
If the start bit, PxCMD.ST, is cleared and nothing is in-flight then
PxCI, PxSACT, PxCMD.CCS and PxCMD.CR should be 0.

MFC r274076
Improve the ability to cancel an in-flight request by using an interrupt,
via SIGCONT, to force the read or write system call to return prematurely.

MFC r274330
To allow a request to be submitted from within the callback routine of
a completing one increase the total by 1 but don't advertise it.

MFC r274931
Change the lower bound for guest vmspace allocation to 0 instead of using
the VM_MIN_ADDRESS constant.

MFC r275817
For level triggered interrupts clear the PIC IRR bit when the interrupt pin
is deasserted.

MFC r275850
Fix 8259 IRQ priority resolver.

MFC r275952
Various 8259 device model improvements.

MFC r275965
Emulate writes to the IA32_MISC_ENABLE MSR.
2014-12-30 22:22:46 +00:00
neel
9a7db864f7 MFC r273375
Add support AMD processors with the SVM/AMD-V hardware extensions.

MFC r273749
Remove bhyve SVM feature printf's now that they are available in the general
CPU feature detection code.

MFC r273766
Add missing 'break' pointed out by Coverity CID 1249760.

MFC r276098
Allow ktr(4) tracing of all guest exceptions via the tunable "hw.vmm.trace_guest_exceptions"

MFC r276392
Inject #UD into the guest when it executes either 'MONITOR' or 'MWAIT' on an
AMD/SVM host.

MFC r276402
Remove "svn:mergeinfo" property that was dragged along when these files were
svn copied in r273375.
2014-12-30 08:24:14 +00:00
neel
3b591af2d9 MFC 261321
Rename the AMD MSR_PERFCTR[0-3] so the Pentium Pro MSR_PERFCTR[0-1] aren't
redefined.

MFC r273214
Fix build to not bogusly always rebuild vmm.ko.

MFC r273338
Add support for AMD's nested page tables in pmap.c:
- Provide the correct bit mask for various bit fields in a PTE (e.g. valid bit)
  for a pmap of type PT_RVI.
- Add a function 'pmap_type_guest(pmap)' that returns TRUE if the pmap is of
  type PT_EPT or PT_RVI.

Add CPU_SET_ATOMIC_ACQ(num, cpuset):
This is used when activating a vcpu in the nested pmap. Using the 'acquire'
variant guarantees that the load of the 'pm_eptgen' will happen only after
the vcpu is activated in 'pm_active'.

Add defines for various AMD-specific MSRs.

Discussed with:	kib (r261321)
2014-12-30 00:00:42 +00:00
neel
88c1adb417 MFC r270326
Fix a recursive lock acquisition in vi_reset_dev().

MFC r270434
Return the spurious interrupt vector (IRQ7 or IRQ15) if the atpic cannot find
any unmasked pin with an interrupt asserted.

MFC r270436
Fix a bug in the emulation of CPUID leaf 0x4.

MFC r270437
Add "hw.vmm.topology.threads_per_core" and "hw.vmm.topology.cores_per_package"
tunables to modify the default cpu topology advertised by bhyve.

MFC r270855
Set the 'inst_length' to '0' early on before any error conditions are detected
in the emulation of the task switch. If any exceptions are triggered then the
guest %rip should point to instruction that caused the task switch as opposed
to the one after it.

MFC r270857
The "SUB" instruction used in getcc() actually does 'x -= y' so use the
proper constraint for 'x'. The "+r" constraint indicates that 'x' is an
input and output register operand.

While here generate code for different variants of getcc() using a macro
GETCC(sz) where 'sz' indicates the operand size.

Update the status bits in %rflags when emulating AND and OR opcodes.

MFC r271439
Initialize 'bc_rdonly' to the right value.

MFC r271451
Optimize the common case of injecting an interrupt into a vcpu after a HLT
by explicitly moving it out of the interrupt shadow.

MFC r271888
Restructure the MSR handling so it is entirely handled by processor-specific
code.

MFC r271890
MSR_KGSBASE is no longer saved and restored from the guest MSR save area. This
behavior was changed in r271888 so update the comment block to reflect this.

MFC r271891
Add some more KTR events to help debugging.

MFC r272197
mmap(2) requires either MAP_PRIVATE or MAP_SHARED for non-anonymous mappings.

MFC r272395
Get rid of code that dealt with the hardware not being able to save/restore
the PAT MSR on guest exit/entry. This workaround was done for a beta release
of VMware Fusion 5 but is no longer needed in later versions.

All Intel CPUs since Nehalem have supported saving and restoring MSR_PAT
in the VM exit and entry controls.

MFC r272670
Inject #UD into the guest when it executes either 'MONITOR' or 'MWAIT'.

MFC r272710
Implement the FLUSH operation in the virtio-block emulation.

MFC r272838
iasl(8) expects integer fields in data tables to be specified as hexadecimal
values. Therefore the bit width of the "PM Timer Block" was actually being
interpreted as 50-bits instead of the expected 32-bit.

This eliminates an error message emitted by a Linux 3.17 guest during boot:
"Invalid length for FADT/PmTimerBlock: 50, using default 32"

MFC r272839
Support Intel-specific MSRs that are accessed when booting up a linux in bhyve:
 - MSR_PLATFORM_INFO
 - MSR_TURBO_RATIO_LIMITx
 - MSR_RAPL_POWER_UNIT

MFC r273108
Emulate "POP r/m". This is needed to boot OpenBSD/i386 MP kernel in bhyve.

MFC r273212
Support stopping and restarting the AHCI command list via toggling PxCMD.ST
from '1' to '0' and back.  This allows the driver a chance to recover if
for instance a timeout occurred due to activity on the host.
2014-12-28 21:27:13 +00:00
jhb
5ae50f92a8 MFC 273988,273989,273995,274057:
MFamd64: Add support for extended FPU states on i386.  This includes
support for AVX on i386.
2014-12-22 21:32:39 +00:00
jhb
71f9e38fa2 MFC 271405,271408,271409,272658:
MFamd64: Use initializecpu() to set various model-specific registers on
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
2014-12-22 19:53:55 +00:00
jhb
2b345a08ed MFC 260557,271076,271077,271082,271083,271098:
- Remove spaces from boot messages when we print the CPU ID/Family/Stepping
- Move prototypes for various functions into out of C files and into
  <machine/md_var.h>.
- Reduce diffs between i386 and amd64 initcpu.c and identcpu.c files.
- Move blacklists of broken TSCs out of the printcpuinfo() function
  and into the TSC probe routine.
- Merge the amd64 and i386 identcpu.c into a single x86 implementation.
2014-12-22 18:40:59 +00:00
kib
492a1d38b5 MFC r275833:
The iret instruction may generate #np and #ss fault, besides #gp.
When returning to usermode, the handler for that exceptions is also
executed with wrong gs base.  Handle all three possible faults in the
same way, checking for iret fault, and performing full iret.
2014-12-19 09:36:59 +00:00
bryanv
f9a98c5bdd MFC r273515, r274055, r274063, r274215, r274065, r274502:
Add VirtIO console driver.
2014-11-29 22:48:40 +00:00
kib
93517beb67 MFC r274555:
Fix END()s for fueword and fueword64, match the name in END() with
entry.
2014-11-22 09:38:18 +00:00