12877 Commits

Author SHA1 Message Date
ian
8e2dd9b5e3 MFC r257854 (discussed with alc@)
As of r257209, all architectures have defined
  VM_KMEM_SIZE_SCALE.  In other words, every architecture is now
  auto-sizing the kmem arena.  This revision changes kmeminit() so
  that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and
  the definition of VM_KMEM_SIZE becomes optional.

  Replace or eliminate all existing definitions of VM_KMEM_SIZE.
  With auto-sizing enabled, VM_KMEM_SIZE effectively became an
  alternate spelling for VM_KMEM_SIZE_MIN on most architectures.
  Use VM_KMEM_SIZE_MIN for clarity.
2014-05-16 01:30:30 +00:00
scottl
96be897ce1 Merge r264984
Retire smp_active.  It was racey and caused demonstrated problems with
the cpufreq code.  Replace its use with smp_started.  There's at least
one userland tool that still looks at the kern.smp.active sysctl, so
preserve it but point it to smp_started as well.

Obtained from:	Netflix, Inc.
2014-05-07 20:28:27 +00:00
alc
8be11d4db2 MFC r262338
When the kernel is running in a virtual machine, it cannot rely upon the
  processor family to determine if the workaround for AMD Family 10h Erratum
  383 should be enabled.  To enable virtual machine migration among a
  heterogeneous collection of physical machines, the hypervisor may have
  been configured to report an older processor family with a reduced feature
  set.  Effectively, the reported processor family and its features are like
  a "least common denominator" for the collection of machines.

  Therefore, when the kernel is running in a virtual machine, instead of
  relying upon the processor family, we now test for features that prove
  that the underlying processor is not affected by the erratum.  (The
  features that we test for are unlikely to ever be emulated in software
  on an affected physical processor.)

PR:		186061
2014-05-07 00:32:49 +00:00
ken
a354f057f0 MFC the mpr(4) driver for LSI's 12Gb SAS cards.
This includes r265236, r265237, r265241 and r265261:

  ------------------------------------------------------------------------
  r265236 | ken | 2014-05-02 14:25:09 -0600 (Fri, 02 May 2014) | 51 lines

  Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers.

  This is derived from the mps(4) driver, but it supports only the 12Gb
  IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.

  Some notes about this driver:
   o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
     this driver.

   o WarpDrive functionality has been removed, since it isn't supported in
     the 12Gb driver interface.

   o The Scatter/Gather list handling code is significantly different between
     the 6Gb and 12Gb hardware.  The 12Gb boards support IEEE Scatter/Gather
     lists.

  Thanks to LSI for developing and testing this driver for FreeBSD.

  share/man/man4/mpr.4:
  	mpr(4) man page.

  sys/dev/mpr/*:
  	mpr(4) driver files.

  sys/modules/Makefile,
  sys/modules/mpr/Makefile:
  	Add a module Makefile for the mpr(4) driver.

  sys/conf/files:
  	Add the mpr(4) driver.

  sys/amd64/conf/GENERIC,
  sys/i386/conf/GENERIC,
  sys/mips/conf/OCTEON1,
  sys/sparc64/conf/GENERIC:
  	Add the mpr(4) driver to all config files that currently
  	have the mps(4) driver.

  sys/ia64/conf/GENERIC:
  	Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
  	config file.

  sys/i386/conf/XEN:
  	Exclude the mpr module from building here.

  Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
  Tested by:	Chris Reeves <chrisr@spectralogic.com>
  Sponsored by:	LSI, Spectra Logic
  Relnotes:	LSI 12Gb SAS driver mpr(4) added

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265237 | ken | 2014-05-02 14:36:20 -0600 (Fri, 02 May 2014) | 8 lines

  Add the mpr(4) man page to the man4 Makefile.

  This should have been included in r265236.

  Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
  MFC after:	3 days
  Sponsored by:	LSI, Spectra Logic

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265241 | brueffer | 2014-05-02 15:14:28 -0600 (Fri, 02 May 2014) | 2 lines

  Use our standard SYNOPSIS wording; perform some cleanup while here.

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265261 | brueffer | 2014-05-03 05:15:28 -0600 (Sat, 03 May 2014) | 2 lines

  Add a missing colon.

  ------------------------------------------------------------------------

Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
Tested by:	Chris Reeves <chrisr@spectralogic.com>
Sponsored by:	LSI, Spectra Logic
Relnotes:	LSI 12Gb SAS driver mpr(4) added
2014-05-05 20:35:35 +00:00
kib
660fbf80db MFC r263912:
Clear the kernel grab of the FPU state on fork.
2014-04-05 14:24:29 +00:00
royger
382b727d12 MFC r263001
Move asm IPIs handlers to C code, so both Xen and native IPI handlers
share the same code.

Approved by: gibbs
Sponsored by: Citrix Systems R&D
2014-04-04 14:54:54 +00:00
dim
9cedb8bb69 MFC 261991:
Upgrade our copy of llvm/clang to 3.4 release.  This version supports
all of the features in the current working draft of the upcoming C++
standard, provisionally named C++1y.

The code generator's performance is greatly increased, and the loop
auto-vectorizer is now enabled at -Os and -O2 in addition to -O3.  The
PowerPC backend has made several major improvements to code generation
quality and compile time, and the X86, SPARC, ARM32, Aarch64 and SystemZ
backends have all seen major feature work.

Release notes for llvm and clang can be found here:
<http://llvm.org/releases/3.4/docs/ReleaseNotes.html>
<http://llvm.org/releases/3.4/tools/clang/docs/ReleaseNotes.html>

MFC 262121 (by emaste):

Update lldb for clang/llvm 3.4 import

This commit largely restores the lldb source to the upstream r196259
snapshot with the addition of threaded inferior support and a few bug
fixes.

Specific upstream lldb revisions restored include:
   SVN      git
  181387  779e6ac
  181703  7bef4e2
  182099  b31044e
  182650  f2dcf35
  182683  0d91b80
  183862  15c1774
  183929  99447a6
  184177  0b2934b
  184948  4dc3761
  184954  007e7bc
  186990  eebd175

Sponsored by:	DARPA, AFRL

MFC 262186 (by emaste):

Fix mismerge in r262121

A break statement was lost in the merge.  The error had no functional
impact, but restore it to reduce the diff against upstream.

MFC 262303:

Pull in r197521 from upstream clang trunk (by rdivacky):

  Use the integrated assembler by default on FreeBSD/ppc and ppc64.

Requested by:	jhibbits

MFC 262611:

Pull in r196874 from upstream llvm trunk:

  Fix a crash that occurs when PWD is invalid.

  MCJIT needs to be able to run in hostile environments, even when PWD
  is invalid. There's no need to crash MCJIT in this case.

  The obvious fix is to simply leave MCContext's CompilationDir empty
  when PWD can't be determined. This way, MCJIT clients,
  and other clients that link with LLVM don't need a valid working directory.

  If we do want to guarantee valid CompilationDir, that should be done
  only for clients of getCompilationDir(). This is as simple as checking
  for an empty string.

  The only current use of getCompilationDir is EmitGenDwarfInfo, which
  won't conceivably run with an invalid working dir. However, in the
  purely hypothetically and untestable case that this happens, the
  AT_comp_dir will be omitted from the compilation_unit DIE.

This should help fix assertions occurring with ports-mgmt/tinderbox,
when it is using jails, and sometimes invalidates clang's current
working directory.

Reported by:	decke

MFC 262809:

Pull in r203007 from upstream clang trunk:

  Don't produce an alias between destructors with different calling conventions.

  Fixes pr19007.

(Please note that is an LLVM PR identifier, not a FreeBSD one.)

This should fix Firefox and/or libxul crashes (due to problems with
regparm/stdcall calling conventions) on i386.

Reported by:	multiple users on freebsd-current
PR:		bin/187103

MFC 263048:

Repair recognition of "CC" as an alias for the C++ compiler, since it
was silently broken by upstream for a Windows-specific use-case.

Apparently some versions of CMake still rely on this archaic feature...

Reported by:	rakuco

MFC 263049:

Garbage collect the old way of adding the libstdc++ include directories
in clang's InitHeaderSearch.cpp.  This has been superseded by David
Chisnall's commit in r255321.

Moreover, if libc++ is used, the libstdc++ include directories should
not be in the search path at all.  These directories are now only used
if you pass -stdlib=libstdc++.
2014-03-21 17:53:59 +00:00
jhb
59b6242e90 MFC 259016,259019,259049,259071,259102,259110,259129,259130,259178,259179,
259203,259221,259261,259532,259615,259650,259651,259667,259680,259727,
259761,259772,259776,259777,259830,259882,259915,260160,260449,260450,
260688,260888,260953,261269,261547,261551,261552,261553,261585:
Merge the vt(4) driver (newcons) to stable/10.

Approved by:	ray
2014-03-06 18:30:56 +00:00
jhb
5b0bab6d24 MFC 261517,261520:
Convert the license on files where I am the sole copyright holder to
2 clause BSD licenses.
2014-02-18 20:27:17 +00:00
jhb
8e8bf4982f MFC 259140:
Move constants for indices in the local APIC's local vector table from
apicvar.h to apicreg.h.
2014-02-18 01:15:32 +00:00
avg
b46715eb45 MFC r257417: Remove references to an unused fasttrap probe hook 2014-02-17 12:57:13 +00:00
eadler
ec294fd7f5 MFC r258779,r258780,r258787,r258822:
Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

Similar to the (1 << 31) case it is not defined to do (2 << 30).

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.
2014-02-04 03:36:42 +00:00
kib
2c4d831850 MFC DMAR busdma implementation.
MFC r257251:
Import the driver for VT-d DMAR hardware.  Implement the busdma(9) using DMARs.

MFC r257512:
Add support for queued invalidation.

MFC miscellaneous follow-ups to r257251.

MFC r257266:
Remove redundand assignment to error variable and check for its value.

MFC r257308:
Remove redundand declaration.

MFC r257511:
Return BUS_PROBE_NOWILDCARD from the DMAR probe method.

MFC r257860,r257896,r257900,r257902,r257903 (by dim):
Fixes for gcc compilation.
2013-12-17 13:49:35 +00:00
delphij
a12532878e MFC r258950:
Enable Hyper-V support in i386 GENERIC.
2013-12-16 06:41:31 +00:00
delphij
4490be87f5 MFC r258948:
Support Hyper-V on i386:

 - Add 'hyperv' module into build;
 - Allow building Hyper-V support as part of the kernel;
 - Hook Hyper-V build into NOTES.

Approved by:	re (rodrigc)
2013-12-06 23:40:50 +00:00
royger
00083cffc4 MFC 258176:
Fix accounting for hw.realmem on the i386 and amd64 platforms.

sys/i386/i386/machdep.c:
sys/amd64/amd64/machdep.c:
	The value reported by FreeBSD as "real memory" when booting
	doesn't match what is later reported by sysctl as hw.realmem.
	This is due to the fact that the value printed during the
	boot process is fetched from smbios data (when possible),
	and accounts for holes in physical memory. On the other
	hand, the value of hw.realmem is unconditionally set to be
	one larger than the highest page of the physical address
	space.

	Fix this by setting hw.realmem to the same value printed
	during boot, this makes hw.realmem honour it's name and
	account properly for physical memory present in the system.

Submitted by:	Roger Pau Monné
Reviewed by:	gibbs
Approved by:	gibbs (mentor)
Approved by:	re (gjb)
2013-12-05 18:08:05 +00:00
emaste
b0519089ed MFC r258135: x86: Allow users to change PSL_RF via ptrace(PT_SETREGS...)
Debuggers may need to change PSL_RF. Note that tf_eflags is already stored
  in the signal context during signal handling and PSL_RF previously could
  be modified via sigreturn, so this change should not provide any new
  ability to userspace.

  For background see the thread at:
  http://lists.freebsd.org/pipermail/freebsd-i386/2007-September/005910.html

  Reviewed by:	jhb, kib

Sponsored by:	DARPA, AFRL
Approved by:	re (gjb)
2013-11-25 15:58:48 +00:00
dim
fe09c80878 MFC r258016:
Disable building the ctl module for the i386 XEN kernel configuration
for now, since it causes gcc warnings about casting 64 bit bus_addr_t's
to 32 bit pointers, and vice versa.

Reviewed by:	ken
Approved by:	re (gjb)
2013-11-18 15:13:58 +00:00
kib
5b26108b2e MFC r257856:
Add bits for the AMD features from CPUID function 0x80000001 ECX,
described in the rev. 3.0 of the Kabini BKDG, document 48751.pdf.

Approved by:	re (gjb)
2013-11-15 07:10:42 +00:00
kib
bc035b6554 MFC r257858:
Fix signal delivery for the iBCS2 binaries.

Approved by:	re (gjb)
2013-11-15 07:09:24 +00:00
gjb
3595c37915 MFC r256328:
Document XENHVM and xenpci are mutually inclusive.

Approved by:    re (delphij)
Sponsored by:   The FreeBSD Foundation
2013-10-11 19:43:37 +00:00
gjb
cfff9c715e - Remove debugging from GENERIC* kernel configurations
- Enable MALLOC_PRODUCTION
- Default dumpdev=NO
- Remove UPDATING entry regarding debugging features
- Bump __FreeBSD_version to 1000500

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2013-10-10 17:59:44 +00:00
dim
d09f35c8d5 Remove redundant declarations of szsigcode and sigcode in
sys/i386/ibcs2/ibcs2_sysvec.c, to silence two gcc warnings.

Approved by:	re (gjb)
MFC after:	3 days
2013-10-07 16:57:48 +00:00
dim
7eedec3f1c Remove redundant declaration of force_evtchn_callback() in the
i386-specific xen-os.h, to silence a gcc warning.

Approved by:	re (gjb)
MFC after:	3 days
2013-10-07 16:53:26 +00:00
gibbs
9c8c76f921 Formalize the concept of virtual CPU ids by adding a per-cpu vcpu_id
field.  Perform vcpu enumeration for Xen PV and HVM environments
and convert all Xen drivers to use vcpu_id instead of a hard coded
assumption of the mapping algorithm (acpi or apic ID) in use.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)

amd64/include/pcpu.h:
i386/include/pcpu.h:
	Add vcpu_id to the amd64 and i386 pcpu structures.

dev/xen/timer/timer.c
x86/xen/xen_intr.c
	Use new vcpu_id instead of assuming acpi_id == vcpu_id.

i386/xen/mp_machdep.c:
i386/xen/mptable.c
x86/xen/hvm.c:
	Perform Xen HVM and Xen full PV vcpu_id mapping.

x86/xen/hvm.c:
x86/acpica/madt.c
	Change SYSINIT ordering of acpi CPU enumeration so that it
	is guaranteed to be available at the time of Xen HVM vcpu
	id mapping.
2013-10-05 23:11:01 +00:00
jmg
cb6017acba add aesni module to i386 and amd64 NOTES...
Approved by:	re (gjb)
2013-10-04 17:21:01 +00:00
kib
776b37339a Free both KVA and backing pages when freeing TSS memory.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (marius)
2013-09-23 20:14:15 +00:00
glebius
12fb397079 - Create kern.ipc.sendfile namespace, and put the new "readhead" OID
there as "kern.ipc.sendfile.readahead".
- Push all nsfbuf related tunables into MD code. Don't move them
  to new namespace in favor of POLA.

Reviewed by:	scottl
Approved by:	re (gjb)
2013-09-22 13:36:52 +00:00
gibbs
d9f8e5d07b Fix compilation of the i386 PAE kernel config.
sys/i386/include/xen/xenvar.h:
	Provide vtomach() when PAE is defined.

Approved by:	re (blanket Xen)
2013-09-22 00:54:22 +00:00
gibbs
7ed30adae7 Merge Xen PVHVM support into the GENERIC kernel config for both
amd64 and i386.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/amd64/include/cpu.h:
sys/i386/i386/mp_machdep.c:
sys/i386/include/cpu.h:
	- Introduce two new CPU hooks for initialization and resume
	  purposes. This allows us to get rid of the XENHVM ifdefs in
	  mp_machdep, and also sets some hooks into common code that can be
	  used by other hypervisor implementations.

sys/amd64/conf/XENHVM:
sys/i386/conf/XENHVM:
	- Remove these configs now that GENERIC has builtin support for Xen
	  HVM.

sys/kern/subr_smp.c:
	- Make sure there are no pending IPIs when suspending a system.

sys/x86/xen/hvm.c:
	- Add cpu init and resume vectors that are called from mp_machdep
	  using the new hooks.
	- Only clear the vcpu_info mapping data on resume.  It is already
	  clear for the BSP on a cold boot and is set correctly as APs
	  are started.
	- Gate xen_hvm_init_cpu only to systems running under Xen.

sys/x86/xen/xen_intr.c:
	 - Gate the setup of event channels only to systems running under Xen.
2013-09-20 22:59:22 +00:00
davidch
e226cdd9b8 Substantial rewrite of bxe(4) to add support for the BCM57712 and
BCM578XX controllers.

Approved by:	re
MFC after:	4 weeks
2013-09-20 20:18:49 +00:00
gibbs
a9c07a6f67 Add support for suspend/resume/migration operations when running as a
Xen PVHVM guest.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
	- Make sure that are no MMU related IPIs pending on migration.
	- Reset pending IPI_BITMAP on resume.
	- Init vcpu_info on resume.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
	- Add a "suspend_cancelled" parameter to pic_resume().  For the
	  Xen PIC, restoration of interrupt services differs between
	  the aborted suspend and normal resume cases, so we must provide
	  this information.

sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
	- Don't swap out "suspend safe" timers across a suspend/resume
	  cycle.  This includes the Xen PV and ACPI timers.

sys/dev/xen/control/control.c:
	- Perform proper suspend/resume process for PVHVM:
		- Suspend all APs before going into suspension, this allows us
		  to reset the vcpu_info on resume for each AP.
		- Reset shared info page and callback on resume.

sys/dev/xen/timer/timer.c:
	- Implement suspend/resume support for the PV timer. Since FreeBSD
	  doesn't perform a per-cpu resume of the timer, we need to call
	  smp_rendezvous in order to correctly resume the timer on each CPU.

sys/dev/xen/xenpci/xenpci.c:
	- Don't reset the PCI interrupt on each suspend/resume.

sys/kern/subr_smp.c:
	- When suspending a PVHVM domain make sure there are no MMU IPIs
	  in-flight, or we will get a lockup on resume due to the fact that
	  pending event channels are not carried over on migration.
	- Implement a generic version of restart_cpus that can be used by
	  suspended and stopped cpus.

sys/x86/xen/hvm.c:
	- Implement resume support for the hypercall page and shared info.
	- Clear vcpu_info so it can be reset by APs when resuming from
	  suspension.

sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
	- Support UP kernel configurations.

sys/x86/xen/xen_intr.c:
	- Properly rebind per-cpus VIRQs and IPIs on resume.
2013-09-20 05:06:03 +00:00
gibbs
45812e0f97 sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
	Set PCPU apic_id and acpi_id fields in a fasion compatible with
	both UP and SMP configurations.

Suggested by:	jhb
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks
2013-09-20 04:35:09 +00:00
alc
88a4d0f31a The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8.  Now, that call no
longer exists.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-20 04:30:18 +00:00
gibbs
6f13d59af6 sys/i386/xen_mp_machdep.c:
Set a 'fake' acpi_id for the i386 PV port, it is needed in
	order to use VIRQs or IPI event channels.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks
2013-09-19 14:41:10 +00:00
pjd
667d7255be Fix panic in ktrcapfail() when no capability rights are passed.
While here, correct all consumers to pass NULL instead of 0 as we pass
capability rights as pointers now, not uint64_t.

Reported by:	Daniel Peyrolon
Tested by:	Daniel Peyrolon
Approved by:	re (marius)
2013-09-18 19:26:08 +00:00
rdivacky
9e8e4eba85 Regen.
Approved by:	re (delphij)
2013-09-18 18:49:26 +00:00
rdivacky
fae003a069 Revert r255672, it has some serious flaws, leaking file references etc.
Approved by:	re (delphij)
2013-09-18 18:48:33 +00:00
rdivacky
b320aa22a6 Regen.
Approved by:    re (delphij)
2013-09-18 17:58:03 +00:00
rdivacky
d57db3eead Implement epoll support in Linuxulator. This is a tiny wrapper around kqueue
to implement epoll subset of functionality. The kqueue user data are 32bit
on i386 which is not enough for epoll user data so this patch overrides
kqueue fileops to maintain enough space in struct file.

Initial patch developed by me in 2007 and then extended and finished
by Yuri Victorovich.

Approved by:    re (delphij)
Sponsored by:   Google Summer of Code
Submitted by:   Yuri Victorovich <yuri at rawbw dot com>
Tested by:      Yuri Victorovich <yuri at rawbw dot com>
2013-09-18 17:56:04 +00:00
bryanv
5dc84522d4 Add vmx(4) to i386 and amd64 GENERIC
Approved by:	re (gjb)
2013-09-17 01:54:13 +00:00
kib
d1fc4ed2e1 Merge the change r255607 from amd64 to i386.
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (gjb)
2013-09-16 19:58:37 +00:00
alc
11b50ccc0e Prior to r254304, we only began scanning the active page queue when the
amount of free memory was close to the point at which we would begin
reclaiming pages.  Now, we continuously scan the active page queue,
regardless of the amount of free memory.  Consequently, we are continuously
calling pmap_ts_referenced() on active pages.

Prior to this change, pmap_ts_referenced() would always demote superpage
mappings in order to obtain finer-grained reference information.  This made
sense because we were coming under memory pressure and would soon have to
begin reclaiming pages.  Now, however, with continuous scanning of the
active page queue, these demotions are taking a toll on performance.  To
address this problem, I have replaced the demotion with a heuristic for
periodically clearing the reference flag on superpage mappings.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-11 17:23:42 +00:00
jhb
04bb6e10cd Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
gibbs
437790b349 Implement PV IPIs for PVHVM guests and further converge PV and HVM
IPI implmementations.

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Submitted by: gibbs (misc cleanup, table driven config)
Reviewed by:  gibbs
MFC after: 2 weeks

sys/amd64/include/cpufunc.h:
sys/amd64/amd64/pmap.c:
	Move invltlb_globpcid() into cpufunc.h so that it can be
	used by the Xen HVM version of tlb shootdown IPI handlers.

sys/x86/xen/xen_intr.c:
sys/xen/xen_intr.h:
	Rename xen_intr_bind_ipi() to xen_intr_alloc_and_bind_ipi(),
	and remove the ipi vector parameter.  This api allocates
	an event channel port that can be used for ipi services,
	but knows nothing of the actual ipi for which that port
	will be used.  Removing the unused argument and cleaning
	up the comments surrounding its declaration helps clarify
	its actual role.

sys/amd64/amd64/mp_machdep.c:
sys/amd64/include/cpu.h:
sys/i386/i386/mp_machdep.c:
sys/i386/include/cpu.h:
	Implement a generic framework for amd64 and i386 that allows
	the implementation of certain CPU management functions to
	be selected at runtime.  Currently this is only used for
	the ipi send function, which we optimize for Xen when running
	on a Xen hypervisor, but can easily be expanded to support
	more operations.

sys/x86/xen/hvm.c:
	Implement Xen PV IPI handlers and operations, replacing native
	send IPI.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
sys/i386/include/smp.h:
	Remove NR_VIRQS and NR_IPIS from FreeBSD headers.  NR_VIRQS
	is defined already for us in the xen interface files.
	NR_IPIS is only needed in one file per Xen platform and is
	easily inferred by the IPI vector table that is defined in
	those files.

sys/i386/xen/mp_machdep.c:
	Restructure to more closely match the HVM implementation by
	performing table driven IPI setup.
2013-09-06 22:17:02 +00:00
bryanv
f46bf10bc5 Add vmx device to the i386 and amd64 NOTES files 2013-09-06 20:24:21 +00:00
glebius
cf37135185 Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations
to MD headers.
2013-09-06 17:44:13 +00:00
pjd
029a6f5d92 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
gibbs
770a4ce79b Better conformance to style(9) and organizational cleanup.
No functional changes.

sys/i386/xen/mp_machdep.c:
	Remove extra newlines.

	Group externs, forward delarations, local types, and pcpu data.

	Wrap at 80 columns.

	Use parens in return statements.

	Tab indent members of array initializers.

MFC after:	2 weeks
2013-09-02 22:22:56 +00:00
gibbs
f5ff33730e Introduce a new, HVM compatible, paravirtualized timer driver for Xen.
Use this new driver for both PV and HVM instances.

This driver requires a Xen hypervisor that supports vector callbacks,
VCPUOP hypercalls, and reports that it has a "safe PV clock".

New timer driver:
Submitted by: will
Sponsored by: Spectra Logic Corporation

PV port to new driver, and bug fixes:
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D

sys/dev/xen/timer/timer.c:
	- Register a PV timer device driver which (currently)
	  implements device_{identify,probe,attach} and stubs
	  device_detach.  The detach routine requires functionality
	  not provided by timecounters(4).  The suspend and resume
	  routines need additional work (due to Xen requiring that
	  the hypercalls be executed on the target VCPU), and aren't
	  needed for our purposes.

	- Make sure there can only be one device instance of this
	  driver, and that it only registers one eventtimers(4) and
	  one timecounters(4) device interface.  Make both interfaces
	  use PCPU data as needed.

	- Match, with a few style cleanups & API differences, the
	  Xen versions of the "fetch time" functions.

	- Document the magic scale_delta() better for the i386 version.

	- When registering the event timer, bind a separate event
	  channel for the timer VIRQ to the device's event timer
	  interrupt handler for each active VCPU.  Describe each
	  interrupt as "xen_et:c%d", so they can be identified per
	  CPU in "vmstat -i" or "show intrcnt" in KDB.

	- When scheduling a timer into the hypervisor, try up to
	  60 times if the hypervisor rejects the time as being in
	  the past.  In the common case, this retry shouldn't happen,
	  and if it does, it should only happen once.  This is
	  because the event timer advertises a minimum period of
	  100usec, which is only less than the usual hypercall round
	  trip time about 1 out of every 100 tries.  (Unlike other
	  similar drivers, this one actually checks whether the
	  hypervisor accepted the singleshot timer set hypercall.)

	- Implement a RTC PV clock based on the hypervisor wallclock.

sys/conf/files:
	- Add dev/xen/timer/timer.c if the kernel configuration
	  includes either the XEN or XENHVM options.

sys/conf/files.i386:
sys/i386/include/xen/xen_clock_util.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/xen_rtc.c:
	- Remove previous PV timer used in i386 XEN PV kernels, the
	  new timer introduced in this change is used instead (so
	  we share the same code between PVHVM and PV).

MFC after: 2 weeks
2013-08-29 23:11:58 +00:00