Commit Graph

193786 Commits

Author SHA1 Message Date
gibbs
9c8c76f921 Formalize the concept of virtual CPU ids by adding a per-cpu vcpu_id
field.  Perform vcpu enumeration for Xen PV and HVM environments
and convert all Xen drivers to use vcpu_id instead of a hard coded
assumption of the mapping algorithm (acpi or apic ID) in use.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)

amd64/include/pcpu.h:
i386/include/pcpu.h:
	Add vcpu_id to the amd64 and i386 pcpu structures.

dev/xen/timer/timer.c
x86/xen/xen_intr.c
	Use new vcpu_id instead of assuming acpi_id == vcpu_id.

i386/xen/mp_machdep.c:
i386/xen/mptable.c
x86/xen/hvm.c:
	Perform Xen HVM and Xen full PV vcpu_id mapping.

x86/xen/hvm.c:
x86/acpica/madt.c
	Change SYSINIT ordering of acpi CPU enumeration so that it
	is guaranteed to be available at the time of Xen HVM vcpu
	id mapping.
2013-10-05 23:11:01 +00:00
neel
aed205d5cd Merge projects/bhyve_npt_pmap into head.
Make the amd64/pmap code aware of nested page table mappings used by bhyve
guests. This allows bhyve to associate each guest with its own vmspace and
deal with nested page faults in the context of that vmspace. This also
enables features like accessed/dirty bit tracking, swapping to disk and
transparent superpage promotions of guest memory.

Guest vmspace:
Each bhyve guest has a unique vmspace to represent the physical memory
allocated to the guest. Each memory segment allocated by the guest is
mapped into the guest's address space via the 'vmspace->vm_map' and is
backed by an object of type OBJT_DEFAULT.

pmap types:
The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT.

The PT_X86 pmap type is used by the vmspace associated with the host kernel
as well as user processes executing on the host. The PT_EPT pmap is used by
the vmspace associated with a bhyve guest.

Page Table Entries:
The EPT page table entries as mostly similar in functionality to regular
page table entries although there are some differences in terms of what
bits are used to express that functionality. For e.g. the dirty bit is
represented by bit 9 in the nested PTE as opposed to bit 6 in the regular
x86 PTE. Therefore the bitmask representing the dirty bit is now computed
at runtime based on the type of the pmap. Thus PG_M that was previously a
macro now becomes a local variable that is initialized at runtime using
'pmap_modified_bit(pmap)'.

An additional wrinkle associated with EPT mappings is that older Intel
processors don't have hardware support for tracking accessed/dirty bits in
the PTE. This means that the amd64/pmap code needs to emulate these bits to
provide proper accounting to the VM subsystem. This is achieved by using
the following mapping for EPT entries that need emulation of A/D bits:
               Bit Position           Interpreted By
PG_V               52                 software (accessed bit emulation handler)
PG_RW              53                 software (dirty bit emulation handler)
PG_A               0                  hardware (aka EPT_PG_RD)
PG_M               1                  hardware (aka EPT_PG_WR)

The idea to use the mapping listed above for A/D bit emulation came from
Alan Cox (alc@).

The final difference with respect to x86 PTEs is that some EPT implementations
do not support superpage mappings. This is recorded in the 'pm_flags' field
of the pmap.

TLB invalidation:
The amd64/pmap code has a number of ways to do invalidation of mappings
that may be cached in the TLB: single page, multiple pages in a range or the
entire TLB. All of these funnel into a single EPT invalidation routine called
'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and
sends an IPI to the host cpus that are executing the guest's vcpus. On a
subsequent entry into the guest it will detect that the EPT has changed and
invalidate the mappings from the TLB.

Guest memory access:
Since the guest memory is no longer wired we need to hold the host physical
page that backs the guest physical page before we can access it. The helper
functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose.

PCI passthru:
Guest's with PCI passthru devices will wire the entire guest physical address
space. The MMIO BAR associated with the passthru device is backed by a
vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that
have one or more PCI passthru devices attached to them.

Limitations:
There isn't a way to map a guest physical page without execute permissions.
This is because the amd64/pmap code interprets the guest physical mappings as
user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U
shares the same bit position as EPT_PG_EXECUTE all guest mappings become
automatically executable.

Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews
as well as their support and encouragement.

Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing
object for pci passthru mmio regions.

Special thanks to Peter Holm for testing the patch on short notice.

Approved by:	re
Discussed with:	grehan
Reviewed by:	alc, kib
Tested by:	pho
2013-10-05 21:22:35 +00:00
gibbs
716c2031c7 Correct panic caused by attaching both Xen PV and HyperV virtualization
aware drivers on Xen hypervisors that advertise support for some
HyperV features.

x86/xen/hvm.c:
	When running in HVM mode on a Xen hypervisor, set vm_guest
	to VM_GUEST_XEN so other virtualization aware components in
	the FreeBSD kernel can detect this mode is active.

dev/hyperv/vmbus/hv_hv.c:
	Use vm_guest to ignore Xen's HyperV emulation when Xen is
	detected and Xen PV drivers are active.

Reported by:	Shanker Balan
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (Xen blanket)
2013-10-05 19:51:09 +00:00
hiren
e6885256cd Expose system level ixgbe sysctls.
Device level sysctls are already exposed as dev.ix.<device>

Fixing the case where number of queues for igb is auto-tuned and
hw.igb.num_queues does not return current/updated value.

Reviewed by:	jfv
Approved by:	re (delphij)
MFC after:	2 weeks
2013-10-05 19:17:56 +00:00
alc
4c5162818b Tidy up kmeminit(): Since r245575, 'nmbclusters' is calculated after
kmeminit() runs, so it contributes nothing to 'vm_kmem_size'; update a
comment to reflect that r254025 replaced the kmem submap with the kmem
arena.

Reviewed by:	kib
Approved by:	re (gjb)
Sponsored by:	EMC / Isilon Storage Division
2013-10-05 18:53:03 +00:00
bryanv
181aa517b6 Do not hold the vtnet Rx queue lock when calling up into the stack
This matches other similar drivers and avoids various LOR warnings.

Approved by:	re (marius)
2013-10-05 18:07:24 +00:00
trasz
e19bd87874 Split cfiscsi_datamove() in two; no functional changes.
Approved by:	re (glebius)
Sponsored by:	FreeBSD Foundation
2013-10-05 16:22:33 +00:00
grehan
26b6c9487a Remove obsolete cmd-line options and code associated with
these.
 The mux-vcpus option may return at some point, given it's utility
in finding bhyve (and FreeBSD) bugs.

Approved by:	re@ (blanket)
Discussed with:	neel@
2013-10-04 23:29:07 +00:00
kib
677c1f8ce9 Add padding to match the compat32 struct stat32 definition to the real
struct stat on 32bit architectures.

Debugged and tested by:	bsam
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (marius)
2013-10-04 22:05:23 +00:00
jilles
64b622034f kldxref: Do not depend on the directory order.
Sort the filenames to get a consistent result between machines of the same
architecture.

Also, sort FTS_D entries after other entries so kldxref -R works properly in
the uncommon case that a directory contains both subdirectories and modules.
Previously, this may have happened to work, depending on the order of files
in the directory.

PR:		bin/182098
Submitted by:	Derek Schrock (original version)
Tested by:	Derek Schrock
Approved by:	re (delphij)
MFC after:	1 week
2013-10-04 21:25:55 +00:00
trasz
cf59042d5a Don't leak memory when removing an unconnected session, and remove useless
UMA_ZONE_NOFREE that caused another leak when unloading the module.

Approved by:	re (glebius)
Sponsored by:	FreeBSD Foundation
2013-10-04 19:31:41 +00:00
grehan
56fd486581 Hook up the AHCI and blockif code to the build.
Approved by:	re@ (blanket)
2013-10-04 18:44:47 +00:00
grehan
00cc733e89 Import Zhixiang Yu's GSoC'13 AHCI emulation:
https://wiki.freebsd.org/SummerOfCode2013/bhyveAHCI

This provides ICH8 SATA disk and ATAPI ports, selectable
via the bhyve slot command-line parameter:

SATA
  -s <slot>,ahci-hd,<image-file>

ATAPI
  -s <slot>,ahci-cd,<image-file>

Slight modifications by:	grehan@
Approved by:	re@ (blanket)
Obtained from:	FreeBSD GSoC'13
2013-10-04 18:31:38 +00:00
nwhitehorn
91400d13ad Disable use of compiler atomic builtins. For APR, this is limited to
architectures where they are known not to work. For SVN itself, use
the least common denominator and disable them across the board. This
allows svnlite to build and run on all FreeBSD architectures.

Approved by:	re (gjb)
2013-10-04 18:27:02 +00:00
jmg
cb6017acba add aesni module to i386 and amd64 NOTES...
Approved by:	re (gjb)
2013-10-04 17:21:01 +00:00
grehan
6d6539c1e2 Block-layer backend interface for bhyve block-io device emulations.
Approved by:	re@ (blanket)
2013-10-04 16:52:03 +00:00
joel
1906ec25d3 mdoc: remove EOL whitespace.
Approved by:	re (blanket)
2013-10-04 16:44:24 +00:00
trasz
8dd526065c Remove useless check - ki_loginclass is an array; can't be NULL.
CID:		1006559
Approved by:	re (kib)
MFC after:	2 weeks
Sponsored by:	FreeBSD Foundation
2013-10-04 16:08:44 +00:00
uqs
d5bf9d0507 Fix make depend.
Approved by:	re (glebius)
2013-10-04 11:55:20 +00:00
jchandra
4be21635da Fixes for the Netlogic XLP on-chip RSA block driver
The changes are to:
* Use contigmalloc/contigfree which handling microcode buffer
* Use a different buffer to send microcode to each engine
* Swap microcode in little-endian compilation
* Fix freeback message queue id field
* Simplify xlp_get_rsa_opsize() to remove unnecessary checks
* Fix NULL check after use in xlp_free_cmd_params()
* Do better error handling when the hardware returns error
* Fix error codes in few cases

Submitted by:	Vekatesh J. V. <venkatesh.vivekanandan@broadcom.com>
Approved by:	re (hrs)
2013-10-04 11:11:51 +00:00
jchandra
0b7496df03 Style fixes for the Netlogic XLP RSA driver
Updates to the Netlogic XLP on-chp RSA block driver. The changes are
to follow style(9) guidelines, to improve readability and to remove
unnecessary initialization.

No changes to logic have been introduced by this commit.

Submitted by:	Venkatesh J. V. <venkatesh.vivekanandan@broadcom.com>
Approved by:	re (hrs)
2013-10-04 10:01:20 +00:00
hrs
1673ef1797 Do not attempt to do AF-specific configurations on a interface when
noafif() is true.  The following warning message was displayed when
pflog0 interface existed, for example:

 ifconfig: ioctl(SIOCGIFINFO_IN6): Protocol family not supported

Reported by:	bz
Approved by:	re (gjb)
2013-10-04 04:15:18 +00:00
hrs
1315bdcd1c Add epair(4) support in $cloned_interfaces. One should be specified
as "epair0" in $cloned_interfaces and "epair0[ab]" in the others in
rc.conf like the following:

 cloned_interfaces="epair0"
 ifconfig_epair0a="inet 192.168.1.1/24"
 ifconfig_epair0b="inet 192.168.2.1/24"

/etc/rc.d/netif now accepts both "netif start epair0" and "netif start
epair0a".

Approved by:	re (kib)
2013-10-04 02:44:04 +00:00
yongari
e1fc8db7f4 Fix clearing MAC stats registers. Previously it cleared every
fourth register.

Submitted by:	Paul A. Patience <paul-a.patience@polymtl.ca>
Approved by:	re (gjb)
2013-10-04 02:21:39 +00:00
sbruno
6b64ef7266 Change len checks for fstypelen and fspathlen to be against absolute len
not strlen as they are *not* strings.

Discovered by GSOC student, Mike Ma <mikemandarine@gmail.com> during his
fuse.glusterfs port to FreeBSD.

Final patch from mckusick@

Submitted by:	mckusick@
Approved by:	re (hrs)
MFC after:	2 weeks
2013-10-03 22:52:03 +00:00
dim
aae6234255 Pull in r189644 from upstream llvm trunk:
Add ms_abi and sysv_abi attribute handling.

  Based on a patch by Benno Rice!

This will help to develop EFI support.

Approved by:	re (kib)
Verified by:	benno
MFC after:	1 week
2013-10-03 20:38:57 +00:00
dim
5dc4bb5bd3 Pull in r186338 from upstream llvm trunk:
Remove invalid assert in DAGTypeLegalizer::RemapValue

  There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
  which, in part, says:

   // Note that these invariants may not hold momentarily when processing a node:
   // the node being processed may be put in a map before being marked Processed.

  Unfortunately, this assert would be valid only if the above-mentioned invariant
  held unconditionally. This was causing llc to assert when, in fact,
  everything was fine.

  Thanks to Richard Sandiford for investigating this issue!

  Fixes PR16562.

This fixes assertions which could occur in the multimedia/ffmpeg1 and
multimedia/ffmpeg2 ports.

Approved by:	re (hrs)
Reported by:	Matthias Apitz <guru@unixarea.de>
MFC after:	3 days
2013-10-03 17:50:14 +00:00
gjb
b941afa162 Do not install bluetooth rc(8) scripts if MK_BLUETOOTH = no.
Approved by:	re (glebius)
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2013-10-03 15:19:16 +00:00
glebius
ade139e19b Refresh the tips for the new pkg system.
Reviewed by:	bapt
Approved by:	re (hrs)
2013-10-03 11:51:15 +00:00
rpaulo
0755e798ae Append the Git branch to the version string.
Approved by:	re (gjb)
2013-10-03 01:53:17 +00:00
mdf
3839ce4080 Fix up typos from r255963 in mtree Makefile. BSD.debug.dist should be
iterated if present, and remove a stray .endif.

Approved by:	re (gjb)
2013-10-03 01:18:06 +00:00
roberto
960b6b30ed Meinberg clocks support was inadvertently removed during the last vendor
import.  Add it back.

PR:		bin/182545
Submitted by:	Joerg Pulz <Joerg.Pulz@frm2.tum.de>
Approved by:	re (delphij)
MFC after:	1 week
2013-10-02 21:47:25 +00:00
glebius
515e096f72 Clear knlist before destroying it in tap(4) and tun(4). This fixes later
crash, when a kqueue descriptor tries to dereference appropriate knotes.

Approved by:	re (kib)
2013-10-02 20:44:36 +00:00
nwhitehorn
3f1d7fea0e Implement GET_STACK_USAGE() on PowerPC. This implementation is identical
to that on x86 and sparc64.

Approved by:	re (kib)
2013-10-02 20:40:21 +00:00
markj
8ff2d52009 Add a separate translator for headers passed to the TCP probes in the
input path. These probes get some of the fields in host order, whereas the
output probes get them in network order, so a single translator isn't
enough. This workaround ensures that the problem is essentially invisble
to users: none of the probe arguments or their fields have changed.

Approved by:	re (hrs)
2013-10-02 17:14:12 +00:00
sbruno
44a6c311ba set ROOTDEVNAME to ada0 with no paritions. This makes it much more functional
with makefs and other tools for testing and ports building

Approved by:    re (gjb)
MFC after:      2 weeks
2013-10-02 14:43:17 +00:00
nwhitehorn
0846c5c8dd Only build the POWER hypervisor UART driver if device uart is included in
the kernel config.

Approved by:	re (gjb)
2013-10-02 13:33:10 +00:00
kib
031d824a49 When helping the bufdaemon from the buffer allocation context, there
is no sense to walk the whole dirty buffer queue.  We are only
interested in, and can operate on, the buffers owned by the current
vnode [1].  Instead of calling generic queue flush routine, do
VOP_FSYNC() if possible.

Holding the dirty buffer queue lock in the bufdaemon, without dropping
it, can cause starvation of buffer writes from other threads. This is
esp. easy to reproduce on the big memory machines, where large files
are written, causing almost all dirty buffers accumulating in several
big files, which vnodes are locked by writers. Bufdaemon cannot flush
any buffer, but is iterating over the whole dirty queue
continuously. Since dirty queue mutex is not dropped, bufdone() in
g_up thread is starved, usually deadlocking the machine [2]. Mitigate
this by dropping the queue lock after the vnode is locked, allowing
other queue lock contenders to make a progress.

Discussed with:	Jeff [1]
Reported by:	pho [2]
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Approved by:	re (hrs)
2013-10-02 06:00:34 +00:00
gjb
b029856426 Add FreeBSD 9.2-RELEASE to the BSD Family Tree
Approved by:	re (hrs)
Sponsored by:	The FreeBSD Foundation
2013-10-02 04:40:46 +00:00
emaste
207f0bc65b Populate .rld_map on MIPS for debuggers
On MIPS the .dynamic section is read-only, so the pointer to rtld
information for debuggers cannot be stored there (in DT_DEBUG).
Instead, a special section .rld_map is used.

Sponsored by:	DARPA, AFRL
Approved by:	re (delphij)
2013-10-02 02:32:58 +00:00
emaste
7d2bbf6ce3 Use correct size for MIPS .rld_map section
On MIPS .dynamic is read-only and so a special section .rld_map is used
to store the pointer to the rtld information for debuggers.  This
section had a hard coded size of 4 bytes which is not correct for
mips64.  (Note that FreeBSD's rtld does not yet populate .rld_map.)

Sponsored by:   DARPA, AFRL
Approved by:	re (delphij)
2013-10-02 00:50:27 +00:00
delphij
b9aa7441da Revert-and-redo r255955: the sort -r should be added to delete-old-dirs.
Approved by:	re (gjb)
2013-10-01 22:53:27 +00:00
jilles
b6f424e548 accept(2): Update portability note for accept4().
The accept(2) man page warns that O_NONBLOCK and other properties on the
new socket may vary across implementations. However, this issue only
applies to accept() and not to accept4(). On the other hand, accept4()
is not commonly available yet.

Reported by:	pluknet
Reviewed by:	bjk
Approved by:	re (kib)
2013-10-01 21:17:18 +00:00
kib
2f645d31a9 When printing the vnode information from ddb, print the lengths of the
dirty and clean buffer queues.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (gjb)
2013-10-01 20:18:33 +00:00
dim
c637526317 Pull in r191711 from upstream llvm trunk:
The X86FixupLEAs pass for Intel Atom must not call
  convertToThreeAddress on ADD16rr opcodes, if src1 != src, since that
  would cause convertToThreeAddress to try to create a virtual register.
  This is not permitted after register allocation, which is when the
  X86FixupLEAs pass runs.

  This patch fixes PR16785.

Pull in r191715 from upstream llvm trunk:

  Forgot to add a break statement.

This should enable building the x11-toolskits/libXaw port with
CPUTYPE=atom.

Approved by:	re (gjb)
Reported by:	Kenta Suzumoto <kentas@hush.com>
MFC after:	3 days
2013-10-01 19:14:24 +00:00
pluknet
3f9b259642 Sweep man pages replacing ad -> ada.
Approved by:	re (blackend)
MFC after:	1 week
X-MFC note:	stable/9 only
2013-10-01 18:41:53 +00:00
emaste
a5307f0fc0 Also remove GNU ar and ranlib man pages
This was missed in r255974.

Approved by:	re (implicit)
2013-10-01 17:51:04 +00:00
emaste
e9dd0037dc Regen.
Approved by:	re (implicit)
2013-10-01 17:46:04 +00:00
emaste
2d5f7eb23b Remove long-unused GNU ar and ranlib
The libarchive-based replacements have been used since 2009; the GNU
ones were kept to support source upgrades from FreeBSD 6.

Approved by:	re@ (delphij)
2013-10-01 17:40:56 +00:00
alfred
9e3370c119 Fixed kernel crash when running devinfo
When calling to ib_uverbs_cleanup_ucontext, there is a call to
mutex_lock of xrcd_table_mutex, which was not initialized.
Added missing initialization for xrcd_table_mutex.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by:	re
2013-10-01 15:43:23 +00:00