Commit Graph

95100 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
ac873bb350 Add some spare fields to structs used by the new iSCSI stack - some just
in case, some for future MC/S support.

This requires kernel and world rebuild.

Approved by:	re (blanket)
Sponsored by:	FreeBSD Foundation
2013-09-20 21:26:51 +00:00
Zbigniew Bodek
e4b318d69c Fix GCC build for all ARMs. Revert bug introduced in r255613.
Previous change applied in r255613 fixed build for ARMv6 but
broke it for previous architecture revisions. This commit
eventually fixes GCC build for all ARM revisions.

Approved by:	cognet (mentor)
Approved by:	re (kib)
2013-09-20 20:44:32 +00:00
David Christensen
4e4007688c Substantial rewrite of bxe(4) to add support for the BCM57712 and
BCM578XX controllers.

Approved by:	re
MFC after:	4 weeks
2013-09-20 20:18:49 +00:00
Neel Natu
74d1d2b7cc Merge the following changes from projects/bhyve_npt_pmap:
- add fields to 'struct pmap' that are required to manage nested page tables.
- add a parameter to 'vmspace_alloc()' that can be used to override the
  default pmap initialization routine 'pmap_pinit()'.

These changes are pushed ahead of the remaining changes in 'bhyve_npt_pmap'
in anticipation of the upcoming KBI freeze for 10.0.

Reviewed by:	kib@, alc@
Approved by:	re (glebius)
2013-09-20 17:06:49 +00:00
Justin T. Gibbs
428b7ca290 Add support for suspend/resume/migration operations when running as a
Xen PVHVM guest.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
	- Make sure that are no MMU related IPIs pending on migration.
	- Reset pending IPI_BITMAP on resume.
	- Init vcpu_info on resume.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
	- Add a "suspend_cancelled" parameter to pic_resume().  For the
	  Xen PIC, restoration of interrupt services differs between
	  the aborted suspend and normal resume cases, so we must provide
	  this information.

sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
	- Don't swap out "suspend safe" timers across a suspend/resume
	  cycle.  This includes the Xen PV and ACPI timers.

sys/dev/xen/control/control.c:
	- Perform proper suspend/resume process for PVHVM:
		- Suspend all APs before going into suspension, this allows us
		  to reset the vcpu_info on resume for each AP.
		- Reset shared info page and callback on resume.

sys/dev/xen/timer/timer.c:
	- Implement suspend/resume support for the PV timer. Since FreeBSD
	  doesn't perform a per-cpu resume of the timer, we need to call
	  smp_rendezvous in order to correctly resume the timer on each CPU.

sys/dev/xen/xenpci/xenpci.c:
	- Don't reset the PCI interrupt on each suspend/resume.

sys/kern/subr_smp.c:
	- When suspending a PVHVM domain make sure there are no MMU IPIs
	  in-flight, or we will get a lockup on resume due to the fact that
	  pending event channels are not carried over on migration.
	- Implement a generic version of restart_cpus that can be used by
	  suspended and stopped cpus.

sys/x86/xen/hvm.c:
	- Implement resume support for the hypercall page and shared info.
	- Clear vcpu_info so it can be reset by APs when resuming from
	  suspension.

sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
	- Support UP kernel configurations.

sys/x86/xen/xen_intr.c:
	- Properly rebind per-cpus VIRQs and IPIs on resume.
2013-09-20 05:06:03 +00:00
Justin T. Gibbs
e96ca45522 sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
	Set PCPU apic_id and acpi_id fields in a fasion compatible with
	both UP and SMP configurations.

Suggested by:	jhb
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks
2013-09-20 04:35:09 +00:00
Alan Cox
deb179bb4c The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8.  Now, that call no
longer exists.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-20 04:30:18 +00:00
Xin LI
1e7d660af4 Update arcmsr(4) driver to 1.20.00.28 which fixes mutex recursion in
CCB abort codepath.

Many thanks to Areca for continuing to support FreeBSD.

Submitted by:	黃清隆 <ching2048 areca com tw>
MFC after:	2 weeks
Approved by:	re (?)
2013-09-19 20:30:35 +00:00
John Baldwin
a566e8e3c5 Regen.
Approved by:	re (delphij)
2013-09-19 18:56:00 +00:00
John Baldwin
55648840de Extend the support for exempting processes from being killed when swap is
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
  from arbitrary processes.  Similar to ktrace it can apply a change to all
  existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
  control operations on processes (as opposed to the debugger-specific
  operations provided by ptrace(2)).  procctl(2) uses a combination of
  idtype_t and an id to identify the set of processes on which to operate
  similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
  of a set of processes.  MADV_PROTECT still works for backwards
  compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
  the first bit of which is used to track if P_PROTECT should be inherited
  by new child processes.

Reviewed by:	kib, jilles (earlier version)
Approved by:	re (delphij)
MFC after:	1 month
2013-09-19 18:53:42 +00:00
Justin T. Gibbs
8a21c7fbe8 sys/i386/xen_mp_machdep.c:
Set a 'fake' acpi_id for the i386 PV port, it is needed in
	order to use VIRQs or IPI event channels.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks
2013-09-19 14:41:10 +00:00
Peter Grehan
d83d73618f Reconnect the hyperv drivers back into GENERIC now that the
disengage driver issue has been resolved.

Approved by:	re@ (gjb)
2013-09-19 05:07:51 +00:00
Peter Grehan
4a67483f2e Reorder the hypervisor presence test to avoid claiming ATA disks
on non hyperv systems.

Reviewed by:	neel, abgupta at microsoft dot com
Approved by:	re@ (hrs)
2013-09-19 02:34:52 +00:00
Edward Tomasz Napierala
7843bd031a Fix several problems in the new iSCSI stack; this includes interoperability
fix for LIO (Linux target), removing possibility for the target to avoid mutual
CHAP by choosing to skip authentication altogether, and fixing truncated error
messages in iscsictl(8) output.  This also fixes several of the problems found
with Coverity.

Note that this change requires world rebuild.

Coverity CID:	1088038, 1087998, 1087990, 1088004, 1088044, 1088041, 1088040
Approved by:	re (blanket)
Sponsored by:	FreeBSD Foundation
2013-09-18 21:15:21 +00:00
Pawel Jakub Dawidek
3fded357af Fix panic in ktrcapfail() when no capability rights are passed.
While here, correct all consumers to pass NULL instead of 0 as we pass
capability rights as pointers now, not uint64_t.

Reported by:	Daniel Peyrolon
Tested by:	Daniel Peyrolon
Approved by:	re (marius)
2013-09-18 19:26:08 +00:00
Roman Divacky
69d912af45 Regen.
Approved by:	re (delphij)
2013-09-18 18:49:26 +00:00
Roman Divacky
b12698e1a1 Revert r255672, it has some serious flaws, leaking file references etc.
Approved by:	re (delphij)
2013-09-18 18:48:33 +00:00
Roman Divacky
70ccaaf58e Regen.
Approved by:    re (delphij)
2013-09-18 17:58:03 +00:00
Roman Divacky
253c75c0de Implement epoll support in Linuxulator. This is a tiny wrapper around kqueue
to implement epoll subset of functionality. The kqueue user data are 32bit
on i386 which is not enough for epoll user data so this patch overrides
kqueue fileops to maintain enough space in struct file.

Initial patch developed by me in 2007 and then extended and finished
by Yuri Victorovich.

Approved by:    re (delphij)
Sponsored by:   Google Summer of Code
Submitted by:   Yuri Victorovich <yuri at rawbw dot com>
Tested by:      Yuri Victorovich <yuri at rawbw dot com>
2013-09-18 17:56:04 +00:00
Sean Bruno
7995e29931 Bring in configuration for Buffalo Airstation WZR-300HP, Atheros based
wireless home router.

Notable things:
2x 16 MB flash devices
Atheros Wireless
Atheros Switching

Many thanks to adrian@ for his guidance on this and keeping the drivers in
the base system up to date

Approved by:    re (delphij)
2013-09-17 22:26:07 +00:00
Jilles Tjoelker
9fdb497cd0 Regenerate for freebsd32_cap_enter().
Approved by:	re (hrs)
2013-09-17 20:49:05 +00:00
Jilles Tjoelker
529411c369 Disallow cap_enter() in freebsd32 compatibility mode.
The freebsd32 compatibility mode (for running 32-bit binaries on 64-bit
kernels) does not currently allow any system calls in capability mode, but
still permits cap_enter(). As a result, 32-bit binaries on 64-bit kernels
that use capability mode do not work (they crash after being disallowed to
call sys_exit()). Affected binaries include dhclient and uniq. The latter's
crashes cause obscure build failures.

This commit makes freebsd32 cap_enter() fail with [ENOSYS], as if capability
mode was not compiled in. Applications deal with this by doing their work
without capability mode.

This commit does not fix the uncommon situation where a 64-bit process
enters capability mode and then executes a 32-bit binary using fexecve().

This commit should be reverted when allowing the necessary freebsd32 system
calls in capability mode.

Reviewed by:	pjd
Approved by:	re (hrs)
2013-09-17 20:48:19 +00:00
Hiren Panchasara
7e06ee8383 We have grown a bit too big lately. Shrinking the kernel for TP-Link
TL-WR1043ND.

Submitted by:   loos (initial version)
Reviewed by:    adrian
Approved by:    sbruno (mentor, implicit)
Approved by:	re (delphij)
Tested by:      hiren
2013-09-17 20:33:42 +00:00
Xin LI
040f9b1e84 Fix a typo when accounting for tx_broadcast statistics.
Submitted by:	Paul A. Patience <paul-a patience polymtl ca>
MFC after:	2 weeks
Approved by:	re (hrs)
2013-09-17 18:46:10 +00:00
Peter Grehan
517e21d3e7 Hide TSC-deadline APIC timer support from guests. This mode
isn't yet implemented in bhyve's APIC emulation.

Reviewed by:	neel
Approved by:	re@ (blanket)
2013-09-17 17:56:53 +00:00
Nathan Whitehorn
7a8d25c037 Merge in support for PAPR-compliant (Power Architecture Platform
Requirements) systems from the projects/pseries branch. This in principle
includes all IBM POWER hardware released in the last 15 years with the
exception of POWER3-based systems when run in 64-bit mode. The main
development target, however, has been the PAPR logical partition support
that is the default target in KVM on POWER and QEMU -- mileage may vary
on actual hardware at present. Much of the heavy lifting here was done
by Andreas Tobler.

Approved by:	re (kib)
2013-09-17 17:37:04 +00:00
Nathan Whitehorn
982b134610 Only attach if properties we need (address, in particular) are present.
This is the correct version of r255420.

Approved by:	re (kib)
2013-09-17 17:31:53 +00:00
Nathan Whitehorn
5d548e66ff Add POWER7+ and POWER8 to the CPU ID table.
Approved by:	re (kib)
2013-09-17 17:29:56 +00:00
Nathan Whitehorn
58aa4de0aa Make sure to copy segments back to the segs array if non-NULL. This is
relied upon by bus_dmamap_load_mbuf_sg() (i.e. all network drivers).

Approved by:	re (kib)
MFC after:	2 weeks
2013-09-17 17:29:07 +00:00
Neel Natu
0f9d5dc758 Fix a bug in decoding an instruction that has an SIB byte as well as an
immediate operand. The presence of an SIB byte in decoding the ModR/M field
would cause 'imm_bytes' to not be set to the correct value.

Fix this by initializing 'imm_bytes' independent of the ModR/M decoding.

Reported by: grehan@
Approved by: re@
2013-09-17 16:06:07 +00:00
Konstantin Belousov
9eab548476 PG_SLAB no longer serves a useful purpose, since m->object is no
longer abused to store pointer to slab. Remove it.

Reviewed by:    alc
Sponsored by:   The FreeBSD Foundation
Approved by:	re (hrs)
2013-09-17 07:35:26 +00:00
Gleb Smirnoff
85fdd534cf Fix assertion in sendfile_readpage() to assert only the validity
of requested amount of data in a page. Move assertion down below
object unlock.

Approved by:	re (kib)
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2013-09-17 06:37:21 +00:00
Bryan Venteicher
03c6abfd1c Add vmx(4) to i386 and amd64 GENERIC
Approved by:	re (gjb)
2013-09-17 01:54:13 +00:00
Konstantin Belousov
06646d663a Merge the change r255607 from amd64 to i386.
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (gjb)
2013-09-16 19:58:37 +00:00
Glen Barber
163fd5eca2 Update head/ to -ALPHA2 status.
Approved by:	re (implicit)
2013-09-16 19:29:18 +00:00
Nathan Whitehorn
1c5fc51cdf Add a loader tunable to use only device tree-provided PCI devices. This is
needed on some more fragile systems to avoid machine checks when blindly
probing the PCI bus. Also reduce ofw_pcibus's priority slightly so that it
can be overridden.

Approved by:	re (gjb)
2013-09-16 15:10:11 +00:00
Nathan Whitehorn
1aff10b99e Fix bug in busdma: if segs is a preexisting buffer, we memcpy it
into the DMA map. The length of the buffer had not yet been
initialized, however, so this would copy gibberish unless it
happened to be right by chance. This bug mostly only affected
systems with IOMMUs.

Approved by:	re (gjb)
MFC after:	3 days
2013-09-16 14:32:56 +00:00
Zbigniew Bodek
e478f35505 Fix GCC build error when building for ARMv6
Apply theravens's idea to move __strong_reference
macros into the proper ifdef section.

Approved by:	cognet (mentor)
Approved by:	re
2013-09-16 10:46:58 +00:00
Zbigniew Bodek
760488b93c Implement pmap_advise() for ARMv6/v7 pmap module
Apply the given advice to the specified range of addresses within the
given pmap. Depending on the advice, clear the referenced and/or
modified flags in each mapping. Superpage within the given range will
be demoted or destroyed.

Reviewed by:	alc
Approved by:	cognet (mentor)
Approved by:	re
2013-09-16 10:39:35 +00:00
Zbigniew Bodek
8b78ad43bc Write protect base page after superpage demotion so that it may repromote
When clearing the modification status of the superpage, one of the
base pages produced during demotion should be marked as write disabled.
The intention is that subsequent write access may repromote.
In the current implementation this was done wrong as write permission was
granted instead of forbidden.

Approved by:	cognet (mentor)
Approved by:	re
2013-09-16 10:34:44 +00:00
Konstantin Belousov
3846a82284 Remove zero-copy sockets code. It only worked for anonymous memory,
and the equivalent functionality is now provided by sendfile(2) over
posix shared memory filedescriptor.

Remove the cow member of struct vm_page, and rearrange the remaining
members.  While there, make hold_count unsigned.

Requested and reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (delphij)
2013-09-16 06:25:54 +00:00
Konstantin Belousov
70b9173019 In pmap_copy(), when the copied region is mapped with superpage but does
not cover entire superpage, avoid copying.  Doing partial copy would
require demotion, which is incompatible with the already held locks.

Reported by:    cperciva
Reviewed by:    alc
Sponsored by:	The FreeBSD Foundation
MFC after:      1 week
Approved by:	re (delphij)
2013-09-16 06:15:15 +00:00
Nathan Whitehorn
c088841850 Add a kernel interface (OF_xref_phandle()) for systems where phandles
used as cross-references in the device tree and phandles as used by the
Open Firmware client interface are in different namespaces. This include
IBM pSeries hardware as well as FDT systems. FDT certainly abuses
ihandles for this purpose and should be modified to use this API
eventually. This changes no behavior on systems where FreeBSD already
worked.

Reviewed by:	marius
Approved by:	re (kib)
MFC after:	2 weeks
2013-09-15 14:19:17 +00:00
Jean-Sébastien Pédron
c8b8d6b96e drm/radeon: Add missing "return false" after unmapping invalid BIOS
Without that, we would try to copy the unmapped BIOS.

Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
Approved by:	re (blanket)
2013-09-15 07:48:42 +00:00
Peter Grehan
b90fcf02f2 Pull the hyperv drivers from GENERIC until the fix to the disengage
driver to make it only probe when running on hyperv is reviewed and
tested.

Approved by:	re (rodrigc)
2013-09-14 20:38:22 +00:00
Jean-Sébastien Pédron
02969dd063 drm/radeon: Fix usage of pci_save_state() and pci_restore_state()
Calling those functions with the drmn device as argument causes a panic,
because it's not a direct child of pci$N. They must be called with the
vgapci device instead.

This fix is not enough to make suspend/resume work reliably.

Approved by:	re (blanket)
2013-09-14 17:24:41 +00:00
Jean-Sébastien Pédron
f4bb978a66 drm/radeon: Fix usage of vga_pci_map_bios()
vga_pci_(un)map_bios() takes a vgapci device as argument, not a drmn
one. This fixes a bug where the BIOS couldn't be mapped if the device
wasn't the boot display.

Approved by:	re (kib; blanket for following drm2/radeon commits)
2013-09-14 17:22:34 +00:00
Jean-Sébastien Pédron
a38326da80 vgapci: Use vga_pci_alloc_resource() to map PCI Expansion ROM
This is cleaner and fixes Video BIOS mapping when the given device isn't
the boot display.

Submitted by:	jhb@
Approved by:	re (kib)
2013-09-14 17:17:32 +00:00
Edward Tomasz Napierala
009ea47eb2 Bring in the new iSCSI target and initiator.
Reviewed by:	ken (parts)
Approved by:	re (delphij)
Sponsored by:	FreeBSD Foundation
2013-09-14 15:29:06 +00:00
Konstantin Belousov
196beb5359 If the last page of the file is partially full and whole valid
portion is invalidated, invalidate the whole page.  Otherwise,
partially valid page appears on a page queue, which is wrong.  This
could only happen for the last page, because only then buffer which
triggered invalidation could not cover the whole page.

Reported and tested by:	pho (previous version)
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
Approved by:	re (delphij)
MFC after:	2 weeks
2013-09-14 10:11:38 +00:00
Konstantin Belousov
77e306c5e0 Fix module build when device ata is not in kernel config.
Sponsored by:	The FreeBSD Foundation
Build-tested by:	gjb
Approved by:	re (delphij)
2013-09-14 09:53:57 +00:00
Konstantin Belousov
e8de242d3a Use TAILQ instead of STAILQ for kqeueue filedescriptors to ensure constant
time removal on kqueue close.

Reported and tested by:	pho
Reviewed by:	jmg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (delphij)
2013-09-13 19:50:50 +00:00
Peter Grehan
ab7fb3bca7 Import Hyper-V paravirtualized drivers from projects/hyperv
branch into head.

Approved by:	re@ (hrs)
Obtained from:	Microsoft, NetApp, and Citrix.
2013-09-13 18:47:58 +00:00
Mikolaj Golub
4d3dfd450a Unregister inet/inet6 pfil hooks on vnet destroy.
Discussed with:	andre
Approved by:	re (rodrigc)
2013-09-13 18:45:10 +00:00
Konstantin Belousov
9bec6325ad When opening or closing fifo, ensure that the vnode is locked
exclusively.  Filesystems are assumed to disable shared locking for
the fifo vnode locks, but some do not.

Reported and tested by:	olgeni
Discussed with:	avg
Sponsored by:   The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (glebius)
2013-09-13 06:52:23 +00:00
Konstantin Belousov
8740a7112e Reduce the scope of the proctree_lock. If several processes cause
continuous calls to the uprintf(9), the proctree_lock could be
shared-locked for indefinite amount of time, starving exclusive
requests. Since proctree_lock is needed for fork() and exit(), this
effectively stops the machine.

While there, do the similar reduction for tprintf(9).

Reported and tested by: pho
Reviewed by:    ed
Sponsored by:   The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (glebius)
2013-09-13 06:39:10 +00:00
Nathan Whitehorn
1330c354c5 Change VM object lock assertion to match locking higher in the call
chain. This repairs a panic observed during pageout on some 64-bit
PowerPC systems.

Submitted by:	grehan
Approved by:	re (kib)
MFC after:	2 weeks
Revisit after:	10.0
2013-09-13 01:12:45 +00:00
Kenneth D. Merry
cd04f04fb5 Fix an issue that caused Integrated RAID volumes on LSI mps(4) controllers
to not get scanned on boot.

The problem originated in change 253549.  With the change to the mps(4)
driver to scan only targets that it knows it has (as opposed to scanning
the entire bus), scanning RAID volumes on boot was omitted.

So, for versions of FreeBSD that have the scanning changes
(__FreeBSD_version 1000039 and higher), scan RAID volumes that are added
whether or not we're booting.

PR:		kern/181784
Reported by:	Xiguang Wang <kurapica@gmail.com>
Tested by:	Dennis Glatting <dg@pki2.com>
Sponsored by:	Spectra Logic
Approved by:	re (delphij)
MFC After:	3 days
2013-09-12 22:06:12 +00:00
John Baldwin
6a87d217e2 Fix an off-by-one error when populating mincore(2) entries for
skipped entries.  lastvecindex references the last valid byte,
so the new bytes should come after it.

Approved by:	re (kib)
MFC after:	1 week
2013-09-12 20:46:32 +00:00
John Baldwin
514a6e6167 Fix a typo.
Approved by:	re (gjb)
2013-09-12 19:52:23 +00:00
John Baldwin
eb2e5544d3 Regen.
Approved by:	re (kib)
2013-09-12 18:03:51 +00:00
John Baldwin
ed749cf183 Fix the type of the idtype argument to wait6() in syscalls.master.
(Accidentally missed this in the previous commit)

Approved by:	re (kib)
MFC after:	1 week
2013-09-12 18:01:13 +00:00
John Baldwin
84c21af119 Fix the type of the idtype argument to wait6() in syscalls.master.
Approved by:	re (kib)
MFC after:	1 week
2013-09-12 17:52:18 +00:00
Glen Barber
99f54f8fd0 Update head/ to -ALPHA1 status, as part of the 10.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2013-09-12 17:51:18 +00:00
Hans Petter Selasky
418b87f8e6 Don't issue USB resume signalling in USB device mode, if the USB power
mode is ON and suspend is detected. This confuses iPads running in USB
host mode at least.

MFC after:	1 week
Approved by:	re (hrs)
2013-09-12 10:39:38 +00:00
Gleb Smirnoff
e06432800f Provide pr_ctloutput method for AF_LOCAL/SOCK_SEQPACKET sockets.
This makes setsockopt() on them working.

Reported by:	Yuri <yuri rawbw.com>
Approved by:	re (kib)
2013-09-11 18:22:30 +00:00
Konstantin Belousov
64c5de5483 Fix build with gcc.
Build-tested by:	gjb
Approved by:	re (glebius)
2013-09-11 17:31:22 +00:00
Alan Cox
87ee6303e5 Prior to r254304, we only began scanning the active page queue when the
amount of free memory was close to the point at which we would begin
reclaiming pages.  Now, we continuously scan the active page queue,
regardless of the amount of free memory.  Consequently, we are continuously
calling pmap_ts_referenced() on active pages.

Prior to this change, pmap_ts_referenced() would always demote superpage
mappings in order to obtain finer-grained reference information.  This made
sense because we were coming under memory pressure and would soon have to
begin reclaiming pages.  Now, however, with continuous scanning of the
active page queue, these demotions are taking a toll on performance.  To
address this problem, I have replaced the demotion with a heuristic for
periodically clearing the reference flag on superpage mappings.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-11 17:23:42 +00:00
Hans Petter Selasky
3dc1e567e5 Clear correct data structure.
MFC after:	1 week
Approved by:	re (hrs)
2013-09-11 10:18:36 +00:00
Gleb Smirnoff
540b1a7238 Clean up SIOCSIFDSTADDR usage from ifnet drivers. The ioctl itself is
extremely outdated, and I doubt that it was ever used for ifnet drivers.
It was used for AF_INET sockets in pre-FreeBSD time.

Approved by:	re (hrs)
Sponsored by:	Nginx, Inc.
2013-09-11 09:19:44 +00:00
Neel Natu
0f1ef0ec80 Fix a limitation in bhyve that would limit the number of virtual machines to
the maximum number of VT-d domains (256 on a Sandybridge). We now allocate a
VT-d domain for a guest only if the administrator has explicitly configured
one or more PCI passthru device(s).

If there are no PCI passthru devices configured (the common case) then the
number of virtual machines is no longer limited by the maximum number of
VT-d domains.

Reviewed by: grehan@
Approved by: re@
2013-09-11 07:11:14 +00:00
Konstantin Belousov
227aaa86ed Implement sendfile(2) for the posix shared memory segment file descriptor,
in addition to the regular files.

Requested by:	alc
Discussed with:	emaste
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Approved by:	re (hrs)
2013-09-11 06:41:15 +00:00
Peter Grehan
47823319c3 IFC @ r255459 2013-09-11 00:19:16 +00:00
David E. O'Brien
a74e05dd2e Back out r255440. /usr/bin/gcc @r255185 (2013-09-03) can build this.
Approved by:	re (kib)
2013-09-10 16:50:13 +00:00
Gleb Smirnoff
2402d97614 Make a bump for r255426.
Approved by:	re (gjb)
2013-09-10 10:38:15 +00:00
Dag-Erling Smørgrav
1a05c762b9 Fix the length calculation for the final block of a sendfile(2)
transmission which could be tricked into rounding up to the nearest
page size, leaking up to a page of kernel memory.  [13:11]

In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR
and SIOCSIFNETMASK at the socket layer rather than pass them on to the
link layer without validation or credential checks.  [SA-13:12]

Prevent cross-mount hardlinks between different nullfs mounts of the
same underlying filesystem.  [SA-13:13]

Security:	CVE-2013-5666
Security:	FreeBSD-SA-13:11.sendfile
Security:	CVE-2013-5691
Security:	FreeBSD-SA-13:12.ifioctl
Security:	CVE-2013-5710
Security:	FreeBSD-SA-13:13.nullfs
Approved by:	re
2013-09-10 10:05:59 +00:00
David E. O'Brien
9dc29a3cf0 Only use a clang'ism if ${CC} is clang.
Reviewed by:	sjg
Approved by:	re (kib)
2013-09-10 05:49:31 +00:00
Konstantin Belousov
f79abb0476 Call free() on the pointer returned from malloc().
Reported and tested by:	Oliver Pinter <oliver.pntr@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Approved by:	re (delphij)
2013-09-10 05:17:53 +00:00
Peter Grehan
8d39ed16c2 Go way past 11 and bump bhyve's max vCPUs to 16.
This should be sufficient for 10.0 and will do
until forthcoming work to avoid limitations
in this area is complete.

Thanks to Bela Lubkin at tidalscale for the
headsup on the apic/cpu id/io apic ASL parameters
that are actually hex values and broke when
written as decimal when 11 vCPUs were configured.

Approved by:	re@
2013-09-10 03:48:18 +00:00
Xin LI
e8de677c74 MFV r247844 (illumos-gate 13975:ef6409bc370f)
Illumos ZFS issues:
  3582 zfs_delay() should support a variable resolution
  3584 DTrace sdt probes for ZFS txg states

Provide a compatibility shim for Solaris's cv_timedwait_hires
to help aid future porting.

Approved by:	re (ZFS blanket)
2013-09-10 01:46:47 +00:00
Michael Tuexen
5dc80df9c5 Fix the aborting of association with the iterator using an empty
user initiated error cause (using SCTP_ABORT|SCTP_SENDALL).

Approved by: re (delphij)
MFC after: 1 week
2013-09-09 21:40:07 +00:00
Peter Grehan
2ee2dc6fd6 Revert the kvp code - there's still some work that
needs to be done for that.

Discussed with:	Microsoft hyper-v devs
2013-09-09 19:27:44 +00:00
John Baldwin
edb572a38c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
Nathan Whitehorn
22b256dfcb Make the primary name of the OF console device /dev/ofwcons, and only
alias it to the contents of the output property if it is defined. This
avoids a panic when booting machines (QEMU) where the output-device
property is not defined.

Since output-device is free-form and potentially conflicts with other
entries in /dev, I also am not sure we should be doing the aliasing at
all, but this at least makes things work again.

Approved by:	re (kib)
2013-09-09 16:51:35 +00:00
Nathan Whitehorn
32fa1ceff1 Revert r255420. This seems to break some Powermac systems and will be
revisited much later.

Pointy hat to:		me
Approved by:		re (kib, implicit due to breakage 10 minutes ago)
2013-09-09 13:40:53 +00:00
Nathan Whitehorn
5d46492ddc Attach only on hardware that is actually supported as opposed to hardware
that seems like it has some of the problems we might want.

Approved by:	re (kib)
2013-09-09 12:54:08 +00:00
Nathan Whitehorn
c84bb047d4 Raise artificial limits on number of CPUs and number of interrupts.
Approved by:	re (kib)
2013-09-09 12:52:34 +00:00
Nathan Whitehorn
c5915fdc44 Add POWER CPUs to the kernel's knowledge. This does not imply we currently
actually run on any machines with POWER CPUs but avoids closing that door
unnecessarily.

Approved by:	re (kib)
2013-09-09 12:51:24 +00:00
Nathan Whitehorn
0658fe8ce1 Add hook called when every new processor is brought online -- including the
BSP -- so that platform modules have a chance to add the new CPU to any
internal bookkeeping.

Approved by:	re (kib)
2013-09-09 12:49:19 +00:00
Nathan Whitehorn
e52f055d23 Use a spin lock instead of a mutex to gate RTAS. This is required if RTAS
calls are involved in interrupt handling.

Approved by:	re (kib)
2013-09-09 12:45:41 +00:00
Nathan Whitehorn
c2f2553784 Use the canonical bits for wired, etc. in the PTE. This is important for
interactions with certain kinds of hypervisors that look into the PTEs
more closely than they should.

Approved by:	re (kib)
2013-09-09 12:44:48 +00:00
Peter Grehan
d940bfec8c Latest update from Microsoft.
Obtained from:	Microsoft Hyper-v dev team
2013-09-09 08:07:46 +00:00
Xin LI
22ecadc03b In r243868, the error message buffer errmsg have been changed from
an on-stack array to a pointer and therefore sizeof(errmsg) would
become 4 or 8 bytes depending on the architecture.

Fix this by using ERRMSGL in place of sizeof().

Submitted by:	J David <j.david.lists@gmail.com>
MFC after:	3 days
Approved by:	re (kib)
2013-09-09 05:01:18 +00:00
Navdeep Parhar
eb22728291 Rework the tx credit mechanism between the cxgbe/tom driver
and the card.  This helps smooth out some burstiness in the
exchange.

Approved by:	re (glebius)
2013-09-09 04:38:57 +00:00
Navdeep Parhar
c81d56a0aa Fix a miscalculation that caused cxgbe/tom to auto-increment
a TOE socket's tx buffer size too aggressively.

Approved by:	re (delphij)
2013-09-09 00:16:59 +00:00
Alan Cox
70c4180f1c Prior to r254304, we only began scanning the active page queue when the
amount of free memory was close to the point at which we would begin
reclaiming pages.  Now, we continuously scan the active page queue,
regardless of the amount of free memory.  Consequently, we are continuously
calling pmap_ts_referenced() on active pages.

Prior to this change, pmap_ts_referenced() would always demote superpage
mappings in order to obtain finer-grained reference information.  This made
sense because we were coming under memory pressure and would soon have to
begin reclaiming pages.  Now, however, with continuous scanning of the
active page queue, these demotions are taking a toll on performance.  For
example, on one of my test machines, the running time for the HPCC Random
Access benchmark (also known as GUPS) has increased by 54%.  To address this
problem, I have replaced the demotion with a heuristic for periodically
clearing the reference flag on superpage mappings.

Reviewed by:	kib
Approved by:	re (glebius)
Sponsored by:	EMC / Isilon Storage Division
2013-09-08 21:30:53 +00:00
Bryan Venteicher
c02d19b6b6 Use correct type for the vmx vlan filter table
Approved by:	re (glebius, gjb)
2013-09-08 19:13:06 +00:00
Mikolaj Golub
1f6addd92c Relese the interface in the last.
Reviewed by:	glebius
Approved by:	re (kib)
2013-09-08 18:19:40 +00:00
Konstantin Belousov
3aaea6efd5 Drain for the xbusy state for two places which potentially do
pmap_remove_all(). Not doing the drain allows the pmap_enter() to
proceed in parallel, making the pmap_remove_all() effects void.

The race results in an invalidated page mapped wired by usermode.

Reported and tested by:	pho
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
Approved by:	re (glebius)
2013-09-08 17:51:22 +00:00
Mark Murray
9365bad8a2 Fix verbose output line; needs <NL>
Submitted by:	Sean Bruno <sean_bruno@yahoo.com>
Approved by:	re (glebius)
2013-09-08 16:48:03 +00:00
Mark Murray
7c2af6212d Fix the build; Certain linkable symbols need to always be present.
Pass the pointy hat please.

Also unblock the software (Yarrow) generator for now. This will be
reverted; Yarrow needs to block until secure, not this behaviour
of serving as soon as asked.

Folks with specific requiremnts will be able to (can!) unblock this
device with any write, and are encouraged to do so in /etc/rc.d/*
scripting. ("Any" in this case could be "echo '' > /dev/random" as
root).
2013-09-07 22:07:36 +00:00
Nathan Whitehorn
4eb54166aa Fix error in r252115: space for the softc needs to be allocated. This
seemed to be working by chance on most systems.
2013-09-07 20:52:31 +00:00
Pawel Jakub Dawidek
013075d557 Sort properly. 2013-09-07 19:16:02 +00:00
Pawel Jakub Dawidek
5a1983cc41 Fix panic in cap_rights_is_valid() when invalid rights are provided -
the right_to_index() function should assert correctness in this case.

Improve other assertions.

Reported by:	pho
Tested by:	pho
2013-09-07 19:03:16 +00:00
Luiz Otavio O Souza
44d06d8d9a Export a function to allow BCM2835's peripheral devices to enable their
altenate pin function (from GPIO pins) as needed.

Approved by:	adrian (mentor)
2013-09-07 18:48:15 +00:00
Jayachandran C.
485a81908b Netlogic XLP network driver update
Changes are to
- update board and network interface detection logic
- fix reading onboard CPLD in little-endian config
- print NAE frequency conrrectly for Bx chips
- update XAUI config to disable Rx/Tx until interface is up

Submitted by:	Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
2013-09-07 18:26:16 +00:00
Jayachandran C.
cbd49bff46 Use a better version of memcpy/bcopy for mips kernel.
Use a variant of mips libc memcpy for kernel. This implementation uses
64-bit operations when compiled for 64-bit, and is significantly faster
in that case.

Submitted by:	Tanmay Jagdale <tanmayj@broadcom.com>
2013-09-07 16:31:30 +00:00
Alexander Motin
58909b74b9 Micro-optimize cpu_search(), allowing compiler to use more efficient inline
ffsl() implementation, when it is available, instead of homegrown iteration.

On dual-E5645 amd64 system (2x6x2 cores) under heavy I/O load that reduces
time spent inside cpu_search() from 19% to 13%, while IOPS increased by 5%.
2013-09-07 15:16:30 +00:00
Mark Murray
a40c2646a4 Bring in some behind-the-scenes development, mainly By Arthur Mesh,
the rest by me.

o Namespace cleanup; the Yarrow name is now restricted to where it
  really applies; this is in anticipation of being augmented or
  replaced by Fortuna in the future. Fortuna is mentioned, but behind
  #if logic, and is ignorable for now.

o The harvest queue is pulled out into its own modules.

o Entropy harvesting is emproved, both by being made more conservative,
  and by separating (a bit!) the sources. Available entropy crumbs are
  marginally improved.

o Selection of sources is made clearer. With recent revelations,
  this will receive more work in the weeks and months to come.

Submitted by:	 Arthur Mesh (partly) <arthurmesh@gmail.com>
2013-09-07 14:15:13 +00:00
Andrew Turner
0a10f22a30 On ARM EABI double precision floating point values are stored in the
endian the CPU is in, i.e. little-endian on most ARM cores.

This allows ARMv4 and ARMv5 boards to boot with the ARM EABI.
2013-09-07 14:04:10 +00:00
Davide Italiano
ab97ad0806 Don't clear the unused SI_CHEAPCLONE flag in tap_create()/tuncreate().
Reviewed by:	kib
2013-09-07 13:50:13 +00:00
Davide Italiano
d56b4cd4ac - Use make_dev_credf(MAKEDEV_REF) instead of the race-prone make_dev()+
dev_ref() in the clone handlers that still use it.
- Don't set SI_CHEAPCLONE flag, it's not used anywhere neither in devfs
(for anything real)

Reviewed by:	kib
2013-09-07 13:45:44 +00:00
Hans Petter Selasky
c6fe3731df Revert parts of r245132 and r245175. We don't need to write to the
IMAN register to clear the pending interrupt status bits. This patch
tries to solve problems seen on the MacBook Air, as reported by
Johannes Lundberg <johannes@brilliantservice.co.jp>

MFC after:	1 week
2013-09-07 10:42:00 +00:00
Gleb Smirnoff
af85e9b0c4 Fix !INET6 build. 2013-09-07 09:47:18 +00:00
Mark Murray
9d32fc31c7 MFC 2013-09-07 07:58:29 +00:00
Gleb Smirnoff
fee4c621fc Fix of r255318: move sf_buf_alloc()/sf_buf_free() out of #ifdef
ARM_USE_SMALL_ALLOC.
2013-09-07 07:56:55 +00:00
Navdeep Parhar
34c916c6d2 Add a vtprintf. It is to tprintf what vprintf is to printf.
Reviewed by:	kib
2013-09-07 07:53:21 +00:00
Hans Petter Selasky
549c5c8798 Disable USB 3.0 streams mode by default, hence not all XHCI chipsets
implement it to avoid undefined behaviour.
2013-09-07 06:53:59 +00:00
Neel Natu
45e51299b3 Allocate VPIDs by using the unit number allocator to keep do the bookkeeping.
Also deal with VPID exhaustion by allocating out of a reserved range as the
last resort.
2013-09-07 05:30:34 +00:00
Peter Grehan
8a02f69652 Mask off the vector from the MSI-x data word.
Some o/s's set the trigger-mode level bit which
results in an invalid vector and pass-thru interrupts
not being delivered.
2013-09-07 03:33:36 +00:00
Pedro F. Giffuni
1f7c9f2bc8 ext2fs: temporarily disable htree directory index.
Our code does not consider yet the case of hash collisions. This
is a rather annoying situation where two or more files that
happen to have the same hash value will not appear accessible.

The situation is not difficult to work-around but given that things
will just work without enabling htree we will save possible
embarrassments for the next release.

Reported by:	Kevin Lo
2013-09-07 02:45:51 +00:00
Michael Tuexen
d4d23375d3 When computing the partial delivery point, take the
receiver socket buffer size correctly into account.

MFC after: 1 week
2013-09-07 00:45:24 +00:00
Luiz Otavio O Souza
ebcbd8aeff Remove the hardcoded limit for the number of gpio_pins that can be used.
Allocate it dynamically.

Approved by:	adrian (mentor)
2013-09-06 23:47:50 +00:00
Luiz Otavio O Souza
8d900240b0 Fix an off-by-one bug in ar71xx_gpio and bcm2835_gpio which makes the last
pin unavailable.

Reported and tested by:	sbruno (ar71xx)
Approved by:	adrian (mentor)
Pointy hat to:	loos
2013-09-06 23:39:56 +00:00
Rick Macklem
5ee5ec755d Intermittent crashes in the NLM (rpc.lockd) code during system
shutdown was reporetd via email. The crashes occurred because the
client side NLM would attempt to use its socket after it had been
destroyed. Looking at the code, it would soclose() once the reference
count on the socket handling structure went to 0. Unfortunately,
nlm_host_get_rpc() will simply allocate a new socket handling structure
when none exists and use the now soclose()d socket. Since there doesn't
seem to be a safe way to determine when the socket is no longer needed,
this patch modifies the code so that it never soclose()es the socket.
Since there is only one socket ever created, this does not introduce a
leak when the rpc.lockd is stopped/restarted. The patch also disables
unloading of the nfslockd module, since it is not safe to do so (and
has never been safe to do so, from what I can see).

Reported by:	mav
Tested by:	mav
MFC after:	2 weeks
2013-09-06 23:14:31 +00:00
Cy Schubert
bfc88dcbf7 Update ipfilter 4.1.28 --> 5.1.2.
Approved by:		glebius (mentor)
BSD Licensed by:	Darren Reed <darrenr@reed.wattle.id.au> (author)
2013-09-06 23:11:19 +00:00
Justin T. Gibbs
e44af46e4c Implement PV IPIs for PVHVM guests and further converge PV and HVM
IPI implmementations.

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Submitted by: gibbs (misc cleanup, table driven config)
Reviewed by:  gibbs
MFC after: 2 weeks

sys/amd64/include/cpufunc.h:
sys/amd64/amd64/pmap.c:
	Move invltlb_globpcid() into cpufunc.h so that it can be
	used by the Xen HVM version of tlb shootdown IPI handlers.

sys/x86/xen/xen_intr.c:
sys/xen/xen_intr.h:
	Rename xen_intr_bind_ipi() to xen_intr_alloc_and_bind_ipi(),
	and remove the ipi vector parameter.  This api allocates
	an event channel port that can be used for ipi services,
	but knows nothing of the actual ipi for which that port
	will be used.  Removing the unused argument and cleaning
	up the comments surrounding its declaration helps clarify
	its actual role.

sys/amd64/amd64/mp_machdep.c:
sys/amd64/include/cpu.h:
sys/i386/i386/mp_machdep.c:
sys/i386/include/cpu.h:
	Implement a generic framework for amd64 and i386 that allows
	the implementation of certain CPU management functions to
	be selected at runtime.  Currently this is only used for
	the ipi send function, which we optimize for Xen when running
	on a Xen hypervisor, but can easily be expanded to support
	more operations.

sys/x86/xen/hvm.c:
	Implement Xen PV IPI handlers and operations, replacing native
	send IPI.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
sys/i386/include/smp.h:
	Remove NR_VIRQS and NR_IPIS from FreeBSD headers.  NR_VIRQS
	is defined already for us in the xen interface files.
	NR_IPIS is only needed in one file per Xen platform and is
	easily inferred by the IPI vector table that is defined in
	those files.

sys/i386/xen/mp_machdep.c:
	Restructure to more closely match the HVM implementation by
	performing table driven IPI setup.
2013-09-06 22:17:02 +00:00
Davide Italiano
933e681d93 Retire netisr.netisr_direct and netisr.netisr_direct_force sysctls.
These were used to control/export dispatch policy but they're not anymore.
This commit cannot be MFC'ed to 9 because old netstat(9) binary relies
on such sysctl to work. On the other hand, there's no real reason to
keep'em around in 10.
2013-09-06 21:02:43 +00:00
Bryan Venteicher
ddb4ffd0c6 Add vmx device to the i386 and amd64 NOTES files 2013-09-06 20:24:21 +00:00
David Chisnall
52b42bace1 On platforms where clang is the default compiler, don't build gcc or libstdc++.
To enable them, set WITH_GCC and WITH_GNUCXX in src.conf.
Make clang default to using libc++ on FreeBSD 10.
Bumped __FreeBSD_version for the change.

GCC is still enabled on PC98, because the PC98 bootloader requires GCC to build
(or, at least, hard-codes the use of gcc into its build).

Thanks to everyone who helped make the ports tree ready for this (and bapt
for coordinating them all).  Also to imp for reviewing this and working on the
forward-porting of the changes in our gcc so that we're getting to a much
better place with regard to external toolchains.

Sorry to all of the people who helped who I forgot to mention by name.

Reviewed by:	bapt, imp, dim, ...
2013-09-06 20:08:03 +00:00
Xin LI
7acd42244e Return BUS_PROBE_DEFAULT instead of BUS_PROBE_SPECIFIC.
This change is a 9.2-RELEASE candidate.

Approved by:	HighPoint Technologies
2013-09-06 18:41:57 +00:00
Mark Murray
c6c7b2912c Yarrow wants entropy estimations to be conservative; the usual idea
is that if you are certain you have N bits of entropy, you declare
N/2.
2013-09-06 17:51:52 +00:00
Gleb Smirnoff
2ee9b44cae Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations
to MD headers.
2013-09-06 17:44:13 +00:00
Mark Murray
0fbf163e60 MFC 2013-09-06 17:42:12 +00:00
Jamie Gritton
bb56d716ea Keep PRIV_KMEM_READ permitted inside jails as it is on the outside. 2013-09-06 17:32:29 +00:00
Konstantin Belousov
9430f833ca Only lock pvh_global_lock read-only for pmap_page_wired_mappings(),
pmap_is_modified() and pmap_is_referenced(), same as it was done for
pmap_ts_referenced().

Consolidate identical code for pmap_is_modified() and
pmap_is_referenced() into helper pmap_page_test_mappings().

Reviewed by:	alc
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
2013-09-06 16:53:48 +00:00
Konstantin Belousov
3e4f32be7d In pmap_ts_referenced(), when restarting the loop due to pv list
generation changed, do not drop and immediately relock the pv list.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-09-06 16:48:34 +00:00
Alexander Motin
f9004a5db0 Make SES driver adequately react on simple enclosure devices -- read Short
Enclosure status to enclosure status field, clear previous state and exit.
2013-09-06 15:41:37 +00:00
Bryan Venteicher
ffead710d5 Add camcontrol support for the SCSI sanitize command
Reviewed by:	ken, mjacob (eariler version)
Sponsored by:	Netapp
2013-09-06 15:19:57 +00:00
Pawel Jakub Dawidek
0d1322f22f Bump __FreeBSD_version to 1000053 after cap_rights_t change.
Suggested by:	danfe
2013-09-06 14:34:20 +00:00
Alexander Motin
d7a52e7b49 Fix kernel panic if cache->nelms is zero.
MFC after:	2 weeks
2013-09-06 14:31:52 +00:00
Luiz Otavio O Souza
ce6ba017fa Fix the leakage of dma tags on if_arge. The leak occur when arge_start()
add some packet(s) to tx ring and arge_stop() is called before receive the
sent packet interrupt from hardware.  Fix arge_stop() to unload the in use
dma tags and free the associated mbuf.

PR:		178319, 163670
Approved by:	adrian (mentor)
2013-09-06 12:47:14 +00:00
Gleb Smirnoff
1338ab601f Fix build. 2013-09-06 05:38:20 +00:00
Gleb Smirnoff
e16477e8d9 On those machines, where sf_bufs do not represent any real object, make
sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely
empty functions.

Reviewed by:	alc, kib, scottl
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2013-09-06 05:37:49 +00:00
Peter Grehan
76c35ba80f Emulate reading of the IA32_MISC_ENABLE MSR, by returning
the host MSR and masking off features that aren't supported.
Linux reads this MSR to detect if NX has been disabled via
BIOS.
2013-09-06 05:20:11 +00:00
Peter Grehan
8b7e3e3022 Allow CPUID leaf 0xD to be read as zeroes.
Linux reads this even though extended features
aren't exposed.

Support for 0xD will be expanded once AVX[2]
is exposed to the guest in upcoming work.
2013-09-06 05:16:10 +00:00
Rick Macklem
318677ad92 It was reported via email that the cu_sent field used by the
krpc client side UDP was observed as way out of range and
caused the rpc.lockd daemon to hang trying to do an RPC.
Inspection of the code found two places where the RPC request
is re-queued, but the value of cu_sent was not incremented.
Since cu_sent is always decremented when the RPC request is
dequeued, I think this could have caused cu_sent to go out of
range. This patch adds lines to increment cu_sent for these
two cases.

Reported by:	dwhite@ixsystems.com
Discussed with:	dwhite@ixsystems.com
MFC after:	2 weeks
2013-09-06 02:34:34 +00:00
Nathan Whitehorn
653a5825b8 Also align the 32-bit PowerPC stacks. 2013-09-05 23:28:50 +00:00
Carl Delsey
2a6e50d2ce Remove contractions.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:14:27 +00:00
Carl Delsey
538779c1a0 Only tear down interface and transport if they've been successfully setup.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:12:58 +00:00
Carl Delsey
218b961f0e Workaround an issue with hardware by accessing remote device through mem
window.

Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:11:11 +00:00
Carl Delsey
b0f569217d Simplify register access macros by removing one level of indirection.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:08:22 +00:00
Carl Delsey
ff53f82bfd Cleaning up spacing and making hex value case consistent.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:06:25 +00:00
Carl Delsey
bfb4daf19d Implement workaround for IvyTown 4K BAR size issue.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:04:36 +00:00
Carl Delsey
93d43573eb Simplifying bus alloc resource call since we only need the default values.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:02:43 +00:00
Carl Delsey
e70d7a7c79 Add support for per device features and workarounds.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 23:00:59 +00:00
Nathan Whitehorn
a5715964b1 Align stacks of kernel threads correctly at 16-byte boundaries rather than
making sure they are all misaligned at +8 bytes. This fixes clang builds
of powerpc64 kernels (aside from a required increase in KSTACK_PAGES which
will come later).

This commit from FreeBSD/powerpc64 with a clang-built kernel.

MFC after:	2 weeks
2013-09-05 23:00:24 +00:00
Carl Delsey
87a7a3f08d Restructure the PCI bar initialization code in anticipation of upcoming
bug fixes.

Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 22:59:18 +00:00
Carl Delsey
64d957247f Fix name change from ntb_transport to if_ntb. A few places were
overlooked.

Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 22:56:52 +00:00
Carl Delsey
12c5baf9f5 Throw a bit to enable the link to come up on Xeon.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 22:52:40 +00:00
Carl Delsey
d43c0fbae1 Add some logging to ntb link up.
Approved by:	jimharris
Sponsored by:	Intel
2013-09-05 22:46:48 +00:00
Hiren Panchasara
b6f49c23a3 Fixing a small typo.
Reviewed by:	gjb
Approved by:	sbruno (mentor)
2013-09-05 18:18:23 +00:00
Sean Bruno
7befb5c2ca Minor printf nit to keep out clean 2013-09-05 16:38:26 +00:00
John Baldwin
86d93a15ff Use LIST_FOREACH_SAFE() instead of doing it by hand. 2013-09-05 14:26:37 +00:00
John Baldwin
fa302f207f Use an unsigned long when indexing into mfchashtbl[] and mf6ctable[]. This
matches the types used when computing hash indices and the type of the
maximum size of mfchashtbl[].

PR:		kern/181821
Submitted by:	Sven-Thorsten Dietrich <sven@vyatta.com> (IPv4)
MFC after:	1 week
2013-09-05 14:16:37 +00:00
Gleb Smirnoff
98fa035135 Fix build. 2013-09-05 13:53:25 +00:00
Gleb Smirnoff
434c3d4783 Fix build.
counter.h requires systm.h
2013-09-05 13:46:30 +00:00
Konstantin Belousov
a677b31425 The vm_pageout_flush() functions sbusies pages in the passed pages
run.  After that, the pager put method is called, usually translated
to VOP_WRITE().  For the filesystems which use buffer cache,
bufwrite() sbusies the buffer pages again, waiting for the xbusy state
to drain.  The later is done in vfs_drain_busy_pages(), which is
called with the buffer pages already sbusied (by vm_pageout_flush()).

Since vfs_drain_busy_pages() can only wait for one page at the time,
and during the wait, the object lock is dropped, previous pages in the
buffer must be protected from other threads busying them.  Up to the
moment, it was done by xbusying the pages, that is incompatible with
the sbusy state in the new implementation of busy.  Switch to sbusy.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-09-05 12:56:08 +00:00
Konstantin Belousov
7a4b2bc56c The vm_page_trysbusy() should not fail when shared busy counter or
VPB_BIT_WAITERS flag were changed between reading of busy_lock and the
cas.  The vm_page_sbusy(), which is the only user of
vm_page_trysbusy() in the tree, panics on the failure, which in these
cases is transient and do not mean that the current page state
prevents sbusying.

Retry the operation inside vm_page_trysbusy() if cas failed, only
return a failure when VPB_BIT_SHARED is cleared.

Reported and tested by:	pho
Reviewed by:	attilio
Sponsored by:	The FreeBSD Foundation
2013-09-05 12:54:40 +00:00
Pawel Jakub Dawidek
96a62209fb The fget() function now takes pointer to cap_rights_t, so change 0 to NULL. 2013-09-05 11:59:23 +00:00
Pawel Jakub Dawidek
ab568de789 Handle cases where capability rights are not provided.
Reported by:	kib
2013-09-05 11:58:12 +00:00
Gleb Smirnoff
2af0c790ec Fix !CAPABILITIES build. 2013-09-05 10:24:09 +00:00
Ruslan Bukin
c7e4729da4 Add support for DLINK DWA-127 Wireless Adapter
Approved by:	cognet (mentor)
2013-09-05 10:09:24 +00:00
Andrey V. Elsukov
87c0c612d8 Remove stub implementation.
MFC after:	1 week
2013-09-05 09:44:09 +00:00
Pawel Jakub Dawidek
44fcd367c5 Correct the logic broken in my last commit.
Reported by:	tijl
2013-09-05 09:36:19 +00:00
Andrey V. Elsukov
d983befd2f Remove unused code and sort variables declarations.
PR:		kern/181822
MFC after:	1 week
2013-09-05 08:12:36 +00:00
Sean Bruno
8e076eff36 Restore builds on architectures that don't support CAPABILITIES (mips). 2013-09-05 03:46:44 +00:00
Sean Bruno
88eb548859 This looks like a typo that breaks the build. Yell at me if this isn't the
intended declaration.
2013-09-05 03:36:57 +00:00
Justin Hibbits
995df27c66 Fix the build. 2013-09-05 01:13:26 +00:00
Pawel Jakub Dawidek
7e473ea146 Add sysctl/tunables for various metaslab variables. 2013-09-05 00:53:01 +00:00
Pawel Jakub Dawidek
a686a7be03 Style fixes. 2013-09-05 00:19:30 +00:00
Pawel Jakub Dawidek
547561f1b0 Style fixes. Most fixes are about not treating integers and pointers as
booleans.
2013-09-05 00:17:38 +00:00
Pawel Jakub Dawidek
00a7f703b3 Regenerate after r255219.
Sponsored by:	The FreeBSD Foundation
2013-09-05 00:11:59 +00:00
Pawel Jakub Dawidek
7008be5bd7 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
Justin T. Gibbs
c70fe93ad8 Correct blkback handling of the BLKIF_OP_FLUSH_DISKCACHE opcode.
Properly round-trip the "operation code" for client requests.

sys/dev/xen/blkback/blkback.c:
	In xbb_dispatch_dev() when processing a flush request,
	correctly set bio->bio_caller1 to the request list (not
	bare request) for the operation, as is expected by the
	completion handler xbb_bio_done().

	In xbb_get_resources(), initialize "operation" in the
	driver's internal request object from the client's "ring
	request", so it is correct when used to populate the reply
	when this operation completes.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
2013-09-04 23:32:49 +00:00
Konstantin Belousov
6aceaa3e17 Tidy up some loose ends in the PCID code:
- Restore the pre-PCID TLB shootdown handlers for whole address space
  and single page invalidation asm code, and assign the IPI handler to
  them when PCID is not supported or disabled.  Old handlers have
  linear control flow.  But, still use the common return sequence.

- Stop using pcpu for INVPCID descriptors in the invlrg handler.  It
  is enough to allocate descriptors on the stack.  As result, two
  SWAPGS instructions are shaved off from the code for Haswell+.

- Fix the reverted condition in invlrng for checking of the PCID
  support [1], also in invlrng check that pmap is kernel pmap before
  performing other tests.  For the kernel pmap, which provides global
  mappings, the INVLPG must be used for invalidation always.

- Save the pre-computed pmap' %CR3 register in the struct pmap.  This
  allows to remove several checks for pm_pcid validity when %CR3 is
  reloaded [2].

Noted by:   gibbs [1]
Discussed with:	alc [2]
Tested by:	pho, flo
Sponsored by:	The FreeBSD Foundation
2013-09-04 23:31:29 +00:00
Rick Macklem
f7d8291af0 Crashes have been observed for NFSv4.1 mounts when the system
is being shut down which were caused by the nfscbd_pool being
destroyed before the backchannel is disabled. This patch is
believed to fix the problem, by simply avoiding ever destroying
the nfscbd_pool. Since the NFS client module cannot be unloaded,
this should not cause a memory leak.

MFC after:	2 weeks
2013-09-04 22:47:56 +00:00
Peter Grehan
46ed9e4908 IFC @ r255209 2013-09-04 20:55:56 +00:00
Oleksandr Tymoshenko
3b15395e04 Add 32-bit support for Gxemul's oldtestmips machine emulation
Original work by: kan@
2013-09-04 20:34:36 +00:00
Eitan Adler
4c8d7275a4 Revert r255152:
It turns out that synaptics_support was turned off by default
because its probing method is too intrusive not because it was unstable.

Once this is fixed it should be enabled once again.

Reported by:	delphij, jkim
2013-09-04 18:42:05 +00:00
Brooks Davis
f43581345b MFP4 217312, 222008, 222052, 222053, 222673, 231484, 231491, 231565, 570643
Rework the timeout code to use actual time rather than a DELAY() loop and
to use both typical and maximum to allow logging of timeout failures.
Also correct the erase timeout, it is specified in milliseconds not
microseconds like the other timeouts.  Do not invoke DELAY() between
status queries as this adds significant latency which in turn reduced
write performance substantially.

Sanity check timeout values from the hardware.

Implement support for buffered writes (only enabled on Intel/Sharp parts
for now).  This yields an order of magnitude speedup on the 64MB Intel
StrataFlash parts we use.

When making a copy of the block to modify, also keep a clean copy around
until we are ready to commit the block and use it to avoid unnecessary
erases.  In the non-buffer write case, also use it to avoid
unnecessary writes when the block has not been erased.  This yields a
significant speedup when doing things like zeroing a block.

Sponsored by:	DARPA, AFRL
Reviewed by:	imp (previous version)
2013-09-04 17:19:21 +00:00
John Baldwin
5396f9ec5a Trim a couple of panic messages. 2013-09-04 11:52:28 +00:00
Gleb Smirnoff
5185640523 Make default cache size more modern.
Requested by:	Slawa Olhovchenkov <slw zxy.spb.ru>
2013-09-04 10:17:50 +00:00
Justin Hibbits
177f0102f4 Fix hwpmc(4) for 32-bit PowerPC. 2013-09-04 04:11:38 +00:00
Navdeep Parhar
4f641559c7 For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is
set to 15 to indicate that the peer did not send a window scale option
with its SYN.  Do not send a window scale option in the SYN|ACK reply
in that case.
2013-09-03 23:34:04 +00:00
Sean Bruno
2b2cd594f5 Add options GEOM_PART_GPT and options MSDOSFS to the DIR-825
Reviewed by:	adrian@
2013-09-03 22:33:06 +00:00
Warner Losh
ce7c952a8e Newer versions of gcc define __INT64_C and __UINT64_C, so avoid
redefining them if gcc provides them.
2013-09-03 22:04:55 +00:00
John Baldwin
dffe0dc4d2 Add support for the 'invpcid' instruction to binutils and DDB's
disassembler on amd64.

MFC after:	1 month
2013-09-03 21:21:47 +00:00
Michael Tuexen
0ddb429900 Remove redundant field pr_sctp_on.
MFC after: 1 week
2013-09-03 19:31:59 +00:00
John-Mark Gurney
ff6c7bf5ca Use the fact that the AES-NI instructions can be pipelined to improve
performance... Use SSE2 instructions for calculating the XTS tweek
factor...  Let the compiler do more work and handle register allocation
by using intrinsics, now only the key schedule is in assembly...

Replace .byte hard coded instructions w/ the proper instructions now
that both clang and gcc support them...

On my machine, pulling the code to userland I saw performance go from
~150MB/sec to 2GB/sec in XTS mode.  GELI on GNOP saw a more modest
increase of about 3x due to other system overhead (geom and
opencrypto)...

These changes allow almost full disk io rate w/ geli...

Reviewed by:	-current, -security
Thanks to:	Mike Hamburg for the XTS tweek algorithm
2013-09-03 18:31:23 +00:00