Commit Graph

92299 Commits

Author SHA1 Message Date
Marius Strobl
105421ff81 Merge r247814 from x86 modulo whitespace bug:
Turn on the CTL disable tunable by default.

This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel.  They
can simply set kern.cam.ctl.disable=0 in loader.conf.
2013-03-08 13:11:45 +00:00
Andre Oppermann
15ae0c9af9 Move the callout subsystem initialization to its own SYSINIT()
from being indirectly called via cpu_startup()+vm_ksubmap_init().
The boot order position remains the same at SI_SUB_CPU.

Allocation of the callout array is changed to stardard kernel malloc
from a slightly obscure direct kernel_map allocation.

kern_timeout_callwheel_alloc() is renamed to callout_callwheel_init()
to better describe its purpose.
kern_timeout_callwheel_init() is removed simplifying the per-cpu
initialization.

Reviewed by:	davide
2013-03-08 10:37:17 +00:00
Andre Oppermann
f8ccf82a4c Move the auto-sizing of the callout array from init_param2() to
kern_timeout_callwheel_alloc() where it is actually used.

This is a mechanical move and no tuning parameters are changed.

The pre-allocated callout array is only used for legacy timeout(9)
calls and is only allocated and active on cpu0.  Eventually all
remaining users of timeout(9) should switch to the callout_* API.

Reviewed by:	davide
2013-03-08 10:14:58 +00:00
Tim Kientzle
08907adea3 This file is specific to arm11x6 processors, so tell the
assembler it's okay to use arm11x6 instructions.
2013-03-08 03:29:05 +00:00
David E. O'Brien
4b52061e17 Fix GCC build:
/usr/src/sys/modules/nvme/../../dev/nvme/nvme.c:211: warning: format '%qx' expects type 'long unsigned int', but argument 9 has type 'long long unsigned int' [-Wformat]
2013-03-07 22:54:28 +00:00
Gavin Atkinson
10f29053d2 Support the FAT16 partition type in gpart(8)
PR:		kern/174714
Submitted by:	4721 at hushmail dot com
MFC after:	1 week
2013-03-07 22:32:41 +00:00
Alexander Motin
34d3281c57 Fix panic when Secondary_Element_Count == 1 and Secondary_Element_Seq
is not set (255).

Reported by:	sbruno
MFC after:	1 week
2013-03-07 18:55:37 +00:00
Alexander Motin
836972b877 Fix off-by-one error in nanoseconds validation.
Submitted by:	bde
2013-03-07 16:50:07 +00:00
Gavin Atkinson
9ec80eff4c Correct two spelling mistakes in a comment. 2013-03-07 13:24:49 +00:00
Gleb Smirnoff
2112695c03 Add quirks to enable headphones redirection on number of Lenovo
laptops, namely X1, X1 Carbon, T420, T520.

PR:		misc/176656
Submitted by:	Hiren Panchasar <hiren.panchasara gmail.com>
Tested by:	glebius, X1 Carbon
Tested by:	osa, X1
Tested by:	Hiren Panchasar, T420
Tested by:	sbruno, T520
Reviewed by:	mav
Sponsored by:	Nginx, Inc.
2013-03-07 08:00:04 +00:00
Gleb Smirnoff
a95940fd46 Plug a memory leak.
Reviewed by:	mav
Sponsored by:	Nginx, Inc.
2013-03-07 07:54:50 +00:00
Lawrence Stewart
1e0e83d760 The hashmask returned by hashinit() is a valid index in the returned hash array.
Fix a siftr(4) potential memory leak and INVARIANTS triggered kernel panic in
hashdestroy() by ensuring the last array index in the flow counter hash table is
flushed of entries.

MFC after:	3 days
2013-03-07 04:42:20 +00:00
Ian Lepore
9a2bff7ca6 Call sched_prio() to immediately change the priority of the thread in
response to an rtprio_thread() call, when the priority is different
than the old priority, and either the old or the new priority class is
not RTP_PRIO_NORMAL (timeshare).

The reasoning for the second half of the test is that if it's a change in
timeshare priority, then the scheduler is going to adjust that priority
in a way that completely wipes out the requested change anyway, so
what's the point?  (If that's not true, then allowing a thread to change
its own timeshare priority would subvert the scheduler's adjustments and
let a cpu-bound thread monopolize the cpu; if allowed at all, that
should require priveleges.)

On the other hand, if either the old or new priority class is not
timeshare, then the scheduler doesn't make automatic adjustments, so we
should honor the request and make the priority change right away.  The
reason the old class gets caught up in this is the very reason for this
change:  when thread A changes the priority of its child thread B from
idle back to timeshare, thread B never actually gets moved to a
timeshare-range run queue unless there are some idle cycles available
to allow it to first get scheduled again as an idle thread.

Reviewed by:	jhb@
2013-03-07 02:53:29 +00:00
Alexander Motin
b5ea3779da Reduce minimal time intervals of setitimer(2) from 1/HZ to 1/(16*HZ) by
using callout_reset_sbt() instead of callout_reset().  We can't remove
lower limit completely in this case because of significant processing
overhead, caused by unability to use direct callout execution due to using
process mutex in callout handler for sending SEGALRM signal.  With support
of periodic events that would allow unprivileged user to abuse the system.

Reviewed by:	davide
2013-03-06 22:40:47 +00:00
Alexander Motin
980c545d76 Fix time math overflows and improve zero intervals handling in poll(),
select(), nanosleep() and kevent() functions after calloutng changes.

Reported by:	bde
2013-03-06 19:37:38 +00:00
Ulrich Spörlein
7732eaccce Fix 'make depend' 2013-03-06 11:44:19 +00:00
Xin LI
cdaba8920e Update driver to version 4.6.95.0.
Submitted by:	"Duvvuru,Venkat Kumar" <VenkatKumar.Duvvuru Emulex.Com>
MFC after:	3 days
2013-03-06 09:53:38 +00:00
Bryan Venteicher
0cfbcf8c7b Remove the virtio dependency entry for the VirtIO device drivers. This
will prevent the kernel from linking if the device driver are included
without the virtio module. Remove pci and scbus for the same reason.

Also explain the relationship and necessity of the virtio and virtio_pci
modules. Currently in FreeBSD, we only support VirtIO PCI, but it could
be replaced with a different interface (like MMIO) and the device
(network, block, etc) will still function.

Requested by:	luigi
Approved by:	grehan (mentor)
MFC after:	3 days
2013-03-06 07:17:53 +00:00
Andrew Turner
078996e049 Fix stack alignment in the kernel to be on an 8 byte boundary as required
by AAPCS.
2013-03-06 06:19:56 +00:00
Oleksandr Tymoshenko
e9401a9e0e - Reset DMA channel if error occured
- Initialize info field in bcm_dma_reset

Submitted by:	Daisuke Aoyama <aoyama@peach.ne.jp>
2013-03-05 20:00:11 +00:00
Martin Matuska
400c4069a5 MFV r247845:
Import ZFS bpobj bugfix from vendor.

Illumos ZFS issues:
  3603 panic from bpobj_enqueue_subobj()
  3604 zdb should print bpobjs more verbosely

References:
  https://www.illumos.org/issues/3603
  https://www.illumos.org/issues/3604

MFC after:	1 week
2013-03-05 18:54:41 +00:00
Konstantin Belousov
257d427d5f Fix build with gcc, do not use unnamed union.
Reported and tested by:	gjb
MFC after:	1 month
2013-03-05 16:15:34 +00:00
Konstantin Belousov
94415dd90e Fix build with gcc, remove redundand declarations.
Reported and tested by:	gjb
MFC after:	1 month
2013-03-05 16:14:55 +00:00
Alexander V. Chernikov
14126522cf Write lock is not required for find&compare operation.
MFC after:	2 weeks
2013-03-05 13:38:45 +00:00
Jean-Sébastien Pédron
f0916543a6 drm_global.c: Destroy sx in drm_global_release()
This fixes a build error at the same time (unused variable "item"), if
the kernel is compiled without INVARIANTS.

Reported by:	Hartmann, O. <ohartman@zedat.fu-berlin.de> (build error)
Reviewed by:	Konstantin Belousov (kib@)
2013-03-05 11:18:57 +00:00
Konstantin Belousov
9b7efb7458 Correct the r247832.
Noted by:	marius, rdivacky
MFC after:	1 month
2013-03-05 11:02:38 +00:00
Jean-Sébastien Pédron
5943eed4b9 g_label_ntfs.c: Mark structures as __packed
Without this, read data is mis-interpreted. This could trigger a panic,
as was the case on one computer where computed "recsize" was zero,
leading to a call to g_read_page() asking for 0 bytes.
2013-03-05 11:02:05 +00:00
Fabien Thomas
d49302aead Add a generic way to call per event allocate / release function.
Reviewed by:	mav
MFC after:	1 month
2013-03-05 10:18:48 +00:00
Konstantin Belousov
e6cd8542ed Import the preliminary port of the TTM.
The early commit is done to facilitate the off-tree work on the
porting of the Radeon driver.

Sponsored by:	The FreeBSD Foundation
Debugged and tested by:	    dumbbell
MFC after:	1 month
2013-03-05 09:49:34 +00:00
Konstantin Belousov
9cfa0e9e3c Import the drm_global references helpers.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-03-05 09:27:21 +00:00
Konstantin Belousov
8f3993c1f1 Import the drm_mm_debug_table() function.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-03-05 09:07:58 +00:00
Konstantin Belousov
214bb83805 Import the likely() compat macro.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-03-05 09:07:01 +00:00
Gleb Smirnoff
fa0fbaece3 Simplify TAILQ usage and avoid additional memory allocations.
Tested by:	Eugene M. Zheganin <emz norma.perm.ru>
Sponsored by:	Nginx, Inc
2013-03-05 08:08:16 +00:00
Bryan Venteicher
4dbc384845 Only set the barrier flag if the feature was negotiated
When the VirtIO barrier feature is not negotiated, the driver
must enforce the proper ordering for BIO_ORDERED BIOs. All the
in-flight BIOs must complete before starting the BIO, and the
ordered BIO must complete before subsequent BIOs can start.

Also fix a few whitespace nits.

Reported by:	neel
Approved by:	grehan (mentor)
MFC after:	3 days
2013-03-05 07:00:05 +00:00
Jack F Vogel
facc592d88 Fix a small, but important bug, a task drain was mistakenly
being compiled only when setting LEGACY_TX, this means you would
not get the drain when needed on detach!!

Thanks to Bryan Venteicher (bryanv@freebsd.org) for catching this
little gremlin!! :)
2013-03-04 23:15:07 +00:00
Jack F Vogel
0ecc2ff0e8 First, sync to internal shared code, and then
Fixes:
	- flow control - don't override user value on re-init
	- fix to make 1G optics work correctly
	- change to interrupt enabling - some bits were incorrect
	  for certain hardware.
	- certain stats fixes, remove a duplicate increment of
	  ierror, thanks to Scott Long for pointing these out.
	- shared code link interface changed, requiring some
	  core code changes to accomodate this.
	- add an m_adj() to ETHER_ALIGN on the recieve side, this
	  was requested by Mike Karels, thanks Mike.
	- Multicast code corrections also thanks to Mike Karels.
2013-03-04 23:07:40 +00:00
Davide Italiano
23d44ab528 - Bump __FreeBSD_version after recent callout(9) changes.
- Add an entry in UPDATING to notice users about breakages.
2013-03-04 22:41:49 +00:00
Justin T. Gibbs
7e2a739f03 Fix assertion failure when using userland DTrace probes from
the pid provider on a kernel compiled with INVARIANTS.

sys/cddl/contrib/opensolaris/uts/intel/dtrace/fasttrap_isa.c:
	In fasttrap_probe_pid(), attempts to write to the
	address space of the thread that fired the probe
	must be performed with the process of the thread
	held.  Use _PHOLD() to ensure this is the case.

	In fasttrap_probe_pid(), use proc_write_regs() instead
	of calling set_regs() directly.  proc_write_regs()
	performs invariant checks to verify the calling
	environment of set_regs().  PROC_LOCK()/UNLOCK() around
	the call to proc_write_regs() so that it's invariants
	are satisfied.

Sponsored by:	Spectra Logic Corporation
Reviewed by:	gnn, rpaulo
MFC after:	1 week
2013-03-04 22:07:36 +00:00
Davide Italiano
ac42a1726a Complete r247813:
Use true/false instead of TRUE/FALSE.

Reported by:	attilio
Requested by:	jhb
2013-03-04 21:52:12 +00:00
Alexander Motin
32ea29e2eb Add quirk to enable headphones redirection on Lenovo X220.
PR:		kern/174876
MFC after:	1 week
2013-03-04 21:20:13 +00:00
Kenneth D. Merry
3a45b4781a Re-enable CTL in GENERIC on i386 and amd64, but turn on the CTL disable
tunable by default.

This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel.  They
can simply set kern.cam.ctl.disable=0 in loader.conf.

The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.

UPDATING:		Explain the change in the CTL configuration, and
			how users can enable CTL if they would like to use
			it.

sys/conf/options:	Add a new option, CTL_DISABLE, that prevents CTL
			from initializing.

ctl.c:			If CTL_DISABLE is turned on, don't initialize.

i386/conf/GENERIC,
amd64/conf/GENERIC:	Re-enable device ctl, and add the CTL_DISABLE
			option.
2013-03-04 21:18:45 +00:00
Davide Italiano
a4a3ce9919 Use C99 'bool' rather than Machish 'boolean_t'.
Requested by:	jhb
2013-03-04 21:09:22 +00:00
Davide Italiano
40e794ab19 MFcalloutng:
- Rewrite kevent() timeout implementation to allow sub-tick precision.
- Make the interval timings for EVFILT_TIMER more accurate. This also
removes an hack introduced in r238424.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:55:16 +00:00
Davide Italiano
cf5e4fe6bb MFcalloutng:
Fix kern_select() and sys_poll() so that they can handle sub-tick
precision for timeouts (in the same fashion it was done for nanosleep()
in r247797).

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:41:27 +00:00
Davide Italiano
4601bab1fb MFcalloutng (r244251 with minor changes):
Specify that precision of 0.5s is enough for resource limitation.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:25:12 +00:00
Davide Italiano
36d0b73102 MFcalloutng (r236314 by mav):
Specify that wakeup rate of 7.5-10Hz is enough for yarrow harvesting
thread.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:16:23 +00:00
Davide Italiano
c38250c9b9 MFcalloutng (r244255 by mav, with minor changes):
Specify that syslog doesn't need exactly 5 wakeups per second.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:07:55 +00:00
Davide Italiano
098176f0d0 MFcalloutng:
kern_nanosleep() is now converted to use tsleep_sbt(). With this change
nanosleep() and usleep() can handle sub-tick precision for timeouts.
Also, try to help coalesce of events passing as argument to tsleep_bt()
a precision value calculated as a percentage of the sleep time.
This percentage is default 5%, but it can tuned according to users
need via the sysctl interface.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 15:57:41 +00:00
Davide Italiano
037637812d Fix build with DIAGNOSTIC/CALLOUT_PROFILING options turned on.
Reported by:	kib, David Wolfskill <david at catwhisker dot org>
Pointy-hat to:	davide
2013-03-04 15:03:52 +00:00
Davide Italiano
6b98f11545 MFcalloutng (r244249, r244306 by mav):
- Switch syscons from timeout() to callout_reset_flags() and specify that
precision is not important there -- anything from 20 to 30Hz will be fine.
- Reduce syscons "refresh" rate to 1-2Hz when console is in graphics mode
and there is nothing to do except some polling for keyboard.  Text mode
refresh would also be nice to have adaptive, but this change at least
should help laptop users who running X.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 14:00:58 +00:00
Attilio Rao
198da1b2fa Merge from vmcontention:
As vm objects are type-stable there is no need to initialize the
resident splay tree pointer and the cache splay tree pointer in
_vm_object_allocate() but this could be done in the init UMA zone
handler.

The destructor UMA zone handler, will further check if the condition is
retained at every destruction and catch for bugs.

Sponsored by:	EMC / Isilon storage division
Submitted by:	alc
2013-03-04 13:10:59 +00:00
Davide Italiano
24e48c6d5b MFcalloutng:
Introduce sbt variants of msleep(), msleep_spin(), pause(), tsleep() in
the KPI, allowing to specify timeout in 'sbintime_t' rather than ticks.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 12:48:41 +00:00
Davide Italiano
461537356a MFcalloutng:
Extend condvar(9) KPI introducing sbt variant of cv_timedwait. This
rely on the previously committed sleepq_set_timeout_sbt().

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 12:20:48 +00:00
Davide Italiano
7392d01c36 Style fix: remove useless braces. Sorry, my bad.
Submitted by:	bde
2013-03-04 11:55:32 +00:00
Davide Italiano
965ac611ec MFcalloutng:
Convert sleepqueue(9) bits to the new callout KPI. Take advantage of
the possibility to run callback directly from hw interrupt context.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 11:51:46 +00:00
Davide Italiano
dbd2e1677f MFcalloutng (r244355):
Make loadavg calculation callout direct. There are several reasons for it:
 - it is very simple and doesn't worth context switch to SWI;
 - since SWI is no longer used here, we can remove twelve years old hack,
excluding this SWI from from the loadavg statistics;
 - it fixes problem when eventtimer (HPET) shares interrupt with some other
device, and that interrupt thread counted as permanent loadavg of 1; now
loadavg accounted before that interrupt thread is scheduled.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, Fabian Keil, markj
2013-03-04 11:22:19 +00:00
Davide Italiano
5b999a6be0 - Make callout(9) tickless, relying on eventtimers(4) as backend for
precise time event generation. This greatly improves granularity of
callouts which are not anymore constrained to wait next tick to be
scheduled.
- Extend the callout KPI introducing a set of callout_reset_sbt* functions,
which take a sbintime_t as timeout argument. The new KPI also offers a
way for consumers to specify precision tolerance they allow, so that
callout can coalesce events and reduce number of interrupts as well as
potentially avoid scheduling a SWI thread.
- Introduce support for dispatching callouts directly from hardware
interrupt context, specifying an additional flag. This feature should be
used carefully, as long as interrupt context has some limitations
(e.g. no sleeping locks can be held).
- Enhance mechanisms to gather informations about callwheel, introducing
a new sysctl to obtain stats.

This change breaks the KBI. struct callout fields has been changed, in
particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t'
(8 bytes) and another 'sbintime_t' field was added for precision.

Together with:	mav
Reviewed by:	attilio, bde, luigi, phk
Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo (amd64, sparc64), marius (sparc64), ian (arm),
		markj (amd64), mav, Fabian Keil
2013-03-04 11:09:56 +00:00
Olivier Houchard
8fd49af627 If we're using a PIPT L2 cache, only merge 2 segments if both the virtual
and the physical addreses are contiguous.

Submitted by:	Thomas Skibo <ThomasSkibo@sbcglobal.net>
2013-03-04 10:41:54 +00:00
Adrian Chadd
bdb9fa5c87 add a method to set/clear the VMF field in the TX descriptor.
Obtained from:	Qualcomm Atheros
2013-03-04 07:40:49 +00:00
Eitan Adler
1eb9ea583b Remove check for NULL prior to free(9) and m_freem(9).
Approved by:	cperciva (mentor)
2013-03-04 02:21:34 +00:00
Pawel Jakub Dawidek
8cb539f18f For some reason when I started to pass filedescent structures instead of
pointers to the file structure receiving descriptors stopped to work when also
at least few kilobytes of data is being send. In the kernel the
soreceive_generic() function doesn't see control mbuf as the first mbuf and
unp_externalize() is never called, first 6(?) kilobytes of data is missing as
well on receiving end.

This breaks for example tmux.

I don't know yet why going from 8 bytes to sizeof(struct filedescent) per
descriptor (or even to 16 bytes per descriptor) breaks things, but to
work-around it for now use 8 bytes per file descriptor at the cost of memory
allocation.

Reported by:	flo, Diane Bruce, Jan Beich <jbeich@tormail.org>
Simple testcase provided by:	mjg
2013-03-03 23:39:30 +00:00
Pawel Jakub Dawidek
5f39e56581 Use dedicated malloc type for filecaps-related data, so we can detect any
memory leaks easier.
2013-03-03 23:25:45 +00:00
Pawel Jakub Dawidek
a6157c3d61 Plug memory leaks in file descriptors passing. 2013-03-03 23:23:35 +00:00
Ulrich Spörlein
21e0559cbc Fix 'make depend' 2013-03-03 16:17:09 +00:00
Davide Italiano
3f555c45eb callwheelmask and callwheelsize are always greater than zero.
Switch their type to u_int.
2013-03-03 15:01:33 +00:00
Davide Italiano
0fb285b716 Remove a couple of unused include. 2013-03-03 14:47:02 +00:00
Alexander Motin
4514d6fa18 MFcalloutng:
Some whitespace fixes.
2013-03-03 09:11:24 +00:00
Rui Paulo
e0dffa2de2 Remove the extra parenthesis from the cv_init() macro. They are not
necessary because we already use parenthesis in zfs_cv_init().

This fixes a long standing bug where there would be an extra ")" at the
end of the string. This extra parenthesis would show up in the WCHAN of
the process (top, stty status, etc.).
2013-03-03 06:42:36 +00:00
Attilio Rao
03e78eac37 Fix-up r247622 by also renaming pv_list iterator into the xen
pmap verbatim copy.

Sponsored by:	EMC / Isilon storage division
Reported by:	tinderbox
2013-03-03 01:02:57 +00:00
Alexander Motin
27eae7e9ad Add protective parentheses for macro argument, missed in r247671. 2013-03-02 22:41:06 +00:00
Alexander Motin
25e533d3e5 Polish few spaces/tabs. 2013-03-02 22:28:20 +00:00
Alexander Motin
d4d29475e6 MFcalloutng:
Give OFED Linux wrapper own "expires" field instead of abusing callout's
c_time, which will change its type and units with calloutng commit.
2013-03-02 22:19:17 +00:00
Pawel Jakub Dawidek
378a73d1bd Regen after r247667. 2013-03-02 21:12:54 +00:00
Pawel Jakub Dawidek
7493f24ee6 - Implement two new system calls:
int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen);
	int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen);

  which allow to bind and connect respectively to a UNIX domain socket with a
  path relative to the directory associated with the given file descriptor 'fd'.

- Add manual pages for the new syscalls.

- Make the new syscalls available for processes in capability mode sandbox.

- Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on
  the directory descriptor for the syscalls to work.

- Update audit(4) to support those two new syscalls and to handle path
  in sockaddr_un structure relative to the given directory descriptor.

- Update procstat(1) to recognize the new capability rights.

- Document the new capability rights in cap_rights_limit(2).

Sponsored by:	The FreeBSD Foundation
Discussed with:	rwatson, jilles, kib, des
2013-03-02 21:11:30 +00:00
Attilio Rao
737a61a1ee Garbage collect NTFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 18:40:04 +00:00
Attilio Rao
0f90e981cb Remove ntfs headers dependency for g_label_ntfs.c by redefining the
used structs and values.

This patch is not targeted for MFC.
2013-03-02 18:23:59 +00:00
Alan Cox
55f33f2caf The value held by the vm object's field pg_color is only considered
valid if the flag OBJ_COLORED is set.  Since _vm_object_allocate()
doesn't set this flag, it needn't initialize pg_color.

Sponsored by:	EMC / Isilon Storage Division
2013-03-02 18:07:29 +00:00
Attilio Rao
4eb0218ace Garbage collect PORTALFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 16:43:28 +00:00
Attilio Rao
f51fb78533 Garbage collect CODAFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 16:30:18 +00:00
Marius Strobl
4495286fb2 - Complete r231621 by also blacklisting the bridge used by VMware for PCIe
devices. While at it, update the comment now that we know that MSI-X
  doesn't work with ESXi 5.1 for Intel 82576 either and the underlying issue
  is a bug in the MSI-X allocation code of the hypervisor.
  Reported by: Harald Schmalzbauer
- Make the nomatch table const.

MFC after:	1 week
2013-03-02 15:54:02 +00:00
Attilio Rao
67f1f66fc7 Garbage collect XFS bits which are now already completely disconnected
from the tree since few months.

This is not targeted for MFC.
2013-03-02 15:33:54 +00:00
Attilio Rao
258bee160c Garbage collect HPFS bits which are now already completely disconnected
from the tree since few months (please note that the userland bits
were already disconnected since a long time, thus there is no need
to update the OLD* entries).

This is not targeted for MFC.
2013-03-02 14:54:33 +00:00
Alexander V. Chernikov
39bddcde96 Fix callout expiring dynamic rules.
PR:		kern/175530
Submitted by:	Vladimir Spiridenkov <vs@gtn.ru>
MFC after:	2 weeks
2013-03-02 14:47:10 +00:00
Attilio Rao
b38d37f7b5 Merge from vmc-playground branch:
Rename the pv_entry_t iterator from pv_list to pv_next.
Besides being more correct technically (as the name seems to suggest
this is a list while it is an iterator), it will also be needed by
vm_radix work to avoid a nameclash on macro expansions.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc, jeff
Tested by:	flo, pho, jhb, davide
2013-03-02 14:19:08 +00:00
Marius Strobl
e8aabc79db - Revert the part of r247601 which turned the overtemperature and power fail
interrupt shutdown handlers into filters. Shutdown_nice(9) acquires a sleep
  lock, which filters shouldn't do. It also seems that kern_reboot(9) still
  may require Giant to be hold.
- Correct an incorrect argument to shutdown_nice(9).

Submitted by:	bde
2013-03-02 13:08:13 +00:00
Marius Strobl
562799bb30 Revert the part of r247600 which turned the overtemperature and power fail
interrupt shutdown handlers into filters. Shutdown_nice(9) acquires a sleep
lock, which filters shouldn't do. It also seems that kern_reboot(9) still
may require Giant to be hold.

Submitted by:	bde
2013-03-02 13:04:58 +00:00
Jilles Tjoelker
6d6a91c50f nullfs: Improve f_flags in statfs().
Include some flags of the nullfs mount itself:
MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW.

This allows userland code calling statfs() or fstatfs() to see these flags.
In particular, this allows opendir() to detect that a -t nullfs -o union
mount needs deduplication (otherwise at least . and .. are returned twice)
and allows rtld to detect a -t nullfs -o noexec mount as noexec.

Turn off the MNT_ROOTFS flag from the underlying filesystem because the
nullfs mount is definitely not the root filesystem.

Reviewed by:	kib
MFC after:	1 week
2013-03-02 12:42:23 +00:00
Pawel Jakub Dawidek
6d4e99aaef If the target file already exists, check for the CAP_UNLINKAT capabiity right
on the target directory descriptor, but only if this is renameat(2) and real
target directory descriptor is given (not AT_FDCWD). Without this fix regular
rename(2) fails if the target file already exists.

Reported by:	Michael Butler <imb@protected-networks.net>
Reported by:	Larry Rosenman <ler@lerctr.org>
Sponsored by:	The FreeBSD Foundation
2013-03-02 09:58:47 +00:00
Adrian Chadd
fe138cc2af Disable the ctl driver in GENERIC.
It unfortunately steals a fair chunk of RAM at startup even if it's not
actively used, which prevents FreeBSD VMs of 128MB from successfully
booting and running.
2013-03-02 08:12:41 +00:00
Andrew Turner
e40f53aa44 Move some virtual memory constants to the top of the file where they are on
other architectures [1].

While here:
 - Remove an unused and commented out include.
 - Add a comment describing the file that other copies have.
 - Fix the style of the defines and add a comment on what each one is.

Suggested by:	[1] alc
2013-03-02 05:02:29 +00:00
Andrew Turner
6f02c16b63 Build the Raspberry Pi dtb file when building the kernel so we can copy it
to the boot partition for U-Boot.
2013-03-02 03:23:14 +00:00
Andrew Turner
61fc9468e0 Ensure the stack is correctly aligned before calling the first C function. 2013-03-02 02:19:04 +00:00
Pawel Jakub Dawidek
1dc31587bf Regen after r247602. 2013-03-02 00:55:09 +00:00
Pawel Jakub Dawidek
2609222ab4 Merge Capsicum overhaul:
- Capability is no longer separate descriptor type. Now every descriptor
  has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
  should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
  cap_new(2), which limits capability rights of the given descriptor
  without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
  ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
  ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
  that can be used with the new cap_fcntls_limit(2) syscall and retrive
  them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
  heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
  recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
  backward API and ABI compatibility there are some incompatible changes
  that are described in detail below:

	CAP_CREATE old behaviour:
	- Allow for openat(2)+O_CREAT.
	- Allow for linkat(2).
	- Allow for symlinkat(2).
	CAP_CREATE new behaviour:
	- Allow for openat(2)+O_CREAT.

	Added CAP_LINKAT:
	- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
	- Allow to be target for renameat(2).

	Added CAP_SYMLINKAT:
	- Allow for symlinkat(2).

	Removed CAP_DELETE. Old behaviour:
	- Allow for unlinkat(2) when removing non-directory object.
	- Allow to be source for renameat(2).

	Removed CAP_RMDIR. Old behaviour:
	- Allow for unlinkat(2) when removing directory.

	Added CAP_RENAMEAT:
	- Required for source directory for the renameat(2) syscall.

	Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
	- Allow for unlinkat(2) on any object.
	- Required if target of renameat(2) exists and will be removed by this
	  call.

	Removed CAP_MAPEXEC.

	CAP_MMAP old behaviour:
	- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
	  PROT_WRITE.
	CAP_MMAP new behaviour:
	- Allow for mmap(2)+PROT_NONE.

	Added CAP_MMAP_R:
	- Allow for mmap(PROT_READ).
	Added CAP_MMAP_W:
	- Allow for mmap(PROT_WRITE).
	Added CAP_MMAP_X:
	- Allow for mmap(PROT_EXEC).
	Added CAP_MMAP_RW:
	- Allow for mmap(PROT_READ | PROT_WRITE).
	Added CAP_MMAP_RX:
	- Allow for mmap(PROT_READ | PROT_EXEC).
	Added CAP_MMAP_WX:
	- Allow for mmap(PROT_WRITE | PROT_EXEC).
	Added CAP_MMAP_RWX:
	- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

	Renamed CAP_MKDIR to CAP_MKDIRAT.
	Renamed CAP_MKFIFO to CAP_MKFIFOAT.
	Renamed CAP_MKNODE to CAP_MKNODEAT.

	CAP_READ old behaviour:
	- Allow pread(2).
	- Disallow read(2), readv(2) (if there is no CAP_SEEK).
	CAP_READ new behaviour:
	- Allow read(2), readv(2).
	- Disallow pread(2) (CAP_SEEK was also required).

	CAP_WRITE old behaviour:
	- Allow pwrite(2).
	- Disallow write(2), writev(2) (if there is no CAP_SEEK).
	CAP_WRITE new behaviour:
	- Allow write(2), writev(2).
	- Disallow pwrite(2) (CAP_SEEK was also required).

	Added convinient defines:

	#define	CAP_PREAD		(CAP_SEEK | CAP_READ)
	#define	CAP_PWRITE		(CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_R		(CAP_MMAP | CAP_SEEK | CAP_READ)
	#define	CAP_MMAP_W		(CAP_MMAP | CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_X		(CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
	#define	CAP_MMAP_RW		(CAP_MMAP_R | CAP_MMAP_W)
	#define	CAP_MMAP_RX		(CAP_MMAP_R | CAP_MMAP_X)
	#define	CAP_MMAP_WX		(CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_MMAP_RWX		(CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_RECV		CAP_READ
	#define	CAP_SEND		CAP_WRITE

	#define	CAP_SOCK_CLIENT \
		(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
		 CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
	#define	CAP_SOCK_SERVER \
		(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
		 CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
		 CAP_SETSOCKOPT | CAP_SHUTDOWN)

	Added defines for backward API compatibility:

	#define	CAP_MAPEXEC		CAP_MMAP_X
	#define	CAP_DELETE		CAP_UNLINKAT
	#define	CAP_MKDIR		CAP_MKDIRAT
	#define	CAP_RMDIR		CAP_UNLINKAT
	#define	CAP_MKFIFO		CAP_MKFIFOAT
	#define	CAP_MKNOD		CAP_MKNODAT
	#define	CAP_SOCK_ALL		(CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by:	The FreeBSD Foundation
Reviewed by:	Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with:	rwatson, benl, jonathan
ABI compatibility discussed with:	kib
2013-03-02 00:53:12 +00:00
Marius Strobl
11be09b056 - Apparently, it's no longer a problem to call shutdown_nice(9) from within
an interrupt filter (some other drivers in the tree do the same). So
  change the overtemperature and power fail interrupts from handlers in order
  to code and get rid of a !INTR_MPSAFE handlers.
- Mark unused parameters as such.
- Use NULL instead of 0 for pointers.

MFC after:	1 week
2013-03-02 00:41:51 +00:00
Marius Strobl
7e026d15d5 - While Netra X1 generally show no ill effects when registering a power
fail interrupt handler, there seems to be either a broken batch of them
  or a tendency to develop a defect which causes this interrupt to fire
  inadvertedly. Given that apart from this problem these machines work
  just fine, add a tunable allowing the setup of the power fail interrupt
  to be disabled.
  While at it, remove the DEBUGGER_ON_POWERFAIL compile time option and
  make that behavior also selectable via the newly added tunable.
- Apparently, it's no longer a problem to call shutdown_nice(9) from within
  an interrupt filter (some other drivers in the tree do the same). So
  change the power fail interrupt from an handler in order to simplify the
  code and get rid of a !INTR_MPSAFE handler.
- Use NULL instead of 0 for pointers.

MFC after:	1 week
2013-03-02 00:37:31 +00:00
Xin LI
69136e792a Fix wrong assignment.
Submitted by:	Sascha Wildner <saw online de>
Obtained from:	DragonFly rev 9568dd07a22a136e380e6c19a8ea188eb92976d5
MFC after:	2 weeks
2013-03-01 23:21:18 +00:00
Xin LI
0e47e251a9 Fix a typo in mfi_stp_cmd() that would give wrong assignment.
Submitted by:	Sascha Wildner <saw online de>
Obtained from:	DragonFly rev 0dc98fff2206d7bb78ce5e07ac34d6954e4bd96a
MFC after:	3 days
2013-03-01 23:18:20 +00:00
Xin LI
5c737b11df MFV r247575:
Import a fix tighten assertion on SPA versions from vendor (Illumos).

Illumos ZFS issue:

  3543 Feature flags causes assertion in spa.c to miss certain cases

MFC after:	2 weeks
2013-03-01 22:20:13 +00:00
Marius Strobl
20132a2238 Initialize count in order to appease clang.
Submitted by:	delphij
2013-03-01 22:09:08 +00:00