96044 Commits

Author SHA1 Message Date
kib
74ea46d9ee Import the likely() compat macro.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-03-05 09:07:01 +00:00
glebius
793da79af8 Simplify TAILQ usage and avoid additional memory allocations.
Tested by:	Eugene M. Zheganin <emz norma.perm.ru>
Sponsored by:	Nginx, Inc
2013-03-05 08:08:16 +00:00
bryanv
78bc791543 Only set the barrier flag if the feature was negotiated
When the VirtIO barrier feature is not negotiated, the driver
must enforce the proper ordering for BIO_ORDERED BIOs. All the
in-flight BIOs must complete before starting the BIO, and the
ordered BIO must complete before subsequent BIOs can start.

Also fix a few whitespace nits.

Reported by:	neel
Approved by:	grehan (mentor)
MFC after:	3 days
2013-03-05 07:00:05 +00:00
jfv
879e516752 Fix a small, but important bug, a task drain was mistakenly
being compiled only when setting LEGACY_TX, this means you would
not get the drain when needed on detach!!

Thanks to Bryan Venteicher (bryanv@freebsd.org) for catching this
little gremlin!! :)
2013-03-04 23:15:07 +00:00
jfv
7b20f97709 First, sync to internal shared code, and then
Fixes:
	- flow control - don't override user value on re-init
	- fix to make 1G optics work correctly
	- change to interrupt enabling - some bits were incorrect
	  for certain hardware.
	- certain stats fixes, remove a duplicate increment of
	  ierror, thanks to Scott Long for pointing these out.
	- shared code link interface changed, requiring some
	  core code changes to accomodate this.
	- add an m_adj() to ETHER_ALIGN on the recieve side, this
	  was requested by Mike Karels, thanks Mike.
	- Multicast code corrections also thanks to Mike Karels.
2013-03-04 23:07:40 +00:00
davide
bfc7c5f119 - Bump __FreeBSD_version after recent callout(9) changes.
- Add an entry in UPDATING to notice users about breakages.
2013-03-04 22:41:49 +00:00
gibbs
7829309113 Fix assertion failure when using userland DTrace probes from
the pid provider on a kernel compiled with INVARIANTS.

sys/cddl/contrib/opensolaris/uts/intel/dtrace/fasttrap_isa.c:
	In fasttrap_probe_pid(), attempts to write to the
	address space of the thread that fired the probe
	must be performed with the process of the thread
	held.  Use _PHOLD() to ensure this is the case.

	In fasttrap_probe_pid(), use proc_write_regs() instead
	of calling set_regs() directly.  proc_write_regs()
	performs invariant checks to verify the calling
	environment of set_regs().  PROC_LOCK()/UNLOCK() around
	the call to proc_write_regs() so that it's invariants
	are satisfied.

Sponsored by:	Spectra Logic Corporation
Reviewed by:	gnn, rpaulo
MFC after:	1 week
2013-03-04 22:07:36 +00:00
davide
63dc09eae5 Complete r247813:
Use true/false instead of TRUE/FALSE.

Reported by:	attilio
Requested by:	jhb
2013-03-04 21:52:12 +00:00
mav
aabe85e339 Add quirk to enable headphones redirection on Lenovo X220.
PR:		kern/174876
MFC after:	1 week
2013-03-04 21:20:13 +00:00
ken
d11db422c6 Re-enable CTL in GENERIC on i386 and amd64, but turn on the CTL disable
tunable by default.

This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel.  They
can simply set kern.cam.ctl.disable=0 in loader.conf.

The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.

UPDATING:		Explain the change in the CTL configuration, and
			how users can enable CTL if they would like to use
			it.

sys/conf/options:	Add a new option, CTL_DISABLE, that prevents CTL
			from initializing.

ctl.c:			If CTL_DISABLE is turned on, don't initialize.

i386/conf/GENERIC,
amd64/conf/GENERIC:	Re-enable device ctl, and add the CTL_DISABLE
			option.
2013-03-04 21:18:45 +00:00
davide
322c45390e Use C99 'bool' rather than Machish 'boolean_t'.
Requested by:	jhb
2013-03-04 21:09:22 +00:00
davide
844e77458d MFcalloutng:
- Rewrite kevent() timeout implementation to allow sub-tick precision.
- Make the interval timings for EVFILT_TIMER more accurate. This also
removes an hack introduced in r238424.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:55:16 +00:00
davide
473ef72351 MFcalloutng:
Fix kern_select() and sys_poll() so that they can handle sub-tick
precision for timeouts (in the same fashion it was done for nanosleep()
in r247797).

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:41:27 +00:00
davide
0aca596713 MFcalloutng (r244251 with minor changes):
Specify that precision of 0.5s is enough for resource limitation.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:25:12 +00:00
davide
9157a1f8c6 MFcalloutng (r236314 by mav):
Specify that wakeup rate of 7.5-10Hz is enough for yarrow harvesting
thread.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:16:23 +00:00
davide
40e58fe548 MFcalloutng (r244255 by mav, with minor changes):
Specify that syslog doesn't need exactly 5 wakeups per second.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 16:07:55 +00:00
davide
4d11390875 MFcalloutng:
kern_nanosleep() is now converted to use tsleep_sbt(). With this change
nanosleep() and usleep() can handle sub-tick precision for timeouts.
Also, try to help coalesce of events passing as argument to tsleep_bt()
a precision value calculated as a percentage of the sleep time.
This percentage is default 5%, but it can tuned according to users
need via the sysctl interface.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 15:57:41 +00:00
davide
2ca848c3f6 Fix build with DIAGNOSTIC/CALLOUT_PROFILING options turned on.
Reported by:	kib, David Wolfskill <david at catwhisker dot org>
Pointy-hat to:	davide
2013-03-04 15:03:52 +00:00
davide
c3613bbf40 MFcalloutng (r244249, r244306 by mav):
- Switch syscons from timeout() to callout_reset_flags() and specify that
precision is not important there -- anything from 20 to 30Hz will be fine.
- Reduce syscons "refresh" rate to 1-2Hz when console is in graphics mode
and there is nothing to do except some polling for keyboard.  Text mode
refresh would also be nice to have adaptive, but this change at least
should help laptop users who running X.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 14:00:58 +00:00
attilio
e5bdd2f06e Merge from vmcontention:
As vm objects are type-stable there is no need to initialize the
resident splay tree pointer and the cache splay tree pointer in
_vm_object_allocate() but this could be done in the init UMA zone
handler.

The destructor UMA zone handler, will further check if the condition is
retained at every destruction and catch for bugs.

Sponsored by:	EMC / Isilon storage division
Submitted by:	alc
2013-03-04 13:10:59 +00:00
davide
8374b0e141 MFcalloutng:
Introduce sbt variants of msleep(), msleep_spin(), pause(), tsleep() in
the KPI, allowing to specify timeout in 'sbintime_t' rather than ticks.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 12:48:41 +00:00
davide
7395c58e52 MFcalloutng:
Extend condvar(9) KPI introducing sbt variant of cv_timedwait. This
rely on the previously committed sleepq_set_timeout_sbt().

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 12:20:48 +00:00
davide
9392b519d7 Style fix: remove useless braces. Sorry, my bad.
Submitted by:	bde
2013-03-04 11:55:32 +00:00
davide
a893437175 MFcalloutng:
Convert sleepqueue(9) bits to the new callout KPI. Take advantage of
the possibility to run callback directly from hw interrupt context.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, markj, Fabian Keil
2013-03-04 11:51:46 +00:00
davide
9ad2265733 MFcalloutng (r244355):
Make loadavg calculation callout direct. There are several reasons for it:
 - it is very simple and doesn't worth context switch to SWI;
 - since SWI is no longer used here, we can remove twelve years old hack,
excluding this SWI from from the loadavg statistics;
 - it fixes problem when eventtimer (HPET) shares interrupt with some other
device, and that interrupt thread counted as permanent loadavg of 1; now
loadavg accounted before that interrupt thread is scheduled.

Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo, marius, ian, Fabian Keil, markj
2013-03-04 11:22:19 +00:00
davide
431035cf16 - Make callout(9) tickless, relying on eventtimers(4) as backend for
precise time event generation. This greatly improves granularity of
callouts which are not anymore constrained to wait next tick to be
scheduled.
- Extend the callout KPI introducing a set of callout_reset_sbt* functions,
which take a sbintime_t as timeout argument. The new KPI also offers a
way for consumers to specify precision tolerance they allow, so that
callout can coalesce events and reduce number of interrupts as well as
potentially avoid scheduling a SWI thread.
- Introduce support for dispatching callouts directly from hardware
interrupt context, specifying an additional flag. This feature should be
used carefully, as long as interrupt context has some limitations
(e.g. no sleeping locks can be held).
- Enhance mechanisms to gather informations about callwheel, introducing
a new sysctl to obtain stats.

This change breaks the KBI. struct callout fields has been changed, in
particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t'
(8 bytes) and another 'sbintime_t' field was added for precision.

Together with:	mav
Reviewed by:	attilio, bde, luigi, phk
Sponsored by:	Google Summer of Code 2012, iXsystems inc.
Tested by:	flo (amd64, sparc64), marius (sparc64), ian (arm),
		markj (amd64), mav, Fabian Keil
2013-03-04 11:09:56 +00:00
cognet
e52f997818 If we're using a PIPT L2 cache, only merge 2 segments if both the virtual
and the physical addreses are contiguous.

Submitted by:	Thomas Skibo <ThomasSkibo@sbcglobal.net>
2013-03-04 10:41:54 +00:00
adrian
9b8f8df1c5 add a method to set/clear the VMF field in the TX descriptor.
Obtained from:	Qualcomm Atheros
2013-03-04 07:40:49 +00:00
eadler
a0bd41720a Remove check for NULL prior to free(9) and m_freem(9).
Approved by:	cperciva (mentor)
2013-03-04 02:21:34 +00:00
pjd
c3b73942d3 For some reason when I started to pass filedescent structures instead of
pointers to the file structure receiving descriptors stopped to work when also
at least few kilobytes of data is being send. In the kernel the
soreceive_generic() function doesn't see control mbuf as the first mbuf and
unp_externalize() is never called, first 6(?) kilobytes of data is missing as
well on receiving end.

This breaks for example tmux.

I don't know yet why going from 8 bytes to sizeof(struct filedescent) per
descriptor (or even to 16 bytes per descriptor) breaks things, but to
work-around it for now use 8 bytes per file descriptor at the cost of memory
allocation.

Reported by:	flo, Diane Bruce, Jan Beich <jbeich@tormail.org>
Simple testcase provided by:	mjg
2013-03-03 23:39:30 +00:00
pjd
386f382f2d Use dedicated malloc type for filecaps-related data, so we can detect any
memory leaks easier.
2013-03-03 23:25:45 +00:00
pjd
1df614f5db Plug memory leaks in file descriptors passing. 2013-03-03 23:23:35 +00:00
uqs
82dd943a5e Fix 'make depend' 2013-03-03 16:17:09 +00:00
davide
bcd9ca99af callwheelmask and callwheelsize are always greater than zero.
Switch their type to u_int.
2013-03-03 15:01:33 +00:00
davide
ff0e3dad9e Remove a couple of unused include. 2013-03-03 14:47:02 +00:00
mav
1cd8093cdd MFcalloutng:
Some whitespace fixes.
2013-03-03 09:11:24 +00:00
rpaulo
f050aa21ec Remove the extra parenthesis from the cv_init() macro. They are not
necessary because we already use parenthesis in zfs_cv_init().

This fixes a long standing bug where there would be an extra ")" at the
end of the string. This extra parenthesis would show up in the WCHAN of
the process (top, stty status, etc.).
2013-03-03 06:42:36 +00:00
attilio
be8012c4e6 Fix-up r247622 by also renaming pv_list iterator into the xen
pmap verbatim copy.

Sponsored by:	EMC / Isilon storage division
Reported by:	tinderbox
2013-03-03 01:02:57 +00:00
mav
b9da6c918f Add protective parentheses for macro argument, missed in r247671. 2013-03-02 22:41:06 +00:00
mav
a5e43a09af Polish few spaces/tabs. 2013-03-02 22:28:20 +00:00
mav
dc07b9e1fa MFcalloutng:
Give OFED Linux wrapper own "expires" field instead of abusing callout's
c_time, which will change its type and units with calloutng commit.
2013-03-02 22:19:17 +00:00
pjd
369ed4d4ad Regen after r247667. 2013-03-02 21:12:54 +00:00
pjd
702516e70b - Implement two new system calls:
int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen);
	int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen);

  which allow to bind and connect respectively to a UNIX domain socket with a
  path relative to the directory associated with the given file descriptor 'fd'.

- Add manual pages for the new syscalls.

- Make the new syscalls available for processes in capability mode sandbox.

- Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on
  the directory descriptor for the syscalls to work.

- Update audit(4) to support those two new syscalls and to handle path
  in sockaddr_un structure relative to the given directory descriptor.

- Update procstat(1) to recognize the new capability rights.

- Document the new capability rights in cap_rights_limit(2).

Sponsored by:	The FreeBSD Foundation
Discussed with:	rwatson, jilles, kib, des
2013-03-02 21:11:30 +00:00
attilio
5d57dc997e Garbage collect NTFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 18:40:04 +00:00
attilio
5775bdb2a4 Remove ntfs headers dependency for g_label_ntfs.c by redefining the
used structs and values.

This patch is not targeted for MFC.
2013-03-02 18:23:59 +00:00
alc
90d4aeb975 The value held by the vm object's field pg_color is only considered
valid if the flag OBJ_COLORED is set.  Since _vm_object_allocate()
doesn't set this flag, it needn't initialize pg_color.

Sponsored by:	EMC / Isilon Storage Division
2013-03-02 18:07:29 +00:00
attilio
59a3d435c9 Garbage collect PORTALFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 16:43:28 +00:00
attilio
5d33ae7487 Garbage collect CODAFS bits which are now completely disconnected from
the tree since few months.

This patch is not targeted for MFC.
2013-03-02 16:30:18 +00:00
marius
718767a4c1 - Complete r231621 by also blacklisting the bridge used by VMware for PCIe
devices. While at it, update the comment now that we know that MSI-X
  doesn't work with ESXi 5.1 for Intel 82576 either and the underlying issue
  is a bug in the MSI-X allocation code of the hypervisor.
  Reported by: Harald Schmalzbauer
- Make the nomatch table const.

MFC after:	1 week
2013-03-02 15:54:02 +00:00
attilio
44df97db57 Garbage collect XFS bits which are now already completely disconnected
from the tree since few months.

This is not targeted for MFC.
2013-03-02 15:33:54 +00:00