This fixes a build error at the same time (unused variable "item"), if
the kernel is compiled without INVARIANTS.
Reported by: Hartmann, O. <ohartman@zedat.fu-berlin.de> (build error)
Reviewed by: Konstantin Belousov (kib@)
Without this, read data is mis-interpreted. This could trigger a panic,
as was the case on one computer where computed "recsize" was zero,
leading to a call to g_read_page() asking for 0 bytes.
The early commit is done to facilitate the off-tree work on the
porting of the Radeon driver.
Sponsored by: The FreeBSD Foundation
Debugged and tested by: dumbbell
MFC after: 1 month
When the VirtIO barrier feature is not negotiated, the driver
must enforce the proper ordering for BIO_ORDERED BIOs. All the
in-flight BIOs must complete before starting the BIO, and the
ordered BIO must complete before subsequent BIOs can start.
Also fix a few whitespace nits.
Reported by: neel
Approved by: grehan (mentor)
MFC after: 3 days
being compiled only when setting LEGACY_TX, this means you would
not get the drain when needed on detach!!
Thanks to Bryan Venteicher (bryanv@freebsd.org) for catching this
little gremlin!! :)
Fixes:
- flow control - don't override user value on re-init
- fix to make 1G optics work correctly
- change to interrupt enabling - some bits were incorrect
for certain hardware.
- certain stats fixes, remove a duplicate increment of
ierror, thanks to Scott Long for pointing these out.
- shared code link interface changed, requiring some
core code changes to accomodate this.
- add an m_adj() to ETHER_ALIGN on the recieve side, this
was requested by Mike Karels, thanks Mike.
- Multicast code corrections also thanks to Mike Karels.
the pid provider on a kernel compiled with INVARIANTS.
sys/cddl/contrib/opensolaris/uts/intel/dtrace/fasttrap_isa.c:
In fasttrap_probe_pid(), attempts to write to the
address space of the thread that fired the probe
must be performed with the process of the thread
held. Use _PHOLD() to ensure this is the case.
In fasttrap_probe_pid(), use proc_write_regs() instead
of calling set_regs() directly. proc_write_regs()
performs invariant checks to verify the calling
environment of set_regs(). PROC_LOCK()/UNLOCK() around
the call to proc_write_regs() so that it's invariants
are satisfied.
Sponsored by: Spectra Logic Corporation
Reviewed by: gnn, rpaulo
MFC after: 1 week
tunable by default.
This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel. They
can simply set kern.cam.ctl.disable=0 in loader.conf.
The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.
UPDATING: Explain the change in the CTL configuration, and
how users can enable CTL if they would like to use
it.
sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL
from initializing.
ctl.c: If CTL_DISABLE is turned on, don't initialize.
i386/conf/GENERIC,
amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE
option.
- Rewrite kevent() timeout implementation to allow sub-tick precision.
- Make the interval timings for EVFILT_TIMER more accurate. This also
removes an hack introduced in r238424.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Fix kern_select() and sys_poll() so that they can handle sub-tick
precision for timeouts (in the same fashion it was done for nanosleep()
in r247797).
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Specify that precision of 0.5s is enough for resource limitation.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Specify that wakeup rate of 7.5-10Hz is enough for yarrow harvesting
thread.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Specify that syslog doesn't need exactly 5 wakeups per second.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
kern_nanosleep() is now converted to use tsleep_sbt(). With this change
nanosleep() and usleep() can handle sub-tick precision for timeouts.
Also, try to help coalesce of events passing as argument to tsleep_bt()
a precision value calculated as a percentage of the sleep time.
This percentage is default 5%, but it can tuned according to users
need via the sysctl interface.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
- Switch syscons from timeout() to callout_reset_flags() and specify that
precision is not important there -- anything from 20 to 30Hz will be fine.
- Reduce syscons "refresh" rate to 1-2Hz when console is in graphics mode
and there is nothing to do except some polling for keyboard. Text mode
refresh would also be nice to have adaptive, but this change at least
should help laptop users who running X.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
As vm objects are type-stable there is no need to initialize the
resident splay tree pointer and the cache splay tree pointer in
_vm_object_allocate() but this could be done in the init UMA zone
handler.
The destructor UMA zone handler, will further check if the condition is
retained at every destruction and catch for bugs.
Sponsored by: EMC / Isilon storage division
Submitted by: alc
Introduce sbt variants of msleep(), msleep_spin(), pause(), tsleep() in
the KPI, allowing to specify timeout in 'sbintime_t' rather than ticks.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Extend condvar(9) KPI introducing sbt variant of cv_timedwait. This
rely on the previously committed sleepq_set_timeout_sbt().
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Convert sleepqueue(9) bits to the new callout KPI. Take advantage of
the possibility to run callback directly from hw interrupt context.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil
Make loadavg calculation callout direct. There are several reasons for it:
- it is very simple and doesn't worth context switch to SWI;
- since SWI is no longer used here, we can remove twelve years old hack,
excluding this SWI from from the loadavg statistics;
- it fixes problem when eventtimer (HPET) shares interrupt with some other
device, and that interrupt thread counted as permanent loadavg of 1; now
loadavg accounted before that interrupt thread is scheduled.
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, Fabian Keil, markj
precise time event generation. This greatly improves granularity of
callouts which are not anymore constrained to wait next tick to be
scheduled.
- Extend the callout KPI introducing a set of callout_reset_sbt* functions,
which take a sbintime_t as timeout argument. The new KPI also offers a
way for consumers to specify precision tolerance they allow, so that
callout can coalesce events and reduce number of interrupts as well as
potentially avoid scheduling a SWI thread.
- Introduce support for dispatching callouts directly from hardware
interrupt context, specifying an additional flag. This feature should be
used carefully, as long as interrupt context has some limitations
(e.g. no sleeping locks can be held).
- Enhance mechanisms to gather informations about callwheel, introducing
a new sysctl to obtain stats.
This change breaks the KBI. struct callout fields has been changed, in
particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t'
(8 bytes) and another 'sbintime_t' field was added for precision.
Together with: mav
Reviewed by: attilio, bde, luigi, phk
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm),
markj (amd64), mav, Fabian Keil
pointers to the file structure receiving descriptors stopped to work when also
at least few kilobytes of data is being send. In the kernel the
soreceive_generic() function doesn't see control mbuf as the first mbuf and
unp_externalize() is never called, first 6(?) kilobytes of data is missing as
well on receiving end.
This breaks for example tmux.
I don't know yet why going from 8 bytes to sizeof(struct filedescent) per
descriptor (or even to 16 bytes per descriptor) breaks things, but to
work-around it for now use 8 bytes per file descriptor at the cost of memory
allocation.
Reported by: flo, Diane Bruce, Jan Beich <jbeich@tormail.org>
Simple testcase provided by: mjg
necessary because we already use parenthesis in zfs_cv_init().
This fixes a long standing bug where there would be an extra ")" at the
end of the string. This extra parenthesis would show up in the WCHAN of
the process (top, stty status, etc.).
int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen);
int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen);
which allow to bind and connect respectively to a UNIX domain socket with a
path relative to the directory associated with the given file descriptor 'fd'.
- Add manual pages for the new syscalls.
- Make the new syscalls available for processes in capability mode sandbox.
- Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on
the directory descriptor for the syscalls to work.
- Update audit(4) to support those two new syscalls and to handle path
in sockaddr_un structure relative to the given directory descriptor.
- Update procstat(1) to recognize the new capability rights.
- Document the new capability rights in cap_rights_limit(2).
Sponsored by: The FreeBSD Foundation
Discussed with: rwatson, jilles, kib, des