1825 Commits

Author SHA1 Message Date
jhb
0dfdde5788 Sort NDHASGIANT.9 link properly. 2010-01-19 20:36:15 +00:00
gavin
4f0862f400 Xref sysctl(3)
Approved by:	ed (mentor)
2010-01-16 14:31:01 +00:00
ru
fdf6718f33 Use the newly brought %U macro. 2010-01-15 16:01:22 +00:00
jhb
415adcce72 - Note that if_xname, if_dname, and if_dunit are usually initialized via
if_initname().
- Document if_drv_flags and replace references to IFF_(RUNNING|OACTIVE)
  with references to IFF_DRV_(RUNNING|OACTIVE).
- Complete truncated sentence in the description of if_transmit by copying
  from the description in if_qflush.
- Add missing line breaks for translators.

Reviewed by:	brooks (1)
MFC after:	3 days
2010-01-14 14:43:16 +00:00
jhb
8019b29e4f - Update required headers for namei() to add <sys/fcntl.h> and remove
<sys/proc.h>.
- Add RETURN VALUES and ERROR sections for namei()'s error return values.
- Add a missing link to NDHASGIANT.9.

PR:		docs/142815, docs/142816
Submitted by:	Lachlan Kang (1, 2)
MFC after:	3 days
2010-01-14 14:36:39 +00:00
attilio
fde84f320b Introduce the new kernel thread called "deadlock resolver".
While the name is pretentious, a good explanation of its targets is
reported in this 17 months old presentation e-mail:
http://lists.freebsd.org/pipermail/freebsd-arch/2008-August/008452.html

In order to implement it, the sq_type in sleepqueues is mandatory and not
only compiled along with INVARIANTS option. Additively, a new sleepqueue
function, sleepq_type() is added, returning the type of the sleepqueue
linked to a wchan.
Three new sysctls are added in order to configure the thread:
debug.deadlkres.slptime_threshold
debug.deadlkres.blktime_threshold
debug.deadlkres.sleepfreq

rappresenting the thresholds for sleep and block time that will lead to
a deadlock matching (when exceeded), while the sleepfreq rappresents the
number of seconds between 2 consecutive thread runnings.
In order to enable the deadlock resolver thread recompile your kernel
with the option DEADLKRES.

Reviewed by:	jeff
Tested by:	pho, Giovanni Trematerra
Sponsored by:	Nokia Incorporated, Sandvine Incorporated
MFC after:	2 weeks
2010-01-09 01:46:38 +00:00
brueffer
e4fee8b8e9 Catch up with the VFS_VPTOFH(9) -> VOP_VPTOFH(9) repocopy that happened
almost three years ago in r166794.

PR:		140989
Submitted by:	Lachlan Kang
MFC after:	1 week
2010-01-04 22:22:00 +00:00
kib
16db58ba54 PG_NOSYNC is called VPO_NOSYNC for long time.
MFC after:	3 days
2010-01-04 14:58:41 +00:00
ru
0b6e1af801 Removed duplicate usbd_xfer_state(9) link. 2009-12-22 16:05:28 +00:00
ru
577c2cea5c Sort mlinks. 2009-12-22 16:02:08 +00:00
julian
e4c705b6d5 Make man page reflect the output columns
MFC after:	1 week
2009-12-16 19:37:38 +00:00
kib
061d83630f Document PBDRY and SLEEPQ_STOP_ON_BDRY.
Requested and reviewed by:	attilio
MFC after:	3 days
2009-12-12 22:08:37 +00:00
attilio
b1c6888d87 In current code, threads performing an interruptible sleep (on both
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.

A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on.  In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).

In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.

The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.

Reported by:	avg, kib, pho
Reviewed by:	kib
Tested by:	pho
2009-12-12 21:31:07 +00:00
trasz
3173e1cb39 Add missing parameter description. 2009-12-02 18:11:14 +00:00
bz
3595d5cfed Unbreak user space after if_timer/if_watchdog removal in r199975.
Tested by:	glebius
2009-12-01 14:56:00 +00:00
ru
0bbb7d3211 Back in 2003, get_cyclecount() was changed to use binuptime() instead
of nanotime().  Reflect this change in a manpage.

Reviewed by:	phk, markm
2009-10-29 09:45:05 +00:00
emaste
7e19eab5d0 Add link for callout_schedule(9). 2009-10-27 14:37:25 +00:00
brueffer
4fa5145d52 Sort SEE ALSO. 2009-10-16 12:32:07 +00:00
jhb
45688ed39d Add a facility for associating optional descriptions with active interrupt
handlers.  This is primarily intended as a way to allow devices that use
multiple interrupts (e.g. MSI) to meaningfully distinguish the various
interrupt handlers.
- Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
  a description with an active interrupt handler setup by BUS_SETUP_INTR.
  It has a default method (bus_generic_describe_intr()) which simply passes
  the request up to the parent device.
- Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
  printf(9) style formatting using var args.
- Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
  an interrupt handler and copy the name passed to intr_event_add_handler()
  into that buffer instead of just saving the pointer to the name.
- Add a new intr_event_describe_handler() which appends a description string
  to an interrupt handler's name.
- Implement support for interrupt descriptions on amd64 and i386 by having
  the nexus(4) driver supply a custom bus_describe_intr method that invokes
  a new intr_describe() MD routine which in turn looks up the associated
  interrupt event and invokes intr_event_describe_handler().

Requested by:	many
Reviewed by:	scottl
MFC after:	2 weeks
2009-10-15 14:54:35 +00:00
jhb
43b02493d8 Oops, add a return values section to note that these routines return an error
on failure or zero on success.
2009-10-14 16:00:20 +00:00
jhb
8f8a23ab53 Add a manual page for BUS_BIND_INTR() and bus_bind_intr().
MFC after:	1 week
2009-10-14 15:58:59 +00:00
trasz
843e93056a Make fetch(9) and store(9) manual pages closer to reality. 2009-10-05 15:16:28 +00:00
trasz
7af86ad4d9 Add pieces of infrastructure required for NFSv4 ACL support in UFS.
Reviewed by:	rwatson
2009-09-22 15:15:03 +00:00
brueffer
c67cf94b25 Fix mdoc, typos, contractions.
This includes:
PR:		135520
Submitted by:	Nobuyuki Koganemaru
Patch by:	gavin
MFC after:	3 days
2009-09-18 14:05:56 +00:00
brueffer
f5fb7c5f64 Various mdoc, spelling etc fixes.
MFC after:	3 days
2009-09-18 00:33:47 +00:00
julian
c5b50090c7 Add claraifications to the kproc and kthread manpages and link
the kthread_create(9) man page to the kproc(9) page as it had migrated and
people looking for it may need a hand to find its new name.

MFC after:	1 week
2009-08-23 07:48:11 +00:00
jhb
9137c5d8b4 This patch fixes two bugs in sglist(9) and improves robustness of the API via
better semantics if a request to append an address range to an existing list
fails.
- When cloning an sglist, properly set the length in the new sglist instead of
  leaving the new list empty.
- Properly compute the amount of data added to an sglist via
  _sglist_append_buf().  This allows sglist_consume_uio() to properly update
  uio_resid.
- When a request to append an address range to a scatter/gather list fails,
  restore the sglist to the state it had at the start of the function call
  instead of resetting it to an empty list.

Requested by:	np (3)
Approved by:	re (kib)
2009-08-21 02:59:07 +00:00
jhb
bb1c942f11 Change the 'resid' parameter to sglist_consume_uio() from an int to a
size_t to match the recent type change of the uio_resid member of struct
uio.

Approved by:	re (kib)
2009-08-20 19:23:58 +00:00
pjd
099429aa32 Remove unused taskqueue_find() function.
Reviewed by:	dfr
Approved by:	re (kib)
2009-08-18 13:55:48 +00:00
pjd
ba16bdec3c Correct typo in the previous commit.
Noticed by:	pluknet <pluknet@gmail.com>
Approved by:	re (kib, implicit)
2009-08-17 10:20:22 +00:00
pjd
ea8df6fcea Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:01:20 +00:00
sam
6fe9bef817 First (early) draft of net80211 documentation. Note this is
focused on driver writers (as opposed to folks adding to net80211).

Reviewed by:	wkoszek
Approved by:	re (rwatson)
2009-08-12 21:03:16 +00:00
bz
83f1495433 Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
  from going away, under INVARIANTS as this is a general problem
  of the stack and should be solved in if.c/netisr but still good
  to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.

Hook epair(4) up to the build.

Approved by:	re (kib)
2009-07-26 12:20:07 +00:00
cperciva
01fee564e1 Fix typo: kproc_resume,.9 -> kproc_resume.9.
Approved by:	re (kib)
2009-07-11 17:36:59 +00:00
thompsa
5910fa6cfc Move programming info from usb(4) to usbdi(9) and update for the usb stack
changeover. Needs much more content still.
2009-06-24 17:01:17 +00:00
rwatson
df217187ce Add stack_print_short() and stack_print_short_ddb() interfaces to
stack(9), which generate a more compact rendition of a stack trace
via the kernel's printf.

MFC after:	1 week
2009-06-24 12:06:15 +00:00
kib
e91d5cfe69 Usermode portion of the support for swap allocation accounting:
- update for getrlimit(2) manpage;
- support for setting RLIMIT_SWAP in login class;
- addition to the limits(1) and sh and csh limit-setting builtins;
- tuning(7) documentation on the sysctls controlling overcommit.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:57:27 +00:00
brooks
e271e202d0 Document crcopysafe() and crsetgroups().
Reminded by:	julian
2009-06-19 19:16:35 +00:00
attilio
256667d4fb Introduce support for adaptive spinning in lockmgr.
Actually, as it did receive few tuning, the support is disabled by
default, but it can opt-in with the option ADAPTIVE_LOCKMGRS.
Due to the nature of lockmgrs, adaptive spinning needs to be
selectively enabled for any interested lockmgr.
The support is bi-directional, or, in other ways, it will work in both
cases if the lock is held in read or write way.  In particular, the
read path is passible of further tunning using the sysctls
debug.lockmgr.retries and debug.lockmgr.loops .  Ideally, such sysctls
should be axed or compiled out before release.

Addictionally note that adaptive spinning doesn't cope well with
LK_SLEEPFAIL.  The reason is that many (and probabilly all) consumers
of LK_SLEEPFAIL are mainly interested in knowing if the interlock was
dropped or not in order to reacquire it and re-test initial conditions.
This directly interacts with adaptive spinning because lockmgr needs
to drop the interlock while spinning in order to avoid a deadlock
(further details in the comments inside the patch).

Final note: finding someone willing to help on tuning this with
relevant workloads would be either very important and appreciated.

Tested by:	jeff, pho
Requested by:	many
2009-06-17 01:55:42 +00:00
bz
56983733aa Add an optional callback function that will be invoked when a per-CPU
queue was drained.  It will never fire for a directly dispatched packet.

You will most likely never want to use this for any ordinary netisr usage
and you will never blame netisr in case you try to use it and it does
not work as expected.

Reviewed by:	rwatson
2009-06-14 17:15:18 +00:00
bz
1ce6fee7c0 Remove a line break leaving a function return type attached to the old
function declaration bottom rather than the new function declaration
start.
2009-06-14 12:11:15 +00:00
imp
4bc21efa2c These are no longer public, so remove the man page. 2009-06-09 23:38:19 +00:00
jhb
77373ed468 Add support for multiple passes of the device tree during the boot-time
probe.  The current device order is unchanged.  This commit just adds the
infrastructure and ABI changes so that it is easier to merge later changes
into 8.x.
- Driver attachments now have an associated pass level.  Attachments are
  not allowed to probe or attach to drivers until the system-wide pass level
  is >= the attachment's pass level.  By default driver attachments use the
  "last" pass level (BUS_PASS_DEFAULT).  Driver's that wish to probe during
  an earlier pass use EARLY_DRIVER_MODULE() instead of DRIVER_MODULE() which
  accepts the pass level as an additional parameter.
- A new method BUS_NEW_PASS has been added to the bus interface.  This
  method is invoked when the system-wide pass level is changed to kick off
  a rescan of the device tree so that drivers that have just been made
  "eligible" can probe and attach.
- The bus_generic_new_pass() function provides a default implementation of
  BUS_NEW_PASS().  It first allows drivers that were just made eligible for
  this pass to identify new child devices.  Then it propogates the rescan to
  child devices that already have an attached driver by invoking their
  BUS_NEW_PASS() method.  It also reprobes devices without a driver.
- BUS_PROBE_NOMATCH() is only invoked for devices that do not have
  an attached driver after being scanned during the final pass.
- The bus_set_pass() function is used during boot to raise the pass level.
  Currently it is only called once during root_bus_configure() to raise
  the pass level to BUS_PASS_DEFAULT.  This has the effect of probing all
  devices in a single pass identical to previous behavior.

Reviewed by:	imp
Approved by:	re (kib)
2009-06-09 14:26:23 +00:00
rwatson
2b053d10b2 Try again to add beginnings of netisr(8) man page: this time add
netisr.9.
2009-06-07 21:32:01 +00:00
rwatson
0083dbe71d Add beginnings of a netisr(9) man page. 2009-06-07 21:31:06 +00:00
jhb
e45af7ed87 Add a simple API to manage scatter/gather lists of phyiscal addresses.
Each list describes a logical memory object that is backed by one or more
physical address ranges.  To minimize locking, the sglist objects
themselves are immutable once they are shared.

These objects may be used in the future to facilitate I/O requests using
physically-addressed buffers.  For the immediate future I plan to use them
to implement a new type of VM object and pager.

Reviewed by:	jeff, scottl
MFC after:	1 month
2009-06-01 20:35:39 +00:00
trasz
7be2e0fb7a Use the "flag" word consistently.
Submitted by:	Ben Kaduk <minimarmot at gmail.com>
2009-06-01 07:48:27 +00:00
trasz
0c63bcbfa4 Add VOP_ACCESSX, which can be used to query for newly added V*
permissions, such as VWRITE_ACL.  For a filsystems that don't
implement it, there is a default implementation, which works
as a wrapper around VOP_ACCESS.

Reviewed by:	rwatson@
2009-05-30 13:59:05 +00:00
rwatson
52ba259960 Make the rmlock(9) interface a bit more like the rwlock(9) interface:
- Add rm_init_flags() and accept extended options only for that variation.
- Add a flags space specifically for rm_init_flags(), rather than borrowing
  the lock_init() flag space.
- Define flag RM_RECURSE to use instead of LO_RECURSABLE.
- Define flag RM_NOWITNESS to allow an rmlock to be exempt from WITNESS
  checking; this wasn't possible previously as rm_init() always passed
  LO_WITNESS when initializing an rmlock's struct lock.
- Add RM_SYSINIT_FLAGS().
- Rename embedded mutex in rmlocks to make it more obvious what it is.
- Update consumers.
- Update man page.
2009-05-29 10:52:37 +00:00
attilio
e05714ba70 Reverse the logic for ADAPTIVE_SX option and enable it by default.
Introduce for this operation the reverse NO_ADAPTIVE_SX option.
The flag SX_ADAPTIVESPIN to be passed to sx_init_flags(9) gets suppressed
and the new flag, offering the reversed logic, SX_NOADAPTIVE is added.

Additively implements adaptive spininning for sx held in shared mode.
The spinning limit can be handled through sysctls in order to be tuned
while the code doesn't reach the release, after which time they should
be dropped probabilly.

This change has made been necessary by recent benchmarks where it does
improve concurrency of workloads in presence of high contention
(ie. ZFS).

KPI breakage is documented by __FreeBSD_version bumping, manpage and
UPDATING updates.

Requested by:	jeff, kmacy
Reviewed by:	jeff
Tested by:	pho
2009-05-29 01:49:27 +00:00