Commit Graph

1813 Commits

Author SHA1 Message Date
Attilio Rao
2028867def In current code, threads performing an interruptible sleep (on both
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.

A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on.  In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).

In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.

The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.

Reported by:	avg, kib, pho
Reviewed by:	kib
Tested by:	pho
2009-12-12 21:31:07 +00:00
Edward Tomasz Napierala
4cbe4bf979 Add missing parameter description. 2009-12-02 18:11:14 +00:00
Bjoern A. Zeeb
aaae58c491 Unbreak user space after if_timer/if_watchdog removal in r199975.
Tested by:	glebius
2009-12-01 14:56:00 +00:00
Ruslan Ermilov
9589412053 Back in 2003, get_cyclecount() was changed to use binuptime() instead
of nanotime().  Reflect this change in a manpage.

Reviewed by:	phk, markm
2009-10-29 09:45:05 +00:00
Ed Maste
e04cb6afe4 Add link for callout_schedule(9). 2009-10-27 14:37:25 +00:00
Christian Brueffer
7024354df3 Sort SEE ALSO. 2009-10-16 12:32:07 +00:00
John Baldwin
37b8ef16cd Add a facility for associating optional descriptions with active interrupt
handlers.  This is primarily intended as a way to allow devices that use
multiple interrupts (e.g. MSI) to meaningfully distinguish the various
interrupt handlers.
- Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
  a description with an active interrupt handler setup by BUS_SETUP_INTR.
  It has a default method (bus_generic_describe_intr()) which simply passes
  the request up to the parent device.
- Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
  printf(9) style formatting using var args.
- Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
  an interrupt handler and copy the name passed to intr_event_add_handler()
  into that buffer instead of just saving the pointer to the name.
- Add a new intr_event_describe_handler() which appends a description string
  to an interrupt handler's name.
- Implement support for interrupt descriptions on amd64 and i386 by having
  the nexus(4) driver supply a custom bus_describe_intr method that invokes
  a new intr_describe() MD routine which in turn looks up the associated
  interrupt event and invokes intr_event_describe_handler().

Requested by:	many
Reviewed by:	scottl
MFC after:	2 weeks
2009-10-15 14:54:35 +00:00
John Baldwin
38d3501a8e Oops, add a return values section to note that these routines return an error
on failure or zero on success.
2009-10-14 16:00:20 +00:00
John Baldwin
2783151570 Add a manual page for BUS_BIND_INTR() and bus_bind_intr().
MFC after:	1 week
2009-10-14 15:58:59 +00:00
Edward Tomasz Napierala
55b95b33ed Make fetch(9) and store(9) manual pages closer to reality. 2009-10-05 15:16:28 +00:00
Edward Tomasz Napierala
a9315dded6 Add pieces of infrastructure required for NFSv4 ACL support in UFS.
Reviewed by:	rwatson
2009-09-22 15:15:03 +00:00
Christian Brueffer
aa8775c640 Fix mdoc, typos, contractions.
This includes:
PR:		135520
Submitted by:	Nobuyuki Koganemaru
Patch by:	gavin
MFC after:	3 days
2009-09-18 14:05:56 +00:00
Christian Brueffer
ace02a6d46 Various mdoc, spelling etc fixes.
MFC after:	3 days
2009-09-18 00:33:47 +00:00
Julian Elischer
58cf5c84e0 Add claraifications to the kproc and kthread manpages and link
the kthread_create(9) man page to the kproc(9) page as it had migrated and
people looking for it may need a hand to find its new name.

MFC after:	1 week
2009-08-23 07:48:11 +00:00
John Baldwin
eb5a1e8f38 This patch fixes two bugs in sglist(9) and improves robustness of the API via
better semantics if a request to append an address range to an existing list
fails.
- When cloning an sglist, properly set the length in the new sglist instead of
  leaving the new list empty.
- Properly compute the amount of data added to an sglist via
  _sglist_append_buf().  This allows sglist_consume_uio() to properly update
  uio_resid.
- When a request to append an address range to a scatter/gather list fails,
  restore the sglist to the state it had at the start of the function call
  instead of resetting it to an empty list.

Requested by:	np (3)
Approved by:	re (kib)
2009-08-21 02:59:07 +00:00
John Baldwin
0cef25aeb2 Change the 'resid' parameter to sglist_consume_uio() from an int to a
size_t to match the recent type change of the uio_resid member of struct
uio.

Approved by:	re (kib)
2009-08-20 19:23:58 +00:00
Pawel Jakub Dawidek
e477e4fe8e Remove unused taskqueue_find() function.
Reviewed by:	dfr
Approved by:	re (kib)
2009-08-18 13:55:48 +00:00
Pawel Jakub Dawidek
9c1a8ce494 Correct typo in the previous commit.
Noticed by:	pluknet <pluknet@gmail.com>
Approved by:	re (kib, implicit)
2009-08-17 10:20:22 +00:00
Pawel Jakub Dawidek
159ef108e1 Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:01:20 +00:00
Sam Leffler
692eebe092 First (early) draft of net80211 documentation. Note this is
focused on driver writers (as opposed to folks adding to net80211).

Reviewed by:	wkoszek
Approved by:	re (rwatson)
2009-08-12 21:03:16 +00:00
Bjoern A. Zeeb
d0ea47437a Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
  from going away, under INVARIANTS as this is a general problem
  of the stack and should be solved in if.c/netisr but still good
  to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.

Hook epair(4) up to the build.

Approved by:	re (kib)
2009-07-26 12:20:07 +00:00
Colin Percival
383334b383 Fix typo: kproc_resume,.9 -> kproc_resume.9.
Approved by:	re (kib)
2009-07-11 17:36:59 +00:00
Andrew Thompson
b35f050eb2 Move programming info from usb(4) to usbdi(9) and update for the usb stack
changeover. Needs much more content still.
2009-06-24 17:01:17 +00:00
Robert Watson
36fecbf302 Add stack_print_short() and stack_print_short_ddb() interfaces to
stack(9), which generate a more compact rendition of a stack trace
via the kernel's printf.

MFC after:	1 week
2009-06-24 12:06:15 +00:00
Konstantin Belousov
c9253e931d Usermode portion of the support for swap allocation accounting:
- update for getrlimit(2) manpage;
- support for setting RLIMIT_SWAP in login class;
- addition to the limits(1) and sh and csh limit-setting builtins;
- tuning(7) documentation on the sysctls controlling overcommit.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:57:27 +00:00
Brooks Davis
e8477da212 Document crcopysafe() and crsetgroups().
Reminded by:	julian
2009-06-19 19:16:35 +00:00
Attilio Rao
651175c9db Introduce support for adaptive spinning in lockmgr.
Actually, as it did receive few tuning, the support is disabled by
default, but it can opt-in with the option ADAPTIVE_LOCKMGRS.
Due to the nature of lockmgrs, adaptive spinning needs to be
selectively enabled for any interested lockmgr.
The support is bi-directional, or, in other ways, it will work in both
cases if the lock is held in read or write way.  In particular, the
read path is passible of further tunning using the sysctls
debug.lockmgr.retries and debug.lockmgr.loops .  Ideally, such sysctls
should be axed or compiled out before release.

Addictionally note that adaptive spinning doesn't cope well with
LK_SLEEPFAIL.  The reason is that many (and probabilly all) consumers
of LK_SLEEPFAIL are mainly interested in knowing if the interlock was
dropped or not in order to reacquire it and re-test initial conditions.
This directly interacts with adaptive spinning because lockmgr needs
to drop the interlock while spinning in order to avoid a deadlock
(further details in the comments inside the patch).

Final note: finding someone willing to help on tuning this with
relevant workloads would be either very important and appreciated.

Tested by:	jeff, pho
Requested by:	many
2009-06-17 01:55:42 +00:00
Bjoern A. Zeeb
ed655c8c07 Add an optional callback function that will be invoked when a per-CPU
queue was drained.  It will never fire for a directly dispatched packet.

You will most likely never want to use this for any ordinary netisr usage
and you will never blame netisr in case you try to use it and it does
not work as expected.

Reviewed by:	rwatson
2009-06-14 17:15:18 +00:00
Bjoern A. Zeeb
44549f1334 Remove a line break leaving a function return type attached to the old
function declaration bottom rather than the new function declaration
start.
2009-06-14 12:11:15 +00:00
Warner Losh
7adb51acc7 These are no longer public, so remove the man page. 2009-06-09 23:38:19 +00:00
John Baldwin
4ef60d2686 Add support for multiple passes of the device tree during the boot-time
probe.  The current device order is unchanged.  This commit just adds the
infrastructure and ABI changes so that it is easier to merge later changes
into 8.x.
- Driver attachments now have an associated pass level.  Attachments are
  not allowed to probe or attach to drivers until the system-wide pass level
  is >= the attachment's pass level.  By default driver attachments use the
  "last" pass level (BUS_PASS_DEFAULT).  Driver's that wish to probe during
  an earlier pass use EARLY_DRIVER_MODULE() instead of DRIVER_MODULE() which
  accepts the pass level as an additional parameter.
- A new method BUS_NEW_PASS has been added to the bus interface.  This
  method is invoked when the system-wide pass level is changed to kick off
  a rescan of the device tree so that drivers that have just been made
  "eligible" can probe and attach.
- The bus_generic_new_pass() function provides a default implementation of
  BUS_NEW_PASS().  It first allows drivers that were just made eligible for
  this pass to identify new child devices.  Then it propogates the rescan to
  child devices that already have an attached driver by invoking their
  BUS_NEW_PASS() method.  It also reprobes devices without a driver.
- BUS_PROBE_NOMATCH() is only invoked for devices that do not have
  an attached driver after being scanned during the final pass.
- The bus_set_pass() function is used during boot to raise the pass level.
  Currently it is only called once during root_bus_configure() to raise
  the pass level to BUS_PASS_DEFAULT.  This has the effect of probing all
  devices in a single pass identical to previous behavior.

Reviewed by:	imp
Approved by:	re (kib)
2009-06-09 14:26:23 +00:00
Robert Watson
6b8eb655fd Try again to add beginnings of netisr(8) man page: this time add
netisr.9.
2009-06-07 21:32:01 +00:00
Robert Watson
3c9c33bba1 Add beginnings of a netisr(9) man page. 2009-06-07 21:31:06 +00:00
John Baldwin
b36cfff75d Add a simple API to manage scatter/gather lists of phyiscal addresses.
Each list describes a logical memory object that is backed by one or more
physical address ranges.  To minimize locking, the sglist objects
themselves are immutable once they are shared.

These objects may be used in the future to facilitate I/O requests using
physically-addressed buffers.  For the immediate future I plan to use them
to implement a new type of VM object and pager.

Reviewed by:	jeff, scottl
MFC after:	1 month
2009-06-01 20:35:39 +00:00
Edward Tomasz Napierala
f0fa0e7faa Use the "flag" word consistently.
Submitted by:	Ben Kaduk <minimarmot at gmail.com>
2009-06-01 07:48:27 +00:00
Edward Tomasz Napierala
c97fcdba57 Add VOP_ACCESSX, which can be used to query for newly added V*
permissions, such as VWRITE_ACL.  For a filsystems that don't
implement it, there is a default implementation, which works
as a wrapper around VOP_ACCESS.

Reviewed by:	rwatson@
2009-05-30 13:59:05 +00:00
Robert Watson
1a109c1cb0 Make the rmlock(9) interface a bit more like the rwlock(9) interface:
- Add rm_init_flags() and accept extended options only for that variation.
- Add a flags space specifically for rm_init_flags(), rather than borrowing
  the lock_init() flag space.
- Define flag RM_RECURSE to use instead of LO_RECURSABLE.
- Define flag RM_NOWITNESS to allow an rmlock to be exempt from WITNESS
  checking; this wasn't possible previously as rm_init() always passed
  LO_WITNESS when initializing an rmlock's struct lock.
- Add RM_SYSINIT_FLAGS().
- Rename embedded mutex in rmlocks to make it more obvious what it is.
- Update consumers.
- Update man page.
2009-05-29 10:52:37 +00:00
Attilio Rao
1ae1c2a3bd Reverse the logic for ADAPTIVE_SX option and enable it by default.
Introduce for this operation the reverse NO_ADAPTIVE_SX option.
The flag SX_ADAPTIVESPIN to be passed to sx_init_flags(9) gets suppressed
and the new flag, offering the reversed logic, SX_NOADAPTIVE is added.

Additively implements adaptive spininning for sx held in shared mode.
The spinning limit can be handled through sysctls in order to be tuned
while the code doesn't reach the release, after which time they should
be dropped probabilly.

This change has made been necessary by recent benchmarks where it does
improve concurrency of workloads in presence of high contention
(ie. ZFS).

KPI breakage is documented by __FreeBSD_version bumping, manpage and
UPDATING updates.

Requested by:	jeff, kmacy
Reviewed by:	jeff
Tested by:	pho
2009-05-29 01:49:27 +00:00
Zachary Loafman
eae608ef4c Fix style/grammar issues in fail(9) man page.
Suggested by:       Ben Kaduk
Approved by:        dfr (mentor)
2009-05-28 15:02:52 +00:00
Zachary Loafman
cfeb7489c2 fail(9) support:
Add support for kernel fault injection using KFAIL_POINT_* macros and
fail_point_* infrastructure. Add example fail point in vfs_bio.c to
simulate VM buf pressure.

Approved by:        dfr (mentor)
2009-05-27 16:36:54 +00:00
Edward Tomasz Napierala
4a48670553 There are things too complex to be fixed in one commit.
Fix a typo in acl(9) manual page.

Submitted by:	avg
2009-05-24 20:34:29 +00:00
Tom McLaughlin
c327ec0021 Update man pages after VFS_* changes in r191990.
Approved by:	brueffer, attilio
2009-05-24 18:34:54 +00:00
Edward Tomasz Napierala
7070b4fc87 Fix typo in the manual page. 2009-05-24 17:08:00 +00:00
Edward Tomasz Napierala
3f8cd45f79 Add new constants to the acl(9) manual page. 2009-05-24 09:42:53 +00:00
John Baldwin
e6b089446b Attempt to clarify some confusing wording regarding atomic_load() and
atomic_store().
2009-05-21 13:39:46 +00:00
Christian Brueffer
72fba9d714 Document sbuf_new_auto().
While here, add a missing `-' in phk's name.

MFC after:	3 days
2009-05-17 21:28:37 +00:00
Marius Strobl
08390d3b63 Correct r190283 (partially reverting it) as on sparc64 BUS_DMA_NOCACHE
actually is only valid for bus_dmamap_load().

MFC after:	3 days
2009-05-12 20:56:34 +00:00
Robert Watson
78fc60e401 Garbage collect man page reference to IFF_NEEDSGIANT. 2009-04-18 20:09:43 +00:00
Edward Tomasz Napierala
d0d7c39c72 Remove 'IMPLEMENTATION NOTES' section from acl(9); it was just a copy/paste
from <sys/acl.h> and it would get out-of-date pretty soon.
2009-04-11 10:37:04 +00:00
Robert Watson
cd5213b94b Remove VOP_LEASE(9) man page, as we no longer have a VOP_LEASE() in the
kernel.
2009-04-10 10:59:48 +00:00