Commit Graph

7280 Commits

Author SHA1 Message Date
rwatson
18f9130ceb Grab the socket buffer send or receive mutex when performing a
read-modify-write on the sb_state field.  This commit catches only
the "easy" ones where it doesn't interact with as yet unmerged
locking.
2004-06-15 03:51:44 +00:00
peter
360d78b6e8 Fix symbol lookups between modules. This caused modules that depend on
other modules to explode.  eg: snd_ich->snd_pcm and umass->usb.
The problem was that I was using the unified base address of the module
instead of finding the start address of the section in question.
2004-06-15 01:35:57 +00:00
peter
30faf2f683 Insurance: cause a proper symbol lookup failure for symbol entries that
reference unknown sections.. rather than returning a small value.
2004-06-15 01:33:39 +00:00
jdp
9e267c7df9 Change the return value of sema_timedwait() so it returns 0 on
success and a proper errno value on failure.  This makes it
consistent with cv_timedwait(), and paves the way for the
introduction of functions such as sema_timedwait_sig() which can
fail in multiple ways.

Bump __FreeBSD_version and add a note to UPDATING.

Approved by:	scottl (ips driver), arch
2004-06-14 18:19:05 +00:00
rwatson
f467792e5a The socket field so_state is used to hold a variety of socket related
flags relating to several aspects of socket functionality.  This change
breaks out several bits relating to send and receive operation into a
new per-socket buffer field, sb_state, in order to facilitate locking.
This is required because, in order to provide more granular locking of
sockets, different state fields have different locking properties.  The
following fields are moved to sb_state:

  SS_CANTRCVMORE            (so_state)
  SS_CANTSENDMORE           (so_state)
  SS_RCVATMARK              (so_state)

Rename respectively to:

  SBS_CANTRCVMORE           (so_rcv.sb_state)
  SBS_CANTSENDMORE          (so_snd.sb_state)
  SBS_RCVATMARK             (so_rcv.sb_state)

This facilitates locking by isolating fields to be located with other
identically locked fields, and permits greater granularity in socket
locking by avoiding storing fields with different locking semantics in
the same short (avoiding locking conflicts).  In the future, we may
wish to coallesce sb_state and sb_flags; for the time being I leave
them separate and there is no additional memory overhead due to the
packing/alignment of shorts in the socket buffer structure.
2004-06-14 18:16:22 +00:00
phk
fc06c1fa80 Remove a left over from userland buffer-cache access to disks. 2004-06-14 14:25:03 +00:00
rwatson
0f9a4a0ff3 Socket MAC labels so_label and so_peerlabel are now protected by
SOCK_LOCK(so):

- Hold socket lock over calls to MAC entry points reading or
  manipulating socket labels.

- Assert socket lock in MAC entry point implementations.

- When externalizing the socket label, first make a thread-local
  copy while holding the socket lock, then release the socket lock
  to externalize to userspace.
2004-06-13 02:50:07 +00:00
rwatson
baee76e6ce Introduce socket and UNIX domain socket locks into hard-coded lock
order definition for witness.  Send lock before receive lock, and
socket locks after accept but  before select:

  filedesc -> accept -> so_snd -> so_rcv -> sellck

All routing locks after send lock:

  so_rcv -> radix node head

All protocol locks before socket locks:

  unp -> so_snd
  udp -> udpinp -> so_snd
  tcp -> tcpinp -> so_snd
2004-06-13 00:23:03 +00:00
rwatson
9c0b0e6b91 Correct whitespace errors in merge from rwatson_netperf: tabs instead of
spaces, no trailing tab at the end of line.

Pointed out by:	csjp
2004-06-12 23:36:59 +00:00
rwatson
75ff0eae83 Extend coverage of SOCK_LOCK(so) to include so_count, the socket
reference count:

- Assert SOCK_LOCK(so) macros that directly manipulate so_count:
  soref(), sorele().

- Assert SOCK_LOCK(so) in macros/functions that rely on the state of
  so_count: sofree(), sotryfree().

- Acquire SOCK_LOCK(so) before calling these functions or macros in
  various contexts in the stack, both at the socket and protocol
  layers.

- In some cases, perform soisdisconnected() before sotryfree(), as
  this could result in frobbing of a non-present socket if
  sotryfree() actually frees the socket.

- Note that sofree()/sotryfree() will release the socket lock even if
  they don't free the socket.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-12 20:47:32 +00:00
rwatson
246fb139dd Introduce a mutex into struct sockbuf, sb_mtx, which will be used to
protect fields in the socket buffer.  Add accessor macros to use the
mutex (SOCKBUF_*()).  Initialize the mutex in soalloc(), and destroy
it in sodealloc().  Add addition, add SOCK_*() access macros which
will protect most remaining fields in the socket; for the time being,
use the receive socket buffer mutex to implement socket level locking
to reduce memory overhead.

Submitted by:	sam
Sponosored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-12 16:08:41 +00:00
phk
2add68d62c Fix registration of loadable line disciplines.
This should make watch(8)/snp(4) work again.
2004-06-12 12:31:42 +00:00
bmilekic
6a6c17a9a6 Gah! Plug a mbuf leak I introduced in the last commit.
I don the pointy-hat.

Problem reported by: Peter Holm <pho@>
2004-06-11 18:17:25 +00:00
julian
d8b86fa6b9 Shuffle some code around. 2004-06-11 17:48:20 +00:00
phk
a78896a72f Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
green
f26bd1c34a Make sysctl_wire_old_buffer() respect ENOMEM from vslock() by marking
the valid length as 0.  This prevents vsunlock() from removing a system
wire from memory that was not successfully wired (by us).

Submitted by:	tegge
2004-06-11 02:20:37 +00:00
rwatson
2881ecab09 Introduce a subsystem lock around UNIX domain sockets in order to protect
global and allocated variables.  This strategy is derived from work
originally developed by BSDi for BSD/OS, and applied to FreeBSD by Sam
Leffler:

- Add unp_mtx, a global mutex which will protect all UNIX domain socket
  related variables, structures, etc.

- Add UNP_LOCK(), UNP_UNLOCK(), UNP_LOCK_ASSERT() macros.

- Acquire unp_mtx on entering most UNIX domain socket code,
  drop/re-acquire around calls into VFS, and release it on return.

- Avoid performing sodupsockaddr() while holding the mutex, so in general
  move to allocating storage before acquiring the mutex to copy the data.

- Make a stack copy of the xucred rather than copying out while holding
  unp_mtx.  Copy the peer credential out after releasing the mutex.

- Add additional assertions of vnode locks following VOP_CREATE().

A few notes:

- Use of an sx lock for the file list mutex may cause problems with regard
  to unp_mtx when garbage collection passed file descriptors.

- The locking in unp_pcblist() for sysctl monitoring is correct subject to
  the unpcb zone not returning memory for reuse by other subsystems
  (consistent with similar existing concerns).

- Sam's version of this change, as with the BSD/OS version, made use of
  both a global lock and per-unpcb locks.  However, in practice, the
  global lock covered all accesses, so I have simplified out the unpcb
  locks in the interest of getting this merged faster (reducing the
  overhead but not sacrificing granularity in most cases).  We will want
  to explore possibilities for improving lock granularity in this code in
  the future.

Submitted by:	sam
Sponsored by:	FreeBSD Foundatiuon
Obtained from:	BSD/OS 5 snapshot provided by BSDi
2004-06-10 21:34:38 +00:00
bmilekic
49d39cb9b4 Plug a race where upon free this scenario could occur:
(time grows downward)
thread 1         thread 2
------------|------------
dec ref_cnt |
            | dec ref_cnt  <-- ref_cnt now zero
cmpset      |
free all    |
return      |
            |
alloc again,|
reuse prev  |
ref_cnt     |
            | cmpset, read
            | already freed
            | ref_cnt
------------|------------

This should fix that by performing only a single
atomic test-and-set that will serve to decrement
the ref_cnt, only if it hasn't changed since the
earlier read, otherwise it'll loop and re-read.
This forces ordering of decrements so that truly
the thread which did the LAST decrement is the
one that frees.

This is how atomic-instruction-based refcnting
should probably be handled.

Submitted by: Julian Elischer
2004-06-10 00:04:27 +00:00
mux
90deaa0a63 Fix a panic happening when m_getm() is called with len < MCLBYTES.
Reported by:	ale
Tested by:	ale
Reviewed by:	bosko
2004-06-09 14:53:35 +00:00
jmallett
00cffe3fcf Add a comment explaining td_critnest's initial state and its life from that
point on, as it happens relatively indirectly, and in a codepath the casual
reader may not be acquainted with or find obvious.

Glanced at by:	jhb
2004-06-09 14:06:44 +00:00
phk
dd18d90a0a Rename struct pt_ioctl to "ptsc" and pointers to it from "pti" to "pt" 2004-06-09 10:21:53 +00:00
phk
a9af1dba3e Ditch K&R function style 2004-06-09 10:16:14 +00:00
phk
3692195f5a Reference count struct tty.
Add two new functions: ttyref() and ttyrel().  ttymalloc() creates a struct
tty with a reference count of one.  when ttyrel sees the count go to zero,
struct tty is freed.

Hold references for open ttys and for ttys which are controlling terminal
for sessions.

Until drivers start using ttyrel(), this commit will make no difference.
2004-06-09 09:41:30 +00:00
phk
1777f5359e Fix a race in destruction of sessions. 2004-06-09 09:29:08 +00:00
phk
5989134159 Move PTY private defines into PTY private files. 2004-06-09 09:09:54 +00:00
stefanf
ce8e58fb1a Avoid assignments to cast expressions.
Reviewed by:	md5
Approved by:	das (mentor)
2004-06-08 13:08:19 +00:00
tjr
687eba3097 Remove remnants of PGINPROF. 2004-06-08 10:37:30 +00:00
rwatson
f78dd34491 Correct a resource leak introduced in recent accept locking changes:
when I reordered events in accept1() to allocate a file descriptor
earlier, I didn't properly update use of goto on exit to unwind for
cases where the file descriptor is now held, but wasn't previously.
The result was that, in the event of accept() on a non-blocking socket,
or in the event of a socket error, a file descriptor would be leaked.

This ended up being non-fatal in many cases, as the file descriptor
would be properly GC'd on process exit, so only showed up for processes
that do a lot of non-blocking accept() calls, and also live for a long
time (such as qmail).

This change updates the use of goto targets to do additional unwinding.

Eyes provided by:	Brian Feldman <green@freebsd.org>
Feet, hands provided by:	Stefan Ehmann <shoesoft@gmx.net>,
				Dimitry Andric <dimitry@andric.com>
				Arjan van Leeuwen <avleeuwen@piwebs.com>
2004-06-07 21:45:44 +00:00
phk
415c3feb5c Make linesw[] an array of pointers to linedesc instead of an array of
linedisc.
2004-06-07 20:45:45 +00:00
julian
00c753c659 Split kern_thread.c into 2 parts. kern_kse.c and kern_thread.c
Kern_kse has already been committed.
This separates out the KSE threading ABI from  generic thread support.
2004-06-07 19:00:57 +00:00
davidxu
2ca8138fdb According to SUSv3, sigwait is different with sigwaitinfo, sigwait
returns error code in return value, not in errno.
2004-06-07 13:35:02 +00:00
pjd
dca5c7ffaa Remove unused code.
Submitted by:	Bjoern A. Zeeb
2004-06-07 12:19:55 +00:00
ume
6f313bdf58 allow more than MLEN bytes for ancillary data to meet the
requirement of Section 20.1 of RFC3542.

Obtained from:	KAME
MFC after:	1 week
2004-06-07 09:59:50 +00:00
tjr
7914516cb4 Remove a stale and misleading comment. 2004-06-07 09:35:00 +00:00
julian
7835fe4007 Move the KSE ABI specific code here and separate it from code that
is generic to any threading system. This commit does not link this
file to the build yet, nor does it remove these functions from their
current location in kern_thread.c. (that commit coming up after further review)
2004-06-07 07:25:03 +00:00
phk
1d74a27efd Remove filename+line number from panic messages. 2004-06-06 21:26:49 +00:00
bde
f3c51ac0d4 Detect interrupt storms better. The storm detection didn't work at all
with an ASUS A7N8X-E motherboard in APIC mode, since storming interrupts
don't repeat immediately.  Use DELAY(1) to wait a bit for them to repeat.
This affects all systems.  Only delay for the first
(10 * intr_storm_threshold) interrupts (per interrupt handler) so that
this is only a pessimization while warming up.  Throttle after calling
the sub-handlers instead of before so that the long delay given by
throttling can be used instead of the DELAY(1) to detect storms after
warming up.

Reduced the throttling period from 1/10 second to 1/hz seconds so that
throttling doesn't destroy performance so much.  Interrupts that are
detected as storming are effectively handled by polling at a frequency
of hz Hz.  On A7N8X-E's there is another hardware or configuration bug
that makes the throttled frequency closer to 2*hz Hz.
2004-06-05 18:27:28 +00:00
mux
289f1bd7df When we don't have any meaningful value to print for the device sysctl
tree, output an empty string instead of "?".  This is already what
happened with DEVICE_SYSCTL_LOCATION and DEVICE_SYSCTL_PNPINFO.  This
makes the output of "sysctl dev" much nicer (it won't display those
empty sysctls).

Reviewed by:	des
2004-06-05 11:39:05 +00:00
tjr
5545221528 Change the types of vn_rdwr_inchunks()'s len and aresid arguments to
size_t and size_t *, respectively. Update callers for the new interface.
This is a better fix for overflows that occurred when dumping segments
larger than 2GB to core files.
2004-06-05 02:18:28 +00:00
tjr
30c66c4349 Back out workaround for vn_rdwr_inchunks()'s INT_MAX length limitation
after discussions with bde; vn_rdwr_inchunks() itself should be fixed.
2004-06-05 02:00:12 +00:00
phk
ab9c3b8324 Centralize the line discipline optimization determination in a function
called ttyldoptim().

Use this function from all the relevant drivers.

I belive no drivers finger linesw[] directly anymore, paving the way for
locking and refcounting.
2004-06-04 21:55:55 +00:00
phk
db2bdd44aa Manual edits to change linesw[]-frobbing to ttyld_*() calls. 2004-06-04 20:04:52 +00:00
phk
c2fc58d70f Machine generated patch which changes linedisc calls from accessing
linesw[] directly to using the ttyld...() functions

The ttyld...() functions ar inline so there is no performance hit.
2004-06-04 16:02:56 +00:00
tjr
27c6e572f9 Remove a stale comment. 2004-06-04 11:00:22 +00:00
des
d0d588d925 Add a devclass level to the dev sysctl tree, in order to support per-
class variables in addition to per-device variables.  In plain English,
this means that dev.foo0.bar is now called dev.foo.0.bar, and it is
possible to to have dev.foo.bar as well.
2004-06-04 10:23:00 +00:00
phk
e38a975fdb Get rid of ttyregister(). All drivers now use ttymalloc() for struct
tty, so now we stand a chance of implementing refcounting and getting
rid of the damn things again.
2004-06-04 07:17:03 +00:00
phk
51f05dbc85 Use ttymalloc() instead of ttyregister(). Use ttyioctl() instead of
direct calls to the linedisc.
2004-06-04 06:50:35 +00:00
tjr
94e7af9e6c Write segments to core dump files in maximally-sized chunks that neither
exceed vn_rdwr_inchunks()'s INT_MAX length limitation nor span a block
boundary. This fixes dumping segments larger than 2GB.

PR:	67546
2004-06-04 06:30:16 +00:00
rwatson
2763c817f5 Mark sun_noname as const since it's immutable. Update definitions
of functions that potentially accept &sun_noname (sbappendaddr(),
et al) to accept a const sockaddr pointer.
2004-06-04 04:07:08 +00:00
alc
7d5d99d158 Move the definitions of SWAPBLK_NONE and SWAPBLK_MASK from vm_page.h to
blist.h, enabling the removal of numerous #includes from subr_blist.c.
(subr_blist.c and swap_pager.c are the only users of these definitions.)
2004-06-04 04:03:26 +00:00