Commit Graph

169 Commits

Author SHA1 Message Date
rwatson
83e5d6bce1 Remove unused (and ifdef'd) unp_abort() and unp_drain().
MFC after:	1 month
2006-06-16 22:11:49 +00:00
maxim
0cf435f016 o There are two methods to get a process credentials over the unix
sockets:

1) A sender sends SCM_CREDS message to a reciever, struct cmsgcred;
2) A reciever sets LOCAL_CREDS socket option and gets sender
credentials in control message, struct sockcred.

Both methods use the same control message type SCM_CREDS with the
same control message level SOL_SOCKET, so they are indistinguishable
for the receiver.  A difference in struct cmsgcred and struct sockcred
layouts may lead to unwanted effects.

Now for sockets with LOCAL_CREDS option remove all previous linked
SCM_CREDS control messages and then add a control message with
struct sockcred so the process specifically asked for the peer
credentials by LOCAL_CREDS option always gets struct sockcred.

PR:		kern/90800
Submitted by:	Andrey Simonenko
Regres. tests:	tools/regression/sockets/unix_cmsg/
MFC after:	1 month
2006-06-13 14:33:35 +00:00
maxim
b583a2a914 Inherit LOCAL_CREDS option from listen socket for sockets returned
by accept(2).

PR:		kern/90644
Submitted by:	Andrey Simonenko
OK'ed by:	mdodd
Tested by:	NetBSD regress/sys/kern/unfdpass/unfdpass.c
MFC after:	1 month
2006-04-24 19:09:33 +00:00
ps
10b2fe8dea Allow for nmbclusters and maxsockets to be increased via sysctl.
An eventhandler is used to update all the various zones that depend
on these values.
2006-04-21 09:25:40 +00:00
rwatson
5479e5d692 Chance protocol switch method pru_detach() so that it returns void
rather than an error.  Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF.  so_pcb is now entirely owned and
managed by the protocol code.  Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit.  In their current state they may leak
memory or panic.

MFC after:	3 months
2006-04-01 15:42:02 +00:00
rwatson
8622e776f9 Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful.  Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit.  This will be corrected shortly in followup
commits to these components.

MFC after:      3 months
2006-04-01 15:15:05 +00:00
rwatson
f7a01805a2 Modify UNIX domain sockets to guarantee, and assume, that so_pcb is always
defined for an in-use socket.  This allows us to eliminate countless tests
of whether so_pcb is non-NULL, eliminating dozens of error cases.  For
now, retain the call to sotryfree() in the uipc_abort() path, but this
will eventually move to soabort().

These new assumptions should be largely correct, and will become more so
as the socket/pcb reference model is fixed.  Removing the notion that
so_pcb can be non-NULL is a critical step towards further fine-graining
of the UNIX domain socket locking, as the so_pcb reference no longer
needs to be protected using locks, instead it is a property of the socket
life cycle.
2006-03-17 13:52:57 +00:00
jeff
822d3c8355 - Lock access to vrele() with VFS_LOCK_GIANT() rather than mtx_lock(&Giant).
Sponsored by:	Isilon Systems, Inc.
2006-01-30 08:19:01 +00:00
rwatson
a59d345748 XXX a comment in uipc_usrreq.c that requires updating. 2006-01-13 00:00:32 +00:00
mux
b29e3549b8 Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to
match the type of the variable they are exporting.

Spotted by:	Thomas Hurst <tom@hur.st>
MFC after:	3 days
2005-12-14 22:27:48 +00:00
rwatson
9487c057e2 Correct a number of serious and closely related bugs in the UNIX domain
socket file descriptor garbage collection code, which is intended to
detect and clear cycles of orphaned file descriptors that are "in-flight"
in a socket when that socket is closed before they are received.  The
algorithm present was both run at poor times (resulting in recursion and
reentrance), and also buggy in the presence of parallelism.  In order to
fix these problems, make the following changes:

- When there are in-flight sockets and a UNIX domain socket is destroyed,
  asynchronously schedule the garbage collector, rather than running it
  synchronously in the current context.  This avoids lock order issues
  when the garbage collection code reenters the UNIX domain socket code,
  avoiding lock order reversals, deadlocks, etc.  Run the code
  asynchronously in a task queue.

- In the garbage collector, when skipping file descriptors that have
  entered a closing state (i.e., have f_count == 0), re-test the FDEFER
  flag, and decrement unp_defer.  As file descriptors can now transition
  to a closed state, while the garbage collector is running, it is no
  longer the case that unp_defer will remain an accurate count of
  deferred sockets in the mark portion of the GC algorithm.  Otherwise,
  the garbage collector will loop waiting waiting for unp_defer to reach
  zero, which it will never do as it is skipping file descriptors that
  were marked in an earlier pass, but now closed.

- Acquire the UNIX domain socket subsystem lock in unp_discard() when
  modifying the unp_rights counter, or a read/write race is risked with
  other threads also manipulating the counter.

While here:

- Remove #if 0'd code regarding acquiring the socket buffer sleep lock in
  the garbage collector, this is not required as we are able to use the
  socket buffer receive lock to protect scanning the receive buffer for
  in-flight file descriptors on the socket buffer.

- Annotate that the description of the garbage collector implementation
  is increasingly inaccurate and needs to be updated.

- Add counters of the number of deferred garbage collections and recycled
  file descriptors.  This will be removed and is here temporarily for
  debugging purposes.

With these changes in place, the unp_passfd regression test now appears
to be passed consistently on UP and SMP systems for extended runs,
whereas before it hung quickly or panicked, depending on which bug was
triggered.

Reported by:	Philip Kizer <pckizer at nostrum dot com>
MFC after:	2 weeks
2005-11-10 16:06:04 +00:00
rwatson
be4f357149 Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
rwatson
49831ed8da Push the assignment of a new or updated so_qlimit from solisten()
following the protocol pru_listen() call to solisten_proto(), so
that it occurs under the socket lock acquisition that also sets
SO_ACCEPTCONN.  This requires passing the new backlog parameter
to the protocol, which also allows the protocol to be aware of
changes in queue limit should it wish to do something about the
new queue limit.  This continues a move towards the socket layer
acting as a library for the protocol.

Bump __FreeBSD_version due to a change in the in-kernel protocol
interface.  This change has been tested with IPv4 and UNIX domain
sockets, but not other protocols.
2005-10-30 19:44:40 +00:00
rwatson
32e95735e8 Canonicalize the UNIX domain socket copyright layout: original holders
before more recent holders.

MFC after:	3 days
2005-09-23 12:41:06 +00:00
cperciva
a199a4f74b Fix two issues which were missed in FreeBSD-SA-05:08.kmem.
Reported by:	Uwe Doering
2005-05-07 00:41:36 +00:00
mdodd
56c42039a5 Add missing break.
Found by:	marcus
2005-04-25 00:48:04 +00:00
mdodd
7826c585d5 Check sopt_level in uipc_ctloutput() and return early if it is non-zero.
This prevents unintended consequnces when an application calls things like
setsockopt(x, SOL_SOCKET, SO_REUSEADDR, ...) on a Unix domain socket.
2005-04-20 02:57:56 +00:00
mdodd
bdcac6ad82 Implement unix(4) socket options LOCAL_CREDS and LOCAL_CONNWAIT.
- Add unp_addsockcred() (for LOCAL_CREDS).
- Add an argument to unp_connect2() to differentiate between
  PRU_CONNECT and PRU_CONNECT2. (for LOCAL_CONNWAIT)

Obtained from:	 NetBSD (with some changes)
2005-04-13 00:01:46 +00:00
rwatson
26df80bf2c In the current world order, solisten() implements the state transition of
a socket from a regular socket to a listening socket able to accept new
connections.  As part of this state transition, solisten() calls into the
protocol to update protocol-layer state.  There were several bugs in this
implementation that could result in a race wherein a TCP SYN received
in the interval between the protocol state transition and the shortly
following socket layer transition would result in a panic in the TCP code,
as the socket would be in the TCPS_LISTEN state, but the socket would not
have the SO_ACCEPTCONN flag set.

This change does the following:

- Pushes the socket state transition from the socket layer solisten() to
  to socket "library" routines called from the protocol.  This permits
  the socket routines to be called while holding the protocol mutexes,
  preventing a race exposing the incomplete socket state transition to TCP
  after the TCP state transition has completed.  The check for a socket
  layer state transition is performed by solisten_proto_check(), and the
  actual transition is performed by solisten_proto().

- Holds the socket lock for the duration of the socket state test and set,
  and over the protocol layer state transition, which is now possible as
  the socket lock is acquired by the protocol layer, rather than vice
  versa.  This prevents additional state related races in the socket
  layer.

This permits the dual transition of socket layer and protocol layer state
to occur while holding locks for both layers, making the two changes
atomic with respect to one another.  Similar changes are likely require
elsewhere in the socket/protocol code.

Reported by:		Peter Holm <peter@holm.cc>
Review and fixes from:	emax, Antoine Brodin <antoine.brodin@laposte.net>
Philosophical head nod:	gnn
2005-02-21 21:58:17 +00:00
rwatson
42f4eba53e When aborting a UNIX domain socket bind() because VOP_CREATE() failed,
make sure to call vn_finished_write(mp) before returning.

MFC after:	3 days
2005-02-21 14:21:50 +00:00
rwatson
09cb31f04d style(9)-ize function headers, remove use of 'register'.
MFC after:	3 days
2005-02-20 23:22:13 +00:00
rwatson
bc84357313 In unp_attach(), allow uma_zalloc to zero the new unpcb rather than
explicitly using bzero().

Update copyright.

MFC after:	3 days
2005-02-20 20:05:11 +00:00
rwatson
aa96e34a9c Move assignment of UNIX domain socket pcb during unp_attach() outside
of the global UNIX domain socket mutex: no protection is needed that
early in the setup of the UNIX domain socket and socket structures.

MFC after:	3 days
2005-02-20 04:18:22 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
rwatson
e1ce7eb9ce Remove temporary debugging printf that was used to detect the presence
of a race that had previously caused a panic in order to determine if
the fix was for the right problem.  It was.

MFC after:	2 weeks
2004-12-23 01:19:27 +00:00
alc
04b2362b0f Add send buffer locking to uipc_send(). Without this locking a race can
occur between a reader and a writer that results in a panic upon close,
e.g.,
	"panic: sbflush_locked: cc 4 || mb 0xffffff0052afa400 || mbcnt 0"

Reviewed by: rwatson@
MFC after: 2 weeks
2004-12-22 20:28:46 +00:00
phk
9b4cd725f1 "nfiles" is a bad name for a global variable. Call it "openfiles" instead
as this is more correct and matches the sysctl variable.
2004-12-01 09:22:26 +00:00
phk
027fce30f5 Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.
2004-11-08 14:44:54 +00:00
rwatson
4b81ce6dd2 Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):

- This permits the caller to acquire the accept mutex before the socket
  mutex, avoiding sofree() having to drop the socket mutex and re-order,
  which could lead to races permitting more than one thread to enter
  sofree() after a socket is ready to be free'd.

- This also covers clearing of the so_pcb weak socket reference from
  the protocol to the socket, preventing races in clearing and
  evaluation of the reference such that sofree() might be called more
  than once on the same socket.

This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket.  The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets.  The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.

RELENG_5_3 candidate.

MFC after:	3 days
Reviewed by:	dwhite
Discussed with:	gnn, dwhite, green
Reported by:	Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by:	Vlad <marchenko at gmail dot com>
2004-10-18 22:19:43 +00:00
rwatson
4dfef2e0e5 Don't hold the UNIX domain socket subsystem lock over the body of the
UNIX domain socket garbage collection implementation, as that risks
holding the mutex over potentially sleeping operations (as well as
introducing some nasty lock order issues, etc).  unp_gc() will hold
the lock long enough to do necessary deferal checks and set that it's
running, but then release it until it needs to reset the gc state.

RELENG_5 candidate.

Discussed with:	alfred
2004-08-25 21:24:36 +00:00
rwatson
3d9f38d578 Add UNP_UNLOCK_ASSERT() to asser that the UNIX domain socket subsystem
lock is not held.

Rather than annotating that the lock is released after calls to
unp_detach() with a comment, annotate with an assertion.

Assert that the UNIX domain socket subsystem lock is not held when
unp_externalize() and unp_internalize() are called.
2004-08-19 01:45:16 +00:00
rwatson
7ac0169fa1 Always acquire the UNIX domain socket subsystem lock (UNP lock)
before dereferencing sotounpcb() and checking its value, as so_pcb
is protected by protocol locking, not subsystem locking.  This
prevents races during close() by one thread and use of ths socket
in another.

unp_bind() now assert the UNP lock, and uipc_bind() now acquires
the lock around calls to unp_bind().
2004-08-16 04:41:03 +00:00
rwatson
2cd0dbc8de Annotate the current UNIX domain socket locking strategies, order,
strengths, and weaknesses in a comment.  Assert a copyright over the
changes made as part of the locking work.
2004-08-16 01:52:04 +00:00
rwatson
136013f29f After completing a name lookup for a target UNIX domain socket to
connect to, re-check that the local UNIX domain socket hasn't been
closed while we slept, and if so, return EINVAL.  This affects the
system running both with and without Giant over the network stack,
and recent ULE changes appear to cause it to trigger more
frequently than previously under load.  While here, improve catching
of possibly closed UNIX domain sockets in one or two additional
circumstances.  I have a much larger set of related changes in
Perforce, but they require more testing before they can be merged.

One debugging printf is left in place to indicate when such a race
takes place: this is typically triggered by a buggy application
that simultaenously connect()'s and close()'s a UNIX domain socket
file descriptor.  I'll remove this at some point in the future, but
am interested in seeing how frequently this is reported.  In the
case of Martin's reported problem, it appears to be a result of a
non-thread safe syslog() implementation in the C library, which
does not synchronize access to its logging file descriptor.

Reported by:	mbr
2004-08-14 03:43:49 +00:00
rwatson
4c9acdbfaf In uipc_connect(), assert that the passed thread is curthread, and pass
td into unp_connect() instead of reading curthread.
2004-07-25 23:30:43 +00:00
rwatson
06c2659597 Drop Giant and acquire the UNIX domain socket subsystem lock a bit
earlier in unp_connect() so that vp->v_socket can't change between
our copying its value to a local variable and later use of that
variable.  This may have been responsible for a panic during
shutdown that I experienced where simultaneous closing of a listen
socket by rpcbind and a new connection being made to rpcbind by
mountd.
2004-07-18 01:29:43 +00:00
alfred
f05df8a881 We allocate an array of pointers to the global file table while
not holding the filelist_lock.  This means the filelist can change
size while allocating.  Detect this race and retry the allocation.
2004-07-02 07:40:10 +00:00
rwatson
b9d22ffbfa Acquire the socket buffer lock when calling unp_scan() on
so->so_rcv.sb_mb to prevent the mbuf chain from changing during the
scan.
2004-06-27 03:29:25 +00:00
rwatson
758f90deb8 Reduce the number of unnecessary unlock-relocks on socket buffer mutexes
associated with performing a wakeup on the socket buffer:

- When performing an sbappend*() followed by a so[rw]wakeup(), explicitly
  acquire the socket buffer lock and use the _locked() variants of both
  calls.  Note that the _locked() sowakeup() versions unlock the mutex on
  return.  This is done in uipc_send(), divert_packet(), mroute
  socket_send(), raw_append(), tcp_reass(), tcp_input(), and udp_append().

- When the socket buffer lock is dropped before a sowakeup(), remove the
  explicit unlock and use the _locked() sowakeup() variant.  This is done
  in soisdisconnecting(), soisdisconnected() when setting the can't send/
  receive flags and dropping data, and in uipc_rcvd() which adjusting
  back-pressure on the sockets.

For UNIX domain sockets running mpsafe with a contention-intensive SMP
mysql benchmark, this results in a 1.6% query rate improvement due to
reduce mutex costs.
2004-06-26 19:10:39 +00:00
rwatson
7203fb63d4 Release UNIX domain socket subsystem lock earlier -- don't need to
hold it over free of unp_addr if we've already removed all references
to unp.
2004-06-25 20:12:06 +00:00
rwatson
21164a78ac Merge next step in socket buffer locking:
- sowakeup() now asserts the socket buffer lock on entry.  Move
  the call to KNOTE higher in sowakeup() so that it is made with
  the socket buffer lock held for consistency with other calls.
  Release the socket buffer lock prior to calling into pgsigio(),
  so_upcall(), or aio_swake().  Locking for this event management
  will need revisiting in the future, but this model avoids lock
  order reversals when upcalls into other subsystems result in
  socket/socket buffer operations.  Assert that the socket buffer
  lock is not held at the end of the function.

- Wrapper macros for sowakeup(), sorwakeup() and sowwakeup(), now
  have _locked versions which assert the socket buffer lock on
  entry.  If a wakeup is required by sb_notify(), invoke
  sowakeup(); otherwise, unconditionally release the socket buffer
  lock.  This results in the socket buffer lock being released
  whether a wakeup is required or not.

- Break out socantsendmore() into socantsendmore_locked() that
  asserts the socket buffer lock.  socantsendmore()
  unconditionally locks the socket buffer before calling
  socantsendmore_locked().  Note that both functions return with
  the socket buffer unlocked as socantsendmore_locked() calls
  sowwakeup_locked() which has the same properties.  Assert that
  the socket buffer is unlocked on return.

- Break out socantrcvmore() into socantrcvmore_locked() that
  asserts the socket buffer lock.  socantrcvmore() unconditionally
  locks the socket buffer before calling socantrcvmore_locked().
  Note that both functions return with the socket buffer unlocked
  as socantrcvmore_locked() calls sorwakeup_locked() which has
  similar properties.  Assert that the socket buffer is unlocked
  on return.

- Break out sbrelease() into a sbrelease_locked() that asserts the
  socket buffer lock.  sbrelease() unconditionally locks the
  socket buffer before calling sbrelease_locked().
  sbrelease_locked() now invokes sbflush_locked() instead of
  sbflush().

- Assert the socket buffer lock in socket buffer sanity check
  functions sblastrecordchk(), sblastmbufchk().

- Assert the socket buffer lock in SBLINKRECORD().

- Break out various sbappend() functions into sbappend_locked()
  (and variations on that name) that assert the socket buffer
  lock.  The !_locked() variations unconditionally lock the socket
  buffer before calling their _locked counterparts.  Internally,
  make sure to call _locked() support routines, etc, if already
  holding the socket buffer lock.

- Break out sbinsertoob() into sbinsertoob_locked() that asserts
  the socket buffer lock.  sbinsertoob() unconditionally locks the
  socket buffer before calling sbinsertoob_locked().

- Break out sbflush() into sbflush_locked() that asserts the
  socket buffer lock.  sbflush() unconditionally locks the socket
  buffer before calling sbflush_locked().  Update panic strings
  for new function names.

- Break out sbdrop() into sbdrop_locked() that asserts the socket
  buffer lock.  sbdrop() unconditionally locks the socket buffer
  before calling sbdrop_locked().

- Break out sbdroprecord() into sbdroprecord_locked() that asserts
  the socket buffer lock.  sbdroprecord() unconditionally locks
  the socket buffer before calling sbdroprecord_locked().

- sofree() now calls socantsendmore_locked() and re-acquires the
  socket buffer lock on return.  It also now calls
  sbrelease_locked().

- sorflush() now calls socantrcvmore_locked() and re-acquires the
  socket buffer lock on return.  Clean up/mess up other behavior
  in sorflush() relating to the temporary stack copy of the socket
  buffer used with dom_dispose by more properly initializing the
  temporary copy, and selectively bzeroing/copying more carefully
  to prevent WITNESS from getting confused by improperly
  initialized mutexes.  Annotate why that's necessary, or at
  least, needed.

- soisconnected() now calls sbdrop_locked() before unlocking the
  socket buffer to avoid locking overhead.

Some parts of this change were:

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-21 00:20:43 +00:00
rwatson
ffabeb7229 In uipc_rcvd(), lock the socket buffers at either end of the UNIX
domain sokcet when updating fields at both ends.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
2004-06-20 21:43:13 +00:00
rwatson
8c7c75cc62 Hold SOCK_LOCK(so) when frobbing so_state when disconnecting a
connected UNIX domain datagram socket.
2004-06-20 21:29:56 +00:00
phk
40dd98a3bd Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
rwatson
f2c0db1521 The socket field so_state is used to hold a variety of socket related
flags relating to several aspects of socket functionality.  This change
breaks out several bits relating to send and receive operation into a
new per-socket buffer field, sb_state, in order to facilitate locking.
This is required because, in order to provide more granular locking of
sockets, different state fields have different locking properties.  The
following fields are moved to sb_state:

  SS_CANTRCVMORE            (so_state)
  SS_CANTSENDMORE           (so_state)
  SS_RCVATMARK              (so_state)

Rename respectively to:

  SBS_CANTRCVMORE           (so_rcv.sb_state)
  SBS_CANTSENDMORE          (so_snd.sb_state)
  SBS_RCVATMARK             (so_rcv.sb_state)

This facilitates locking by isolating fields to be located with other
identically locked fields, and permits greater granularity in socket
locking by avoiding storing fields with different locking semantics in
the same short (avoiding locking conflicts).  In the future, we may
wish to coallesce sb_state and sb_flags; for the time being I leave
them separate and there is no additional memory overhead due to the
packing/alignment of shorts in the socket buffer structure.
2004-06-14 18:16:22 +00:00
rwatson
f1bc833e95 Socket MAC labels so_label and so_peerlabel are now protected by
SOCK_LOCK(so):

- Hold socket lock over calls to MAC entry points reading or
  manipulating socket labels.

- Assert socket lock in MAC entry point implementations.

- When externalizing the socket label, first make a thread-local
  copy while holding the socket lock, then release the socket lock
  to externalize to userspace.
2004-06-13 02:50:07 +00:00
rwatson
82295697cd Extend coverage of SOCK_LOCK(so) to include so_count, the socket
reference count:

- Assert SOCK_LOCK(so) macros that directly manipulate so_count:
  soref(), sorele().

- Assert SOCK_LOCK(so) in macros/functions that rely on the state of
  so_count: sofree(), sotryfree().

- Acquire SOCK_LOCK(so) before calling these functions or macros in
  various contexts in the stack, both at the socket and protocol
  layers.

- In some cases, perform soisdisconnected() before sotryfree(), as
  this could result in frobbing of a non-present socket if
  sotryfree() actually frees the socket.

- Note that sofree()/sotryfree() will release the socket lock even if
  they don't free the socket.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-12 20:47:32 +00:00
rwatson
fcbd16cf61 Introduce a subsystem lock around UNIX domain sockets in order to protect
global and allocated variables.  This strategy is derived from work
originally developed by BSDi for BSD/OS, and applied to FreeBSD by Sam
Leffler:

- Add unp_mtx, a global mutex which will protect all UNIX domain socket
  related variables, structures, etc.

- Add UNP_LOCK(), UNP_UNLOCK(), UNP_LOCK_ASSERT() macros.

- Acquire unp_mtx on entering most UNIX domain socket code,
  drop/re-acquire around calls into VFS, and release it on return.

- Avoid performing sodupsockaddr() while holding the mutex, so in general
  move to allocating storage before acquiring the mutex to copy the data.

- Make a stack copy of the xucred rather than copying out while holding
  unp_mtx.  Copy the peer credential out after releasing the mutex.

- Add additional assertions of vnode locks following VOP_CREATE().

A few notes:

- Use of an sx lock for the file list mutex may cause problems with regard
  to unp_mtx when garbage collection passed file descriptors.

- The locking in unp_pcblist() for sysctl monitoring is correct subject to
  the unpcb zone not returning memory for reuse by other subsystems
  (consistent with similar existing concerns).

- Sam's version of this change, as with the BSD/OS version, made use of
  both a global lock and per-unpcb locks.  However, in practice, the
  global lock covered all accesses, so I have simplified out the unpcb
  locks in the interest of getting this merged faster (reducing the
  overhead but not sacrificing granularity in most cases).  We will want
  to explore possibilities for improving lock granularity in this code in
  the future.

Submitted by:	sam
Sponsored by:	FreeBSD Foundatiuon
Obtained from:	BSD/OS 5 snapshot provided by BSDi
2004-06-10 21:34:38 +00:00
rwatson
87449e4f90 Mark sun_noname as const since it's immutable. Update definitions
of functions that potentially accept &sun_noname (sbappendaddr(),
et al) to accept a const sockaddr pointer.
2004-06-04 04:07:08 +00:00
imp
74cf37bd00 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00