121 Commits

Author SHA1 Message Date
rwatson
bc5e7eb1f3 Introduce no-op nosup fifo kqueue filter and detach routine, which are
used when a read filter is requested on a write-only fifo descriptor, or
a write filter is requested on a read-only fifo descriptor.  This
permits the filters to be registered, but never raises the event, which
causes kqueue behavior for fifos to more closely match similar semantics
for poll and select, which permit testing for the condition even though
the condition will never be raised, and is consistent with POSIX's notion
that a fifo has identical semantics to a one-way IPC channel created
using pipe() on most operating systems.

The fifo regression test suite can now run to completion on HEAD without
errors.

MFC after:	3 days
2005-09-12 19:59:12 +00:00
rwatson
491de3e2d2 When a request is made to register a filter on a fifo that doesn't
apply to the fifo (i.e., not EVFILT_READ or EVFILT_WRITE), reject
it as EINVAL, not by returning 1 (EPERM).

MFC after:	3 days
2005-09-12 18:07:49 +00:00
rwatson
6b308e01a1 Remove DFLAG_SEEKABLE from fifo file descriptors: fifos are not seekable
according to POSIX, not to mention the fact that it doesn't make sense
(and hence isn't really implemented).  This causes the fifo_misc
regression test to succeed.
2005-09-12 12:15:12 +00:00
rwatson
1481446aae Only poll the fifo for read events if the fifo is attached to a readable
file descriptor.  Otherwise, the read end of a fifo might return that it
is writable (which it isn't).

Only poll the fifo for write events if the fifo attached to a writable
file descriptor.  Otherwise, the write end of a fifo might return that
it is readable (which it isn't).

In the event that a file is FREAD|FWRITE (which is allowed by POSIX, but
has undefined behavior), we poll for both.

MFC after:	3 days
2005-09-12 10:16:18 +00:00
rwatson
919d519cbb After going to some trouble to identify only the write-related events
to poll the write socket for, the fifo polling code proceeded to poll
for the complete set of events.  Use 'levents' instead of 'events' as
the argument to poll, and only poll the write socket if there is
interest in write events.

MFC after:	3 days
2005-09-12 10:13:15 +00:00
rwatson
5be69d1d56 When a writer opens a fifo, wake up the read socket for read, not the
write socket.

MFC after:	3 days
2005-09-12 10:07:21 +00:00
rwatson
c9f007b159 Add an assertion that fifo_open() doesn't race against other threads
while sleeping to allocate fifo state: due to using the vnode lock to
serialize access to a fifo during open, it shouldn't happen (tm).

MFC after:	3 days
2005-09-12 10:06:38 +00:00
rwatson
a079789c36 Rather than reaching into the internals of the UNIX domain socket code
by calling uipc_connect2() to connect two socket endpoints to create a
fifo, call soconnect2().

MFC after:	3 days
2005-09-12 10:05:08 +00:00
jeff
0d9df2e12d - The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem.  Check that rather than VI_XLOCK.
 - VOP_INACTIVE should no longer drop the vnode lock.
 - The vnode lock is required around calls to vrecycle() and vgone().

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:18:25 +00:00
phk
7617afd15d Whitespace in vop_vector{} initializations. 2005-01-13 18:59:48 +00:00
imp
fb0a393e99 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:10:42 +00:00
phk
7bb3e163dd Don't forget to bypass vnodes in corner cases.
Found by:	kkenn and ports/shell/zsh
Thanks to:	jeffr
2004-12-13 10:07:57 +00:00
phk
eda2bd825d Explicitly panic vop_read/vop_write on fifos. 2004-12-13 07:07:50 +00:00
phk
59f305606c Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
phk
05b9cb2a46 Mechanically change prototypes for vnode operations to use the new typedefs. 2004-12-01 12:24:41 +00:00
phk
a1fa5f3147 Add dropped implementation of ioctl for fifos. 2004-11-18 17:18:11 +00:00
phk
a94fe69bb7 Make vnode bypass for fifos (read, write, poll) mandatory. 2004-11-17 07:30:02 +00:00
phk
0e1bc6bd7d Add file ops to fifofs so that we can bypass vnodes (and Giant) for the
heavy-duty operations (read, write, poll/select, kqueue).

Disabled for now, enable with "vfs.fifofs.fops=1" in loader.conf.
2004-11-15 14:51:44 +00:00
phk
7204610fc8 fifos doesn't need a vop_lookup, the default will do fine. 2004-11-13 18:51:13 +00:00
phk
723cc1105c Properly implement a default version of VOP_GETWRITEMOUNT.
Remove improper access to vop_stdgetwritemount() which should and
will instead rely on the VOP default path.
2004-11-06 11:41:22 +00:00
jmg
bc1805c6e8 Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers.  Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks.  Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by:	green, rwatson (both earlier versions)
2004-08-15 06:24:42 +00:00
rwatson
083bcb28d6 Remove unlocked read annotation for sbspace(); the read is locked. 2004-06-23 00:35:50 +00:00
rwatson
d87fad9f08 Merge some additional leaf node socket buffer locking from
rwatson_netperf:

Introduce conditional locking of the socket buffer in fifofs kqueue
filters; KNOTE() will be called holding the socket buffer locks in
fifofs, but sometimes the kqueue() system call will poll using the
same entry point without holding the socket buffer lock.

Introduce conditional locking of the socket buffer in the socket
kqueue filters; KNOTE() will be called holding the socket buffer
locks in the socket code, but sometimes the kqueue() system call
will poll using the same entry points without holding the socket
buffer lock.

Simplify the logic in sodisconnect() since we no longer need spls.

NOTE: To remove conditional locking in the kqueue filters, it would
make sense to use a separate kqueue API entry into the socket/fifo
code when calling from the kqueue() system call.
2004-06-18 02:57:55 +00:00
rwatson
855c4bb01f Merge additional socket buffer locking from rwatson_netperf:
- Lock down low hanging fruit use of sb_flags with socket buffer
  lock.

- Lock down low hanging fruit use of so_state with socket lock.

- Lock down low hanging fruit use of so_options.

- Lock down low-hanging fruit use of sb_lowwat and sb_hiwat with
  socket buffer lock.

- Annotate situations in which we unlock the socket lock and then
  grab the receive socket buffer lock, which are currently actually
  the same lock.  Depending on how we want to play our cards, we
  may want to coallesce these lock uses to reduce overhead.

- Convert a if()->panic() into a KASSERT relating to so_state in
  soaccept().

- Remove a number of splnet()/splx() references.

More complex merging of socket and socket buffer locking to
follow.
2004-06-17 22:48:11 +00:00
rwatson
029226f3a8 Grab the socket buffer send or receive mutex when performing a
read-modify-write on the sb_state field.  This commit catches only
the "easy" ones where it doesn't interact with as yet unmerged
locking.
2004-06-15 03:51:44 +00:00
rwatson
f2c0db1521 The socket field so_state is used to hold a variety of socket related
flags relating to several aspects of socket functionality.  This change
breaks out several bits relating to send and receive operation into a
new per-socket buffer field, sb_state, in order to facilitate locking.
This is required because, in order to provide more granular locking of
sockets, different state fields have different locking properties.  The
following fields are moved to sb_state:

  SS_CANTRCVMORE            (so_state)
  SS_CANTSENDMORE           (so_state)
  SS_RCVATMARK              (so_state)

Rename respectively to:

  SBS_CANTRCVMORE           (so_rcv.sb_state)
  SBS_CANTSENDMORE          (so_snd.sb_state)
  SBS_RCVATMARK             (so_rcv.sb_state)

This facilitates locking by isolating fields to be located with other
identically locked fields, and permits greater granularity in socket
locking by avoiding storing fields with different locking semantics in
the same short (avoiding locking conflicts).  In the future, we may
wish to coallesce sb_state and sb_flags; for the time being I leave
them separate and there is no additional memory overhead due to the
packing/alignment of shorts in the socket buffer structure.
2004-06-14 18:16:22 +00:00
truckman
d503c79cad Add MSG_NBIO flag option to soreceive() and sosend() that causes
them to behave the same as if the SS_NBIO socket flag had been set
for this call.  The SS_NBIO flag for ordinary sockets is set by
fcntl(fd, F_SETFL, O_NONBLOCK).

Pass the MSG_NBIO flag to the soreceive() and sosend() calls in
fifo_read() and fifo_write() instead of frobbing the SS_NBIO flag
on the underlying socket for each I/O operation.  The O_NONBLOCK
flag is a property of the descriptor, and unlike ordinary sockets,
fifos may be referenced by multiple descriptors.
2004-06-01 01:18:51 +00:00
truckman
6174e9d812 Switch from using the vnode interlock to a private mutex in fifo_open()
to avoid lock order problems when manipulating the sockets associated
with the fifo.

Minor optimization of a couple of calls to fifo_cleanup() from
fifo_open().
2004-05-17 20:16:40 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
rwatson
8eeaad5c11 Export uipc_connect2() from uipc_usrreq.c instead of unp_connect2(),
and consume that interface in portalfs and fifofs instead.  In the
new world order, unp_connect2() assumes that the unpcb mutex is
held, whereas uipc_connect2() validates that the passed sockets are
UNIX domain sockets, then grabs the mutex.

NB: the portalfs and fifofs code gets down and dirty with UNIX domain
sockets.  Maybe this is a bad thing.
2004-03-31 01:41:30 +00:00
truckman
24b3a6d135 Use "fip->fi_readers == 0 && fip->fi_writers == 0" as the condition for
disposing fifo resources in fifo_cleanup() instead using of
"vp->v_usecount == 1".  There may be other references to the vnode, for
instance by nullfs, at the time fifo_open() or fifo_close() is called,
which could cause a resource leak.

Don't bother grabbing the vnode interlock in fifo_cleanup() since it no
longer accesses v_usecount.
2003-11-16 01:11:11 +00:00
truckman
2305818240 If fifo_open() is interrupted, fifo_close() may not get called, causing
a resource leak.  Move the resource deallocation code from fifo_close()
to a new function, fifo_cleanup(), and call fifo_cleanup() from
fifo_close() and the appropriate places in fifo_open().

Tested by: 	Lukas Ertl
Pointy hat to:	truckman
2003-11-10 22:21:00 +00:00
truckman
78ee1563af Partially back out rev 1.87 by nuking fifo_inactive() and moving the
resource deallocation back to fifo_close().  This eliminates any
stale data that might be stuck in the socket buffers after all the
readers and writers have closed the fifo.

Tested by: Thorsten Schroeder <ths@katjusha.de>
2003-06-16 17:17:09 +00:00
truckman
6f638a7438 Clean up the fifo_open() implementation:
Restructure the error handling portion of the resource allocation
        code to eliminate duplicated code.

        Test for the O_NONBLOCK && fi_readers == 0 case before incrementing
        fi_writers and modifying the the socket flag to avoid having to
        undo these operations in this error case.

        Restructure and simplify the code that handles blocking opens.

There should be no change to functionality.
2003-06-13 06:58:11 +00:00
truckman
0c845cdfd3 Fix up locking problems in fifo_open() and fifo_close():
Sleep on the vnode interlock while waiting for another
	caller to increment fi_readers or fi_writers.  Hold the
	vnode interlock while incrementing fi_readers or fi_writers
	to prevent a wakeup from being missed.

	Only access fi_readers and fi_writers while holding the vnode
	lock.  Previously fifo_close() decremented their values without
	holding a lock.

	Move resource deallocation from fifo_close() to fifo_inactive(),
	which allows the VOP_CLOSE() call in the error return path in
	fifo_open() to be removed.  Fifo_open() was calling VOP_CLOSE()
	with the vnode lock held, in violation the current vnode locking
	API.  Also the way fifo_close() used vrefcnt() to decide whether
	to deallocate resources was bogus according to comments in the
	vrefcnt() implementation.

Reviewed by:	bde
2003-06-01 06:24:32 +00:00
phk
ed8b540a0c Remove unused variable.
Found by:       FlexeLint
2003-05-31 18:45:32 +00:00
bde
3933bc0c41 Better fix for the problem addressed by rev.1.79: don't loop in
fifo_open() waiting for another reader or writer if one arrived and
departed while we were waiting (or a little earlier).

Rev.1.79 broke blocking opens of fifos by making them time out after 1
second.  This was bad for at least apsfilter.

Tested by:	"Simon 'corecode' Schubert" <corecode@corecode.ath.cx>,
		Alexander Leidinger <Alexander@leidinger.net>,
		phk
MFC after:	4 weeks
2003-03-24 11:03:42 +00:00
des
2756b6c964 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
dillon
ccd5574cc6 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
dillon
ddf9ef103e Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
phk
816919ad39 There is some sort of race/deadlock which I have not identified
here.  It manifests itself by sendmail hanging in "fifoow" during
boot on a diskless machine with sendmail disabled.

Giving the sleep a 1sec timout breaks the deadlock, but does not solve
the underlying problem.

XXX comment applied.
2002-12-29 10:32:16 +00:00
phk
f01369965f Fix comments and one resulting code confusion about the type of the
"command" argument to VOP_IOCTL.

Spotted by:	FlexeLint.
2002-10-16 08:04:11 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
jeff
f7e588347b - Use vrefcnt() where it is safe to do so instead of doing direct and
unlocked accesses to v_usecount.
 - Lock access to the buf lists in the various sync routines.  interlock
   locking could be avoided almost entirely in leaf filesystems if the
   fsync function had a generic helper.
2002-09-25 02:32:42 +00:00
njl
00c79f5c92 Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde
2002-09-18 20:42:04 +00:00
njl
0590c43070 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
rwatson
6cbbcf7f28 Handle one more case of a fifofs filetmp: set filetmp.f_cred to
ap->a_cred, and pass in ap->a_td->td_ucred as the active_cred to
soo_poll().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-20 02:17:59 +00:00
rwatson
3246fbf45f In continuation of early fileop credential changes, modify fo_ioctl() to
accept an 'active_cred' argument reflecting the credential of the thread
initiating the ioctl operation.

- Change fo_ioctl() to accept active_cred; change consumers of the
  fo_ioctl() interface to generally pass active_cred from td->td_ucred.
- In fifofs, initialize filetmp.f_cred to ap->a_cred so that the
  invocations of soo_ioctl() are provided access to the calling f_cred.
  Pass ap->a_td->td_ucred as the active_cred, but note that this is
  required because we don't yet distinguish file_cred and active_cred
  in invoking VOP's.
- Update kqueue_ioctl() for its new argument.
- Update pipe_ioctl() for its new argument, pass active_cred rather
  than td_ucred to MAC for authorization.
- Update soo_ioctl() for its new argument.
- Update vn_ioctl() for its new argument, use active_cred rather than
  td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-17 02:36:16 +00:00