Commit Graph

9461 Commits

Author SHA1 Message Date
alc
b98eae58a6 Introduce a field to struct vm_page for storing flags that are
synchronized by the lock on the object containing the page.

Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags.  Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.

Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().

Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
2006-08-09 17:43:27 +00:00
pjd
7f9e892ea9 Add a bandaid to avoid a deadlock in a situation, when we are trying to suspend
a file system, but need to obtain a vnode. We may not be able to do it, because
all vnodes could be already in use and other processes cannot release them,
because they are waiting in "suspfs" state.

In such situation, we allow to allocate a vnode anyway.

This is a temporary fix - there is no backpressure to free vnodes allocated in
those circumstances.

MFC after:	1 week
Reviewed by:	tegge
2006-08-09 12:47:30 +00:00
alc
45c95873a5 Reduce the scope of the page queues lock in vfs_busy_pages() now that
vm_page_sleep_if_busy() no longer requires the caller to hold the page
queues lock.
2006-08-08 06:00:49 +00:00
rwatson
3bd21f489d Move definition of UNIX domain socket protosw and domain entries from
uipc_proto.c to uipc_usrreq.c, making localdomain static.  Remove
uipc_proto.c as it's no longer used.  With this change, UNIX domain
sockets are entirely encapsulated in uipc_usrreq.c.
2006-08-07 12:02:43 +00:00
rwatson
9119bbc087 Improve commenting of vaccess(), making sure to be clear that the ifdef
capabilities code is there for reference and never actually used.  Slight
style tweak.
2006-08-06 10:43:35 +00:00
rwatson
295b39f6ec Don't set pru_sosend, pru_soreceive, pru_sopoll to default values, as they
are already set to default values.
2006-08-06 10:39:21 +00:00
alc
67d9b76d0e Reduce the scope of the page queues lock in kern_sendfile() now that
vm_page_sleep_if_busy() no longer requires the caller to hold the page
queues lock.
2006-08-06 01:00:09 +00:00
rwatson
3a02c014e3 Remove register, use ANSI function headers. 2006-08-05 21:40:59 +00:00
rwatson
a6a5fadace We now spell "inode" as "vnode" in the VFS layer, so update comment
for new world order.

MFC after:	3 days
Pointed out by:	mckusick
2006-08-05 21:08:47 +00:00
jb
3aec549f3a Add support for the generated file systrace_args.c. 2006-08-05 19:25:14 +00:00
yar
209e4786e7 Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR:		misc/101245
Submitted by:	Darren Pilgrim <darren pilgrim bitfreak org>
Tested by:	md5(1)
MFC after:	1 week
2006-08-04 07:56:35 +00:00
alc
fe447f8ea1 The page queues lock is no longer required by vm_page_io_start(). Reduce
the scope of the page queues lock in kern_sendfile() accordingly.
2006-08-04 05:53:20 +00:00
jb
3cb851c2d1 Report the correct function name in a DPRINTF. 2006-08-03 21:19:13 +00:00
jb
3f68e6bfc3 Regen.
Note the addition of the extra file now generated.
2006-08-03 05:32:43 +00:00
jb
3f04b04c3f Generate another file called systrace_args.c. This will be compiled
into systrace and is used to map the syscall arguments into the 64-bit
parameter array.
2006-08-03 05:29:09 +00:00
rwatson
774fbf7736 Move destroying kqueue state from above pru_detach to below it in
sofree(), as a number of protocols expect to be able to call
soisdisconnected() during detach.  That may not be a good assumption,
but until I'm sure if it's a good assumption or not, allow it.
2006-08-02 18:37:44 +00:00
rwatson
d53af1ffe2 Change two XXX's to two notes: the fact that SOCK_LOCK(so) ==
SOCKBUF_LOCK(&so->so_rcv) is encoded, which is worth noting, but not a
bug.
2006-08-02 16:23:52 +00:00
jhb
70c20770c0 Fix some bugs in the previous revision (1.419). Don't perform extra
vfs_rel() on the mountpoint if the MAC checks fail in kern_statfs() and
kern_fstatfs().  Similarly, don't perform an extra vfs_rel() if we get
a doomed vnode in kern_fstatfs(), and handle the case of mp being NULL
(for some doomed vnodes) by conditionalizing the vfs_rel() in
kern_fstatfs() on mp != NULL.

CID:		1517
Found by:	Coverity Prevent (tm) (kern_fstatfs())
Pointy hat to:	jhb
2006-08-02 15:27:48 +00:00
rwatson
fb3cc7e20d Remove now unneeded ENOTCONN clause from SOCK_DGRAM side of uipc_send():
we have to check it regardless of the target address, so don't check it
twice.
2006-08-02 14:30:58 +00:00
rwatson
8db4b3b586 Remove 'register'.
Use ANSI C prototypes/function headers.
More deterministically line wrap comments.
2006-08-02 13:01:58 +00:00
davidxu
5ffd62af88 Don't include sys/thr.h and umtx.h in sys/sysproto.h, it is unnecessary. 2006-08-02 08:09:24 +00:00
davidxu
a956bbf6a4 INT_MAX is defined in file sys/limits.h, include the file now. 2006-08-02 07:34:51 +00:00
rwatson
641d5a5e44 Move updated of 'numopensockets' from bottom of sodealloc() to the top,
eliminating a second set of identical mutex operations at the bottom.
This allows brief exceeding of the max sockets limit, but only by
sockets in the last stages of being torn down.
2006-08-02 00:45:27 +00:00
jhb
023ecadf48 Make system call modules a bit more robust:
- If we fail to register the system call during MOD_LOAD, then note that
  so that we don't try to deregister it or invoke the chained event handler
  during the subsequent MOD_UNLOAD event.  Doing the deregister when the
  register failed could result in trashing system call entries.
- Add a SI_SUB_SYSCALLS just before starting up init and use that to
  register syscall modules instead of SI_SUB_DRIVERS.  Registering system
  calls as late as possible increases the chances that any other module
  event handlers or SYSINITs in a module are executed to initialize the
  data in a kld before a syscall dependent on that data is able to be
  invoked.

MFC after:	3 days
2006-08-01 16:32:20 +00:00
jhb
9cbcfd1fff Don't lock each of the processes while looking for a pid. The allproc and
proctree locks that we already hold provide sufficient protection.
2006-08-01 15:30:56 +00:00
rwatson
cedd19512d Reimplement socket buffer tear-down in sofree(): as the socket is no
longer referenced by other threads (hence our freeing it), we don't need
to set the can't send and can't receive flags, wake up the consumers,
perform two levels of locking, etc.  Implement a fast-path teardown,
sbdestroy(), which flushes and releases each socket buffer.  A manual
dom_dispose of the receive buffer is still required explicitly to GC
any in-flight file descriptors, etc, before flushing the buffer.

This results in a 9% UP performance improvement and 16% SMP performance
improvement on a tight loop of socket();close(); in micro-benchmarking,
but will likely also affect CPU-bound macro-benchmark performance.
2006-08-01 10:30:26 +00:00
rwatson
66e61accdf Close a race that occurs when using sendto() to connect and send on a
UNIX domain socket at the same time as the remote host is closing the
new connections as quickly as they open.  Since the connect() and
send() paths are non-atomic with respect to another, it is possible
for the second thread's close() call to disconnect the two sockets
as connect() returns, leading to the consumer (which plans to send())
with a NULL kernel pointer to its proposed peer.  As a result, after
acquiring the UNIX domain socket subsystem lock, we need to revalidate
the connection pointers even though connect() has technically succeed,
and reurn an error to say that there's no connection on which to
perform the send.

We might want to rethink the specific errno number, perhaps ECONNRESET
would be better.

PR:		100940
Reported by:	Young Hyun <youngh at caida dot org>
MFC after:	2 weeks
MFC note:	Some adaptation will be required
2006-07-31 23:00:05 +00:00
jhb
5f5d488d28 Trim an obsolete comment. ktrgenio() stopped doing crazy gymnastics when
ktrace was redone to be mostly synchronous again.
2006-07-31 15:31:43 +00:00
jhb
dee1b3da95 Regen for MPSAFE flag removal. 2006-07-28 19:08:37 +00:00
jhb
c62c38439f Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
jhb
6a211b6d81 Various fixes to comments in the syscall master files including removing
cruft from the audit import and adding mention of COMPAT4 to freebsd32.
2006-07-28 18:55:18 +00:00
jhb
e1d747891e Adjust td_locks for non-spin mutexes, rwlocks, and sx locks so that it is
a count of all non-spin locks, not just lockmgr locks.  This can give us a
much cheaper way to see if we have any locks held (such as when returning
to userland via userret()) without requiring WITNESS.

MFC after:	1 week
2006-07-27 21:45:55 +00:00
jhb
15daf52a2d Hold the reference on the mountpoint slightly longer in kern_statfs() and
kern_fstatfs() so that it is still held when prison_enforce_statfs() is
called (since that function likes to poke and prod at the mountpoint
structure).

MFC after:	3 days
2006-07-27 20:00:27 +00:00
jhb
9b98bcfb28 Write a magic value into mtx_lock when destroying a mutex that will force
all other mtx_lock() operations to block.  Previously, when the mutex was
destroyed, it would still have a valid value in mtx_lock(): either the
unowned cookie, which would allow a subsequent mtx_lock() to succeed, or a
pointer to the thread who destroyed the mutex if the mutex was locked when
it was destroyed.

MFC after:	3 days
2006-07-27 19:58:18 +00:00
jhb
6b46a69f12 Fix a file descriptor race I reintroduced when I split accept1() up into
kern_accept() and accept1().  If another thread closed the new file
descriptor and the first thread later got an error trying to copyout the
socket address, then it would attempt to close the wrong file object.  To
fix, add a struct file ** argument to kern_accept().  If it is non-NULL,
then on success kern_accept() will store a pointer to the new file object
there and not release any of the references.  It is up to the calling code
to drop the references appropriately (including a call to fdclose() in case
of error to safely handle the aforementioned race).  While I'm at it, go
ahead and fix the svr4 streams code to not leak the accept fd if it gets an
error trying to copyout the streams structures.
2006-07-27 19:54:41 +00:00
rwatson
2268f15916 Remove call to soisdisconnected() in uipc_detach(), since it will already
have been invoked by uipc_close() or uipc_abort(), and the socket is in a
state of being torn down by the time we get to this point, so kqueue
state frobbed by soisdisconnected() is not available, so frobbing it will
result in a panic.

Reported by:	Munehiro Matsuda <haro at h4 dot dion dot ne dot jp>
2006-07-26 19:16:34 +00:00
rwatson
c5a16c08ba Remove non-socket buffer routines from uipc_sockbuf.c, and socket buffer
specific routines from uipc_socket2.c following repo-copy.  We might
rethink the location of one or two at some point, but the division was
relatively clean.  uipc_sockbuf.c is now the home of routines that
manipulate socket buffers.
2006-07-24 16:21:31 +00:00
rwatson
40868fda8a soreceive_generic(), and sopoll_generic(). Add new functions sosend(),
soreceive(), and sopoll(), which are wrappers for pru_sosend,
pru_soreceive, and pru_sopoll, and are now used univerally by socket
consumers rather than either directly invoking the old so*() functions
or directly invoking the protocol switch method (about an even split
prior to this commit).

This completes an architectural change that was begun in 1996 to permit
protocols to provide substitute implementations, as now used by UDP.
Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to
perform these operations on sockets -- in particular, distributed file
systems and socket system calls.

Architectural head nod:	sam, gnn, wollman
2006-07-24 15:20:08 +00:00
rwatson
1aefa1572f Remove duplicate 'or'.
Submitted by:	ru
2006-07-23 21:01:09 +00:00
rwatson
af93451039 Update various uipc_socket.c comments, and reformat others. 2006-07-23 20:36:04 +00:00
rwatson
407a79e120 Add additional comments to the top of the UNIX domain socket implementation
providing some high level pointers regarding the implementation.
2006-07-23 20:06:45 +00:00
rwatson
711726cd79 Remove old kern.malloc sysctl, which generated a text representation of
the kernel malloc(9) state for vmstat -m.  libmemstat is now used to
generate a machine-readable version which is converged by vmstat -m
into a human-readable version.

Not for MFC.
2006-07-23 19:55:41 +00:00
rwatson
91cb1c84be Expand comments for malloc(9) to better describe the design and
statistics / memory types model.
2006-07-23 19:51:39 +00:00
rwatson
39c8e12140 Update and reformat comments for POSIX.1e ACL utility routines. 2006-07-23 19:35:10 +00:00
rwatson
c6cb3a5582 Add two new unpcb flags, UNP_BINDING and UNP_CONNECTING, which will be
used to mark UNIX domain sockets as being in the process of binding or
connecting.  Use these to prevent simultaneous bind or connect
operations by multiple threads or processes on the same socket at the
same time, which closes race conditions present in the UNIX domain
socket implementation since inception.
2006-07-23 12:01:14 +00:00
rwatson
e5969e57e6 Merge unp_bind() into uipc_bind(), as it is called only from uipc_bind(). 2006-07-23 11:02:12 +00:00
rwatson
b383d2c883 Since unp_attach() and unp_detach() are now called only from uipc_attach()
and uipc_detach(), merge them into their calling functions.
2006-07-23 10:25:28 +00:00
rwatson
777bd5286c Move various UNIX socket global variables and sysctls from the middle of
the file to the top.
2006-07-23 10:19:04 +00:00
rwatson
5d4d7953f0 In uipc_send() and uipc_rcvd(), store unp->unp_conn pointer in unp2
while working with the second unpcb to make the code more clear.
2006-07-22 18:41:42 +00:00
rwatson
b176399b59 Re-wrap and other minor formatting and punctuation fixes for UNIX domain
socket comments.
2006-07-22 17:24:55 +00:00