Commit Graph

100 Commits

Author SHA1 Message Date
Alan Cox
cd430164f1 Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite()
can be called both with and without the pipe mutex held.  (For example,
if called by pipeselwakeup(), it is held.  Whereas, if called by kqueue_scan(),
it is not.)

Reviewed by:	alfred
2002-03-27 21:47:50 +00:00
Alfred Perlstein
db51256707 When "cloning" a pipe's buffer bcopy the data after dropping the pipe's
lock as the data may be paged out and cause a fault.
2002-03-22 16:09:22 +00:00
Jeff Roberson
c897b81311 Remove references to vm_zone.h and switch over to the new uma API.
Also, remove maxsockets.  If you look carefully you'll notice that the old
zone allocator never honored this anyway.
2002-03-20 04:09:59 +00:00
Jeff Roberson
8355f576a9 This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by:	arch@
2002-03-19 09:11:49 +00:00
Alfred Perlstein
3b018f572d Bug fixes:
Missed a place where the pipe sleep lock was needed in order to safely grab
Giant, fix it and add an assertion to make sure this doesn't happen again.

Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the
wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK.

Fix a location where the wrong pipe was being passed to
PIPE_GET_GIANT/PIPE_DROP_GIANT.
2002-03-15 07:18:09 +00:00
Alfred Perlstein
be4af4b723 Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not
fully instaniated.

Revert the logic in pipeclose so that we don't have the entire function
pretty much under a single if() statement, instead invert the test and
just return if it fails.

Submitted (in different form) by: bde

Don't use pool mutexes for pipes.  We can not use pool mutexes
because we will need to grab the select lock while holding a pipe
lock which is not allowed because you may not aquire additional
mutexes when holding a pool mutex.

Instead malloc(9) space for the mutex that is shared between the
pipes.
2002-03-09 22:06:31 +00:00
Seigo Tanimura
996abba928 Track the number of wired pages to avoid unwiring unwired pages.
Reviewed by:	alfred
2002-03-05 00:51:03 +00:00
Alfred Perlstein
9f01374de5 kill __P. 2002-02-27 18:51:53 +00:00
Alfred Perlstein
566c1313a3 add assertions in the places where giant is required to catch when
the pipe is locked and shouldn't be.

initialize pipe->pipe_mtxp to NULL when creating pipes in order not
to trip the above assertions.

swap pipe lock with giant around calls to pipe_destroy_write_buffer()

pipe_destroy_write_buffer issue noticed by: jhb
2002-02-27 18:49:58 +00:00
Alfred Perlstein
21dbcfd500 Fix a NULL deref panic in pipe_write, we can't blindly lock
pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the
passed in pipe's mutex instead.
2002-02-27 17:23:16 +00:00
Alfred Perlstein
ffddaaeeeb MPsafe fixes:
use SYSINIT to initialize pipe_zone.
use PIPE_LOCK to protect kevent ops.
2002-02-27 11:27:48 +00:00
Alfred Perlstein
f81b04d96c First rev at making pipe(2) pipe's MPsafe.
Both ends of the pipe share a pool_mutex, this makes allocation
and deadlock avoidance easy.

Remove some un-needed FILE_LOCK ops while I'm here.

There are some issues wrt to select and the f{s,g}etown code that
we'll have to deal with, I think we may also need to move the calls
to vfs_timestamp outside of the sections covered by PIPE_LOCK.
2002-02-27 07:35:59 +00:00
Alfred Perlstein
426da3bcfb SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.

I've polished it quite a bit reducing the need for locking and
adapting it for KSE.

Locks:

1 mutex in each filedesc
   protects all the fields.
   protects "struct file" initialization, while a struct file
     is being changed from &badfileops -> &pipeops or something
     the filedesc should be locked.

1 mutex in each struct file
   protects the refcount fields.
   doesn't protect anything else.
   the flags used for garbage collection have been moved to
     f_gcflag which was the FILLER short, this doesn't need
     locking because the garbage collection is a single threaded
     container.
  could likely be made to use a pool mutex.

1 sx lock for the global filelist.

struct file *	fhold(struct file *fp);
        /* increments reference count on a file */

struct file *	fhold_locked(struct file *fp);
        /* like fhold but expects file to locked */

struct file *	ffind_hold(struct thread *, int fd);
        /* finds the struct file in thread, adds one reference and
                returns it unlocked */

struct file *	ffind_lock(struct thread *, int fd);
        /* ffind_hold, but returns file locked */

I still have to smp-safe the fget cruft, I'll get to that asap.
2002-01-13 11:58:06 +00:00
Maxim Sobolev
783c41d432 Make kevents on pipes work as described in the manpage - when the last
reader/writer disconnects, ensure that anybody who is waiting for the
kevent on the other end of the pipe gets EV_EOF.

MFC after:	2 weeks
2001-11-19 09:25:30 +00:00
John Baldwin
ed01445d8f Use the passed in thread to selrecord() instead of curthread. 2001-09-21 22:46:54 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
7b9673fa28 cleanup: GIANT macros, rename DEPRECIATE to DEPRECATE
Move p_giant_optional to proc zero'd section
Remove (old) XXX zfree comment in pipe code
2001-07-04 17:11:03 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
Jonathan Lemon
7b748f0a21 Correctly hook up the write kqfilter to pipes.
Submitted by:  Niels Provos <provos@citi.umich.edu>
2001-06-15 20:45:01 +00:00
Matthew Dillon
1b3e974a71 The pipe_write() code was locking the pipe without busying it first in
certain cases, and a close() by another process could potentially rip the
pipe out from under the (blocked) locking operation.

Reported-by: Alexander Viro <viro@math.psu.edu>
2001-06-04 04:04:45 +00:00
Alfred Perlstein
0cea693084 whitespace/style 2001-05-24 18:06:22 +00:00
Alfred Perlstein
53240603ee aquire vm_mutex a little bit earlier to protect a pmap call. 2001-05-23 10:26:36 +00:00
John Baldwin
270b041d95 - Assert that the vm mutex is held in pipe_free_kmem().
- Don't release the vm mutex early in pipespace() but instead hold it
  across vm_object_deallocate() if vm_map_find() returns an error and
  across pipe_free_kmem() if vm_map_find() succeeds.
- Add a XXX above a zfree() since zalloc already has its own locking,
  one would hope that zfree() wouldn't need the vm lock.
2001-05-21 18:47:17 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Alfred Perlstein
0fd061c0c4 Cleanup
Remove comment about setting error for reads on EOF, read returns 0 on
EOF so the code should be ok.

Remove non-effective priority boost, PRIO+1 doesn't do anything
(according to McKusick), if a real priority boost is needed it should
have been +4.

Style fixes:
.) return foo -> return (foo)
.) FLAG1|FlAG2 -> FLAG1 | FlAG2
.) wrap long lines
.) unwrap short lines
.) for(i=0;i=foo;i++) -> for (i = 0; i=foo; i++)
.) remove braces for some conditionals with a single statement
.) fix continuation lines.

md5 couldn't verify the binary because some code had to
be shuffled around to address the style issues.
2001-05-17 19:47:09 +00:00
Alfred Perlstein
2deb4a20c3 initialize pipe pointers 2001-05-17 18:22:58 +00:00
Alfred Perlstein
82a283fcf3 pipe_create has to zero out the select record earlier to avoid
returning a half-initialized pipe and causing pipeclose() to follow
a junk pointer.

Discovered by: "Nick S" <snicko@noid.org>
2001-05-17 17:59:28 +00:00
Alfred Perlstein
97d4578662 Remove an 'optimization' I hope to never see again.
The pipe code could not handle running out of kva, it would panic
if that happened.  Instead return ENFILE to the application which
is an acceptable error return from pipe(2).

There was some slightly tricky things that needed to be worked on,
namely that the pipe code can 'realloc' the size of the buffer if
it detects that the pipe could use a bit more room.  However if it
failed the reallocation it could not cope and would panic.  Fix
this by attempting to grow the pipe while holding onto our old
resources.  If all goes well free the old resources and use the
new ones, otherwise continue to use the smaller buffer already
allocated.

While I'm here add a few blank lines for style(9) and remove
'register'.
2001-05-08 09:09:18 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Jonathan Lemon
608a3ce62a Extend kqueue down to the device layer.
Backwards compatible approach suggested by: peter
2001-02-15 16:34:11 +00:00
David Malone
3b54736e19 Style improvements for last fix. Should be functionally the same.
Submitted by:	bde
2001-01-11 00:13:54 +00:00
Garrett Wollman
0a2c3d48c6 select() DKI is now in <sys/selinfo.h>. 2001-01-09 04:33:49 +00:00
David Malone
2ebaaccd47 If we failed to allocate the file discriptor for the write end of
the pipe, then we were corrupting the pipe_zone free list by calling
pipeclose on rpipe twice. NULL out rpipe to avoid this.

Reviewed by:	dillon
Reviewed by:	iedowse
2001-01-08 22:14:48 +00:00
Matthew Dillon
279d722604 This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
Jonathan Lemon
6f451c99b3 Pipes are not writeable while a direct write is in progress. However,
the kqueue filter got the sense of the test reversed, so fix it.

Spotted by:	Michael Elkins <me@sigpipe.org>
2000-09-14 20:10:19 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Chris Costello
12861d58db Include UID and GID information for stat() calls using the values filled
into the file descriptor data by falloc().

Reviewed by:	phk
2000-05-11 22:08:20 +00:00
Jonathan Lemon
cb679c385e Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
Matthew Dillon
f1924a54f8 Fix in-kernel infinite loop in pipe_write() when the reader goes away
at just the wrong time.
2000-03-24 00:47:37 +00:00
Bruce Evans
da211f5bf6 Use vfs_timestamp() instead of getnanotime() to set timestamps. This
fixee incoherency of pipe timestamps relative to file timestamps in
the usual case where getnanotime() is not used for the latter.  (File
and pipe timestamps are still incoherent relative to real time unless
the vfs_timestamp_precision sysctl is set to 2 or 3).
1999-12-26 13:04:52 +00:00
Tor Egge
f52523701c Fix two problems with pipe_write():
1. Data written beyond end of pipe buffer, causing kernel memory corruption.

    - Check that space is still valid after obtaining the pipe lock.

    - Defer the calculation of transfer size until the pipe
      lock has been obtained.

    - Update the pipe buffer pointers while holding the pipe lock.

 2. Writes of size <= PIPE_BUF not always atomic.

    - Allow an internal write to span two contiguous segments,
      so writes of size <= PIPE_BUF can be kept atomic
      when wrapping around from the end to the start of the
      pipe buffer.

PR:		15235
Reviewed by:	Matt Dillon <dillon@FreeBSD.org>
1999-12-13 02:55:47 +00:00
Peter Wemm
29e040e5c1 Update pipe code for fo_stat() entry point - pipe_stat() is now no longer
used outside the pipe code.
1999-11-08 03:28:49 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Matthew Dillon
4cc712004c Fix bug in pipe code relating to writes of mmap'd but illegal address
spaces which cross a segment boundry in the page table.  pmap_kextract()
    is not designed for access to the user space portion of the page
    table and cannot handle the null-page-directory-entry case.

    The fix is to have vm_fault_quick() return a success or failure which
    is then used to avoid calling pmap_kextract().
1999-09-20 19:08:48 +00:00
Brian Feldman
13ccadd4b0 This is what was "fdfix2.patch," a fix for fd sharing. It's pretty
far-reaching in fd-land, so you'll want to consult the code for
changes.  The biggest change is that now, you don't use
	fp->f_ops->fo_foo(fp, bar)
but instead
	fo_foo(fp, bar),
which increments and decrements the fp refcount upon entry and exit.
Two new calls, fhold() and fdrop(), are provided.  Each does what it
seems like it should, and if fdrop() brings the refcount to zero, the
fd is freed as well.

Thanks to peter ("to hell with it, it looks ok to me.") for his review.
Thanks to msmith for keeping me from putting locks everywhere :)

Reviewed by:	peter
1999-09-19 17:00:25 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Brian Feldman
e32c66c539 Fix fd race conditions (during shared fd table usage.) Badfileops is
now used in f_ops in place of NULL, and modifications to the files
are more carefully ordered. f_ops should also be set to &badfileops
upon "close" of a file.

This does not fix other problems mentioned in this PR than the first
one.

PR:		11629
Reviewed by:	peter
1999-08-04 18:53:50 +00:00
Alan Cox
3d41489171 Restructure pipe_read in order to eliminate several race conditions.
Submitted by:	Matthew Dillon <dillon@apollo.backplane.com> and myself
1999-06-05 03:53:57 +00:00
Dmitrij Tejblum
8fe387ab84 Add standard padding argument to pread and pwrite syscall. That should make them
NetBSD compatible.

Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that
the offset is set in the struct uio).

Factor out some common code from read/pread/write/pwrite syscalls.
1999-04-04 21:41:28 +00:00