Commit Graph

105 Commits

Author SHA1 Message Date
Alan Cox
eae43d0e56 o Some style(9)-motivated changes to white space. 2002-01-01 00:40:29 +00:00
Alan Cox
5ca50a4bc9 o Correct an off-by-one error in aio_suspend(2).
PR:		18350
2001-12-31 03:13:24 +00:00
Alan Cox
516d256401 o Use "td->td_proc" instead of "curproc" where possible.
o Eliminate the unnecessary initialization of several static variables
   to zero.
2001-12-31 02:03:39 +00:00
Alfred Perlstein
21d56e9c33 Make AIO a loadable module.
Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).

Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.

Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.

Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int".  Fix
it by using an inline version of the AS macro against the syscall
arguments.  (AS should be available globally but we'll get to that
later.)

Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.
2001-12-29 07:13:47 +00:00
Alan Cox
604035c5f2 o Eliminate compilation warnings on 64-bit architectures. 2001-12-10 03:34:06 +00:00
Alan Cox
91369fc768 o Eliminate unnecessary synchronization from filt_aiodetach().
o The manual page for kevent says that EVFILT_AIO returns under the same
   conditions as aio_error().  With that in mind, set the data field
   of the returned struct kevent to the value that would be returned
   by aio_error().
 o Fix two compilation warnings.
2001-12-09 08:16:36 +00:00
John Baldwin
43150722c9 The aio kthreads start off with a root credential just like all other
kthreads, so don't malloc a ucred just so we can create a duplicate of the
one we already have.
2001-10-05 17:55:11 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Alfred Perlstein
2f3cf91876 Check validity of signal callback requested via aio routines.
Also move the insertion of the request to after the request is validated,
there's still looks like there may be some problems if an invalid address
is passed to the aio routines, basically a possible leak or having a
not completely initialized structure on the queue may still be possible.

A new sig macro was made _SIG_VALID to check the validity of a signal,
it would be advisable to use it from now on (in kern/kern_sig.c) rather
than rolling your own.

PR: kern/17152
2001-04-18 22:18:39 +00:00
Alan Cox
136446540a When aio_read/write() is used on a raw device, physical buffers are
used for up to "vfs.aio.max_buf_aio" of the requests.  If a request
size is MAXPHYS, but the request base isn't page aligned, vmapbuf()
will map the end of the user space buffer into the start of the kva
allocated for the next physical buffer.  Don't use a physical buffer
in this case.  (This change addresses problem report 25617.)

When an aio_read/write() on a raw device has completed, timeout() is
used to schedule a signal to the process.  Thus, the reporting is
delayed up to 10 ms (assuming hz is 100).  The process might have
terminated in the meantime, causing a trap 12 when attempting to
deliver the signal.  Thus, the timeout must be cancelled when removing
the job.

aio jobs in state JOBST_JOBQGLOBAL should be removed from the
kaio_jobqueue list during process rundown.

During process rundown, some aio jobs might move from one list to a
different list that has already been "emptied", causing the rundown to
be incomplete.  Retry the rundown.

A call to BUF_KERNPROC() is needed after obtaining a physical buffer
to disassociate the lock from the running process since it can return
to userland without releasing that lock.

PR:		25617
Submitted by:	tegge
2001-03-10 22:47:57 +00:00
Alan Cox
c9a970a79f Use the kthread API to create and destroy AIO daemons.
Submitted by:	jhb
2001-03-09 06:27:01 +00:00
John Baldwin
19eb87d22a Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00
Alan Cox
9c8a2647f6 Add a missing splx() to aio_fphysio(). (This change is a no-op in -5.0,
but potentially significant in -4.x.)

Eliminate a pointless parameter to aio_fphysio().

Remove unnecessary casts from aio_fphysio() and aio_physwakeup().
2001-03-06 15:54:38 +00:00
Alan Cox
88ed460e6b Eliminate the aio_freejobs list. Its purpose was to store free
aiocb's allocated by zalloc().  In other words, zfree() was never
 called.  Now, we call zfree().  Why eliminate this micro-
 optimization?  At some later point, when we multithread the AIO
 system, we would need a mutex to synchronize access to aio_freejobs,
 making its use nearly indistinguishable in cost from zalloc() and
 zfree().

Remove unnecessary fhold() and fdrop() calls from aio_qphysio(),
 undo'ing a part of revision 1.86.  The reference count on the file
 structure is already incremented by _aio_aqueue() before it calls
 aio_qphysio().  (Update the comments to document this fact.)

Remove unnecessary casts from _aio_aqueue(), aio_read(), aio_write()
 and aio_waitcomplete().

Remove an unnecessary "return;" from aio_process().

Add "static" in various places.
2001-03-05 01:30:23 +00:00
Alan Cox
fb579e9a61 Remove the field privatemodes from struct __aiocb_private and the
related code from aio_read() and aio_write().  This field was
intended, but never used, to allow a mythical user-level library to
make an aio_read() or aio_write() behave like an ordinary read() or
write(), i.e., a blocking I/O operation.
2001-03-04 01:22:23 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Jeroen Ruigrok van der Werven
f09deb6962 Fix typo: wierd -> weird.
There is no such thing as wierd in the english language.
2001-02-06 09:25:10 +00:00
Poul-Henning Kamp
37d4006626 Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
Jake Burkholder
86360fee54 Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup
from struct proc, which are now unused (p_nthread already was).
Remove process flag P_KTHREADP which was untested and only set
in vfs_aio.c (it should use kthread_create).  Move the yield
system call to kern_synch.c as kern_threads.c has been removed
completely.

moral support from:	alfred, jhb
2000-12-02 05:41:30 +00:00
Alan Cox
c6fa9f78d2 Provide a new interface for the user of aio_read() and aio_write() to request
a kevent upon completion of the I/O.  Specifically, introduce a new type
of sigevent notification, SIGEV_EVENT.  If sigev_notify is SIGEV_EVENT,
then sigev_notify_kqueue names the kqueue that should receive the event
and sigev_value contains the "void *" is copied into the kevent's udata
field.

In contrast to the existing interface, this one: 1) works on
the Alpha 2) avoids the extra copyin() call for the kevent because all
of the information needed is in the sigevent and 3) could be
applied to request a single kevent upon completion of an entire lio_listio().

Reviewed by:	jlemon
2000-11-21 19:36:36 +00:00
Matthew Dillon
279d722604 This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
Alan Cox
39b2b25fa0 _aio_aqueue(): Change kevent registration to use its own struct file pointer.
Otherwise, aio_read() and aio_write() on sockets are broken if a kevent is
 registered.  (The code after kevent registration for handling sockets assumes
 that the struct file pointer "fp" still refers to the socket, not the kqueue.)
2000-10-29 21:38:28 +00:00
John Baldwin
35e0e5b311 Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
Alan Cox
b92bb032d8 aio_qphysio: Eliminate one instance of an out-of-range check that is
performed twice.  Eliminate initialization that is already performed
 by _aio_aqueue.

aio_physwakeup: Eliminate redundant synchronization that is already
 performed by bufdone.
2000-09-26 06:35:22 +00:00
Bruce Evans
621dbe43df Added used include of <sys/mutex.h> (don't depend on pollution in
<sys/signalvar.h>).
2000-09-17 12:20:49 +00:00
John Baldwin
a93a7807b2 aio processes need to have the Giant mutex before doing work.
Submitted by:	tegge
2000-09-11 04:06:48 +00:00
Don Lewis
f535380cb6 Remove uidinfo hash table lookup and maintenance out of chgproccnt() and
chgsbsize(), which are called rather frequently and may be called from an
interrupt context in the case of chgsbsize().  Instead, do the hash table
lookup and maintenance when credentials are changed, which is a lot less
frequent.  Add pointers to the uidinfo structures to the ucred and pcred
structures for fast access.  Pass a pointer to the credential to chgproccnt()
and chgsbsize() instead of passing the uid.  Add a reference count to the
uidinfo structure and use it to decide when to free the structure rather
than freeing the structure when the resource consumption drops to zero.
Move the resource tracking code from kern_proc.c to kern_resource.c.  Move
some duplicate code sequences in kern_prot.c to separate helper functions.
Change KASSERTs in this code to unconditional tests and calls to panic().
2000-09-05 22:11:13 +00:00
Alan Cox
b70158bae1 Make filt_aio() check the jobstate for JOBST_JOBBFINISHED (in addition
to JOBST_JOBFINISHED) in case the aio_read() or aio_write() was performed
via the high-performance physio method, i.e., aio_qphysio().
2000-09-04 07:56:32 +00:00
Peter Wemm
5dec52bada Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the
!VFS_AIO case.  Lots of things have hooks into here (kqueue, exit(),
 sockets, etc), I elected to keep the external interfaces the same
 rather than spread more #ifdefs around the kernel.
2000-07-28 23:10:10 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Poul-Henning Kamp
9626b608de Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
Jonathan Lemon
cb679c385e Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
Poul-Henning Kamp
c244d2de43 Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.
2000-04-02 15:24:56 +00:00
Poul-Henning Kamp
b99c307a21 Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.
2000-03-20 11:29:10 +00:00
Poul-Henning Kamp
21144e3bf1 Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd.  The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue.  It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users:  Greg has not had time to test this yet, be careful.
2000-03-20 10:44:49 +00:00
Jason Evans
dd85920a4f Add the VFS_AIO config option and leave it off by default. Unless the
VFS_AIO option is specified, all aio-related syscalls return ENOSYS.

The aio code is very fragile right now, and is unsuitable for default
inclusion in a production shell box.

Approved by:	jkh
2000-02-23 07:44:25 +00:00
Jason Evans
b7592c7bea Back out the previous spl change, since it opens a race window.
Reviewed by:	alfred, dillon, peter
2000-01-20 08:15:13 +00:00
Jason Evans
60ffb01993 Don't tsleep() while at splbio().
Correctly return EINPROGRESS from aio_error() even when an aio request
is still in the socket queue.

Submitted by:	Adrian Chadd <adrian@bofh.co.uk>
2000-01-20 01:59:58 +00:00
Brian Feldman
f582ac0630 Fix vn_isdisk() usage to make AIO work on non-disk-files again, rather
than just return ENOTBLK.

PR:	16163
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
2000-01-17 21:18:39 +00:00
Jason Evans
bfbbc4aa44 Add aio_waitcomplete(). Make aio work correctly for socket descriptors.
Make gratuitous style(9) fixes (me, not the submitter) to make the aio
code more readable.

PR:		kern/12053
Submitted by:	Chris Sedore <cmsedore@maxwell.syr.edu>
2000-01-14 02:53:29 +00:00
Poul-Henning Kamp
ba4ad1fcea Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
Poul-Henning Kamp
38224dcd59 Convert various pieces of code to use vn_isdisk() rather than checking
for vp->v_type == VBLK.

In ccd: we don't need to call VOP_GETATTR to find the type of a vnode.

Reviewed by:    sos
1999-11-22 10:33:55 +00:00
Poul-Henning Kamp
008626c39e Simplify and de-bogotify check for raw disk. 1999-11-07 13:09:09 +00:00
Poul-Henning Kamp
02c58685a4 Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the
"rw" argument, rather than hijacking B_{READ|WRITE}.

Fix two bugs (physio & cam) resulting by the confusion caused by this.

Submitted by:   Tor.Egge@fast.no
Reviewed by:    alc, ken (partly)
1999-10-30 06:32:05 +00:00
Peter Wemm
d1f088dab5 Trim unused options (or #ifdef for undoc options).
Submitted by:	phk
1999-10-11 15:19:12 +00:00
Brian Feldman
13ccadd4b0 This is what was "fdfix2.patch," a fix for fd sharing. It's pretty
far-reaching in fd-land, so you'll want to consult the code for
changes.  The biggest change is that now, you don't use
	fp->f_ops->fo_foo(fp, bar)
but instead
	fo_foo(fp, bar),
which increments and decrements the fp refcount upon entry and exit.
Two new calls, fhold() and fdrop(), are provided.  Each does what it
seems like it should, and if fdrop() brings the refcount to zero, the
fd is freed as well.

Thanks to peter ("to hell with it, it looks ok to me.") for his review.
Thanks to msmith for keeping me from putting locks everywhere :)

Reviewed by:	peter
1999-09-19 17:00:25 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Poul-Henning Kamp
49ff4debd3 Spring cleaning around strategy and disklabels/slices:
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.

Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.

Remove bogus and unused strategy1 routines.

Remove open/close arguments from dssize().  Pick them up from dev_t.

Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
1999-08-14 11:40:51 +00:00
Poul-Henning Kamp
4d4f932326 s/v_specinfo/v_rdev/ 1999-08-13 10:10:12 +00:00