Commit Graph

267 Commits

Author SHA1 Message Date
jeff
318cbeeecf Remove references to vm_zone.h and switch over to the new uma API.
Also, remove maxsockets.  If you look carefully you'll notice that the old
zone allocator never honored this anyway.
2002-03-20 04:09:59 +00:00
eivind
05c20bbea2 Document all functions, global and static variables, and sysctls.
Includes some minor whitespace changes, and re-ordering to be able to document
properly (e.g, grouping of variables and the SYSCTL macro calls for them, where
the documentation has been added.)

Reviewed by:	phk (but all errors are mine)
2002-03-05 15:38:49 +00:00
tanimura
a09da29859 Lock struct pgrp, session and sigio.
New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
  session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by:	jhb, alfred
Tested on:	cvsup.jp.FreeBSD.org
		(which is a quad-CPU machine running -current)
2002-02-23 11:12:57 +00:00
alc
f64c08b326 o Clearing p/td_retval[0] after aio_newproc() is unnecessary. (We stopped
calling rfork() to create aio threads in revision 1.46.)
 o Don't recompute the FILE * when it's already stored in the kernel's AIOCB.
2002-02-12 17:40:41 +00:00
julian
b5eb64d6f0 Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
alc
ca68430b0d o Remove the unused vestiges of JOBST_JOBQPROC and
the per-thread jobtorun queue.
 o Use TAILQ_EMPTY() instead of TAILQ_FIRST(...) == NULL.
2002-01-20 18:59:58 +00:00
alc
60143f00ec o Revision 1.99 ("KSE Milestone 2") left the aio daemons
sleeping on a process object but changed the corresponding
   wakeup()s to the thread object.  The result was that non-raw
   aio ops waited for an aio daemon to timeout before action
   was taken.  Now, we sleep on the thread object.

PR:		kern/34016
2002-01-20 00:52:44 +00:00
alc
babe0aff74 o Eliminate an unused parameter from aio_fphysio(). 2002-01-17 17:19:40 +00:00
alc
6a4a71604b o Correct the initialization of aiolio_zone: Each entry was 16 times larger
than necessary.
 o Move a rarely-used goto label inside a critical section so that we don't
   perform an splnet() for which there is no corresponding splx().
 o Remove unnecessary splnet()/splx() around accesses to kaioinfo::kaio_jobdone
   in aio_return().
 o Use TAILQ_FOREACH for simple cases of iteration over kaioinfo::kaio_jobdone.
2002-01-14 07:26:33 +00:00
alc
13725ff1d5 o Correct a 32/64-bit error in the initialization of aiol_zone, specifically,
sizeof(int) is not the size of a pointer.
2002-01-09 06:40:45 +00:00
alc
938cb766b8 o Add missing synchronization (splnet()/splx()) in aio_free_entry().
o Move the definition of struct aiocblist from sys/aio.h to kern/vfs_aio.c.
 o Make aio_swake_cb() static.
2002-01-06 21:03:39 +00:00
alc
e78b8215cc o Properly check the file descriptor passed to aio_cancel(2). (Previously,
no out-of-bounds check was performed on the file descriptor.)
 o Eliminate some excessive white space from aio_cancel(2).
2002-01-02 07:04:38 +00:00
alc
e5d0c7a325 o Some style(9)-motivated changes to white space. 2002-01-01 00:40:29 +00:00
alc
4687eb5aa5 o Correct an off-by-one error in aio_suspend(2).
PR:		18350
2001-12-31 03:13:24 +00:00
alc
0fe9459a66 o Use "td->td_proc" instead of "curproc" where possible.
o Eliminate the unnecessary initialization of several static variables
   to zero.
2001-12-31 02:03:39 +00:00
alfred
f097734c27 Make AIO a loadable module.
Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).

Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.

Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.

Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int".  Fix
it by using an inline version of the AS macro against the syscall
arguments.  (AS should be available globally but we'll get to that
later.)

Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.
2001-12-29 07:13:47 +00:00
alc
a49c1c9183 o Eliminate compilation warnings on 64-bit architectures. 2001-12-10 03:34:06 +00:00
alc
be4cbfd029 o Eliminate unnecessary synchronization from filt_aiodetach().
o The manual page for kevent says that EVFILT_AIO returns under the same
   conditions as aio_error().  With that in mind, set the data field
   of the returned struct kevent to the value that would be returned
   by aio_error().
 o Fix two compilation warnings.
2001-12-09 08:16:36 +00:00
jhb
563c710867 The aio kthreads start off with a root credential just like all other
kthreads, so don't malloc a ucred just so we can create a duplicate of the
one we already have.
2001-10-05 17:55:11 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
alfred
3405c2ccfa Check validity of signal callback requested via aio routines.
Also move the insertion of the request to after the request is validated,
there's still looks like there may be some problems if an invalid address
is passed to the aio routines, basically a possible leak or having a
not completely initialized structure on the queue may still be possible.

A new sig macro was made _SIG_VALID to check the validity of a signal,
it would be advisable to use it from now on (in kern/kern_sig.c) rather
than rolling your own.

PR: kern/17152
2001-04-18 22:18:39 +00:00
alc
d25198ddf6 When aio_read/write() is used on a raw device, physical buffers are
used for up to "vfs.aio.max_buf_aio" of the requests.  If a request
size is MAXPHYS, but the request base isn't page aligned, vmapbuf()
will map the end of the user space buffer into the start of the kva
allocated for the next physical buffer.  Don't use a physical buffer
in this case.  (This change addresses problem report 25617.)

When an aio_read/write() on a raw device has completed, timeout() is
used to schedule a signal to the process.  Thus, the reporting is
delayed up to 10 ms (assuming hz is 100).  The process might have
terminated in the meantime, causing a trap 12 when attempting to
deliver the signal.  Thus, the timeout must be cancelled when removing
the job.

aio jobs in state JOBST_JOBQGLOBAL should be removed from the
kaio_jobqueue list during process rundown.

During process rundown, some aio jobs might move from one list to a
different list that has already been "emptied", causing the rundown to
be incomplete.  Retry the rundown.

A call to BUF_KERNPROC() is needed after obtaining a physical buffer
to disassociate the lock from the running process since it can return
to userland without releasing that lock.

PR:		25617
Submitted by:	tegge
2001-03-10 22:47:57 +00:00
alc
60a613e429 Use the kthread API to create and destroy AIO daemons.
Submitted by:	jhb
2001-03-09 06:27:01 +00:00
jhb
9cd254601b Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00
alc
499c6382fb Add a missing splx() to aio_fphysio(). (This change is a no-op in -5.0,
but potentially significant in -4.x.)

Eliminate a pointless parameter to aio_fphysio().

Remove unnecessary casts from aio_fphysio() and aio_physwakeup().
2001-03-06 15:54:38 +00:00
alc
a27aa6e3d5 Eliminate the aio_freejobs list. Its purpose was to store free
aiocb's allocated by zalloc().  In other words, zfree() was never
 called.  Now, we call zfree().  Why eliminate this micro-
 optimization?  At some later point, when we multithread the AIO
 system, we would need a mutex to synchronize access to aio_freejobs,
 making its use nearly indistinguishable in cost from zalloc() and
 zfree().

Remove unnecessary fhold() and fdrop() calls from aio_qphysio(),
 undo'ing a part of revision 1.86.  The reference count on the file
 structure is already incremented by _aio_aqueue() before it calls
 aio_qphysio().  (Update the comments to document this fact.)

Remove unnecessary casts from _aio_aqueue(), aio_read(), aio_write()
 and aio_waitcomplete().

Remove an unnecessary "return;" from aio_process().

Add "static" in various places.
2001-03-05 01:30:23 +00:00
alc
6a1cc9f79f Remove the field privatemodes from struct __aiocb_private and the
related code from aio_read() and aio_write().  This field was
intended, but never used, to allow a mythical user-level library to
make an aio_read() or aio_write() behave like an ordinary read() or
write(), i.e., a blocking I/O operation.
2001-03-04 01:22:23 +00:00
bmilekic
f364d4ac36 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
asmodai
e69fe706f1 Fix typo: wierd -> weird.
There is no such thing as wierd in the english language.
2001-02-06 09:25:10 +00:00
phk
709379c1ae Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
jake
7cd1ca1cdc Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup
from struct proc, which are now unused (p_nthread already was).
Remove process flag P_KTHREADP which was untested and only set
in vfs_aio.c (it should use kthread_create).  Move the yield
system call to kern_synch.c as kern_threads.c has been removed
completely.

moral support from:	alfred, jhb
2000-12-02 05:41:30 +00:00
alc
dfa19cb0ce Provide a new interface for the user of aio_read() and aio_write() to request
a kevent upon completion of the I/O.  Specifically, introduce a new type
of sigevent notification, SIGEV_EVENT.  If sigev_notify is SIGEV_EVENT,
then sigev_notify_kqueue names the kqueue that should receive the event
and sigev_value contains the "void *" is copied into the kevent's udata
field.

In contrast to the existing interface, this one: 1) works on
the Alpha 2) avoids the extra copyin() call for the kevent because all
of the information needed is in the sigevent and 3) could be
applied to request a single kevent upon completion of an entire lio_listio().

Reviewed by:	jlemon
2000-11-21 19:36:36 +00:00
dillon
15a44d16ca This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
alc
452893ff8d _aio_aqueue(): Change kevent registration to use its own struct file pointer.
Otherwise, aio_read() and aio_write() on sockets are broken if a kevent is
 registered.  (The code after kevent registration for handling sockets assumes
 that the struct file pointer "fp" still refers to the socket, not the kqueue.)
2000-10-29 21:38:28 +00:00
jhb
d944886e4d Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
alc
e062da3a80 aio_qphysio: Eliminate one instance of an out-of-range check that is
performed twice.  Eliminate initialization that is already performed
 by _aio_aqueue.

aio_physwakeup: Eliminate redundant synchronization that is already
 performed by bufdone.
2000-09-26 06:35:22 +00:00
bde
10844db3a7 Added used include of <sys/mutex.h> (don't depend on pollution in
<sys/signalvar.h>).
2000-09-17 12:20:49 +00:00
jhb
bd6b65a757 aio processes need to have the Giant mutex before doing work.
Submitted by:	tegge
2000-09-11 04:06:48 +00:00
truckman
0575d8a3f9 Remove uidinfo hash table lookup and maintenance out of chgproccnt() and
chgsbsize(), which are called rather frequently and may be called from an
interrupt context in the case of chgsbsize().  Instead, do the hash table
lookup and maintenance when credentials are changed, which is a lot less
frequent.  Add pointers to the uidinfo structures to the ucred and pcred
structures for fast access.  Pass a pointer to the credential to chgproccnt()
and chgsbsize() instead of passing the uid.  Add a reference count to the
uidinfo structure and use it to decide when to free the structure rather
than freeing the structure when the resource consumption drops to zero.
Move the resource tracking code from kern_proc.c to kern_resource.c.  Move
some duplicate code sequences in kern_prot.c to separate helper functions.
Change KASSERTs in this code to unconditional tests and calls to panic().
2000-09-05 22:11:13 +00:00
alc
cb0cf9cc92 Make filt_aio() check the jobstate for JOBST_JOBBFINISHED (in addition
to JOBST_JOBFINISHED) in case the aio_read() or aio_write() was performed
via the high-performance physio method, i.e., aio_qphysio().
2000-09-04 07:56:32 +00:00
peter
87a4a5f84e Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the
!VFS_AIO case.  Lots of things have hooks into here (kqueue, exit(),
 sockets, etc), I elected to keep the external interfaces the same
 rather than spread more #ifdefs around the kernel.
2000-07-28 23:10:10 +00:00
jake
961b97d434 Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
jake
d93fbc9916 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
phk
36c3965ff9 Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
jlemon
c41c876463 Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
phk
8ee11d587f Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.
2000-04-02 15:24:56 +00:00
phk
5df766a0f8 Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.
2000-03-20 11:29:10 +00:00
phk
a246e10f55 Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd.  The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue.  It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users:  Greg has not had time to test this yet, be careful.
2000-03-20 10:44:49 +00:00
jasone
8fc7b7d841 Add the VFS_AIO config option and leave it off by default. Unless the
VFS_AIO option is specified, all aio-related syscalls return ENOSYS.

The aio code is very fragile right now, and is unsuitable for default
inclusion in a production shell box.

Approved by:	jkh
2000-02-23 07:44:25 +00:00
jasone
cec957051d Back out the previous spl change, since it opens a race window.
Reviewed by:	alfred, dillon, peter
2000-01-20 08:15:13 +00:00
jasone
ff778f6b27 Don't tsleep() while at splbio().
Correctly return EINPROGRESS from aio_error() even when an aio request
is still in the socket queue.

Submitted by:	Adrian Chadd <adrian@bofh.co.uk>
2000-01-20 01:59:58 +00:00
green
24ae07bb54 Fix vn_isdisk() usage to make AIO work on non-disk-files again, rather
than just return ENOTBLK.

PR:	16163
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
2000-01-17 21:18:39 +00:00
jasone
241bd93929 Add aio_waitcomplete(). Make aio work correctly for socket descriptors.
Make gratuitous style(9) fixes (me, not the submitter) to make the aio
code more readable.

PR:		kern/12053
Submitted by:	Chris Sedore <cmsedore@maxwell.syr.edu>
2000-01-14 02:53:29 +00:00
phk
ae0c1ec8f7 Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
phk
1848d96439 Convert various pieces of code to use vn_isdisk() rather than checking
for vp->v_type == VBLK.

In ccd: we don't need to call VOP_GETATTR to find the type of a vnode.

Reviewed by:    sos
1999-11-22 10:33:55 +00:00
phk
b6e0662982 Simplify and de-bogotify check for raw disk. 1999-11-07 13:09:09 +00:00
phk
8d8f53dcdc Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the
"rw" argument, rather than hijacking B_{READ|WRITE}.

Fix two bugs (physio & cam) resulting by the confusion caused by this.

Submitted by:   Tor.Egge@fast.no
Reviewed by:    alc, ken (partly)
1999-10-30 06:32:05 +00:00
peter
787140aa42 Trim unused options (or #ifdef for undoc options).
Submitted by:	phk
1999-10-11 15:19:12 +00:00
green
140cb4ff83 This is what was "fdfix2.patch," a fix for fd sharing. It's pretty
far-reaching in fd-land, so you'll want to consult the code for
changes.  The biggest change is that now, you don't use
	fp->f_ops->fo_foo(fp, bar)
but instead
	fo_foo(fp, bar),
which increments and decrements the fp refcount upon entry and exit.
Two new calls, fhold() and fdrop(), are provided.  Each does what it
seems like it should, and if fdrop() brings the refcount to zero, the
fd is freed as well.

Thanks to peter ("to hell with it, it looks ok to me.") for his review.
Thanks to msmith for keeping me from putting locks everywhere :)

Reviewed by:	peter
1999-09-19 17:00:25 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
phk
5f45261e99 Spring cleaning around strategy and disklabels/slices:
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.

Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.

Remove bogus and unused strategy1 routines.

Remove open/close arguments from dssize().  Pick them up from dev_t.

Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
1999-08-14 11:40:51 +00:00
phk
683c2698ff s/v_specinfo/v_rdev/ 1999-08-13 10:10:12 +00:00
phk
e938d317d5 Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.
1999-08-08 18:43:05 +00:00
peter
6cb5fe6c6b Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface.  kproc_start is still available
as a SYSINIT() hook.  This allowed simplification of chunks of the
sysinit code in the process.  This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process.  It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit.  This means that nfsiod
doesn't need to be in /sbin and is always "available".  This is a fair bit
easier to do outside of the SYSINIT_KT() framework.
1999-07-01 13:21:46 +00:00
peter
a9c3f31bb0 Slight tweak to fork1() calling conventions. Add a third argument so
the caller can easily find the child proc struct.  fork(), rfork() etc
syscalls set p->p_retval[] themselves.  Simplify the SYSINIT_KT() code
and other kernel thread creators to not need to use pfind() to find the
child based on the pid.  While here, partly tidy up some of the fork1()
code for RF_SIGSHARE etc.
1999-06-30 15:33:41 +00:00
mckusick
5b58f2f951 Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.
1999-06-26 02:47:16 +00:00
phk
a6b150ee30 Introduce the makebdev() function, it does the same as the makedev()
function for now, but that will change.
1999-06-01 18:56:26 +00:00
phk
5a312ef1f6 major(something) can never become NODEV. 1999-05-09 13:13:52 +00:00
phk
500e41bd71 I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.
1999-05-08 06:40:31 +00:00
phk
f57a01ebfc remove b_proc from struct buf, it's (now) unused.
Reviewed by:	dillon, bde
1999-05-06 20:00:34 +00:00
peter
459d4a2cc5 Fix up a few easy 'assignment used as truth value' and 'suggest parens
around && within ||' type warnings.  I'm pretty sure I have not masked
any problems here, I've committed real problem fixes seperately.
1999-05-06 18:44:42 +00:00
luoqi
af7e9be5cc Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
dt
f13dd5fa6d Add standard padding argument to pread and pwrite syscall. That should make them
NetBSD compatible.

Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that
the offset is set in the struct uio).

Factor out some common code from read/pread/write/pwrite syscalls.
1999-04-04 21:41:28 +00:00
bde
ae4dfde7bc Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>). 1999-02-25 15:54:06 +00:00
luoqi
082d37c1ac Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
dillon
a784c6e33d More const fixes for -Wall, -Wcast-qual 1999-01-29 23:18:50 +00:00
bde
cf8e8cebcc Removed bogus casts to c_caddr_t. This is part of terminating
c_caddr_t with extreme prejudice.  Here the original casts to
caddr_t were to support K&R compilers (or missing prototypes),
but the relevant source files require an ANSI compiler.
1999-01-29 08:29:05 +00:00
dillon
ca558df378 Fix warnings related to -Wall -Wcast-qual 1999-01-28 17:32:05 +00:00
dillon
975fba8a24 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
dillon
a40e0249d4 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 21:50:00 +00:00
dillon
df24433bbe This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
des
ecc123364e Wrap two macros into do { ... } while (0), and fix the way they're used
in the kernel.

Reviewed by: bde
1998-12-15 17:38:33 +00:00
tegge
03c8e15ce9 Don't forget to update the pmap associated with aio daemons when adding
new page directory entries for a growing kernel virtual address space.
1998-11-27 01:14:21 +00:00
phk
13c66194f4 Nitpicking and dusting performed on a train. Removes trivial warnings
about unused variables, labels and other lint.
1998-10-25 17:44:59 +00:00
bde
faebf197e2 Fixed nonsense overflow checking (checking that a long variable is less
than INT_MAX after it has possibly overflowed).

Removed an unused variable and its associated 2 style bugs.

Removed unused includes.
1998-08-17 17:28:10 +00:00
bde
259a90a05b Cast between longs and pointers via intptr_t. There shouldn't be
nearly so many casts here.  Casting an pointer that was an integer
back to an integer just to compare it with -1 is bad, and casting
it back just to compare it with NULL is just wrong.
1998-07-15 06:51:14 +00:00
julian
788be0aa49 fix braino from yesterdays' megacommit
Not sure of the result of it..
(may or may not effect anything) but it's fixed now.
(found by: comparing what cvsup sent back to me with what I tested..)
1998-07-05 20:33:18 +00:00
julian
0262543b5f There is no such thing any more as "struct bdevsw".
There is only cdevsw (which should be renamed in a later edit to deventry
or something). cdevsw contains the union of what were in both bdevsw an
cdevsw entries.  The bdevsw[] table stiff exists and is a second pointer
to the cdevsw entry of the device. it's major is in d_bmaj rather than
d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers
to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).

rawread()/rawwrite() went away as part of this though it's not strictly
the same  patch, just that it involves all the same lines in the drivers.

cdroms no longer have write() entries (they did have rawwrite (?)).
tapes no longer have support for bdev operations.

Reviewed by: Eivind Eklund and Mike Smith
	Changes suggested by eivind.
1998-07-04 22:30:26 +00:00
dfr
2e6fba7d51 64bit fixes: don't cast pointers to int. 1998-06-10 10:31:08 +00:00
des
396b114475 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
phk
9b703b1455 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
dufault
8ed0defc6e Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and
_KPOSIX_PRIORITY_SCHEDULING options to work.  Changes:

Change all "posix4" to "p1003_1b".  Misnamed files are left
as "posix4" until I'm told if I can simply delete them and add
new ones;

Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux;

Add man pages for _POSIX_PRIORITY_SCHEDULING system calls;

Add options to LINT;

Minor fixes to P1003_1B code during testing.
1998-03-28 11:51:01 +00:00
bde
cd450d6714 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
bde
e15b8a8b3a Removed a stale comment and staler code. 1998-02-25 06:30:15 +00:00
eivind
d7a6ab2803 Staticize. 1998-02-09 06:11:36 +00:00
eivind
4547a09753 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
eivind
c552a9a1c3 Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
dyson
eb84a1691e Quiet some lint. 1997-12-10 04:14:23 +00:00
dyson
f22f07ffe0 Correct prototypes to match POSIX. Correct return code for aio_cancel.
Submitted by:	Alex Nash <nash@mcs.com>
1997-12-08 02:18:25 +00:00
dyson
ba00c38faa Fix a problem when creating a new kernel thread. In some cases, aio_read
or aio_write can return the pid of the new thread.  This is due to the
way that return values from system calls being passed by side-effect in
the proc structure now.  This commit fixes the problem with aio_read and
aio_write.
1997-12-01 18:41:08 +00:00
dyson
f6a7ea5e3a Fix error handling for VCHR type I/O. Also, fix another spl problem, and
remove alot of overly verbose debugging statements.
ioproclist {
	int aioprocflags;			/* AIO proc flags */
	TAILQ_ENTRY(aioproclist) list;		/* List of processes */
	struct proc *aioproc;			/* The AIO thread */
	TAILQ_HEAD (,aiocblist) jobtorun;	/* suggested job to run */
};

/*
 * data-structure for lio signal management
 */
struct aio_liojob {
	int lioj_flags;
	int	lioj_buffer_count;
	int	lioj_buffer_finished_count;
	int	lioj_queue_count;
	int	lioj_queue_finished_count;
	struct sigevent lioj_signal;	/* signal on all I/O done */
	TAILQ_ENTRY (aio_liojob) lioj_list;
	struct kaioinfo *lioj_ki;
};
#define	LIOJ_SIGNAL			0x1 /* signal on all done (lio) */
#define	LIOJ_SIGNAL_POSTED	0x2	/* signal has been posted */

/*
 * per process aio data structure
 */
struct kaioinfo {
	int	kaio_flags;			/* per process kaio flags */
	int	kaio_maxactive_count;	/* maximum number of AIOs */
	int	kaio_active_count;	/* number of currently used AIOs */
	int	kaio_qallowed_count;	/* maxiumu size of AIO queue */
	int	kaio_queue_count;	/* size of AIO queue */
	int	kaio_ballowed_count;	/* maximum number of buffers */
	int	kaio_queue_finished_count;	/* number of daemon jobs finished */
	int	kaio_buffer_count;	/* number of physio buffers */
	int	kaio_buffer_finished_count;	/* count of I/O done */
	struct proc *kaio_p;			/* process that uses this kaio block */
	TAILQ_HEAD (,aio_liojob) kaio_liojoblist;	/* list of lio jobs */
	TAILQ_HEAD (,aiocblist)	kaio_jobqueue;	/* job queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_jobdone;	/* done queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_bufqueue;	/* buffer job queue for process */
	TAILQ_HEAD (,aiocblist)	kaio_bufdone;	/* buffer done queue for process */
};

#define KAIO_RUNDOWN 0x1		/* process is being run down */
#define KAIO_WAKEUP 0x2			/* wakeup process when there is a significant
								   event */


TAILQ_HEAD (,aioproclist) aio_freeproc, aio_activeproc;
TAILQ_HEAD(,aiocblist) aio_jobs;			/* Async job list */
TAILQ_HEAD(,aiocblist) aio_bufjobs;			/* Phys I/O job list */
TAILQ_HEAD(,aiocblist) aio_freejobs;		/* Pool of free jobs */

static void aio_init_aioinfo(struct proc *p) ;
static void aio_onceonly(void *) ;
static int aio_free_entry(struct aiocblist *aiocbe);
static void aio_process(struct aiocblist *aiocbe);
static int aio_newproc(void) ;
static int aio_aqueue(struct proc *p, struct aiocb *job, int type) ;
static void aio_physwakeup(struct buf *bp);
static int aio_fphysio(struct proc *p, struct aiocblist *aiocbe, int type);
static int aio_qphysio(struct proc *p, struct aiocblist *iocb);
static void aio_daemon(void *uproc);

SYSINIT(aio, SI_SUB_VFS, SI_ORDER_ANY, aio_onceonly, NULL);

static vm_zone_t kaio_zone=0, aiop_zone=0,
	aiocb_zone=0, aiol_zone=0, aiolio_zone=0;

/*
 * Single AIOD vmspace shared amongst all of them
 */
static struct vmspace *aiovmspace = NULL;

/*
 * Startup initialization
 */
void
aio_onceonly(void *na)
{
	TAILQ_INIT(&aio_freeproc);
	TAILQ_INIT(&aio_activeproc);
	TAILQ_INIT(&aio_jobs);
	TAILQ_INIT(&aio_bufjobs);
	TAILQ_INIT(&aio_freejobs);
	kaio_zone = zinit("AIO", sizeof (struct kaioinfo), 0, 0, 1);
	aiop_zone = zinit("AIOP", sizeof (struct aioproclist), 0, 0, 1);
	aiocb_zone = zinit("AIOCB", sizeof (struct aiocblist), 0, 0, 1);
	aiol_zone = zinit("AIOL", AIO_LISTIO_MAX * sizeof (int), 0, 0, 1);
	aiolio_zone = zinit("AIOLIO",
		AIO_LISTIO_MAX * sizeof (struct aio_liojob), 0, 0, 1);
	aiod_timeout = AIOD_TIMEOUT_DEFAULT;
	aiod_lifetime = AIOD_LIFETIME_DEFAULT;
	jobrefid = 1;
}

/*
 * Init the per-process aioinfo structure.
 * The aioinfo limits are set per-process for user limit (resource) management.
 */
void
aio_init_aioinfo(struct proc *p)
{
	struct kaioinfo *ki;
	if (p->p_aioinfo == NULL) {
		ki = zalloc(kaio_zone);
		p->p_aioinfo = ki
1997-12-01 07:01:45 +00:00
dyson
4f2c953311 Correct a last minute code change. Would have been an infinite loop under
certain error conditions.
Submitted by:	pst@shockwave.com
1997-11-30 23:21:08 +00:00
dyson
f2ef972c8c Fix an spl nit. 1997-11-30 21:47:36 +00:00
dyson
eb130e286e Finish up the vast majority of the AIO/LIO functionality. Proper signal
support was missing in the previous version of the AIO code.  More
tunables added, and very efficient support for VCHR files has been added.
Kernel threads are not used for VCHR files, all work for such files is
done for the requesting process directly.  Some attempt has been made to
charge the requesting process for resource utilization, but more work
is needed.  aio_fsync is still missing (but the original fsync system
call can be used for now.)  aio_cancel is essentially a noop, but that
is okay per POSIX.  More aio_cancel functionality can be added later,
if it is found to be needed.

The functions implemented include:
	aio_read, aio_write, lio_listio, aio_error, aio_return,
	aio_cancel, aio_suspend.

The code has been implemented to support the POSIX spec 1003.1b
(formerly known as POSIX 1003.4 spec) features of the above.  The
async I/O features are truly async, with the VCHR mode of operation
being essentially the same as physio (for appropriate files) for
maximum efficiency.  This code also supports the signal capability,
is highly tunable, allowing management of resource usage, and
has been written to allow a per process usage quota.

Both the O'Reilly POSIX.4 book and the actual POSIX 1003.1b document
were the reference specs used.  Any filedescriptor can be used with
these new system calls.  I know of no exceptions where these
system calls will not work.  (TTY's will also probably work.)
1997-11-30 04:36:31 +00:00
dyson
bdc807a3bb Disable the VCHR optimization for AIO until I have implemented it. Just in
case anyone wants to play with the POSIX AIO/LIO stuff.  (As it is, it should
work with ANY vnode, on UP systems only, for now.)
1997-11-29 02:57:46 +00:00
dyson
6781e003d5 Fix and complete the AIO syscalls. There are some performance enhancements
coming up soon, but the code is functional.  Docs will be forthcoming.
1997-11-29 01:33:10 +00:00
bde
a9d75411ee Get locking stuff by #including <sys/lock.h> instead of <sys/vnode.h>. 1997-11-18 10:02:40 +00:00
phk
4d26888936 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
phk
4c8218a5c7 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
phk
36e7a51ea1 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
phk
645e7b2ab6 Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
dyson
a00683f93c Make the target for the number of AIO daemons work. 1997-10-11 01:07:03 +00:00
dyson
0b1a7c5e85 Major cleanup and debugging of the new AIO/LIO code. The LIO code is
now corrected.  New tunables/instrumentation added.  The code is now
likely "good enough to use."  I will add the userland support soon.
The "high performance" mode for raw devices is still missing, and will
be added next.  POSIX system calls that now appear to work:
aio_cancel, aio_error, aio_read, aio_return, aio_suspend, aio_write,
lio_listio.  Missing, but to be added soon: aio_fsync.
1997-10-09 04:14:41 +00:00
bde
6ffb8bf9af Removed unused #includes. 1997-09-02 20:06:59 +00:00
dyson
8e56f80f0e Clean up some lint associated with the AIO code. 1997-07-17 04:49:43 +00:00
dyson
532dea9042 This is an upgrade so that the kernel supports the AIO calls from
POSIX.4.  Additionally, there is some initial code that supports LIO.
This code supports AIO/LIO for all types of file descriptors, with
few if any restrictions.  There will be a followup very soon that
will support significantly more efficient operation for VCHR type
files (raw.)  This code is also dependent on some kernel features
that don't work under SMP yet.  After I commit the changes to the
kernel to support proper address space sharing on SMP, this code
will also work under SMP.
1997-07-06 02:40:43 +00:00
dyson
195ce8e1d0 Add initial AIO/LIO kernel thread support files. This is preliminary, and
further features will be added.
1997-06-16 00:27:26 +00:00