Commit Graph

175 Commits

Author SHA1 Message Date
jake
a4ad237eaa - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
dwmalone
dd75d1d73b Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
jake
0c0be4e826 Protect the following with a lockmgr lock:
allproc
	zombproc
	pidhashtbl
	proc.p_list
	proc.p_hash
	nextpid

Reviewed by:	jhb
Obtained from:	BSD/OS and netbsd
2000-11-22 07:42:04 +00:00
dillon
15a44d16ca This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
phk
4e063f5534 Take VBLK devices further out of their missery.
This should fix the panic I introduced in my previous commit on this topic.
2000-11-02 21:14:13 +00:00
jhb
d944886e4d Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
jasone
4e290e67b7 Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.
2000-10-04 01:29:17 +00:00
eivind
e6b9704a88 Add function comments for functions missing them 2000-09-14 19:13:59 +00:00
eivind
957c331f7f Blow away COMPAT_43 support for mount 2000-09-14 18:11:44 +00:00
bp
a7bc78c86d Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT.
They will be used by nullfs and other stacked filesystems to support full
cache coherency.

Reviewed in general by:	mckusick, dillon
2000-09-12 09:49:08 +00:00
rwatson
15609dd6ef o Remove commented out code which modified return values from
extattr_{get,set} syscalls in the face of partial reads or writes.

Obtained from:	TrustedBSD Project
2000-09-05 02:13:14 +00:00
truckman
bc29ebd97d access() shouldn't diddle with the contents of a potentially shared
credential.  Create a temporary copy of the current credential and
modify the copy.

Submitted by:	tegge
2000-09-02 12:31:55 +00:00
tegge
31d32eb7b1 Don't set flags on the mount structure before all permission checks have
been done.

Don't allow multiple mount operations with MNT_UPDATE at the same
time on the same mount point.  When the first mount operation
completed, MNT_UPDATE was cleared in the mount structure, causing
the second to complete as if it was a no-update mount operation
with the following bad side effects:

        - mount structure inserted multiple times onto the mountlist
        - vp->v_mountedhere incorrectly set, causing next namei
          operation walking into the mountpoint to crash with
          a locking against myself panic.

Plug a vnode leak in case vinvalbuf fails.
2000-08-09 01:57:11 +00:00
rwatson
bf8742e138 o Modify extattr_{set,get}() syscalls so that partial reads and writes
with an error condition such as EINTR, EWOULDBLOCK, and ERESTART,
  are reported to the application, not silently conceal.  This
  behavior was copied from the {read,write}v() syscalls, and is
  appropriate there but not here.
o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is
  passed to the wrong argument in namei(), resulting in some
  unexpected errors during name resolution, and passing in an unlocked
  vnode.

Obtained from:	TrustedBSD Project
2000-07-28 19:52:38 +00:00
rwatson
97118d11db o Lock vnode before calling extattr_* VOP's, and modify vnode spec to
allow for that.
o Remember to call NDFREE() if exiting as a result of a failed
  vn_start_write() when snapshotting.

Reviewed by:	mckusick
Obtained from:	TrustedBSD Project
2000-07-26 20:29:20 +00:00
mckusick
d439206673 Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0);
Submitted by:	Peter Holm <pho@freebsd.org>
2000-07-25 05:38:54 +00:00
mckusick
a3d0c189ea Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
mckusick
040e64cd97 Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from:	 BSD/OS
2000-07-04 03:34:11 +00:00
phk
2a91a9dd04 Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,
that is way cleaner than using the softupdates_stub stunt, which
should be killed when convenient.

Discussed with:	mckusick
2000-07-03 13:26:54 +00:00
archie
0e6c8a1f1b Move the securelevel check before loading KLD's into linker_load_file(),
instead of requiring every caller of linker_load_file() to perform the
check itself. This avoids netgraph loading KLD's when securelevel > 0,
not to mention any future code that may call linker_load_file().

Reviewed by:	dfr
2000-06-29 17:57:04 +00:00
phk
cb90cb2b60 Revert part of my bioops change which implemented panic(8). 2000-06-16 14:32:13 +00:00
phk
4ec91666fa Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
phk
36c3965ff9 Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
dillon
689641c1ea Commit major SMP cleanups and move the BGL (big giant lock) in the
syscall path inward.  A system call may select whether it needs the MP
    lock or not (the default being that it does need it).

    A great deal of conditional SMP code for various deadended experiments
    has been removed.  'cil' and 'cml' have been removed entirely, and the
    locking around the cpl has been removed.  The conditional
    separately-locked fast-interrupt code has been removed, meaning that
    interrupts must hold the CPL now (but they pretty much had to anyway).
    Another reason for doing this is that the original separate-lock for
    interrupts just doesn't apply to the interrupt thread mechanism being
    contemplated.

    Modifications to the cpl may now ONLY occur while holding the MP
    lock.  For example, if an otherwise MP safe syscall needs to mess with
    the cpl, it must hold the MP lock for the duration and must (as usual)
    save/restore the cpl in a nested fashion.

    This is precursor work for the real meat coming later: avoiding having
    to hold the MP lock for common syscalls and I/O's and interrupt threads.
    It is expected that the spl mechanisms and new interrupt threading
    mechanisms will be able to run in tandem, allowing a slow piecemeal
    transition to occur.

    This patch should result in a moderate performance improvement due to
    the considerable amount of code that has been removed from the critical
    path, especially the simplification of the spl*() calls.  The real
    performance gains will come later.

Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
2000-03-28 07:16:37 +00:00
mckusick
2f9951ffbd Add bwillwrite to all system calls that create things in the filesystem.
Benchmarks that create huge trees of empty files overwhelm the buffer cache.
2000-01-10 00:08:53 +00:00
rwatson
4b6baecfc7 Second pass commit to introduce new ACL and Extended Attribute system
calls, vnops, vfsops, both in /kern, and to individual file systems that
require a vfsop_ array entry.

Reviewed by:	eivind
1999-12-19 06:08:07 +00:00
eivind
87724eb673 Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
dillon
7cbe0b8f16 Remove accidental pollution unrelated to previous commit. The issue
here is real but has not yet been discussed with Eivind.
1999-12-12 03:28:14 +00:00
dillon
b66fb2c648 Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to
madvise().

    This feature prevents the update daemon from gratuitously flushing
    dirty pages associated with a mapped file-backed region of memory.  The
    system pager will still page the memory as necessary and the VM system
    will still be fully coherent with the filesystem.  Modifications made
    by other means to the same area of memory, for example by write(), are
    unaffected.  The feature works on a page-granularity basis.

    MAP_NOSYNC allows one to use mmap() to share memory between processes
    without incuring any significant filesystem overhead, putting it in
    the same performance category as SysV Shared memory and anonymous memory.

Reviewed by: julian, alc, dg
1999-12-12 03:19:33 +00:00
phk
1adcecffd9 struct mountlist and struct mount.mnt_list have no business being
a CIRCLEQ.  Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.

This removes ugly  mp != (void*)&mountlist  comparisons.

Requested by:   phk
Submitted by:   Jake Burkholder jake@checker.org
PR:             14967
1999-11-20 10:00:46 +00:00
dillon
13bf2dbc72 Ensure that garbage from the kernel stack does not wind up being
returned to user mode in the spare fields of the stat structure.

PR:		kern/14966
Reviewed by:	dillon@freebsd.org
Submitted by:	Kelly Yancey kbyanc@posi.net
1999-11-18 08:14:20 +00:00
phk
ec4e24bd52 Commit the remaining part of PR14914:
Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
   structures for list operations.  This patch makes all list operations
   in sys/kern use the queue(3) macros, rather than directly accessing the
   *Q_{HEAD,ENTRY} structures.

Reviewed by:    phk
Submitted by:   Jake Burkholder <jake@checker.org>
PR:     14914
1999-11-16 16:28:58 +00:00
eivind
4ce73d7096 Remove WILLRELE from VOP_SYMLINK
Note: Previous commit to these files (except coda_vnops and devfs_vnops)
that claimed to remove WILLRELE from VOP_RENAME actually removed it from
VOP_MKNOD.
1999-11-13 20:58:17 +00:00
eivind
6d6289a5a6 Fix style bugs from last commit 1999-11-13 14:35:50 +00:00
eivind
21fff7b1c2 Remove WILLRELE from VOP_RENAME 1999-11-12 03:34:28 +00:00
julian
88e6664e72 Most modern OSs have the ability to flag certain mounts as ones to
be ignored by default by the df(1) program.  This is used mostly to
avoid stat()-ing entries that do not represent "real" disk mount
points (such as those made by an automounter such as amd.)  It is
also useful not to have to stat() these entries because it takes
longer to report them that for other file systems, being that these
mount points are served by a user-level file server and resulting in
several context switches.  Worse, if the automounter is down
unexpectedly, a causal df(1) will hang in an interruptible way.

PR:		kern/9764
Submitted by:	Erez Zadok <ezk@cs.columbia.edu>
1999-11-01 04:57:43 +00:00
peter
787140aa42 Trim unused options (or #ifdef for undoc options).
Submitted by:	phk
1999-10-11 15:19:12 +00:00
phk
322edeeaa9 Before we start to mess with the VFS name-cache clean things up a little bit:
Isolate the namecache in its own file, and give it a dedicated malloc type.
1999-10-03 12:18:29 +00:00
phk
073b941095 Remove v_maxio from struct vnode.
Replace it with mnt_iosize_max in struct mount.

Nits from:	bde
1999-09-29 20:05:33 +00:00
phk
3588b9beb9 Fix a hole in jail(2).
Noticed by:	Alexander Bezroutchko <abb@zenon.net>
1999-09-25 14:14:21 +00:00
alfred
b9136a6115 Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from:	NetBSD
1999-09-11 00:46:08 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
phk
591c94d4c6 Simplify the handling of VCHR and VBLK vnodes using the new dev_t:
Make the alias list a SLIST.

        Drop the "fast recycling" optimization of vnodes (including
        the returning of a prexisting but stale vnode from checkalias).
        It doesn't buy us anything now that we don't hardlimit
        vnodes anymore.

        Rename checkalias2() and checkalias() to addalias() and
        addaliasu() - which takes dev_t and udev_t arg respectively.

        Make the revoke syscalls use vcount() instead of VALIASED.

        Remove VALIASED flag, we don't need it now and it is faster
        to traverse the much shorter lists than to maintain the
        flag.

        vfs_mountedon() can check the dev_t directly, all the vnodes
        point to the same one.

Print the devicename in specfs/vprint().

Remove a couple of stale LFS vnode flags.

Remove unimplemented/unused LK_DRAINED;
1999-08-26 14:53:31 +00:00
jdp
545b4031fc Go back to using microtime() to get the timestamps for {f,l,}utimes(path,
NULL) for now.  Bruce says I jumped the gun with my change in
revision 1.131, or maybe it should use nanotime(), or maybe it
shouldn't be decided in the VFS layer at all.  I'm leaving it with
the old behavior until the Trans-Pacific Internet Vulcan Mind Meld
yields fuller understanding.
1999-08-22 16:50:30 +00:00
jdp
36ba7111cd Use the new vfs_timestamp() function to create the timestamps used
by utimes(path, NULL).  This gives them the same precision as the
timestamps produced by write operations.  Do likewise for lutimes()
and futimes().

Suggested by bde.
1999-08-22 01:46:57 +00:00
alfred
e7ab2e043a Replace a redundant vfs_object_create() call (already done in vn_open)
with a KASSERT.

Reviewed by: Eivind, Alan Cox
1999-08-12 20:38:32 +00:00
green
c03366a55d Fix fd race conditions (during shared fd table usage.) Badfileops is
now used in f_ops in place of NULL, and modifications to the files
are more carefully ordered. f_ops should also be set to &badfileops
upon "close" of a file.

This does not fix other problems mentioned in this PR than the first
one.

PR:		11629
Reviewed by:	peter
1999-08-04 18:53:50 +00:00
imp
54b72818fb o Typo in prior version kept it from compiling (blush).
Noticed by: Nobody!

o Add comment about why we restrict chflags to root for devices.
o nit noticed by bde wrt return values.
1999-08-04 04:52:18 +00:00
imp
9e1a007dbf brucify:
o use suser_xxx rather than suser to support JAIL code.
	o KNF comment convention
	o use vp->type rather than vaddr.type and eliminate call to
	  VOP_GETATTR.  Bruce says that vp->type is valid at this
	  point.

Submitted by: bde.

Not fixed:
	o return (value)
	o Comment needs to be longer and more explicit.  It will be after
	  the advisory.
1999-08-03 17:07:04 +00:00
imp
b9c832cd1f Only allow root to set file flags on devices. 1999-08-02 21:34:46 +00:00