Commit Graph

650 Commits

Author SHA1 Message Date
Kirk McKusick
52a3bfa2e7 Cannot do MALLOC with M_WAITOK while holding ACQUIRE_LOCK
Obtained from:	Ethan Solomita <ethan@geocast.com>
2000-09-07 23:02:55 +00:00
Jason Evans
0384fff8c5 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
Robert Watson
bbf0607700 Modify extended attribute protection model to authorize based on
attribute namespace and DAC protection on file:
	- Attribute names beginning with '$' are in the system namespace
	- The attribute name "$" is reserved
	- System namespace attributes may only be read/set by suser()
	  or by kernel (cred == NULL)
	- Other attribute names are in the application namespace
	- The attribute name "" is reserved
	- Application namespace attributes are protected in the manner
	  of the target file permission

o Kernel changes
	- Add ufs_extattr_valid_attrname() to check whether the requested
	  attribute "set" or "enable" is appropriate (i.e., non-reserved)
	- Modify ufs_extattr_credcheck() to accept target file vnode, not
	  to take inode uid
	- Modify ufs_extattr_credcheck() to check namespace, then enforce
	  either kernel/suser for system namespace, or vaccess() for
	  application namespace
o EA backing file format changes
	- Remove permission fields from extended attribute backing file
	  header
	- Bump extended attribute backing file header version to 3
o Update extattrctl.c and extattrctl.8
	- Remove now deprecated -r and -w arguments to initattr, as
	  permissions are now implicit
	- (unrelated) fix error reporting and unlinking during failed
	  initattr to remove duplicate/inaccurate error messages, and to
	  only unlink if the failure wasn't in the backing file open()

Obtained from:	TrustedBSD Project
2000-09-02 20:31:26 +00:00
Robert Watson
012c643d3e o Restructure vaccess() so as to check for DAC permission to modify the
object before falling back on privilege.  Make vaccess() accept an
  additional optional argument, privused, to determine whether
  privilege was required for vaccess() to return 0.  Add commented
  out capability checks for reference.  Rename some variables to make
  it more clear which modes/uids/etc are associated with the object,
  and which with the access mode.
o Update file system use of vaccess() to pass NULL as the optional
  privused argument.  Once additional patches are applied, suser()
  will no longer set ASU, so privused will permit passing of
  privilege information up the stack to the caller.

Reviewed by:	bde, green, phk, -security, others
Obtained from:	TrustedBSD Project
2000-08-29 14:45:49 +00:00
Robert Watson
877dd71fc6 o Correct spelling of ufs_exttatr_find_attr -> ufs_extattr_find_attr
o Add "const" qualifier to attrname argument of various calls to remove
  warnings

Obtained from:	TrustedBSD Project
2000-08-26 22:00:58 +00:00
Poul-Henning Kamp
3f54a085a6 Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)
Remove old DEVFS support fields from dev_t.

  Make uid, gid & mode members of dev_t and set them in make_dev().

  Use correct uid, gid & mode in make_dev in disk minilayer.

  Add support for registering alias names for a dev_t using the
  new function make_dev_alias().  These will show up as symlinks
  in DEVFS.

  Use makedev() rather than make_dev() for MFSs magic devices to prevent
  DEVFS from noticing this abuse.

  Add a field for DEVFS inode number in dev_t.

  Add new DEVFS in fs/devfs.

  Add devfs cloning to:
        disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
        md(4), tun(4), bpf(4), fd(4)

  If DEVFS add -d flag to /sbin/inits args to make it mount devfs.

  Add commented out DEVFS to GENERIC
2000-08-20 21:34:39 +00:00
Poul-Henning Kamp
e39c53eda5 Centralize the canonical vop_access user/group/other check in vaccess().
Discussed with: bde
2000-08-20 08:36:26 +00:00
Tor Egge
b5ee7ec63a Initialize *countp to 0 in stub for softdep_flushworklist().
This allows ffs_fsync() to break out of a loop that might otherwise
be infinite on kernels compiled without the SOFTUPDATES option.
The observed symptom was a system hang at the first unmount attempt.
2000-08-09 00:41:54 +00:00
Ollivier Robert
8694d8e912 Fix the lockmgr panic everyone is seeing at shutdown time.
vput assumes curproc is the lock holder, but it's not true in this case.

Thanks a lot Luoqi !

Submitted by:	luoqi
Tested by:	phk
2000-08-01 14:15:07 +00:00
Peter Wemm
68e530258a Minor tweak - removed unused variable 'struct mount *mp'; 2000-07-28 22:28:05 +00:00
Peter Wemm
6ee6b42ef7 Minor change: fix warning - move a 'struct vnode *vp' declaration inside a
#ifdef DIAGNOSTIC to match its corresponding usage.
2000-07-28 22:27:00 +00:00
Kirk McKusick
3592b7155c Clean up the snapshot code so that it no longer depends on the use of
the SF_IMMUTABLE flag to prevent writing. Instead put in explicit
checking for the SF_SNAPSHOT flag in the appropriate places. With
this change, it is now possible to rename and link to snapshot files.
It is also possible to set or clear any of the owner, group, or
other read bits on the file, though none of the write or execute
bits can be set. There is also an explicit test to prevent the
setting or clearing of the SF_SNAPSHOT flag via chflags() or
fchflags(). Note also that the modify time cannot be changed as
it needs to accurately reflect the time that the snapshot was taken.

Submitted by:	Robert Watson <rwatson@FreeBSD.org>
2000-07-26 23:07:01 +00:00
Poul-Henning Kamp
a0580699e1 Fix the "mfs_badop[vop_getwritemount] = 45" messages. 2000-07-26 17:53:04 +00:00
Kirk McKusick
55ba28c60a Add stub for softdep_flushworklist() so that kernels compiled
without the SOFTUPDATES option will load correctly.

Obtained from:	John Baldwin <jhb@bsdi.com>
2000-07-25 05:28:59 +00:00
Kirk McKusick
d56bdab31c Eliminate periodic 'mfs_badop[vop_getwritemount] = 45' messages.
Submitted by:	Sheldon Hearn <sheldonh@uunet.co.za>
2000-07-25 05:11:57 +00:00
Kirk McKusick
9b97113391 This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
Robert Watson
bc373480dc o Marius pointed out an unusually inconvenient upper bound on extended
attribute data size.
o Fortunately it turned out to be an unused constant left over from an
  earlier implementation, and is therefore being removed so as not to
  confuse casual observers.

Submitted by:	mbendiks@eunet.no
2000-07-14 03:30:52 +00:00
Boris Popov
3fbd97427e Prevent possible dereference of NULL pointer.
Submitted by:	Marius Bendiksen <mbendiks@eunet.no>
2000-07-13 02:17:14 +00:00
Kirk McKusick
d303f71fdc Brain fault, forgot to update ffs_snapshot.c with the new calling convention
for vn_start_write.
2000-07-12 00:27:27 +00:00
Kirk McKusick
f2a2857bb3 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
Kirk McKusick
d4c1816924 Clean up warning about undeclared function by declaring softdep_fsync
in mount.h instead of ffs_extern.h. The correct solution is to use
an indirect function pointer so that the kernel does not have to be
built with options FFS, but that will be left for another day.
2000-07-11 19:28:26 +00:00
Poul-Henning Kamp
88bab4e40c Finish repo-copy:
Move ufs/ufs/ufs_disksubr.c to kern/subr_disklabel.c.

These functions are not UFS specific and are in fact used all over the place.
2000-07-10 13:48:06 +00:00
Kirk McKusick
cc3962a9cd Delete README as it is now obsolete. Relevant information is in
README.softupdates.
2000-07-08 02:32:49 +00:00
Kirk McKusick
876578906d Update to reflect current status. 2000-07-08 02:31:21 +00:00
Kirk McKusick
22e5a6234e Get userland visible flags added for snapshots to give a few days
advance preparation for them to get migrated into place so that
subsequent changes in utilities will not fail to compile for lack
of up-to-date header files in /usr/include.
2000-07-04 04:58:34 +00:00
Kirk McKusick
e6796b67d9 Move the truncation code out of vn_open and into the open system call
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.

Obtained from:	 BSD/OS
2000-07-04 03:34:11 +00:00
Poul-Henning Kamp
3275cf7379 Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,
that is way cleaner than using the softupdates_stub stunt, which
should be killed when convenient.

Discussed with:	mckusick
2000-07-03 13:26:54 +00:00
Poul-Henning Kamp
a8b1f9d2c9 Move prtactive to vfs from ufs. It is used all over the place. 2000-06-27 07:46:22 +00:00
Andrey A. Chernov
2d90744fd8 Remove obsoleted info about linking from contrib 2000-06-24 13:29:25 +00:00
Kirk McKusick
858c16fab8 Update to new copyright. 2000-06-22 00:29:53 +00:00
Kirk McKusick
6019e6208f When running with quotas enabled on a filesystem using soft updates,
the system would panic when a user's inode quota was exceeded (see
PR 18959 for details). This fixes that problem.

PR:		18959
Submitted by:	Jason Godsey <jason@unixguy.fidalgo.net>
2000-06-18 22:14:28 +00:00
Kirk McKusick
d3abb52714 Some additional performance improvements. When freeing an inode
check to see if it has been committed to disk. If it has never
been written, it can be freed immediately. For short lived files
this change allows the same inode to be reused repeatedly.
Similarly, when upgrading a fragment to a larger size, if it
has never been claimed by an inode on disk, it too can be freed
immediately making it available for reuse often in the next slowly
growing block of the same file.
2000-06-18 22:05:57 +00:00
Poul-Henning Kamp
7c50d77218 Revert part of my bioops change which implemented panic(8). 2000-06-16 14:32:13 +00:00
Poul-Henning Kamp
7523681895 ARGH! I have too many source trees :-(
Fix prototype errors in last commit.
2000-06-16 13:00:33 +00:00
Poul-Henning Kamp
a2e7a027a7 Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
Poul-Henning Kamp
6ea6805f8c Remove a comment which should never have made it in. 2000-06-14 21:48:19 +00:00
Robert Watson
192851dbff o Remove unneeded off_t variable to clean up compile warning
Obtained from:	TrustedBSD Project
2000-06-05 14:22:51 +00:00
Robert Watson
b2b0497ab5 o If FFS_EXTATTR is defined, don't print out an error message on unmount
if an FFS partition returns EOPNOTSUPP, as it just means extended
  attributes weren't enabled on that partition.  Prevents spurious
  warning per-partition at shutdown.
2000-06-04 04:50:36 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Robert Watson
f3706a0361 s/ffs_unmonut/ffs_unmount/ in a gratuitous ufs_extattr printf.
Reported by:	knu
2000-05-07 17:21:08 +00:00
Poul-Henning Kamp
9626b608de Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
Robert Watson
a7e8b37043 Don't allow VOP_GETEXTATTR to set uio->uio_offset != 0, as we don't
provide locking over extended attribute operations, requiring that
individual operations be atomic.  Allowing non-zero starting offsets
permits applications/etc to put themselves at risk for inconsistent
behavior.  As VOP_SETEXTATTR already prohibited non-zero write offsets,
this makes sense.

Suggested by:	Andreas Gruenbacher <a.gruenbacher@bestbits.at>
2000-05-03 05:50:46 +00:00
Poul-Henning Kamp
2c9b67a8df Remove unneeded #include <vm/vm_zone.h>
Generated by:	src/tools/tools/kerninclude
2000-04-30 18:52:11 +00:00
Poul-Henning Kamp
87150cb06d s/biowait/bufwait/g
Prodded by: several.
2000-04-29 16:25:22 +00:00
Poul-Henning Kamp
eb95c536ad Remove unneeded #include <sys/kernel.h> 2000-04-29 15:36:14 +00:00
Kirk McKusick
0fa301ad86 When files are given to users by root, the quota system failed to
reset their grace timer as their ownership crossed the soft limit
threshhold. Thus if they had been over their limit in the past,
they were suddenly penalized as if they had been over their limit
ever since. The fix is to check when root gives away files, that
when the receiving user crosses their soft limit, their grace timer
is reset. See the PR report for a detailed method of reproducing
the bug.

PR:		kern/17128
Submitted by:	Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
2000-04-28 06:12:56 +00:00
Poul-Henning Kamp
a1da82154d Convert the magic MFS device to a VCHR.
Detected by:	obrien
2000-04-22 05:45:38 +00:00
Robert Watson
747b0fa36c o Introduce an extended attribute backing file header magic number
o Introduce an extended attribute backing file header version number
2000-04-19 20:12:41 +00:00
Poul-Henning Kamp
3389ae9350 Remove ~25 unneeded #include <sys/conf.h>
Remove ~60 unneeded #include <sys/malloc.h>
2000-04-19 14:58:28 +00:00