Commit Graph

1281 Commits

Author SHA1 Message Date
mckusick
a697c3ed65 Fixes a bug that caused UFS2 filesystems bigger than 2TB to
prematurely report that they were full and/or to panic the kernel
with the message ``ffs_clusteralloc: allocated out of group''.

Submitted by:	Henry Whincup <henry@jot.to>
MFC after:	1 week
2004-12-09 21:24:00 +00:00
phk
125513db75 Fix snapshot creation. 2004-12-08 11:54:06 +00:00
phk
8bef9a211a Fix nfs exports (for now). The real fix is to teach mountd about
nmount.
2004-12-07 15:09:30 +00:00
phk
4a639d6164 The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly
split the conversion of the remaining three filesystems out from the root
mounting changes, so in one go:

cd9660:
	Convert to nmount.
	Add omount compat shims.
	Remove dedicated rootfs mounting code.
	Use vfs_mountedfrom()
	Rely on vfs_mount.c calling VFS_STATFS()

nfs(client):
	Convert to nmount (the simple way, mount_nfs(8) is still necessary).
	Add omount compat shims.
	Drop COMPAT_PRELITE2 mount arg compatibility.

ffs:
	Convert to nmount.
	Add omount compat shims.
	Remove dedicated rootfs mounting code.
	Use vfs_mountedfrom()
	Rely on vfs_mount.c calling VFS_STATFS()

Remove vfs_omount() method, all filesystems are now converted.

Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem
task, and they all do it now.

Change rootmounting to use DEVFS trampoline:

vfs_mount.c:
	Mount devfs on /.  Devfs needs no 'from' so this is clean.
	symlink /dev to /.  This makes it possible to lookup /dev/foo.
	Mount "real" root filesystem on /.
	Surgically move the devfs mountpoint from under the real root
	filesystem onto /dev in the real root filesystem.

Remove now unnecessary getdiskbyname().

kern_init.c:
	Don't do devfs mounting and rootvnode assignment here, it was
	already handled by vfs_mount.c.

Remove now unused bdevvp(), addaliasu() and addalias().  Put the
few necessary lines in devfs where they belong.  This eliminates the
second-last source of bogo vnodes, leaving only the lemming-syncer.

Remove rootdev variable, it doesn't give meaning in a global context and
was not trustworth anyway.  Correct information is provided by
statfs(/).
2004-12-07 08:15:41 +00:00
phk
6c14f71ef7 VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases
doesn't.  Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.

Fix this the cleaner way:  Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.
2004-12-05 22:41:02 +00:00
marcel
8b42e21d12 Fix null-pointer indirect function calls introduced in the previous
commit. In the new world order, the transitive closure on the vector
operations is not precomputed. As such, it's unsafe to actually use
any of the function pointers in an indirect function call. They can
be null, and we need to use the default vector in that case.
This is mostly a quick fix for the four function pointers that are
ed explicitly. A more generic or scalable solution is likely to see
the light of day.

No pathos on: current@
2004-12-05 22:30:28 +00:00
phk
a3935187b0 typo in comment. 2004-12-03 20:36:55 +00:00
phk
59f305606c Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
phk
05b9cb2a46 Mechanically change prototypes for vnode operations to use the new typedefs. 2004-12-01 12:24:41 +00:00
phk
5525bf4639 Use system wide no-op vfs_start function. 2004-11-25 09:11:27 +00:00
jeff
9caab2e843 - Eliminate the acquisition and release of the bqlock in bremfree() by
setting the B_REMFREE flag in the buf.  This is done to prevent lock order
   reversals with code that must call bremfree() with a local lock held.
   This also reduces overhead by removing two lock operations per buf for
   fsync() and similar.
 - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock
   has been acquired so that we may remove ourself from the free-list.
 - Provide a bremfreef() function to immediately remove a buf from a
   free-list for use only by NFS.  This is done because the nfsclient code
   overloads the b_freelist queue for its own async. io queue.
 - Simplify the numfreebuffers accounting by removing a switch statement
   that executed the same code in every possible case.
 - getnewbuf() can encounter locked bufs on free-lists once Giant is removed.
   Remove a panic associated with this condition and delay asserts that
   inspect the buf until after it is locked.

Reviewed by:	phk
Sponsored by:	Isilon Systems, Inc.
2004-11-18 08:44:09 +00:00
phk
d8b3df3cb9 Make VOP_BMAP return a struct bufobj for the underlying storage device
instead of a vnode for it.

The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.

(vfs_cluster doesn't use the backing store argument.)
2004-11-15 09:18:27 +00:00
phk
0eb6213e4d Be prepared to accept NULL mountargs as part of root-mounting. 2004-11-13 13:04:31 +00:00
phk
488ffe7864 Put back the vfs_object_create() calls, they do make a difference when
my test-setup does what I want it to instead of what I ask it to.

Pointed out by:	tegge
2004-11-12 10:27:14 +00:00
phk
95d99361de fix some comments 2004-11-10 06:53:31 +00:00
phk
a633563324 Use mount flags instead of NULL path to detect root filesystem mount. 2004-11-09 23:38:10 +00:00
phk
8fe1c85657 Stop pretending to have a vm_object backing the underlying disk vnode:
it isn't used for anything anywhere and the vnode_pager would explode
if we attempted to.
2004-11-09 23:12:45 +00:00
phk
723cc1105c Properly implement a default version of VOP_GETWRITEMOUNT.
Remove improper access to vop_stdgetwritemount() which should and
will instead rely on the VOP default path.
2004-11-06 11:41:22 +00:00
phk
248c63e073 Don't grab the exclusive bit on a root filesystem until we are willing
to mount it.  Doing so prevented fsck to be run after a refused mount.
2004-11-04 09:11:22 +00:00
phk
d9d9558b8b Move UFS from DEVFS backing to GEOM backing.
This eliminates a bunch of vnode overhead (approx 1-2 % speed
improvement) and gives us more control over the access to the storage
device.

Access counts on the underlying device are not correctly tracked and
therefore it is possible to read-only mount the same disk device multiple
times:
	syv# mount -p
	/dev/md0        /var    ufs rw  2 2
	/dev/ad0        /mnt    ufs ro  1 1
	/dev/ad0        /mnt2   ufs ro  1 1
	/dev/ad0        /mnt3   ufs ro  1 1

Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches
things in RAM) this is not possible with read-write mounts, and the system
will correctly reject this.

Details:

	Add a geom consumer and a bufobj pointer to ufsmount.

	Eliminate the vnode argument from softdep_disk_prewrite().
	Pick the vnode out of bp->b_vp for now.  Eventually we
	should find it through bp->b_bufobj->b_private.

	In the mountcode, use g_vfs_open() once we have used
	VOP_ACCESS() to check permissions.

	When upgrading and downgrading between r/o and r/w do the
	right thing with GEOM access counts.  Remove all the
	workarounds for not being able to do this with VOP_OPEN().

	If we are the root mount, drop the exclusive access count
	until we upgrade to r/w.  This allows fsck of the root
	filesystem and the MNT_RELOAD to work correctly.

	Set bo_private to the GEOM consumer on the device bufobj.

	Change the ffs_ops->strategy function to call g_vfs_strategy()

	In ufs_strategy() directly call the strategy on the disk
	bufobj.  Same in rawread.

	In ffs_fsync() we will no longer see VCHR device nodes, so
	remove code which synced the filesystem mounted on it, in
	case we came there.  I'm not sure this code made sense in
	the first place since we would have taken the specfs route
	on such a vnode.

	Redo the highly bogus readblock() function in the snapshot
	code to something slightly less bogus: Constructing an uio
	and using physio was really quite a detour.  Instead just
	fill in a bio and ship it down.
2004-10-29 10:15:56 +00:00
phk
414fca23b7 We only support backing UFS/FFS with disks. 2004-10-28 06:19:28 +00:00
phk
a75ad58326 Eliminate unnecessary KASSERTS. 2004-10-27 06:45:06 +00:00
phk
2678190fba KASSERT that we only get to prewrite() on writes. 2004-10-26 20:13:49 +00:00
phk
08b5d8832a White space changes. Add missing static. 2004-10-26 20:13:21 +00:00
phk
5e6094204e Replace single case switch() with if(). 2004-10-26 20:12:25 +00:00
phk
7cd4756a3b Vertically align comment. 2004-10-26 20:12:00 +00:00
phk
fd2239c999 The island council met and voted buf_prewrite() home.
Give ffs it's own bufobj->bo_ops vector and create a private strategy
routine, (currently misnamed for forwards compatibility), which is
just a copy of the generic bufstrategy routine except we call
softdep_disk_prewrite() directly instead of through the buf_prewrite()
indirection.

Teach UFS about the need for softdep_disk_prewrite() and call the
function directly in FFS.

Remove buf_prewrite() from the default bufstrategy() and from the
global bio_ops method vector.
2004-10-26 10:44:10 +00:00
phk
274ea3ac53 Fix syntax errors introduced by last commit.
Why isn't DIRECTIO in NOTES/LINT ?
2004-10-26 09:04:20 +00:00
phk
c66aa10c8e Put the I/O block size in bufobj->bo_bsize.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open().  The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
2004-10-26 07:39:12 +00:00
phk
e0db6b548c Degeneralize the per cdev copyonwrite callback. The only possible value
is ffs_copyonwrite() and the only place it can be called from is FFS which
would never want to call another filesystems copyonwrite method, should one
exist, so there is no reason why anything generic should know about this.
2004-10-26 06:25:56 +00:00
phk
0e87ab8bc6 Loose the v_dirty* and v_clean* alias macros.
Check the count field where we just want to know the full/empty state,
rather than using TAILQ_EMPTY() or TAILQ_FIRST().
2004-10-25 09:14:03 +00:00
phk
3a8a530155 Remove vnode->v_bsize. This was a dead-end. 2004-10-25 07:50:59 +00:00
phk
1b25a59886 Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.

Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.

Rename ibwrite to bufwrite().

Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.

Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().

Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
2004-10-24 20:03:41 +00:00
phk
52a089c526 Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.
2004-10-22 08:47:20 +00:00
phk
3833976d12 Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write
count on a bufobj.  Bufobj_wdrop() replaces vwakeup().

Use these functions all relevant places except in ffs_softdep.c where
the use if interlocked_sleep() makes this impossible.

Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
2004-10-21 15:53:54 +00:00
rwatson
049aec7270 Explicitly break out NETA license from Berkeley license to clearly
indicate license grant, as well as to indicate that NETA is asserting
only two clauses, not four clauses.

Requested by:	imp
2004-10-20 08:05:02 +00:00
njl
8b9984e218 Fix fsbtodb() for UFS1. This fixes an overflow for file sizes >1 TB,
allowing for sizes up to 4 TB.  This doesn't affect UFS2 since b is already
a 64 bit type, coincidental with daddr_t.

Submitted by:	bde
2004-10-09 20:16:06 +00:00
pjd
c944ef39d6 Back out changes which were introduced to delay mounting root file system.
Those changes were made on gmirror needs, but now gmirror handles this
by itself.
2004-10-05 11:26:43 +00:00
phk
fb7e95019c Remove support for accessing device nodes in UFS/FFS.
Device nodes can still be created and exported with NFS.
2004-09-28 13:30:58 +00:00
phk
6e31d065d3 Give cluster_write() an explicit vnode argument.
In the future a struct buf will not automatically point out a vnode for us.
2004-09-27 19:14:10 +00:00
pjd
99b0ffd3c0 Introduce new /boot/loader.conf variable: root_mount_delay.
It can be used to delay mounting root partition to give a chance to GEOM
providers to show up.
Now, when there is no needed provider, vfs_rootmount() function will look
for it every second and if it can't be find in defined time, it'll ask
for root device name (before this change it was done immediately).

This will allow to boot from gmirror device in degraded mode.
2004-09-23 10:13:18 +00:00
phk
73cf913d5f The getpages VOP was a good stab at getting scatter/gather I/O without
too much kernel copying, but it is not the right way to do it, and it is
in the way for straightening out the buffer cache.

The right way is to pass the VM page array down through the struct
bio to the disk device driver and DMA directly in to/out off the
physical memory.  Once the VM/buf thing is sorted out it is next on
the list.

Retire most of vnode method. ffs_getpages().  It is not clear if what is
left shouldn't be in the default implementation which we now fall back to.

Retire specfs_getpages() as well, as it has no users now.
2004-09-19 08:14:55 +00:00
phk
d90d8244cd Do not traverse list of snapshots if there isn't one.
Found by:	scottl
2004-09-16 17:28:56 +00:00
phk
f3cf14c41e Missed a place where snapshots were allocated in my last commit to
this file.
2004-09-16 15:58:18 +00:00
phk
a915c8947e Create struct snapdata which contains the snapshot fields from cdev
and the previously malloc'ed snapshot lock.

Malloc struct snapdata instead of just the lock.

Replace snapshot fields in cdev with pointer to snapdata (saves 16 bytes).

While here, give the private readblock() function a vnode argument
in preparation for moving UFS to access GEOM directly.
2004-09-13 07:29:45 +00:00
phk
2806321da1 Remove the buffercache/vnode side of BIO_DELETE processing in
preparation for integration of p4::phk_bufwork.  In the future,
local filesystems will talk to GEOM directly and they will consequently
be able to issue BIO_DELETE directly.  Since the removal of the fla
driver, BIO_DELETE has effectively been a no-op anyway.
2004-09-13 06:50:42 +00:00
phk
1912367ebb Create simple function init_va_filerev() for initializing a va_filerev
field.

Replace three instances of longhaired initialization va_filerev fields.

Added XXX comment wondering why we don't use random bits instead of
uptime of the system for this purpose.
2004-09-07 09:17:05 +00:00
csjp
d0350352a9 Currently, if the secure level is low enough, system flags can
be manipulated by prison root. In 4.x prison root can not manipulate
system flags, regardless of the security level. This behavior
should remain consistent to avoid any surprises which could lead
to security problems for system administrators which give out
privileged access to jails.

This commit changes suser_cred's flag argument from SUSER_ALLOWJAIL
to 0. This will prevent prison root from being able to manipulate
system flags on files.

This may be a MFC candidate for RELENG_5.

Discussed with:	cperciva
Reviewed by:	rwatson
Approved by:	bmilekic (mentor)
PR:		kern/70298
2004-08-22 02:03:41 +00:00
jhb
e4ddba3ab3 Generalize the UFS bad magic value used to determine when a filesystem
has only been partly initialized via newfs(8) so that it applies to both
UFS1 and UFS2.

Submitted by:	"Xin LI" delphij at frontfree dot net
MFC:		maybe?
2004-08-19 11:09:13 +00:00
dwmalone
2aab4410a1 When looking for some extra data to include in the hash, use the
address of the dirhash, rather than the first sizeof(struct dirhash
*) bytes of the structure (which, thankfully, seem to be constant).

Submitted by:	Ted Unangst <tedu@zeitbombe.org>
MFC after:	2 weeks
2004-08-16 10:00:44 +00:00
jmg
bc1805c6e8 Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers.  Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks.  Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by:	green, rwatson (both earlier versions)
2004-08-15 06:24:42 +00:00
phk
db95f8ec86 use bufdone() not biodone(). 2004-08-08 13:23:05 +00:00
phk
2d868d02cf Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version.  This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space.  A few places abused
it to get hold of some credentials to pass around.  Effectively
it is unused.

Reorganize the root filesystem selection code.
2004-07-30 22:08:52 +00:00
phk
075684f5fd Remove global variable rootdevs and rootvp, they are unused as such.
Add local rootvp variables as needed.

Remove checks for miniroot's in the swappartition.  We never did that
and most of the filesystems could never be used for that, but it had
still been copy&pasted all over the place.
2004-07-28 20:21:04 +00:00
kan
586367666d Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper
inode field based on UFS version. Use DIP ro read values and DIP_SET
to modify them throughout FFS code base.
2004-07-28 06:41:27 +00:00
cperciva
d9fecc83c8 Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with:	rwatson, scottl
Requested by:	jhb
2004-07-26 07:24:04 +00:00
phk
4880ba75c6 Make sure to update the mnt_stats before UFS1 extattr tried to
do I/O on the device.  Otherwise the blocksize is undefined in the
buffer cache.
2004-07-14 14:19:32 +00:00
alfred
8a1713aada Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
marcel
cdeb3179e7 Update for the KDB debugger framework:
o  Make debugging code conditional upon KDB.
o  Use kdb_backtrace() instead of backtrace().
o  Remove inclusion of opt_ddb.h.
2004-07-10 20:45:47 +00:00
phk
368b68e3c6 Explicity initialize vp->v_bsize. 2004-07-07 20:04:06 +00:00
phk
070a613a48 When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint.  If we
find that it was recycled, we restart our traversal from the start
of the list.

Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:

		MNT_ILOCK(mp);
	loop:
		for (vp = TAILQ_FIRST(&mp...);
		    (vp = nvp) != NULL;
		    nvp = TAILQ_NEXT(vp,...)) {
			if (vp->v_mount != mp)
				goto loop;
			MNT_IUNLOCK(mp);
			...
			MNT_ILOCK(mp);
		}
		MNT_IUNLOCK(mp);

The code which takes vnodes off a mountpoint looks like this:

	MNT_ILOCK(vp->v_mount);
	...
	TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
	...
	MNT_IUNLOCK(vp->v_mount);
	...
	vp->v_mount = something;

(Take a moment and try to spot the locking error before you read on.)

On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.

Fix:

Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go.  This saves approx 65 lines of
duplicated code.

Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.

Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.
2004-07-04 08:52:35 +00:00
rwatson
7a9902cd18 Annotate that we don't check the returned data length from ufs_readdir()
because UFS uses fixed-size directory blocks.  When using this code with
other file systems, such as HFS+, the value of auio.uio_resid will need
to be taken into account.
2004-06-24 18:31:23 +00:00
rwatson
ef6253fcd5 Remove unnecessary setting of VV_SYSTEM on extended attribute backing
files.  When this flag is used in our port of this code to Darwin, it
caused remarkable pain, and doesn't offer a benefit in FreeBSD.
2004-06-24 18:17:41 +00:00
rwatson
95406ba9f6 Protect a non-text comment with a '-'. 2004-06-24 17:45:45 +00:00
rwatson
8e7d2654a9 White space cleanup: use spaces instead of tabs in variable declarations
local to a function.  Remove a couple of blank lines in variable
declarations.

In one case, explicitly test against NULL rather than using a pointer
as a boolean directly.
2004-06-24 17:44:14 +00:00
bde
61308bc09f Backed out previous commit. The dev_t -> `struct cdev *' changes have
lots of errors.  Blind substitution of "dev_t foo" by "struct cdev *foo"
in comments usually just created an English syntax error (e.g.,
"struct cdev *changes"), but here it did less than that since the dev_t
is a user dev_t.
2004-06-20 03:11:19 +00:00
kuriyama
bf763fabc7 Avoid deadlock which is caused by locking VDIR of parent and VREG of
snapshot itself in wrong order.
We can skip unlink check of that directory because it must have
snapshot in it.

Reviewed by:	mckusick and current@
2004-06-18 14:35:17 +00:00
phk
dfd1f7fd50 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
julian
6c9d81ae0d Nice, is a property of a process as a whole..
I mistakenly moved it to the ksegroup when breaking up the process
structure. Put it back in the proc structure.
2004-06-16 00:26:31 +00:00
stefanf
d7af95e868 Avoid assignments to cast expressions.
Reviewed by:	md5
Approved by:	das (mentor)
2004-06-08 13:08:19 +00:00
tjr
7a46b27935 Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by:	jhb
2004-06-03 01:47:37 +00:00
krion
2ca388c921 - Fix typo
Approved by:	tobez
2004-05-31 16:55:12 +00:00
kensmith
827f9222d6 Upon further review it was decided this piece of the msync(2)
fixes was applicable to HEAD, originally it was thought this
should only be done in RELENG_4.  Implement IO_INVAL in the vnode
op for writing by marking the buffer as "no cache".  This fix
has already been applied to RELENG_4 as Rev. 1.65.2.15 of
ufs/ufs/ufs_readwrite.c.

Reviewed by:	alc, tegge
2004-05-21 12:05:48 +00:00
kensmith
ed41973344 Style fixup in previous commit.
Noticed by:	bde (thanks!)
2004-05-19 18:06:21 +00:00
kensmith
7e5c41897c Change ffs_realloccg() to set the valid bits for the extended part of the
fragment to zero the valid parts of a VM_IO buffer.

RE would like this to be part of 4.10-RC3 so this will be MFC-ed immediately.

Reviewed by:	alc, tegge
2004-05-14 22:00:08 +00:00
bmilekic
97390ebf55 Revert previous change to this file because it breaks some
things which compare /etc/fstab entries to results from
getfsstat().  The real way to fix this is to make 'ufs2'
a recognized filesystem (for real, no beating around the
bush).

This should fix things like 'umount -a -t ufs' now.
Appologies for the previous breakage.
2004-04-29 15:10:42 +00:00
bmilekic
788a94ec83 The previous change to mount(8) to report ufs or ufs2 used
libufs, which only works for Charlie root.

This change reverts the introduction of libufs and moves the
check into the kernel.  Since the f_fstypename is the same
for both ufs and ufs2, we check fs_magic for presence of
ufs2 and copy "ufs2" explicitly instead.

Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
2004-04-26 15:13:46 +00:00
bde
5e501817ae Record where half the bits in this file came from (from ufs_readwrite.c).
Damage to history from moving bits was especially large since a repo copy
is not feasible for partial files.
2004-04-07 11:21:18 +00:00
imp
cbeab61b3a Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and irc message from Robert
Watson saying that clause 3 can be removed from those files with an
NAI copyright that also have only a University of California
copyrights.

Approved by: core, rwatson
2004-04-07 03:47:21 +00:00
jhb
ea51c85889 Fix a paste-o from the buf_prewrite() cleanup commit and check for the
MNTK_SUSPEND flag on the correct vnode pointer in softdep_disk_prewrite().

Reviewed by:	phk
Tested by:	kensmith
2004-04-06 19:20:24 +00:00
mux
3ceb770141 Fix the remaining warnings of growfs(8) on my sparc64 box with
WARNS=6.  I don't change the WARNS level in the Makefile because I
didn't tested this on other archs.

The fs.h fix was suggested by:	marcel
Reviewed by:	md5(1)
2004-04-03 23:30:59 +00:00
kan
7decbcc07e Avoid doing bawrite to initialize inode block while holding cylinder
group block locked. If filesystem has any active snapshots, bawrite
can come back trying to allocate new snapshot data block from the same
cylinder group and cause panic due to recursive lock attempt.

PR:		64206
Reviewed by:	mckusick
Tested by:	pjd
2004-03-16 22:06:32 +00:00
phk
5c532f7fd4 When I was a kid my work table was one cluttered mess an cleaning it up
were a rather overwhelming task.  I soon learned that if you don't know
where you're going to store something, at least try to pile it next to
something slightly related in the hope that a pattern emerges.

Apply the same principle to the ffs/snapshot/softupdates code which have
leaked into specfs:  Add yet a buf-quasi-method and call it from the
only two places I can see it can make a difference and implement the
magic in ffs_softdep.c where it belongs.

It's not pretty, but at least it's one less layer violated.
2004-03-11 18:50:33 +00:00
phk
2a5e157787 Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
2004-03-11 18:02:36 +00:00
mckusick
962d700311 A more accurate test in the new ufs_lock than that in 1.235. 2004-02-23 19:05:05 +00:00
mckusick
5b78fad42b In the function clear_inodedeps(), a FREE_LOCK() should be called
AFTER the call to vn_start_write(), not before it. Otherwise, it is
possible to unlock it multiple times if the vn_start_write() fails.

Submitted by:	Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
2004-02-23 06:56:31 +00:00
mckusick
d1dbb3b2d4 Change UFS from using vop_stdlock to using its own ufs_lock.
In ufs_lock, check for attempts to acquire shared locks on
snapshot files and change them to be exclusive locks. This
change eliminates deadlocks and machine lockups reported in
-current since most read requests started using shared lock
requests.

Submitted by:	Jun Kuriyama <kuriyama@imgsrc.co.jp>
2004-02-23 06:40:17 +00:00
rwatson
90431761a2 Update my personal copyrights and NETA copyrights in the kernel
to use the "year1-year3" format, as opposed to "year1, year2, year3".
This seems to make lawyers more happy, but also prevents the
lines from getting excessively long as the years start to add up.

Suggested by:	imp
2004-02-22 00:33:12 +00:00
dwmalone
900024a6aa Abstract dirhash's locking using macros. This should make it easier to
use the same dirhash code on different branches/platforms.

Reviewed by:	Ted Unangst <tedu@zeitbombe.org>
Reviewed by:	iedowse
MFC after:	3 weeks
2004-02-15 21:39:35 +00:00
bde
4dca0a78ca Fixed some style bugs:
- don't unlock the vnode after vinvalbuf() only to have to relock it
  almost immediately.
- don't refer to devices classified by vn_isdisk() as block devices.
2004-02-14 04:41:13 +00:00
bde
5b9992a0bf MFextfs: backed out secondary changes in rev.1.40 that had become just
style bugs (a variable that is used only once, and misformattings).
2004-02-13 03:05:12 +00:00
kuriyama
5b2e0c4e9c Fix style bugs in previous commit.
Submitted by:	bde
2004-02-13 02:02:06 +00:00
bde
a2bb8cef87 Fixed some minor style bugs (English usage and formatting of binary
operators) in and near revs.1.169-1.170 (open mode bandaid).  This
(or better a proper fix) should have been done before cloning the
bandaid to many other file systems.
2004-02-12 16:52:24 +00:00
kuriyama
d9ccee2813 Reverse lock order by using local variable. This will shut up "acquiring
duplicate lock of same type" message.

Reviewed by:	mckusick
2004-02-12 08:52:08 +00:00
bde
7e5e459beb Removed more vestiges of vfs_ioopt:
- rev.1.42 of ffs_readwrite.c added a special case in ffs_read() for reads
  that are initially at EOF, and rev.1.62 of ufs_readwrite.c fixed
  timestamp bugs in it.  Removal of most of vfs_ioopt made it just and
  optimization, and removal of the vm object reference calls made it less
  than an optimization.  It was cloned in rev.1.94 of ufs_readwrite.c as
  part of cloning ffs_extwrite() although it was always less than an
  optimization in ffs_extwrite().
- some comments, compound statements and vertical whitespace were vestiges
  of dead code.
2004-02-11 15:27:26 +00:00
jhb
279b2b8278 Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
alc
b8f86642e4 Remove unnecessary vm object reference and deallocate calls from ffs_read()
and ffs_write().  These calls trace their origins to the dead vfs_ioopt
code, first appearing in revision 1.39 of ufs_readwrite.c.

Observed by:	bde
Discussed with:	tegge
2004-01-31 05:42:58 +00:00
ache
3951249708 Turn uio_resid/uio_offset comments into KASSERTs
Reviewed by:    bde
2004-01-27 11:28:38 +00:00
ache
359cfdfd99 Copy comment about caller check from ffs_read to ffs_extread, don't
check for uio_resid < 0 here too.
2004-01-23 06:00:41 +00:00
ache
1d8ca37452 Fix various panic() strings to reflect true function name to allow
easy grep.
Small code reorganization to look more logic.
Copy ffs_write check from prev. commit to ffs_extwrite.
2004-01-23 05:52:31 +00:00