Commit Graph

1075 Commits

Author SHA1 Message Date
rwatson
1963dd0ba4 Return EOPNOTSUPP for attempted EA operations on VCHR vnodes in UFS2;
if we permit them to occur, the kernel panics due to our performing
EA operations using VOP_STRATEGY on the vnode.  This went unnoticed
previously because there are very for users of device nodes on UFS2
due to the introduction of devfs.  However, this can come up with
the Linux compat directories and its hard-coded dev nodes (which will
need to go away as we move away from hard-coded device numbers).
This can come up if you use EA-intensive features such as ACLs and
MAC.

The proper fix is pretty complicated, but this band-aid would be
an excellent MFC candidate for the release.
2003-06-01 02:42:18 +00:00
phk
34f931b00b Remove unused variable.
Found by:       FlexeLint
2003-05-31 19:56:09 +00:00
phk
9bb8512e8f Remove unused local variables.
Found by:       FlexeLint
2003-05-31 18:17:32 +00:00
phk
0129a20107 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
alc
57b29f0e87 Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
rwatson
7e2cfac5e0 Jeff added locking assertions that the VV_ flags on vnodes were modified
only while holding appropriate vnode locks.  This patch slides the lock
release for ufs_extattr_enable() to continue to hold the active vnode lock
on a backing file until after the flag change; it also acquires a vnode
lock when disabling an attribute and hence clearing a flag on the backing
vnode.  This permits VFS_DEBUG_LOCKS to run UFS1 extended attributes
without panicking, as well as preventing a potential race and vnode flag
problem.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:07:33 +00:00
alc
f9966ce9e8 Lock the vm_object on entry to vm_object_vndeallocate(). 2003-05-03 20:28:26 +00:00
tjr
854348219c Do not attempt to free NULL dinodes (i_din1 or i_din2) in ffs_ifree().
These fields can be left as NULL if ffs_vget() allocates an inode but
fails before the dinode memory has been allocated. There are two cases
when this can occur: when we lose a race and another process has added
the inode to the hash, and when reading the inode off disk fails.

The bug was observed by Kris on one of the package-building machines.
See http://marc.theaimsgroup.com/?l=freebsd-current&m=105172731013411&w=2
In Kris's case, it was the bread() that failed because of a disk error.

The alternative to this patch is to ensure that ffs_vget() does not call
vput() when the inode that hasn't been properly initialised.
2003-05-01 06:41:59 +00:00
tjr
0b639b63af Free i_din2 instead of i_din1 in ffs_ifree() on UFS2 filesystems.
This is purely a cosmetic change because these members are in a
union together.
2003-05-01 06:38:27 +00:00
markm
6cc289554b Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
kan
9468fdaf14 Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
jhb
ced60d737a Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
2003-04-22 20:45:38 +00:00
jeff
886a932d7b - Use the sched_nice() api instead of setting the nice value directly.
Tested by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2003-04-12 01:05:19 +00:00
alc
ac3de07d6f Sufficient access checks are performed by vmapbuf() that calling useracc()
is pointless.  Remove the call to useracc().

Don't reinitialize fields that are already initialized by getpbuf().

Reviewed by:	tegge
2003-04-06 19:26:30 +00:00
tegge
23e9ae3483 Check return value from vmapbuf instead of the function address. 2003-03-27 20:48:34 +00:00
tegge
d1a3d87bf5 Eliminate a buffer sleep/wakeup race. 2003-03-27 19:28:11 +00:00
tegge
ede5ebede7 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
jhb
b8b062b09b Use td->td_ucred instead of td->td_proc->p_ucred. 2003-03-20 21:17:40 +00:00
jhb
15ebade0f4 Minor fixes to ffs_fserr():
- Assume that curthread is not NULL.  It never is in -current.
- Use td_ucred instead of p_ucred.
2003-03-20 21:15:54 +00:00
phk
e059b79437 Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
jeff
ae3c8799da - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
mckusick
9c654a63e6 Use the appropriate size when zeroing out the unused portion
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.

Reported by:	Ian Freislich <ianf@za.uu.net>
Sponsored by:   DARPA & NAI Labs.
2003-03-07 23:49:16 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
jeff
4de0ae322c - Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick
2003-03-04 00:04:44 +00:00
njl
5a225ad933 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
des
2756b6c964 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
mckusick
259fedfc3c Change the field used to test whether the superblock has been updated
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.

This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.

Reported by:	Nate Lawson <nate@root.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 23:21:08 +00:00
jeff
9e4c9a6ce9 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
das
1cc0669761 Expand the reference count on struct dquot to 32 bits.
This fixes a panic on large systems where a single user
may have more than 64K active or inactive vnodes.

PR:		48234
Reviewed by:	mike (mentor)
2003-02-24 08:49:59 +00:00
mckusick
46e9534a11 When removing the last item from a non-empty worklist, the worklist
tail pointer must be updated.

Reported by:	Kris Kennaway <kris@obsecurity.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-24 07:28:41 +00:00
mckusick
9ed5a11f37 This patch fixes a deadlock between the bufdaemon and a process taking
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:59:34 +00:00
mckusick
5340f28b15 This patch fixes a bug on an active filesystem on which a snapshot
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:29:51 +00:00
mckusick
e6aeffdeb5 This patch fixes a bug in the logical block calculation macros so
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:19:26 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
mckusick
d9ebbec084 Replace use of random() with arc4random() to provide less guessable
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.

Submitted by:	Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 21:31:58 +00:00
mckusick
d8fb26b1c6 Correct lines incorrectly added to the copyright message.
Submitted by:	Frank van der Linden <fvdl@wasabisystems.com>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 00:31:06 +00:00
jeff
87e306ad71 - Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
 - Move B_SCANNED into b_vflags and call it BV_SCANNED.
 - Create a vop_stdfsync() modeled after spec's sync.
 - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
   fs specific processing.  This gives all of these filesystems proper
   behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
 - Annotate the locking in buf.h
2003-02-09 11:28:35 +00:00
alfred
86daf0cca6 Catch more uses of MIN(). 2003-02-02 13:30:00 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
dillon
ccd5574cc6 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
dillon
ddf9ef103e Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
marcel
111c003344 o Improve wording of the comment that accompanies fs_pad. The
padding is not specific to non-i386 architectures. It is
   caused by non-i386 specific alignment requirements of
   fs_swuid,
o  Add a CTASSERT to catch a change in the size of struct fs
   at compile-time rather than run-time.

Ok'd: gordon
Tested on: i386 ia64
2003-01-10 06:59:34 +00:00
gordon
2af32f18dc Fix superblock alignment problems on non-i386 platforms. Also change fs_uuid
to fs_swuid, making it more descriptive.

Submitted by:	marcel
Reviewed by:	peter
Pointy hat to:	gordon
2003-01-09 23:53:30 +00:00
gordon
f1018f664a Steal some space from fs_fsmnt to create fs_volname and fs_uuid. The volname
will be used to support volume names with the help of a GEOM module (to be
committed). uuid will be used to deal with conflicting volume names (which
doesn't work just yet).

Approved by:	mckusick@
2003-01-08 22:53:54 +00:00
mckusick
db74e87c2d This patch fixes a problem caused by applications that rapidly and
repeatedly truncate the same file. Each time the file is truncated,
a buffer is grabbed to store the indirect block numbers that need
to be freed. Those blocks cannot be freed until the inode claiming
them is written to disk. Thus, the number of buffers being held by
soft updates explodes and in extreme cases can run the kernel out
of buffers. The problem can be avoided by doing an fsync on the
file every debug.maxindirdep truncates (currently defaulted to 50).
The fsync causes the inode to be written so that the held buffers
can be freed. The check for excessive buffers is checked as part
of the existing hook for excessive dependencies (softdep_slowdown)
in the truncate code.

Reported by:	David Schultz <dschultz@uclink.Berkeley.EDU>
Sponsored by:   DARPA & NAI Labs.
MFC after:	3 weeks
2003-01-07 18:23:50 +00:00
phk
131885aa2f Temporarily introduce a new VOP_SPECSTRATEGY operation while I try
to sort out disk-io from file-io in the vm/buffer/filesystem space.

The intent is to sort VOP_STRATEGY calls into those which operate
on "real" vnodes and those which operate on VCHR vnodes.  For
the latter kind, the call will be changed to VOP_SPECSTRATEGY,
possibly conditionally for those places where dual-use happens.

Add a default VOP_SPECSTRATEGY method which will call the normal
VOP_STRATEGY.  First time it is called it will print debugging
information.  This will only happen if a normal vnode is passed
to VOP_SPECSTRATEGY by mistake.

Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY
does on a VCHR vnode today.

Add a new VOP_STRATEGY method in specfs to catch instances where
the conversion to VOP_SPECSTRATEGY has not yet happened.  Handle
the request just like we always did, but first time called print
debugging information.

Apart up to two instances of console messages per boot, this amounts
to a glorified no-op commit.

If you get any of the messages on your console I would very much
like a copy of them mailed to phk@freebsd.org
2003-01-04 22:10:36 +00:00
phk
157437ec08 Since Jeffr made the std* functions the default in rev 1.63 of
kern/vfs_defaults.c it is wrong for the individual filesystems to use
the std* functions as that prevents override of the default.

Found by:       src/tools/tools/vop_table
2003-01-04 08:47:19 +00:00
phk
daf6948653 Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since
all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
2003-01-03 06:32:15 +00:00
schweikh
d3367c5f5d Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
alfred
927595101c When compiling the kernel do not implicitly include filedesc.h from proc.h,
this was causing filedesc work to be very painful.
In order to make this work split out sigio definitions to thier own header
(sigio.h) which is included from proc.h for the time being.
2003-01-01 01:56:19 +00:00