Commit Graph

394 Commits

Author SHA1 Message Date
phk
a5e8f5f094 Initialize b_saveaddr when we hand out buffers 2003-06-20 08:26:38 +00:00
alc
df7799dd77 Lock the vm object when removing a page. 2003-06-11 16:37:33 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
phk
0129a20107 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
alc
afa5d8b791 Finish the vm_object locking for this file, including holding the vm_object
lock when accessing the vm_object's flags or calling vm_page_lookup().
2003-04-28 05:40:45 +00:00
alc
3db24e0c46 - Lock the vm_object when performing vm_page_alloc() in allocbuf(). 2003-04-26 07:42:24 +00:00
alc
c9e51c9b11 Lock the vm_object in vfs_busy_pages(). 2003-04-20 00:17:05 +00:00
alc
dc48d3db81 - Lock the vm_object when performing vm_object_pip_subtract().
- Assert that the vm_object lock is held in vm_object_pip_subtract().
2003-04-19 22:11:41 +00:00
alc
ef4e8a19cf - Lock the vm_object when performing vm_object_pip_wakeupn().
- Assert that the vm_object lock is held in vm_object_pip_wakeupn().
 - Add a new macro VM_OBJECT_LOCK_ASSERT().
2003-04-19 21:15:44 +00:00
alc
a05b4b3347 Update locking on the kernel_object to use the new macros. 2003-04-14 00:36:53 +00:00
alc
e847e1c64f Remove an unnecessary trunc_page() from vmapbuf().
Reviewed by:	tegge
2003-04-06 00:40:54 +00:00
alc
cbd6318ffd o Check the b_bufsize passed to vmapbuf() returning an error
if it is invalid.
 o Remove a debugging printf() from vmapbuf().

Suggested by:   tegge
2003-04-04 06:14:54 +00:00
phk
a0fbf93755 Preparation commit before I start on the bioqueue lockdown:
Collect all the bits of bioqueue handing in subr_disk.c, vfs_bio.c is big
enough as it is and disksort already lives in subr_disk.c.
2003-03-30 08:51:23 +00:00
tegge
ede5ebede7 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
phk
e059b79437 Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
jeff
459181e3ed - Add a lock for protecting against msleep(bp, ...) wakeup(bp) races.
- Create a new function bdone() which sets B_DONE and calls wakup(bp). This
   is suitable for use as b_iodone for buf consumers who are not going
   through the buf cache.
 - Create a new function bwait() which waits for the buf to be done at a set
   priority and with a specific wmesg.
 - Replace several cases where the above functionality was implemented
   without locking with the new functions.
2003-03-13 07:31:45 +00:00
jeff
ae3c8799da - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
jeff
4de0ae322c - Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick
2003-03-04 00:04:44 +00:00
jeff
8e95e91722 - Hold the vnode interlock across calls to bgetvp instead of acquiring it
internally.  This is required to stop multiple bufs from being associated
   with a single lblkno.
2003-03-02 06:05:23 +00:00
jeff
98d7696db0 - gc USE_BUFHASH. The smp locking of the buf cache renders this useless. 2003-03-01 05:55:03 +00:00
mckusick
0309ffd2e7 When doing cleanup of excessive buffers in bdwrite (see kern/vfs_bio.c
delta 1.371) we must ensure that we do not get ourselves into a
recursive trap endlessly trying to clean up after ourselves.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 23:59:09 +00:00
jeff
1228dbd648 - Add the missing NULL interlock argument to a recently added BUF_LOCK. 2003-02-25 08:23:11 +00:00
mckusick
6e9f6f2d6d Prevent large files from monopolizing the system buffers. Keep
track of the number of dirty buffers held by a vnode. When a
bdwrite is done on a buffer, check the existing number of dirty
buffers associated with its vnode. If the number rises above
vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one
of the other (hopefully older) dirty buffers associated with
the vnode is written (using bawrite). In the event that this
approach fails to curb the growth in it the vnode's number of
dirty buffers (due to soft updates rollback dependencies),
the more drastic approach of doing a VOP_FSYNC on the vnode
is used. This code primarily affects very large and actively
written files such as snapshots. This change should eliminate
hanging when taking snapshots or doing background fsck on
very large filesystems.

Hopefully, one day it will be possible to cache filesystem
metadata in the VM cache as is done with file data. As it
stands, only the buffer cache can be used which limits total
metadata storage to about 20Mb no matter how much memory is
available on the system. This rather small memory gets badly
thrashed causing a lot of extra I/O. For example, taking a
snapshot of a 1Tb filesystem minimally requires about 35,000
write operations, but because of the cache thrashing (we only
have about 350 buffers at our disposal) ends up doing about
237,540 I/O's thus taking twenty-five minutes instead of four
if it could run entirely in the cache.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 06:44:42 +00:00
jeff
9e4c9a6ce9 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
jeff
de87d496d3 - Introduce a new function bremfreel() that does a bremfree with the buf
queue lock already held.
 - In getblk() and flushbufqueues() use bremfreel() while we still have the
   buf queue lock held to keep the lists consistent.
 - Add LK_NOWAIT to two cases where we're essentially asserting that the bufs
   are not locked while acquiring the locks.  This will make sure that we get
   the appropriate panic() and not another one for sleeping with a lock held.
2003-02-16 10:43:06 +00:00
jeff
4d663017dc - Add a comment about a race that will happen without Giant. 2003-02-10 22:47:34 +00:00
jeff
2492221864 - Unlock the nblock after the loop in bwillwrite(). 2003-02-10 22:33:59 +00:00
jeff
2de830f8f6 - In getnewbuf() unlock the bq lock prior to sleeping when we're out of
buffers.

Submitted by:	tegge
2003-02-10 06:02:51 +00:00
jeff
6bab19f3ac - Correct another atomic op.
Spotted by:	alc
2003-02-09 22:39:51 +00:00
jeff
528cceebc4 - Move some code out from #ifdef INVARIANTS. 2003-02-09 12:11:37 +00:00
jeff
87e306ad71 - Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
 - Move B_SCANNED into b_vflags and call it BV_SCANNED.
 - Create a vop_stdfsync() modeled after spec's sync.
 - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
   fs specific processing.  This gives all of these filesystems proper
   behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
 - Annotate the locking in buf.h
2003-02-09 11:28:35 +00:00
jeff
75e9ed76e4 - spell add 'add' and not 'subtract' in an atomic op.
Spotted by:	alc
Pointy hat to:	jeff
2003-02-09 11:21:40 +00:00
jeff
734283166f - Lock down the buffer cache's infrastructure code. This includes locks on
buf lists, synchronization variables, and atomic ops for the counters.
   This change does not remove giant from any code although some pushdown
   may be possible.
 - In vfs_bio_awrite() don't access buf fields without the buf lock.
2003-02-09 09:47:31 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
dillon
e7be7a0432 Close the remaining user address mapping races for physical
I/O, CAM, and AIO.  Still TODO: streamline useracc() checks.

Reviewed by:	alc, tegge
MFC after:	7 days
2003-01-20 17:46:48 +00:00
alc
43a8628b32 - Hold the page queues lock around vm_page_hold().
- Assert that the page queues lock rather than Giant is held in
   vm_page_hold().
2003-01-20 09:24:03 +00:00
alc
b8c33f3dc0 Fix two long-standing, but likely harmless, errors in the use of
vm_pageout_deficit:
1. Update vm_pageout_deficit before VM_WAIT.  There is no sense in
   delaying the update; the sooner the pageout daemon receives this
   information the better.  Reviewed by: tegge
2. Update vm_pageout_deficit according to the number of pages still
   needed to complete the allocation, not the original size of the
   allocation.  Submitted by: tegge

(These errors have existed since the introduction of vm_pageout_deficit
in revision 1.144.)
2003-01-16 08:14:56 +00:00
dillon
bd6fdb8977 Merge all the various copies of vmapbuf() and vunmapbuf() into a single
portable copy.  Note that pmap_extract() must be used instead of
pmap_kextract().

This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
2003-01-15 23:54:35 +00:00
alc
c7ca47fcc7 - Update vm_pageout_deficit using atomic operations. It's a simple
counter outside the scope of existing locks.
 - Eliminate a redundant clearing of vm_pageout_deficit.
2003-01-14 06:57:03 +00:00
alc
a1d37b604a vm_hold_load_pages() needn't clear PG_ZERO because it didn't pass
VM_ALLOC_ZERO to vm_page_alloc(). (PG_ZERO is clear by default.)
2003-01-12 06:30:15 +00:00
alc
0943de6b01 Make bogus_offset local to bufinit(). 2003-01-07 19:55:08 +00:00
phk
97ae520f5d Fix cut&paste bug which would result in a panic because buffer was
being biodone'ed multiple times.
2003-01-05 22:01:08 +00:00
alc
9b01bac020 Allocate bogus_page with VM_ALLOC_WIRED. (Previously, bogus_page's
allocation incremented the global count of wired pages, but not the
page's own wire count.  This inconsistency was introduced in
revision 1.230.)
2003-01-05 18:46:13 +00:00
phk
131885aa2f Temporarily introduce a new VOP_SPECSTRATEGY operation while I try
to sort out disk-io from file-io in the vm/buffer/filesystem space.

The intent is to sort VOP_STRATEGY calls into those which operate
on "real" vnodes and those which operate on VCHR vnodes.  For
the latter kind, the call will be changed to VOP_SPECSTRATEGY,
possibly conditionally for those places where dual-use happens.

Add a default VOP_SPECSTRATEGY method which will call the normal
VOP_STRATEGY.  First time it is called it will print debugging
information.  This will only happen if a normal vnode is passed
to VOP_SPECSTRATEGY by mistake.

Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY
does on a VCHR vnode today.

Add a new VOP_STRATEGY method in specfs to catch instances where
the conversion to VOP_SPECSTRATEGY has not yet happened.  Handle
the request just like we always did, but first time called print
debugging information.

Apart up to two instances of console messages per boot, this amounts
to a glorified no-op commit.

If you get any of the messages on your console I would very much
like a copy of them mailed to phk@freebsd.org
2003-01-04 22:10:36 +00:00
phk
8969090670 Don't call VOP_BMAP on VCHR vnodes when the logical and physical block
numbers are identical: it cannot even hope to accomplish anything.
2003-01-04 09:37:42 +00:00
phk
daf6948653 Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since
all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
2003-01-03 06:32:15 +00:00
schweikh
d3367c5f5d Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
alc
a221c3d422 Hold the page queues lock when calling vm_page_flag_clear(). 2002-12-27 06:52:32 +00:00