Commit Graph

170 Commits

Author SHA1 Message Date
alc
83fe46be18 Update locking around vm_object_page_remove() to use the new macros. 2003-04-18 16:39:03 +00:00
phk
e059b79437 Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
phk
12e99581a7 We can get past here on a normal vnode as well, so use VOP_STRATEGY if so. 2003-01-13 21:32:16 +00:00
phk
0552c4bec8 Convert VOP_STRATEGY to VOP_SPECSTRATEGY in the generic getpages and
the pager input for small filesystems.
2003-01-05 20:32:03 +00:00
phk
daf6948653 Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since
all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
2003-01-03 06:32:15 +00:00
dillon
1f367998c6 Allow the VM object flushing code to cluster. When the filesystem syncer
comes along and flushes a file which has been mmap()'d SHARED/RW, with
dirty pages, it was flushing the underlying VM object asynchronously,
resulting in thousands of 8K writes.  With this change the VM Object flushing
code will cluster dirty pages in 64K blocks.

Note that until the low memory deadlock issue is reviewed, it is not safe
to allow the pageout daemon to use this feature.  Forced pageouts still
use fs block size'd ops for the moment.

MFC after:	3 days
2002-12-28 21:03:42 +00:00
alc
6be448f264 Perform vm_object_lock() and vm_object_unlock() around
vm_object_page_remove().
2002-12-15 05:41:56 +00:00
alc
1337d432f2 Hold the page queues lock when performing pmap_clear_modify().
Approved by:	re (blanket)
2002-11-27 19:51:48 +00:00
alc
cd298c7231 Hold the page queues/flags lock when calling vm_page_set_validclean().
Approved by:	re
2002-11-23 19:10:31 +00:00
alc
2918fada10 Add page queue and flag locking in vnode_pager_setsize().
Approved by:	re
2002-11-23 03:58:35 +00:00
alc
5e336b1d19 Now that pmap_remove_all() is exported by our pmap implementations
use it directly.
2002-11-16 07:44:25 +00:00
alc
fc8a5bc419 When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than
indirectly through vm_page_protect().  The one remaining page flag that
is updated by vm_page_protect() is already being updated by our various
pmap implementations.

Note: A later commit will similarly change the VM_PROT_READ case and
eliminate vm_page_protect().
2002-11-10 07:12:04 +00:00
mux
aae3d5b8b6 Better printf() formats. 2002-11-07 23:16:22 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
jeff
ee079921fc - Add a ASSERT_VOP_LOCKED in vnode_pager_alloc.
- Lock access to v_iflags.
2002-09-25 01:23:43 +00:00
alc
cdcc7b3446 o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since
pmap_zero_page() and pmap_zero_page_area() were modified to accept
   a struct vm_page * instead of a physical address, vm_page_zero_fill()
   and vm_page_zero_fill_area() have served no purpose.
2002-08-25 00:22:31 +00:00
jeff
02517b6731 - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
alc
0cc69019ed o Lock page queue accesses by vm_page_free().
o Apply some style fixes.
2002-07-28 20:13:48 +00:00
alc
abf1927809 o Lock page queue accesses by vm_page_activate() and vm_page_deactivate(). 2002-07-27 04:30:46 +00:00
robert
20e451e8dc - Use (OFF_TO_IDX(off) - pi) instead of (OFF_TO_IDX(off - IDX_TO_OFF(pi))).
- Reformat a comment.
2002-07-01 14:14:07 +00:00
alc
b81ea3e3e5 o Replace GIANT_REQUIRED in vnode_pager_alloc() by the acquisition and
release of Giant.  (Annotate as MPSAFE.)
 o Also, in vnode_pager_alloc(), remove an unnecessary re-initialization
   of struct vm_object::flags and move a statement that is duplicated
   in both branches of an if-else.
2002-06-22 07:28:06 +00:00
trhodes
28d42899b7 More s/file system/filesystem/g 2002-05-16 21:28:32 +00:00
phk
8536ea3cdb Make daddr_t and u_daddr_t 64bits wide.
Retire daddr64_t and use daddr_t instead.

Sponsored by:	DARPA & NAI Labs.
2002-05-14 11:09:43 +00:00
alc
d60a525036 o Condition the compilation and use of vm_freeze_copyopts()
on ENABLE_VFS_IOOPT.
2002-05-06 05:45:57 +00:00
peter
0feba1c376 We do not necessarily need to map/unmap pages to zero parts of them.
On systems where physical memory is also direct mapped (alpha, sparc,
ia64 etc) this is slightly harmful.
2002-04-28 00:15:48 +00:00
alfred
1446d09429 Remove __P. 2002-03-19 22:20:14 +00:00
mckusick
e929f2e4f0 Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
2002-03-15 18:49:47 +00:00
eivind
0799ec54b1 - Remove a number of extra newlines that do not belong here according to
style(9)
- Minor space adjustment in cases where we have "( ", " )", if(), return(),
  while(), for(), etc.
- Add /* SYMBOL */ after a few #endifs.

Reviewed by:	alc
2002-03-10 21:52:48 +00:00
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
dillon
cd4d323ad3 This fixes a large number of bugs in our NFS client side code. A recent
commit by Kirk also fixed a softupdates bug that could easily be triggered
by server side NFS.

	* An edge case with shared R+W mmap()'s and truncate whereby
	  the system would inappropriately clear the dirty bits on
	  still-dirty data.  (applicable to all filesystems)

	  THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING.
	  see vm/vm_page.c line 1641

	* The straddle case for VM pages and buffer cache buffers when
	  truncating.  (applicable to NFS client side)

	* Possible SMP database corruption due to vm_pager_unmap_page()
	  not clearing the TLB for the other cpu's.  (applicable to NFS
	  client side but could effect all filesystems).  Note: not
	  considered serious since the corruption occurs beyond the file
	  EOF.

	* When flusing a dirty buffer due to B_CACHE getting cleared,
	  we were accidently setting B_CACHE again (that is, bwrite() sets
	  B_CACHE), when we really want it to stay clear after the write
	  is complete.  This resulted in a corrupt buffer.  (applicable
	  to all filesystems but probably only triggered by NFS)

	* We have to call vtruncbuf() when ftruncate()ing to remove
	  any buffer cache buffers.  This is still tentitive, I may
	  be able to remove it due to the second bug fix.  (applicable
	  to NFS client side)

	* vnode_pager_setsize() race against nfs_vinvalbuf()... we have
	  to set n_size before calling nfs_vinvalbuf or the NFS code
	  may recursively vnode_pager_setsize() to the original value
	  before the truncate.  This is what was causing the user mmap
	  bus faults in the nfs tester program.  (applicable to NFS
	  client side)

	* Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made
	  by Kirk).

Testing program written by: Avadis Tevanian, Jr.
Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step')
MFC after:	1 week
2001-12-14 01:16:57 +00:00
dillon
a08a119cfc Adjust vnode_pager_input_smlfs() to not attempt to BMAP blocks beyond the
file EOF.  This works around a bug in the ISOFS (CDRom) BMAP code which
returns bogus values for requests beyond the file EOF rather then returning
an error, resulting in either corrupt data being mmap()'d beyond the file EOF
or resulting in a seg-fault on the last page of a mmap()'d file (mmap()s of
CDRom files).

Reported by: peter / Yahoo
MFC after:	3 days
2001-11-05 18:58:47 +00:00
dillon
8a2a967bbc Finally fix the VM bug where a file whos EOF occurs in the middle of a page
would sometimes prevent a dirty page from being cleaned, even when synced,
resulting in the dirty page being re-flushed to disk every 30-60 seconds or
so, forever.  The problem is that when the filesystem flushes a page to
its backing file it typically does not clear dirty bits representing areas
of the page that are beyond the file EOF.  If the file is also mmap()'d and
a fault is taken, vm_fault (properly, is required to) set the vm_page_t->dirty
bits to VM_PAGE_BITS_ALL.  This combination could leave us with an uncleanable,
unfreeable page.

The solution is to have the vnode_pager detect the edge case and manually
clear the dirty bits representing areas beyond the file EOF.  The filesystem
does the rest and the page comes up clean after the write completes.

MFC after:	3 days
2001-10-12 18:17:34 +00:00
jhb
4806d88677 Change the kernel's ucred API as follows:
- crhold() returns a reference to the ucred whose refcount it bumps.
- crcopy() now simply copies the credentials from one credential to
  another and has no return value.
- a new crshared() primitive is added which returns true if a ucred's
  refcount is > 1 and false (0) otherwise.
2001-10-11 23:38:17 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
jhb
d47050ac2b Whitespace fixes. 2001-08-04 20:49:29 +00:00
dillon
cbc4469f38 whitespace / register cleanup 2001-07-04 19:00:13 +00:00
dillon
e028603b7e With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
jhb
34ed38abdc Fix a XXX comment by moving the initialization of the number of pbuf's
for the vnode pager to a new vnode pager init method instead of making it
a hack in getpages().
2001-07-03 07:35:56 +00:00
jhb
0c84f37525 Don't hold the VM lock across VOP's and other things that can sleep. 2001-05-29 16:58:25 +00:00
jhb
c6bb467070 - Assert Giant is held in the vnode pager methods.
- Lock the VM while walking down a vm_object's backing_object list in
  vnode_pager_lock().
2001-05-23 22:51:23 +00:00
alfred
a3f0842419 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
grog
4b9d9cbaac Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
grog
1f5de30718 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
alfred
1e50a5f33e vnode_pager_freepage() is really vm_page_free() in disguise,
nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()
2001-04-19 06:18:23 +00:00
dillon
fd223545d4 This implements a better launder limiting solution. There was a solution
in 4.2-REL which I ripped out in -stable and -current when implementing the
low-memory handling solution.  However, maxlaunder turns out to be the saving
grace in certain very heavily loaded systems (e.g. newsreader box).  The new
algorithm limits the number of pages laundered in the first pageout daemon
pass.  If that is not sufficient then suceessive will be run without any
limit.

Write I/O is now pipelined using two sysctls, vfs.lorunningspace and
vfs.hirunningspace.  This prevents excessive buffered writes in the
disk queues which cause long (multi-second) delays for reads.  It leads
to more stable (less jerky) and generally faster I/O streaming to disk
by allowing required read ops (e.g. for indirect blocks and such) to occur
without interrupting the write stream, amoung other things.

NOTE: eventually, filesystem write I/O pipelining needs to be done on a
per-device basis.  At the moment it is globalized.
2000-12-26 19:41:38 +00:00
mckusick
a3d0c189ea Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
peter
ee5cd6988f Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly
to various pmap_*() functions instead of looking up the physical address
and passing that.  In many cases, the first thing the pmap code was doing
was going to a lot of trouble to get back the original vm_page_t, or
it's shadow pv_table entry.

Inspired by: John Dyson's 1998 patches.

Also:
Eliminate pv_table as a seperate thing and build it into a machine
dependent part of vm_page_t.  This eliminates having a seperate set of
structions that shadow each other in a 1:1 fashion that we often went to
a lot of trouble to translate from one to the other. (see above)
This happens to save 4 bytes of physical memory for each page in the
system.  (8 bytes on the Alpha).

Eliminate the use of the phys_avail[] array to determine if a page is
managed (ie: it has pv_entries etc).  Store this information in a flag.
Things like device_pager set it because they create vm_page_t's on the
fly that do not have pv_entries.  This makes it easier to "unmanage" a
page of physical memory (this will be taken advantage of in subsequent
commits).

Add a function to add a new page to the freelist.  This could be used
for reclaiming the previously wasted pages left over from preloaded
loader(8) files.

Reviewed by:	dillon
2000-05-21 12:50:18 +00:00
phk
36c3965ff9 Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
phk
8ee11d587f Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.
2000-04-02 15:24:56 +00:00