Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>
comments from vnode_pager_setsize(). This call was introduced in
revision 1.140 to address a problem that no longer exists.
Specifically, pmap_zero_page_area() has replaced a (possibly)
problematic implementation of page zeroing that was based on
vm_pager_map(), bzero(), and vm_pager_unmap().
cache: vnode_pager_setsize() must handle the case where a file is
truncated to a non-page-size-aligned boundary and there is a cached
page underlying the new end of file.
Reported by: kris, tegge
Tested by: kris
MFC after: 3 days
Now, we assume no more sched_lock protection for some of them and use the
distribuited loads method for vmmeter (distribuited through CPUs).
Reviewed by: alc, bde
Approved by: jeff (mentor)
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.
Requested by: alc
Approved by: jeff (mentor)
vmcnts. This can be used to abstract away pcpu details but also changes
to use atomics for all counters now. This means sched lock is no longer
responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
it introduced a check after the call to file system's get pages method
that assumes that the get pages method does not change the array of pages
that is passed to it. In the case of vnode_pager_generic_getpages(),
this assumption has been incorrect. The contents of the array of pages
may be shifted by vnode_pager_generic_getpages(). Likely, the problem
has been hidden by vnode_pager_haspage() limiting the set of pages that
are passed to vnode_pager_generic_getpages() such that a shift never
occurs.
The fix implemented herein is to adjust the pointer to the array of pages
rather than shifting the pages within the array.
MFC after: 3 weeks
Fix suggested by: tegge
an error returned by VOP_BMAP() and a hole in the file.
Change the callers to vnode_pager_addr() such that they return
VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page.
Reviewed by: tegge
MFC after: 3 weeks
vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the
underlying file system and (2) an error in performing the VOP_BMAP().
Previously, vnode_pager_generic_getpages() assumed that all errors were
of the first type. If, in fact, the error was of the second type, the
likely outcome was for the process to become permanently blocked on a busy
page.
MFC after: 3 weeks
Reviewed by: tegge
synchronized by the lock on the object containing the page.
Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.
Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().
Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
be called without any vnode locks held. Remove calls to vn_start_write() and
vn_finished_write() in vnode_pager_putpages() and add these calls before the
vnode lock is obtained to most of the callers that don't already have them.
lock also protects this flag so it is not necessary.
- Don't rely on v_mount to detect whether or not we've been recycled, use
the more appropriate VI_DOOMED instead.
Sponsored by: Isilon Systems, Inc.
MFC After: 1 week
The former type, size_t, was causing truncation to 32 bits on i386,
which immediately led to undersizing of VM objects backed by
files >4GB. In particular, sendfile(2) was broken for such files.
PR: kern/92243
MFC after: 5 days
vm_pager_init() is run before required nswbuf variable has been set
to correct value. This caused system to run with single pbuf available
for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt
variable in the same way.
Reported by: ade
Obtained from: alc
MFC after: 2 days
underlying vnode requires Giant.
- In vm_fault only acquire Giant if the underlying object has NEEDSGIANT
set.
- In vm_object_shadow inherit the NEEDSGIANT flag from the backing object.
down. If we have dirty pages, the putpages routine will need to know
what the vnode's object is so that it may write out dirty pages.
Pointy hat: phk
Found by: obrien
patch from kan@).
Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on
close. This is not yet a generally safe function, but for this very
specific use it is safe. This solves the problem with buffers not
being flushed by unmount or after failed mount attempts.
the name Sande^H^H^H^H^Hvnode_create_vobject().
Make the new function take a size argument which removes the need for
a VOP_STAT() or a very pessimistic guess for disks.
Call that new function from vop_stdcreatevobject().
Make vnode_pager_alloc() private now that its only user came home.
- Use VFS_LOCK_GIANT() rather than directly acquiring giant in places
where giant is only held because vfs requires it.
Sponsored By: Isilon Systems, Inc.
revision 1.55, the address parameter to vnode_pager_addr() was changed
from an unsigned 32-bit quantity to a signed 64-bit quantity. However,
an out-of-range check on the address was not updated. Consequently,
memory-mapped I/O on files greater than 2GB could cause a kernel panic.
Since the address is now a signed 64-bit quantity, the problem resolution
is simply to remove a cast.
Reviewed by: bde@ and tegge@
PR: 73010
MFC after: 1 week
to implement the sanity check should have been changed when we converted
the implementation of vm_pindex_t from 32 to 64 bits. (Thus, RELENG_4 is
not affected.) The consequence of this error would be a legimate write to
an extremely large file being treated as an errant attempt to write meta-
data.
Discussed with: tegge@
instead of a vnode for it.
The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.
(vfs_cluster doesn't use the backing store argument.)
because this call is only needed to wake threads that slept when they
discovered a dead object connected to a vnode. To eliminate unnecessary
calls to wakeup() by vnode_pager_dealloc(), introduce a new flag,
OBJ_DISCONNECTWNT.
Reviewed by: tegge@
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open(). The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
Extend it with a strategy method.
Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.
Rename ibwrite to bufwrite().
Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.
Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().
Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
allocation and deallocation. This flag's principal use is shortly after
allocation. For such cases, clearing the flag is pointless. The only
unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never
requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed
page.
Reviewed by: tegge@
on non-VCHR vnodes. This fixes a panic when reading data from files on a
filesystem with a small (less than a page) block size.
PR: 59271
Reviewed by: alc
vm_pageout_page_stats() from Giant.
- Modify vm_pager_put_pages() and vm_pager_page_unswapped() to expect the
vm object to be locked on entry. (All of the pager routines now expect
this.)
releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages()
or vm_pager_has_pages()). If we sleep, we have marked the vm object with
the paging-in-progress flag.
comes along and flushes a file which has been mmap()'d SHARED/RW, with
dirty pages, it was flushing the underlying VM object asynchronously,
resulting in thousands of 8K writes. With this change the VM Object flushing
code will cluster dirty pages in 64K blocks.
Note that until the low memory deadlock issue is reviewed, it is not safe
to allow the pageout daemon to use this feature. Forced pageouts still
use fs block size'd ops for the moment.
MFC after: 3 days
indirectly through vm_page_protect(). The one remaining page flag that
is updated by vm_page_protect() is already being updated by our various
pmap implementations.
Note: A later commit will similarly change the VM_PROT_READ case and
eliminate vm_page_protect().
pmap_zero_page() and pmap_zero_page_area() were modified to accept
a struct vm_page * instead of a physical address, vm_page_zero_fill()
and vm_page_zero_fill_area() have served no purpose.
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.
Idea stolen from: BSD/OS
release of Giant. (Annotate as MPSAFE.)
o Also, in vnode_pager_alloc(), remove an unnecessary re-initialization
of struct vm_object::flags and move a statement that is duplicated
in both branches of an if-else.
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
style(9)
- Minor space adjustment in cases where we have "( ", " )", if(), return(),
while(), for(), etc.
- Add /* SYMBOL */ after a few #endifs.
Reviewed by: alc
commit by Kirk also fixed a softupdates bug that could easily be triggered
by server side NFS.
* An edge case with shared R+W mmap()'s and truncate whereby
the system would inappropriately clear the dirty bits on
still-dirty data. (applicable to all filesystems)
THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING.
see vm/vm_page.c line 1641
* The straddle case for VM pages and buffer cache buffers when
truncating. (applicable to NFS client side)
* Possible SMP database corruption due to vm_pager_unmap_page()
not clearing the TLB for the other cpu's. (applicable to NFS
client side but could effect all filesystems). Note: not
considered serious since the corruption occurs beyond the file
EOF.
* When flusing a dirty buffer due to B_CACHE getting cleared,
we were accidently setting B_CACHE again (that is, bwrite() sets
B_CACHE), when we really want it to stay clear after the write
is complete. This resulted in a corrupt buffer. (applicable
to all filesystems but probably only triggered by NFS)
* We have to call vtruncbuf() when ftruncate()ing to remove
any buffer cache buffers. This is still tentitive, I may
be able to remove it due to the second bug fix. (applicable
to NFS client side)
* vnode_pager_setsize() race against nfs_vinvalbuf()... we have
to set n_size before calling nfs_vinvalbuf or the NFS code
may recursively vnode_pager_setsize() to the original value
before the truncate. This is what was causing the user mmap
bus faults in the nfs tester program. (applicable to NFS
client side)
* Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made
by Kirk).
Testing program written by: Avadis Tevanian, Jr.
Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step')
MFC after: 1 week
file EOF. This works around a bug in the ISOFS (CDRom) BMAP code which
returns bogus values for requests beyond the file EOF rather then returning
an error, resulting in either corrupt data being mmap()'d beyond the file EOF
or resulting in a seg-fault on the last page of a mmap()'d file (mmap()s of
CDRom files).
Reported by: peter / Yahoo
MFC after: 3 days