with NFS.
We are moving responsibility for creating the vnode_pager object into
the filesystems which own the vnode, and this is one of the places
we have to cover.
We call vnode_create_vobject() directly because we own the vnode.
If we can get the size easily, pass it as an argument to save the
call to VOP_GETATTR() in vnode_create_vobject()
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:
The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.
Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.
If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.
Discussed with: rwatson
Add a vfs_cmount() function which converts omount argument stucture
to nmount arguments.
Convert vfs_omount() to vfs_mount() and parse nmount arguments.
This is 100% compatible with existing userland.
Later on, but before userland gets converted to nmount we may want
to revisit the names of the mountoptions, for instance it may make
sense to use consistent options for charset conversion etc.
doesn't. Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.
Fix this the cleaner way: Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.
- Use copyinstr() to read cs_win, cs_dos, cs_local strings from the
mount argument structure instead of reading through user-space pointers(!).
- When mounting a filesystem, or updating an existing mount, only try to
update the iconv handles from the information in the mount argument
structure if the structure itself has the MSDOSFSMNT_KICONV flag set.
- Attempt to handle failure of update_mp() in the MNT_UPDATE case.
and refuse initializing filesystems with a wrong version. This will
aid maintenance activites on the 5-stable branch.
s/vfs_mount/vfs_omount/
s/vfs_nmount/vfs_mount/
Name our filesystems mount function consistently.
Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space. A few places abused
it to get hold of some credentials to pass around. Effectively
it is unused.
Reorganize the root filesystem selection code.
Add local rootvp variables as needed.
Remove checks for miniroot's in the swappartition. We never did that
and most of the filesystems could never be used for that, but it had
still been copy&pasted all over the place.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
our cached 'next vnode' being removed from this mountpoint. If we
find that it was recycled, we restart our traversal from the start
of the list.
Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:
MNT_ILOCK(mp);
loop:
for (vp = TAILQ_FIRST(&mp...);
(vp = nvp) != NULL;
nvp = TAILQ_NEXT(vp,...)) {
if (vp->v_mount != mp)
goto loop;
MNT_IUNLOCK(mp);
...
MNT_ILOCK(mp);
}
MNT_IUNLOCK(mp);
The code which takes vnodes off a mountpoint looks like this:
MNT_ILOCK(vp->v_mount);
...
TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
...
MNT_IUNLOCK(vp->v_mount);
...
vp->v_mount = something;
(Take a moment and try to spot the locking error before you read on.)
On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.
Fix:
Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go. This saves approx 65 lines of
duplicated code.
Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.
Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.
FAT32 filesystems to be mounted, subject to some fairly serious limitations.
This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.
Don't enable this option unless you understand the consequences.
255; USB keychains exist that use 256 as the number of heads. This
check has also been removed in Darwin (along with most of the other
head/sector sanity checks).
by 1 u_int if the number of clusters was 1 more than a multiple of
(8 * sizeof(u_int)). The bitmap is malloced and large (often huge), so
fatal overrun probably only occurred if the number of clusters was 1
more than 1 multiple of PAGE_SIZE/8.
it means that the correct value is unknown. Since this value is just
a hint to improve performance, initially assume that the first non-reserved
cluster is free, then correct this assumption if necessary before writing
the FSInfo block back to disk.
PR: 62826
MFC after: 2 weeks
- don't unlock the vnode after vinvalbuf() only to have to relock it
almost immediately.
- don't refer to devices classified by vn_isdisk() as block devices.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.
Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.
Discussed with: jeff
wasn't curthread, i.e. when we receive a thread pointer to use
as a function argument. Use VOP_UNLOCK/vrele in these cases.
The only case there td != curthread known at the moment is
boot() calling sync with thread0 pointer.
This fixes the panic on shutdown people have reported.
masks for files and directories. This should make some
of the Midnight Commander users happy.
Remove an extra ')' in the manual page.
PR: 35699
Submitted by: Eugene Grosbein <eugen@grosbein.pp.ru> (original version)
Tested by: simon
contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
the user requests a read-only mount. This is necessary because we
don't do the VOP_OPEN again if they upgrade a read-only mount to
read-write.
Fixes lockup when creating files on msdosfs mounts that have been
mounted read-only then upgraded to read-write. The exact cause of
the lockup is not known, but it is likely to be the kernel getting
stuck in an infinite loop trying to write dirty buffers to a device
without write permission.
Reported/tested by andreas, discussed with phk.
unlocked accesses to v_usecount.
- Lock access to the buf lists in the various sync routines. interlock
locking could be avoided almost entirely in leaf filesystems if the
fsync function had a generic helper.
v_tag is now const char * and should only be used for debugging.
Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.
Suggested by: phk
Reviewed by: bde, rwatson (earlier version)
these in the main filesystems. This does not change the resulting code
but makes the source a little bit more grepable.
Sponsored by: DARPA and NAI Labs.
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.
Idea stolen from: BSD/OS
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@