Commit Graph

204 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
c52fd858ae Remove unused thread argument from vtruncbuf().
Reviewed by:	kib
2012-04-23 13:21:28 +00:00
Konstantin Belousov
2aacee7779 Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.
Noted by:	bde
MFC after:	1 week
2012-02-22 13:01:17 +00:00
Konstantin Belousov
526d0bd547 Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with:	bde, das (previous versions)
MFC after:	1 month
2012-02-21 01:05:12 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Konstantin Belousov
b0d5391101 Add a comment describing the reason for calling cache_purge(fvp).
Requested by:	danfe
MFC after:	6 days
2010-10-08 07:17:22 +00:00
Konstantin Belousov
4d477d5c77 The msdosfs lookup is case insensitive. Several aliases may be inserted for
a single directory entry. As a consequnce, name cache purge done by lookup
for fvp when DELETE op for namei is specified, might be not enough to
expunge all namecache entries that were installed for this direntry.

Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded.

PR:	kern/93634
MFC after:	1 week
2010-10-07 08:36:02 +00:00
Edward Tomasz Napierala
307d88b787 Style fixes and removal of unneeded variable.
Submitted by:	bde@
2010-05-06 18:43:19 +00:00
Edward Tomasz Napierala
b5f770bd86 Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().
Reviewed by:	kib
2010-05-05 16:44:25 +00:00
Konstantin Belousov
2e45cc5bf6 Remove seemingly unneeded unlock/relock of the dvp in msdosfs_rmdir,
causing LOR.

Reported and tested by:	pho
MFC after:	3 weeks
2010-02-28 17:09:09 +00:00
Ulrich Spörlein
8fa03d08ca Fix common misspelling of hierarchy
Pointed out by:		bf1783 at gmail
Approved by:		np (cxgb), kientzle (tar, etc.), philip (mentor)
2010-02-20 10:19:19 +00:00
Konstantin Belousov
48d1bcf8e0 - Add idempotency guards so the structures can be used in other utilities.
- Update bpb structs with reserved fields.
- In direntry struct join deName with deExtension. Although a
  fix was attempted in the past, these fields were being overflowed,
  Now this is consistent with the spec, and we can now share the
  WinChksum code with NetBSD.

Submitted by:	Pedro F. Giffuni <giffunip tutopia com>
Mostly obtained from:	NetBSD
Reviewed by:	bde
MFC after:	2 weeks
2010-02-13 12:41:07 +00:00
Konstantin Belousov
d6da640860 Fix r193923 by noting that type of a_fp is struct file *, not int.
It was assumed that r193923 was trivial change that cannot be done
wrong.

MFC after:	2 weeks
2009-06-10 14:24:31 +00:00
Konstantin Belousov
e4d9bdc105 s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args
definition.

Discussed with:	bde
MFC after:	2 weeks
2009-06-10 14:09:05 +00:00
John Baldwin
c72ae1423b - Hold a reference on the cdev a filesystem is mounted from in the mount.
- Remove the cdev pointers from the denode and instead use the mountpoint's
  reference to call dev2udev() in getattr().

Reviewed by:	kib, julian
2009-02-27 20:00:15 +00:00
Edward Tomasz Napierala
0da50f6ef8 According to phk@, VOP_STRATEGY should never, _ever_, return
anything other than 0.  Make it so.  This fixes
"panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648",
encountered when writing to an orphaned filesystem.  Reason
for the panic was the following assert:
KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
at vfs_bio:bufstrategy().

Reviewed by:	scottl, phk
Approved by:	rwatson (mentor)
Sponsored by:	FreeBSD Foundation
2008-12-16 21:13:11 +00:00
Edward Tomasz Napierala
15bc6b2bd8 Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by:	rwatson (mentor)
2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Konstantin Belousov
4c5a20e3da Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR().
NODEV is more appropriate when va_rdev doesn't have a meaningful value.

Submitted by:   Jaakko Heinonen <jh saunalahti fi>
Suggested by:   bde
Discussed on:   freebsd-fs
MFC after:	1 month
2008-09-20 19:49:15 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Konstantin Belousov
813d71de08 The uniqdosname() function takes char[12] as it third argument.
Found by:	-fstack-protector
Reported by:	dougb
Tested by:	dougb, Rainer Hurling <rhurlin gwdg de>
MFC after:	3 days
2008-07-04 09:40:52 +00:00
Konstantin Belousov
eab626f110 Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by:	kris
Tested by:	kris, pho
Discussed with:	jeff, dfr
MFC after:	2 weeks
2008-04-16 11:33:32 +00:00
Doug Rabson
dfdcada31e Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Bruce Evans
cb65c1ee29 Implement the async (really, delayed-write) mount option for msdosfs.
This is much simpler than for ffs since there are many fewer places
where we need to choose between a delayed write and a sync write --
just 5 in msdosfs and more than 30 in ffs.

This is more complete and correct than in ffs.  Several places in ffs
are are still missing the choice.  ffs_update() has a layering violation
that breaks callers which want to force a sync update (mainly fsync(2)
and O_SYNC write(2)).

However, fsync(2) and O_SYNC write(2) are still more broken than in
ffs, since they are broken for default (non-sync non-async) mounts
too.  Both fail to sync the FAT in all cases, and both fail to sync
the directory entry in some cases after losing a race.  Async everything
is probably safer than the half-baked sync of metadata given by default
mounts.
2007-10-19 12:23:25 +00:00
Bruce Evans
cefb55828f In msdosfs_settattr(), don't do synchronous updates of the denode
(except indirectly for the size pseudo-attribute).  If anything deserves
a sync update, then it is ids and immutable flags, since these are
related to security, but ffs never synced these and msdosfs doesn't
support them.  (ufs_setattr() only does an update in one case where
it is least needed (for timestamps); it did pessimal sync updates for
timestamps until 1998/03/08 but was changed for unlogged reasons related
to soft updates.)

Now msdosfs calls deupdat() with waitfor == 0, which normally gives a
delayed update to disk but always gives a sync update of timestamps
in core, while for ffs everything is delayed until the syncer daemon
or other activity causes an update (except for timestamps).

This gives a large optimization mainly for things like cp -p, where
attribute adjustment could easily triple the number of physical I/O's
if it is done synchronously (but cp -p to msdosfs is not as bad as
that, since msdosfs doesn't support many attributes so null adjustments
are more common, and msdosfs doesn't support ctimes so even if cp
doesn't weed out null adjustments they don't become non-null after
clobbering the ctime).
2007-10-18 07:26:21 +00:00
Bruce Evans
c2819440b3 Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions
can easily block in bread(), and then there was nothing to prevent the
static buffer (nambuf_{ptr,len,last_id}) being clobbered by another
thread.

The effects of the bug seem to have been limited to failed lookups and
mangled names in readdir(), since Giant locking provides enough
serialization to prevent concurrent calls to the functions that access
the buffer.  They were very obvious for multiple concurrent tree walks,
especially with a small cluster size.

The bug was introduced in msdosfs_conv.c 1.34 and associated changes,
and is in all releases starting with 5.2.

The fix is to allocate the buffer as a local variable and pass around
pointers to it like "_r" functions in libc do.  Stack use from this
is large but not too large.  This also fixes a memory leak on module
unload.

Reviewed by:	kib
Approved by:	re (kensmith)
2007-08-31 22:29:55 +00:00
Bruce Evans
a4e6807c49 In msdosfs_read() and msdosfs_write(), don't check explicitly for
(uio_offset < 0) since this can't happen.  If this happens, then the
general code handles the problem safely (better than before for reading,
returning 0 (EOF) instead of the bogus errno EINVAL, and the same as
before for writing, returning EFBIG).

In msdosfs_read(), don't check for (uio_resid < 0).  msdosfs_write()
already didn't check.

In msdosfs_read(), document in a comment our assumptions that the caller
passed a valid uio_offset and uio_resid.  ffs checks using KASSERT(),
and that is enough sanity checking.  In the same comment, partly document
there is no need to check for the EOVERFLOW case, unlike in ffs where this
case can happen at least in theory.

In msdosfs_write(), add a comment about why the checking of
(uio_resid == 0) is explicit, unlike in ffs.

In msdosfs_write(), check for impossibly large final offsets before
checking if the file size rlimit would be exceeded, so that we don't
have an overflow bug in the rlimit check and are consistent with ffs.
We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final
offset would be impossibly large but not so large as to cause overflow.
Overflow normally gave the benign behaviour of no signal.

Approved by:	re (kensmith) (blanket)
2007-08-07 10:35:27 +00:00
Bruce Evans
b7837a91c9 Fix and update the comments about the effect of the read-only flag on writing.
They are still too verbose.

Remove nearby unreachable code for handling symlinks.

Approved by:	re (kensmith) (blanket)
2007-08-07 05:42:10 +00:00
Bruce Evans
c0f5121cac Fix some style bugs (don't assume that off_t == int64_t; fix some comments;
remove some parentheses; fix only a couple of whtespace errors).

Approved by:	re (kensmith) (blanket)
2007-08-07 03:43:28 +00:00
Bruce Evans
d2bb66bacd Sort includes.
Remove rotted banal comment attached to includes.

Approved by:	re (kensmith) (blanket)
2007-08-07 02:28:33 +00:00
Bruce Evans
eba34270fa Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of
depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h>

Approved by:	re (kensmith) (blanket)
2007-08-07 01:40:27 +00:00
Bruce Evans
6fd81fc7a6 Remove unused include(s).
Approved by:	re (kensmith) (blanket)
2007-08-07 01:07:16 +00:00
Bruce Evans
6b6c5f5ef9 Implement vfs clustering for msdosfs.
This gives a very large speedup for small block sizes (in my tests,
about 5 times for write and 3 times for read with a block size of 512,
if clustering is possible) and a moderate speedup for the moderatatly
large block sizes that should be used on non-small media (4K is the
best size in most cases, and the speedup for that is about 1.3 times
for write and 1.2 times for read).  mmap() should benefit from clustering
like read()/write(), but the current implementation of vm only supports
clustering (at least for getpages) if the fs block size is >= PAGE SIZE.

msdosfs is now only slightly slower than ffs with soft updates for
writing and slightly faster for reading when both use their best block
sizes.  Writing is slower for msdosfs because of more sync writes.
Reading is faster for msdosfs because indirect blocks interfere with
clustering in ffs.

The changes in msdosfs_read() and msdosfs_write() are simpler merges
of corresponding code in ffs (after fixing some style bugs in ffs).
msdosfs_bmap() needs fs-specific code.  This implementation loops
calling a lower level bmap function to do the hard parts.  This is a
bit inefficient, but is efficient enough since msdsfs_bmap() is only
called when there is physical i/o to do.

Approved by:	re (hrs)
2007-07-20 17:06:57 +00:00
Bruce Evans
d34b0a1bac Clean up before implementing vfs clustering for msdosfs:
In msdosfs_read(), mainly reorder the main loop to the same order as in
ffs_read().

In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of
clrbuf().  I think this just just a bogus optimization, but ffs always
does it and msdosfs already did it in one place, and it is what I've
tested.

In msdosfs_write(), merge good bits from a comment in ffs_write(), and
fix 1 style bug.

In the main comment for msdosfs_pcbmap(), improve wording and catch
up with 13 years of changes in the function.  This comment belongs in
VOP_BMAP.9 but that doesn't exist.

In msdosfs_bmap(), return EFBIG if the requested cluster number is out
of bounds instead of blindly truncating it, and fix many style bugs.

Approved by:	re (hrs)
2007-07-20 16:21:47 +00:00
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Pawel Jakub Dawidek
10bcafe9ab Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by:	mckusick
Discussed with:	many (on IRC)
Tested with:	ufs, msdosfs, cd9660, nullfs and zfs
2007-02-15 22:08:35 +00:00
Tai-hwa Liang
61ad2e26ef Fixing compilation bustage by removing references to opt_msdosfs.h.
This auto-generated header file no longer exists since the removal of
MSDOSFS_LARGE in sys/conf/options:1.574.
2007-01-30 08:05:04 +00:00
Craig Rodrigues
f458f2a553 Add a "-o large" mount option for msdosfs. Convert compile-time checks for
#ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified.

Test case provided by Oliver Fromme:
  truncate -s 200G test.img
  mdconfig -a -t vnode -f test.img -u 9
  newfs_msdos -s 419430400 -n 1 /dev/md9 zip250
  mount -t msdosfs /dev/md9 /mnt    # should fail
  mount -t msdosfs -o large /dev/md9 /mnt   # should succeed

PR:		105964
Requested by:	Oliver Fromme <olli lurza secnetix de>
Tested by:	trhodes
MFC after:	2 weeks
2007-01-30 03:11:45 +00:00
Maxim Konovalov
1c5cf521ae o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME
set birthtime to FAT CTime (creation time) and in the other cases
set birthtime to -1.

o Set ctime to mtime instead of FAT CTime which has completely
different meaning.

PR:		kern/106018
Submitted by:	Oliver Fromme
MFC after:	1 month
2006-12-03 19:04:26 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
Poul-Henning Kamp
3c960d9379 Replace slightly crummy fattime<->timespec conversion functions. 2006-10-24 11:14:05 +00:00
Jeff Roberson
89b0e10910 - Reorder calls to vrele() after calls to vput() when the vrele is a
directory.  vrele() may lock the passed vnode, which in these cases would
   give an invalid lock order of child -> parent.  These situations are
   deadlock prone although do not typically deadlock because the vrele
   is typically not releasing the last reference to the vnode.  Users of
   vrele must consider it as a call to vn_lock() and order it appropriately.

MFC After: 	1 week
Sponsored by:	Isilon Systems, Inc.
Tested by:	kkenn
2006-02-01 00:25:26 +00:00
Tom Rhodes
9fc31f8a5f Update incorrect comments here, there should not be a call to panic()
over fs corruption.

Discussed with:	alfred, phk
2006-01-23 17:45:57 +00:00
Max Khon
710a9accfe Do not assume that `char direntry::deExtension[3]' starts right after
`char direntry::deName[8]' and access deExtension[] explicitly.

Found by:	Coverity Prevent(tm)
CID:		350, 351, 352
2006-01-22 21:09:38 +00:00
Poul-Henning Kamp
7ce296cf04 Remove debug printout of major/minor numbers, print name instead. 2005-02-27 21:16:26 +00:00
Peter Edwards
72b3e305af Unbreak a few filesystems for which vnode_create_vobject() wasn't being
called in "open", causing mmap() to fail.

Where possible, pass size of file to vnode_create_vobject() rather
than having it find it out the hard way via VOP_LOOKUP

Reviewed by: phk
2005-01-29 16:23:39 +00:00
Poul-Henning Kamp
83c6439714 Whitespace in vop_vector{} initializations. 2005-01-13 18:59:48 +00:00
Poul-Henning Kamp
0391e5a151 Wrap the bufobj operations in macros: BO_STRATEGY() and BO_WRITE() 2005-01-11 09:10:46 +00:00
Warner Losh
d167cf6f3a /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:10:42 +00:00