Commit Graph

3024 Commits

Author SHA1 Message Date
Rick Macklem
694a586a43 Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by:	kib
2011-05-22 01:07:54 +00:00
Rick Macklem
b70cddba44 Add a sanity check for the existence of an "addr" option
to both NFS clients. This avoids the crash reported by
Sergey Kandaurov (pluknet@gmail.com) to the freebsd-fs@
list with subject "[old nfsclient] different nmount()
args passed from mount vs mount_nfs" dated May 17, 2011.

Tested by:	pluknet at gmail.com (old nfs client)
MFC after:	2 weeks
2011-05-18 18:36:40 +00:00
Rick Macklem
1f3765902c Change the sysctl naming for the old and new NFS clients
to vfs.oldnfs.xxx and vfs.nfs.xxx respectively. This makes
the default nfs client use vfs.nfs.xxx after r221124.
2011-05-15 20:52:43 +00:00
John Baldwin
5b4f35a4f0 Merge comments about converting directory entries to be more direct and
concise.

Inspired by:	Gleb Kurtsou
2011-05-14 01:10:57 +00:00
Rick Macklem
a0c2c3691c Change the new NFS server so that it uses vfs.nfsd naming
for its sysctls instead of vfs.newnfs. This separates the
names from the ones used by the client.
2011-05-08 01:01:27 +00:00
Rick Macklem
1dcad8ec9a Set the initial value of maxfilesize to OFF_MAX in the
new NFS client. It will then be reduced to whatever the
server says it can support. There might be an argument
that this could be one block larger, but since NFS is
a byte granular system, I chose not to do that.

Suggested by:	Matt Dillon
Tested by:	Daniel Braniss (earlier version)
MFC after:	2 weeks
2011-05-06 17:51:00 +00:00
Alexander Motin
08aadbe3b4 Increase NFS_TICKINTVL value from 10 to 500. Now that callout does useful
things only once per second, so other 99 calls per second were useless and
just don't allow idle system to sleep properly.

Reviewed by:	rmacklem
2011-05-06 13:11:50 +00:00
Rick Macklem
78e4b1f838 Change the new NFS server so that it returns 0 when the f_bavail
or f_ffree fields of "struct statfs" are negative, since the
values that go on the wire are unsigned and will appear to be
very large positive values otherwise. This makes the handling
of a negative f_bavail compatible with the old/regular NFS server.

MFC after:	2 weeks
2011-05-06 01:29:14 +00:00
Rick Macklem
f96712c2e6 Fix the new NFS client so that it handles the 64bit fields
that are now in "struct statfs" for NFSv3 and NFSv4. Since
the ffiles value is uint64_t on the wire, I clip the value
to INT64_MAX to avoid setting f_ffree negative.

Tested by:	kib
MFC after:	2 weeks
2011-05-05 00:11:09 +00:00
Rick Macklem
5a816b92a3 Add a comment noting that the NFS code assumes that the
values of error numbers in sys/errno.h will be the same
as the ones specified by the NFS RFCs and that the code
needs to be fixed if error numbers are changed in sys/errno.h.

Suggested by:	Peter Jeremy
MFC after:	2 weeks
2011-05-04 22:02:33 +00:00
Rick Macklem
2e3b981a4d Add kernel support for NFSSVC_ZEROCLTSTATS and NFSSVC_ZEROSRVSTATS
so that they can be used by nfsstat(1) to implement the "-z" option
for the new NFS subsystem.

MFC after:	2 weeks
2011-05-04 13:36:18 +00:00
Rick Macklem
2b08b570cb Revert r221306, since NFSSVC_ZEROSTATS zero'd both client and
server stats, when separate modifiers for NFSSVC_GETSTATS for
each of client and server stats is what it required by nfsstat(1).
2011-05-04 13:30:38 +00:00
Ruslan Ermilov
e2f2b37089 Implemented a mount option "nocto" that disables cache coherency
checking at open time.  It may improve performance for read-only
NFS mounts.  Use deliberately.

MFC after:	1 week
Reviewed by:	rmacklem, jhb (earlier version)
2011-05-04 13:27:45 +00:00
Ruslan Ermilov
55cde634cf In ncl_printf(), call vprintf() instead of printf().
MFC after:	3 days
2011-05-04 11:22:52 +00:00
Rick Macklem
b2946fadcd Add the kernel support needed to zero out the nfsstats
structure for the new NFS subsystem. This will be used
by nfsstats.c to implement the "-z" option.

MFC after:	2 weeks
2011-05-01 22:19:52 +00:00
Konstantin Belousov
4417ac326a Clarify the comment.
MFC after:	1 week
2011-04-30 13:49:03 +00:00
Rick Macklem
8b713a2f8a The build was broken by r221190 for 64bit arches like amd64.
This patch fixes it.

MFC after:	2 weeks
2011-04-29 12:30:15 +00:00
Rick Macklem
61c827204b Fix the new NFS client so that it handles the "nfs_args" value
in mnt_optnew. This is needed so that the old mount(2) syscall
works and that is needed so that amd(8) works. The code was
basically just cribbed from sys/nfsclient/nfs_vfsops.c with minor
changes. This patch is mainly to fix the new NFS client so that
amd(8) works with it. Thanks go to Craig Rodrigues for helping with
this.

Tested by:	Craig Rodrigues (for amd)
MFC after:	2 weeks
2011-04-28 23:21:50 +00:00
John Baldwin
7d74606889 Update a comment since ext2fs does not use SU.
Reviewed by:	kib
2011-04-28 20:25:15 +00:00
John Baldwin
466a71d75e The b_dep field of buffers is always empty for ext2fs, it is only used
for SU in FFS.

Reported by:	kib
2011-04-28 17:36:26 +00:00
John Baldwin
9e880b876d Sync with several changes in UFS/FFS:
- 77115: Implement support for O_DIRECT.
- 98425: Fix a performance issue introduced in 70131 that was causing
  reads before writes even when writing full blocks.
- 98658: Rename the BALLOC flags from B_* to BA_* to avoid confusion with
  the struct buf B_ flags.
- 100344: Merge the BA_ and IO_ flags so so that they may both be used in
  the same flags word. This merger is possible by assigning the IO_ flags
  to the low sixteen bits and the BA_ flags the high sixteen bits.
- 105422: Fix a file-rewrite performance case.
- 129545: Implement IO_INVAL in VOP_WRITE() by marking the buffer as
  "no cache".
- Readd the DOINGASYNC() macro and use it to control asynchronous writes.
  Change i-node updates to honor DOINGASYNC() instead of always being
  synchronous.
- Use a PRIV_VFS_RETAINSUGID check instead of checking cr_uid against 0
  directly when deciding whether or not to clear suid and sgid bits.

Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-04-28 14:27:17 +00:00
Rick Macklem
afea74655f Fix module names and dependencies so the NFS clients will
load correctly as modules after r221124.
2011-04-27 20:42:30 +00:00
John Baldwin
bbfe24fbf2 Use a private EXT2_ROOTINO constant instead of redefining ROOTINO.
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-04-27 18:25:35 +00:00
John Baldwin
4d2ede6798 Various style fixes including using uint*_t instead of u_int*_t.
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-04-27 18:15:34 +00:00
Rick Macklem
4309e17add This patch changes head so that the default NFS client is now the new
NFS client (which I guess is no longer experimental). The fstype "newnfs"
is now "nfs" and the regular/old NFS client is now fstype "oldnfs".
Although mounts via fstype "nfs" will usually work without userland
changes, an updated mount_nfs(8) binary is needed for kernels built with
"options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and
mount(8) binaries are needed to do mounts for fstype "oldnfs".
The GENERIC kernel configs have been changed to use options
NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER.
For kernels being used on diskless NFS root systems, "options NFSCL"
must be in the kernel config.
Discussed on freebsd-fs@.
2011-04-27 17:51:51 +00:00
Rick Macklem
541cb7a358 Fix a kernel linking problem introduced by r221032, r221040
when building kernels that don't have "options NFS_ROOT"
specified. I plan on moving the functions that use these
data structures into the shared code in sys/nfs/nfs_diskless.c
in a future commit. At that time, these definitions will no
longer be needed in nfs_vfsops.c and nfs_clvfsops.c.

MFC after:	2 weeks
2011-04-26 13:50:11 +00:00
Rick Macklem
8954032f0d Modify the experimental (newnfs) NFS client so that it uses the
same diskless NFS root code as the regular client, which
was moved to sys/nfs by r221032. This fixes the newnfs
client so that it can do an NFSv3 diskless root file system.

MFC after:	2 weeks
2011-04-25 23:12:18 +00:00
Rick Macklem
151c163e4d Fix the experimental NFS client so that it does not bogusly
set the f_flags field of "struct statfs". This had the interesting
effect of making the NFSv4 mounts "disappear" after r221014,
since NFSMNT_NFSV4 and MNT_IGNORE became the same bit.

MFC after:	2 weeks
2011-04-25 14:51:08 +00:00
Rick Macklem
385edc8e71 Modify the experimental NFS client so that it uses the same
"struct nfs_args" as the regular NFS client. This is needed
so that the old mount(2) syscall will work and it makes
sharing of the diskless NFS root code easier. Eary in the
porting exercise I introduced a new revision of nfs_args, but
didn't actually need it, thanks to nmount(2). I re-introduced the
NFSMNT_KERB flag, since it does essentially the same thing and
the old one would not have been used because it never worked.
I also added a few new NFSMNT_xxx flags to sys/nfsclient/nfs_args.h
that are used by the experimental NFS client.

MFC after:	2 weeks
2011-04-25 13:09:32 +00:00
Rick Macklem
24e2bcc006 Remove the nm_mtx mutex locking from the test for
nm_maxfilesize. This value rarely, if ever, changes
and the nm_mtx mutex is locked/unlocked earlier in
the function, which should be sufficient to avoid
getting a stale cached value for it. There is a
discussion w.r.t. what these tests should be, but
I've left them basically the same as the regular
NFS client for now.

Suggested by:	pjd
MFC after:	2 weeks
2011-04-21 19:56:06 +00:00
Rick Macklem
920ae5d96a Revert r220906, since the vp isn't always locked when
nfscl_request() is called. It will need a more involved
patch.
2011-04-21 12:38:12 +00:00
Rick Macklem
69bcf84509 Add a check for VI_DOOMED at the beginning of nfscl_request()
so that it won't try and use vp->v_mount to do an RPC during
a forced dismount. There needs to be at least one more kernel
commit, plus a change to the umount(8) command before forced
dismounts will work for the experimental NFS client.

MFC after:	2 weeks
2011-04-20 23:25:18 +00:00
Rick Macklem
b29b9bcbfb Modify the offset + size checks for read and write in the
experimental NFS client to take care of overflows for the calls
above the buffer cache layer in a manner similar to r220876.
Thanks go to dillon at apollo.backplane.com for providing the
snippet of code that does this.

MFC after:	2 weeks
2011-04-20 01:15:22 +00:00
Rick Macklem
b1297f142f Modify the offset + size checks for read and write in the
experimental NFS client to take care of overflows. Thanks
go to dillon at apollo.backplane.com for providing the
snippet of code that does this.

MFC after:	2 weeks
2011-04-20 00:21:51 +00:00
Rick Macklem
58c969c8de Fix up handling of the nfsmount structure in read and write
within the experimental NFS client. Mostly add mutex locking
and use the same rsize, wsize during the operation by keeping
a local copy of it. This is another change that brings it
closer to the regular NFS client.

MFC after:	2 weeks
2011-04-19 01:09:51 +00:00
Rick Macklem
a8bafa5d3b Revert r220761 since, as kib@ pointed out, the case of
adding the check to nfsrpc_close() isn't useful. Also,
the check in nfscl_getcl() must be more involved, since
it needs to check before and after the acquisition of
the refcnt on nfsc_lock, while the mutex that protects
the client state data is held.
2011-04-18 23:35:16 +00:00
Rick Macklem
bc62b5cf6a Add a vput() to nfs_lookitup() in the experimental NFS client
for a case that will probably never happen. It can only
happen if a server were to successfully lookup a file, but not
return attributes for that file. Although technically allowed
by the NFSv3 RFC, I doubt any server would ever do this.
However, if it did, the client would have not vput()'d the
new vnode when it needed to do so.

MFC after:	2 weeks
2011-04-18 01:02:43 +00:00
Rick Macklem
ab42af2708 Add vput() calls in two places in the experimental NFS client
that would be needed if, in the future, nfscl_loadattrcache()
were to return an error. Currently nfscl_loadattrcache()
never returns an error, so these cases never currently happen.

MFC after:	2 weeks
2011-04-18 00:41:23 +00:00
Rick Macklem
78d8a60009 Change the mutex locking for several locations in the
experimental NFS client's vnode op functions to make
them compatible with the regular NFS client. I'll admit
I'm not sure that the mutex locks around the assignments
are needed, but the regular client has them, so I added them.
Also, add handling of the case of partial attributes in
setattr to be compatible with the regular client.

MFC after:	2 weeks
2011-04-17 23:56:57 +00:00
Rick Macklem
be8b35eda7 Add checks for MNTK_UNMOUNTF at the beginning of three
functions, so that threads don't get stuck in them during
a forced dismount. nfs_sync/VFS_SYNC() needs this, since it is
called by dounmount() before VFS_UNMOUNT(). The nfscl_nget()
case makes sure that a thread doing an VOP_OPEN() or
VOP_ADVLOCK() call doesn't get blocked before attempting
the RPC. Attempting RPCs don't block, since they all
fail once a forced dismount is in progress.
The third one at the beginning of nfsrpc_close()
is done so threads don't get blocked while doing VOP_INACTIVE()
as the vnodes are cleared out.
With these three changes plus a change to the umount(1)
command so that it doesn't do "sync()" for the forced case
seem to make forced dismounts work for the experimental NFS
client.

MFC after:	2 weeks
2011-04-17 23:04:03 +00:00
Rick Macklem
ebd9ef339f Get rid of the "nfscl: consider increasing kern.ipc.maxsockbuf"
message that was generated when doing experimental NFS client
mounts. I put that message in because the krpc would hang with
the default size for mounts that used large rsize/wsize values.
Since the bug that caused these hangs was fixed by r213756,
I think the message is no longer needed.

MFC after:	2 weeks
2011-04-17 20:01:32 +00:00
Rick Macklem
0a9f005dff Fix up some of the sysctls for the experimental NFS client so
that they use the same names as the regular client. Also add
string descriptions for them.

MFC after:	2 weeks
2011-04-17 18:56:17 +00:00
Rick Macklem
8e82d541da Change some defaults in the experimental NFS client to be the
same as the regular NFS client for NFSv3. The main one is making
use of a reserved port# the default. Also, set the retry limit
for TCP the same and fix the code so that it doesn't disable
readdirplus for NFSv4.

MFC after:	2 weeks
2011-04-17 14:10:12 +00:00
Rick Macklem
f5613c1d97 Fix readdirplus in the experimental NFS client so that it
skips over ".." to avoid a LOR race with nfs_lookup(). This
fix is analagous to r138256 in the regular NFS client.

MFC after:	2 weeks
2011-04-17 02:44:51 +00:00
Rick Macklem
4b3a38ecdf Add a lktype flags argument to nfscl_nget() and ncl_nget() in the
experimental NFS client so that its nfs_lookup() function can use
cn_lkflags in a manner analagous to the regular NFS client.

MFC after:	2 weeks
2011-04-16 23:20:21 +00:00
Rick Macklem
f8a2f6b03a Add mutex locking on the nfs node in ncl_inactive() for the
experimental NFS client.

MFC after:	2 weeks
2011-04-16 22:15:59 +00:00
Rick Macklem
7b8c319be4 Change the experimental NFS client so that it creates nfsiod
threads in the same manner as the regular NFS client after
r214026 was committed. This resolves the lors fixed by r214026
and its predecessors for the regular client.

Reviewed by:	jhb
MFC after:	2 weeks
2011-04-15 23:07:48 +00:00
Rick Macklem
a09001a82b Fix the experimental NFSv4 server so that it uses VOP_PATHCONF()
to determine if a file system supports NFSv4 ACLs. Since
VOP_PATHCONF() must be called with a locked vnode, the function
is called before nfsvno_fillattr() and the result is passed in
as an extra argument.

MFC after:	2 weeks
2011-04-14 23:46:15 +00:00
Rick Macklem
07c0c166e4 Modify the experimental NFSv4 server so that it handles
crossing of server mount points properly. The functions
nfsvno_fillattr() and nfsv4_fillattr() were modified to
take the extra arguments that are the mount point, a flag
to indicate that it is a file system root and the mounted
on fileno. The mount point argument needs to be busy when
nfsvno_fillattr() is called, since the vp argument is not
locked.

Reviewed by:	kib
MFC after:	2 weeks
2011-04-14 21:49:52 +00:00
Rick Macklem
149ce1025c Add VOP_PATHCONF() support to the experimental NFS client
so that it can, along with other things, report whether or
not NFS4 ACLs are supported.

MFC after:	2 weeks
2011-04-13 22:37:28 +00:00
Rick Macklem
3707cf8962 Fix the experimental NFSv4 client so that it recognizes server
mount point crossings correctly. It was testing the wrong flag.
Also, try harder to make sure that the fsid is different than
the one assigned to the client mount point, by hashing the
server's fsid (just to create a different value deterministically)
when it is the same.

MFC after:	2 weeks
2011-04-13 22:16:52 +00:00
Rick Macklem
f659876f01 Vrele ni_startdir in the experimental NFS server for the case
of NFSv2 getting an error return from VOP_MKNOD(). Without this
patch, the server file system remains busy after an NFSv2
VOP_MKNOD() fails.

MFC after:	2 weeks
2011-04-11 20:54:30 +00:00
Rick Macklem
806e2e4bb6 Add some cleanup code to the module unload operation for
the experimental NFS server, so that it doesn't leak memory
when unloaded. However, unloading the NFSv4 server is not
recommended, since all NFSv4 state will be lost by the unload
and clients will have to recover the state after a server
reload/restart as if the server crashed/rebooted.

MFC after:	2 weeks
2011-04-10 20:43:07 +00:00
Rick Macklem
8d2f180ea4 Add a VOP_UNLOCK() for the directory, when that is not what
VOP_LOOKUP() returned. This fixes a bug in the experimental
NFS server for the case where VFS_VGET() fails returning EOPNOTSUPP
in the ReaddirPlus RPC, forcing the use of VOP_LOOKUP() instead.

MFC after:	2 weeks
2011-04-09 23:55:27 +00:00
Konstantin Belousov
e06c3d4363 Linuxolator calls VOP_READDIR with ncookies pointer. Implement a
workaround for fdescfs to not panic when ncookies is not NULL, similar
to the one committed as r152254, but simpler, due to fdescfs_readdir()
not calling vfs_read_dirent().

PR:	kern/156177
MFC after:	1 week
2011-04-09 21:40:48 +00:00
Edward Tomasz Napierala
722581d9e6 Add RACCT_NOFILE accounting.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 19:13:04 +00:00
Zack Kirsch
418802a96c This patch fixes the Experimental NFS client to properly deal with 32 bit or 64
bit fileid's in NFSv2 and NFSv3. Without this fix, invalid casting (and sign
extension) was creating problems for any fileid greater than 2^31.

We discovered this because we have test clusters with more than 2 billion
allocated files and 64-bit ino_t's (and friend structures).

Reviewed by:    rmacklem
Approved by:    zml (mentor)
MFC after:      2 weeks
2011-03-30 01:10:11 +00:00
Konstantin Belousov
9ba671debc Report EBUSY instead of EROFS for attempt of deleting or renaming the
root directory of msdosfs mount. The VFS code would handle deletion
case itself too, assuming VV_ROOT flag is not lost. The msdosfs_rename()
should also note attempt to rename root via doscheckpath() or different
mount point check leading to EXDEV. Nonetheless, keep the checks for now.

The change is inspired by NetBSD change referenced in PR, but return
EBUSY like kern_unlinkat() does.

PR:	kern/152079
MFC after:	1 week
2011-03-25 22:31:28 +00:00
John Baldwin
8e6fa660f2 Fix some locking nits with the p_state field of struct proc:
- Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL
  in fork to honor the locking requirements.  While here, expand the scope
  of the PROC_LOCK() on the new process (p2) to avoid some LORs.  Previously
  the code was locking the new child process (p2) after it had locked the
  parent process (p1).  However, when locking two processes, the safe order
  is to lock the child first, then the parent.
- Fix various places that were checking p_state against PRS_NEW without
  having the process locked to use PROC_LOCK().  Every place was already
  locking the process, just after the PRS_NEW check.
- Remove or reduce the use of PROC_SLOCK() for places that were checking
  p_state against PRS_NEW.  The PROC_LOCK() alone is sufficient for reading
  the current state.
- Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.

MFC after:	1 week
2011-03-24 18:40:11 +00:00
Alexander Leidinger
de5b19526b Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
John Baldwin
056c6c933c Use ffs() to locate free bits in the inode and block bitmaps rather than
loops with bit shifts.
2011-02-24 22:11:36 +00:00
Rebecca Cran
974206cf70 Fix typos - remove duplicate "is".
PR:		docs/154934
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after:	3 days
2011-02-23 09:22:33 +00:00
Alan Cox
4d2f3d2cde Eliminate two dubious attempts at optimizing the implementation of a
file's last accessed, modified, and changed times:

TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally
in tmpfs_remove() without regard to the number of hard links to the file.
Otherwise, after the last directory entry for a file has been removed, a
process that still has the file open could read stale values for the last
accessed and changed times with fstat(2).

Similarly, tmpfs_close() should update the time-related fields even if all
directory entries for a file have been removed.  In this case, the effect
is that the time-related fields will have values that are later than
expected.  They will correspond to the time at which fstat(2) is called.

In collaboration with:	kib
MFC after:	1 week
2011-02-22 14:47:10 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Alan Cox
7ded42ba28 tmpfs_remove() isn't modifying the file's data, so it shouldn't set
TMPFS_NODE_MODIFIED on the node.

PR:		152488
Submitted by:	Anton Yuzhaninov
Reviewed by:	kib
MFC after:	1 week
2011-02-19 21:04:36 +00:00
Bjoern A. Zeeb
1fb51a12f2 Mfp4 CH=177274,177280,177284-177285,177297,177324-177325
VNET socket push back:
  try to minimize the number of places where we have to switch vnets
  and narrow down the time we stay switched.  Add assertions to the
  socket code to catch possibly unset vnets as seen in r204147.

  While this reduces the number of vnet recursion in some places like
  NFS, POSIX local sockets and some netgraph, .. recursions are
  impossible to fix.

  The current expectations are documented at the beginning of
  uipc_socket.c along with the other information there.

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb
  Tested by:    zec

Tested by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	2 weeks
2011-02-16 21:29:13 +00:00
Alan Cox
4673c751f8 Further simplify tmpfs_reg_resize(). Also, update its comments, including
style fixes.
2011-02-14 15:36:38 +00:00
Alan Cox
b10d1d5d60 Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm
object's size field.  Previously, that field was always zero, even
when the object tn_reg.tn_aobj contained numerous pages.

Apply style fixes to tmpfs_reg_resize().

In collaboration with:	kib
2011-02-13 14:46:39 +00:00
John Baldwin
73dd6d1f8f After reading a bitmap block for i-nodes or blocks, recheck the count of
free i-nodes or blocks to handle a race where another thread might have
allocated the last i-node or block while we were waiting for the buffer.

Tested by:	dougb
2011-02-08 13:02:25 +00:00
Alan Cox
17f3095d1a Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are
incorrectly calling vm_object_page_clean().  They are passing the length of
the range rather than the ending offset of the range.

Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the
callers.

Reviewed by:	kib
MFC after:	3 weeks
2011-02-05 21:21:27 +00:00
John Baldwin
a3ebd02675 Collapse duplicate definitions of EXT2_SB().
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-02-04 14:20:27 +00:00
John Baldwin
8e42a40607 Fix build with DIAGNOSTIC enabled.
Pointy hat to:	jhb
2011-02-02 14:59:05 +00:00
John Baldwin
45641afb72 Some cosmetic fixes and remove a duplicate constant.
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-02-01 18:30:52 +00:00
John Baldwin
c767faa558 - Set the next_alloc fields for an i-node after allocating a new block
so that future allocations start with most recently allocated block
  rather than the beginning of the filesystem.
- Fix ext2_alloccg() to properly scan for 8 block chunks that are not
  aligned on 8-bit boundaries.  Previously this was causing new blocks
  to be allocated in a highly fragmented fashion (block 0 of a file at
  lbn N, block 1 at lbn N + 8, block 2 at lbn N + 16, etc.).
- Cosmetic tweaks to the currently-disabled fancy realloc sysctls.

PR:		kern/153584
Discussed with:	bde
Tested by:	Pedro F. Giffuni  giffunip at yahoo, Zheng Liu (lz)
2011-02-01 18:21:45 +00:00
George V. Neville-Neil
64181ef324 Quick fix to a comment. 2011-01-27 03:32:16 +00:00
Dmitry Chagin
a5c1afadeb Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after:	1 month
2011-01-26 20:03:58 +00:00
John Baldwin
cd2895aab0 - Move special inode constants to ext2_dinode.h and rename them to match
NetBSD.
- Add a constant for the HASJOURNAL compat flag.

PR:		kern/153584
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-01-21 22:00:40 +00:00
John Baldwin
84edda0a2c Restore support for the 'async' and 'sync' mount options lost when
switching to nmount(2).  While here, sort the options.

PR:		kern/153584
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
MFC after:	1 week
2011-01-21 21:33:46 +00:00
Konstantin Belousov
9fb9c623a6 In tmpfs_readdir(), normalize handling of the directory entries that
either overflow the supplied buffer, or cause uiomove fail.
Do not advance cached de when directory entry was not copied out.
Do not return EOF when no entries could be copied due to first entry
too large for supplied buffer, signal EINVAL instead.

Reported by:	Beat G?tzi <beat chruetertee ch>
MFC after:	1 week
2011-01-20 09:39:16 +00:00
John Baldwin
a2add8d070 Fix build with KDB defined.
Pointy hat to:	jhb
Submitted by:	jkim
2011-01-19 19:49:48 +00:00
John Baldwin
08b1d53573 Whitespace and style fixes. 2011-01-19 16:55:32 +00:00
John Baldwin
f82a066c72 Move calculation of 'bmask' earlier to match it's current location in
ufs_lookup().
2011-01-19 16:52:22 +00:00
John Baldwin
007c620744 Merge 118969 from UFS:
Eliminate the i_devvp field from the incore inodes, we can get the same
value from ip->i_ump->um_devvp.

Submitted by:	Pedro F. Giffuni  giffunip at yahoo
MFC after:	1 week
2011-01-19 16:46:13 +00:00
Rick Macklem
8207db3ec3 Fix the experimental NFSv4 server so that it uses VOP_ACCESSX()
to check for VREAD_ACL instead of VOP_ACCESS().

MFC after:	3 days
2011-01-18 14:34:45 +00:00
Rick Macklem
5f73287a6e Modify the experimental NFSv4 server so that it posts a SIGUSR2
signal to the master nfsd daemon whenever the stable restart
file has been modified. This will allow the master nfsd daemon
to maintain an up to date backup copy of the file. This is
enabled via the nfssvc() syscall, so that older nfsd daemons
will not be signaled.

Reviewed by:	jhb
MFC after:	1 week
2011-01-14 23:30:35 +00:00
Zack Kirsch
770b49a314 In the experimental NFS server, when converting an open-owner to a lock-owner,
start at sequence id 1 instead of 0, to match up with both Solaris and Linux.

Reviewed by:    rmacklem
Approved by:    zml (mentor)
2011-01-12 23:46:12 +00:00
Zack Kirsch
52776c502b Clean up the experimental NFS server replay cache when the module is unloaded.
Reviewed by:    rmacklem
Approved by:    zml (mentor)
2011-01-12 23:34:09 +00:00
Rick Macklem
f9266eb1f9 Modify readdirplus in the experimental NFS server in a
manner analogous to r216633 for the regular server. This
change busies the file system so that VFS_VGET() is
guaranteed to be using the correct mount point even
during a forced dismount attempt. Since nfsd_fhtovp() is
not called immediately before readdirplus, the patch is
actually a clone of pjd@'s nfs_serv.c.4.patch instead of
the one committed in r216633.

Reviewed by:	kib
MFC after:	10 days
2011-01-09 02:10:54 +00:00
Rick Macklem
fbf0af3fcb Delete the NFS_STARTWRITE() and NFS_ENDWRITE() macros that
obscured vn_start_write() and vn_finished_write() for the
old OpenBSD port, since most uses have been replaced by the
correct calls.

MFC after:	12 days
2011-01-06 20:31:33 +00:00
Rick Macklem
8974bc2f3a Since the VFS_LOCK_GIANT() code in the experimental NFS
server is broken and the major file systems are now all
mpsafe, modify the server so that it will only export
mpsafe file systems. This was discussed on freebsd-fs@
and removes a fair bit of crufty code.

MFC after:	12 days
2011-01-06 19:50:11 +00:00
Rick Macklem
785f073be9 Modify the experimental NFS server so that it calls
vn_start_write() with a non-NULL vp. That way it will
find the correct mount point mp and use that mp for the
subsequent vn_finished_write() call. Also, it should fail
without crashing if the mount point is being forced dismounted
because vn_start_write() will set the mp NULL via VOP_GETWRITEMOUNT().

Reviewed by:	kib
MFC after:	12 days
2011-01-05 19:35:35 +00:00
Rick Macklem
47524363da Fix the experimental NFS server to use vfs_busyfs() instead
of vfs_getvfs() so that the mount point is busied for the
VFS_FHTOVP() call. This is analagous to r185432 for the
regular NFS server.

Reviewed by:	kib
MFC after:	12 days
2011-01-05 18:46:05 +00:00
Rick Macklem
90305aa38b Fix the nlm so that it no longer depends on the regular
nfs client and, as such, can be loaded for the experimental
nfs client without the regular client.

Reviewed by:	jhb
MFC after:	2 weeks
2011-01-03 20:37:31 +00:00
Rick Macklem
fa5ecdd3b9 Fix the experimental NFS server so that it doesn't leak
a reference count on the directory when creating device
special files.

MFC after:	2 weeks
2011-01-03 00:40:13 +00:00
Rick Macklem
81f78d997d Modify the experimental NFSv4 server so that the lookup
ops return a locked vnode. This ensures that the associated mount
point will always be valid for the code that follows the operation.
Also add a couple of additional checks
for non-error to the other functions that create file objects.

MFC after:	2 weeks
2011-01-03 00:33:32 +00:00
Rick Macklem
c9aad40f5f Delete some cruft from the experimental NFS server that was
only used by the OpenBSD port for its pseudo-fs.

MFC after:	2 weeks
2011-01-02 21:34:01 +00:00
Rick Macklem
629fa50e68 Add checks for VI_DOOMED and vn_lock() failures to the
experimental NFS server, to handle the case where an
exported file system is forced dismounted while an RPC
is in progress. Further commits will fix the cases where
a mount point is used when the associated vnode isn't locked.

Reviewed by:	kib
MFC after:	2 weeks
2011-01-02 19:58:39 +00:00
Rick Macklem
5a12538bd7 Add support for shared vnode locks for the Read operation
in the experimental NFSv4 server.

Reviewed by:	kib
MFC after:	2 weeks
2011-01-01 18:50:49 +00:00
Rick Macklem
bd2fa726e0 Delete the nfsvno_localconflict() function in the experimental
NFS server since it is no longer used and is broken.

MFC after:	2 weeks
2010-12-28 23:50:13 +00:00
Rick Macklem
17891d0082 Modify the experimental NFS server so that it uses LK_SHARED
for RPC operations when it can. Since VFS_FHTOVP() currently
always gets an exclusively locked vnode and is usually called
at the beginning of each RPC, the RPCs for a given vnode will
still be serialized. As such, passing a lock type argument to
VFS_FHTOVP() would be preferable to doing the vn_lock() with
LK_DOWNGRADE after the VFS_FHTOVP() call.

Reviewed by:	kib
MFC after:	2 weeks
2010-12-25 21:56:25 +00:00
Rick Macklem
0cf42b622b Add an argument to nfsvno_getattr() in the experimental
NFS server, so that it can avoid calling VOP_ISLOCKED()
when the vnode is known to be locked. This will allow
LK_SHARED to be used for these cases, which happen to
be all the cases that can use LK_SHARED. This does not
fix any bug, but it reduces the number of calls to
VOP_ISLOCKED() and prepares the code so that it can be
switched to using LK_SHARED in a future patch.

Reviewed by:	kib
MFC after:	2 weeks
2010-12-24 21:31:18 +00:00
Rick Macklem
a852f40b7a Simplify vnode locking in the expeimental NFS server's
readdir functions. In particular, get rid of two bogus
VOP_ISLOCKED() calls. Removing the VOP_ISLOCKED() calls
is the only actual bug fixed by this patch.

Reviewed by:	kib
MFC after:	2 weeks
2010-12-24 20:24:07 +00:00
Rick Macklem
63e1cb4308 Since VOP_READDIR() for ZFS does not return monotonically
increasing directory offset cookies, disable the UFS related
loop that skips over directory entries at the beginning of
the block for the experimental NFS server. This loop is
required for UFS since it always returns directory entries
starting at the beginning of the block that
the requested directory offset is in. In discussion with pjd@
and mckusick@ it seems that this behaviour of UFS should maybe
change, with this fix being an interim patch until then.
This patch only fixes the experimental server, since pjd@ is
working on a patch for the regular server.

Discussed with:	pjd, mckusick
MFC after:	5 days
2010-12-24 18:46:44 +00:00
Rick Macklem
d6ec8427bc Fix two vnode locking problems in nfsd_recalldelegation() in the
experimental NFSv4 server. The first was a bogus use of VOP_ISLOCKED()
in a KASSERT() and the second was the need to lock the vnode for the
nfsrv_checkremove() call. Also, delete a "__unused" that was bogus,
since the argument is used.

Reviewed by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-12-17 22:18:09 +00:00
Jaakko Heinonen
2d843e7d34 Don't allow user created symbolic links to cover another entries marked
with DE_USER. If a devfs rule hid such entry, it was possible to create
infinite number of symbolic links with the same name.

Reviewed by:	kib
2010-12-15 16:49:47 +00:00
Jaakko Heinonen
ef456eec95 - Assert that dm_lock is exclusively held in devfs_rules_apply() and
in devfs_vmkdir() while adding the entry to de_list of the parent.
- Apply devfs rules to newly created directories and symbolic links.

PR:		kern/125034
Submitted by:	Mateusz Guzik (original version)
2010-12-15 16:42:44 +00:00
Jaakko Heinonen
2f66e90fc7 Handle the special ruleset 0 in devfs_ruleset_use(). An attempt set the
current ruleset to 0 with command "devfs ruleset 0" triggered a KASSERT
in devfs_ruleset_create().

PR:		kern/125030
Submitted by:	Mateusz Guzik
2010-12-12 08:52:13 +00:00
Rick Macklem
b4a8d95279 Disable attempts to establish a callback connection from the
experimental NFSv4 server to a NFSv4 client when delegations are not
being issued, even if the client advertises a callback path.
This avoids a problem where a Linux client advertises a
callback path that doesn't work, due to a firewall, and then
times out an Open attempt before the FreeBSD server gives up
its callback connection attempt. (Suggested by
drb at karlov.mff.cuni.cz to fix the Linux client problem that
he reported on the fs-stable mailing list.)
The server should probably have
a 1sec timeout on callback connection attempts when there are
no delegations issued to the client, but that patch will require
changes to the krpc and this serves as a work around until then.

Tested by:	drb at karlov.mff.cuni.cz
MFC after:	5 days
2010-12-09 19:02:23 +00:00
Edward Tomasz Napierala
ef694c1ac4 Replace pointer to "struct uidinfo" with pointer to "struct ucred"
in "struct vm_object".  This is required to make it possible to account
for per-jail swap usage.

Reviewed by:	kib@
Tested by:	pho@
Sponsored by:	FreeBSD Foundation
2010-12-02 17:37:16 +00:00
Konstantin Belousov
847e02e941 For non-stopped threads, td_frame pointer is undefined. As a
consequence, fill_regs() and fill_fpregs() access random data, usually
on the thread kernel stack. Most often the td_frame points to the
previous frame saved by last kernel entry sequence, but this is not
guaranteed.

For /proc/<pid>/{regs,fpregs} read access, require the thread to be in
stopped state. Otherwise, return EBUSY as is done for write case.

Reported and tested by:	pho
Approved by:	des (procfs maintainer)
MFC after:	1 week
2010-12-02 12:44:51 +00:00
Konstantin Belousov
730b63b0c2 Remove prtactive variable and related printf()s in the vop_inactive
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.

MFC after:   1 week
X-MFC-note:  keep prtactive symbol in vfs_subr.c
2010-11-19 21:17:34 +00:00
John Baldwin
b3e3402d3a Remove unused includes of <sys/mutex.h> and <machine/mutex.h>. 2010-11-09 20:41:10 +00:00
Rick Macklem
f93d95cbf6 Modify nfs_open() in the experimental NFS client to be compatible
with the regular NFS client. Also, fix a couple of mutex lock issues.

MFC after:	1 week
2010-10-29 13:46:21 +00:00
Rick Macklem
0661e0348b Add a call for nfsrpc_close() to ncl_reclaim() in the experimental
NFSv4 client, since the call in ncl_inactive() might be missed
because VOP_INACTIVE() is not guaranteed to be called before
VOP_RECLAIM().

MFC after:	1 week
2010-10-29 13:34:57 +00:00
Rick Macklem
c5dd9d8c37 Add a flag to the experimental NFSv4 client to indicate when
delegations are being returned for reasons other than a Recall.
Also, re-organize nfscl_recalldeleg() slightly, so that it leaves
clearing NMODIFIED to the ncl_flush() call and invalidates the
attribute cache after flushing. It is hoped that these changes
might fix the problem others have seen when using the NFSv4
client with delegations enabled, since I can't reliably reproduce
the problem. These changes only affect the client when doing NFSv4
mounts with delegations enabled.

MFC after:	10 days
2010-10-26 23:18:37 +00:00
Rick Macklem
377c50f67a Modify the experimental NFSv4 server's file handle hash function
to use the generic hash32_buf() function. Although adding the
bytes seemed sufficient for UFS and ZFS, since most of the bytes
are the same for file handles on the same volume, this might not
be sufficient for other file systems. Use of a generic function
also seems preferable to one specific to NFSv4.

Suggested by:	gleb.kurtsou at gmail.com
MFC after:	10 days
2010-10-23 22:28:29 +00:00
Rick Macklem
91027b4ef0 Modify the file handle hash function in the experimental NFS
server so that it will work better for non-UFS file systems.
The new function simply sums the bytes of the fh_fid field
of fhandle_t.

MFC after:	10 days
2010-10-22 21:38:56 +00:00
Rick Macklem
8a1b5ade5f Modify the experimental NFS server in a manner analagous to
r214049 for the regular NFS server, so that it will not do
a VOP_LOOKUP() of ".." when at the root of a file system
when performing a ReaddirPlus RPC.

MFC after:	10 days
2010-10-21 18:49:12 +00:00
Rick Macklem
4d4f9a3721 Fix the type of the 3rd argument for nm_getinfo so that it works
for architectures like sparc64.

Suggested by:	kib
MFC after:	2 weeks
2010-10-19 11:55:58 +00:00
Rick Macklem
ca27c028d8 Modify the NFS clients and the NLM so that the NLM can be used
by both clients. Since the NLM uses various fields of the
nfsmount structure, those fields were extracted and put in a
separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h.
This structure also has a function pointer for a function that
extracts the required information from the mount point and nfs vnode
for that particular client, for information stored differently by the
clients.

Reviewed by:	jhb
MFC after:	2 weeks
2010-10-19 00:20:00 +00:00
Kevin Lo
4bc8fad7bd Fix a possible race where the directory dirent is moved to the location
that was used by ".." entry.
This change seems fixed panic during attempt to access msdosfs data
over nfs.

Reviewed by:	kib
MFC after:	1 week
2010-10-18 03:34:33 +00:00
Rui Paulo
0b53cc9f56 Ignore the return value of DE_INTERNALIZE(). 2010-10-13 11:37:39 +00:00
Andriy Gapon
e07b64c567 tmpfs + sendfile: do not produce partially valid pages for vnode's tail
See r213730 for details of analogous change in ZFS.

MFC after:	3 days
2010-10-12 17:16:51 +00:00
Jaakko Heinonen
27877c9903 Format prototypes to follow style(9) more closely.
Discussed with:	kib, phk
2010-10-12 15:58:52 +00:00
Rick Macklem
db0a33d219 Try and make the nfsrv_localunlock() function in the experimental
NFSv4 server more readable. Mostly changes to comments, but a
case of >= is changed to >, since == can never happen. Also, I've
added a couple of KASSERT()s and a slight optimization, since
once the "else if" case happens, subsequent locks in the list can't
have any effect. None of these changes fixes any known bug.

MFC after:	2 weeks
2010-10-11 23:15:18 +00:00
Konstantin Belousov
d0cc54f3b4 The r184588 changed the layout of struct export_args, causing an ABI
breakage for old mount(2) syscall, since most struct <filesystem>_args
embed export_args. The mount(2) is supposed to provide ABI
compatibility for pre-nmount mount(8) binaries, so restore ABI to
pre-r184588.

Requested and reviewed by:	bde
MFC after:    2 weeks
2010-10-10 07:05:47 +00:00
Konstantin Belousov
b0d5391101 Add a comment describing the reason for calling cache_purge(fvp).
Requested by:	danfe
MFC after:	6 days
2010-10-08 07:17:22 +00:00
Konstantin Belousov
4d477d5c77 The msdosfs lookup is case insensitive. Several aliases may be inserted for
a single directory entry. As a consequnce, name cache purge done by lookup
for fvp when DELETE op for namei is specified, might be not enough to
expunge all namecache entries that were installed for this direntry.

Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded.

PR:	kern/93634
MFC after:	1 week
2010-10-07 08:36:02 +00:00
Alan Cox
a03e344a7f M_USE_RESERVE has been deprecated for a decade. Eliminate any uses that
have no run-time effect.
2010-10-02 17:58:57 +00:00
Jaakko Heinonen
47bcfb6422 Add a new function devfs_dev_exists() to be able to find out if a
specific devfs path already exists.

The function will be used from kern_conf.c to detect duplicate device
registrations. Callers must hold the devmtx mutex.

Reviewed by:	kib
2010-09-27 18:20:56 +00:00
Jaakko Heinonen
d318c565d7 Add reference counting for devfs paths containing user created symbolic
links. The reference counting is needed to be able to determine if a
specific devfs path exists. For true device file paths we can traverse
the cdevp_list but a separate directory list is needed for user created
symbolic links.

Add a new directory entry flag DE_USER to mark entries which should
unreference their parent directory on deletion.

A new function to traverse cdevp_list and the directory list will be
introduced in a separate commit.

Idea from:	kib
Reviewed by:	kib
2010-09-27 17:47:09 +00:00
Jaakko Heinonen
6adc52306a Modify devfs_fqpn() for future use in devfs path reference counting
code:

- Accept devfs_mount and devfs_dirent as the arguments instead of a
  vnode. This generalizes the function so that it can be used from
  contexts where vnode references are not available.
- Accept NULL cnp argument. No '/' will be appended, if a NULL cnp is
  provided.
- Make the function global and add its prototype to devfs.h.

Reviewed by:	kib
2010-09-21 16:49:02 +00:00
Rick Macklem
a212c01aac Fix nfsrv_freeallnfslocks() in the experimental NFSv4 server so that
it frees local locks correctly upon close. In order for
nfsrv_localunlock() to work correctly, the lock can no longer be in
the lockowner's stateid list. As such, nfsrv_freenfslock() has to
be called before nfsrv_localunlock(), to get rid of the lock structure
on the lockowner's stateid list. This only affected operation when
local locks (vfs.newnfs.enable_locallocks=1) are enabled, which is
not the default at this time.

MFC after:	1 week
2010-09-19 01:18:03 +00:00
Rick Macklem
c7aafc24c4 Fix the experimental NFSv4 server so that it performs local VOP_ADVLOCK()
unlock operations correctly. It was passing in F_SETLK instead of
F_UNLCK as the operation for the unlock case. This only affected
operation when local locking (vfs.newnfs.enable_locallocks=1) was enabled.

MFC after:	1 week
2010-09-19 01:05:19 +00:00
Jaakko Heinonen
8570d045e5 - For consistency, remove "." and ".." entries from de_dlist before
calling devfs_delete() (and thus possibly dropping dm_lock) in
  devfs_rmdir_empty().
- Assert that we don't return doomed entries from devfs_find(). [1]

Suggested by:	kib [1]
Reviewed by:	kib
2010-09-18 18:37:41 +00:00
Jaakko Heinonen
89d10571db Remove empty devfs directories automatically.
devfs_delete() now recursively removes empty parent directories unless
the DEVFS_DEL_NORECURSE flag is specified. devfs_delete() can't be
called anymore with a parent directory vnode lock held because the
possible parent directory deletion needs to lock the vnode. Thus we
unlock the parent directory vnode in devfs_remove() before calling
devfs_delete().

Call devfs_populate_vp() from devfs_symlink() and devfs_vptocnp() as now
directories can get removed.

Add a check for DE_DOOMED flag to devfs_populate_vp() because
devfs_delete() drops dm_lock before the VI_DOOMED vnode flag gets set.
This ensures that devfs_populate_vp() returns an error for directories
which are in progress of deletion.

Reviewed by:	kib
Discussed on:	freebsd-current (mostly silence)
2010-09-15 14:23:55 +00:00
Andriy Gapon
21bd3e2576 tmpfs, zfs + sendfile: mark page bits as valid after populating it with data
Otherwise, adding insult to injury, in addition to double-caching of data
we would always copy the data into a vnode's vm object page from backend.
This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).

PR:		kern/141305
Reported by:	Wiktor Niesiobedzki <bsd@vink.pl>
Reviewed by:	alc
Tested by:	tools/regression/sockets/sendfile
MFC after:	2 weeks
2010-09-15 10:31:27 +00:00
Rick Macklem
2c6d0e01f8 This patch applies one of the two fixes suggested by
zack.kirsch at isilon.com for a race between nfsrv_freeopen()
and nfsrv_getlockfile() in the experimental NFS server that
he found during testing. Although nfsrv_freeopen() holds a
sleep lock on the lock file structure when called with
cansleep != 0, nfsrv_getlockfile() could still search the
list, once it acquired the NFSLOCKSTATE() mutex. I believe
that acquiring the mutex in nfsrv_freeopen() fixes the race.

MFC after:	2 weeks
2010-09-10 23:49:33 +00:00
Rick Macklem
37fe683250 Fix the NFSVNO_CMPFH() macro in the experimental NFS server so
that it works correctly for ZFS file handles. It is possible to
have two ZFS file handles that differ only in the bytes in the
fid_reserved field of the generic "struct fid" and comparing the
bytes in fid_data didn't catch this case. This patch changes the
macro to compare all bytes of "struct fid".

Tested by:	gull at gull.us
MFC after:	2 weeks
2010-09-10 23:18:45 +00:00
Rick Macklem
a8c0af5906 Fix the experimental NFS client so that it doesn't panic when
NFSv2,3 byte range locking is attempted. A fix that allows the
nlm_advlock() to work with both clients is in progress, but
may take a while. As such, I am doing this commit so that
the kernel doesn't panic in the meantime.

Submitted by:	jh
MFC after:	2 weeks
2010-09-09 15:45:11 +00:00
Ivan Voras
b2143ecb99 Avoid "Entry can disappear before we lock fdvp" panic.
PR:		150143
Submitted by:	Gleb Kurtsou <gk at FreeBSD.org>
Pretty sure it won't blow up: mckusick
MFC after:	2 weeks
2010-09-07 22:40:45 +00:00
John Baldwin
8e27c18282 Store the full timestamp when caching timestamps of files and
directories for purposes of validating name cache entries.  This
closes races where two updates to a file or directory within the same
second could result in stale entries in the name cache.  While here,
remove the 'n_expiry' field as it is no longer used.

Reviewed by:	rmacklem
MFC after:	1 week
2010-09-07 14:29:45 +00:00
Daichi GOTO
21f9b7b28a Allowed unionfs to use whiteout not supporting file system as
upper layer. Until now, unionfs prevents to use that kind of
file system as upper layer. This time, I changed to allow
that kind of file system as upper layer. By this change, you
can use whiteout not supporting file system (e.g., especially
for tmpfs) as upper layer. It's very useful for combination of
tmpfs as upper layer and read only file system as lower layer.

By difinition, without whiteout support from the file system
backing the upper layer, there is no way that delete and rename
operations on lower layer objects can be done.  EOPNOTSUPP is
returned for this kind of operations as generated by VOP_WHITEOUT()
along with any others which would make modifica tions to the
lower layer, such as chmod(1).

This change is suggested by ed.

Submitted by:	ed
2010-09-05 04:58:16 +00:00
Rick Macklem
848fd2c0e2 Change the code in ncl_bioread() in the experimental NFS
client to return an error when rabp is not set, so it
behaves the same way as the regular NFS client for this
case. It does not affect NFSv4, since nfs_getcacheblk()
only fails for "intr" mounts and NFSv4 can't use the
"intr" mount option.

MFC after:	2 weeks
2010-09-05 00:47:44 +00:00
Rick Macklem
0372f5f411 Disable use of the NLM in the experimental NFS client, since
it will crash the kernel because it uses the nfsmount and
nfsnode structures of the regular NFS client.

MFC after:	2 weeks
2010-09-05 00:10:18 +00:00
Ulf Lilleengen
0cc17ce608 - Remove duplicate comment.
PR:		kern/148820
Submitted by:	pluknet <pluknet - at - gmail.com>
2010-09-01 05:34:17 +00:00
Rick Macklem
2d0c83b139 Add a null_remove() function to nullfs, so that the v_usecount
of the lower level vnode is incremented to greater than 1 when
the upper level vnode's v_usecount is greater than one. This
is necessary for the NFS clients, so that they will do a silly
rename of the file instead of actually removing it when the
file is still in use. It is "racy", since the v_usecount is
incremented in many places in the kernel with
minimal synchronization, but an extraneous silly rename is
preferred to not doing a silly rename when it is required.
The only other file systems that currently check the value
of v_usecount in their VOP_REMOVE() functions are nwfs and
smbfs. These file systems choose to fail a remove when the
v_usecount is greater than 1 and I believe will function
more correctly with this patch, as well.

Tested by:	to.my.trociny at gmail.com
Submitted by:	to.my.trociny at gmail.com (earlier version)
Reviewed by:	kib
MFC after:	2 weeks
2010-08-31 01:16:45 +00:00
Rick Macklem
b5cb66df25 Add acquisition of a reference count on nfsv4root_lock to the
nfsd_recalldelegation() function, since this function is called
by nfsd threads when they are handling NFSv2 or NFSv3 RPCs, where
no reference count would have been acquired.

MFC after:	2 weeks
2010-08-28 23:50:09 +00:00
Rick Macklem
2ec3f92528 The timer routine in the experimental NFS server did not acquire
the correct mutex when checking nfsv4root_lock. Although this
could be fixed by adding mutex lock/unlock calls, zack.kirsch at
isilon.com suggested a better fix that uses a non-blocking
acquisition of a reference count on nfsv4root_lock. This fix
allows the weird NFSLOCKSTATE(); NFSUNLOCKSTATE(); synchronization
to be deleted. This patch applies this fix.

Tested by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-08-28 21:41:18 +00:00
Jaakko Heinonen
4136388a18 Set de_dir for user created symbolic links. This will be needed to be
able to resolve their parent directories.
2010-08-26 16:01:29 +00:00
Edward Tomasz Napierala
81f6480d42 Revert r210194, adding a comment explaining why calls to chgproccnt()
in unionfs are actually needed.  I have a better fix in trasz_hrl p4 branch,
but now is not a good moment to commit it.

Reported by:	Alex Kozlov
2010-08-25 21:32:08 +00:00
Jaakko Heinonen
f5efcd64f4 Call devfs_populate_vp() from devfs_getattr(). It was possible that
fstat(2) returned stale information through an open file descriptor.
2010-08-25 15:29:12 +00:00
Jaakko Heinonen
0f6bb099ae Introduce and use devfs_populate_vp() to unlock a vnode before calling
devfs_populate(). This is a prerequisite for the automatic removal of
empty directories which will be committed in the future.

Reviewed by:	kib (previous version)
2010-08-22 16:08:12 +00:00
Ed Schouten
99d57a6bd8 Add support for whiteouts on tmpfs.
Right now unionfs only allows filesystems to be mounted on top of
another if it supports whiteouts. Even though I have sent a patch to
daichi@ to let unionfs work without it, we'd better also add support for
whiteouts to tmpfs.

This patch implements .vop_whiteout and makes necessary changes to
lookup() and readdir() to take them into account. We must also make sure
that when adding or removing a file, we honour the componentname's
DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames.

MFC after:	1 month
2010-08-22 05:36:06 +00:00
John Baldwin
3634d5b241 Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and
LK_CANRECURSE after a lock is created.  Use them to implement macros that
otherwise manipulated the flags directly.  Assert that the associated
lockmgr lock is exclusively locked by the current thread when manipulating
these flags to ensure the flag updates are safe.  This last change required
some minor shuffling in a few filesystems to exclusively lock a brand new
vnode slightly earlier.

Reviewed by:	kib
MFC after:	3 days
2010-08-20 19:46:50 +00:00
Jaakko Heinonen
96835d61b6 Call dev_rel() in error paths.
Reported by:	kib
Reviewed by:	kib
MFC after:	2 weeks
2010-08-19 16:39:00 +00:00
Jaakko Heinonen
64040d3978 Allow user created symbolic links to cover device files and directories
if the device file appears during or after the link creation.

User created symbolic links are now inserted at the head of the
directory entry list after the "." and ".." entries. A new directory
entry flag DE_COVERED indicates that an entry is covered by a symbolic
link.

PR:		kern/114057
Reviewed by:	kib
Idea from:	kib
Discussed on:	freebsd-current (mostly silence)
2010-08-12 15:29:07 +00:00
Robert Watson
be80264279 Properly bounds check ioctl/pioctl data arguments for Coda:
1. Use unsigned rather than signed lengths
2. Bound messages to/from Venus to VC_MAXMSGSIZE
3. Bound messages to/from general user processes to VC_MAXDATASIZE
4. Update comment regarding data limits for pioctl

Without (1) and (3), it may be possible for unprivileged user processes to
read sensitive portions of kernel memory.  This issue is only present if
the Coda kernel module is loaded and venus (the userspace Coda daemon) is
running and has /coda mounted.

As Coda is considered experimental and production use is warned against in
the coda(4) man page, and because Coda must be explicitly configured for a
configuration to be vulnerable, we won't be issuing a security advisory.
However, if you are using Coda, then you are advised to apply these fixes.

Reported by:	Dan J. Rosenberg <drosenberg at vsecurity.com>
Obtained from:	NetBSD (Christos Zoulas)
Security:	Kernel memory disclosure; no advisory as feature experimental
MFC after:	3 days
2010-08-07 08:08:14 +00:00
Konstantin Belousov
d3c5a40780 Enable shared lookups and externed shared ops for devfs.
In collaboration with:	pho
MFC after:	1 month
2010-08-06 09:46:53 +00:00
Konstantin Belousov
3979450b4c Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created
cdev will never be destroyed. Propagate the flag to devfs vnodes as
VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a
thread reference on such nodes.

In collaboration with:	pho
MFC after:	1 month
2010-08-06 09:42:15 +00:00
Konstantin Belousov
9968a42675 Enable shared locks for the devfs vnodes. Honor the locking mode
requested by lookup(). This should be a nop at the moment.

In collaboration with:	pho
MFC after:	1 month
2010-08-06 09:23:47 +00:00
Konstantin Belousov
3a6fc63c9f Initialize VV_ISTTY vnode flag on the devfs vnode creation instead of
doing it on each open.

In collaboration with:	pho
MFC after:	1 month
2010-08-06 09:06:55 +00:00
Rick Macklem
e3649d5a2f Modify the return value for nfscl_mustflush() from boolean_t,
which I mistakenly thought was correct w.r.t. style(9), back
to int and add the checks for != 0. This is just a stylistic
modification.

MFC after:	1 week
2010-08-03 01:49:28 +00:00
Rick Macklem
f92bbff248 Move sys/nfsclient/nfs_lock.c into sys/nfs and build it as a separate
module that can be used by both the regular and experimental nfs
clients. This fixes the problem reported by jh@ where /dev/nfslock
would be registered twice when both nfs clients were used.
I also defined the size of the lm_fh field to be the correct value,
as it should be the maximum size of an NFSv3 file handle.

Reviewed by:	jh
MFC after:	2 weeks
2010-07-24 22:11:11 +00:00
Rick Macklem
66c0f45a3d For the experimental NFSv4 server's dumplocks operation, add the
MPSAFE flag to cn_flags so that it doesn't panic. The panics weren't
seen since nfsdumpstate(8) is broken for the "-l" case, so this
was never done. I'll do a separate commit to fix nfsdumpstate(8).

Submitted by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-07-19 23:33:42 +00:00
Rick Macklem
6ec1ef63d1 Add a call to nfscl_mustflush() in nfs_close() of the experimental
NFSv4 client, so that attributes are not acquired from the server
when a delegation for the file is held. This can reduce the number
of Getattr Ops significantly.

MFC after:	2 weeks
2010-07-18 22:35:46 +00:00
Edward Tomasz Napierala
dce36a0159 Fix build.
Submitted by:	Andreas Tobler <andreast-list at fgznet.ch>
2010-07-18 07:55:22 +00:00
Rick Macklem
5813b99c83 Change the nfscl_mustflush() function in the experimental NFSv4
client to return a boolean_t in order to make it more compatible
with style(9).

MFC after:	2 weeks
2010-07-18 00:24:01 +00:00
Edward Tomasz Napierala
b29d02f258 Remove updating process count by unionfs. It serves no purpose, unionfs just
needs root credentials for a moment.
2010-07-17 15:45:20 +00:00
Rick Macklem
2cf552b115 Patch the experimental NFSv4 server so that it acquires a reference
count on nfsv4rootfs_lock when dumping state, since these functions
are not called by nfsd threads. Without this reference count, it
is possible for an nfsd thread to acquire an exclusive lock on
nfsv4rootfs_lock while the dump is in progress and then change the
lists, potentially causing a crash.

Reported by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-07-16 23:17:05 +00:00
John Baldwin
61e1c19319 Revert the previous commit. The race is not applicable to the lockmgr
implementation in 8.0 and later as its flags field does not hold dynamic
state such as waiters flags, but is only modified in lockinit() aside
from VN_LOCK_*().

Discussed with:	attilio
2010-07-16 19:52:03 +00:00
John Baldwin
dbfcf8cfea When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were
changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE
in the vnode lock's flags) until after they had determined if the vnode was
a FIFO.  This occurs after the vnode has been inserted a VFS hash or some
similar table, so it is possible for another thread to find this vnode via
vget() on an i-node number and block on the vnode lock.  If the lockmgr
interlock (vnode interlock for vnode locks) is not held when clearing the
LK_NOSHARE flag, then the lk_flags field can be clobbered.  As a result
the thread blocked on the vnode lock may never get woken up.  Fix this by
holding the vnode interlock while modifying the lock flags in this case.

MFC after:	3 days
2010-07-16 19:20:20 +00:00
Rick Macklem
866e6c5adb Delete comments related to soft clock interrupts that don't apply
to the FreeBSD port of the experimental NFSv4 server.

Submitted by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-07-16 01:44:49 +00:00
John Baldwin
3c497facfb Retire the NFS access cache timestamp structure. It was used in VOP_OPEN()
to avoid sending multiple ACCESS/GETATTR RPCs during a single open()
between VOP_LOOKUP() and VOP_OPEN().  Now we always send the RPC in
VOP_LOOKUP() and not VOP_OPEN() in the cases that multiple RPCs could be
sent.

MFC after:	2 weeks
2010-07-15 19:40:48 +00:00
John Baldwin
f9b1a4a3b6 Merge 208603, 209946, and 209948 to the new NFS client:
Move attribute cache flushes from VOP_OPEN() to VOP_LOOKUP() to provide
more graceful recovery for stale filehandles and eliminate the need for
conditionally clearing the attribute cache in the !NMODIFIED case in
VOP_OPEN().

Reviewed by:	rmacklem
MFC after:	2 weeks
2010-07-15 19:21:48 +00:00
Rick Macklem
63f6e5bf6f This patch fixes a bug in the experimental NFSv4 server where it
released a reference count on nfsv4rootfs_lock erroneously when
administrative revocation of state was done.

Submitted by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-07-15 03:02:10 +00:00
Rick Macklem
86836fcf1f For the experimental NFSv4 client, make sure that attributes that
predate the issue of a delegation are not cached once the delegation
is held. This is necessary, since cached attributes remain valid
while the delegation is held.

MFC after:	2 weeks
2010-07-13 23:14:39 +00:00
Rick Macklem
4bc59a660c For the experimental NFSv4 client, do not use cached attributes
that were invalidated, even when a delegation for the file is held.

MFC after:	2 weeks
2010-07-13 23:07:32 +00:00
Rick Macklem
95b1c51b6c Fix a bogus comment that mentions lru lists that don't exist.
Reported by:	zack.kirsch at isilon.com
MFC after:	2 weeks
2010-07-13 22:44:14 +00:00
Andriy Gapon
12b3a08d09 udf_vnops: cosmetic followup to r208671 - better looking code
Suggested by:	jhb
MFC after:	3 days
2010-06-22 08:22:25 +00:00
Alan Cox
61a2a5dcd2 Eliminate unnecessary page queues locking. 2010-06-18 22:12:12 +00:00
Alan Cox
8393d186b9 Eliminate unnecessary page queues locking. 2010-06-16 00:41:21 +00:00
Rick Macklem
a8437c97f1 Add MODULE_DEPEND() macros to the experimental NFS client and
server so that the modules will load when kernels are built with
none of the NFS* configuration options specified. I believe this
resolves the problems reported by PR kern/144458 and the email on
freebsd-stable@ posted by Dmitry Pryanishnikov on June 13.

Tested by:	kib
PR:		kern/144458
Reviewed by:	kib
MFC after:	1 week
2010-06-15 00:25:04 +00:00
Konstantin Belousov
b38f7723eb In NFS clients, instead of inconsistently using #ifdef
DIAGNOSTIC and #ifndef DIAGNOSTIC for debug assertions, prefer
KASSERT(). Also change one #ifdef DIAGNOSTIC in the new nfs server.

Submitted by:	Mikolaj Golub <to.my.trociny gmail com>
MFC after:	2 weeks
2010-06-13 05:24:27 +00:00
Andriy Gapon
1bdfff2252 fix a few cases where a string is passed via format argument instead of
via %s

Most of the cases looked harmless, but this is done for the sake of
correctness.  In one case it even allowed to drop an intermediate buffer.

Found by:	clang
MFC after:	2 week
2010-06-11 19:27:21 +00:00
Jaakko Heinonen
f40645c83d Add a new function devfs_parent_dirent() for resolving devfs parent
directory entry. Use the new function in devfs_fqpn(), devfs_lookupx()
and devfs_vptocnp() instead of manually resolving the parent entry.

Reviewed by:	kib
2010-06-09 15:29:12 +00:00
Jaakko Heinonen
59e0452e82 Don't try to call cdevsw d_close() method when devfs_close() is called
because of insmntque1() failure.

Found with:	stress2
Suggested and reviewed by:	kib
2010-06-01 18:57:21 +00:00
Andriy Gapon
6b3ee24839 udf_readlink: fix malloc call with uninitialized size parameter
Found by:	clang static analyzer
MFC after:	4 days
2010-05-31 09:08:44 +00:00
Rick Macklem
f8c5fbf7c1 Allow the experimental NFSv4 client to use cached attributes
when a write delegation is held. Also, add a missing
mtx_unlock() call for the ACL debugging code.

MFC after:	5 days
2010-05-18 05:18:21 +00:00
Rick Macklem
5ed9b96420 Add a sanity check for a negative args.fhsize to the experimental
NFS client.

MFC after:	5 days
2010-05-17 23:55:38 +00:00
Konstantin Belousov
de082cd17a Disable bypass for the vop_advlockpurge(). The vop is called after
vop_revoke(), the v_data is already destroyed.

Reported and tested by:	ed
2010-05-16 05:00:29 +00:00
Konstantin Belousov
c3fd23a2dc The thread_unsuspend() requires both process mutex and process spinlock
locked. Postpone the process unlock till the thread_unsuspend() is called.

Approved by:	des (procfs maintainer)
MFC after:	1 week
2010-05-10 15:19:12 +00:00
Konstantin Belousov
53731b3c44 For detach procfs ctl command, also clear P_STOPPED_TRACE process stop
flag, and for each thread, TDB_SUSPEND debug flag, same as it is done by
exit1() for orphaned debugee.

Approved by:	des (procfs maintainer)
MFC after:	1 week
2010-05-10 15:18:03 +00:00
Rick Macklem
c19f54267c Fix typos in macros.
PR:		kern/146375
Submitted by:	simon AT comsys.ntu-kpi.kiev.ua
MFC after:	1 week
2010-05-08 14:50:12 +00:00
Rick Macklem
23d9efa7a8 Patch the experimental NFS client so that it works for NFSv2
by adding the necessary mapping from NFSv3 procedure numbers
to NFSv2 procedure numbers when doing NFSv2 RPCs.

MFC after:	1 week
2010-05-08 01:24:18 +00:00
Alan Cox
03679e2334 Push down the page queues lock into vm_page_activate(). 2010-05-07 15:49:43 +00:00
Konstantin Belousov
d2ba618a63 Add MAKEDEV_NOWAIT flag to make_dev_credf(9), to create a device node
in a no-sleep context. If resource allocation cannot be done without
sleep, make_dev_credf() fails and returns NULL.

Reviewed by:	jh
MFC after:	2 weeks
2010-05-06 19:22:50 +00:00
Alan Cox
eb00b276ab Eliminate page queues locking around most calls to vm_page_free(). 2010-05-06 18:58:32 +00:00
Edward Tomasz Napierala
307d88b787 Style fixes and removal of unneeded variable.
Submitted by:	bde@
2010-05-06 18:43:19 +00:00
Alan Cox
5ac59343be Acquire the page lock around all remaining calls to vm_page_free() on
managed pages that didn't already have that lock held.  (Freeing an
unmanaged page, such as the various pmaps use, doesn't require the page
lock.)

This allows a change in vm_page_remove()'s locking requirements.  It now
expects the page lock to be held instead of the page queues lock.
Consequently, the page queues lock is no longer required at all by callers
to vm_page_rename().

Discussed with: kib
2010-05-05 18:16:06 +00:00
Edward Tomasz Napierala
b5f770bd86 Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().
Reviewed by:	kib
2010-05-05 16:44:25 +00:00
Alan Cox
e3ef0d2fcf Push down the acquisition of the page queues lock into vm_page_unwire().
Update the comment describing which lock should be held on entry to
vm_page_wire().

Reviewed by:	kib
2010-05-05 03:45:46 +00:00
Konstantin Belousov
fc0c3802f0 Lock the page around vm_page_activate() and vm_page_deactivate() calls
where it was missed. The wrapped fragments now protect wire_count with
page lock.

Reviewed by:	alc
2010-05-03 20:31:13 +00:00
Alan Cox
c5a648516e Acquire the page lock around vm_page_unwire() and vm_page_wire().
Reviewed by:	kib
2010-05-03 16:41:11 +00:00
Alan Cox
b88b6c9d80 It makes no sense for vm_page_sleep_if_busy()'s helper, vm_page_sleep(),
to unconditionally set PG_REFERENCED on a page before sleeping.  In many
cases, it's perfectly ok for the page to disappear, i.e., be reclaimed by
the page daemon, before the caller to vm_page_sleep() is reawakened.
Instead, we now explicitly set PG_REFERENCED in those cases where having
the page persist until the caller is awakened is clearly desirable.  Note,
however, that setting PG_REFERENCED on the page is still only a hint,
and not a guarantee that the page should persist.
2010-05-02 17:33:46 +00:00
Rick Macklem
8583f92fdf For the experimental NFS client, it should always flush dirty
buffers before closing the NFSv4 opens, as the comment states.
This patch deletes the call to nfscl_mustflush() which would
return 0 for the case where a delegation still exists, which
was incorrect and could cause crashes during recovery from
an expired lease.

MFC after:	1 week
2010-04-28 23:16:21 +00:00
Rick Macklem
cb8a84e08e Delete a diagnostic statement that is no longer useful from
the experimental NFS client.

MFC after:	1 week
2010-04-28 23:05:42 +00:00
Rick Macklem
23f929dfe8 An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs
during the grace period after startup. This grace period must
be at least the lease duration, which is typically 1-2 minutes.
It seems prudent for the experimental NFS client to wait a few
seconds before retrying such an RPC, so that the server isn't
flooded with non-recovery RPCs during recovery. This patch adds
an argument to nfs_catnap() to implement a 5 second delay
for this case.

MFC after:	1 week
2010-04-24 22:52:14 +00:00
Rick Macklem
67c5c2d2d8 When the experimental NFS client is handling an NFSv4 server reboot
with delegations enabled, the recovery could fail if the renew
thread is trying to return a delegation, since it will not do the
recovery. This patch fixes the above by having nfscl_recalldeleg()
fail with the I/O operations returning EIO, so that they will be
attempted later. Most of the patch consists of adding an argument
to various functions to indicate the delegation recall case where
this needs to be done.

MFC after:	1 week
2010-04-22 23:51:01 +00:00
Konstantin Belousov
5673e3cb08 The cache_enter(9) function shall not be called for doomed dvp.
Assert this.

In the reported panic, vdestroy() fired the assertion "vp has namecache
for ..", because pseudofs may end up doing cache_enter() with reclaimed
dvp, after dotdot lookup temporary unlocked dvp.
Similar problem exists in ufs_lookup() for "." lookup, when vnode
lock needs to be upgraded.

Verify that dvp is not reclaimed before calling cache_enter().

Reported and tested by:	pho
Reviewed by:	kan
MFC after:	2 weeks
2010-04-20 10:19:27 +00:00
Rick Macklem
a318bc273d For the experimental NFS client doing an NFSv4 mount,
set the NFSCLFLAGS_RECVRINPROG while doing recovery from an expired
lease in a manner similar to r206818 for server reboot recovery.
This will prevent the function that acquires stateids for I/O
operations from acquiring out of date stateids during recovery.
Also, fix up mutex locking on the nfsc_flags field.

MFC after:	1 week
2010-04-20 01:02:39 +00:00
Rick Macklem
7ea710b3b1 Avoid extraneous recovery cycles in the experimental NFS client
when an NFSv4 server reboots, by doing two things.
1 - Make the function that acquires a stateid for I/O operations
    block until recovery is complete, so that it doesn't acquire
    out of date stateids.
2 - Only allow a recovery once every 1/2 of a lease duration, since
    the NFSv4 server must provide a recovery grace period of at
    least a lease duration. This should avoid recoveries caused
    by an out of date stateid that was acquired for an I/O op.
    just before a recovery cycle started.

MFC after:	1 week
2010-04-18 22:21:23 +00:00
Jaakko Heinonen
17f820725e Revert r206560. The change doesn't work correctly in all cases with
multiple devfs mounts.
2010-04-16 07:02:28 +00:00
Rick Macklem
0ac68bd339 Add mutex lock calls to 2 cases in the experimental NFS client's
renew thread where they were missing.

MFC after:	1 week
2010-04-15 23:56:05 +00:00
Rick Macklem
55909abf07 The experimental NFS client was not filling in recovery credentials
for opens done locally in the client when a delegation for the file
was held. This could cause the client to crash in crsetgroups() when
recovering from a server crash/reboot. This patch fills in the
recovery credentials for this case, in order to avoid the client crash.
Also, add KASSERT()s to the credential copy functions, to catch any
other cases where the credentials aren't filled in correctly.

MFC after:	1 week
2010-04-15 22:57:30 +00:00
Jaakko Heinonen
70781bf94e - Ignore and report duplicate and empty device names in devfs_populate_loop()
instead of causing erratic behavior. Currently make_dev(9) can't fail, so
  there is no way to report an error to make_dev(9) callers.
- Disallow using "." and ".." in device path names. It didn't work previously
  but now it is reported rather than panicing.
- Treat multiple sequential slashes as single in device path names.

Discussed with:	pjd
2010-04-13 18:53:39 +00:00
Joel Dahl
d122d78412 Switch to our preferred 2-clause BSD license.
Approved by:	bp
2010-04-07 16:50:38 +00:00
Rick Macklem
2a45247c7a Harden the experimental NFS server a little, by adding range
checks on the length of the client's open/lock owner name. Also,
add free()'s for one case where they were missing and would
have caused a leak if NFSERR_BADXDR had been replied. Probably
never happens, but the leak is now plugged, just in case.

MFC after:	2 weeks
2010-04-06 01:14:49 +00:00
Robert Watson
f1853d0fc2 Synchronize Coda kernel module definitions in our coda.h to Coda 6's
coda.h:

- CodaFid typdef -> struct CodaFid throughout.
- Use unsigned int instead of unsigned long for venus_dirent and other
  cosmetic fixes.
- Introduce cuid_t and cgid_t and use instead of uid_t and gid_t in RPCs.
- Synchronize comments and macros.
- Use u_int32_t instead of unsigned long for coda_out_hdr.

With these changes, a 64-bit Coda kernel module now works with
coda6_client, whereas previous userspace and kernel versions of RPCs
differed sufficiently to prevent using the file system.  This has been
verified only with casual testing, but /coda is now usable for at least
basic operations on amd64.

MFC after:	1 week
2010-04-05 20:12:54 +00:00
Robert Watson
1c482201ef Correct definition of CIOC_KERNEL_VERSION Coda ioctl() for systems
where sizeof(int) != sizeof(sizeof(int)), or the ioctl will return
EINVAL.

MFC after:	3 days
2010-04-05 19:40:13 +00:00
Rick Macklem
54bde1faa5 Harden the experimental NFS server a little, by adding extra checks
in the readdir functions for non-positive byte count arguments.
For the negative case, set it to the maximum allowable, since it
was actually a large positive value (unsigned) on the wire.
Also, fix up the readdir function comment a bit.

Suggested by:	dillon AT apollo.backplane.com
MFC after:	2 weeks
2010-04-04 23:19:11 +00:00
Andriy Gapon
423b0fb7ad mountmsdosfs: reject too high value of bytes per cluster
Bytes per cluster are calcuated as bytes per sector times sectors per
cluster.  Too high value can overflow an internal variable with type
that can hold only values in valid range.  Trying to use a wider type
results in an attempt to read more than MAXBSIZE at once, a panic.
Unfortunately, it is FreeBSD newfs_msdos that  produces filesystems
with invalid parameters for certain types of media.

Reported by:	Fabian Keil <freebsd-listen@fabiankeil.de>,
		Paul B. Mahol <onemda@gmail.com>
Discussed with:	bde, kib
MFC after:	1 week
X-ToDo:		fix newfs_msdos
2010-04-02 15:22:23 +00:00
Konstantin Belousov
ea01588095 Add function vop_rename_fail(9) that performs needed cleanup for locks
and references of the VOP_RENAME(9) arguments. Use vop_rename_fail()
in deadfs_rename().

Tested by:	Mikolaj Golub
MFC after:	1 week
2010-04-02 14:03:01 +00:00
Rick Macklem
15b28cb82d For the experimental NFS server, add a call to free the lookup
path buffer for one case where it was missing when doing mkdir.
This could have conceivably resulted in a leak of a buffer, but
a leak was never observed during testing, so I suspect it would
have occurred rarely, if ever, in practice.

MFC after:	2 weeks
2010-04-02 02:19:28 +00:00
Rick Macklem
f61786cb60 Add SAVENAME to the cn_flags for all cases in the experimental
NFS server for the CREATE cn_nameiop where SAVESTART isn't set.
I was not aware that this needed to be done by the caller until
recently.

Tested by:	lampa AT fit.vutbr.cz (link case)
Submitted by:	lampa AT fit.vutbr.cz (link case)
MFC after:	2 weeks
2010-04-02 01:53:48 +00:00
Rick Macklem
a43fcbe34d This patch should fix handling of byte range locks locally
on the server for the experimental nfs server. When enabled
by setting vfs.newnfs.locallocks_enable to non-zero, the
experimental nfs server will now acquire byte range locks
on the file on behalf of NFSv4 clients, such that lock
conflicts between the NFSv4 clients and processes running
locally on the server, will be recognized and handled correctly.

MFC after:	2 weeks
2010-03-30 23:11:50 +00:00
Rick Macklem
7482701cd4 Patch the experimental NFS server in a manner analagous to r205661
for the regular NFS server, to ensure that ESTALE is
returned to the client for all errors returned by VFS_FHTOVP().

MFC after:	2 weeks
2010-03-26 01:35:19 +00:00
Rick Macklem
3dfe81c650 Fix the experimental NFS subsystem so that it uses the correct
preprocessor macro name for not requiring strict data alignment.

Suggested by:	marius
MFC after:	2 weeks
2010-03-24 02:02:02 +00:00
Jung-uk Kim
d04be5775f Fix a long standing regression of readdir(3) in fdescfs(5) introduced
in r1.48.  We were stopping at the first null pointer when multiple file
descriptors were opened and one in the middle was closed.  This restores
traditional behaviour of fdescfs.

MFC after:	3 days
2010-03-16 19:59:14 +00:00
Nathan Whitehorn
841c0c7ec7 Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.

Reviewed by:	kib, jhb
2010-03-11 14:49:06 +00:00
Robert Watson
2684bef615 Update nfsrv_getsocksndseq() for changes in TCP internals since FreeBSD 6.x:
- so_pcb is now guaranteed to be non-NULL and valid if a valid socket
  reference is held.

- Need to check INP_TIMEWAIT and INP_DROPPED before assuming inp_ppcb is a
  tcpcb, as it might be a tcptw or NULL otherwise.

- tp can never be NULL by the end of the function, so only check
  TCPS_ESTABLISHED before extracting tcpcb fields.

The NFS server arguably incorporates too many assumptions about TCP
internals, but fixing that is left for nother day.

MFC after:		1 week
Reviewed by:		bz
Reviewed and tested by:	rmacklem
Sponsored by:		Juniper Networks
2010-03-11 11:33:04 +00:00
Konstantin Belousov
84caee6bbb When returning error from msdosfs_lookup(), make sure that *vpp is NULL.
lookup() KASSERTs this condition.

Reported and tested by:	pho
MFC after:	3 weeks
2010-03-03 21:59:45 +00:00
Konstantin Belousov
6c0358cc98 Do not leak vnode lock when msdosfs mount is updated and specified
device is different from the device used to the original mount.

Note that update_mp does not need devvp locked, and pmp->pm_devvp cannot
be freed meantime.

Reported and tested by:	pho
MFC after:	3 weeks
2010-03-02 17:24:33 +00:00
Konstantin Belousov
a84ec05f56 Only destroy pm_fatlock on error if it was initialized.
MFC after:	3 weeks
2010-03-02 11:02:59 +00:00
Konstantin Belousov
f0147e0a46 Mark msdosfs as mpsafe.
Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:19:22 +00:00
Konstantin Belousov
db811dd724 Fix the race between dotdot lookup and forced unmount, by using
msdosfs-specific variant of vn_vget_ino(), msdosfs_deget_dotdot().

As was done for UFS, relookup the dotdot denode after the call to
msdosfs_deget_dotdot(), because vnode lock is dropped and directory
might be moved.

Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:17:29 +00:00
Konstantin Belousov
30e65ad1bb Use pm_fatlock to protect per-filesystem rb tree used to allocate fileno
on the large FAT volumes. Previously, a single global mutex was used.

Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:16:43 +00:00
Konstantin Belousov
eb739c7cd5 Add assertions for FAT bitmap state.
Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:15:45 +00:00
Konstantin Belousov
6be1a4cc5f Use pm_fatlock to protect fat bitmap.
Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:13:59 +00:00
Konstantin Belousov
23b6c23084 Add per-mountpoint lockmgr lock for msdosfs. It is intended to be used
as fat bitmap lock and to replace global mutex protecting fileno rbtree.

Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:13:07 +00:00
Konstantin Belousov
0bdbd6270f In msdosfs deget(), properly handle the case when the vnode is found in hash.
Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:11:31 +00:00
Konstantin Belousov
740a720142 In msdosfs_inactive(), reclaim the vnodes both for SLOT_DELETED and
SLOT_EMPTY deName[0] values. Besides conforming to FAT specification, it
also clears the issue where vfs_hash_insert found the vnode in hash, and
newly allocated vnode is vput()ed. There, deName[0] == 0, and vnode is
not reclaimed, indefinitely kept on mountlist.

Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:10:41 +00:00
Konstantin Belousov
2e45cc5bf6 Remove seemingly unneeded unlock/relock of the dvp in msdosfs_rmdir,
causing LOR.

Reported and tested by:	pho
MFC after:	3 weeks
2010-02-28 17:09:09 +00:00
Konstantin Belousov
ef6a2be307 Assert that the msdosfs vnode is (e)locked in several places.
The plan is to use vnode lock to protect denode and fat cache,
and having separate lock for block use map.

Change the check and return on impossible condition into KASSERT().

Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:07:49 +00:00
Konstantin Belousov
35fcc0662b Remove unused global statistic about fat cache usage.
Tested by:	pho
MFC after:	3 weeks
2010-02-28 17:06:42 +00:00
Ulrich Spörlein
8fa03d08ca Fix common misspelling of hierarchy
Pointed out by:		bf1783 at gmail
Approved by:		np (cxgb), kientzle (tar, etc.), philip (mentor)
2010-02-20 10:19:19 +00:00
Konstantin Belousov
3c8b687fe1 Invalid filesystem might cause the bp to be never read.
Noted by:	Pedro F. Giffuni <giffunip tutopia com>
Obtanined from:	NetBSD
MFC after:	1 week
2010-02-14 12:10:49 +00:00
Rick Macklem
9c360f222d Change the default value for vfs.newnfs.enable_locallocks to 0 for
the experimental NFS server, since local locking is known to be
broken and the patch to fix it is still a work in progress.

MFC after:	5 days
2010-02-14 00:18:32 +00:00
Rick Macklem
d5ad662523 This fixes the experimental NFS server so that it won't crash in the
caching code for IPv6 by fixing a typo that used the incorrect variable.
It also fixes the indentation of the statement above it.

Reported by:	simon AT comsys.ntu-kpi.kiev.ua
MFC after:	5 days
2010-02-13 23:56:19 +00:00
Konstantin Belousov
699d124f23 Fix function name in the comment in the second location too.
Submitted by:	ed
MFC after:	1 week
2010-02-13 12:50:09 +00:00
Konstantin Belousov
48d1bcf8e0 - Add idempotency guards so the structures can be used in other utilities.
- Update bpb structs with reserved fields.
- In direntry struct join deName with deExtension. Although a
  fix was attempted in the past, these fields were being overflowed,
  Now this is consistent with the spec, and we can now share the
  WinChksum code with NetBSD.

Submitted by:	Pedro F. Giffuni <giffunip tutopia com>
Mostly obtained from:	NetBSD
Reviewed by:	bde
MFC after:	2 weeks
2010-02-13 12:41:07 +00:00
Konstantin Belousov
8b36e81367 Use M_ZERO instead of calling bzero().
Fix function name in the comment.

MFC after:	1 week
2010-02-13 12:11:03 +00:00
Konstantin Belousov
4f160a1c59 Remove unused macros.
MFC after:	1 week
2010-02-13 11:34:25 +00:00
Rick Macklem
dd5b5a9431 Patch the experimental NFS client so that there is a timeout
for negative name cache entries in a manner analogous to
r202767 for the regular NFS client. Also, make the code in
nfs_lookup() compatible with that of the regular client
and replace the sysctl variable that enabled negative name
caching with the mount point option.

MFC after:	2 weeks
2010-01-31 19:12:24 +00:00
Ed Schouten
bf2296be8b Properly use dev_refl()/dev_rel() in kern.devname.
While there, perform some clean-up fixes. Update some stale comments on
struct cdev * instead of dev_t and devfs_random(). Also add some missing
whitespace.

MFC after:	1 week
2010-01-31 15:19:16 +00:00
Jaakko Heinonen
dec3772ee4 Add "maxfilesize" mount option for tmpfs to allow specifying the
maximum file size limit. Default is UINT64_MAX when the option is
not specified. It was useless to set the limit to the total amount of
memory and swap in the system.

Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to
check if there is enough memory available.

Remove now unused get_swpgtotal().

Reviewed by:	Gleb Kurtsou
Approved by:	trasz (mentor)
2010-01-29 12:09:14 +00:00
Rick Macklem
80169e41d6 Patch the experimental NFS client in a manner analogous to
r203072 for the regular NFS client. Also, delete two fields
of struct nfsmount that are not used by the FreeBSD port of
the client.

MFC after:	2 weeks
2010-01-28 16:17:24 +00:00
Edward Tomasz Napierala
678a6d7a4a Don't touch v_interlock; use VI_* macros instead. 2010-01-27 19:30:44 +00:00
Marius Strobl
9251d56f5f On LP64 struct ifid is 64-bit aligned while struct fid is 32-bit aligned
so on architectures with strict alignment requirements we can't just simply
cast the latter to the former but need to copy it bytewise instead.

PR:		143010
MFC after:	3 days
2010-01-23 22:38:01 +00:00
Jaakko Heinonen
e48fbf26e8 Truncate read request rather than returning EIO if the request is
larger than MAXPHYS + 1. This fixes a problem with cat(1) when it
uses a large I/O buffer.

Reported by:	Fernando Apesteguía
Suggested by:	jilles
Reviewed by:	des
Approved by:	trasz (mentor)
2010-01-22 08:45:12 +00:00
Jaakko Heinonen
189ee6be40 - Change the type of nodes_max to u_int and use "%u" format string to
convert its value. [1]
- Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more
  reasonable than the old four nodes per page (with page size 4096) because
  non-empty regular files always use at least one page. This fixes possible
  overflow in the calculation. [2]
- Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node().

PR:		kern/138367
Suggested by:	bde [1], Gleb Kurtsou [2]
Approved by:	trasz (mentor)
2010-01-20 16:56:20 +00:00
Ulf Lilleengen
2ab38c399e Revert parts of r202283:
- Return EOPNOTSUPP before EROFS to be consistent with other filesystems.
- Fix setting of the nodump flag for users without PRIV_VFS_SYSFLAGS privilege.

Submitted by:	jh@
2010-01-18 19:09:16 +00:00
Ulf Lilleengen
e09c00cada Bring in the ext2fs work done by Aditya Sarawgi during and after Google Summer
of Code 2009:

- BSDL block and inode allocation policies for ext2fs. This involves the use
  FFS1 style block and inode allocation for ext2fs. Preallocation was removed
  since it was GPL'd.
- Make ext2fs MPSAFE by introducing locks to per-mount datastructures.
- Fixes for kern/122047 PR.
- Various small bugfixes.
- Move out of gnu/ directory.

Sponsored by:   Google Inc.
Submitted by:	Aditya Sarawgi <sarawgi.aditya AT SPAMFREE gmail DOT com>
2010-01-14 14:30:54 +00:00
Jaakko Heinonen
5364a38dba - Fix some style bugs in tmpfs_mount(). [1]
- Remove a stale comment about tmpfs_mem_info() 'total' argument.

Reported by:	bde [1]
2010-01-13 14:17:21 +00:00
Brooks Davis
646063122d Update the comment on printing group membership to reflect that fact
that each groupt the process is a member of is printed rather than an
entry for each group the user could be a member of.

MFC after:	3 days
2010-01-09 23:23:52 +00:00
Edward Tomasz Napierala
f92a68eec3 Remove unused smbfs_smb_qpathinfo(). 2010-01-08 15:53:07 +00:00
Jaakko Heinonen
720c50b339 - Change the type of size_max to u_quad_t because its value is converted
with vfs_scanopt(9) using the "%qu" format string.
- Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to
  prevent overflow in howmany() macro.

PR:		kern/141194
Approved by:	trasz (mentor)
MFC after:	2 weeks
2010-01-08 07:57:43 +00:00
Rick Macklem
3968293472 The test for "same client" for the experimental nfs server over NFSv4
was broken w.r.t. byte range lock conflicts when it was the same client
and the request used the open_to_lock_owner4 case, since lckstp->ls_clp
was not set. This patch fixes it by using "clp" instead of "lckstp->ls_clp".

MFC after:	2 weeks
2010-01-03 20:08:10 +00:00
Rick Macklem
b04b4acda0 Fix three related problems in the experimental nfs client when
checking for conflicts w.r.t. byte range locks for NFSv4.
1 - Return 0 instead of EACCES when a conflict is found, for F_GETLK.
2 - Check for "same file" when checking for a conflict.
3 - Don't check for a conflict for the F_UNLCK case.
2010-01-03 18:27:10 +00:00
Rick Macklem
5d60668c70 Fix the experimental NFS client so that it can create Unix
domain sockets on an NFSv4 mount point. It was generating
incorrect XDR in the request for this case.

Tested by:	infofarmer
MFC after:	2 weeks
2009-12-31 18:02:48 +00:00
Rick Macklem
4a8e21764d When porting the experimental nfs subsystem to the FreeBSD8 krpc,
I added 3 functions that were already in the experimental client
under different names. This patch deletes the functions in the
experimental client and renames the calls to use the other set.
(This is just removal of duplicated code and does not fix any bug.)

MFC after:	2 weeks
2009-12-26 19:15:15 +00:00
Rick Macklem
8da45f2c6e Modify the experimental server so that it uses VOP_ACCESSX().
This is necessary in order to enable NFSv4 ACL support. The
argument to nfsvno_accchk() was changed to an accmode_t and
the function nfsrv_aclaccess() was no longer needed and,
therefore, deleted.

Reviewed by:	trasz
MFC after:	2 weeks
2009-12-25 20:44:19 +00:00
Ed Schouten
8dc9b4cf04 Let access overriding to TTYs depend on the cdev_priv, not the vnode.
Basically this commit changes two things, which improves access to TTYs
in exceptional conditions. Basically the problem was that when you ran
jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the
node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if
you want to attach to screens quickly, use ssh(1), etc.

The fixes:

- Cache the cdev_priv of the controlling TTY in struct session. Change
  devfs_access() to compare against the cdev_priv instead of the vnode.
  This allows you to bypass UNIX permissions, even across different
  mounts of devfs.

- Extend devfs_prison_check() to unconditionally expose the device node
  of the controlling TTY, even if normal prison nesting rules normally
  don't allow this. This actually allows you to interact with this
  device node.

To be honest, I'm not really happy with this solution. We now have to
store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp).
In an ideal world, we should just get rid of the latter two and only use
s_ttyp, but this makes certian pieces of code very impractical (e.g.
devfs, kern_exit.c).

Reported by:	Many people
2009-12-19 18:42:12 +00:00
Xin LI
d9cf8753e3 Allow using IPv6 in nfsrvd_sentcache() callback.
PR:		kern/141289
Submitted by:	Petr Lampa <lampa fit vutbr cz>
Approved by:	rmacklem
MFC after:	1 week
2009-12-08 23:43:50 +00:00
Guido van Rooij
c53a28102a Fix ntfs such that it understand media with a non-512-bytes sector size:
1. Fixups are always done on 512 byte chunks (in stead of sectors). This
is kind of stupid.
2. Conevrt between NTFS blocknumbers (the blocksize equals the media
sector size) and the bread() and getblk() blocknr (which are 512-byte
sized)

NB: this change should not affect ntfs for 512-byte sector sizes.
2009-12-07 15:15:08 +00:00
Edward Tomasz Napierala
74991298d9 Remove unneeded ifdefs.
Reviewed by:	rmacklem
2009-12-03 18:03:42 +00:00
Edward Tomasz Napierala
92bd961f62 Don't use ap->a_td->td_ucred when we were passed ap->a_cred. 2009-12-02 18:09:22 +00:00
Rick Macklem
38e3ea69d4 Modify the experimental nfs server so that it falls back to
using VOP_LOOKUP() when VFS_VGET() returns EOPNOTSUPP in the
ReaddirPlus RPC. This patch is based upon one by pjd@ for the
regular nfs server which has not yet been committed. It is needed
when a ZFS volume is exported and ReaddirPlus (which almost
always happens for NFSv4) is performed by a client. The patch
also simplifies vnode lock handling somewhat.

MFC after:	2 weeks
2009-11-23 16:08:15 +00:00
Rick Macklem
086f6e0cc7 Patch the experimental NFS server is a manner analagous to
r197525, so that the creation verifier is handled correctly
in va_atime for 64bit architectures. There were two problems.
One was that the code incorrectly assumed that
sizeof (struct timespec) == 8 and the other was that the tv_sec
field needs to be assigned from a signed 32bit integer, so that
sign extension occurs on 64bit architectures. This is required
for correct operation when exporting ZFS volumes.

Reviewed by:	pjd
MFC after:	2 weeks
2009-11-20 21:21:13 +00:00
Jaakko Heinonen
1bb015c07c Create verifier used by FreeBSD NFS client is suboptimal because the
first part of a verifier is set to the first IP address from
V_in_ifaddrhead list. This address is typically the loopback address
making the first part of the verifier practically non-unique. The second
part of the verifier is initialized to zero making its initial value
non-unique too.

This commit changes the strategy for create verifier initialization:
just initialize it to a random value. Also move verifier handling into
its own function and use a mutex to protect the variable.

This change is a candidate for porting to sys/nfsclient.

Reviewed by:	jhb, rmacklem
Approved by:	trasz (mentor)
2009-11-11 15:43:07 +00:00
Attilio Rao
9c76640868 - Improve comments about locking of the "struct fifoinfo" which is a bit
unclear.
- Fix a memory leak [0]

[0] Diagnosed by:	Dorr H. Clark <dclark at engr dot scu dot edu>
MFC:	1 week
2009-11-06 22:29:46 +00:00
Alan Cox
4afcae9ba3 There is no need to "busy" a page when the object is locked for the duration
of the operation.
2009-10-26 18:02:05 +00:00
Ruslan Ermilov
90147b7506 Spell DIAGNOSTIC correctly. 2009-10-24 18:49:17 +00:00
Jaakko Heinonen
dd697cbf66 Unloading of the nfscl module is unsupported because newnfslock doesn't
support unloading. It's not trivial to implement newnfslock unloading so
for now just admit that unloading is unsupported and refuse to attempt
unload in all nfscl module event handlers.

Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 15:06:18 +00:00
Jaakko Heinonen
349af4de7d Fix ordering of nfscl_modevent() and ncl_uninit(). nfscl_modevent() must
be called after ncl_uninit() when unloading the nfscl module because
ncl_uninit() uses ncl_iod_mutex which is destroyed in nfscl_modevent().

Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 15:01:46 +00:00
Jaakko Heinonen
417612e0e4 Fix comment typos.
Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 14:57:26 +00:00
Xin LI
82cf92d483 Add locking around access to parent node, and bail out when the parent
node is already freed rather than panicking the system.

PR:		kern/122038
Submitted by:	gk
Tested by:	pho
MFC after:	1 week
2009-10-11 07:03:56 +00:00
Xin LI
3fa0694aaa Add a special workaround to handle UIO_NOCOPY case. This fixes data
corruption observed when sendfile() is being used.

PR:		kern/127213
Submitted by:	gk
MFC after:	2 weeks
2009-10-07 23:17:15 +00:00
Xin LI
7441ac4618 Fix a bug that causes the fsx test case of mmap'ed page being out of sync
of read/write, inspired by ZFS's counterpart.

PR:		kern/139312
Submitted by:	gk@
MFC after:	1 week
2009-10-04 10:38:04 +00:00
Edward Tomasz Napierala
2c29cfa083 Provide default implementation for VOP_ACCESS(9), so that filesystems which
want to provide VOP_ACCESSX(9) don't have to implement both.  Note that
this commit makes implementation of either of these two mandatory.

Reviewed by:	kib
2009-10-01 17:22:03 +00:00
Edward Tomasz Napierala
5aea82db46 Fix typo in the comment. 2009-09-30 18:50:50 +00:00
Konstantin Belousov
17dfbc1c43 Add per-process osrel node to the procfs, to allow read and set p_osrel
value for the process.

Approved by:	des (procfs maintainer)
MFC after:	3 weeks
2009-09-23 12:08:08 +00:00
Robert Watson
e76d823b81 Use C99 initialization for struct filterops.
Obtained from:	Mac OS X
Sponsored by:	Apple Inc.
MFC after:	3 weeks
2009-09-12 20:03:45 +00:00
Rick Macklem
8f63187ec1 Add LK_NOWITNESS to the vn_lock() calls done on newly created nfs
vnodes, since these nodes are not linked into the mount queue and,
as such, the vn_lock() cannot cause a deadlock so LORs are harmless.

Suggested by:	kib
Approved by:	kib (mentor)
MFC after:	3 days
2009-09-09 20:37:49 +00:00
Poul-Henning Kamp
6778431478 Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.
2009-09-08 13:19:05 +00:00
Poul-Henning Kamp
b34421bf9c Add necessary include. 2009-09-08 13:16:55 +00:00
Konstantin Belousov
481208a815 If a race is detected, pfs_vncache_alloc() may reclaim a vnode that had
never been inserted into the pfs_vncache list. Since pfs_vncache_free()
does not anticipate this case, it decrements pfs_vncache_entries
unconditionally; if the vnode was not in the list, pfs_vncache_entries
will no longer reflect the actual number of list entries. This may cause
size of the cache to exceed the configured maximum. It may also trigger
a panic during module unload or system shutdown.

Do not decrement pfs_vncache_entries for the vnode that was not in the
list.

Submitted by:	tegge
Reviewed by:	des
MFC after:	1 week
2009-09-07 12:10:41 +00:00
Konstantin Belousov
6cc745d2d7 insmntque_stddtr() clears vp->v_data and resets vp->v_op to
dead_vnodeops before calling vgone(). Revert r189706 and corresponding
part of the r186560.

Noted and reviewed by:	tegge
Approved by:	des (pseudofs part)
MFC after:	3 days
2009-09-07 11:55:34 +00:00
Konstantin Belousov
34f83c86f6 Remove spurious pfs_unlock().
PR:	kern/137310
Reviewed by:	des
MFC after:	3 days
2009-08-31 09:26:04 +00:00
Jilles Tjoelker
74d1c4927a Fix poll() on half-closed sockets, while retaining POLLHUP for fifos.
This reverts part of r196460, so that sockets only return POLLHUP if both
directions are closed/error. Fifos get POLLHUP by closing the unused
direction immediately after creating the sockets.

The tools/regression/poll/*poll.c tests now pass except for two other things:
- if POLLHUP is returned, POLLIN is always returned as well instead of only
  when there is data left in the buffer to be read
- fifo old/new reader distinction does not work the way POSIX specs it

Reviewed by:	kib, bde
2009-08-25 21:44:14 +00:00
Marko Zec
0348c661d1 Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet
context inside the RPC code.

Temporarily set td's cred to mount's cred before calling socreate() via
__rpc_nconf2socket().

Submitted by:	rmacklem (in part)
Reviewed by:	rmacklem, rwatson
Discussed with:	dfr, bz
Approved by:	re (rwatson), julian (mentor)
MFC after:	3 days
2009-08-24 10:09:30 +00:00
Rick Macklem
60f04e38ca Apply the same patch as r196205 for nfs_upgrade_lock() and
nfs_downgrade_lock() to the experimental nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-08-17 16:12:28 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
John Baldwin
87eca70e0c Fix some LORs between vnode locks and filedescriptor table locks.
- Don't grab the filedesc lock just to read fd_cmask.
- Drop vnode locks earlier when mounting the root filesystem and before
  sanitizing stdin/out/err file descriptors during execve().

Submitted by:	kib
Approved by:	re (rwatson)
MFC after:	1 week
2009-07-31 13:40:06 +00:00
Rick Macklem
6a795918c1 Fix the experimental nfs client so that it only calls ncl_vinvalbuf()
for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since
nfscl_mustflush() only returns 0 when there is a valid write delegation
issued to the client, it only affects the case of an NFSv4 mount with
callbacks/delegations enabled.

Approved by:	 re (kensmith), kib (mentor)
2009-07-29 14:50:31 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Rick Macklem
93a15e2054 When vfs.newnfs.callback_addr is set to an IPv4 address, the
experimental NFSv4 client might try and use it as an IPv6 address,
breaking callbacks. The fix simply initializes the isinet6 variable
for this case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 18:10:44 +00:00
Rick Macklem
c79e697621 Add changes to the experimental nfs client to use the PBDRY flag for
msleep(9) when a vnode lock or similar may be held. The changes are
just a clone of the changes applied to the regular nfs client by
r195703.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:37:53 +00:00
Rick Macklem
7f1968ba10 When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:32:28 +00:00
Rick Macklem
80e556956e Fix two bugs in the experimental nfs client:
- When the root vnode was acquired during mounting, mnt_stat.f_iosize was
  still set to 0, so getnewvnode() would set bo_bsize == 0. This would
  confuse getblk(), so that it always returned the first block causing
  the problem when the root directory of the mount point was greater
  than one block in size. It was fixed by setting mnt_stat.f_iosize to
  NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode.
- NFSMNT_INT was being set temporarily while the initial connect to a
  server was being done. This erroneously configured the krpc for
  interruptible RPCs, which caused problems because signals weren't
  being masked off as they would have been for interruptible mounts.
  This code was deleted to fix the problem. Since mount_nfs does an
  NFS null RPC before the mount system call, connections to the server
  should work ok.

Tested by:	swell dot k at gmail dot com
Approved by:	re (kensmith), kib (mentor)
2009-07-19 16:44:26 +00:00
Rick Macklem
405229f913 Fix the experimental nfs client so that it does not cause a
"share->excl" panic when doing a lookup of dotdot at the root
of a server's file system. The patch avoids calling vn_lock()
for that case, since nfscl_nget() has already acquired a lock
for the vnode.

Approved by:	re (kensmith), kib (mentor)
2009-07-14 23:10:23 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Rick Macklem
089f366ab0 Add calls to the experimental nfs client for the case of an "intr" mount,
so that signals that aren't supposed to terminate RPCs in progress are
masked off during the RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:07:35 +00:00
Rick Macklem
ad86aef9af Fix the handling of dotdot in lookup for the experimental nfs client
in a manner analagous to the change in r195294 for the regular nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:02:17 +00:00
Rick Macklem
9ca27b565b Since the nfscl_getclose() function both decremented open counts and,
optionally, created a separate list of NFSv4 opens to be closed, it
was possible for the associated OpenOwner to be free'd before the Open
was closed. The problem was that the Open was taken off the OpenOwner
list before the Close RPC was done and OpenOwners can be free'd once the
list is empty. This patch separates out the case of doing the Close RPC
into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose()
so that it closes a single open instead of a list of them. This avoids
removing the Open from the OpenOwner list before doing the Close RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-09 19:00:29 +00:00
Konstantin Belousov
7f5dff5064 Fix poll(2) and select(2) for named pipes to return "ready for read"
when all writers, observed by reader, exited. Use writer generation
counter for fifo, and store the snapshot of the fifo generation in the
f_seqcount field of struct file, that is otherwise unused for fifos.
Set FreeBSD-undocumented POLLINIGNEOF flag only when file f_seqcount is
equal to fifo' fi_wgen, and revert r89376.

Fix POLLINIGNEOF for sockets and pipes, and return POLLHUP for them.
Note that the patch does not fix not returning POLLHUP for fifos.

PR:	kern/94772
Submitted by:	bde (original version)
Reviewed by:	rwatson, jilles
Approved by:	re (kensmith)
MFC after:	6 weeks (might be)
2009-07-07 09:43:44 +00:00
Konstantin Belousov
f1eccd05ec In vn_vget_ino() and their inline equivalents, mnt_ref() the mount point
around the sequence that drop vnode lock and then busies the mount point.
Not having vlocked node or direct reference to the mp allows for the
forced unmount to proceed, making mp unmounted or reused.

Tested by:	pho
Reviewed by:	jeff
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-02 18:02:55 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
Robert Watson
2d9cfabad4 Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the
in_ifaddrhead and INADDR_HASH address lists.

Previously, these lists were used unsynchronized as they were effectively
never changed in steady state, but we've seen increasing reports of
writer-writer races on very busy VPN servers as core count has gone up
(and similar configurations where address lists change frequently and
concurrently).

For the time being, use rwlocks rather than rmlocks in order to take
advantage of their better lock debugging support.  As a result, we don't
enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion
is complete and a performance analysis has been done.  This means that one
class of reader-writer races still exists.

MFC after:      6 weeks
Reviewed by:    bz
2009-06-25 11:52:33 +00:00
Konstantin Belousov
3364c323e6 Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
Konstantin Belousov
c808c9632d Add explicit struct ucred * argument for VOP_VPTOCNP, to be used by
vn_open_cred in default implementation. Valid struct ucred is needed for
audit and MAC, and curthread credentials may be wrong.

This further requires modifying the interface of vn_fullpath(9), but it
is out of scope of this change.

Reviewed by:	rwatson
2009-06-21 19:21:01 +00:00
Roman Divacky
23057f089b In non-debugging mode make this define (void)0 instead of nothing. This
helps to catch bugs like the below with clang.

	if (cond);		<--- note the trailing ;
	   something();

Approved by:	ed (mentor)
Discussed on:	current@
2009-06-21 08:36:30 +00:00
Rick Macklem
65cc6600c5 Replace RPCAUTH_UNIXGIDS with NFS_MAXGRPS so that nfscbd.c will build.
Approved by:	kib (mentor)
2009-06-20 17:11:07 +00:00
Ed Schouten
f8f6146082 Improve nested jail awareness of devfs by handling credentials.
Now that we start to use credentials on character devices more often
(because of MPSAFE TTY), move the prison-checks that are in place in the
TTY code into devfs.

Instead of strictly comparing the prisons, use the more common
prison_check() function to compare credentials. This means that
pseudo-terminals are only visible in devfs by processes within the same
jail and parent jails.

Even though regular users in parent jails can now interact with
pseudo-terminals from child jails, this seems to be the right approach.
These processes are also capable of interacting with the jailed
processes anyway, through signals for example.

Reviewed by:	kib, rwatson (older version)
2009-06-20 14:50:32 +00:00
Rick Macklem
2c1e6cce5b Change the size of the nfsc_groups[] array in the experimental nfs
client to RPCAUTH_UNIXGIDS + 1 (17), since that is what can go on
the wire for AUTH_SYS authentication.

Reviewed by:	brooks
Approved by:	kib (mentor)
2009-06-20 00:54:57 +00:00
Brooks Davis
838d985825 Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively.  (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer.  Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively.  Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary.  In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups.  When feasible, truncate
the group list rather than generating an error.

Minor changes:
  - Reduce the number of hand rolled versions of groupmember().
  - Do not assign to both cr_gid and cr_groups[0].
  - Modify ipfw to cache ucreds instead of part of their contents since
    they are immutable once referenced by more than one entity.

Submitted by:	Isilon Systems (initial implementation)
X-MFC after:	never
PR:		bin/113398 kern/133867
2009-06-19 17:10:35 +00:00
Alan Cox
57a7e73261 Fix some of the style errors in *getpages(). 2009-06-18 05:56:24 +00:00
Rick Macklem
76b30a0cd4 Add the SVC_RELEASE(xprt), as required by r194407.
Approved by:	kib (mentor)
2009-06-17 22:55:59 +00:00
Bjoern A. Zeeb
ebd8672cc3 Add explicit includes for jail.h to the files that need them and
remove the "hidden" one from vimage.h.
2009-06-17 15:01:01 +00:00
Rick Macklem
81e3c4fc8e Fix handling of ".." in nfs_lookup() for the forced dismount case
by cribbing the change made to the regular nfs client in r194358.

Approved by:	kib (mentor)
2009-06-17 14:10:18 +00:00
Bjoern A. Zeeb
7654a365db Add the explicit include of vimage.h to another five .c files still
missing it.

Remove the "hidden" kernel only include of vimage.h from ip_var.h added
with the very first Vimage commit r181803 to avoid further kernel poisoning.
2009-06-17 12:44:11 +00:00
Rick Macklem
47b7dc9933 Remove the "int *" typecast for the aresid argument to vn_rdwr()
and change the type of the argument from size_t to int. This
should avoid issues on 64bit architectures.

Suggested by:	kib
Approved by:	kib (mentor)
2009-06-16 13:52:21 +00:00
Alan Cox
47f11d9a46 Eliminate unnecessary variables. 2009-06-13 20:21:08 +00:00
Jamie Gritton
c1f192193d Rename the host-related prison fields to be the same as the host.*
parameters they represent, and the variables they replaced, instead of
abbreviated versions of them.

Approved by:	bz (mentor)
2009-06-13 15:39:12 +00:00
Jamie Gritton
01de879ac7 Use getcredhostuuid instead of accessing the prison directly.
Approved by:	bz (mentor)
2009-06-13 15:35:22 +00:00
John Baldwin
04f7f4636f Update the inline version of vn_get_ino() for ".." lookups to match the
recentish changes to vn_get_ino().

MFC after:	1 week
2009-06-12 21:19:57 +00:00
Rick Macklem
934a309971 This commit is analagous to r193952, but for the experimental nfs
subsystem. Add a test for VI_DOOMED just after ncl_upgrade_vnlock() in
ncl_bioread_check_cons(). This is required since it is possible
for the vnode to be vgonel()'d while in ncl_upgrade_vnlock() when
a forced dismount is in progress. Also, move the check for VI_DOOMED
in ncl_vinvalbuf() down to after ncl_upgrade_vnlock() and replace the
out of date comment for it.

Approved by:	kib (mentor)
2009-06-10 21:16:39 +00:00
Konstantin Belousov
93bc76dc3e For cd9660_ioctl, check for recycled vnode after locking it.
Noted by:	Jaakko Heinonen <jh saunalahti fi>
MFC after:	2 weeks
2009-06-10 15:48:34 +00:00
Konstantin Belousov
d6da640860 Fix r193923 by noting that type of a_fp is struct file *, not int.
It was assumed that r193923 was trivial change that cannot be done
wrong.

MFC after:	2 weeks
2009-06-10 14:24:31 +00:00
Konstantin Belousov
e4d9bdc105 s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args
definition.

Discussed with:	bde
MFC after:	2 weeks
2009-06-10 14:09:05 +00:00
Konstantin Belousov
c4702e66f4 Remove unused VOP_IOCTL and VOP_KQFILTER implementations for fifofs.
MFC after:	2 weeks
2009-06-10 14:02:22 +00:00
Konstantin Belousov
c4df27d5c8 VOP_IOCTL takes unlocked vnode as an argument. Due to this, v_data may
be NULL or derefenced memory may become free at arbitrary moment.

Lock the vnode in cd9660, devfs and pseudofs implementation of VOP_IOCTL
to prevent reclaim; check whether the vnode was already reclaimed after
the lock is granted.

Reported by:	georg at dts su
Reviewed by:	des (pseudofs)
MFC after:	2 weeks
2009-06-10 13:57:36 +00:00
Rick Macklem
410654ec1b Since vn_lock() with the LK_RETRY flag never returns an error
for FreeBSD-CURRENT, the code that checked for and returned the
error was broken. Change it to check for VI_DOOMED set after
vn_lock() and return an error for that case. I believe this
should only happen for forced dismounts.

Approved by:	kib (mentor)
2009-06-09 15:18:01 +00:00
Rick Macklem
2fa32154e4 Fix nfscl_getcl() so that it doesn't crash when it is called to
do an NFSv4 Close operation with the cred argument NULL. Also,
clarify what NULL arguments mean in the function's comment.

Approved by:	kib (mentor)
2009-06-08 18:41:23 +00:00
Robert Watson
dde155e95a Use #ifdef APPLE_MAC instead of #ifdef MAC to conditionalize Apple-specific
behavior for unicode support in UDF so as not to conflict with the MAC
Framework.

Note that Apple's XNU kernel also uses #ifdef MAC for the MAC Framework.

Suggested by:	pjd
MFC after:	3 days
2009-06-06 07:13:57 +00:00
Dag-Erling Smørgrav
c097b30885 Drop Giant.
MFC after:	1 week
2009-06-06 00:44:13 +00:00
Robert Watson
bcf11e8d00 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
Robert Watson
37ba986a5f Don't check MAC in the NFS server ACL set path, right now we aren't
enforcing MAC for NFS clients.
2009-06-05 14:15:00 +00:00
Robert Watson
927e0a56ce Re-add opt_mac.h include, which is required in order for MNT_MULTILABEL
to be set properly on devfs.  Otherwise, it isn't possible to set labels
on /dev nodes.

Reported by:	Sergio Rodriguez <sergiorr at yahoo.com>
MFC after:	3 days
2009-06-04 10:30:18 +00:00
Alan Cox
1f17689408 nfs_write() can use the recently introduced vfs_bio_set_valid() instead of
vfs_bio_set_validclean(), thereby avoiding the page queues lock.

Garbage collect vfs_bio_set_validclean().  Nothing uses it any longer.
2009-05-31 20:18:02 +00:00
Konstantin Belousov
b00098d164 Unlock the pseudofs vnode before calling fill method for pfs_readlink().
The fill code may need to lock another vnode, e.g. procfs file
implementation.

Reviewed by:	des
Tested by:	pho
MFC after:	2 weeks
2009-05-31 15:01:50 +00:00
Konstantin Belousov
b0f34bb643 Implement the bypass routine for VOP_VPTOCNP in nullfs.
Among other things, this makes procfs <pid>/file working for executables
started from nullfs mount.

Tested by:	pho
PR:	94269, 104938
2009-05-31 14:58:43 +00:00
Konstantin Belousov
b9131889d2 Do not drop vnode interlock in null_checkvp(). null_lock() verifies that
v_data is not-null before calling NULLVPTOLOWERVP(), and dropping the
interlock allows for reclaim to clean v_data and free the memory.

While there, remove unneeded semicolons and convert the infinite loops
to panics. I have a will to remove null_checkvp() altogether, or leave
it as a trivial stub, but not now.

Reported and tested by:	pho
2009-05-31 14:54:20 +00:00
Konstantin Belousov
cec9ed6d7f Lock the real null vnode lock before substitution of vp->v_vnlock.
This should not really matter for correctness, since vp->v_lock is
not locked before the call, and null_lock() holds the interlock,
but makes the control flow for reclaim more clear.

Tested by:	pho
2009-05-31 14:52:45 +00:00
Marko Zec
705fe7ce35 Unbreak options VIMAGE kernel builds.
Approved by:	julian (mentor)
2009-05-31 11:57:51 +00:00
Rick Macklem
d00a615a98 Add a check to v_type == VREG for the recently modified code that
does NFSv4 Closes in the experimental client's VOP_INACTIVE().
I also replaced a bunch of ap->a_vp with a local copy of vp,
because I thought that made it more readable.

Approved by:	kib (mentor)
2009-05-30 22:11:12 +00:00
Edward Tomasz Napierala
c97fcdba57 Add VOP_ACCESSX, which can be used to query for newly added V*
permissions, such as VWRITE_ACL.  For a filsystems that don't
implement it, there is a default implementation, which works
as a wrapper around VOP_ACCESS.

Reviewed by:	rwatson@
2009-05-30 13:59:05 +00:00
Jamie Gritton
76ca6f88da Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex.  Jails may
have their own host information, or they may inherit it from the
parent/system.  The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL.  The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by:	bz (mentor)
2009-05-29 21:27:12 +00:00
Alan Cox
3933ec4d15 Make *getpages()s' assertion on the state of each page's dirty bits
stricter.
2009-05-28 18:11:09 +00:00
Dag-Erling Smørgrav
26088b9d4d Use a temporary variable to avoid a duplicate strlen().
Submitted by:	kib
MFC after:	1 week
2009-05-28 10:24:26 +00:00
Rick Macklem
fdd88deeba Fix handling of NFSv4 Close operations in ncl_inactive(). Only
do them for NFSv4 and flush writes to the server before doing
the Close(s), as required. Also, use the a_td argument instead of
curthread.

Approved by:	kib (mentor)
2009-05-27 19:41:29 +00:00
Alan Cox
e2d0be0172 Eliminate redundant setting of a page's valid bits and pointless clearing
of the same page's dirty bits.
2009-05-27 18:12:10 +00:00
Rick Macklem
ee1c1433e3 Add a function to the experimental nfs subsystem that tests to see
if a local file system supports NFSv4 ACLs. This allows the
NFSHASNFS4ACL() macro to be correctly implemented. The NFSv4 ACL
support should now work when the server exports a ZFS volume.

Approved by:	kib (mentor)
2009-05-27 15:16:56 +00:00
Jamie Gritton
0304c73163 Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
Rick Macklem
c3e22f831f Fix the experimental nfs subsystem so that it builds with the
current NFSv4 ACLs, as defined in sys/acl.h. It still needs a
way to test a mount point for NFSv4 ACL support before it will
work. Until then, the NFSHASNFS4ACL() macro just always returns 0.

Approved by:	kib (mentor)
2009-05-26 22:21:53 +00:00
Edward Tomasz Napierala
ff392a1d87 Adapt to the new ACL #define names.
Reviewed by:	rmacklem@
2009-05-26 17:01:00 +00:00
Rick Macklem
10a5a16efa Add two sysctl variables to the experimental nfs server, so
that the range of versions of NFS handled by the server can
be limited. The nfsd daemon must be restarted after these
sysctl variables are changed, in order for the change to take
effect.

Approved by:	kib (mentor)
2009-05-26 01:47:37 +00:00
Rick Macklem
9183a2a33e Fix the handling of NFSv4 Illegal Operation number to conform
to RFC3530 (the operation number in the reply must be set to
the value for OP_ILLEGAL). Also cleaned up some indentation.

Approved by:	kib (mentor)
2009-05-26 01:16:09 +00:00
Rick Macklem
415670e4c2 Fix the experimental nfs server's interface to the new krpc so
that it handles the case of a non-exported NFSv4 root correctly.
Also, delete handling for the case where nd_repstat is already
set in nfs_proc(), since that no longer happens.

Approved by:	kib (mentor)
2009-05-26 01:09:33 +00:00
Rick Macklem
b1cfc0d961 Add NFSv4 root export checks to the DelegPurge, Renew and
ReleaseLockOwner operations analagous to what is already
in place for SetClientID and SetClientIDConfirm. These are
the five NFSv4 operations that do not use file handle(s),
so the checks are done using the NFSv4 root export entries
in /etc/exports.

Approved by:	kib (mentor)
2009-05-25 01:00:09 +00:00
Rick Macklem
6098ac23c1 Temporarily #undef NFS4_ACL_EXTATTR_NAME, so that the
experimental nfs subsystem will build while the NFSv4 ACL
support is going into the kernel.

Approved by:	kib (mentor)
2009-05-24 23:47:22 +00:00
Rick Macklem
05c965a221 Crib the realign function out of nfs_krpc.c and add a call
to it for the client side reply. Hopefully this fixes the
problem with using the new krpc for arm for the experimental
nfs client.

Approved by:	kib (mentor)
2009-05-24 19:46:12 +00:00
Rick Macklem
c1edc4480e Fix the experimental NFSv4 server so that it handles the case
where a client is not allowed NFSv4 access correctly. This
restriction is specified in the "V4: ..." line(s) in
/etc/exports.

Approved by:	kib (mentor)
2009-05-24 18:49:53 +00:00
Rick Macklem
63bde62e9a Fix the experimental nfsv4 client so that it works for the
case of a kerberized mount without a host based principal
name. This will only work for mounts being done by a user
other than root. Support for a host based principal name
will not work until proposed changes to the rpcsec_gss part
of the krpc are committed. It now builds for "options KGSSAPI".

Approved by:	kib (mentor)
2009-05-24 03:22:49 +00:00
Alan Cox
5662349ed5 Eliminate the unnecessary clearing of a page's dirty bits from
nwfs_getpages().
2009-05-23 18:25:11 +00:00
Rick Macklem
e2b84e0308 Fix the rpc_gss_secfind() call in nfs_commonkrpc.c so that
the code will build when "options KGSSAPI" is specified
without requiring the proposed changes that add host based
initiator principal support. It will not handle the case where
the client uses a host based initiator principal until those
changes are committed. The code that uses those changes is
#ifdef'd notyet until the krpc rpcsec_changes are committed.

Approved by:	kib (mentor)
2009-05-23 00:40:17 +00:00
Rick Macklem
b66124f314 Change the sysctl_base argument to svcpool_create() to NULL for
client side callbacks so that leaf names are not re-used,
since they are already being used by the server.

Approved by:	kib (mentor)
2009-05-22 23:22:56 +00:00
Rick Macklem
9dd7554f09 Fix the name of the module common to the client and server
in the experimental nfs subsystem to the correct one for
the MODULE_DEPEND() macro.

Approved by:	kib (mentor)
2009-05-22 20:55:29 +00:00
Rick Macklem
ed41a7ccee Change the printf of r192595 to identify the function,
as requested by Sam.

Approved by:	kib (mentor)
2009-05-22 19:05:48 +00:00
Rick Macklem
476174008c Modified the printf message of r192590 to remove the
possible DOS attack, as suggested by Sam.

Approved by:	kib (mentor)
2009-05-22 18:10:39 +00:00
Rick Macklem
8757104e6b Change the comment at the beginning of the function to reflect the
change from panic() to printf() done by r192588.
2009-05-22 16:46:01 +00:00
Rick Macklem
199685bca9 Change the reboot panic that would have occurred if clientid
numbers wrapped around to a printf() warning of a possible
DOS attack, in the experimental nfsv4 server.

Approved by:	kib (mentor)
2009-05-22 16:41:33 +00:00
Rick Macklem
682eec57c4 Modify the mount handling code in the experimental nfs client to
use the newer nmount() style arguments, as is used by mount_nfs.c.
This prepares the kernel code for the use of a mount_nfs.c with
changes for the experimental client integrated into it.

Approved by:	kib (mentor)
2009-05-22 15:08:12 +00:00
Rick Macklem
6ffb637d4a Change the code in the experimental nfs client to avoid flushing
writes upon close when a write delegation is held by the client.
This should be safe to do, now that nfsv4 Close operations are
delayed until ncl_inactive() is called for the vnode.

Approved by:	kib (mentor)
2009-05-22 15:01:47 +00:00
Rick Macklem
7951235502 Fix the comment in sys/fs/nfs/nfs.h to correctly reflect the
current use of the R_xxx flags. This changed when the
NFS_LEGACYRPC code was removed from the subsystem.

Approved by:	kib (mentor)
2009-05-22 14:53:26 +00:00
Robert Watson
86ce6a83d1 Remove the unmaintained University of Michigan NFSv4 client from 8.x
prior to 8.0-RELEASE.  Rick Macklem's new and more feature-rich NFSv234
client and server are replacing it.

Discussed with:	rmacklem
2009-05-22 12:35:12 +00:00
Rick Macklem
92f7f12bca Fix the experimental nfs server so that it depends on the nlm,
since it now calls nlm_acquire_next_sysid().

Approved by:	kib (mentor)
2009-05-22 01:15:07 +00:00
Rick Macklem
bac9ff3446 Fix the comment at line 3711 to be consistent with the change
applied for r192537.

Approved by:	kib (mentor)
2009-05-21 14:52:36 +00:00
Rick Macklem
b839e625b0 Modify sys/fs/nfsserver/nfs_nfsdport.c to use nlm_acquire_next_sysid()
to set the l_sysid for locks correctly.

Approved by:	kib (mentor)
2009-05-21 01:50:27 +00:00
Rick Macklem
29e890f126 Although it should never happen, all the nfsv4 server can do
when it runs out of clientids is reboot. I had replaced cpu_reboot()
with printf(), since cpu_reboot() doesn't exist for sparc64.
This change replaces the printf() with panic(), so the reboot
would occur for this highly unlikely occurrence.

Approved by:	kib (mentor)
2009-05-20 18:58:07 +00:00
Rick Macklem
47a598563d Change the experimental NFSv4 client so that it does not do
the NFSv4 Close operations until ncl_inactive(). This is
necessary so that the Open StateIDs are available for doing
I/O on mmap'd files after VOP_CLOSE(). I also changed some
indentation for the nfscl_getclose() function.

Approved by:	kib (mentor)
2009-05-18 21:22:03 +00:00
Rick Macklem
2c1b26b976 Fix the acquisition of local locks via VOP_ADVLOCK() by the
experimental nfsv4 server. It was setting the a_id argument
to a fixed value, but that wasn't sufficient for FreeBSD8.
Instead, set l_pid and l_sysid to 0 plus set the F_REMOTE
flag to indicate that these fields are used to check for
same lock owner. Since, for NFSv4, a lockowner is a ClientID plus
an up to 1024byte name, it can't be put in l_sysid easily.
I also renamed the p variable to td, since it's a thread ptr.

Approved by:	kib (mentor)
2009-05-17 19:33:48 +00:00
Rick Macklem
57d1e46484 Added a SYSCTL to sys/fs/nfsserver/nfs_nfsdport.c so that the value of
nfsrv_dolocallocks can be changed via sysctl. I also added some non-empty
descriptor strings and reformatted some overly long lines.

Approved by:	kib (mentor)
2009-05-17 17:54:01 +00:00
Alan Cox
f2eb372e02 Merge r191964: Eliminate a case of unnecessary page queues locking. 2009-05-17 06:45:30 +00:00
Rick Macklem
72d1bbba52 Changed sys/fs/nfs_clbio.c in the same way Alan Cox changed
sys/nfsclient/nfs_bio.c for r192134, so that the sources stay
in sync.

Approved by:	kib (mentor)
2009-05-16 22:31:38 +00:00
Rick Macklem
15e8331f0e Fixed the Null callback RPCs so that they work with the new krpc. This
required two changes: setting the program and version numbers before
connect and fixing the handling of the Null Rpc case in newnfs_request().

Approved by:	kib (mentor)
2009-05-16 03:12:55 +00:00
Rick Macklem
0c794c5db1 Move the nfsstat structure and proc/op number definitions on the
experimental nfs subsystem from sys/fs/nfs/nfs.h and sys/fs/nfs/nfsproto.h
to sys/fs/nfs/nfsport.h and rename nfsstat to ext_nfsstat. This was done
so that src/usr.bin/nfsstat.c could use it alongside the regular nfs
include files and struct nfsstat.

Approved by:	kib (mentor)
2009-05-15 19:33:59 +00:00
Konstantin Belousov
0e9bd89d7d Devfs replaces file ops vector with devfs-specific one in devfs_open(),
before the struct file is fully initialized in vn_open(), in particular,
fp->f_vnode is NULL. Other thread calling file operation before f_vnode
is set results in NULL pointer dereference in devvn_refthread().

Initialize f_vnode before calling d_fdopen() cdevsw method, that might
set file ops too.

Reported and tested by:	Chris Timmons <cwt networks cwu edu>
	(RELENG_7 version)
MFC after:	3 days
2009-05-15 19:23:05 +00:00
Rick Macklem
d34b41c9c2 Modify the diskless booting code in sys/fs/nfsclient to be compatible
with what is in sys/nfsclient, so that it will at least build now.

Approved by:	kib (mentor)
2009-05-15 16:03:11 +00:00
Alan Cox
42eb41087c Eliminate unnecessary clearing of the page's dirty mask from various
getpages functions.

Eliminate a stale comment.
2009-05-15 04:33:35 +00:00
Rick Macklem
98ad44534e Apply changes to the experimental nfs server so that it uses the security
flavors as exported in FreeBSD-CURRENT. This allows it to use a
slightly modified mountd.c instead of a different utility.

Approved by:	kib (mentor)
2009-05-14 21:39:08 +00:00
Rick Macklem
cec6336717 Change the file names in the comments in sys/fs/nfs/nfs_var.h so
that they are the names used in FreeBSD-CURRENT. Also shuffled a
few entries around, so that they under the correct comment.

Approved by:	kib (mentor)
2009-05-14 20:39:09 +00:00
Rick Macklem
a770e183dd Apply a one line change to nfs_clbio.c (which is largely a copy
of sys/nfsclient/nfs_bio.c) to track the change recently committed
by acl for nfs_bio.c.

Approved by:	kib (mentor)
2009-05-13 21:18:34 +00:00
Rick Macklem
7e74551956 Modify the experimental nfs server to use the new nfsd_nfsd_args
structure for nfsd. Includes a change that clarifies the use of
an empty principal name string to indicate AUTH_SYS only.

Approved by:	kib (mentor)
2009-05-12 16:04:51 +00:00
Konstantin Belousov
3fe65eb8fe Report all fdescfs vnodes as VCHR for stat(2). Fake the unique
major/minor numbers of the devices.

Pretending that the vnodes are character devices prevents file tree
walkers from descending into the directories opened by current process.
Also, not doing stat on the filedescriptors prevents the recursive entry
into the VFS.

Requested by:	kientzle
Discussed with:	Jilles Tjoelker <jilles stack nl>
2009-05-12 09:28:45 +00:00
Konstantin Belousov
ada3b6a9ea Return controlled EINVAL when the fdescfs lookup routine is given string
representing too large integer, instead of overflowing and possibly
returning a random but valid vnode.

Noted by:	Jilles Tjoelker <jilles stack nl>
MFC after:	3 days
2009-05-12 09:22:33 +00:00
Alan Cox
12aa4fdca9 Eliminate gratuitous clearing of the page's dirty mask. 2009-05-12 05:49:02 +00:00
Rick Macklem
1c6c0ed937 Change the name of the nfs server addsock structure from nfsd_args
to nfsd_addsock_args, so that it is consistent with the one in
	sys/nfsserver/nfs.h.

Approved by:	kib (mentor)
2009-05-11 19:37:05 +00:00
Rick Macklem
70839889c6 Modify nfsvno_fhtovp() to ensure that it always sets the credp
argument. Returning without credp set could result in a caller
	doing crfree() on garbage.

Reviewed by:	kan
Approved by:	kib (mentor)
2009-05-11 18:45:04 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Alan Cox
f45cc06eb1 Eliminate stale comments.
Eliminate a case of unnecessary page queues locking.
2009-05-10 17:05:43 +00:00
Alexander Kabaev
5679fe1957 Do not embed struct ucred into larger netcred parent structures.
Credential might need to hang around longer than its parent and be used
outside of mnt_explock scope controlling netcred lifetime. Use separate
reference-counted ucred allocated separately instead.

While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up
some unused declarations in new NFS code.

Reported by:	John Hickey
PR:		kern/133439
Reviewed by:	dfr, kib
2009-05-09 18:09:17 +00:00
Rick Macklem
9ec7b004d0 Add the experimental nfs subtree to the kernel, that includes
support for NFSv4 as well as NFSv2 and 3.
	It lives in 3 subdirs under sys/fs:
	nfs - functions that are common to the client and server
	nfsclient - a mutation of sys/nfsclient that call generic functions
	to do RPCs and handle state. As such, it retains the
	buffer cache handling characteristics and vnode semantics that
	are found in sys/nfsclient, for the most part.
	nfsserver - the server. It includes a DRC designed specifically for
	NFSv4, that is used instead of the generic DRC in sys/rpc.
	The build glue will be checked in later, so at this point, it
	consists of 3 new subdirs that should not affect kernel building.

Approved by:	kib (mentor)
2009-05-04 15:23:58 +00:00
Robert Watson
885868cd8f Remove VOP_LEASE and supporting functions. This hasn't been used since
the removal of NQNFS, but was left in in case it was required for NFSv4.
Since our new NFSv4 client and server can't use it for their
requirements, GC the old mechanism, as well as other unused lease-
related code and interfaces.

Due to its impact on kernel programming and binary interfaces, this
change should not be MFC'd.

Proposed by:    jeff
Reviewed by:    jeff
Discussed with: rmacklem, zach loafman @ isilon
2009-04-10 10:52:19 +00:00
Dag-Erling Smørgrav
4970b8ae0a Remove spurious locking in pfs_write().
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-08 09:02:42 +00:00
Dag-Erling Smørgrav
59fc1b068f Fix an inverted KASSERT. Add similar assertions in other similar places.
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-07 16:13:10 +00:00
Peter Holm
aa73f8c7a2 Do not use null_bypass for VOP_ISLOCKED, directly call default
implementation. null_bypass cannot work for the !nullfs-vnodes, in
particular, for VBAD vnodes.

In collaboration with:	kib
2009-03-18 13:54:35 +00:00
Attilio Rao
b13ec5e016 Remove the null_islocked() overloaded vop because the standard one does
the same.
2009-03-13 07:09:20 +00:00
John Baldwin
33fc362512 Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a
filesystem supports additional operations using shared vnode locks.
Currently this is used to enable shared locks for open() and close() of
read-only file descriptors.
- When an ISOPEN namei() request is performed with LOCKSHARED, use a
  shared vnode lock for the leaf vnode only if the mount point has the
  extended shared flag set.
- Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but
  not O_CREAT.
- Use a shared vnode lock around VOP_CLOSE() if the file was opened with
  O_RDONLY and the mountpoint has the extended shared flag set.
- Adjust md(4) to upgrade the vnode lock on the vnode it gets back from
  vn_open() since it now may only have a shared vnode lock.
- Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since
  FIFO's require exclusive vnode locks for their open() and close()
  routines.  (My recent MPSAFE patches for UDF and cd9660 already included
  this change.)
- Enable extended shared operations on UFS, cd9660, and UDF.

Submitted by:	ups
Reviewed by:	pjd (ZFS bits)
MFC after:	1 month
2009-03-11 14:13:47 +00:00
Konstantin Belousov
52dc9305aa Enable advisory file locking for devfs vnodes.
Reported by:	Timothy Redaelli <timothy redaelli eu>
MFC after:	1 week
2009-03-11 12:53:16 +00:00
Konstantin Belousov
062ef8a5f8 Do not use bypass for vop_vptocnp() from nullfs, call standard
implementation instead. The bypass does not assume that returned vnode
is only held.

Reported by:	Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com>
Reviewed by:	jhb
Tested by:	pho, pluknet <pluknet gmail com>
2009-03-10 14:35:21 +00:00
Konstantin Belousov
125dcf8c7d Extract the no_poll() and vop_nopoll() code into the common routine
poll_no_poll().
Return a poll_no_poll() result from devfs_poll_f() when
filedescriptor does not reference the live cdev, instead of ENXIO.

Noted and tested by:	hps
MFC after:	1 week
2009-03-06 15:35:37 +00:00
Andriy Gapon
159da68b48 udf: use truly unique directory cookie
'off' is an offset within current block, so there is a good chance
it can be non-unique, so use complete offset.

Submitted by:	bde
Approved by:	jhb
2009-03-04 13:54:10 +00:00
Andriy Gapon
9cd835bba9 udf_strategy: remove redundant comment
We fail mapping for any udf_bmap_internal error and there can be
different reasons for it, so no need to (over-)emphasize files with
data in fentry.

Submitted by:	bde
Approved by:	jhb
2009-03-04 13:53:57 +00:00
Andriy Gapon
ff9e355b51 udf_readdir: do not advance offset if entry can not be uio-ed
Previosly readdir missed some directory entries because there was
no space for them in current uio but directory stream offset
was advanced nevertheless.
jhb has discoved the issue and provided a test-case.

Reviewed by:	bde
Approved by:	jhb (mentor)
2009-03-03 13:10:25 +00:00
Konstantin Belousov
2883703e00 Use the p_sysent->sv_flags flag SV_ILP32 to detect 32bit process
executing on 64bit kernel. This eliminates the direct comparisions
of p_sysent with &ia32_freebsd_sysvec, that were left intact after
r185169.
2009-03-02 18:43:50 +00:00
John Baldwin
c72ae1423b - Hold a reference on the cdev a filesystem is mounted from in the mount.
- Remove the cdev pointers from the denode and instead use the mountpoint's
  reference to call dev2udev() in getattr().

Reviewed by:	kib, julian
2009-02-27 20:00:15 +00:00
Andriy Gapon
b2c91b67b6 udf_readatoffset: return correct size and data pointer for data in fentry
This should help correct reading of directories with data located
in fentry.

Submitted by:	bde
Approved by:	jhb (mentor)
2009-02-27 17:29:31 +00:00
Andriy Gapon
b0c0fb5940 udf_readatoffset: read through directory vnode, do not read > MAXBSIZE
Currently bread()-ing through device vnode with
(1) VMIO enabled,
(2) bo_bsize != DEV_BSIZE
(3) more than 1 block
results in data being incorrectly cached.
So instead a more common approach of using a vnode belonging to fs is now
employed.
Also, prevent attempt to bread more than MAXBSIZE bytes because of
adjustments made to account for offset that doesn't start on block
boundary.
Add expanded comments to explain the calculations.
Also drop unused inline function while here.

PR: kern/120967
PR: kern/129084

Reviewed by: scottl, kib
Approved by: jhb (mentor)
2009-02-26 18:58:41 +00:00
Andriy Gapon
fb2a76ccf8 udf: add read-ahead support modeled after cd9660
Reviewed by: scottl
Approved by: jhb (mentor)
2009-02-26 12:33:22 +00:00
Andriy Gapon
82467096ff udf_map: return proper error code instead of leaking an internal one
Incidentally this also allows for small files with data embedded into
fentry to be mmap-ed.

Approved by: jhb (mentor)
2009-02-26 12:33:17 +00:00
Andriy Gapon
be52a95d88 udf_read: correctly read data from files with data embedded into fentry,
... as opposed to files with data in extents.
Some UDF authoring tools produce this type of file for sufficiently small
data files.

Approved by: jhb (mentor)
2009-02-26 12:33:12 +00:00
Andriy Gapon
5792e04db9 udf_strategy: tiny optimization of logic, calculations; extra diagnostics
Use bit-shift instead of division/multiplication.
Act on error as soon as it is detected.
Report attempt to read data embedded in file entry via regular way.
While there, fix lblktosize macro and make use of it.

No functionality should change as a result.

Approved by: jhb (mentor)
2009-02-26 12:33:02 +00:00
Edward Tomasz Napierala
4f560d7595 Right now, when trying to unmount a device that's already gone,
msdosfs_unmount() and ffs_unmount() exit early after getting ENXIO.
However, dounmount() treats ENXIO as a success and proceeds with
unmounting.  In effect, the filesystem gets unmounted without closing
GEOM provider etc.

Reviewed by:	kib
Approved by:	rwatson (mentor)
Tested by:	dho
Sponsored by:	FreeBSD Foundation
2009-02-23 21:09:28 +00:00
Alan Cox
2984411d7b Use uiomove_fromphys() instead of the combination of sf_buf and uiomove().
This is not only shorter; it also eliminates unnecessary thread pinning on
architectures that implement a direct map.

MFC after:	3 weeks
2009-02-22 19:50:09 +00:00
Alan Cox
2ade0391c2 Simplify the unwiring and activation of pages.
MFC after:	1 week
2009-02-22 18:15:17 +00:00
Andriy Gapon
7b1eb68a49 style nit in r188815
Pointed out by:	jhb, rpaulo
Approved by:	jhb (mentor)
2009-02-19 15:37:43 +00:00
Andriy Gapon
84206c741f fs/udf: fix incorrect error return (-1) when reading a large dir
Not enough space in user-land buffer is not an error, userland
will read further until eof is reached. So instead of propagating
-1 to caller we convert it to zero/success.

cd9660 code works exactly the same way.

PR:		kern/78987
Reviewed by:	jhb (mentor)
Approved by:	jhb (mentor)
2009-02-19 15:05:30 +00:00
Dag-Erling Smørgrav
655fcdaa00 Fix a logic bug that caused the pfs_attr method to be called only for
PFS_PROCDEP nodes.

Submitted by:	Andrew Brampton <brampton@gmail.com>
MFC after:	2 weeks
2009-02-16 15:17:26 +00:00
John Baldwin
ea77ff0a15 Use shared vnode locks when invoking VOP_READDIR().
MFC after:	1 month
2009-02-13 18:18:14 +00:00
John Baldwin
0b2ab5ec8e - Consolidate error handling in the cd9660 and udf mount routines.
- Always read the character device pointer while the associated devfs vnode
  is locked.  Also, use dev_ref() to obtain a new reference on the vnode for
  the mountpoint.  This reference is released on unmount.  This mirrors the
  earlier fix to FFS.

Reviewed by:	kib
2009-02-11 22:22:26 +00:00
John Baldwin
4ad0d60bb8 Mark udf(4) MPSAFE and add support for shared vnode locks during pathname
lookups:
- Honor the caller's locking flags in udf_root() and udf_vget().
- Set VV_ROOT for the root vnode in udf_vget() instead of only doing it in
  udf_root().
- Honor the requested locking flags during pathname lookups in udf_lookup().
- Release the buffer holding the directory data before looking up the vnode
  for a given file to avoid a LOR between the "udf" vnode locks and
  "bufwait".
- Use vn_vget_ino() to handle ".." lookups.
- Special case "." lookups instead of calling udf_vget().  We have to do
  extra checking for the vnode lock for "." lookups.
2009-02-09 20:50:23 +00:00
John Baldwin
a40ad858de Use the same style as the rest of the file for the optional data string
after each path component rather than a GCC-ism.
2009-02-09 20:42:51 +00:00
Konstantin Belousov
e3c7e75305 Lookup up the directory entry for the tmpfs node that are deleted by
both node pointer and name component. This does the right thing for
hardlinks to the same node in the same directory.

Submitted by:	Yoshihiro Ota <ota j email ne jp>
PR:	kern/131356
MFC after:	2 weeks
2009-02-08 19:18:33 +00:00
John Baldwin
e3024df2e0 Add rudimentary support for symbolic links on UDF. Links are stored as a
sequence of pathname components.  We walk the list building a string in
the caller's passed in buffer.  Currently this only handles path names
in CS8 (character set 8) as that is what mkisofs generates for UDF images.

MFC after:	1 month
2009-02-06 22:24:03 +00:00
John Baldwin
61e69c80e4 Add support for fifos to UDF:
- Add a separate set of vnode operations that inherits from the fifo ops
  and use it for fifo nodes.
- Add a VOP_SETATTR() method that allows setting the size (by silently
  ignoring the requests) of fifos.  This is to allow O_TRUNC opens of
  fifo devices (e.g. I/O redirection in shells using ">").
- Add a VOP_PRINT() handler while I'm here.
2009-02-06 20:09:14 +00:00
John Baldwin
8941aad19b Tweak the output of VOP_PRINT/vn_printf() some.
- Align the fifo output in fifo_print() with other vn_printf() output.
- Remove the leading space from lockmgr_printinfo() so its output lines up
  in vn_printf().
- lockmgr_printinfo() now ends with a newline, so remove an extra newline
  from vn_printf().
2009-02-06 20:06:48 +00:00
Bjoern A. Zeeb
13fd4d2163 After r186194 the *fs_strategy() functions always return 0.
So we are no longer interested in the error returned from
the *fs_doio() functions. With that we can remove the
error variable as its value is unused now.

Submitted by:	Christoph Mallon christoph.mallon@gmx.de
2009-01-31 18:06:34 +00:00
Bjoern A. Zeeb
7956d34b95 Remove unused local variables.
Submitted by:	Christoph Mallon christoph.mallon@gmx.de
Reviewed by:	kib
MFC after:	2 weeks
2009-01-31 17:36:22 +00:00
Ed Schouten
f3b86a5fd7 Mark most often used sysctl's as MPSAFE.
After running a `make buildkernel', I noticed most of the Giant locks in
sysctl are only caused by a very small amount of sysctl's:

- sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like
  sysctl.oidfmt.

- kern.ident, kern.osrelease, kern.version, etc. These are just constant
  strings.

- kern.arandom, used by the stack protector. It is already protected by
  arc4_mtx.

I also saw the following sysctl's show up. Not as often as the ones
above, but still quite often:

- security.jail.jailed. Also mark security.jail.list as MPSAFE. They
  don't need locking or already use allprison_lock.

- kern.devname, used by devname(3), ttyname(3), etc.

This seems to reduce Giant locking inside sysctl by ~75% in my primitive
test setup.
2009-01-28 19:58:05 +00:00