Commit Graph

3474 Commits

Author SHA1 Message Date
cem
3d8d5f23ac nfsclient: Protest loudly when GETATTR responses are invalid
BROKEN NFS SERVER OR MIDDLEWARE: Certain WAN "accelerators" attempt to cache
NFS GETATTR traffic, but actually corrupt it (e.g., responding to requests
with attributes for totally different files).

Warn very verbosely when this is detected. Linux' NFS client has a similar
warning.

Adds a sysctl/tunable (vfs.nfs.fileid_maxwarnings) to configure the quantity
of warnings; default to 10. (Zero disables; -1 is unlimited.)

Adds a failpoint to aid in validating the warning / behavior with a
non-broken server. Use something like:

    sysctl 'debug.fail_point.nfscl_force_fileid_warning=10%return(1)'

Reviewed by:	rmacklem
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3304
2015-08-05 22:27:30 +00:00
rmacklem
e15fd657a2 This patch fixes a problem where, if the NFSv4 server has a previous
unconfirmed clientid structure for the same client on the last hash list,
this old entry would not be removed/deleted. I do not think this bug would have
caused serious problems, since the new entry would have been before the old one
on the list. This old entry would have eventually been scavenged/removed.
Detected while reading the code looking for another bug.

MFC after:	3 days
2015-07-29 23:06:30 +00:00
jeff
b7b72de7da - Remove some dead code copied from ffs. 2015-07-29 03:06:08 +00:00
brueffer
95419ac921 In tmpfs_chtimes(), remove checks on the nanosecond level when
determining whether a node changed.

Other filesystems, e.g., UFS, only check on seconds, when determining
whether something changed.

This also corrects the birthtime case, where we checked tv_nsec
twice, instead of tv_sec and tv_nsec (PR).

PR:			201284
Submitted by:		David Binderman
Patch suggested by:	kib
Reviewed by:		kib
MFC after:		2 weeks
Committed from:		Essen FreeBSD Hackathon
2015-07-26 08:33:46 +00:00
kib
48ccbdea81 The si_status field of the siginfo_t, provided by the waitid(2) and
SIGCHLD signal, should keep full 32 bits of the status passed to the
_exit(2).

Split the combined p_xstat of the struct proc into the separate exit
status p_xexit for normal process exit, and signalled termination
information p_xsig.  Kernel-visible macro KW_EXITCODE() reconstructs
old p_xstat from p_xexit and p_xsig.  p_xexit contains complete status
and copied out into si_status.

Requested by:	Joerg Schilling
Reviewed by:	jilles (previous version), pho
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2015-07-18 09:02:50 +00:00
markj
d19ba3f89d Check suspendability on the mountpoint returned by VOP_GETWRITEMOUNT.
This obviates the need for a MNTK_SUSPENDABLE flag, since passthrough
filesystems like nullfs and unionfs no longer need to inherit this
information from their lower layer(s). This change also restores the
pre-r273336 behaviour of using the presence of a susp_clean VFS method to
request suspension support.

Reviewed by:	kib, mjg
Differential Revision:	https://reviews.freebsd.org/D2937
2015-07-05 22:37:33 +00:00
mjg
feeee4c707 fd: make 'rights' a manadatory argument to fget* functions 2015-07-05 19:05:16 +00:00
rmacklem
5ebe352487 If a "principal" argument isn't provided for a Kerberized NFS mount,
the kernel would generate a bogus one with a ":/<path>" suffix.
This would only occur for the case where there was no explicit
"principal" argument and the getaddrinfo() call in mount_nfs.c failed to a
return a cannonical name for the server.
This patch fixes this unusual case.

PR:		201073
Submitted by:	masato@itc.naist.jp
MFC after:	2 weeks
2015-07-03 22:11:07 +00:00
rmacklem
97c35a724e Alex Burlyga reported a POLA violation for the new NFS client as
compared to the old NFS client via email to the freebsd-fs@ mailing list.
For the new client, when multiple clients attempted to create a symbolic
link concurrently, more that one client would report success instead of
EEXIST. This was caused by code in the new client that mapped EEXIST to
OK assuming it was caused by a retried RPC request.
Since the old client did not do this, the patch defaults to the old
behaviour and permits the new behaviour to be enabled via a sysctl.

Reported by:	alex.burlyga.ietf@gmail.com
Tested by:	alex.burlyga.ietf@gmail.com
MFC after:	2 weeks
2015-07-03 01:15:21 +00:00
markm
d586165577 Huge cleanup of random(4) code.
* GENERAL
- Update copyright.
- Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set
  neither to ON, which means we want Fortuna
- If there is no 'device random' in the kernel, there will be NO
  random(4) device in the kernel, and the KERN_ARND sysctl will
  return nothing. With RANDOM_DUMMY there will be a random(4) that
  always blocks.
- Repair kern.arandom (KERN_ARND sysctl). The old version went
  through arc4random(9) and was a bit weird.
- Adjust arc4random stirring a bit - the existing code looks a little
  suspect.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Redo read_random(9) so as to duplicate random(4)'s read internals.
  This makes it a first-class citizen rather than a hack.
- Move stuff out of locked regions when it does not need to be
  there.
- Trim RANDOM_DEBUG printfs. Some are excess to requirement, some
  behind boot verbose.
- Use SYSINIT to sequence the startup.
- Fix init/deinit sysctl stuff.
- Make relevant sysctls also tunables.
- Add different harvesting "styles" to allow for different requirements
  (direct, queue, fast).
- Add harvesting of FFS atime events. This needs to be checked for
  weighing down the FS code.
- Add harvesting of slab allocator events. This needs to be checked for
  weighing down the allocator code.
- Fix the random(9) manpage.
- Loadable modules are not present for now. These will be re-engineered
  when the dust settles.
- Use macros for locks.
- Fix comments.

* src/share/man/...
- Update the man pages.

* src/etc/...
- The startup/shutdown work is done in D2924.

* src/UPDATING
- Add UPDATING announcement.

* src/sys/dev/random/build.sh
- Add copyright.
- Add libz for unit tests.

* src/sys/dev/random/dummy.c
- Remove; no longer needed. Functionality incorporated into randomdev.*.

* live_entropy_sources.c live_entropy_sources.h
- Remove; content moved.
- move content to randomdev.[ch] and optimise.

* src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h
- Remove; plugability is no longer used. Compile-time algorithm
  selection is the way to go.

* src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h
- Add early (re)boot-time randomness caching.

* src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h
- Remove; no longer needed.

* src/sys/dev/random/uint128.h
- Provide a fake uint128_t; if a real one ever arrived, we can use
  that instead. All that is needed here is N=0, N++, N==0, and some
  localised trickery is used to manufacture a 128-bit 0ULLL.

* src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h
- Improve unit tests; previously the testing human needed clairvoyance;
  now the test will do a basic check of compressibility. Clairvoyant
  talent is still a good idea.
- This is still a long way off a proper unit test.

* src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h
- Improve messy union to just uint128_t.
- Remove unneeded 'static struct fortuna_start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])

* src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h
- Improve messy union to just uint128_t.
- Remove unneeded 'staic struct start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
- Fix some magic numbers elsewhere used as FAST and SLOW.

Differential Revision: https://reviews.freebsd.org/D2025
Reviewed by: vsevolod,delphij,rwatson,trasz,jmg
Approved by: so (delphij)
2015-06-30 17:00:45 +00:00
kib
f69fd0ed6d Restore the td_cookie value for the tmpfs directory entry which was a
dup entry, upon detach from the parent directory.  If the node is
renamed, the entry is re-attached at the different directory, and
invalud cookie value triggers assert (or corrupts directory rb tree,
it seems).

Reported by:	clusteradm (gjb, antoine)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-06-19 07:25:15 +00:00
glebius
5b81a20433 o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async().
o Provide an extensive set of assertions for input array of pages.
o Remove now duplicate assertions from different pagers.

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2015-06-17 22:44:27 +00:00
mjg
1a3e7a935e Replace struct filedesc argument in getvnode with struct thread
This is is a step towards removal of spurious arguments.
2015-06-16 13:09:18 +00:00
glebius
519f1ccd36 Make KPI of vm_pager_get_pages() more strict: if a pager changes a page
in the requested array, then it is responsible for disposition of previous
page and is responsible for updating the entry in the requested array.
Now consumers of KPI do not need to re-lookup the pages after call to
vm_pager_get_pages().

Reviewed by:	kib
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2015-06-12 11:32:20 +00:00
mjg
d7bc9285a6 Implement lockless resource limits.
Use the same scheme implemented to manage credentials.

Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.

Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.
2015-06-10 10:48:12 +00:00
markj
8563211eff unionfs: fix suspendability check bugs
- MNTK_SUSPENDABLE is set in mnt_kern_flag, not mnt_flag.
- The lower layer of a unionfs mount is read-only, so the mount should
  be suspendable iff the upper layer is suspendable.
- Remove a couple of superfluous comments.

Differential Revision:	https://reviews.freebsd.org/D2714
Reviewed by:	kib, mjg
2015-06-06 16:36:13 +00:00
jhb
bba1e1e047 Add a new file operations hook for mmap operations. File type-specific
logic is now placed in the mmap hook implementation rather than requiring
it to be placed in sys/vm/vm_mmap.c.  This hook allows new file types to
support mmap() as well as potentially allowing mmap() for existing file
types that do not currently support any mapping.

The vm_mmap() function is now split up into two functions.  A new
vm_mmap_object() function handles the "back half" of vm_mmap() and accepts
a referenced VM object to map rather than a (handle, handle_type) tuple.
vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a
a VM object and then calling vm_mmap_object() to handle the actual mapping.
The vm_mmap() function remains for use by other parts of the kernel
(e.g. device drivers and exec) but now only supports mapping vnodes,
character devices, and anonymous memory.

The mmap() system call invokes vm_mmap_object() directly with a NULL object
for anonymous mappings.  For mappings using a file descriptor, the
descriptors fo_mmap() hook is invoked instead.  The fo_mmap() hook is
responsible for performing type-specific checks and adjustments to
arguments as well as possibly modifying mapping parameters such as flags
or the object offset.  The fo_mmap() hook routines then call
vm_mmap_object() to handle the actual mapping.

The fo_mmap() hook is optional.  If it is not set, then fo_mmap() will
fail with ENODEV.  A fo_mmap() hook is implemented for regular files,
character devices, and shared memory objects (created via shm_open()).

While here, consistently use the VM_PROT_* constants for the vm_prot_t
type for the 'prot' variable passed to vm_mmap() and vm_mmap_object()
as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines.
Previously some places were using the mmap()-specific PROT_* constants
instead.  While this happens to work because PROT_xx == VM_PROT_xx,
using VM_PROT_* is more correct.

Differential Revision:	https://reviews.freebsd.org/D2658
Reviewed by:	alc (glanced over), kib
MFC after:	1 month
Sponsored by:	Chelsio
2015-06-04 19:41:15 +00:00
vangyzen
597cee37df Provide vnode in memory map info for files on tmpfs
When providing memory map information to userland, populate the vnode pointer
for tmpfs files.  Set the memory mapping to appear as a vnode type, to match
FreeBSD 9 behavior.

This fixes the use of tmpfs files with the dtrace pid provider,
procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).

Submitted by:   Eric Badger <eric@badgerio.us> (initial revision)
Obtained from:  Dell Inc.
PR:             198431
MFC after:      2 weeks
Reviewed by:    jhb
Approved by:    kib (mentor)
2015-06-02 18:37:04 +00:00
delphij
77722a1db5 Clear p_stops upon PROCFS_CTL_DETACH, similar to r283889.
Noticed by:	jhb
Reviewed by:	sef
Sponsored by:	iXsystems, Inc.
MFC after:	2 weeks
2015-06-01 18:49:31 +00:00
rmacklem
d7f3fa6b20 Make the NFS server use shared vnode locks for a few cases
that are allowed by the VFS/VOP interface instead of using
exclusive locks.

MFC after:	2 weeks
2015-05-29 20:22:53 +00:00
pfg
dce46f4095 Provide VOP_GETPAGES_ASYNC() for extfs.
Merge the filesystem specific part from r274914 to ext2fs.

I only did regular testing with the change but UFS and our ext2fs
are similar enough that the code should just work with the new
sendfile.

Discussed with:	glebius
2015-05-28 21:06:59 +00:00
rmacklem
7c550ca17d Make the size of the hash tables used by the NFSv4 server tunable.
No appreciable change in performance was observed after increasing
the sizes of these tables and then testing with a single client.
However, there was an email that indicated high CPU overheads for
a heavily loaded NFSv4 and it is hoped that increasing the sizes
of the hash tables via these tunables might help.
The tables remain the same size by default.

Differential Revision:	https://reviews.freebsd.org/D2596
MFC after:	2 weeks
2015-05-27 22:00:05 +00:00
kib
ff588ae9b0 Currently, softupdate code detects overstepping on the workitems
limits in the code which is deep in the call stack, and owns several
critical system resources, like vnode locks.  Attempt to wait while
the per-mount softupdate thread cleans up the backlog may deadlock,
because the thread might need to lock the same vnode which is owned by
the waiting thread.

Instead of synchronously waiting for the worker, perform the worker'
tickle and pause until the backlog is cleaned, at the safe point
during return from kernel to usermode.  A new ast request to call
softdep_ast_cleanup() is created, the SU code now only checks the size
of queue and schedules ast.

There is no ast delivery for the kernel threads, so they are exempted
from the mechanism, except NFS daemon threads.  NFS server loop
explicitely checks for the request, and informs the schedule_cleanup()
that it is capable of handling the requests by the process P2_AST_SU
flag.  This is needed because nfsd may be the sole cause of the SU
workqueue overflow.  But, to not cause nsfd to spawn additional
threads just because we slow down existing workers, only tickle su
threads, without waiting for the backlog cleanup.

Reviewed by:	jhb, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-05-27 09:20:42 +00:00
dchagin
4e888d6b57 Hide vfs.pfs.trace variable if it is not used. 2015-05-24 18:11:22 +00:00
rmacklem
b51d622ba8 The NFS client generated directory block(s) with d_fileno == 0
so that it would not return less data than requested.
Since returning less directory data than requested is not a problem
for FreeBSD and even UFS no longer returns directory structures
with d_fileno == 0, this patch stops the client from doing this.
Although entries with d_fileno == 0 should not be a problem,
the man pages no longer document that these entries should be
ignored, so there was a concern that these entries might be an
issue in the future.

Suggested by:	trasz
Tested by:	trasz
MFC after:	2 weeks
2015-05-23 21:58:41 +00:00
jkim
318c4f97e6 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
jhb
9b4a921aab Always set p_oppid when attaching to an existing process via procfs
tracing.  This matches the behavior of ptrace(PT_ATTACH).  Also,
the procfs detach request assumes p_oppid is always set.

Reviewed by:	kib
MFC after:	2 weeks
2015-05-22 11:03:51 +00:00
rmacklem
9a564b79b5 The NFS client wasn't handling getdirentries(2) requests for sizes
that are not an exact multiple of DIRBLKSIZ correctly. Fortunately
readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications
were affected. This patch fixes this problem by reducing the size
of the directory read to an exact multiple of DIRBLKSIZ.

Tested by:	trasz
Reported by:	trasz
Reviewed by:	trasz
MFC after:	2 weeks
2015-05-21 23:14:18 +00:00
mav
983d16c8e2 Do not promote large async writes to sync.
Present implementation of large sync writes is too strict and so can be
quite slow.  Instead of doing that, execute large async write in chunks,
syncing each chunk separately.

It would be good to fix large sync writes too, but I leave it to somebody
with more skills in this area.

Reviewed by:	rmacklem
MFC after:	1 week
2015-05-14 10:04:42 +00:00
rmacklem
5a5431c415 Fix the NFS server's handling of a bogus NFSv2 ROOT RPC.
The ROOT RPC is deprecated in the NFSv2 RFC, RFC-1094
and should never be used by a client.

Tested by:	thmu@freenet.de
MFC after:	1 week
2015-04-25 00:58:24 +00:00
rmacklem
e0a9cb76d2 MAXBSIZE defines both the largest UFS block size and the
largest size for a buffer in the buffer cache. This patch
defines a new constant MAXBCACHEBUF, which is the largest
size for a buffer in the buffer cache. Having a separate
constant allows MAXBCACHEBUF to be set larger than MAXBSIZE
on a per-architecture basis, so that NFS can do larger read/writes
for these architectures. It modifies sys/param.h so that BKVASIZE
can also be set on a per-architecture basis.
A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE
is fixed as well.

Differential Revision:	https://reviews.freebsd.org/D2330
Reviewed by:	mav, kib
MFC after:	2 weeks
2015-04-25 00:52:01 +00:00
pfg
2c313e6688 Prevent a double free.
This is similar to r281756 so set the ptr NULL after free as a safety belt
against future changes.

Obtained from:	HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac)
Submitted by:	Oliver Pinter
Revewed by:	rmacklem
2015-04-20 16:40:13 +00:00
pfg
32880107f3 nfsrpc_createv4: fix double free.
Reported by:	Oliver Pinter, clang static checker
Obtained from:	HardenedBSD (commit 63cac77c42c0c3fc67da62f97d5ab651d52ae707)
Reviewed by:	rmacklem
MFC after:	5 days
2015-04-19 23:55:59 +00:00
mav
d9ba2b8e84 Change wcommitsize default from one empirical value to another.
The new value is more predictable with growing RAM size:

        hibufspace maxvnodes      old      new
i386:
  256MB   32980992     15800  2198732  2097152
    2GB   94027776    107677   878764  4194304
amd64:
  256MB   32980992     15800  2198732  2097152
    1GB  114114560     68062  1678155  4194304
    4GB  217055232    111807  1955452  4194304
   16GB 1717846016    337308  5097465 16777216
   64GB 1734918144   1164427  1490479 16777216
  256GB 1734918144   4426453   391983 16777216

Reviewed by:	rmacklem
MFC after:	2 weeks
2015-04-19 11:34:41 +00:00
trasz
4252f860ce Replace "new NFS" with just "NFS" in some sysctl description strings.
Sponsored by:	The FreeBSD Foundation
2015-04-19 06:18:41 +00:00
pfg
2bead96db0 Drop experimental dir_index support.
The htree directory index is a highly desirable feature for research
purposes and was meant to improve performance in our ext2/3 driver.
Unfortunately our implementation has two problems:

- It never really delivered any performance improvement.
- It appears to corrupt the filesystem in undetermined circumstances.

Strictly speaking dir_index is not required for read/write support in
ext2/3 and our limited ext4 support still works fine without it.

Regain stability in the ext2 driver by removing it. We may need it back
(fixed) if we want to support encrypted ext4 support but thanks to the
wonders of version control we can always revert this change and bring it
back.

PR:	191895
PR:	198731
PR:	199309

MFC after:	5 days
2015-04-17 22:26:01 +00:00
rmacklem
b4d8a8d1f7 mav@ has found that NFS servers exporting ZFS file systems
can perform better when using a 128K read/write data size.
This patch changes NFS_MAXDATA from 64K to 128K so that
clients can use 128K for NFS mounts to allow this.
The patch also renames NFS_MAXDATA to NFS_SRVMAXIO so
that it is clear that it applies to the NFS server side
only. It also avoids a name conflict with the NFS_MAXDATA
defined in rpcsvc/nfs_prot.h, that is used for userland RPC.

Tested by:	mav
Reviewed by:	mav
MFC after:	2 weeks
2015-04-16 22:35:15 +00:00
rmacklem
ad77d0b1c1 File systems that do not use the buffer cache (such as ZFS) must
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.

Reviewed by:	kib
MFC after:	2 weeks
2015-04-15 20:16:31 +00:00
will
e2c616f11c tmpfs_getattr(): Return more correct allocated byte counts.
For VREG vnodes, return the resident page count (multiplied by PAGE_SIZE)
for the tmpfs node's anonymous VM object that stores actual file contents.

For all other vnodes, return the tmpfs_node's tn_size, which should not
be rounded to a page.

This change allows using stat(2) to identify a sparse file on tmpfs.

Reviewed by:	kib
MFC after:	1 week
2015-04-10 19:04:39 +00:00
kib
1440f3812a Do not call msdosfs_sync() on the read-only msdosfs mounts. In fact,
it should be a nop for ro.

PR:	199152
Reviewed by:	bde (PR version of the patch)
Submitted by:	longwitz@incore.de
MFC after:	1 week
2015-04-05 21:10:38 +00:00
kib
95e5199578 Assert that an msdosfs mount is not read-only when FAT modifications
are requested.

PR:	199152
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-05 21:08:04 +00:00
kib
b7c417708f Refine r280308. Do not completely disable timestamping of devfs nodes
on reads or writes, the time marks are used to display idle time by
w(1) [1].  Instead, use vfs.devfs.dotimes as the selector of default
precision vs. using time_second.  The later gives seconds precision,
which is good enough for the purpose.

Note that timestamp updates are unlocked and the updates itself, as
well as the check in devfs_timestamp, are non-atomic.

Noted by:	truckman [1]
Reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-01 08:25:40 +00:00
kib
89382b6533 msdosfs: mark unused compat-mount fields
The magic number MSDOSFS_ARGSMAGIC, which used to distinguish
"old" vs "new" msdosfs mount arguments, has not been used since
2005; it should just go away now.

Likewise, the local-to-Unicode table that changed at the same
time is unused.

Leave the space reserved in the old style mount arguments, though,
since we still support the old mount call (via the cmount entry
point).

Submitted by:	Chris Torek <chris.torek@gmail.com>
MFC after:	2 weeks
2015-03-22 09:09:26 +00:00
delphij
041657da93 Disable timestamping on devfs read/write operations by default.
Currently we update timestamps unconditionally when doing read or
write operations.  This may slow things down on hardware where
reading timestamps is expensive (e.g. HPET, because of the default
vfs.timestamp_precision setting is nanosecond now) with limited
benefit.

A new sysctl variable, vfs.devfs.dotimes is added, which can be
set to non-zero value when the old behavior is desirable.

Differential Revision:	https://reviews.freebsd.org/D2104
Reported by:	Mike Tancsa <mike sentex net>
Reviewed by:	kib
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
MFC after:	2 weeks
2015-03-21 01:14:11 +00:00
glebius
398be53682 o Enhance vm_pager_free_nonreq() function:
- Allow to call the function with vm object lock held.
  - Allow to specify reqpage that doesn't match any page in the region,
    meaning freeing all pages.
o Utilize the new function in couple more places in vnode pager.

Reviewed by:	alc, kib
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2015-03-17 19:19:19 +00:00
jkim
d07e2757d9 Fix white spaces. 2015-03-02 19:14:58 +00:00
trasz
ab90d82e08 Make fuse(4) respect FOPEN_DIRECT_IO. This is required for correct
operation of GlusterFS.

PR:		192701
Submitted by:	harsha at harshavardhana.net
Reviewed by:	kib@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-03-02 19:04:27 +00:00
imp
58c9460670 nandfs_meta_bread() calls bread() which can set bp to NULL in some
error cases. Calling brelse() with a NULL pointer is not allowed,
so only call brelse() when the bp is non-NULL.

Reported by: Maxime Villard (reported as uninitialized variable)
2015-03-01 21:41:37 +00:00
kan
a95ac78b9c Do not leak 'copy' buffer if bmap_truncate_indirect fails.
Reported by: Brainy Code Scanner, by Maxime Villard.
MFC after: 2 weeks
2015-02-28 22:24:45 +00:00
kib
661b19b40e Some fixes for fdescfs lookup code.
Do not ever return doomed vnode from lookup.  This could happen, if
not checked, since dvp is relocked in the 'looking up ourselves' case.

In the other case, since dvp is relocked, mount point might go away
while fdesc_allocvp() is called.  Prevent the situation by doing
vfs_busy() before unlocking dvp.  Reuse the vn_vget_ino_gen() helper.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-02-28 19:57:22 +00:00
kib
3bc9cbc06a The VNASSERT in vflush() FORCECLOSE case is trying to panic early to
prevent errors from yanking devices out from under filesystems.  Only
care about special vnodes on devfs, special nodes on other kinds of
filesystems do not have special properties.

Sponsored by:  EMC / Isilon Storage Division
Submitted by:   Conrad Meyer
MFC after:	1 week
2015-02-27 16:43:50 +00:00
pfg
e22521379a ext2fs: Plug small memory leak
free() e2fs_contigdirs upon error.
Undo zeroing of e2fs_gd as this was actually a false positive.

X-MFC with:	278790
2015-02-15 14:25:00 +00:00
pfg
03988df8a2 Reuse value of cursize instead of recalculating.
Reported by:	Clang static checker
MFC after:	1 week
2015-02-15 01:34:00 +00:00
pfg
d2ad05642a Initialize the allocation of variables related to the ext2 allocator.
The e2fs_gd struct was not being initialized and garbage was
being used for hinting the ext2 allocator variant.
Use malloc to clear the values and also initialize e2fs_contigdirs
during allocation to keep consistency.

While here clean up small style issues.

Reported by:	Clang static analyser
MFC after:	1 week
2015-02-15 01:12:15 +00:00
trasz
e13ac6cd7e Restore ABI compatibility, broken in r273127. Note that while this fixes
ABI with 10.1, it breaks ABI for 11-CURRENT, so rebuild of automountd(8)
is neccessary.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2015-02-10 16:17:16 +00:00
kib
09bdd8a7f8 Remove duplicated assignment.
CID:	1267988
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-02-03 12:09:48 +00:00
kib
1831e3d7dc Update directory times immediately after an entry is created or
removed.  Postponing it until tmpfs_getattr() is called causes
discordant values reported for file times vs. directory times.

Reported and tested by:	madpilot
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-31 21:31:53 +00:00
kib
1ccf1fa71b Remove single-use boolean.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-01-31 12:58:04 +00:00
kib
8773784e5a POSIX states that write(2) "shall mark for update the last data
modification and last file status change timestamps of the file".
Currently, tmpfs only modifies ctime when file was extended.  Since
r277828 followed tmpfs_write(), mmaped writes also do not modify
ctime.

Fix this, by updating both ctime and mtime for writes to tmpfs files.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-01-31 12:27:18 +00:00
dim
6b8eea4924 Fix a -Wcast-qual warning in smbfs_subr.c, by using __DECONST. No
functional change.

MFC after:	3 days
2015-01-30 22:02:32 +00:00
dim
edbaff1357 Fix a -Wcast-qual warning in udf_vnops.c, by using __DECONST. No
functional change.

MFC after:	3 days
2015-01-30 22:01:45 +00:00
dim
edba0c462e Fix a bunch of -Wcast-qual warnings in cd9660_util.c, by using
__DECONST.  No functional change.

MFC after:	3 days
2015-01-29 20:40:25 +00:00
dim
07f28f1df7 Fix a bunch of -Wcast-qual warnings in msdosfs_conv.c, by using
__DECONST.  No functional change.

MFC after:	3 days
2015-01-29 20:30:13 +00:00
jamie
c7d0935d11 Add allow.mount.fdescfs jail flag.
PR:		192951
Submitted by:	ruben@verweg.com
MFC after:	3 days
2015-01-28 21:08:09 +00:00
kib
19abfd4698 Update mtime for tmpfs files modified through memory mapping. Similar
to UFS, perform updates during syncer scans, which in particular means
that tmpfs now performs scan on sync.  Also, this means that a mtime
update may be delayed up to 30 seconds after the write.

The vm_object' OBJ_TMPFS_DIRTY flag for tmpfs swap object is similar
to the OBJ_MIGHTBEDIRTY flag for the vnode object, it indicates that
object could have been dirtied.  Adapt fast page fault handler and
vm_object_set_writeable_dirty() to handle OBJ_TMPFS_NODE same as
OBJT_VNODE.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-01-28 10:37:23 +00:00
kib
53810519b4 tmpfs does not use UVM on FreeBSD.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2015-01-28 10:25:35 +00:00
kib
f748dc7ade Stop enforcing additional reference on all cdevs, which was introduced
in r277199.  Acquire the neccessary reference in delist_dev_locked()
and inform destroy_devl() about it using CDP_UNREF_DTR flag.

Fix some style nits, add asserts.

Discussed with:	hselasky
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-19 17:36:52 +00:00
kib
b3741c8701 Ignore devfs directory entries for devices either being destroyed or
delisted.  The check is racy.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-19 17:24:52 +00:00
ngie
f94357dba9 Fix the build when INVARIANTS is defined by restoring bo's definition in
ext2_truncate(..) and by putting it under INVARIANTS ifdefs

X-MFC with: r277354
MFC after: 2 weeks
2015-01-19 07:10:08 +00:00
pfg
8fa2e2513f ext2: Garbage-collect some unused variables
Reported by:	clang static analysis
MFC after:	2 weeks
2015-01-19 03:30:45 +00:00
pfg
142fb530ca ext2: fix for uninitialized pointer read.
path.ep_bp was being used uninitialized in ext4_ext_find_extent().

CID:		1062344
MFC after:	1 week
2015-01-18 21:18:28 +00:00
pfg
f71f36cb87 Remove dead code.
After the ext2 variant of the "orlov allocator" was implemented,
the case for a negative or zero dirsize disappeared.

Drop the dead code and unsign dirsize given that it can't be
negative anyways.

CID:		1008669
MFC after:	1 week
2015-01-18 20:26:27 +00:00
kib
53832db395 Make SIGSTOP working for sleeps done while waiting for fifo readers or
writers in open(2), when the fifo is located on an NFS mount.

Reported by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-01-18 15:03:26 +00:00
pfg
51b45a8019 ext2: cosmetical issues
Minor sorting and note when the cases are expected to fall through.

MFC after:	1 week
2015-01-17 15:19:18 +00:00
hselasky
b04cbf0c36 Avoid race with "dev_rel()" when using the recently added
"delist_dev()" function. Make sure the character device structure
doesn't go away until the end of the "destroy_dev()" function due to
concurrently running cleanup code inside "devfs_populate()".

MFC after:	1 week
Reported by:	dchagin@
2015-01-14 22:07:13 +00:00
hselasky
4d7a9f7cc1 Don't use POLLNVAL as a return value from the client side poll
function. Many existing clients don't understand POLLNVAL and instead
relies on an error code from the read(), write() or ioctl() system
call. Also make sure we wakeup any client pollers before the cuse
server is closing, so they don't wait forever for an event.
2015-01-13 13:32:18 +00:00
emaste
c3135c0392 ANSIfy msdosfs
Add a few cases and style(9) fixes missed in r276887

Sponsored by:	The FreeBSD Foundation
2015-01-12 21:55:48 +00:00
emaste
9d6a34744f ANSIfy sys/fs/msdosfs
There are a number of msdosfs improvements in NetBSD that may be worth
bringing over, and this reduces noise in the comparison.

Differential Revision:	https://reviews.freebsd.org/D1466
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
2015-01-09 14:50:08 +00:00
rwatson
608a6cd69c Use M_SIZE() instead of hand-crafted (and mostly correct) NFSMSIZ() macro
in the NFS server; garbage collect now-unused NFSMSIZ() and M_HASCL()
macros.  Also garbage collect now-unused versions in headers for the
removed previous NFS client and server.

Reviewed by:	rmacklem
Sponsored by:	EMC / Isilon Storage Division
2015-01-07 17:22:56 +00:00
mjg
d80e0e454f Convert nullfs hash lock from a mutex to an rwlock. 2014-12-30 21:41:35 +00:00
rmacklem
f96de0d7cc r245508 modified the NFS client's Setattr RPC to
use VA_UTIMES_NULL to indicate whether it should
set the time to the current tod on the server.
This had the side effect of making the NFS client
use the client's timestamp for exclusive create,
starting with FreeBSD9.2.
Unfortunately a bug in some Solaris NFS servers
causes these servers to return NFS_OK to the
Setattr RPC done during exclusive create, but not
actually set the file's mode, leaving the file's
mode == 0.
This patch restores the NFS client's behaviour to
use the server's tod for the exclusive open's
Setattr RPC, to avoid the Solaris server bug and
to restore the pre-FreeBSD9.2 NFS behaviour.

Discussed on:	freebsd-fs
PR:	186293
MFC after:	3 months
2014-12-28 21:13:52 +00:00
rmacklem
4818055005 Delete some duplicate code that was harmless because
exactly the same code is at the end of the nfscl_checksattr()
function that is called just before it. As such, this code
had already been executed and didn't do anything.

MFC after:	1 week
2014-12-25 22:29:37 +00:00
rmacklem
5309296f87 A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1
was reported via email. This was caused by a LOR between the
sleep lock used to serialize the local locking (nfsrv_locklf())
and locking the vnode. I believe this patch fixes the problem
by delaying relocking of the vnode until the sleep lock is
unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side
effect of unlocking the vnode, unlocking the vnode was moved to before
the functions that call nfsvno_advlock().
It shouldn't affect the execution of the default case where
vfs.nfsd.enable_locallocks=0.

Reported by:	loic.blot@unix-experience.fr
Discussed with:	kib
MFC after:	1 week
2014-12-25 01:55:17 +00:00
rmacklem
d86ceb5d4a Fix kernel builds with "options NFS_DEBUG" that
were broken by r276096. Also delete the two
kernel options NFS_GATHERDELAY, NFS_WDELAYHASHSIZ
which are no longer used.

Reported by:	bz
2014-12-23 14:24:36 +00:00
rmacklem
dc4905f459 Remove the old NFS client and server from head,
which means that the NFSCLIENT and NFSSERVER
kernel options will no longer work. This commit
only removes the kernel components. Removal of
unused code in the user utilities will be done
later. This commit does not include an addition
to UPDATING, but that will be committed in a
few minutes.

Discussed on: freebsd-fs
2014-12-23 00:47:46 +00:00
kib
4e541c8756 Handle MAKEENTRY cnp flag in the VOP_CREATE(). Curiously, some
fs, e.g. smbfs, already did it.

Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-12-21 13:29:33 +00:00
benno
1f4cf8121b Adjust the test of a KASSERT to better match the intent.
This assertion was added in r246213 as a guard against corrupted mbufs
arriving from drivers, the key distinguishing factor of said mbufs being
that they had a negative length. Given we're in a while loop specifically
designed to skip over zero-length mbufs, panicking on a zero-length mbuf
seems incorrect.

No objection from:	kib
2014-12-19 19:09:22 +00:00
kib
77c9d3f4e8 The VOP_LOOKUP() implementations for CREATE op do not put the name
into namecache, to avoid cache trashing when doing large operations.
E.g., tar archive extraction is not usually followed by access to many
of the files created.

Right now, each VOP_LOOKUP() implementation explicitely knowns about
this quirk and tests for both MAKEENTRY flag presence and op != CREATE
to make the call to cache_enter().  Centralize the handling of the
quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP.
VFS now sets NOCACHE flag for CREATE namei() calls.

Note that the change in semantic is backward-compatible and could be
merged to the stable branch, and is compatible with non-changed
third-party filesystems which correctly handle MAKEENTRY.

Suggested by:	Chris Torek <torek@pi-coral.com>
Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-12-18 10:01:12 +00:00
gleb
5c99f46b3b Adjust printf format specifiers for dev_t and ino_t in kernel.
ino_t and dev_t are about to become uint64_t.

Reviewed by:	kib, mckusick
2014-12-17 07:27:19 +00:00
pfg
1730fb872a ext2fs: Fix old out-of-bounds access.
Overrunning buffer pointed to by (caddr_t)&oip->i_db[0] of 48 bytes by
passing it to a function which accesses it at byte offset 59 using
argument 60UL.

The issue was inherited from an older FFS implementation and
fixed there with by merging UFS2 in r98542. We follow the
FFS fix.

Discussed with:	bde
CID:		1007665
MFC after:	3 days
2014-12-09 14:56:00 +00:00
kib
c1b643c449 Do not call VFS_SYNC() before VFS_UNMOUNT() for forced unmount.
Since VFS does not/cannot stop writes, sync might run indefinitely, or
be a wrong thing to do at all.  E. g. NFS ignores VFS_SYNC() for
forced unmounts, since non-responding server does not allow sync to
finish.  On the other hand, filesystems can and do stop writes using
fs-specific facilities, and should already fully flush caches in
VFS_UNMOUNT() due to the race.

Adjust msdosfs tp sync in unmount for forced call, to accomodate the
new behaviour.  Note that it is still racy, since writes are not
stopped.

Discussed with:	avg, bjk, mckusick
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2014-12-09 10:00:47 +00:00
kib
11cee2ecf7 The process spin lock currently has the following distinct uses:
- Threads lifetime cycle, in particular, counting of the threads in
  the process, and interlocking with process mutex and thread lock.
  The main reason of this is that turnstile locks are after thread
  locks, so you e.g. cannot unlock blockable mutex (think process
  mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
  from the clock interrupt context.  Replace the p_slock by p_itimmtx
  and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason.  Replace the p_slock
  by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting.  Need for the spinlock there is subtle,
  my understanding is that spinlock blocks context switching for the
  current thread, which prevents td_runtime and similar fields from
  changing (updates are done at the mi_switch()).  Replace the p_slock
  by p_statmtx and PROC_STATLOCK().

The split is done mostly for code clarity, and should not affect
scalability.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-26 14:10:00 +00:00
trasz
4afc5ee040 Implement "automount -c".
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-11-22 16:48:29 +00:00
trasz
7e9273db3f Fix smbfs to not zero out statfs f_flags field. Previously, this
made getmntinfo() return empty flags for smbfs filesystems when
called with MNT_WAIT. It's not visible with mount(8), since it uses
MNT_NOWAIT, but broke autounmount(8) operation.

PR:		195161
Differential Revision:	https://reviews.freebsd.org/D1194
Reviewed by:	kib@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-11-21 06:21:39 +00:00
pfg
62ba21698a ifdef ext2_print_inode which is not really used.
ext2_print_inode is not really used but it was nice to
have for initial development work. #ifdef it under a
new EXT2FS_DEBUG knob so that we don't spend time
compiling it.

MFC after:	3 days
2014-11-12 16:23:56 +00:00
mjg
5c258fe4b9 Fix up some session-related races in devfs.
One was introduced with r272596, the rest was there to begin with.

Noted by: jhb
2014-11-03 03:12:15 +00:00
trasz
63b5e04c28 Fix handling of "conn" mount_nfs(8) option.
Reviewed by:	rmacklem@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-30 09:25:03 +00:00
trasz
d140ecd680 Add support for "timeo", "actimeo", "noac", and "proto" options
to mount_nfs(8).  They are implemented on Linux, OS X, and Solaris,
and thus can be expected to appear in automounter maps.

Reviewed by:	rmacklem@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-30 08:50:01 +00:00
kib
99d20fccef Allow the vfs.nfsd knobs to be set from loader.conf (or using
kenv(8)).  This is useful when nfsd is loaded as module.

Reviewed by:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-27 07:47:13 +00:00
rmacklem
8695c1d285 Clip the settings for the NFS rsize, wsize mount options
to a power of 2. For non-power of 2 settings, intermittent
page faults have been reported. Although the bug that causes
these page faults/crashes has not been identified, it does
not appear to occur when rsize, wsize is a power of 2.

Reported by:	tcberner@gmail.com
MFC after:	2 weeks
2014-10-22 22:27:51 +00:00
rmacklem
d26983570d Revert r273481 so it can be recoded using fls(), which
some feel will make it more readable.
2014-10-22 21:57:35 +00:00
rmacklem
5777570c4b Clip the settings for the NFS rsize, wsize mount options
to a power of 2. For non-power of 2 settings, intermittent
page faults have been reported. Although the bug that causes
these page faults/crashes has not been identified, it does
not appear to occur when rsize, wsize is a power of 2.

Reported by:	tcberner@gmail.com
MFC after:	2 weeks
2014-10-22 20:47:11 +00:00
mjg
b8143f5206 tmpfs: allow shared file lookups
Tested by: pho
2014-10-21 21:27:13 +00:00
hselasky
49c137f7be Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
mjg
12e0034dd0 Provide vfs suspension support only for filesystems which need it, take
two.

nullfs and unionfs need to request suspension if underlying filesystem(s)
use it. Utilize mnt_kern_flag for this purpose.

This is a fixup for 273271.

No strong objections from: kib
Pointy hat to: mjg
MFC after:	2 weeks
2014-10-20 18:00:50 +00:00
mjg
d6c735be77 unionfs: hold mount interlock while manipulating mnt_flag
This is for consistency with other filesystems.
2014-10-20 17:53:49 +00:00
mjg
65ead9d18a Provide vfs suspension support only for filesystems which need it.
Need is expressed by providing vfs_susp_clean function in vfsops.

Differential Revision:	D952
Reviewed by:	kib (previous version)
MFC after:	2 weeks
2014-10-19 06:59:33 +00:00
trasz
dd166aeaa3 Remove useless debug.
Sponsored by:	The FreeBSD Foundation
2014-10-17 12:06:48 +00:00
araujo
33df97f342 Make the sysctl(8) for checkutf8 positively defined and improve
the description of it.

Submitted by:	Ronald Klop <ronald-lists@klop.ws>
Reviewed by:	rmacklem
Approved by:	rmacklem
Sponsored by:	QNAP Systems Inc.
2014-10-17 02:11:09 +00:00
davide
e88bd26b3f Follow up to r225617. In order to maximize the re-usability of kernel code
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.

Submitted by:   kmacy
Tested by:      make universe
2014-10-16 18:04:43 +00:00
araujo
bfe82e3333 Add two sysctl(8) to enable/disable NFSv4 server to check when setting
user nobody and/or setting group nogroup as owner of a file or directory.
Usually at the client side, if there is an username that is not in the
client's passwd database, some clients will send 'nobody@<your.dns.domain>'
in the wire and the NFSv4 server will treat it as an ERROR.
However, if you have a valid user nobody in your passwd database,
the NFSv4 server will treat it as a NFSERR_BADOWNER as its believes the
client doesn't has the username mapped.

Submitted by:	Loic Blot <loic.blot@unix-experience.fr>
Reviewed by:	rmacklem
Approved by:	rmacklem
MFC after:	2 weeks
2014-10-16 02:24:19 +00:00
kib
4032517985 Style changes for deadfs:
- ANSIfy VOPs.
- Remove trivial comments.
- Remove ARGSUSED.
- Remove copies of the vop_XXX_args structure definitions in comments.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-15 13:22:33 +00:00
kib
6b880aa521 When vnode bypass cannot be performed on the cdev file descriptor for
read/write/poll/ioctl, call standard vnode filedescriptor fop.  This
restores the special handling for terminals by calling the deadfs VOP,
instead of always returning ENXIO for destroyed devices or revoked
terminals.

Since destroyed (and not revoked) device would use devfs_specops VOP
vector, make dead_read/write/poll non-static and fill VOP table with
pointers to the functions, to instead of VOP_PANIC.

Noted and reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-15 13:16:51 +00:00
kib
66572524ce Change the deadfs poll VOP to return POLLIN|POLLRDNORM if the caller
is interested in i/o state.  Return POLLNVAL for invalid bits, similar
to poll_no_poll().  Note that POLLOUT must not be returned, since
POLLHUP is set.

Noted and reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-15 13:08:53 +00:00
trasz
1687a6636a Make automountd(8) inform autofs(4) whether directory being handled can
have wildcards.  This makes it possible for autofs(4) to avoid requesting
automountd(8) action on access to nonexistent nodes - unless wildcards
are actually used.

Note that this change breaks ABI for automountd(8).

Tested by:	dhw@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-15 09:28:45 +00:00
kib
1b90da9ab8 Do not set IN_ACCESS flag for read-only mounts. The IN_ACCESS
survives remount in rw, also it is set for vnodes on rootfs before
noatime can be set or clock is adjusted.  All conditions result in
wrong atime for accessed vnodes.

Submitted by:	bde
MFC after:	1 week
2014-10-11 19:09:56 +00:00
trasz
b24bb0c435 Add assertion to catch duplicated notes.
Sponsored by:	The FreeBSD Foundation
2014-10-11 05:11:23 +00:00
trasz
3da2b907c2 Remove remnants of some cleanup; no functional changes.
Sponsored by:	The FreeBSD Foundation
2014-10-09 18:49:58 +00:00
trasz
ce66e045bf Simplify; no functional changes.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-08 09:44:02 +00:00
mjg
b14804421c devfs: tidy up after 272596
This moves a var to an if statement, no functional changes.

MFC after:	1 week
2014-10-06 07:22:48 +00:00
mjg
59a46e800a devfs: don't take proctree_lock unconditionally in devfs_close
MFC after:	1 week
2014-10-06 06:20:35 +00:00
trasz
1db3c5c0a6 Make autofs use shared vnode locks.
Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-04 09:37:40 +00:00
trasz
04f680ae6f Fix autofs debug macros.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-03 10:18:22 +00:00
trasz
cd1d9d476d Make autofs(4) use shared lock for lookups, instead of exclusive one.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-10-03 09:58:05 +00:00
araujo
4100d9b0a6 Fix failures and warnings reported by newpynfs20090424 test tool.
This fix addresses only issues with the pynfs reports, none of these
issues are know to create problems for extant real clients.

Submitted by:	Bart Hsiao <bart.hsiao@gmail.com>
Reworked by:	myself
Reviewed by:	rmacklem
Approved by:	rmacklem
Sponsored by:	QNAP Systems Inc.
2014-10-03 02:24:41 +00:00
trasz
d62de4d36d Call uma_zfree() outside of lock, and improve comment.
Sponsored by:	The FreeBSD Foundation
2014-10-02 10:37:56 +00:00
trasz
f820ba5865 Make autofs timeout handling use timeout task instead of callout;
that's because the handler can sleep on sx lock.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2014-10-02 10:31:32 +00:00
trasz
49648b28cf Fix thinko that, with two map entries like shown below, in that order,
made autofs mix them up: the second one wasn't visible in ls(1) output,
and trying to access it would trigger mount for the first one.

foobar		host:/foobar
foo		host:/foo

MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2014-09-23 11:27:43 +00:00
jhb
8f082668d0 Add a new fo_fill_kinfo fileops method to add type-specific information to
struct kinfo_file.
- Move the various fill_*_info() methods out of kern_descrip.c and into the
  various file type implementations.
- Rework the support for kinfo_ofile to generate a suitable kinfo_file object
  for each file and then convert that to a kinfo_ofile structure rather than
  keeping a second, different set of code that directly manipulates
  type-specific file information.
- Remove the shm_path() and ksem_info() layering violations.

Differential Revision:	https://reviews.freebsd.org/D775
Reviewed by:	kib, glebius (earlier version)
2014-09-22 16:20:47 +00:00
trasz
3f03c07734 Turns out -1 is a perfectly valid error number, ERESTART. Remove useless
code written under assumption that it wasn't.

Sponsored by:	The FreeBSD Foundation
2014-09-21 10:34:15 +00:00
trasz
bd9494937c Fix typos.
Sponsored by:	The FreeBSD Foundation
2014-09-18 10:33:23 +00:00
kib
ebd8a253bb Provide the unique implementation for the VOP_GETPAGES() method used
by ffs and ext2fs.  Remove duplicated call to vm_page_zero_invalid(),
done by VOP and by vm_pager_getpages().  Use vm_pager_free_nonreq().

Reviewed by:	alc (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	6 weeks (after r271596)
2014-09-15 12:28:29 +00:00
alc
810772ca9c Avoid an exclusive acquisition of the object lock on the expected execution
path through the NFS clients' getpages functions.

Introduce vm_pager_free_nonreq().  This function can be used to eliminate
code that is duplicated in many getpages functions.  Also, in contrast to
the code that currently appears in those getpages functions,
vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one
case.

Reviewed by:	kib
MFC after:	6 weeks
Sponsored by:	EMC / Isilon Storage Division
2014-09-14 18:07:55 +00:00
alc
e169525bb8 We don't need an exclusive object lock on the expected execution path
through {ext2,ffs}_getpages().

Reviewed by:	kib, pfg
MFC after:	6 weeks
Sponsored by:	EMC / Isilon Storage Division
2014-09-13 18:26:13 +00:00
pfg
84922f7368 Extra space from r271467.
MFC after:	2 months
2014-09-12 15:54:18 +00:00
pfg
0ad1313a29 ext2fs: add ext2_getpages().
Literally copy/pasted from ffs_getpages().

Tested with:	fsx
MFC after:	2 months
2014-09-12 15:49:21 +00:00
glebius
5939c729a8 Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES(). 2014-09-10 12:36:41 +00:00
rwatson
a4fcb94b32 Garbage collect NFSMINOFF() from the NFS stack; this unused macro replicates
mbuf-initialisation logic that is best left to centralised mbuf utility
code rather than scattered around the kernel.

MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2014-09-05 17:05:51 +00:00
trasz
bf5f21dde4 Fix bug that, assuming a/ is a root of NFS filesystem mounted on autofs,
prevented "mv a/from a/to" from working, while "cd a && mv from to" was ok.

PR:		192948
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-24 17:03:52 +00:00
trasz
347e2b6e44 Autofs softc needs to be global anyway, so don't pass it as a local
variable, and don't store in autofs_mount.  Also rename it from 'sc'
to 'autofs_softc', since it's global and extern.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-23 11:45:14 +00:00
trasz
bd42c0dce6 Add comment explaining one of the quirks in autofs.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-23 11:38:31 +00:00
trasz
d946778919 Fix includes.
Suggested by:	pluknet@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-21 15:59:25 +00:00
trasz
a07b17da6a Use __FBSDID() properly.
Suggested by:	pluknet@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-21 15:07:25 +00:00
trasz
90fa877deb Rework ".." lookup; previous one failed to properly busy the mountpoint.
Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-08-20 13:46:51 +00:00
trasz
cac9beab7d Bring in the new automounter, similar to what's provided in most other
UNIX systems, eg. MacOS X and Solaris.  It uses Sun-compatible map format,
has proper kernel support, and LDAP integration.

There are still a few outstanding problems; they will be fixed shortly.

Reviewed by:	allanjude@, emaste@, kib@, wblock@ (earlier versions)
Phabric:	D523
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
2014-08-17 09:44:42 +00:00
rmacklem
cb34fe67aa Change the NFS server's printf related to hitting
the DRC cache's flood level so that it suggests
increasing vfs.nfsd.tcphighwater.

Suggested by:	h.schmalzbauer@omnilan.de
2014-08-10 01:13:32 +00:00
kib
34def2183a VOP_LOOKUP() may relock the directory vnode for some reasons. Since
nullfs vnode shares vnode lock with lower vnode, this allows the
reclamation of nullfs directory vnode in null_lookup().  In this
situation, VOP must return ENOENT.

More, since after the reclamation, the locks of nullfs directory vnode
and lower vnode are no longer shared, the relock of the ldvp does not
restore the correct locking state of dvp, and leaks ldvp lock.
Correct this by unlocking ldvp and locking dvp.

Use cached value of dvp->v_mount.

Reported by:	bdrewery
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-08-08 11:39:05 +00:00
pfg
4600198022 Revert r269523:
Providing a higher EXT2_LINK_MAX limit is a bad idea for ext2/3.

Discussed with:	bde
2014-08-05 01:25:14 +00:00
pfg
83f0bdc908 set EXT2_LINK_MAX to LINK_MAX
In linux EXT4_LINK_MAX is now 64000.  We can't really do that
since i_nlink and va_nlink are signed so setting higher values
is likely to cause trouble.

This is a system limitation so set the EXT_LINK_MAX to
what the system can handle.

MFC after:	3 days
2014-08-04 16:41:06 +00:00
imp
c7ddb2dba0 Set the erase block size properly in the case the underlying media
doesn't advertise an erase block size.

Submitted by: bjg@
Pointy hat to: imp@
2014-08-02 05:05:16 +00:00
imp
30d213fa5d Follow the ufs practice for disallowing permission changes as well as
writes to files for read-only file systems. Since there are already
checks in nandfs_setattr that return an error, this moves detection of
the error earlier.
2014-08-02 05:05:10 +00:00
imp
57758616d2 Fix a minor style(9) issue. 2014-08-02 05:05:05 +00:00
kib
b8112840fb Do not generate 1000 unique lock names for nfsrc hash chain locks.
It overflows witness.

Shorten the names of some nfs mutexes.

Reported and tested by:	pho
No objections from:	rmacklem, mav
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-31 19:24:44 +00:00
kib
617cdeea47 Assert that nullfs vnode has VV_ROOT set whenever lower vnode has.
Assert that dotdot lookup on the root vnode is not performed.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-28 14:20:31 +00:00
kib
61d1c715a6 Fix typo.
MFC after:	3 days
2014-07-24 23:14:03 +00:00
imp
3393ae3fe1 Fix typo in comment: noone -> no one.
Fix minor style(9) nits.
2014-07-23 16:18:51 +00:00
kib
4a70d74f6e Do not ignore error from tmpfs_alloc_vp(). It results in access to
the random memory.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-16 14:08:01 +00:00
kib
8708f6a5a2 Remove unused header.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-16 14:06:16 +00:00
kib
32a7383c85 Check for the cross-device cross-link attempt in the VFS, instead of
forcing filesystem VOP_LINK() methods to repeat the code.  In
tmpfs_link(), remove redundand check for the type of the source,
already done by VFS.

Note that NFS server already performs this check before calling
VOP_LINK().

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-16 14:04:46 +00:00
kib
7eb2d27b75 Rework the tmpfs unmount.
- Suspend filesystem for unmount.  This prevents new tmpfs nodes from
  instantiating, and also ensures that only unmount thread can destroy
  nodes.

- Do not start tmpfs node deletion until all vnodes are reclaimed,
  which guarantees that no thread can access tmpfs data.  For this,
  call vflush() in the loop, until the mnt_nvnodelistsize is non-zero.
  Note that after mnt_nvnodelistsize becomes 0, insmntque() blocks
  insertion of a vnode germ into the mount list of vnodes.

- Fail node allocation when the filesystem is being unmounted.  This
  is race-free due to the vflush() call in loop.  This is mostly
  cosmetic, avoiding some more work which might be done until
  suspension in unmount is started.

Note that there is currently no way to prevent new vnode instantiation
from readers during the unmount.  Due to this, forced unmount might
live-lock if vflush() loop cannot get to the zero vnode count due to
races with readers.  The unmount would proceed after the load is
lifted.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:52:33 +00:00
kib
2488a7f8be Change forgotten in r268615. Set the OBJ_TMPFS_NODE flag for
vm_object of VREG tmpfs node.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:35:14 +00:00
kib
e152d89652 Use tmpfs_vn_get_ino_gen() to handle the races with reclaim in tmpfs
dotdot lookup.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:16:55 +00:00
kib
1ae17f77a1 Style. Add comment about lock mode.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:13:56 +00:00
kib
a3c7856b26 In tmpfs_alloc_file(), code after the 'out' label does only 'return
error;'.  Replace goto's with the return.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:02:40 +00:00
kib
2aa8688209 Add convenience macro to assert tmpfs node lock.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:59:25 +00:00
kib
7dd9ab980a Add some assertions for the code handling vm_object for tmpfs vnode.
In particular, vnode must be exclusively locked when the tmpfs vnode
and object are divorced.  When the vnode is opened, the object must be
still alive, since only live vnode can be opened, and the tmpfs node
owns a reference on the object.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:55:02 +00:00
kib
8542c8b735 The tmpfs_link() must not dereference the filesystem-specific data for
a vnode until it is verified that the vnode indeed belongs to tmpfs
mount.  Otherwise, it might access random memory, at least in the
debug kernel.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:45:29 +00:00
kib
2f41c9023e Generalize vn_get_ino() to allow filesystems to use custom vnode
producer, instead of hard-coding VFS_VGET().  New function, which
takes callback, is called vn_get_ino_gen(), standard callback for
vn_get_ino() is provided.

Convert inline copies of vn_get_ino() in msdosfs and cd9660 into the
uses of vn_get_ino_gen().

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:34:54 +00:00
kib
d84e7ad50c Remove code separator lines which do not conform to style(9).
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:17:11 +00:00
imp
bffc4efe0b Naughty NANDFS was using hidden unused flag, hiding the fact that the
flag was used and wasn't really available. Change the name without
fixing any laying issues that might be present in NANDFS' use of this
flag.
2014-07-07 23:21:07 +00:00
rmacklem
210206ed5b The new NFSv3 server did not generate directory postop attributes for
the reply to ReaddirPlus when the server failed within the loop
that calls VFS_VGET(). This failure is most likely an error
return from VFS_VGET() caused by a bogus d_fileno that was
truncated to 32bits.
This patch fixes the server so that it will return directory postop
attributes for the failure. It does not fix the underlying issue caused
by d_fileno being uint32_t when a file system like ZFS generates
a fileno that is greater than 32bits.

Reported by:	jpaetzel
Reviewed by:	jpaetzel
MFC after:	1 month
2014-07-04 22:47:07 +00:00
rmacklem
8c4948bf02 Merge the NFSv4.1 server code in projects/nfsv4.1-server over
into head. The code is not believed to have any effect
on the semantics of non-NFSv4.1 server behaviour.
It is a rather large merge, but I am hoping that there will
not be any regressions for the NFS server.

MFC after:	1 month
2014-07-01 20:47:16 +00:00
bdrewery
be0e28fbe8 Change NFS readdir() to only ignore cookies preceding the given offset for
UFS rather than for all but ZFS.  This code was assuming that offsets were
monotonically increasing for all file systems except ZFS and that the
cookies from a previous call may have been rewound to a block boundary.
According to mckusick@ only UFS is known to do this, so only requests against
UFS file systems should remove cookies smaller than the given offset.  This
fixes serving TMPFS over NFS as it too does not have monotonically increasing
offsets.  The comment around the code also indicated it was specific to UFS.

Some of the code using 'not_zfs' is specific to ZFS snapshot handling, so
add a 'is_zfs' variable for those cases.

It's possible that 'is_zfs' check for VFS_VGET() support may not be
specific to ZFS.  This needs more research and testing.

After this fix TMPFS and other file systems can be served over NFS.

To test I compared the results of syncing a /usr/src tree into a tmpfs and
serving that over NFS.  Before the fix 3589 files were missing on the remote
view.  After the fix all files were successfully found.

Reviewed by:	rmacklem
Discussed with:	mckusick, rmacklem via fs@
Discussed at:	http://lists.freebsd.org/pipermail/freebsd-fs/2014-April/019264.html
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2014-07-01 20:00:35 +00:00
rmacklem
13b85f8306 There might be a potential race condition for the NFSv4 client
when a newly created file has another open done on it that
update the open mode. This patch moves the code that updates
the open mode up into the block where the mutex is held to
ensure this cannot happen. No bug caused by this potential
race has been observed, but this fix is a safety belt to ensure
it cannot happen.

MFC after:	2 weeks
2014-06-28 21:47:15 +00:00
hselasky
6ed453b3ad Use existing PHOLD() and PRELE() macros.
Submitted by:	kib @
2014-06-24 18:25:43 +00:00
kib
f5cc3af6c8 In msdosfs_setattr(), add a check for result of the utimes(2)
permissions test, forgotten in r164033.

Refactor the permission checks for utimes(2) into vnode helper
function vn_utimes_perm(9), and simplify its code comparing with the
UFS origin, by writing the call to VOP_ACCESSX only once.  Use the
helper for UFS(5), tmpfs(5), devfs(5) and msdosfs(5).

Reported by:	bde
Reviewed by:	bde, trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-17 07:11:00 +00:00
rmacklem
579917b494 The new NFS server would not allow a hard link to be
created to a symlink. This restriction (which was
inherited from OpenBSD) is not required by the NFS RFCs.
Since this is allowed by the old NFS server, it is a
POLA violation to not allow it. This patch modifies the
new NFS server to allow this.

Reported by:	jhb
Reviewed by:	jhb
MFC after:	3 days
2014-06-06 21:38:49 +00:00
kib
48aedb6ddc Allow shared locking for the tmpfs vnodes.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-04 15:30:49 +00:00
hselasky
672c28ec68 Initial import of character device in userspace support for FreeBSD.
The CUSE library is a wrapper for the devfs kernel functionality which
is exposed through /dev/cuse . In order to function the CUSE kernel
code must either be enabled in the kernel configuration file or loaded
separately as a module. Currently none of the committed items are
connected to the default builds, except for installing the needed
header files. The CUSE code will be connected to the default world and
kernel builds in a follow-up commit.

The CUSE module was written by Hans Petter Selasky, somewhat inspired
by similar functionality found in FUSE. The CUSE library can be used
for many purposes. Currently CUSE is used when running Linux kernel
drivers in user-space, which need to create a character device node to
communicate with its applications. CUSE has full support for almost
all devfs functionality found in the kernel:
 - kevents
 - read
 - write
 - ioctl
 - poll
 - open
 - close
 - mmap
 - private per file handle data

Requested by several people. Also see "multimedia/cuse4bsd-kmod" in
ports.
2014-05-23 08:46:28 +00:00
kib
42829c1795 After r254627, the deupdate() started writing the directory entries to
disk.  That has a side effect of corrupting the "." entries names on
rename, since the call to createde() in the msdosfs_rename() sets the
de_Name to the target name.  If any change to the directory attributes
is performed, the wrong name is written back to the on-disk direntry
on update.

Overwrite the de_Name for the directories on rename to correct the dot
name.

Submitted by:	bde
MFC after:	1 week
2014-05-03 16:11:55 +00:00
rmacklem
2c816214c0 The new draft specification for NFSv4.0 specifies that a server
should either accept owner and owner_group strings that are just
the digits of the uid/gid or return NFS4ERR_BADOWNER.
This patch adds a sysctl vfs.nfsd.enable_stringtouid, which can
be set to enable the server w.r.t. accepting numeric string. It
also ensures that NFS4ERR_BADOWNER is returned if numeric uid/gid
strings are not enabled. This fixes the server for recent Linux
nfs4 clients that use numeric uid/gid strings by default.

Reported and tested by:	craigyk@gmail.com
MFC after:	2 weeks
2014-05-03 00:13:45 +00:00
mjg
7ec4134d19 Ignore the error from pipespace_new when creating a pipe.
It can fail if pipe map is exhausted (as a result of too many pipes created),
but it is not fatal and could be provoked by unprivileged users. The only
consequence is worse performance with given pipe.

Reported by:	ivoras
Suggested by:	kib
MFC after:	1 week
2014-05-02 00:52:13 +00:00
rmacklem
285b96444f The PR reported that the old NFS server did not set uio_td == NULL
for the VOP_READ() call. This patch fixes both the old and new
server for this case.

PR:		185232
Submitted by:	PR had patch for old server
Reviewed by:	kib
MFC after:	2 weeks
2014-04-24 20:47:58 +00:00
rmacklem
a993ac3816 Remove an unnecessary level of indirection for an argument.
This simplifies the code and should avoid the clang sparc
port from generating an abort() call.

Requested by:	rdivacky
Submitted by:	jhb
MFC after:	2 weeks
2014-04-23 23:13:46 +00:00
rmacklem
73ba58db38 Modify the NFSv4 client's Pathconf RPC (actually a Getattr Op.)
so that it only does the RPC for names that are answered by the RPC.
Doing the RPC for other names is harmless, but unnecessary.

MFC after:	2 weeks
2014-04-23 22:13:10 +00:00
rmacklem
9ce3bb474c Fixes mkdir for the NFSv2 client that was broken by r264705.
Reported by:	bdrewery
MFC after:	2 weeks
2014-04-22 04:42:46 +00:00
rmacklem
36db983e6f For an NFSv4 mount with the "nocto" option, don't get the
up to date file attributes upon close. This reduces the
Getattr RPC count by about 65% for software builds.

MFC after:	2 weeks
2014-04-21 19:10:23 +00:00
rmacklem
4dfafb6abe Modify the NFSv4 client create/mkdir RPC so that it acquires
post-create/mkdir directory attributes. This allows the RPC to
name cache the newly created directory and reduces the lookup RPC
count for applications creating a lot of directories.

MFC after:	2 weeks
2014-04-20 22:19:00 +00:00
rmacklem
f068ab6909 Modify the NFSv4 client open/create RPC so that it acquires
post-open/create directory attributes. This allows the RPC to
name cache the newly created file and reduces the lookup RPC
count by about 10% for software builds.

MFC after:	2 weeks
2014-04-19 19:40:20 +00:00
rmacklem
f55eac8e60 Modify the Lookup RPC for NFSv4 so that it acquires directory
attributes. This allows the client to cache directory names
when they are looked up, reducing the Lookup RPC count by
about 40% for software builds.

MFC after:	2 weeks
2014-04-18 22:05:34 +00:00
imp
af1f03cbf4 Take out the hack to write -1's to non-NAND. Always do a BIO_DELETE on
the ranges we want to erase. This is nicer to SSDs that want TRIMs
anyway.
2014-04-18 17:03:43 +00:00
imp
dc1d630338 More properly account for free/reserved segments to avoid deadlock or
worse when filling up a device and then trying to erase files to make
space. Without enough space, you can't do that. Also, ensure that the
metadata writes don't generate ENOSPC. They will be retried later
since the buffers are still dirty...

Submitted by: mjg@
2014-04-18 17:03:35 +00:00
ae
b95b3ecef4 Use SMB_QUERY_FS_SIZE_INFO request to populate statfs structure.
When server doesn't support this request, try to use SMB_INFO_ALLOCATION.
And use SMB_COM_QUERY_INFORMATION_DISK request as fallback.

MFC after:	2 weeks
2014-04-15 09:10:01 +00:00
delphij
26c4b55c2e Fix NFS deadlock vulnerability. [SA-14:05]
Fix "Heartbleed" vulnerability and ECDSA Cache Side-channel
Attack in OpenSSL. [SA-14:06]
2014-04-08 18:27:32 +00:00
bdrewery
6fcf6199a4 Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.

Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.

Exp-run revealed no ports using it directly.

No objection from:	arch@
Sponsored by:	EMC / Isilon Storage Division
2014-03-22 10:26:09 +00:00
pfg
b7c5f322bc Revert r263449;
ext2fs: minor update to the dirpref policy.

The change in UFS r254996, reverted the change as the
older code seems to work better. This was not visible
in local testing but we can trust UFS is vastly more
exercised in diferent environments.
2014-03-21 04:33:38 +00:00
pfg
81396b10ba ext2fs: minor update to the dirpref policy.
Bring in a minor change to the dirpref policy based on r248623.

This is pretty minimal change to keep the implementation in
sync with UFS but other parts from the original change are not
directly applicable so don't expect improvements in fsck times.

MFC after:	2 weeks
2014-03-20 21:19:13 +00:00
pfg
0b3f2d5e48 msdosfs: minor format fix - spaces vs tab
MFC after:	3 days
2014-03-20 20:14:04 +00:00
rwatson
33fdc14c0c Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

MFC after:	3 weeks
2014-03-16 10:55:57 +00:00
bdrewery
996c2ffe42 Add missing FALLTHROUGH comment in tmpfs_dir_getdents for looking up '.' and
'..'.

Reviewed by:	Russell Cattelan
Sponsored by:	EMC / Isilon Storage Division
MFC after:	2 weeks
2014-03-14 13:58:02 +00:00