in the requested array, then it is responsible for disposition of previous
page and is responsible for updating the entry in the requested array.
Now consumers of KPI do not need to re-lookup the pages after call to
vm_pager_get_pages().
Reviewed by: kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Use the same scheme implemented to manage credentials.
Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.
Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.
- MNTK_SUSPENDABLE is set in mnt_kern_flag, not mnt_flag.
- The lower layer of a unionfs mount is read-only, so the mount should
be suspendable iff the upper layer is suspendable.
- Remove a couple of superfluous comments.
Differential Revision: https://reviews.freebsd.org/D2714
Reviewed by: kib, mjg
logic is now placed in the mmap hook implementation rather than requiring
it to be placed in sys/vm/vm_mmap.c. This hook allows new file types to
support mmap() as well as potentially allowing mmap() for existing file
types that do not currently support any mapping.
The vm_mmap() function is now split up into two functions. A new
vm_mmap_object() function handles the "back half" of vm_mmap() and accepts
a referenced VM object to map rather than a (handle, handle_type) tuple.
vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a
a VM object and then calling vm_mmap_object() to handle the actual mapping.
The vm_mmap() function remains for use by other parts of the kernel
(e.g. device drivers and exec) but now only supports mapping vnodes,
character devices, and anonymous memory.
The mmap() system call invokes vm_mmap_object() directly with a NULL object
for anonymous mappings. For mappings using a file descriptor, the
descriptors fo_mmap() hook is invoked instead. The fo_mmap() hook is
responsible for performing type-specific checks and adjustments to
arguments as well as possibly modifying mapping parameters such as flags
or the object offset. The fo_mmap() hook routines then call
vm_mmap_object() to handle the actual mapping.
The fo_mmap() hook is optional. If it is not set, then fo_mmap() will
fail with ENODEV. A fo_mmap() hook is implemented for regular files,
character devices, and shared memory objects (created via shm_open()).
While here, consistently use the VM_PROT_* constants for the vm_prot_t
type for the 'prot' variable passed to vm_mmap() and vm_mmap_object()
as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines.
Previously some places were using the mmap()-specific PROT_* constants
instead. While this happens to work because PROT_xx == VM_PROT_xx,
using VM_PROT_* is more correct.
Differential Revision: https://reviews.freebsd.org/D2658
Reviewed by: alc (glanced over), kib
MFC after: 1 month
Sponsored by: Chelsio
When providing memory map information to userland, populate the vnode pointer
for tmpfs files. Set the memory mapping to appear as a vnode type, to match
FreeBSD 9 behavior.
This fixes the use of tmpfs files with the dtrace pid provider,
procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).
Submitted by: Eric Badger <eric@badgerio.us> (initial revision)
Obtained from: Dell Inc.
PR: 198431
MFC after: 2 weeks
Reviewed by: jhb
Approved by: kib (mentor)
Merge the filesystem specific part from r274914 to ext2fs.
I only did regular testing with the change but UFS and our ext2fs
are similar enough that the code should just work with the new
sendfile.
Discussed with: glebius
No appreciable change in performance was observed after increasing
the sizes of these tables and then testing with a single client.
However, there was an email that indicated high CPU overheads for
a heavily loaded NFSv4 and it is hoped that increasing the sizes
of the hash tables via these tunables might help.
The tables remain the same size by default.
Differential Revision: https://reviews.freebsd.org/D2596
MFC after: 2 weeks
limits in the code which is deep in the call stack, and owns several
critical system resources, like vnode locks. Attempt to wait while
the per-mount softupdate thread cleans up the backlog may deadlock,
because the thread might need to lock the same vnode which is owned by
the waiting thread.
Instead of synchronously waiting for the worker, perform the worker'
tickle and pause until the backlog is cleaned, at the safe point
during return from kernel to usermode. A new ast request to call
softdep_ast_cleanup() is created, the SU code now only checks the size
of queue and schedules ast.
There is no ast delivery for the kernel threads, so they are exempted
from the mechanism, except NFS daemon threads. NFS server loop
explicitely checks for the request, and informs the schedule_cleanup()
that it is capable of handling the requests by the process P2_AST_SU
flag. This is needed because nfsd may be the sole cause of the SU
workqueue overflow. But, to not cause nsfd to spawn additional
threads just because we slow down existing workers, only tickle su
threads, without waiting for the backlog cleanup.
Reviewed by: jhb, mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
so that it would not return less data than requested.
Since returning less directory data than requested is not a problem
for FreeBSD and even UFS no longer returns directory structures
with d_fileno == 0, this patch stops the client from doing this.
Although entries with d_fileno == 0 should not be a problem,
the man pages no longer document that these entries should be
ignored, so there was a concern that these entries might be an
issue in the future.
Suggested by: trasz
Tested by: trasz
MFC after: 2 weeks
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks
tracing. This matches the behavior of ptrace(PT_ATTACH). Also,
the procfs detach request assumes p_oppid is always set.
Reviewed by: kib
MFC after: 2 weeks
that are not an exact multiple of DIRBLKSIZ correctly. Fortunately
readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications
were affected. This patch fixes this problem by reducing the size
of the directory read to an exact multiple of DIRBLKSIZ.
Tested by: trasz
Reported by: trasz
Reviewed by: trasz
MFC after: 2 weeks
Present implementation of large sync writes is too strict and so can be
quite slow. Instead of doing that, execute large async write in chunks,
syncing each chunk separately.
It would be good to fix large sync writes too, but I leave it to somebody
with more skills in this area.
Reviewed by: rmacklem
MFC after: 1 week
largest size for a buffer in the buffer cache. This patch
defines a new constant MAXBCACHEBUF, which is the largest
size for a buffer in the buffer cache. Having a separate
constant allows MAXBCACHEBUF to be set larger than MAXBSIZE
on a per-architecture basis, so that NFS can do larger read/writes
for these architectures. It modifies sys/param.h so that BKVASIZE
can also be set on a per-architecture basis.
A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE
is fixed as well.
Differential Revision: https://reviews.freebsd.org/D2330
Reviewed by: mav, kib
MFC after: 2 weeks
This is similar to r281756 so set the ptr NULL after free as a safety belt
against future changes.
Obtained from: HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac)
Submitted by: Oliver Pinter
Revewed by: rmacklem
The htree directory index is a highly desirable feature for research
purposes and was meant to improve performance in our ext2/3 driver.
Unfortunately our implementation has two problems:
- It never really delivered any performance improvement.
- It appears to corrupt the filesystem in undetermined circumstances.
Strictly speaking dir_index is not required for read/write support in
ext2/3 and our limited ext4 support still works fine without it.
Regain stability in the ext2 driver by removing it. We may need it back
(fixed) if we want to support encrypted ext4 support but thanks to the
wonders of version control we can always revert this change and bring it
back.
PR: 191895
PR: 198731
PR: 199309
MFC after: 5 days
can perform better when using a 128K read/write data size.
This patch changes NFS_MAXDATA from 64K to 128K so that
clients can use 128K for NFS mounts to allow this.
The patch also renames NFS_MAXDATA to NFS_SRVMAXIO so
that it is clear that it applies to the NFS server side
only. It also avoids a name conflict with the NFS_MAXDATA
defined in rpcsvc/nfs_prot.h, that is used for userland RPC.
Tested by: mav
Reviewed by: mav
MFC after: 2 weeks
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.
Reviewed by: kib
MFC after: 2 weeks
For VREG vnodes, return the resident page count (multiplied by PAGE_SIZE)
for the tmpfs node's anonymous VM object that stores actual file contents.
For all other vnodes, return the tmpfs_node's tn_size, which should not
be rounded to a page.
This change allows using stat(2) to identify a sparse file on tmpfs.
Reviewed by: kib
MFC after: 1 week
on reads or writes, the time marks are used to display idle time by
w(1) [1]. Instead, use vfs.devfs.dotimes as the selector of default
precision vs. using time_second. The later gives seconds precision,
which is good enough for the purpose.
Note that timestamp updates are unlocked and the updates itself, as
well as the check in devfs_timestamp, are non-atomic.
Noted by: truckman [1]
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The magic number MSDOSFS_ARGSMAGIC, which used to distinguish
"old" vs "new" msdosfs mount arguments, has not been used since
2005; it should just go away now.
Likewise, the local-to-Unicode table that changed at the same
time is unused.
Leave the space reserved in the old style mount arguments, though,
since we still support the old mount call (via the cmount entry
point).
Submitted by: Chris Torek <chris.torek@gmail.com>
MFC after: 2 weeks
Currently we update timestamps unconditionally when doing read or
write operations. This may slow things down on hardware where
reading timestamps is expensive (e.g. HPET, because of the default
vfs.timestamp_precision setting is nanosecond now) with limited
benefit.
A new sysctl variable, vfs.devfs.dotimes is added, which can be
set to non-zero value when the old behavior is desirable.
Differential Revision: https://reviews.freebsd.org/D2104
Reported by: Mike Tancsa <mike sentex net>
Reviewed by: kib
Relnotes: yes
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
- Allow to call the function with vm object lock held.
- Allow to specify reqpage that doesn't match any page in the region,
meaning freeing all pages.
o Utilize the new function in couple more places in vnode pager.
Reviewed by: alc, kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
error cases. Calling brelse() with a NULL pointer is not allowed,
so only call brelse() when the bp is non-NULL.
Reported by: Maxime Villard (reported as uninitialized variable)
Do not ever return doomed vnode from lookup. This could happen, if
not checked, since dvp is relocked in the 'looking up ourselves' case.
In the other case, since dvp is relocked, mount point might go away
while fdesc_allocvp() is called. Prevent the situation by doing
vfs_busy() before unlocking dvp. Reuse the vn_vget_ino_gen() helper.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
prevent errors from yanking devices out from under filesystems. Only
care about special vnodes on devfs, special nodes on other kinds of
filesystems do not have special properties.
Sponsored by: EMC / Isilon Storage Division
Submitted by: Conrad Meyer
MFC after: 1 week
The e2fs_gd struct was not being initialized and garbage was
being used for hinting the ext2 allocator variant.
Use malloc to clear the values and also initialize e2fs_contigdirs
during allocation to keep consistency.
While here clean up small style issues.
Reported by: Clang static analyser
MFC after: 1 week
removed. Postponing it until tmpfs_getattr() is called causes
discordant values reported for file times vs. directory times.
Reported and tested by: madpilot
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
modification and last file status change timestamps of the file".
Currently, tmpfs only modifies ctime when file was extended. Since
r277828 followed tmpfs_write(), mmaped writes also do not modify
ctime.
Fix this, by updating both ctime and mtime for writes to tmpfs files.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
to UFS, perform updates during syncer scans, which in particular means
that tmpfs now performs scan on sync. Also, this means that a mtime
update may be delayed up to 30 seconds after the write.
The vm_object' OBJ_TMPFS_DIRTY flag for tmpfs swap object is similar
to the OBJ_MIGHTBEDIRTY flag for the vnode object, it indicates that
object could have been dirtied. Adapt fast page fault handler and
vm_object_set_writeable_dirty() to handle OBJ_TMPFS_NODE same as
OBJT_VNODE.
Reported by: Ronald Klop <ronald-lists@klop.ws>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
in r277199. Acquire the neccessary reference in delist_dev_locked()
and inform destroy_devl() about it using CDP_UNREF_DTR flag.
Fix some style nits, add asserts.
Discussed with: hselasky
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
After the ext2 variant of the "orlov allocator" was implemented,
the case for a negative or zero dirsize disappeared.
Drop the dead code and unsign dirsize given that it can't be
negative anyways.
CID: 1008669
MFC after: 1 week
"delist_dev()" function. Make sure the character device structure
doesn't go away until the end of the "destroy_dev()" function due to
concurrently running cleanup code inside "devfs_populate()".
MFC after: 1 week
Reported by: dchagin@
function. Many existing clients don't understand POLLNVAL and instead
relies on an error code from the read(), write() or ioctl() system
call. Also make sure we wakeup any client pollers before the cuse
server is closing, so they don't wait forever for an event.
There are a number of msdosfs improvements in NetBSD that may be worth
bringing over, and this reduces noise in the comparison.
Differential Revision: https://reviews.freebsd.org/D1466
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
in the NFS server; garbage collect now-unused NFSMSIZ() and M_HASCL()
macros. Also garbage collect now-unused versions in headers for the
removed previous NFS client and server.
Reviewed by: rmacklem
Sponsored by: EMC / Isilon Storage Division
use VA_UTIMES_NULL to indicate whether it should
set the time to the current tod on the server.
This had the side effect of making the NFS client
use the client's timestamp for exclusive create,
starting with FreeBSD9.2.
Unfortunately a bug in some Solaris NFS servers
causes these servers to return NFS_OK to the
Setattr RPC done during exclusive create, but not
actually set the file's mode, leaving the file's
mode == 0.
This patch restores the NFS client's behaviour to
use the server's tod for the exclusive open's
Setattr RPC, to avoid the Solaris server bug and
to restore the pre-FreeBSD9.2 NFS behaviour.
Discussed on: freebsd-fs
PR: 186293
MFC after: 3 months
exactly the same code is at the end of the nfscl_checksattr()
function that is called just before it. As such, this code
had already been executed and didn't do anything.
MFC after: 1 week
was reported via email. This was caused by a LOR between the
sleep lock used to serialize the local locking (nfsrv_locklf())
and locking the vnode. I believe this patch fixes the problem
by delaying relocking of the vnode until the sleep lock is
unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side
effect of unlocking the vnode, unlocking the vnode was moved to before
the functions that call nfsvno_advlock().
It shouldn't affect the execution of the default case where
vfs.nfsd.enable_locallocks=0.
Reported by: loic.blot@unix-experience.fr
Discussed with: kib
MFC after: 1 week
which means that the NFSCLIENT and NFSSERVER
kernel options will no longer work. This commit
only removes the kernel components. Removal of
unused code in the user utilities will be done
later. This commit does not include an addition
to UPDATING, but that will be committed in a
few minutes.
Discussed on: freebsd-fs
This assertion was added in r246213 as a guard against corrupted mbufs
arriving from drivers, the key distinguishing factor of said mbufs being
that they had a negative length. Given we're in a while loop specifically
designed to skip over zero-length mbufs, panicking on a zero-length mbuf
seems incorrect.
No objection from: kib
into namecache, to avoid cache trashing when doing large operations.
E.g., tar archive extraction is not usually followed by access to many
of the files created.
Right now, each VOP_LOOKUP() implementation explicitely knowns about
this quirk and tests for both MAKEENTRY flag presence and op != CREATE
to make the call to cache_enter(). Centralize the handling of the
quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP.
VFS now sets NOCACHE flag for CREATE namei() calls.
Note that the change in semantic is backward-compatible and could be
merged to the stable branch, and is compatible with non-changed
third-party filesystems which correctly handle MAKEENTRY.
Suggested by: Chris Torek <torek@pi-coral.com>
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Overrunning buffer pointed to by (caddr_t)&oip->i_db[0] of 48 bytes by
passing it to a function which accesses it at byte offset 59 using
argument 60UL.
The issue was inherited from an older FFS implementation and
fixed there with by merging UFS2 in r98542. We follow the
FFS fix.
Discussed with: bde
CID: 1007665
MFC after: 3 days
Since VFS does not/cannot stop writes, sync might run indefinitely, or
be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for
forced unmounts, since non-responding server does not allow sync to
finish. On the other hand, filesystems can and do stop writes using
fs-specific facilities, and should already fully flush caches in
VFS_UNMOUNT() due to the race.
Adjust msdosfs tp sync in unmount for forced call, to accomodate the
new behaviour. Note that it is still racy, since writes are not
stopped.
Discussed with: avg, bjk, mckusick
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
- Threads lifetime cycle, in particular, counting of the threads in
the process, and interlocking with process mutex and thread lock.
The main reason of this is that turnstile locks are after thread
locks, so you e.g. cannot unlock blockable mutex (think process
mutex) while owning thread lock.
- Virtual and profiling itimers, since the timers activation is done
from the clock interrupt context. Replace the p_slock by p_itimmtx
and PROC_ITIMLOCK().
- Profiling code (profil(2)), for similar reason. Replace the p_slock
by p_profmtx and PROC_PROFLOCK().
- Resource usage accounting. Need for the spinlock there is subtle,
my understanding is that spinlock blocks context switching for the
current thread, which prevents td_runtime and similar fields from
changing (updates are done at the mi_switch()). Replace the p_slock
by p_statmtx and PROC_STATLOCK().
The split is done mostly for code clarity, and should not affect
scalability.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
made getmntinfo() return empty flags for smbfs filesystems when
called with MNT_WAIT. It's not visible with mount(8), since it uses
MNT_NOWAIT, but broke autounmount(8) operation.
PR: 195161
Differential Revision: https://reviews.freebsd.org/D1194
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
ext2_print_inode is not really used but it was nice to
have for initial development work. #ifdef it under a
new EXT2FS_DEBUG knob so that we don't spend time
compiling it.
MFC after: 3 days
to mount_nfs(8). They are implemented on Linux, OS X, and Solaris,
and thus can be expected to appear in automounter maps.
Reviewed by: rmacklem@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
to a power of 2. For non-power of 2 settings, intermittent
page faults have been reported. Although the bug that causes
these page faults/crashes has not been identified, it does
not appear to occur when rsize, wsize is a power of 2.
Reported by: tcberner@gmail.com
MFC after: 2 weeks
to a power of 2. For non-power of 2 settings, intermittent
page faults have been reported. Although the bug that causes
these page faults/crashes has not been identified, it does
not appear to occur when rsize, wsize is a power of 2.
Reported by: tcberner@gmail.com
MFC after: 2 weeks
- Wrong integer type was specified.
- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.
- Logical OR where binary OR was expected.
- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.
- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.
- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.
- Updated "EXAMPLES" section in SYSCTL manual page.
MFC after: 3 days
Sponsored by: Mellanox Technologies
two.
nullfs and unionfs need to request suspension if underlying filesystem(s)
use it. Utilize mnt_kern_flag for this purpose.
This is a fixup for 273271.
No strong objections from: kib
Pointy hat to: mjg
MFC after: 2 weeks
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.
Submitted by: kmacy
Tested by: make universe
user nobody and/or setting group nogroup as owner of a file or directory.
Usually at the client side, if there is an username that is not in the
client's passwd database, some clients will send 'nobody@<your.dns.domain>'
in the wire and the NFSv4 server will treat it as an ERROR.
However, if you have a valid user nobody in your passwd database,
the NFSv4 server will treat it as a NFSERR_BADOWNER as its believes the
client doesn't has the username mapped.
Submitted by: Loic Blot <loic.blot@unix-experience.fr>
Reviewed by: rmacklem
Approved by: rmacklem
MFC after: 2 weeks
read/write/poll/ioctl, call standard vnode filedescriptor fop. This
restores the special handling for terminals by calling the deadfs VOP,
instead of always returning ENXIO for destroyed devices or revoked
terminals.
Since destroyed (and not revoked) device would use devfs_specops VOP
vector, make dead_read/write/poll non-static and fill VOP table with
pointers to the functions, to instead of VOP_PANIC.
Noted and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
is interested in i/o state. Return POLLNVAL for invalid bits, similar
to poll_no_poll(). Note that POLLOUT must not be returned, since
POLLHUP is set.
Noted and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
have wildcards. This makes it possible for autofs(4) to avoid requesting
automountd(8) action on access to nonexistent nodes - unless wildcards
are actually used.
Note that this change breaks ABI for automountd(8).
Tested by: dhw@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
survives remount in rw, also it is set for vnodes on rootfs before
noatime can be set or clock is adjusted. All conditions result in
wrong atime for accessed vnodes.
Submitted by: bde
MFC after: 1 week
This fix addresses only issues with the pynfs reports, none of these
issues are know to create problems for extant real clients.
Submitted by: Bart Hsiao <bart.hsiao@gmail.com>
Reworked by: myself
Reviewed by: rmacklem
Approved by: rmacklem
Sponsored by: QNAP Systems Inc.
made autofs mix them up: the second one wasn't visible in ls(1) output,
and trying to access it would trigger mount for the first one.
foobar host:/foobar
foo host:/foo
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
struct kinfo_file.
- Move the various fill_*_info() methods out of kern_descrip.c and into the
various file type implementations.
- Rework the support for kinfo_ofile to generate a suitable kinfo_file object
for each file and then convert that to a kinfo_ofile structure rather than
keeping a second, different set of code that directly manipulates
type-specific file information.
- Remove the shm_path() and ksem_info() layering violations.
Differential Revision: https://reviews.freebsd.org/D775
Reviewed by: kib, glebius (earlier version)
by ffs and ext2fs. Remove duplicated call to vm_page_zero_invalid(),
done by VOP and by vm_pager_getpages(). Use vm_pager_free_nonreq().
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 6 weeks (after r271596)
path through the NFS clients' getpages functions.
Introduce vm_pager_free_nonreq(). This function can be used to eliminate
code that is duplicated in many getpages functions. Also, in contrast to
the code that currently appears in those getpages functions,
vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one
case.
Reviewed by: kib
MFC after: 6 weeks
Sponsored by: EMC / Isilon Storage Division
mbuf-initialisation logic that is best left to centralised mbuf utility
code rather than scattered around the kernel.
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
variable, and don't store in autofs_mount. Also rename it from 'sc'
to 'autofs_softc', since it's global and extern.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
UNIX systems, eg. MacOS X and Solaris. It uses Sun-compatible map format,
has proper kernel support, and LDAP integration.
There are still a few outstanding problems; they will be fixed shortly.
Reviewed by: allanjude@, emaste@, kib@, wblock@ (earlier versions)
Phabric: D523
MFC after: 2 weeks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
nullfs vnode shares vnode lock with lower vnode, this allows the
reclamation of nullfs directory vnode in null_lookup(). In this
situation, VOP must return ENOENT.
More, since after the reclamation, the locks of nullfs directory vnode
and lower vnode are no longer shared, the relock of the ldvp does not
restore the correct locking state of dvp, and leaks ldvp lock.
Correct this by unlocking ldvp and locking dvp.
Use cached value of dvp->v_mount.
Reported by: bdrewery
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
In linux EXT4_LINK_MAX is now 64000. We can't really do that
since i_nlink and va_nlink are signed so setting higher values
is likely to cause trouble.
This is a system limitation so set the EXT_LINK_MAX to
what the system can handle.
MFC after: 3 days
writes to files for read-only file systems. Since there are already
checks in nandfs_setattr that return an error, this moves detection of
the error earlier.
It overflows witness.
Shorten the names of some nfs mutexes.
Reported and tested by: pho
No objections from: rmacklem, mav
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
forcing filesystem VOP_LINK() methods to repeat the code. In
tmpfs_link(), remove redundand check for the type of the source,
already done by VFS.
Note that NFS server already performs this check before calling
VOP_LINK().
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
- Suspend filesystem for unmount. This prevents new tmpfs nodes from
instantiating, and also ensures that only unmount thread can destroy
nodes.
- Do not start tmpfs node deletion until all vnodes are reclaimed,
which guarantees that no thread can access tmpfs data. For this,
call vflush() in the loop, until the mnt_nvnodelistsize is non-zero.
Note that after mnt_nvnodelistsize becomes 0, insmntque() blocks
insertion of a vnode germ into the mount list of vnodes.
- Fail node allocation when the filesystem is being unmounted. This
is race-free due to the vflush() call in loop. This is mostly
cosmetic, avoiding some more work which might be done until
suspension in unmount is started.
Note that there is currently no way to prevent new vnode instantiation
from readers during the unmount. Due to this, forced unmount might
live-lock if vflush() loop cannot get to the zero vnode count due to
races with readers. The unmount would proceed after the load is
lifted.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks