freebsd-dev

Author	SHA1	Message	Date
Pedro F. Giffuni	386b134364	ext2: rename some directory index constants. Missed from r294653. Pointyhat: me	2016-01-24 04:30:30 +00:00
Pedro F. Giffuni	c22ff471b4	Fix comment.	2016-01-24 02:44:00 +00:00
Pedro F. Giffuni	9b58c8019f	Rename some directory index constants. Directory index was introduced in ext3. We don't always use the prefix to denote the ext2 variant they belong to but when we do we should try to be accurate.	2016-01-24 02:41:49 +00:00
Pedro F. Giffuni	e08ad8f068	ext2: Initialize i_flag after allocation. We use i_flag to carry some flags like IN_E4INDEX which newer ext2fs variants uses internally. fsck.ext3 rightfully complains after our implementation tags non-directory inodes with INDEX_FL. Initializing i_flag during allocation removes the noise factor and quiets down fsck. Patch from: Damjan Jovanovic PR: 206530	2016-01-24 02:25:41 +00:00
Konstantin Belousov	1a2dd035fb	When devfs dirent is freed, a vnode might still keep a pointer to it, apparently. Interlock and clear the pointer to avoid free memory dereference. Submitted by: bde (previous version) MFC after: 3 weeks	2016-01-22 20:30:51 +00:00
Pedro F. Giffuni	9824e4adbe	ext2fs: Bring back the htree dir_index implementation. The htree dir_index is perhaps one of the most characteristic features of the linux ext3 implementation. It was removed in r281670, due to repeated bug reports. Damjan Jovanic detected and fixed three bugs and did some stress testing by building Apache OpenOffice on top of it so it is now in good shape to bring back. Differential Revision: https://reviews.freebsd.org/D5007 Submitted by: Damjan Jovanovic Reviewed by: pfg Tested by: pho Relnotes: Yes MFC after: 2 months (only 10.x)	2016-01-21 14:50:28 +00:00
Konstantin Belousov	aeace3c33c	Assert that the linkage between struct cdev_privdata and and struct file is consistent. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-01-17 08:34:35 +00:00
Ravi Pokala	bc089e5d7d	[PR 206224] bv_cnt is sometimes examined without holding the bufobj lock Add locking around access to bv_cnt which is currently being done unlocked PR: 206224 Reviewed by: imp Approved by: jhb MFC after: 1 week Sponsored by: Panasas, Inc. Differential Revision: https://reviews.freebsd.org/D4931	2016-01-17 01:04:20 +00:00
Bjoern A. Zeeb	8676704962	Unbreak NOIP builds after r294084.	2016-01-15 16:45:36 +00:00
Alexander V. Chernikov	d3bf8f6486	Make nfscl_getmyip() use new routing KPI. * Use standard IPv6 SAS instead of rt->rt_ifa address. * Make address lookup work for IPv6 LLA. * Save address into buffer provided by caller instead of using static vars. Discussed with: rmacklem	2016-01-15 09:05:14 +00:00
Konstantin Belousov	a53b7c692d	Make devfs_fpdrop() static. It was not a public KPI, and it has no reason to remain exported for some time. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-01-13 14:03:06 +00:00
Pedro F. Giffuni	daf884fa9f	ext4: mount panic from freeing invalid pointers Initialize the struct with those fields to zeroes on allocation, preventing the panic. Patch by: Damjan Jovanovic. PR: 206056 MFC after: 3 days	2016-01-11 19:25:43 +00:00
Pedro F. Giffuni	e813d9d7fa	ext4: add support for reading sparse files Add support for sparse files in ext4. Also implement read-ahead, which greatly increases the performance when transferring files from ext4. Both features implemented by Damjan Jovanovic. PR: 205816 MFC after: 1 week	2016-01-11 19:14:55 +00:00
Andrey V. Elsukov	c829016e85	Change the type of newsize argument in the smbfs_smb_setfsize() function from int to int64. MSDN says that SMB_SET_FILE_END_OF_FILE_INFO uses signed 64-bit integer to specify offset, but since smbfs_smb_setfsize() has used plain int, a value was truncated in case when offset was larger than 2G. https://msdn.microsoft.com/en-us/library/ff469975.aspx In particular, now `truncate -s 10G` will work correctly on the mounted SMB share. Reported and tested by: Eugene Grosbein <eugen at grosbein dot net> MFC after: 1 week	2016-01-11 18:11:06 +00:00
Pedro F. Giffuni	7135ca50c1	ext2fs: reading mmaped file in Ext4 causes panic Always call brelse(path.ep_bp), fixing reading EXT4 files using mmap(). Patch by Damjan Jovanovic. PR: 205938 MFC after: 1 week	2016-01-07 21:43:43 +00:00
Konstantin Belousov	fb57d63e47	Hide transient EBADF errors caused by the parallel revoke(2) or forced unmount of devfs mounts, by restarting the failed syscall. When restarted, failing syscalls eventually either stop finding the node and returning ENOENT, or the vnode op vectors finally transition to the deadfs vop. The later return EIO or other error, more appropriate for the operation. Submitted by: bde Tested by: pho MFC after: 3 weeks	2016-01-02 20:29:28 +00:00
Konstantin Belousov	d52aff3c7a	Minor style cleanup. Submitted by: bde MFC after: 1 week	2016-01-01 15:48:48 +00:00
Konstantin Belousov	6f73b583d9	Force nullfs vnode reclaim after unlinking, to potentially unlink lower vnode. Otherwise, reference to the lower vnode from the upper one prevents final unlink. PR: 178238 Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-12-30 19:49:22 +00:00
Pedro F. Giffuni	26069aec57	ext2: recognize ext4 INCOMPAT_RECOVER flag This is a flag specific for journalling in ext4. Add it to the list of ext4 features we ignore for read-only purposes. PR: 205668 MFC after: 1 week	2015-12-29 15:51:52 +00:00
Konstantin Belousov	cccac8a1ef	Make it possible for the cdevsw d_close() driver method to detect last close and close due to revoke(2)-like operation. A new FLASTCLOSE flag indicates that this is last close. FREVOKE is set for revokes, and FNONBLOCK is also set, same as is already done for VOP_CLOSE() call from vgonel(). The flags reuse user open(2) flags which are never stored in f_flag, to not consume bit space in the ABI visible way. Assert this with the static check. Requested and reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-12-22 20:37:34 +00:00
Konstantin Belousov	b63d070ad1	Keep devfs mount locked for the whole duration of the devfs_setattr(), and ensure that our dirent is instantiated. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-12-22 20:22:17 +00:00
Hans Petter Selasky	beebd9aac8	Make CUSE usable with platforms where the size of "unsigned long" is different from the size of a pointer.	2015-12-22 09:55:44 +00:00
Hans Petter Selasky	9fb69c9d9d	Make CUSE usable with platforms where the size of "unsigned long" is different from the size of a pointer.	2015-12-22 09:41:33 +00:00
Hans Petter Selasky	ac60e80129	Guard against the same process being both CUSE server and client at the same time. This can easily lead to a deadlock when destroying the character devices nodes.	2015-12-22 09:26:24 +00:00
Gleb Smirnoff	f17f88d3e0	Fix breakage caused by r292373 in ZFS/FUSE/NFS/SMBFS. With the new VOP_GETPAGES() KPI the "count" argument counts pages already, and doesn't need to be translated from bytes to pages. While here make it consistent that rbehind and rahead are updated only if we doesn't return error. Pointy hat to: glebius	2015-12-16 23:48:50 +00:00
Gleb Smirnoff	b0cd20172d	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
John Baldwin	8d7e0f5889	The cdevpriv_dtr_t typedef was not able to be used in a function prototype like the various d_*_t typedefs since it declared a function pointer rather than a function. Add a new d_priv_dtor_t typedef that declares the function and can be used as a function prototype. The previous typedef wasn't useful outside of the cdevpriv implementation, so retire it. The name d_priv_dtor_t was chosen to be more consistent with cdev methods since it is commonly used in place of d_close_t even though it is not a direct pointer in struct cdevsw. Reviewed by: kib, imp MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D4340	2015-12-02 18:27:30 +00:00
Rick Macklem	65171ebbc8	Fix the memory leak that occurs when the nfscommon.ko module is unloaded. This leak was introduced by r291527. Since the nfscommon.ko module is rarely unloaded, this leak would not have been much of an issue. MFC after: 2 weeks	2015-12-02 02:47:13 +00:00
Rick Macklem	10b2e06e3e	Delete the TUNABLE_INT() line. It was in r291527 so that it could be MFC'd to stable/10 and still work.	2015-11-30 23:37:09 +00:00
Rick Macklem	84be7e0952	Add kernel support to the NFS server for the "-manage-gids" option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks	2015-11-30 21:54:27 +00:00
Kirk McKusick	43a993bb7d	For performance reasons, it is useful to have a single string used as the name of a filesystem when setting it as the first parameter to the getnewvnode() function. Most filesystems call getnewvnode from just one place so can use a literal string as the first parameter. However, NFS calls getnewvnode from two places, so we create a global constant string that can be used by the two instances. This change also collapses two instances of getnewvnode() in the UFS filesystem to a single call. Reviewed by: kib Tested by: Peter Holm	2015-11-29 21:01:02 +00:00
Rick Macklem	a0962bf8bc	When the nfsd threads are terminated, the NFSv4 server state (opens, locks, etc) is retained, which I believe is correct behaviour. However, for NFSv4.1, the server also retained a reference to the xprt (RPC transport socket structure) for the backchannel. This caused svcpool_destroy() to not call SVC_DESTROY() for the xprt and allowed a socket upcall to occur after the mutexes in the svcpool were destroyed, causing a crash. This patch fixes the code so that the backchannel xprt structure is dereferenced just before svcpool_destroy() is called, so the code does do an SVC_DESTROY() on the xprt, which shuts down the socket upcall. Tested by: g_amanakis@yahoo.com PR: 204340 MFC after: 2 weeks	2015-11-21 23:55:46 +00:00
Rick Macklem	b179878dde	Revert r283330 since it broke directory caching in the client. At this time I cannot see a way to fix directory caching when it has partial blocks in the buffer cache, due to the fact that the syscall's uio_offset won't stay the same as the lblkno * NFS_DIRBLKSIZ offset. Reported by: bde MFC after: 2 weeks	2015-11-21 00:15:41 +00:00
Rick Macklem	f315383406	mnt_stat.f_iosize (which is used to set bo_bsize) must be set to the largest size of buffer cache block or the mapping of the buffer is bogus. When a mount with rsize=4096,wsize=4096 was done, f_iosize would be set to 4096. This resulted in corrupted directory data, since the buffer cache block size for directories is NFS_DIRBLKSIZ (8192). This patch fixes the code so that it always sets f_iosize to at least NFS_DIRBLKSIZ. Tested by: krichy@cflinux.hu PR: 177971 MFC after: 2 weeks	2015-11-17 01:44:26 +00:00
Mark Johnston	d28713378a	- Consistently use PROC_ASSERT_HELD() to verify that a process' hold count is non-zero. - Include the process address in the PROC_ASSERT_HELD() and PROC_ASSERT_NOT_HELD() assertion messages so that the corresponding process can be found easily when debugging. MFC after: 1 week	2015-11-08 01:38:56 +00:00
Konstantin Belousov	1d48f121d8	Ensure that when a blockable open of fifo returns success, a valid file descriptor opened for complimentary access exists as well. The implementation of the guarantee is done by counting the generations of readers and writers opens. We return success and not EINTR or ERESTART error, when the sleep for complimentary opening is interrupted, but the generation was changed during the sleep. Longer explanation: assume there are two threads, A doing open("fifo", O_RDONLY) and B doing open("fifo", O_WRONLY), and no other threads either trying to open the fifo, nor there are any file descriptors referencing the fifo. Before the change, it was possible e.g. for for thread A to return a valid file descriptor, while thread B returned EINTR if a signal to B was delivered simultaneously with the wakeup from A. After the change, in this situation both A::open() and B::open() succeed and the signal is made "as if" it was noticed slightly later. Note that the signal actual delivery is not changed, it is done by ast on syscall return path, so signal handler is still executed before first instruction after syscall. See PR for the code demonstrating the issue. PR: 203162 Reported by: Victor Stinner victor.stinner@gmail.com Reviewed by: jilles Tested by: bapt, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-09-20 21:18:33 +00:00
Edward Tomasz Napierala	1d4c0424c8	Fix an NFS server bug that manifested in "ls -al" displaying a plus sign on every directory exported via NFSv4 with NFSv4 ACLs enabled. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3502	2015-08-28 14:26:11 +00:00
Edward Tomasz Napierala	643e5ec210	Make it possible to forcibly unmount devfs. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-08-24 14:04:44 +00:00
Edward Tomasz Napierala	6e572e084b	After r286237 it should be fine to call vgone(9) on a busy GEOM vnode; remove KASSERT that would prevent forced devfs unmount from working. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-08-23 14:53:54 +00:00
Rick Macklem	29dc40b6be	For the case where an NFSv4.1 ExchangeID operation has the client identifier that already has a confirmed ClientID, the nfsrv_setclient() function would not fill in the clientidp being returned. As such, the value of ClientID returned would be whatever garbage was on the stack. An NFSv4.1 client would not normally do this, but it appears that it can happen for certain Linux clients. When it happens, the client persistently retries the ExchangeID and Create_session after Create_session fails when it uses the bogus clientid. With this patch, the correct clientid is replied. This problem was identified in a packet trace supplied by Ahmed Kamal via email. Reported by: email.ahmedkamal@googlemail.com MFC after: 2 weeks	2015-08-14 22:02:14 +00:00
John Baldwin	fada4adf95	The changes that introduced fo_mmap() treated all character device mappings as if MAP_SHARED was always present since in general MAP_PRIVATE is not permitted for character devices. However, there is one exception in that MAP_PRIVATE mappings are permitted for /dev/zero. Only require a writable file descriptor (FWRITE) for shared, writable mappings of character devices. vm_mmap_cdev() will reject any private mappings for other devices. Reviewed by: kib Reported by: sbruno (broke qemu cross-builds), peter Differential Revision: https://reviews.freebsd.org/D3316	2015-08-06 16:50:37 +00:00
Conrad Meyer	b5af3f30a7	nfsclient: Protest loudly when GETATTR responses are invalid BROKEN NFS SERVER OR MIDDLEWARE: Certain WAN "accelerators" attempt to cache NFS GETATTR traffic, but actually corrupt it (e.g., responding to requests with attributes for totally different files). Warn very verbosely when this is detected. Linux' NFS client has a similar warning. Adds a sysctl/tunable (vfs.nfs.fileid_maxwarnings) to configure the quantity of warnings; default to 10. (Zero disables; -1 is unlimited.) Adds a failpoint to aid in validating the warning / behavior with a non-broken server. Use something like: sysctl 'debug.fail_point.nfscl_force_fileid_warning=10%return(1)' Reviewed by: rmacklem Approved by: markj (mentor) Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3304	2015-08-05 22:27:30 +00:00
Rick Macklem	25f37276e5	This patch fixes a problem where, if the NFSv4 server has a previous unconfirmed clientid structure for the same client on the last hash list, this old entry would not be removed/deleted. I do not think this bug would have caused serious problems, since the new entry would have been before the old one on the list. This old entry would have eventually been scavenged/removed. Detected while reading the code looking for another bug. MFC after: 3 days	2015-07-29 23:06:30 +00:00
Jeff Roberson	7d07bfd8a3	- Remove some dead code copied from ffs.	2015-07-29 03:06:08 +00:00
Christian Brueffer	382353e2e8	In tmpfs_chtimes(), remove checks on the nanosecond level when determining whether a node changed. Other filesystems, e.g., UFS, only check on seconds, when determining whether something changed. This also corrects the birthtime case, where we checked tv_nsec twice, instead of tv_sec and tv_nsec (PR). PR: 201284 Submitted by: David Binderman Patch suggested by: kib Reviewed by: kib MFC after: 2 weeks Committed from: Essen FreeBSD Hackathon	2015-07-26 08:33:46 +00:00
Konstantin Belousov	b4490c6e93	The si_status field of the siginfo_t, provided by the waitid(2) and SIGCHLD signal, should keep full 32 bits of the status passed to the _exit(2). Split the combined p_xstat of the struct proc into the separate exit status p_xexit for normal process exit, and signalled termination information p_xsig. Kernel-visible macro KW_EXITCODE() reconstructs old p_xstat from p_xexit and p_xsig. p_xexit contains complete status and copied out into si_status. Requested by: Joerg Schilling Reviewed by: jilles (previous version), pho Tested by: pho Sponsored by: The FreeBSD Foundation	2015-07-18 09:02:50 +00:00
Mark Johnston	5f34e93c58	Check suspendability on the mountpoint returned by VOP_GETWRITEMOUNT. This obviates the need for a MNTK_SUSPENDABLE flag, since passthrough filesystems like nullfs and unionfs no longer need to inherit this information from their lower layer(s). This change also restores the pre-r273336 behaviour of using the presence of a susp_clean VFS method to request suspension support. Reviewed by: kib, mjg Differential Revision: https://reviews.freebsd.org/D2937	2015-07-05 22:37:33 +00:00
Mateusz Guzik	f131759f54	fd: make 'rights' a manadatory argument to fget* functions	2015-07-05 19:05:16 +00:00
Rick Macklem	2a3508eb48	If a "principal" argument isn't provided for a Kerberized NFS mount, the kernel would generate a bogus one with a ":/<path>" suffix. This would only occur for the case where there was no explicit "principal" argument and the getaddrinfo() call in mount_nfs.c failed to a return a cannonical name for the server. This patch fixes this unusual case. PR: 201073 Submitted by: masato@itc.naist.jp MFC after: 2 weeks	2015-07-03 22:11:07 +00:00
Rick Macklem	d189dcb6e2	Alex Burlyga reported a POLA violation for the new NFS client as compared to the old NFS client via email to the freebsd-fs@ mailing list. For the new client, when multiple clients attempted to create a symbolic link concurrently, more that one client would report success instead of EEXIST. This was caused by code in the new client that mapped EEXIST to OK assuming it was caused by a retried RPC request. Since the old client did not do this, the patch defaults to the old behaviour and permits the new behaviour to be enabled via a sysctl. Reported by: alex.burlyga.ietf@gmail.com Tested by: alex.burlyga.ietf@gmail.com MFC after: 2 weeks	2015-07-03 01:15:21 +00:00
Mark Murray	d1b06863fb	Huge cleanup of random(4) code. * GENERAL - Update copyright. - Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set neither to ON, which means we want Fortuna - If there is no 'device random' in the kernel, there will be NO random(4) device in the kernel, and the KERN_ARND sysctl will return nothing. With RANDOM_DUMMY there will be a random(4) that always blocks. - Repair kern.arandom (KERN_ARND sysctl). The old version went through arc4random(9) and was a bit weird. - Adjust arc4random stirring a bit - the existing code looks a little suspect. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Redo read_random(9) so as to duplicate random(4)'s read internals. This makes it a first-class citizen rather than a hack. - Move stuff out of locked regions when it does not need to be there. - Trim RANDOM_DEBUG printfs. Some are excess to requirement, some behind boot verbose. - Use SYSINIT to sequence the startup. - Fix init/deinit sysctl stuff. - Make relevant sysctls also tunables. - Add different harvesting "styles" to allow for different requirements (direct, queue, fast). - Add harvesting of FFS atime events. This needs to be checked for weighing down the FS code. - Add harvesting of slab allocator events. This needs to be checked for weighing down the allocator code. - Fix the random(9) manpage. - Loadable modules are not present for now. These will be re-engineered when the dust settles. - Use macros for locks. - Fix comments. * src/share/man/... - Update the man pages. * src/etc/... - The startup/shutdown work is done in D2924. * src/UPDATING - Add UPDATING announcement. * src/sys/dev/random/build.sh - Add copyright. - Add libz for unit tests. * src/sys/dev/random/dummy.c - Remove; no longer needed. Functionality incorporated into randomdev.. live_entropy_sources.c live_entropy_sources.h - Remove; content moved. - move content to randomdev.[ch] and optimise. * src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h - Remove; plugability is no longer used. Compile-time algorithm selection is the way to go. * src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h - Add early (re)boot-time randomness caching. * src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h - Remove; no longer needed. * src/sys/dev/random/uint128.h - Provide a fake uint128_t; if a real one ever arrived, we can use that instead. All that is needed here is N=0, N++, N==0, and some localised trickery is used to manufacture a 128-bit 0ULLL. * src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h - Improve unit tests; previously the testing human needed clairvoyance; now the test will do a basic check of compressibility. Clairvoyant talent is still a good idea. - This is still a long way off a proper unit test. * src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h - Improve messy union to just uint128_t. - Remove unneeded 'static struct fortuna_start_cache'. - Tighten up up arithmetic. - Provide a method to allow eternal junk to be introduced; harden it against blatant by compress/hashing. - Assert that locks are held correctly. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Turn into self-sufficient module (no longer requires randomdev_soft.[ch]) * src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h - Improve messy union to just uint128_t. - Remove unneeded 'staic struct start_cache'. - Tighten up up arithmetic. - Provide a method to allow eternal junk to be introduced; harden it against blatant by compress/hashing. - Assert that locks are held correctly. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Turn into self-sufficient module (no longer requires randomdev_soft.[ch]) - Fix some magic numbers elsewhere used as FAST and SLOW. Differential Revision: https://reviews.freebsd.org/D2025 Reviewed by: vsevolod,delphij,rwatson,trasz,jmg Approved by: so (delphij)	2015-06-30 17:00:45 +00:00
Konstantin Belousov	8551285097	Restore the td_cookie value for the tmpfs directory entry which was a dup entry, upon detach from the parent directory. If the node is renamed, the entry is re-attached at the different directory, and invalud cookie value triggers assert (or corrupts directory rb tree, it seems). Reported by: clusteradm (gjb, antoine) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-06-19 07:25:15 +00:00
Gleb Smirnoff	093ebe1d28	o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async(). o Provide an extensive set of assertions for input array of pages. o Remove now duplicate assertions from different pagers. Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-06-17 22:44:27 +00:00
Mateusz Guzik	4da8456f0a	Replace struct filedesc argument in getvnode with struct thread This is is a step towards removal of spurious arguments.	2015-06-16 13:09:18 +00:00
Gleb Smirnoff	093c7f396d	Make KPI of vm_pager_get_pages() more strict: if a pager changes a page in the requested array, then it is responsible for disposition of previous page and is responsible for updating the entry in the requested array. Now consumers of KPI do not need to re-lookup the pages after call to vm_pager_get_pages(). Reviewed by: kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-06-12 11:32:20 +00:00
Mateusz Guzik	f6f6d24062	Implement lockless resource limits. Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.	2015-06-10 10:48:12 +00:00
Mark Johnston	068a3d319a	unionfs: fix suspendability check bugs - MNTK_SUSPENDABLE is set in mnt_kern_flag, not mnt_flag. - The lower layer of a unionfs mount is read-only, so the mount should be suspendable iff the upper layer is suspendable. - Remove a couple of superfluous comments. Differential Revision: https://reviews.freebsd.org/D2714 Reviewed by: kib, mjg	2015-06-06 16:36:13 +00:00
John Baldwin	7077c42623	Add a new file operations hook for mmap operations. File type-specific logic is now placed in the mmap hook implementation rather than requiring it to be placed in sys/vm/vm_mmap.c. This hook allows new file types to support mmap() as well as potentially allowing mmap() for existing file types that do not currently support any mapping. The vm_mmap() function is now split up into two functions. A new vm_mmap_object() function handles the "back half" of vm_mmap() and accepts a referenced VM object to map rather than a (handle, handle_type) tuple. vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a a VM object and then calling vm_mmap_object() to handle the actual mapping. The vm_mmap() function remains for use by other parts of the kernel (e.g. device drivers and exec) but now only supports mapping vnodes, character devices, and anonymous memory. The mmap() system call invokes vm_mmap_object() directly with a NULL object for anonymous mappings. For mappings using a file descriptor, the descriptors fo_mmap() hook is invoked instead. The fo_mmap() hook is responsible for performing type-specific checks and adjustments to arguments as well as possibly modifying mapping parameters such as flags or the object offset. The fo_mmap() hook routines then call vm_mmap_object() to handle the actual mapping. The fo_mmap() hook is optional. If it is not set, then fo_mmap() will fail with ENODEV. A fo_mmap() hook is implemented for regular files, character devices, and shared memory objects (created via shm_open()). While here, consistently use the VM_PROT_* constants for the vm_prot_t type for the 'prot' variable passed to vm_mmap() and vm_mmap_object() as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines. Previously some places were using the mmap()-specific PROT_* constants instead. While this happens to work because PROT_xx == VM_PROT_xx, using VM_PROT_* is more correct. Differential Revision: https://reviews.freebsd.org/D2658 Reviewed by: alc (glanced over), kib MFC after: 1 month Sponsored by: Chelsio	2015-06-04 19:41:15 +00:00
Eric van Gyzen	63e4c6cdf9	Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431 MFC after: 2 weeks Reviewed by: jhb Approved by: kib (mentor)	2015-06-02 18:37:04 +00:00
Xin LI	6e55e724a6	Clear p_stops upon PROCFS_CTL_DETACH, similar to r283889. Noticed by: jhb Reviewed by: sef Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2015-06-01 18:49:31 +00:00
Rick Macklem	0c419e226c	Make the NFS server use shared vnode locks for a few cases that are allowed by the VFS/VOP interface instead of using exclusive locks. MFC after: 2 weeks	2015-05-29 20:22:53 +00:00
Pedro F. Giffuni	e54a659a26	Provide VOP_GETPAGES_ASYNC() for extfs. Merge the filesystem specific part from r274914 to ext2fs. I only did regular testing with the change but UFS and our ext2fs are similar enough that the code should just work with the new sendfile. Discussed with: glebius	2015-05-28 21:06:59 +00:00
Rick Macklem	1f54e596ad	Make the size of the hash tables used by the NFSv4 server tunable. No appreciable change in performance was observed after increasing the sizes of these tables and then testing with a single client. However, there was an email that indicated high CPU overheads for a heavily loaded NFSv4 and it is hoped that increasing the sizes of the hash tables via these tunables might help. The tables remain the same size by default. Differential Revision: https://reviews.freebsd.org/D2596 MFC after: 2 weeks	2015-05-27 22:00:05 +00:00
Konstantin Belousov	1bc93bb7b9	Currently, softupdate code detects overstepping on the workitems limits in the code which is deep in the call stack, and owns several critical system resources, like vnode locks. Attempt to wait while the per-mount softupdate thread cleans up the backlog may deadlock, because the thread might need to lock the same vnode which is owned by the waiting thread. Instead of synchronously waiting for the worker, perform the worker' tickle and pause until the backlog is cleaned, at the safe point during return from kernel to usermode. A new ast request to call softdep_ast_cleanup() is created, the SU code now only checks the size of queue and schedules ast. There is no ast delivery for the kernel threads, so they are exempted from the mechanism, except NFS daemon threads. NFS server loop explicitely checks for the request, and informs the schedule_cleanup() that it is capable of handling the requests by the process P2_AST_SU flag. This is needed because nfsd may be the sole cause of the SU workqueue overflow. But, to not cause nsfd to spawn additional threads just because we slow down existing workers, only tickle su threads, without waiting for the backlog cleanup. Reviewed by: jhb, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-05-27 09:20:42 +00:00
Dmitry Chagin	63cc3320d9	Hide vfs.pfs.trace variable if it is not used.	2015-05-24 18:11:22 +00:00
Rick Macklem	262a84286d	The NFS client generated directory block(s) with d_fileno == 0 so that it would not return less data than requested. Since returning less directory data than requested is not a problem for FreeBSD and even UFS no longer returns directory structures with d_fileno == 0, this patch stops the client from doing this. Although entries with d_fileno == 0 should not be a problem, the man pages no longer document that these entries should be ignored, so there was a concern that these entries might be an issue in the future. Suggested by: trasz Tested by: trasz MFC after: 2 weeks	2015-05-23 21:58:41 +00:00
Jung-uk Kim	fd90e2ed54	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
John Baldwin	312827253b	Always set p_oppid when attaching to an existing process via procfs tracing. This matches the behavior of ptrace(PT_ATTACH). Also, the procfs detach request assumes p_oppid is always set. Reviewed by: kib MFC after: 2 weeks	2015-05-22 11:03:51 +00:00
Rick Macklem	86b9457f5b	The NFS client wasn't handling getdirentries(2) requests for sizes that are not an exact multiple of DIRBLKSIZ correctly. Fortunately readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications were affected. This patch fixes this problem by reducing the size of the directory read to an exact multiple of DIRBLKSIZ. Tested by: trasz Reported by: trasz Reviewed by: trasz MFC after: 2 weeks	2015-05-21 23:14:18 +00:00
Alexander Motin	a87627b26b	Do not promote large async writes to sync. Present implementation of large sync writes is too strict and so can be quite slow. Instead of doing that, execute large async write in chunks, syncing each chunk separately. It would be good to fix large sync writes too, but I leave it to somebody with more skills in this area. Reviewed by: rmacklem MFC after: 1 week	2015-05-14 10:04:42 +00:00
Rick Macklem	2fbb9d563b	Fix the NFS server's handling of a bogus NFSv2 ROOT RPC. The ROOT RPC is deprecated in the NFSv2 RFC, RFC-1094 and should never be used by a client. Tested by: thmu@freenet.de MFC after: 1 week	2015-04-25 00:58:24 +00:00
Rick Macklem	7cfdc2a7bc	MAXBSIZE defines both the largest UFS block size and the largest size for a buffer in the buffer cache. This patch defines a new constant MAXBCACHEBUF, which is the largest size for a buffer in the buffer cache. Having a separate constant allows MAXBCACHEBUF to be set larger than MAXBSIZE on a per-architecture basis, so that NFS can do larger read/writes for these architectures. It modifies sys/param.h so that BKVASIZE can also be set on a per-architecture basis. A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE is fixed as well. Differential Revision: https://reviews.freebsd.org/D2330 Reviewed by: mav, kib MFC after: 2 weeks	2015-04-25 00:52:01 +00:00
Pedro F. Giffuni	2f39c91019	Prevent a double free. This is similar to r281756 so set the ptr NULL after free as a safety belt against future changes. Obtained from: HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac) Submitted by: Oliver Pinter Revewed by: rmacklem	2015-04-20 16:40:13 +00:00
Pedro F. Giffuni	a3a4b110da	nfsrpc_createv4: fix double free. Reported by: Oliver Pinter, clang static checker Obtained from: HardenedBSD (commit 63cac77c42c0c3fc67da62f97d5ab651d52ae707) Reviewed by: rmacklem MFC after: 5 days	2015-04-19 23:55:59 +00:00
Alexander Motin	afdfc9a40d	Change wcommitsize default from one empirical value to another. The new value is more predictable with growing RAM size: hibufspace maxvnodes old new i386: 256MB 32980992 15800 2198732 2097152 2GB 94027776 107677 878764 4194304 amd64: 256MB 32980992 15800 2198732 2097152 1GB 114114560 68062 1678155 4194304 4GB 217055232 111807 1955452 4194304 16GB 1717846016 337308 5097465 16777216 64GB 1734918144 1164427 1490479 16777216 256GB 1734918144 4426453 391983 16777216 Reviewed by: rmacklem MFC after: 2 weeks	2015-04-19 11:34:41 +00:00
Edward Tomasz Napierala	50a220c699	Replace "new NFS" with just "NFS" in some sysctl description strings. Sponsored by: The FreeBSD Foundation	2015-04-19 06:18:41 +00:00
Pedro F. Giffuni	f738ee4825	Drop experimental dir_index support. The htree directory index is a highly desirable feature for research purposes and was meant to improve performance in our ext2/3 driver. Unfortunately our implementation has two problems: - It never really delivered any performance improvement. - It appears to corrupt the filesystem in undetermined circumstances. Strictly speaking dir_index is not required for read/write support in ext2/3 and our limited ext4 support still works fine without it. Regain stability in the ext2 driver by removing it. We may need it back (fixed) if we want to support encrypted ext4 support but thanks to the wonders of version control we can always revert this change and bring it back. PR: 191895 PR: 198731 PR: 199309 MFC after: 5 days	2015-04-17 22:26:01 +00:00
Rick Macklem	66e80f77d2	mav@ has found that NFS servers exporting ZFS file systems can perform better when using a 128K read/write data size. This patch changes NFS_MAXDATA from 64K to 128K so that clients can use 128K for NFS mounts to allow this. The patch also renames NFS_MAXDATA to NFS_SRVMAXIO so that it is clear that it applies to the NFS server side only. It also avoids a name conflict with the NFS_MAXDATA defined in rpcsvc/nfs_prot.h, that is used for userland RPC. Tested by: mav Reviewed by: mav MFC after: 2 weeks	2015-04-16 22:35:15 +00:00
Rick Macklem	dda11d4ab9	File systems that do not use the buffer cache (such as ZFS) must use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache. Reviewed by: kib MFC after: 2 weeks	2015-04-15 20:16:31 +00:00
Will Andrews	677c3c0c66	tmpfs_getattr(): Return more correct allocated byte counts. For VREG vnodes, return the resident page count (multiplied by PAGE_SIZE) for the tmpfs node's anonymous VM object that stores actual file contents. For all other vnodes, return the tmpfs_node's tn_size, which should not be rounded to a page. This change allows using stat(2) to identify a sparse file on tmpfs. Reviewed by: kib MFC after: 1 week	2015-04-10 19:04:39 +00:00
Konstantin Belousov	2359e2dcc3	Do not call msdosfs_sync() on the read-only msdosfs mounts. In fact, it should be a nop for ro. PR: 199152 Reviewed by: bde (PR version of the patch) Submitted by: longwitz@incore.de MFC after: 1 week	2015-04-05 21:10:38 +00:00
Konstantin Belousov	420d65d9e4	Assert that an msdosfs mount is not read-only when FAT modifications are requested. PR: 199152 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-04-05 21:08:04 +00:00
Konstantin Belousov	bda2eb9ae8	Refine r280308. Do not completely disable timestamping of devfs nodes on reads or writes, the time marks are used to display idle time by w(1) [1]. Instead, use vfs.devfs.dotimes as the selector of default precision vs. using time_second. The later gives seconds precision, which is good enough for the purpose. Note that timestamp updates are unlocked and the updates itself, as well as the check in devfs_timestamp, are non-atomic. Noted by: truckman [1] Reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-04-01 08:25:40 +00:00
Konstantin Belousov	c66d17c7bc	msdosfs: mark unused compat-mount fields The magic number MSDOSFS_ARGSMAGIC, which used to distinguish "old" vs "new" msdosfs mount arguments, has not been used since 2005; it should just go away now. Likewise, the local-to-Unicode table that changed at the same time is unused. Leave the space reserved in the old style mount arguments, though, since we still support the old mount call (via the cmount entry point). Submitted by: Chris Torek <chris.torek@gmail.com> MFC after: 2 weeks	2015-03-22 09:09:26 +00:00
Xin LI	4f9343fc7c	Disable timestamping on devfs read/write operations by default. Currently we update timestamps unconditionally when doing read or write operations. This may slow things down on hardware where reading timestamps is expensive (e.g. HPET, because of the default vfs.timestamp_precision setting is nanosecond now) with limited benefit. A new sysctl variable, vfs.devfs.dotimes is added, which can be set to non-zero value when the old behavior is desirable. Differential Revision: https://reviews.freebsd.org/D2104 Reported by: Mike Tancsa <mike sentex net> Reviewed by: kib Relnotes: yes Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2015-03-21 01:14:11 +00:00
Gleb Smirnoff	4d6481a4c9	o Enhance vm_pager_free_nonreq() function: - Allow to call the function with vm object lock held. - Allow to specify reqpage that doesn't match any page in the region, meaning freeing all pages. o Utilize the new function in couple more places in vnode pager. Reviewed by: alc, kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-03-17 19:19:19 +00:00
Jung-uk Kim	2d427c524d	Fix white spaces.	2015-03-02 19:14:58 +00:00
Edward Tomasz Napierala	ead063e0a2	Make fuse(4) respect FOPEN_DIRECT_IO. This is required for correct operation of GlusterFS. PR: 192701 Submitted by: harsha at harshavardhana.net Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-03-02 19:04:27 +00:00
Warner Losh	03afe9ba70	nandfs_meta_bread() calls bread() which can set bp to NULL in some error cases. Calling brelse() with a NULL pointer is not allowed, so only call brelse() when the bp is non-NULL. Reported by: Maxime Villard (reported as uninitialized variable)	2015-03-01 21:41:37 +00:00
Alexander Kabaev	0fd841369a	Do not leak 'copy' buffer if bmap_truncate_indirect fails. Reported by: Brainy Code Scanner, by Maxime Villard. MFC after: 2 weeks	2015-02-28 22:24:45 +00:00
Konstantin Belousov	4fc4286fbc	Some fixes for fdescfs lookup code. Do not ever return doomed vnode from lookup. This could happen, if not checked, since dvp is relocked in the 'looking up ourselves' case. In the other case, since dvp is relocked, mount point might go away while fdesc_allocvp() is called. Prevent the situation by doing vfs_busy() before unlocking dvp. Reuse the vn_vget_ino_gen() helper. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-02-28 19:57:22 +00:00
Konstantin Belousov	08189ed667	The VNASSERT in vflush() FORCECLOSE case is trying to panic early to prevent errors from yanking devices out from under filesystems. Only care about special vnodes on devfs, special nodes on other kinds of filesystems do not have special properties. Sponsored by: EMC / Isilon Storage Division Submitted by: Conrad Meyer MFC after: 1 week	2015-02-27 16:43:50 +00:00
Pedro F. Giffuni	e5c356b2a2	ext2fs: Plug small memory leak free() e2fs_contigdirs upon error. Undo zeroing of e2fs_gd as this was actually a false positive. X-MFC with: 278790	2015-02-15 14:25:00 +00:00
Pedro F. Giffuni	6be4cf2244	Reuse value of cursize instead of recalculating. Reported by: Clang static checker MFC after: 1 week	2015-02-15 01:34:00 +00:00
Pedro F. Giffuni	f3ee91ed2b	Initialize the allocation of variables related to the ext2 allocator. The e2fs_gd struct was not being initialized and garbage was being used for hinting the ext2 allocator variant. Use malloc to clear the values and also initialize e2fs_contigdirs during allocation to keep consistency. While here clean up small style issues. Reported by: Clang static analyser MFC after: 1 week	2015-02-15 01:12:15 +00:00
Edward Tomasz Napierala	29836e077a	Restore ABI compatibility, broken in r273127. Note that while this fixes ABI with 10.1, it breaks ABI for 11-CURRENT, so rebuild of automountd(8) is neccessary. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2015-02-10 16:17:16 +00:00
Konstantin Belousov	bf5fce2bee	Remove duplicated assignment. CID: 1267988 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-02-03 12:09:48 +00:00
Konstantin Belousov	e0a60ae16a	Update directory times immediately after an entry is created or removed. Postponing it until tmpfs_getattr() is called causes discordant values reported for file times vs. directory times. Reported and tested by: madpilot Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-31 21:31:53 +00:00
Konstantin Belousov	f1a90a7bac	Remove single-use boolean. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-31 12:58:04 +00:00
Konstantin Belousov	311d39f2ee	POSIX states that write(2) "shall mark for update the last data modification and last file status change timestamps of the file". Currently, tmpfs only modifies ctime when file was extended. Since r277828 followed tmpfs_write(), mmaped writes also do not modify ctime. Fix this, by updating both ctime and mtime for writes to tmpfs files. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-31 12:27:18 +00:00
Dimitry Andric	7f4daa88f1	Fix a -Wcast-qual warning in smbfs_subr.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-30 22:02:32 +00:00
Dimitry Andric	38fc0aa484	Fix a -Wcast-qual warning in udf_vnops.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-30 22:01:45 +00:00
Dimitry Andric	03ce3d7219	Fix a bunch of -Wcast-qual warnings in cd9660_util.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-29 20:40:25 +00:00
Dimitry Andric	09bb4a314a	Fix a bunch of -Wcast-qual warnings in msdosfs_conv.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-29 20:30:13 +00:00
Jamie Gritton	464aad1407	Add allow.mount.fdescfs jail flag. PR: 192951 Submitted by: ruben@verweg.com MFC after: 3 days	2015-01-28 21:08:09 +00:00
Konstantin Belousov	f40cb1c645	Update mtime for tmpfs files modified through memory mapping. Similar to UFS, perform updates during syncer scans, which in particular means that tmpfs now performs scan on sync. Also, this means that a mtime update may be delayed up to 30 seconds after the write. The vm_object' OBJ_TMPFS_DIRTY flag for tmpfs swap object is similar to the OBJ_MIGHTBEDIRTY flag for the vnode object, it indicates that object could have been dirtied. Adapt fast page fault handler and vm_object_set_writeable_dirty() to handle OBJ_TMPFS_NODE same as OBJT_VNODE. Reported by: Ronald Klop <ronald-lists@klop.ws> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-28 10:37:23 +00:00
Konstantin Belousov	3544b0f68f	tmpfs does not use UVM on FreeBSD. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2015-01-28 10:25:35 +00:00
Konstantin Belousov	3b50dff506	Stop enforcing additional reference on all cdevs, which was introduced in r277199. Acquire the neccessary reference in delist_dev_locked() and inform destroy_devl() about it using CDP_UNREF_DTR flag. Fix some style nits, add asserts. Discussed with: hselasky Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-19 17:36:52 +00:00
Konstantin Belousov	a57a934a38	Ignore devfs directory entries for devices either being destroyed or delisted. The check is racy. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-19 17:24:52 +00:00
Enji Cooper	ae266f893f	Fix the build when INVARIANTS is defined by restoring `bo`'s definition in ext2_truncate(..) and by putting it under INVARIANTS ifdefs X-MFC with: r277354 MFC after: 2 weeks	2015-01-19 07:10:08 +00:00
Pedro F. Giffuni	9a53618ab2	ext2: Garbage-collect some unused variables Reported by: clang static analysis MFC after: 2 weeks	2015-01-19 03:30:45 +00:00
Pedro F. Giffuni	7075482d4a	ext2: fix for uninitialized pointer read. path.ep_bp was being used uninitialized in ext4_ext_find_extent(). CID: 1062344 MFC after: 1 week	2015-01-18 21:18:28 +00:00
Pedro F. Giffuni	955ba37baa	Remove dead code. After the ext2 variant of the "orlov allocator" was implemented, the case for a negative or zero dirsize disappeared. Drop the dead code and unsign dirsize given that it can't be negative anyways. CID: 1008669 MFC after: 1 week	2015-01-18 20:26:27 +00:00
Konstantin Belousov	e3612a4c1f	Make SIGSTOP working for sleeps done while waiting for fifo readers or writers in open(2), when the fifo is located on an NFS mount. Reported by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-18 15:03:26 +00:00
Pedro F. Giffuni	84b170d298	ext2: cosmetical issues Minor sorting and note when the cases are expected to fall through. MFC after: 1 week	2015-01-17 15:19:18 +00:00
Hans Petter Selasky	d2955419cd	Avoid race with "dev_rel()" when using the recently added "delist_dev()" function. Make sure the character device structure doesn't go away until the end of the "destroy_dev()" function due to concurrently running cleanup code inside "devfs_populate()". MFC after: 1 week Reported by: dchagin@	2015-01-14 22:07:13 +00:00
Hans Petter Selasky	9570004318	Don't use POLLNVAL as a return value from the client side poll function. Many existing clients don't understand POLLNVAL and instead relies on an error code from the read(), write() or ioctl() system call. Also make sure we wakeup any client pollers before the cuse server is closing, so they don't wait forever for an event.	2015-01-13 13:32:18 +00:00
Ed Maste	4882501b5c	ANSIfy msdosfs Add a few cases and style(9) fixes missed in r276887 Sponsored by: The FreeBSD Foundation	2015-01-12 21:55:48 +00:00
Ed Maste	10c9700f3e	ANSIfy sys/fs/msdosfs There are a number of msdosfs improvements in NetBSD that may be worth bringing over, and this reduces noise in the comparison. Differential Revision: https://reviews.freebsd.org/D1466 Reviewed by: kib Sponsored by: The FreeBSD Foundation	2015-01-09 14:50:08 +00:00
Robert Watson	eae6da3db4	Use M_SIZE() instead of hand-crafted (and mostly correct) NFSMSIZ() macro in the NFS server; garbage collect now-unused NFSMSIZ() and M_HASCL() macros. Also garbage collect now-unused versions in headers for the removed previous NFS client and server. Reviewed by: rmacklem Sponsored by: EMC / Isilon Storage Division	2015-01-07 17:22:56 +00:00
Mateusz Guzik	cd29b292b8	Convert nullfs hash lock from a mutex to an rwlock.	2014-12-30 21:41:35 +00:00
Rick Macklem	07d491dede	r245508 modified the NFS client's Setattr RPC to use VA_UTIMES_NULL to indicate whether it should set the time to the current tod on the server. This had the side effect of making the NFS client use the client's timestamp for exclusive create, starting with FreeBSD9.2. Unfortunately a bug in some Solaris NFS servers causes these servers to return NFS_OK to the Setattr RPC done during exclusive create, but not actually set the file's mode, leaving the file's mode == 0. This patch restores the NFS client's behaviour to use the server's tod for the exclusive open's Setattr RPC, to avoid the Solaris server bug and to restore the pre-FreeBSD9.2 NFS behaviour. Discussed on: freebsd-fs PR: 186293 MFC after: 3 months	2014-12-28 21:13:52 +00:00
Rick Macklem	2f88b3d20a	Delete some duplicate code that was harmless because exactly the same code is at the end of the nfscl_checksattr() function that is called just before it. As such, this code had already been executed and didn't do anything. MFC after: 1 week	2014-12-25 22:29:37 +00:00
Rick Macklem	52f1bb38c2	A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1 was reported via email. This was caused by a LOR between the sleep lock used to serialize the local locking (nfsrv_locklf()) and locking the vnode. I believe this patch fixes the problem by delaying relocking of the vnode until the sleep lock is unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side effect of unlocking the vnode, unlocking the vnode was moved to before the functions that call nfsvno_advlock(). It shouldn't affect the execution of the default case where vfs.nfsd.enable_locallocks=0. Reported by: loic.blot@unix-experience.fr Discussed with: kib MFC after: 1 week	2014-12-25 01:55:17 +00:00
Rick Macklem	62c23db947	Fix kernel builds with "options NFS_DEBUG" that were broken by r276096. Also delete the two kernel options NFS_GATHERDELAY, NFS_WDELAYHASHSIZ which are no longer used. Reported by: bz	2014-12-23 14:24:36 +00:00
Rick Macklem	c15882f091	Remove the old NFS client and server from head, which means that the NFSCLIENT and NFSSERVER kernel options will no longer work. This commit only removes the kernel components. Removal of unused code in the user utilities will be done later. This commit does not include an addition to UPDATING, but that will be committed in a few minutes. Discussed on: freebsd-fs	2014-12-23 00:47:46 +00:00
Konstantin Belousov	789bdfdbc6	Handle MAKEENTRY cnp flag in the VOP_CREATE(). Curiously, some fs, e.g. smbfs, already did it. Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-21 13:29:33 +00:00
Benno Rice	6d659a5d9b	Adjust the test of a KASSERT to better match the intent. This assertion was added in r246213 as a guard against corrupted mbufs arriving from drivers, the key distinguishing factor of said mbufs being that they had a negative length. Given we're in a while loop specifically designed to skip over zero-length mbufs, panicking on a zero-length mbuf seems incorrect. No objection from: kib	2014-12-19 19:09:22 +00:00
Konstantin Belousov	6c21f6edb8	The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-18 10:01:12 +00:00
Gleb Kurtsou	dde58752db	Adjust printf format specifiers for dev_t and ino_t in kernel. ino_t and dev_t are about to become uint64_t. Reviewed by: kib, mckusick	2014-12-17 07:27:19 +00:00
Pedro F. Giffuni	8f87059b41	ext2fs: Fix old out-of-bounds access. Overrunning buffer pointed to by (caddr_t)&oip->i_db[0] of 48 bytes by passing it to a function which accesses it at byte offset 59 using argument 60UL. The issue was inherited from an older FFS implementation and fixed there with by merging UFS2 in r98542. We follow the FFS fix. Discussed with: bde CID: 1007665 MFC after: 3 days	2014-12-09 14:56:00 +00:00
Konstantin Belousov	b2344ab5ff	Do not call VFS_SYNC() before VFS_UNMOUNT() for forced unmount. Since VFS does not/cannot stop writes, sync might run indefinitely, or be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for forced unmounts, since non-responding server does not allow sync to finish. On the other hand, filesystems can and do stop writes using fs-specific facilities, and should already fully flush caches in VFS_UNMOUNT() due to the race. Adjust msdosfs tp sync in unmount for forced call, to accomodate the new behaviour. Note that it is still racy, since writes are not stopped. Discussed with: avg, bjk, mckusick Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-12-09 10:00:47 +00:00
Konstantin Belousov	5c7bebf961	The process spin lock currently has the following distinct uses: - Threads lifetime cycle, in particular, counting of the threads in the process, and interlocking with process mutex and thread lock. The main reason of this is that turnstile locks are after thread locks, so you e.g. cannot unlock blockable mutex (think process mutex) while owning thread lock. - Virtual and profiling itimers, since the timers activation is done from the clock interrupt context. Replace the p_slock by p_itimmtx and PROC_ITIMLOCK(). - Profiling code (profil(2)), for similar reason. Replace the p_slock by p_profmtx and PROC_PROFLOCK(). - Resource usage accounting. Need for the spinlock there is subtle, my understanding is that spinlock blocks context switching for the current thread, which prevents td_runtime and similar fields from changing (updates are done at the mi_switch()). Replace the p_slock by p_statmtx and PROC_STATLOCK(). The split is done mostly for code clarity, and should not affect scalability. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-26 14:10:00 +00:00
Edward Tomasz Napierala	e3d5f1fe3b	Implement "automount -c". MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-11-22 16:48:29 +00:00
Edward Tomasz Napierala	836856e3e6	Fix smbfs to not zero out statfs f_flags field. Previously, this made getmntinfo() return empty flags for smbfs filesystems when called with MNT_WAIT. It's not visible with mount(8), since it uses MNT_NOWAIT, but broke autounmount(8) operation. PR: 195161 Differential Revision: https://reviews.freebsd.org/D1194 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-11-21 06:21:39 +00:00
Pedro F. Giffuni	33587684a6	ifdef ext2_print_inode which is not really used. ext2_print_inode is not really used but it was nice to have for initial development work. #ifdef it under a new EXT2FS_DEBUG knob so that we don't spend time compiling it. MFC after: 3 days	2014-11-12 16:23:56 +00:00
Mateusz Guzik	4dba07b216	Fix up some session-related races in devfs. One was introduced with r272596, the rest was there to begin with. Noted by: jhb	2014-11-03 03:12:15 +00:00
Edward Tomasz Napierala	2fbe0cff73	Fix handling of "conn" mount_nfs(8) option. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-10-30 09:25:03 +00:00
Edward Tomasz Napierala	5a06ac3540	Add support for "timeo", "actimeo", "noac", and "proto" options to mount_nfs(8). They are implemented on Linux, OS X, and Solaris, and thus can be expected to appear in automounter maps. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-10-30 08:50:01 +00:00
Konstantin Belousov	42ecb595f2	Allow the vfs.nfsd knobs to be set from loader.conf (or using kenv(8)). This is useful when nfsd is loaded as module. Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-10-27 07:47:13 +00:00
Rick Macklem	6a30c96cdc	Clip the settings for the NFS rsize, wsize mount options to a power of 2. For non-power of 2 settings, intermittent page faults have been reported. Although the bug that causes these page faults/crashes has not been identified, it does not appear to occur when rsize, wsize is a power of 2. Reported by: tcberner@gmail.com MFC after: 2 weeks	2014-10-22 22:27:51 +00:00
Rick Macklem	fcf121d481	Revert r273481 so it can be recoded using fls(), which some feel will make it more readable.	2014-10-22 21:57:35 +00:00
Rick Macklem	88cc4e92da	Clip the settings for the NFS rsize, wsize mount options to a power of 2. For non-power of 2 settings, intermittent page faults have been reported. Although the bug that causes these page faults/crashes has not been identified, it does not appear to occur when rsize, wsize is a power of 2. Reported by: tcberner@gmail.com MFC after: 2 weeks	2014-10-22 20:47:11 +00:00
Mateusz Guzik	12e2a30ef9	tmpfs: allow shared file lookups Tested by: pho	2014-10-21 21:27:13 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
Mateusz Guzik	4fce16e4c9	Provide vfs suspension support only for filesystems which need it, take two. nullfs and unionfs need to request suspension if underlying filesystem(s) use it. Utilize mnt_kern_flag for this purpose. This is a fixup for 273271. No strong objections from: kib Pointy hat to: mjg MFC after: 2 weeks	2014-10-20 18:00:50 +00:00
Mateusz Guzik	a8a07fd613	unionfs: hold mount interlock while manipulating mnt_flag This is for consistency with other filesystems.	2014-10-20 17:53:49 +00:00
Mateusz Guzik	020b8f17a0	Provide vfs suspension support only for filesystems which need it. Need is expressed by providing vfs_susp_clean function in vfsops. Differential Revision: D952 Reviewed by: kib (previous version) MFC after: 2 weeks	2014-10-19 06:59:33 +00:00
Edward Tomasz Napierala	5742494d29	Remove useless debug. Sponsored by: The FreeBSD Foundation	2014-10-17 12:06:48 +00:00
Marcelo Araujo	f9246664f5	Make the sysctl(8) for checkutf8 positively defined and improve the description of it. Submitted by: Ronald Klop <ronald-lists@klop.ws> Reviewed by: rmacklem Approved by: rmacklem Sponsored by: QNAP Systems Inc.	2014-10-17 02:11:09 +00:00

1 2 3 4 5 ...

3460 Commits