freebsd-skq

Author	SHA1	Message	Date
Pedro F. Giffuni	fca15474a0	ext2fs: renumber the license clauses to avoid skipping #3 . This is to keep consistency with other files, and help license-checking utilities determine the number of clauses that apply. No functional change.	2016-12-02 19:47:23 +00:00
Konstantin Belousov	abc1515601	NFSv4 client tracks opens, and the track records are only dropped when the vnode is inactivated. This contradicts with the nullfs caching which keeps upper vnode around, as consequence keeping the use reference to lower vnode. Add a filesystem flag to request nullfs to not cache when mounted over that filesystem, and set the flag for nfs v4 mounts. Reported by: asomers Reviewed by: rmacklem Tested by: asomers, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-27 09:20:58 +00:00
Pedro F. Giffuni	bb9535bbc7	ext2: avoid possible overflow when calculating malloc size. This is inspired on r308064 for case of reloading UFS. MFC after: 1 week	2016-11-26 02:06:33 +00:00
Rick Macklem	1a2079d936	Stop "nfsstat -z" from clearing counts of NFSv4 state structures. The "-z" option on nfsstats was erroneously zeroing out the counts of NFSv4 state structures. These counts will normally go back down to zero as state is released. When zeroed out by "-z", these counts can go negative. This patch fixes this problem. MFC after: 2 weeks	2016-11-25 23:28:09 +00:00
Mark Johnston	99e6e1930c	Release laundered vnode pages to the head of the inactive queue. The swap pager enqueues laundered pages near the head of the inactive queue to avoid another trip through LRU before reclamation. This change adds support for this behaviour to the vnode pager and makes use of it in UFS and ext2fs. Some ioflag handling is consolidated into a common subroutine so that this support can be easily extended to other filesystems which make use of the buffer cache. No changes are needed for ZFS since its putpages routine always undirties the pages before returning, and the laundry thread requeues the pages appropriately in this case. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D8589	2016-11-23 17:53:07 +00:00
Alan Cox	bba39b9ae3	Remove PG_CACHED-related fields from struct vmmeter, because they are no longer used. More precisely, they are always zero because the code that decremented and incremented them no longer exists. Bump __FreeBSD_version to mark this change. Reviewed by: kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8583	2016-11-22 18:13:46 +00:00
Konstantin Belousov	1fa81dab7d	On error, bread(9) zeroes buffer pointer, do not dereference it. See r294954 for the bread(9) change and r297401 for similar cd9660 fix. Reported and tested by: Joshua Kinard <kumba@gentoo.org> PR: 214705 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-22 13:24:57 +00:00
Konstantin Belousov	753a007f0d	Use buffer pager for NFS. The pager, due to its construction, implements clustering for the page-ins. In particular, buildworld load demonstrates reduction of the READ RPCs from 39k down to 24k. No change in real or CPU time was observed. Discussed with, and measured by: bde No objections from: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-22 10:58:24 +00:00
Konstantin Belousov	fc2c3afee0	Minor cleanup, remove unneeded XXX comments and unused re-define. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-22 10:24:59 +00:00
Colin Percival	63659ba6df	Reduce NFS "NFSv4( mounted on)? fileid > 32bits" log spam. Rather than printing a warning for every time we receive a fileid > 2^32 from the NFS server, count warnings and print at most one of each warning type per minute, e.g., Nov 15 05:17:34 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (24730 occurrences) Nov 15 05:17:56 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (178 occurrences) Nov 15 05:18:53 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (7582 occurrences) Nov 15 05:18:58 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (23 occurrences) A buildworld with an NFS mounted /usr/obj can otherwise result in hundreds of thousands of lines being printed, which seems unnecessarily verbose. When ino_t becomes a 64-bit type, these printfs will no longer be needed (and the problems associated with truncating 64-bit fileids to generate 32-bit inode numbers will also go away). Reviewed by: rmacklem MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8523	2016-11-16 01:11:49 +00:00
Alan Cox	7667839a7e	Remove most of the code for implementing PG_CACHED pages. (This change does not remove user-space visible fields from vm_cnt or all of the references to cached pages from comments. Those changes will come later.) Reviewed by: kib, markj Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8497	2016-11-15 18:22:50 +00:00
Edward Tomasz Napierala	53232c0d1d	Remove spurious space. MFC after: 1 month	2016-11-13 12:06:25 +00:00
Bryan Drewery	28323add09	Fix improper use of "its". Sponsored by: Dell EMC Isilon	2016-11-08 23:59:41 +00:00
Edward Tomasz Napierala	a79e9d0fec	Value returned by taskqueue_enqueue_timeout(9) is not an error; don't treat it as such. MFC after: 1 month	2016-11-05 12:30:10 +00:00
Konstantin Belousov	7359fdcf5f	Allow some dotdot lookups in capability mode. If dotdot lookup does not escape from the file descriptor passed as the lookup root, we can allow the component traversal. Track the directories traversed, and check the result of dotdot lookup against the recorded list of the directory vnodes. Dotdot lookups are enabled by sysctl vfs.lookup_cap_dotdot, currently disabled by default until more verification of the approach is done. Disallow non-local filesystems for dotdot, since remote server might conspire with the local process to allow it to escape the namespace. This might be too cautious, provide the knob vfs.lookup_cap_dotdot_nonlocal to override as well. Idea by: rwatson Discussed with: emaste, jonathan, rwatson Reviewed by: mjg (previous version) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 week Differential revision: https://reviews.freebsd.org/D8110	2016-11-02 12:43:15 +00:00
Konstantin Belousov	c329ee711b	Use buffer pager for cd9660. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:46:39 +00:00
Konstantin Belousov	06965e96b3	Use buffer pager for msdosfs. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:46:15 +00:00
Konstantin Belousov	2aa3944510	Enable vn_io_fault() deadlock avoidance for msdosfs. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:35:06 +00:00
Konstantin Belousov	b05088aeeb	Ensure that cluster allocations never allocate clusters outside the volume limits. In particular: - Assert that usemap_alloc() and usemap_free() cluster number argument is valid. - In chainlength(), return 0 if cluster start is after the max cluster. - In chainlength(), cut the calculated cluster chain length at the max cluster. - For true paranoia, after the pm_inusemap is calculated in fillinusemap(), reset all bits in the array for clusters after the max cluster, as in-use. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:34:32 +00:00
Konstantin Belousov	03b8a419e4	If the fatchain() call in chainalloc() returned an error, revert marking the cluster run as in-use. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:26:44 +00:00
Konstantin Belousov	f33d62b2d2	Use symbolic name for the value of fully free word in pm_inusemap. Explicitely mention every bit in the value. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:23:36 +00:00
Konstantin Belousov	1c4ec415e2	Use symbolic name for the free cluster number. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:01:49 +00:00
Konstantin Belousov	f220587d03	Fix comment formatting. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 10:59:34 +00:00
Konstantin Belousov	8b3c8abc45	Remove useless NULL check. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 10:57:41 +00:00
Rick Macklem	dcb19c3886	A problem w.r.t. interoperation between the FreeBSD NFSv4.1 server with delegations enabled and the Linux NFSv4.1 client was reported in reviews.freebsd.org/D7891. I believe that the FreeBSD server behaviour conforms to the RFC and that the Linux client has a bug. Therefore, I do not think the proposed patch is appropriate. When nfsrv_writedelegifpos is non-zero, the FreeBSD server will issue a write delegation for a read open if possible. The Linux client then erroneously assumes that the credentials used for the read open can write the file. This patch reverses the default value for nfsrv_writedelegifpos to 0 so that the default behaviour is Linux compatible and adds a sysctl that can be used to set nfsrv_writedelegifpos. This change should only affect users that are mounting a FreeBSD server with delegations enabled (they are not enabled by default) with a Linux NFSv4.1 client mount. Reported by: fatih.acar@gandi.net Tested by: fatih.acar@gandi.net MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D7891	2016-10-20 23:53:16 +00:00
Ganael LAPLANCHE	2ee1ec5d82	Fix panic() message reporting ufs instead of nandfs PR: 213438 Approved by: kib	2016-10-13 19:33:07 +00:00
Mateusz Guzik	8660b707ff	vfs: remove the __bo_vnode field from struct vnode The pointer can be obtained using __containerof instead. Reviewed by: kib	2016-09-30 17:11:03 +00:00
Alan Somers	0696afbe09	Mount msdosfs with longnames support by default. The old behavior depended on the FAT version and on what files were in the root directory. "mount_msdosfs -o shortnames" is still supported. Reviewed by: wblock, cem Discussed with: trasz, adrian, imp MFC after: 4 weeks X-MFC-Notes: Don't MFC the removal of findwin95 Differential Revision: https://reviews.freebsd.org/D8018	2016-09-23 19:05:07 +00:00
Hans Petter Selasky	19f1b6fb7b	Prevent cuse4bsd.ko and cuse.ko from loading at the same time by declaring support for the cuse4bsd interface in cuse.ko. Found by: Sergey V. Dyatko <sergey.dyatko@gmail.com> MFC after: 1 week	2016-09-23 07:41:23 +00:00
Edward Tomasz Napierala	e583d99909	Change the getnewvnode(9) tag for nullfs from "null" to "nullfs". It's more consistent, and besides, the "null" alone looks weird. MFC after: 1 month	2016-09-15 13:57:37 +00:00
Mateusz Guzik	6a3e46059a	nullfs: plug vnode ref leak in null_vptocnp The lower vnode is already referenced and nodeget is supposed to consume the reference. Thus the extra vref call was causing a leak. Reported by: pho Reviewed by: kib MFC after: 1 week	2016-09-09 10:40:55 +00:00
Mateusz Guzik	2740551545	nullfs: stop special-casing directories in null_vptocnp The previous code was forcing an expensive walk in vop_stdvptocnp, which was causing performance issues on highly contended zfs. No objections: kib MFC after: 2 weeks	2016-09-06 21:22:03 +00:00
Konstantin Belousov	47e61f6cc6	Implement VOP_FDATASYNC() for msdosfs. Standard VOP_FSYNC() implementation just syncs data buffers, and due to this, is the correct and efficient implementation for msdosfs or any other filesystem which uses bufer cache trivially. Provide globally visible wrapper vop_stdfdatasync_buf() for future consumption by other filesystems. Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D7471	2016-08-15 19:17:00 +00:00
Rick Macklem	1b819cf265	Update the nfsstats structure to include the changes needed by the patch in D1626 plus changes so that it includes counts for NFSv4.1 (and the draft of NFSv4.2). Also, make all the counts uint64_t and add a vers field at the beginning, so that future revisions can easily be implemented. There is code in place to handle the old vesion of the nfsstats structure for backwards binary compatibility. Subsequent commits will update nfsstat(8) to use the new fields. Submitted by: will (earlier version) Reviewed by: ken MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D1626	2016-08-12 22:44:59 +00:00
Edward Tomasz Napierala	c12582546e	Implement autofs_print(), for improved debugging experience. MFC after: 1 month	2016-08-11 14:27:23 +00:00
Edward Tomasz Napierala	411455a8fb	Replace all remaining calls to vprint(9) with vn_printf(9), and remove the old macro. MFC after: 1 month	2016-08-10 16:12:31 +00:00
Konstantin Belousov	15ad3e51c5	Convert another tmpfs assert into runtime check. The offset of the directory file, passed to getdirentries(2) syscall, is user-controllable. The value of the offset must not be asserted, instead the invalid value should be checked and rejected if invalid. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-10 13:50:21 +00:00
Pedro F. Giffuni	25efc7c822	ext2fs: Add defines for some missing ext4 feature flags. These are currently unused in our implementation and some even appear to have not been implemented yet on linux but it is good to keep them for reference. Obtained from: NetBSD (CVS Rev. 1.41) MFC after: 1 month	2016-08-06 17:24:35 +00:00
Pedro F. Giffuni	ef4736fea8	ext2fs: Add some more inode flags. These are currently unused in out implementation but it is good to keep them for reference. Obtained from: NetBSD (CVS Rev. 1.35) MFC after: 1 month	2016-08-06 16:48:40 +00:00
Konstantin Belousov	ad600ac8e3	Remove ncl_printf(), use printf(9) directly. After r303710 the function duplicates printf(). Correct function names in the messages []. Noted by: bde [] Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 15:58:20 +00:00
Konstantin Belousov	83d7cf21ea	Remove unneeded (recursing) Giant acquisition around vprintf(9). Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 11:49:17 +00:00
Konstantin Belousov	29ffb32ccd	Remove Giant asserts. Update comment. Owning Giant in the init/uninit is accidental due to the moment where VFS modules initialization is performed, and is not enforced by the VFS interface. The Giant lock does not prevent a parallel execution of the code, it is VFS which implements the proper protocol. Approved by: des (pseudofs maintainer) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 08:57:15 +00:00
Konstantin Belousov	6828ba639a	Some style changes. Fix a typo in comment. Approved by: des (pseudofs maintainer) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 08:53:29 +00:00
Edward Tomasz Napierala	905807264d	Remove write-only variable. MFC after: 1 month	2016-07-29 12:15:55 +00:00
Konstantin Belousov	584b675ed6	Hide the boottime and bootimebin globals, provide the getboottime(9) and getboottimebin(9) KPI. Change consumers of boottime to use the KPI. The variables were renamed to avoid shadowing issues with local variables of the same name. Issue is that boottime* should be adjusted from tc_windup(), which requires them to be members of the timehands structure. As a preparation, this commit only introduces the interface. Some uses of boottime were found doubtful, e.g. NLM uses boottime to identify the system boot instance. Arguably the identity should not change on the leap second adjustment, but the commit is about the timekeeping code and the consumers were kept bug-to-bug compatible. Tested by: pho (as part of the bigger patch) Reviewed by: jhb (same) Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month X-Differential revision: https://reviews.freebsd.org/D7302	2016-07-27 11:08:59 +00:00
Conrad Meyer	af326ace9d	devfs: Move most ioctl logic down to vnode layer Devfs' file layer ioctl is now just a thin shim around the vnode layer. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7286	2016-07-25 16:28:02 +00:00
Hans Petter Selasky	010638ab22	Handle IOC_VOID special case of passing an integer IOCTL argument through CUSE. Submitted by: Vladimir Kondratyev <wulf@cicgroup.ru> Approved by: re (gjb)	2016-07-06 22:21:22 +00:00
Konstantin Belousov	3a1e5dd8e6	Rewrite sigdeferstop(9) and sigallowstop(9) into more flexible framework allowing to set the suspension policy for the dynamic block. Extend the currently possible policies of stopping on interruptible sleeps and ignoring such sleeps by two more: do not suspend at interruptible sleeps, but interrupt them with either EINTR or ERESTART. Reviewed by: jilles Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-06-26 20:07:24 +00:00
Konstantin Belousov	20de93c6c0	Clean other flags in ncl_inactive, only. Add comment explaining why other flags should be unset. Suggested and reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 12 days Approved by: re (gjb)	2016-06-26 14:18:28 +00:00
Konstantin Belousov	8f73d398ed	Since VOP_INACTIVE() is not guaranteed to be called, all cleanups executed by inactive methods, must be repeated on reclaim. In particular, unlink and free sillyrenamed vnode both on inactivation and reclaim. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-06-25 11:34:06 +00:00
Konstantin Belousov	e37dfd3d2b	Do not access NFS data for reclaimed vnode. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (delphij)	2016-06-19 18:29:43 +00:00
Konstantin Belousov	2d5bba3ae3	Another follow-up to r291460. Only access vp->v_rdev for VCHR vnodes in devfs_reclaim(). Reported and tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (gjb) MFC after: 1 week	2016-06-15 15:55:14 +00:00
Kevin Lo	45cbcf9b83	Fix a style bug.	2016-06-08 02:39:10 +00:00
Pedro F. Giffuni	633785280d	ext2fs: Stop dropping and reacquiring Giant around geom calls. As in UFS r300366.	2016-06-07 21:40:42 +00:00
Conrad Meyer	ab8316b8df	nfs_clvfsops: Fix leading whitespace introduced in r299848 Replace spaces with tabs. No functional change. Sponsored by: EMC / Isilon Storage Division	2016-06-07 20:16:01 +00:00
Conrad Meyer	15634fd60c	nfs_clvfsops: Prevent strdup of stack garbage with bogus mount specs If strlen(hostp) was zero, the stack array 'nam' would never be initialized before being strdup()ed. Fix this by initializing it to the empty string. It's possible some external condition makes this case impossible, in which case, an assertion instead of this workaround is appropriate. Introduced in r299848. Reported by: Coverity CID: 1355336 Sponsored by: EMC / Isilon Storage Division	2016-06-07 20:00:20 +00:00
Pedro F. Giffuni	2e621997eb	ext2fs: rearrange ext4_bmapext(). While here assign error a bit later. Reviewed by: Damjan Jovanovich Obtained from: NetBSD	2016-06-07 18:23:22 +00:00
Pedro F. Giffuni	96e9f46789	ext2fs(5): Cosmetic cleanups, mostly to the ext4 code. Obtained from: NetBSD	2016-06-07 17:08:34 +00:00
Pedro F. Giffuni	43ce40e891	ext2fs: cleanup generation number management. Ext2/3/4 manages generation numbers differently than UFS so adopt some rules that should work well. When allocating a new inode, make sure we generate a "good" random value specifically avoiding zero. Don't interfere with the numbers that are already generated in the filesystem: ext2fs doesn't have the backwards compatibility issues where there were no generation numbers. Reviewed by: kevlo MFC after: 1 week	2016-06-07 14:37:43 +00:00
Konstantin Belousov	df5905fe7d	Remove drop/reacquire of Giant around geom calls for cd9660 and udf. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-05-22 18:16:25 +00:00
Kevin Lo	57d2ac2f90	arc4random() returns 0 to (2**32)−1, use an alternative to initialize i_gen if it's zero rather than a divide by 2. With inputs from delphij, mckusick, rmacklem Reviewed by: mckusick	2016-05-22 14:31:20 +00:00
Konstantin Belousov	bb8297e6d4	Same as for UFS, remove drop/reacquire of Giant, and use si_mountpt as the mount semaphore. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-05-21 11:40:41 +00:00
Konstantin Belousov	ae40237874	Remove zero assignments in the cdev allocator. cdp memory is requested with M_ZERO. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-05-21 09:55:32 +00:00
Rick Macklem	372b97d0b6	If a local (AF_LOCAL, AF_UNIX) socket creation (bind) is attempted on a fuse mounted file system, it will crash. Although it may be possible to make this work correctly, this patch avoids the crash in the meantime. I removed the MPASS(), since panicing for the FIFO case didn't make a lot of sense when it returns an error for the others. PR: 195000 Submitted by: henry.hu.sh@gmail.com (earlier version) MFC after: 2 weeks	2016-05-18 22:23:20 +00:00
Gleb Smirnoff	fefbf77024	Comment fix: the getsockaddr() is actually meant here. Reviewed by: rmacklem	2016-05-18 17:40:53 +00:00
Edward Tomasz Napierala	e635011374	Silence down the "insmntque() failed" autofs error; it happens on shutdown and is perfectly normal. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-05-17 12:04:39 +00:00
Rick Macklem	e6e2445622	Fix fuse for "cp" of a mode 0444 file to the file system. When "cp" of a file with read-only (mode 0444) to a fuse mounted file system was attempted it would fail with EACCES. This was because fuse would attempt to open the file WRONLY and the open would fail. This patch changes the fuse_vnop_open() to test for an extant read-write open and use that, if it is available. This makes the "cp" of a read-only file to the fuse mounted file system work ok. There are simpler ways to fix this than adding the fuse_filehandle_validrw() function, but this function is useful for future patches related to exporting a fuse filesystem via NFS. MFC after: 2 weeks	2016-05-15 23:15:10 +00:00
Edward Tomasz Napierala	0d1654c39b	Make it possible to reroot into NFS. This means one can have eg an NFSv4 root over WiFi: boot from md_root (small rootfs image preloaded by loader(8)), setup WiFi, and then reroot into the actual root, over NFS. Note that it's currently limited to NFSv4, and due to problems with nfsuserd(8) it requres a workaround on the server side: one needs to set the vfs.nfsd.enable_stringtouid=1 sysctl and not run nfsuserd(8) on either the server or the client side. Reviewed by: rmacklem@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6347	2016-05-15 08:34:59 +00:00
Rick Macklem	72393e3d80	Fix fuse so that stale buffer cache data isn't read. When I/O on a file under fuse is switched from buffered to DIRECT_IO, it was possible to read stale (before a recent modification) data from the buffer cache. This patch invalidates the buffer cache for the file to fix this. PR: 194293 MFC after: 2 weeks	2016-05-15 00:45:17 +00:00
Rick Macklem	1390cca2b1	Fix fuse to use DIRECT_IO when required. When a file is opened write-only and a partial block was written, buffered I/O would try and read the whole block in. This would result in a hung thread, since there was no open (fuse filehandle) that allowed reading. This patch avoids the problem by forcing DIRECT_IO for this case. It also sets DIRECT_IO when the file system specifies the FN_DIRECTIO flag in its reply to the open. Tested by: nishida@asusa.net, freebsd@moosefs.com PR: 194293, 206238 MFC after: 2 weeks	2016-05-14 20:03:22 +00:00
Conrad Meyer	5ecc225fc5	nfsd: Fix use-after-free in NFS4 lock test service Trivial use-after-free where stp was freed too soon in the non-error path. To fix, simply move its release to the end of the routine. Reported by: Coverity CID: 1006105 Sponsored by: EMC / Isilon Storage Division	2016-05-12 05:03:12 +00:00
Konstantin Belousov	b6a60ae74a	Use vfs_hash_ref(9) to eliminate LK_EXCLOTHER kludge. As a consequence, the nfs client override of VOP_LOCK1() is no longer needed. Reviewed and tested by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-05-11 06:35:46 +00:00
Rick Macklem	de2413b95e	Don't increment srvrpccnt[] for the NFSv4.1 operations. When support for NFSv4.1 was added to the NFS server, it broke the server rpc count stats, since newnfsstats.srvrpccnt[] doesn't have entries for the new NFSv4.1 operations. Without this patch, the code was incrementing bogus entries in newnfsstats for the new NFSv4.1 operations. This patch is an interim fix. The nfsstats structure needs to be updated and that will come in a future commit. Reported by: cem MFC after: 2 weeks	2016-05-07 22:45:08 +00:00
Pedro F. Giffuni	ee58b56452	nfsserver: minor spelling fix in comment. No functional change.	2016-05-06 23:40:37 +00:00
Rick Macklem	8eabbbe24b	Give mountd -S priority over outstanding RPC requests when suspending the nfsd. It was reported via email that under certain heavy RPC loads long delays before the exports would be updated was observed when using "mountd -S". This patch reverses the priority between the exclusive lock request to suspend the nfsd threads and the shared lock request for performing RPCs. As such, when mountd attempts to suspend the nfsd threads, it gets priority over outstanding RPC requests to do this. I suspect that the case reported was an artificial test load, but this patch did fix the problem for the reporter. Reported and Tested by: josephlai@qnap.com MFC after: 2 weeks	2016-05-06 23:26:17 +00:00
Ed Maste	8edac6eee6	Add nid_namelen bounds check to nfssvc system call This is only allowed by root and only used by the nfs daemon, which should not provide an incorrect value. However, it's still good practice to validate data provided by userland. PR: 206626 Reported by: CTurt <cturt@hardenedbsd.org> Reviewed by: rmacklem MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D6201	2016-05-06 21:19:28 +00:00
Ed Maste	58fef175e4	Rationalize license numbering in fdescfs(5)	2016-04-30 16:01:37 +00:00
Pedro F. Giffuni	4ed3c0e713	sys: Make use of our rounddown() macro when sys/param.h is available. No functional change.	2016-04-30 14:41:18 +00:00
Ed Maste	799e4e488f	ANSIfy fdescfs(5)	2016-04-30 12:44:03 +00:00
Pedro F. Giffuni	b3a15ddd5b	sys/fs: spelling fixes in comments. No functional change.	2016-04-29 20:51:24 +00:00
Pedro F. Giffuni	91a25a7d6d	fs/ext2fs: spelling fixes on comment. No functional change.	2016-04-29 20:45:50 +00:00
Pedro F. Giffuni	a96c9b30e2	NFS: spelling fixes on comments. No funcional change.	2016-04-29 16:07:25 +00:00
Pedro F. Giffuni	b114da42af	sys/devfs: unsign an index to prevent signed integer overflow. cdp_maxdirent in struct:cdev_priv is of type u_int. Use the same type for the corresponding index in devfs_revoke(). MFC after: 1 week	2016-04-28 02:39:43 +00:00
Kristof Provost	66527f742b	msdosfs: Prevent buffer overflow when expanding win95 names In win2unixfn() we expand Windows 95 style long names. In some cases that requires moving the data in the nbp->nb_buf buffer backwards to make room. That code failed to check for overflows, leading to a stack overflow in win2unixfn(). We now check for this event, and mark the entire conversion as failed in that case. This means we present the 8 character, dos style, name instead. PR: 204643 Differential Revision: https://reviews.freebsd.org/D6015	2016-04-26 20:36:32 +00:00
Pedro F. Giffuni	55e0987aea	sys: extend use of the howmany() macro when available. We have a howmany() macro in the <sys/param.h> header that is convenient to re-use as it makes things easier to read.	2016-04-26 15:38:17 +00:00
Pedro F. Giffuni	ee7ae58a45	ext2fs: make use of the howmany() macro when available. We have a howmany() macro in the <sys/param.h> header that is convenient to re-use as it makes things easier to read. MFC after: 2 weeks	2016-04-26 01:41:15 +00:00
Rick Macklem	ae03cbd7f3	Allow the NFSv4 server to reply NFSERR_WRONGSEC for the SetClientID operation. It was reported via email that a Linux client couldn't do a Kerberized NFS mount when only "sec=krb5" was specified for the exports. The Linux client attempted a mount via krb5i and the server replied NFSERR_SERVERFAULT. Although NFSERR_WRONGSEC isn't listed as an error for SetClientID, I think it is the correct reply, so this patch enables that. I do not know if this fixes the mount attempt, but adding "krb5i" to the list of allowed security flavours does allow the mount to work. Reported by: joef@spectralogic.com MFC after: 2 weeks	2016-04-23 21:18:45 +00:00
Pedro F. Giffuni	4cb92c4cf4	ext2_htree_release(): prevent signed integer overflow in a loop. h_levels_num, as most data structs in ext2fs, is unsigned so the index that addresses it has to be unsigned as well. To get to overflow here we would probably be considering a degenerate case though. MFC after: 5 days	2016-04-23 18:28:59 +00:00
Rick Macklem	0533d72612	Fix a LOR in the NFSv4.1 server. The ordering of acquisition of the state and session mutexes was reversed in two cases executed when an NFSv4.1 client created/freed a session. Since clients will typically do this only when mounting and dismounting, the likelyhood of causing a deadlock was low but possible. This can only occur for NFSv4.1 mounts, since the others do not use sessions. This was detected while testing the pNFS server/client where the client crashed during dismounting. The patch also reorders the unlocks, although that isn't necessary for correct operation. MFC after: 2 weeks	2016-04-23 01:22:04 +00:00
Pedro F. Giffuni	d9c9c81c08	sys: use our roundup2/rounddown2() macros when param.h is available. rounddown2 tends to produce longer lines than the original code and when the code has a high indentation level it was not really advantageous to do the replacement. This tries to strike a balance between readability using the macros and flexibility of having the expressions, so not everything is converted.	2016-04-21 19:57:40 +00:00
Pedro F. Giffuni	02abd40029	kernel: use our nitems() macro when it is available through param.h. No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current	2016-04-19 23:48:27 +00:00
Pedro F. Giffuni	0d3e502f92	fs misc: for pointers replace 0 with NULL. Mostly cosmetical, no functional change. Found with devel/coccinelle.	2016-04-15 17:28:24 +00:00
Rick Macklem	13c581fc54	If the VOP_SETATTR() call that saves the exclusive create verifier failed, the NFS server would leave the newly created vnode locked. This could result in a file system that would not unmount and processes wedged, waiting for the file to be unlocked. Since this VOP_SETATTR() never fails for most file systems, this bug doesn't normally manifest itself. I found it during testing of an exported GlusterFS file system, which can fail. This patch adds the vput() and changes the error to the correct NFS one. MFC after: 2 weeks	2016-04-12 20:23:09 +00:00
Rick Macklem	84aa8a8ad1	Bruce Evans reported that there was a performance regression between the old and new NFS clients. He did a good job of isolating the problem which was caused by the new NFS client not setting the post write mtime correctly. The new NFS client code was cloned from the old client, but was incorrect, because the mtime in the nfs vnode's cache wasn't yet updated. This patch fixes this problem. The patch also adds missing mutex locking. Reported and tested by: bde MFC after: 2 weeks	2016-04-11 21:55:21 +00:00
Pedro F. Giffuni	e45e8680ed	ext2fs: replace 0 with NULL for pointers. While here do late initialization of ebap, similar as was done in UFS. Found with devel/coccinelle. MFC after: 2 weeks	2016-04-11 00:12:24 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
Kevin Lo	2b3506d919	Fix comment.	2016-04-08 04:29:05 +00:00
Edward Tomasz Napierala	ae34b6ff96	Add four new RCTL resources - readbps, readiops, writebps and writeiops, for limiting disk (actually filesystem) IO. Note that in some cases these limits are not quite precise. It's ok, as long as it's within some reasonable bounds. Testing - and review of the code, in particular the VFS and VM parts - is very welcome. MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5080	2016-04-07 04:23:25 +00:00
Kevin Lo	df04a188af	Update comment: Linux does set a randomized generation number of an inode on ext2/3/4. While here use arc4random() instead of random(). Reviewed by: pfg MFC after: 3 days	2016-04-01 03:21:01 +00:00
Konstantin Belousov	cc4916adf2	Do not access buffer if bread(9) or cluster_read(9) failed. On error, the functions free the buffer and set the pointer to NULL. Also remove useless call to brelse(9) on the error path. PR: 208275 Submitted by: Fabian Keil <fk@fabiankeil.de> MFC after: 2 weeks	2016-03-29 19:59:44 +00:00
Kevin Lo	98a768596e	Update superblock and inode structs for ext4. Reviewed by: pfg	2016-03-28 07:44:55 +00:00
Edward Tomasz Napierala	42ed64e39b	Speed up lookups in autofs(5) by using red-black trees instead of linear searches. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5627	2016-03-24 13:34:39 +00:00
Edward Tomasz Napierala	f8eecb9709	Pacify Coverity in a better way, to avoid write-only variable when building without INVARIANTS. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-03-16 14:00:45 +00:00
Edward Tomasz Napierala	ee4256cf01	Pacify Coverity. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-03-15 20:42:36 +00:00
Edward Tomasz Napierala	49d8ebfe0e	Remove name length limitation from autofs(5). The linear search with strlens is somewhat suboptimal, but it's a temporary measure that will be replaced with red-black trees later on. PR: 204417 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5266	2016-03-13 14:17:23 +00:00
Edward Tomasz Napierala	7571d31339	Use S_BLKSIZE instead of magic constant. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-03-12 09:33:26 +00:00
Edward Tomasz Napierala	f69db55151	Remove cn_consume from 'struct componentname'. It was never set to anything other than 0. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5611	2016-03-12 08:50:38 +00:00
Edward Tomasz Napierala	213ed83855	Fix autofs triggering problem. Assume you have an NFS server, 192.168.1.1, with share "share". This commit fixes a problem where "mkdir /net/192.168.1.1/share/meh" would return spurious error instead of creating the directory if the target filesystem wasn't mounted yet; subsequent attempts would work correctly. The failure scenario is kind of complicated to explain, but it all boils down to calling VOP_MKDIR() for the target filesystem (NFS) with wrong dvp - the autofs vnode instead of the filesystem root mounted over it. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5442	2016-03-12 07:54:42 +00:00
Konstantin Belousov	ffc161df9f	Do not perform unneccessary shared recursion on the allproc_lock in pfs_visible(). The recursion does not cause deadlock because the sx implementation does not prefer exclusive waiters over the shared, but this is an implementation detail. Reported by: pho, Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jhb Tested by: pho Approved by: des (pseudofs maintainer) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-03-11 11:51:38 +00:00
Konstantin Belousov	f36aa2b792	Pass MNTK_NO_IOPF and MNTK_UNMAPPED_BUFS flags from the lower filesystem to the nullfs mount. MNTK_NO_IOPF must be present on the nullfs struct mount so that struct file fo_read and fo_write fops operate in the mode requested by the lower mount. MNTK_UNMAPPED_BUFS allows VOP_GETPAGES() to use unmapped buffers. It does not matter for VOP_GETPAGES() calls from vm_fault() since handle of the vm_object always points to the lower vnode. But it may be useful for other situations where VOP_GETPAGES() is used. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-03-04 17:24:28 +00:00
Pedro F. Giffuni	308c3c240f	Ext2: cleanup setting of ctime/mtime/birthtime. This adopts the same change as r291936 for UFS. Directly clear IN_ACCESS or IN_UPDATE when user supplied the time, and copy the value into the inode. This keeps the behaviour cleaner and is consistent with UFS. Reviewed by: bde MFC after: 1 month (only 10)	2016-02-19 15:53:08 +00:00
Konstantin Belousov	830cd4b810	After nullfs rmdir operation, reclaim the directory vnode which was unlinked. Otherwise the vnode stays cached, causing leak. This is similar to r292961 for regular files. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-02-17 19:43:03 +00:00
Pedro F. Giffuni	00e24e4173	ext2fs: Remove panics for rename() race conditions. Sync with r84642 from UFS: The panics are inappropriate because the IN_RENAME flag only fixes a few of the huge number of race conditions that can result in the source path becoming invalid even prior to the VOP_RENAME() call. Found accidentally while checking an issue from PVS Static Analysis. MFC after: 3 days	2016-02-14 19:52:50 +00:00
Pedro F. Giffuni	e296c1df6f	cd9660: More "check for NULL" cleaunps. Cleanup some checks for NULL. Most of these were always unnecessary and starting with r294954 brelse() doesn't need any NULL checks at all. For now keep the checks somewhat consistent with NetBSD in case we want to merge the cleanups to older versions.	2016-02-12 22:46:14 +00:00
Mark Johnston	785eb42adf	Clear the cookie pointer on error in tmpfs_readdir(). It is otherwise left dangling, and callers that request cookies always free the cookie buffer, even when VOP_READDIR(9) returns an error. This results in a double free if tmpfs_readdir() returns an error to the NFS server or the Linux getdents(2) emulation code. Reported by: pho MFC after: 1 week Security: double free of malloc(9)-backed memory Sponsored by: EMC / Isilon Storage Division	2016-02-12 20:43:53 +00:00
Pedro F. Giffuni	a633908d21	Ext4: Use boolean type instead of '0' and '1' There are precedents of uses of bool in the kernel and it is incorrect style to use integers as replacement for a boolean type.	2016-02-11 15:27:14 +00:00
Pedro F. Giffuni	78f6ea5440	Ext4: fix handling of files with sparse blocks before extent's index. This is ongoing work from Damjan Jovanovic to improve ext4 read support with sparse files: Keep track of the first and last block in each extent as it descends down the extent tree, thus being able to work out that some blocks are sparse earlier. This solves an issue on r293680. In ext4_bmapext() start supporting the runb parameter, which appears to be the number of adjacent blocks prior to the block being converted in the same way that runp is the number of blocks after, speding up random access to mmaped files. PR: 206652	2016-02-11 00:34:11 +00:00
Pedro F. Giffuni	9da20ec3a0	Revert r295359: CID 1018688 is a false positive. The initialization is done by calling vn_start_write(... &mp, flags). mp is only an output parameter unless (flags & V_MNTREF), and fdesc doesn't put V_MNTREF in flags. Pointed out by: bde	2016-02-07 15:40:01 +00:00
Pedro F. Giffuni	db7e4ae81a	msdosfs_rename: yet another unused value. As with r295355, it seems to be left over from a cleanup in r33548. The code is not in NetBSD either. Thanks to bde for checking out the history.	2016-02-07 15:36:16 +00:00
Pedro F. Giffuni	062b0cc0e6	cd9660: Drop an unnecessary check for NULL. This was unnecessary and also confused Coverity. Confirmed on: NetBSD CID: 978558	2016-02-07 03:48:40 +00:00
Pedro F. Giffuni	0ae08af46a	fdesc_setattr: unitialized pointer read CID: 1018688	2016-02-07 01:09:38 +00:00
Pedro F. Giffuni	2799a46fdf	msdosfs_rename: Unused value Assigned value to pmp, is immediatedly overwritten before it can be used. CID: 1304892	2016-02-06 21:54:02 +00:00
Pedro F. Giffuni	817dd2573e	Revert r294695: ext2fs: passthrough any extra timestamps to the dinode struct. While it passed the classic testing, the change appears to have caused some regression and still requires some more precautions. PR: 206820 MFC after: 3 days	2016-02-03 14:31:23 +00:00
Pedro F. Giffuni	afbad87898	ext2fs: passthrough any extra timestamps to the dinode struct. In general we don't trust any of the extended timestamps unless the EXT2F_ROCOMPAT_EXTRA_ISIZE feature is set. However, in the case where we freshly allocated a new inode the information is valid and it is better to pass it along instead of leaving the value undefined. This should have no practical effect but should reduce the amount of garbage if EXT2F_ROCOMPAT_EXTRA_ISIZE is set, like in cases where the filesystem is converted from ext3 to ext4. MFC after: 4 days	2016-01-24 23:24:47 +00:00
Pedro F. Giffuni	386b134364	ext2: rename some directory index constants. Missed from r294653. Pointyhat: me	2016-01-24 04:30:30 +00:00
Pedro F. Giffuni	c22ff471b4	Fix comment.	2016-01-24 02:44:00 +00:00
Pedro F. Giffuni	9b58c8019f	Rename some directory index constants. Directory index was introduced in ext3. We don't always use the prefix to denote the ext2 variant they belong to but when we do we should try to be accurate.	2016-01-24 02:41:49 +00:00
Pedro F. Giffuni	e08ad8f068	ext2: Initialize i_flag after allocation. We use i_flag to carry some flags like IN_E4INDEX which newer ext2fs variants uses internally. fsck.ext3 rightfully complains after our implementation tags non-directory inodes with INDEX_FL. Initializing i_flag during allocation removes the noise factor and quiets down fsck. Patch from: Damjan Jovanovic PR: 206530	2016-01-24 02:25:41 +00:00
Konstantin Belousov	1a2dd035fb	When devfs dirent is freed, a vnode might still keep a pointer to it, apparently. Interlock and clear the pointer to avoid free memory dereference. Submitted by: bde (previous version) MFC after: 3 weeks	2016-01-22 20:30:51 +00:00
Pedro F. Giffuni	9824e4adbe	ext2fs: Bring back the htree dir_index implementation. The htree dir_index is perhaps one of the most characteristic features of the linux ext3 implementation. It was removed in r281670, due to repeated bug reports. Damjan Jovanic detected and fixed three bugs and did some stress testing by building Apache OpenOffice on top of it so it is now in good shape to bring back. Differential Revision: https://reviews.freebsd.org/D5007 Submitted by: Damjan Jovanovic Reviewed by: pfg Tested by: pho Relnotes: Yes MFC after: 2 months (only 10.x)	2016-01-21 14:50:28 +00:00
Konstantin Belousov	aeace3c33c	Assert that the linkage between struct cdev_privdata and and struct file is consistent. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-01-17 08:34:35 +00:00
Ravi Pokala	bc089e5d7d	[PR 206224] bv_cnt is sometimes examined without holding the bufobj lock Add locking around access to bv_cnt which is currently being done unlocked PR: 206224 Reviewed by: imp Approved by: jhb MFC after: 1 week Sponsored by: Panasas, Inc. Differential Revision: https://reviews.freebsd.org/D4931	2016-01-17 01:04:20 +00:00
Bjoern A. Zeeb	8676704962	Unbreak NOIP builds after r294084.	2016-01-15 16:45:36 +00:00
Alexander V. Chernikov	d3bf8f6486	Make nfscl_getmyip() use new routing KPI. * Use standard IPv6 SAS instead of rt->rt_ifa address. * Make address lookup work for IPv6 LLA. * Save address into buffer provided by caller instead of using static vars. Discussed with: rmacklem	2016-01-15 09:05:14 +00:00
Konstantin Belousov	a53b7c692d	Make devfs_fpdrop() static. It was not a public KPI, and it has no reason to remain exported for some time. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-01-13 14:03:06 +00:00
Pedro F. Giffuni	daf884fa9f	ext4: mount panic from freeing invalid pointers Initialize the struct with those fields to zeroes on allocation, preventing the panic. Patch by: Damjan Jovanovic. PR: 206056 MFC after: 3 days	2016-01-11 19:25:43 +00:00
Pedro F. Giffuni	e813d9d7fa	ext4: add support for reading sparse files Add support for sparse files in ext4. Also implement read-ahead, which greatly increases the performance when transferring files from ext4. Both features implemented by Damjan Jovanovic. PR: 205816 MFC after: 1 week	2016-01-11 19:14:55 +00:00
Andrey V. Elsukov	c829016e85	Change the type of newsize argument in the smbfs_smb_setfsize() function from int to int64. MSDN says that SMB_SET_FILE_END_OF_FILE_INFO uses signed 64-bit integer to specify offset, but since smbfs_smb_setfsize() has used plain int, a value was truncated in case when offset was larger than 2G. https://msdn.microsoft.com/en-us/library/ff469975.aspx In particular, now `truncate -s 10G` will work correctly on the mounted SMB share. Reported and tested by: Eugene Grosbein <eugen at grosbein dot net> MFC after: 1 week	2016-01-11 18:11:06 +00:00
Pedro F. Giffuni	7135ca50c1	ext2fs: reading mmaped file in Ext4 causes panic Always call brelse(path.ep_bp), fixing reading EXT4 files using mmap(). Patch by Damjan Jovanovic. PR: 205938 MFC after: 1 week	2016-01-07 21:43:43 +00:00
Konstantin Belousov	fb57d63e47	Hide transient EBADF errors caused by the parallel revoke(2) or forced unmount of devfs mounts, by restarting the failed syscall. When restarted, failing syscalls eventually either stop finding the node and returning ENOENT, or the vnode op vectors finally transition to the deadfs vop. The later return EIO or other error, more appropriate for the operation. Submitted by: bde Tested by: pho MFC after: 3 weeks	2016-01-02 20:29:28 +00:00
Konstantin Belousov	d52aff3c7a	Minor style cleanup. Submitted by: bde MFC after: 1 week	2016-01-01 15:48:48 +00:00
Konstantin Belousov	6f73b583d9	Force nullfs vnode reclaim after unlinking, to potentially unlink lower vnode. Otherwise, reference to the lower vnode from the upper one prevents final unlink. PR: 178238 Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-12-30 19:49:22 +00:00
Pedro F. Giffuni	26069aec57	ext2: recognize ext4 INCOMPAT_RECOVER flag This is a flag specific for journalling in ext4. Add it to the list of ext4 features we ignore for read-only purposes. PR: 205668 MFC after: 1 week	2015-12-29 15:51:52 +00:00
Konstantin Belousov	cccac8a1ef	Make it possible for the cdevsw d_close() driver method to detect last close and close due to revoke(2)-like operation. A new FLASTCLOSE flag indicates that this is last close. FREVOKE is set for revokes, and FNONBLOCK is also set, same as is already done for VOP_CLOSE() call from vgonel(). The flags reuse user open(2) flags which are never stored in f_flag, to not consume bit space in the ABI visible way. Assert this with the static check. Requested and reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-12-22 20:37:34 +00:00
Konstantin Belousov	b63d070ad1	Keep devfs mount locked for the whole duration of the devfs_setattr(), and ensure that our dirent is instantiated. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-12-22 20:22:17 +00:00
Hans Petter Selasky	beebd9aac8	Make CUSE usable with platforms where the size of "unsigned long" is different from the size of a pointer.	2015-12-22 09:55:44 +00:00
Hans Petter Selasky	9fb69c9d9d	Make CUSE usable with platforms where the size of "unsigned long" is different from the size of a pointer.	2015-12-22 09:41:33 +00:00
Hans Petter Selasky	ac60e80129	Guard against the same process being both CUSE server and client at the same time. This can easily lead to a deadlock when destroying the character devices nodes.	2015-12-22 09:26:24 +00:00
Gleb Smirnoff	f17f88d3e0	Fix breakage caused by r292373 in ZFS/FUSE/NFS/SMBFS. With the new VOP_GETPAGES() KPI the "count" argument counts pages already, and doesn't need to be translated from bytes to pages. While here make it consistent that rbehind and rahead are updated only if we doesn't return error. Pointy hat to: glebius	2015-12-16 23:48:50 +00:00
Gleb Smirnoff	b0cd20172d	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
John Baldwin	8d7e0f5889	The cdevpriv_dtr_t typedef was not able to be used in a function prototype like the various d_*_t typedefs since it declared a function pointer rather than a function. Add a new d_priv_dtor_t typedef that declares the function and can be used as a function prototype. The previous typedef wasn't useful outside of the cdevpriv implementation, so retire it. The name d_priv_dtor_t was chosen to be more consistent with cdev methods since it is commonly used in place of d_close_t even though it is not a direct pointer in struct cdevsw. Reviewed by: kib, imp MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D4340	2015-12-02 18:27:30 +00:00
Rick Macklem	65171ebbc8	Fix the memory leak that occurs when the nfscommon.ko module is unloaded. This leak was introduced by r291527. Since the nfscommon.ko module is rarely unloaded, this leak would not have been much of an issue. MFC after: 2 weeks	2015-12-02 02:47:13 +00:00
Rick Macklem	10b2e06e3e	Delete the TUNABLE_INT() line. It was in r291527 so that it could be MFC'd to stable/10 and still work.	2015-11-30 23:37:09 +00:00
Rick Macklem	84be7e0952	Add kernel support to the NFS server for the "-manage-gids" option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks	2015-11-30 21:54:27 +00:00
Kirk McKusick	43a993bb7d	For performance reasons, it is useful to have a single string used as the name of a filesystem when setting it as the first parameter to the getnewvnode() function. Most filesystems call getnewvnode from just one place so can use a literal string as the first parameter. However, NFS calls getnewvnode from two places, so we create a global constant string that can be used by the two instances. This change also collapses two instances of getnewvnode() in the UFS filesystem to a single call. Reviewed by: kib Tested by: Peter Holm	2015-11-29 21:01:02 +00:00
Rick Macklem	a0962bf8bc	When the nfsd threads are terminated, the NFSv4 server state (opens, locks, etc) is retained, which I believe is correct behaviour. However, for NFSv4.1, the server also retained a reference to the xprt (RPC transport socket structure) for the backchannel. This caused svcpool_destroy() to not call SVC_DESTROY() for the xprt and allowed a socket upcall to occur after the mutexes in the svcpool were destroyed, causing a crash. This patch fixes the code so that the backchannel xprt structure is dereferenced just before svcpool_destroy() is called, so the code does do an SVC_DESTROY() on the xprt, which shuts down the socket upcall. Tested by: g_amanakis@yahoo.com PR: 204340 MFC after: 2 weeks	2015-11-21 23:55:46 +00:00
Rick Macklem	b179878dde	Revert r283330 since it broke directory caching in the client. At this time I cannot see a way to fix directory caching when it has partial blocks in the buffer cache, due to the fact that the syscall's uio_offset won't stay the same as the lblkno * NFS_DIRBLKSIZ offset. Reported by: bde MFC after: 2 weeks	2015-11-21 00:15:41 +00:00
Rick Macklem	f315383406	mnt_stat.f_iosize (which is used to set bo_bsize) must be set to the largest size of buffer cache block or the mapping of the buffer is bogus. When a mount with rsize=4096,wsize=4096 was done, f_iosize would be set to 4096. This resulted in corrupted directory data, since the buffer cache block size for directories is NFS_DIRBLKSIZ (8192). This patch fixes the code so that it always sets f_iosize to at least NFS_DIRBLKSIZ. Tested by: krichy@cflinux.hu PR: 177971 MFC after: 2 weeks	2015-11-17 01:44:26 +00:00
Mark Johnston	d28713378a	- Consistently use PROC_ASSERT_HELD() to verify that a process' hold count is non-zero. - Include the process address in the PROC_ASSERT_HELD() and PROC_ASSERT_NOT_HELD() assertion messages so that the corresponding process can be found easily when debugging. MFC after: 1 week	2015-11-08 01:38:56 +00:00
Konstantin Belousov	1d48f121d8	Ensure that when a blockable open of fifo returns success, a valid file descriptor opened for complimentary access exists as well. The implementation of the guarantee is done by counting the generations of readers and writers opens. We return success and not EINTR or ERESTART error, when the sleep for complimentary opening is interrupted, but the generation was changed during the sleep. Longer explanation: assume there are two threads, A doing open("fifo", O_RDONLY) and B doing open("fifo", O_WRONLY), and no other threads either trying to open the fifo, nor there are any file descriptors referencing the fifo. Before the change, it was possible e.g. for for thread A to return a valid file descriptor, while thread B returned EINTR if a signal to B was delivered simultaneously with the wakeup from A. After the change, in this situation both A::open() and B::open() succeed and the signal is made "as if" it was noticed slightly later. Note that the signal actual delivery is not changed, it is done by ast on syscall return path, so signal handler is still executed before first instruction after syscall. See PR for the code demonstrating the issue. PR: 203162 Reported by: Victor Stinner victor.stinner@gmail.com Reviewed by: jilles Tested by: bapt, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-09-20 21:18:33 +00:00
Edward Tomasz Napierala	1d4c0424c8	Fix an NFS server bug that manifested in "ls -al" displaying a plus sign on every directory exported via NFSv4 with NFSv4 ACLs enabled. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3502	2015-08-28 14:26:11 +00:00
Edward Tomasz Napierala	643e5ec210	Make it possible to forcibly unmount devfs. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-08-24 14:04:44 +00:00
Edward Tomasz Napierala	6e572e084b	After r286237 it should be fine to call vgone(9) on a busy GEOM vnode; remove KASSERT that would prevent forced devfs unmount from working. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-08-23 14:53:54 +00:00
Rick Macklem	29dc40b6be	For the case where an NFSv4.1 ExchangeID operation has the client identifier that already has a confirmed ClientID, the nfsrv_setclient() function would not fill in the clientidp being returned. As such, the value of ClientID returned would be whatever garbage was on the stack. An NFSv4.1 client would not normally do this, but it appears that it can happen for certain Linux clients. When it happens, the client persistently retries the ExchangeID and Create_session after Create_session fails when it uses the bogus clientid. With this patch, the correct clientid is replied. This problem was identified in a packet trace supplied by Ahmed Kamal via email. Reported by: email.ahmedkamal@googlemail.com MFC after: 2 weeks	2015-08-14 22:02:14 +00:00
John Baldwin	fada4adf95	The changes that introduced fo_mmap() treated all character device mappings as if MAP_SHARED was always present since in general MAP_PRIVATE is not permitted for character devices. However, there is one exception in that MAP_PRIVATE mappings are permitted for /dev/zero. Only require a writable file descriptor (FWRITE) for shared, writable mappings of character devices. vm_mmap_cdev() will reject any private mappings for other devices. Reviewed by: kib Reported by: sbruno (broke qemu cross-builds), peter Differential Revision: https://reviews.freebsd.org/D3316	2015-08-06 16:50:37 +00:00
Conrad Meyer	b5af3f30a7	nfsclient: Protest loudly when GETATTR responses are invalid BROKEN NFS SERVER OR MIDDLEWARE: Certain WAN "accelerators" attempt to cache NFS GETATTR traffic, but actually corrupt it (e.g., responding to requests with attributes for totally different files). Warn very verbosely when this is detected. Linux' NFS client has a similar warning. Adds a sysctl/tunable (vfs.nfs.fileid_maxwarnings) to configure the quantity of warnings; default to 10. (Zero disables; -1 is unlimited.) Adds a failpoint to aid in validating the warning / behavior with a non-broken server. Use something like: sysctl 'debug.fail_point.nfscl_force_fileid_warning=10%return(1)' Reviewed by: rmacklem Approved by: markj (mentor) Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3304	2015-08-05 22:27:30 +00:00
Rick Macklem	25f37276e5	This patch fixes a problem where, if the NFSv4 server has a previous unconfirmed clientid structure for the same client on the last hash list, this old entry would not be removed/deleted. I do not think this bug would have caused serious problems, since the new entry would have been before the old one on the list. This old entry would have eventually been scavenged/removed. Detected while reading the code looking for another bug. MFC after: 3 days	2015-07-29 23:06:30 +00:00
Jeff Roberson	7d07bfd8a3	- Remove some dead code copied from ffs.	2015-07-29 03:06:08 +00:00
Christian Brueffer	382353e2e8	In tmpfs_chtimes(), remove checks on the nanosecond level when determining whether a node changed. Other filesystems, e.g., UFS, only check on seconds, when determining whether something changed. This also corrects the birthtime case, where we checked tv_nsec twice, instead of tv_sec and tv_nsec (PR). PR: 201284 Submitted by: David Binderman Patch suggested by: kib Reviewed by: kib MFC after: 2 weeks Committed from: Essen FreeBSD Hackathon	2015-07-26 08:33:46 +00:00
Konstantin Belousov	b4490c6e93	The si_status field of the siginfo_t, provided by the waitid(2) and SIGCHLD signal, should keep full 32 bits of the status passed to the _exit(2). Split the combined p_xstat of the struct proc into the separate exit status p_xexit for normal process exit, and signalled termination information p_xsig. Kernel-visible macro KW_EXITCODE() reconstructs old p_xstat from p_xexit and p_xsig. p_xexit contains complete status and copied out into si_status. Requested by: Joerg Schilling Reviewed by: jilles (previous version), pho Tested by: pho Sponsored by: The FreeBSD Foundation	2015-07-18 09:02:50 +00:00
Mark Johnston	5f34e93c58	Check suspendability on the mountpoint returned by VOP_GETWRITEMOUNT. This obviates the need for a MNTK_SUSPENDABLE flag, since passthrough filesystems like nullfs and unionfs no longer need to inherit this information from their lower layer(s). This change also restores the pre-r273336 behaviour of using the presence of a susp_clean VFS method to request suspension support. Reviewed by: kib, mjg Differential Revision: https://reviews.freebsd.org/D2937	2015-07-05 22:37:33 +00:00
Mateusz Guzik	f131759f54	fd: make 'rights' a manadatory argument to fget* functions	2015-07-05 19:05:16 +00:00
Rick Macklem	2a3508eb48	If a "principal" argument isn't provided for a Kerberized NFS mount, the kernel would generate a bogus one with a ":/<path>" suffix. This would only occur for the case where there was no explicit "principal" argument and the getaddrinfo() call in mount_nfs.c failed to a return a cannonical name for the server. This patch fixes this unusual case. PR: 201073 Submitted by: masato@itc.naist.jp MFC after: 2 weeks	2015-07-03 22:11:07 +00:00
Rick Macklem	d189dcb6e2	Alex Burlyga reported a POLA violation for the new NFS client as compared to the old NFS client via email to the freebsd-fs@ mailing list. For the new client, when multiple clients attempted to create a symbolic link concurrently, more that one client would report success instead of EEXIST. This was caused by code in the new client that mapped EEXIST to OK assuming it was caused by a retried RPC request. Since the old client did not do this, the patch defaults to the old behaviour and permits the new behaviour to be enabled via a sysctl. Reported by: alex.burlyga.ietf@gmail.com Tested by: alex.burlyga.ietf@gmail.com MFC after: 2 weeks	2015-07-03 01:15:21 +00:00
Mark Murray	d1b06863fb	Huge cleanup of random(4) code. * GENERAL - Update copyright. - Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set neither to ON, which means we want Fortuna - If there is no 'device random' in the kernel, there will be NO random(4) device in the kernel, and the KERN_ARND sysctl will return nothing. With RANDOM_DUMMY there will be a random(4) that always blocks. - Repair kern.arandom (KERN_ARND sysctl). The old version went through arc4random(9) and was a bit weird. - Adjust arc4random stirring a bit - the existing code looks a little suspect. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Redo read_random(9) so as to duplicate random(4)'s read internals. This makes it a first-class citizen rather than a hack. - Move stuff out of locked regions when it does not need to be there. - Trim RANDOM_DEBUG printfs. Some are excess to requirement, some behind boot verbose. - Use SYSINIT to sequence the startup. - Fix init/deinit sysctl stuff. - Make relevant sysctls also tunables. - Add different harvesting "styles" to allow for different requirements (direct, queue, fast). - Add harvesting of FFS atime events. This needs to be checked for weighing down the FS code. - Add harvesting of slab allocator events. This needs to be checked for weighing down the allocator code. - Fix the random(9) manpage. - Loadable modules are not present for now. These will be re-engineered when the dust settles. - Use macros for locks. - Fix comments. * src/share/man/... - Update the man pages. * src/etc/... - The startup/shutdown work is done in D2924. * src/UPDATING - Add UPDATING announcement. * src/sys/dev/random/build.sh - Add copyright. - Add libz for unit tests. * src/sys/dev/random/dummy.c - Remove; no longer needed. Functionality incorporated into randomdev.. live_entropy_sources.c live_entropy_sources.h - Remove; content moved. - move content to randomdev.[ch] and optimise. * src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h - Remove; plugability is no longer used. Compile-time algorithm selection is the way to go. * src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h - Add early (re)boot-time randomness caching. * src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h - Remove; no longer needed. * src/sys/dev/random/uint128.h - Provide a fake uint128_t; if a real one ever arrived, we can use that instead. All that is needed here is N=0, N++, N==0, and some localised trickery is used to manufacture a 128-bit 0ULLL. * src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h - Improve unit tests; previously the testing human needed clairvoyance; now the test will do a basic check of compressibility. Clairvoyant talent is still a good idea. - This is still a long way off a proper unit test. * src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h - Improve messy union to just uint128_t. - Remove unneeded 'static struct fortuna_start_cache'. - Tighten up up arithmetic. - Provide a method to allow eternal junk to be introduced; harden it against blatant by compress/hashing. - Assert that locks are held correctly. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Turn into self-sufficient module (no longer requires randomdev_soft.[ch]) * src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h - Improve messy union to just uint128_t. - Remove unneeded 'staic struct start_cache'. - Tighten up up arithmetic. - Provide a method to allow eternal junk to be introduced; harden it against blatant by compress/hashing. - Assert that locks are held correctly. - Fix the nasty pre- and post-read overloading by providing explictit functions to do these tasks. - Turn into self-sufficient module (no longer requires randomdev_soft.[ch]) - Fix some magic numbers elsewhere used as FAST and SLOW. Differential Revision: https://reviews.freebsd.org/D2025 Reviewed by: vsevolod,delphij,rwatson,trasz,jmg Approved by: so (delphij)	2015-06-30 17:00:45 +00:00
Konstantin Belousov	8551285097	Restore the td_cookie value for the tmpfs directory entry which was a dup entry, upon detach from the parent directory. If the node is renamed, the entry is re-attached at the different directory, and invalud cookie value triggers assert (or corrupts directory rb tree, it seems). Reported by: clusteradm (gjb, antoine) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-06-19 07:25:15 +00:00
Gleb Smirnoff	093ebe1d28	o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async(). o Provide an extensive set of assertions for input array of pages. o Remove now duplicate assertions from different pagers. Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-06-17 22:44:27 +00:00
Mateusz Guzik	4da8456f0a	Replace struct filedesc argument in getvnode with struct thread This is is a step towards removal of spurious arguments.	2015-06-16 13:09:18 +00:00
Gleb Smirnoff	093c7f396d	Make KPI of vm_pager_get_pages() more strict: if a pager changes a page in the requested array, then it is responsible for disposition of previous page and is responsible for updating the entry in the requested array. Now consumers of KPI do not need to re-lookup the pages after call to vm_pager_get_pages(). Reviewed by: kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-06-12 11:32:20 +00:00
Mateusz Guzik	f6f6d24062	Implement lockless resource limits. Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.	2015-06-10 10:48:12 +00:00
Mark Johnston	068a3d319a	unionfs: fix suspendability check bugs - MNTK_SUSPENDABLE is set in mnt_kern_flag, not mnt_flag. - The lower layer of a unionfs mount is read-only, so the mount should be suspendable iff the upper layer is suspendable. - Remove a couple of superfluous comments. Differential Revision: https://reviews.freebsd.org/D2714 Reviewed by: kib, mjg	2015-06-06 16:36:13 +00:00
John Baldwin	7077c42623	Add a new file operations hook for mmap operations. File type-specific logic is now placed in the mmap hook implementation rather than requiring it to be placed in sys/vm/vm_mmap.c. This hook allows new file types to support mmap() as well as potentially allowing mmap() for existing file types that do not currently support any mapping. The vm_mmap() function is now split up into two functions. A new vm_mmap_object() function handles the "back half" of vm_mmap() and accepts a referenced VM object to map rather than a (handle, handle_type) tuple. vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a a VM object and then calling vm_mmap_object() to handle the actual mapping. The vm_mmap() function remains for use by other parts of the kernel (e.g. device drivers and exec) but now only supports mapping vnodes, character devices, and anonymous memory. The mmap() system call invokes vm_mmap_object() directly with a NULL object for anonymous mappings. For mappings using a file descriptor, the descriptors fo_mmap() hook is invoked instead. The fo_mmap() hook is responsible for performing type-specific checks and adjustments to arguments as well as possibly modifying mapping parameters such as flags or the object offset. The fo_mmap() hook routines then call vm_mmap_object() to handle the actual mapping. The fo_mmap() hook is optional. If it is not set, then fo_mmap() will fail with ENODEV. A fo_mmap() hook is implemented for regular files, character devices, and shared memory objects (created via shm_open()). While here, consistently use the VM_PROT_* constants for the vm_prot_t type for the 'prot' variable passed to vm_mmap() and vm_mmap_object() as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines. Previously some places were using the mmap()-specific PROT_* constants instead. While this happens to work because PROT_xx == VM_PROT_xx, using VM_PROT_* is more correct. Differential Revision: https://reviews.freebsd.org/D2658 Reviewed by: alc (glanced over), kib MFC after: 1 month Sponsored by: Chelsio	2015-06-04 19:41:15 +00:00
Eric van Gyzen	63e4c6cdf9	Provide vnode in memory map info for files on tmpfs When providing memory map information to userland, populate the vnode pointer for tmpfs files. Set the memory mapping to appear as a vnode type, to match FreeBSD 9 behavior. This fixes the use of tmpfs files with the dtrace pid provider, procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY). Submitted by: Eric Badger <eric@badgerio.us> (initial revision) Obtained from: Dell Inc. PR: 198431 MFC after: 2 weeks Reviewed by: jhb Approved by: kib (mentor)	2015-06-02 18:37:04 +00:00
Xin LI	6e55e724a6	Clear p_stops upon PROCFS_CTL_DETACH, similar to r283889. Noticed by: jhb Reviewed by: sef Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2015-06-01 18:49:31 +00:00
Rick Macklem	0c419e226c	Make the NFS server use shared vnode locks for a few cases that are allowed by the VFS/VOP interface instead of using exclusive locks. MFC after: 2 weeks	2015-05-29 20:22:53 +00:00
Pedro F. Giffuni	e54a659a26	Provide VOP_GETPAGES_ASYNC() for extfs. Merge the filesystem specific part from r274914 to ext2fs. I only did regular testing with the change but UFS and our ext2fs are similar enough that the code should just work with the new sendfile. Discussed with: glebius	2015-05-28 21:06:59 +00:00
Rick Macklem	1f54e596ad	Make the size of the hash tables used by the NFSv4 server tunable. No appreciable change in performance was observed after increasing the sizes of these tables and then testing with a single client. However, there was an email that indicated high CPU overheads for a heavily loaded NFSv4 and it is hoped that increasing the sizes of the hash tables via these tunables might help. The tables remain the same size by default. Differential Revision: https://reviews.freebsd.org/D2596 MFC after: 2 weeks	2015-05-27 22:00:05 +00:00
Konstantin Belousov	1bc93bb7b9	Currently, softupdate code detects overstepping on the workitems limits in the code which is deep in the call stack, and owns several critical system resources, like vnode locks. Attempt to wait while the per-mount softupdate thread cleans up the backlog may deadlock, because the thread might need to lock the same vnode which is owned by the waiting thread. Instead of synchronously waiting for the worker, perform the worker' tickle and pause until the backlog is cleaned, at the safe point during return from kernel to usermode. A new ast request to call softdep_ast_cleanup() is created, the SU code now only checks the size of queue and schedules ast. There is no ast delivery for the kernel threads, so they are exempted from the mechanism, except NFS daemon threads. NFS server loop explicitely checks for the request, and informs the schedule_cleanup() that it is capable of handling the requests by the process P2_AST_SU flag. This is needed because nfsd may be the sole cause of the SU workqueue overflow. But, to not cause nsfd to spawn additional threads just because we slow down existing workers, only tickle su threads, without waiting for the backlog cleanup. Reviewed by: jhb, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-05-27 09:20:42 +00:00
Dmitry Chagin	63cc3320d9	Hide vfs.pfs.trace variable if it is not used.	2015-05-24 18:11:22 +00:00
Rick Macklem	262a84286d	The NFS client generated directory block(s) with d_fileno == 0 so that it would not return less data than requested. Since returning less directory data than requested is not a problem for FreeBSD and even UFS no longer returns directory structures with d_fileno == 0, this patch stops the client from doing this. Although entries with d_fileno == 0 should not be a problem, the man pages no longer document that these entries should be ignored, so there was a concern that these entries might be an issue in the future. Suggested by: trasz Tested by: trasz MFC after: 2 weeks	2015-05-23 21:58:41 +00:00
Jung-uk Kim	fd90e2ed54	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
John Baldwin	312827253b	Always set p_oppid when attaching to an existing process via procfs tracing. This matches the behavior of ptrace(PT_ATTACH). Also, the procfs detach request assumes p_oppid is always set. Reviewed by: kib MFC after: 2 weeks	2015-05-22 11:03:51 +00:00
Rick Macklem	86b9457f5b	The NFS client wasn't handling getdirentries(2) requests for sizes that are not an exact multiple of DIRBLKSIZ correctly. Fortunately readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications were affected. This patch fixes this problem by reducing the size of the directory read to an exact multiple of DIRBLKSIZ. Tested by: trasz Reported by: trasz Reviewed by: trasz MFC after: 2 weeks	2015-05-21 23:14:18 +00:00
Alexander Motin	a87627b26b	Do not promote large async writes to sync. Present implementation of large sync writes is too strict and so can be quite slow. Instead of doing that, execute large async write in chunks, syncing each chunk separately. It would be good to fix large sync writes too, but I leave it to somebody with more skills in this area. Reviewed by: rmacklem MFC after: 1 week	2015-05-14 10:04:42 +00:00
Rick Macklem	2fbb9d563b	Fix the NFS server's handling of a bogus NFSv2 ROOT RPC. The ROOT RPC is deprecated in the NFSv2 RFC, RFC-1094 and should never be used by a client. Tested by: thmu@freenet.de MFC after: 1 week	2015-04-25 00:58:24 +00:00
Rick Macklem	7cfdc2a7bc	MAXBSIZE defines both the largest UFS block size and the largest size for a buffer in the buffer cache. This patch defines a new constant MAXBCACHEBUF, which is the largest size for a buffer in the buffer cache. Having a separate constant allows MAXBCACHEBUF to be set larger than MAXBSIZE on a per-architecture basis, so that NFS can do larger read/writes for these architectures. It modifies sys/param.h so that BKVASIZE can also be set on a per-architecture basis. A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE is fixed as well. Differential Revision: https://reviews.freebsd.org/D2330 Reviewed by: mav, kib MFC after: 2 weeks	2015-04-25 00:52:01 +00:00
Pedro F. Giffuni	2f39c91019	Prevent a double free. This is similar to r281756 so set the ptr NULL after free as a safety belt against future changes. Obtained from: HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac) Submitted by: Oliver Pinter Revewed by: rmacklem	2015-04-20 16:40:13 +00:00
Pedro F. Giffuni	a3a4b110da	nfsrpc_createv4: fix double free. Reported by: Oliver Pinter, clang static checker Obtained from: HardenedBSD (commit 63cac77c42c0c3fc67da62f97d5ab651d52ae707) Reviewed by: rmacklem MFC after: 5 days	2015-04-19 23:55:59 +00:00
Alexander Motin	afdfc9a40d	Change wcommitsize default from one empirical value to another. The new value is more predictable with growing RAM size: hibufspace maxvnodes old new i386: 256MB 32980992 15800 2198732 2097152 2GB 94027776 107677 878764 4194304 amd64: 256MB 32980992 15800 2198732 2097152 1GB 114114560 68062 1678155 4194304 4GB 217055232 111807 1955452 4194304 16GB 1717846016 337308 5097465 16777216 64GB 1734918144 1164427 1490479 16777216 256GB 1734918144 4426453 391983 16777216 Reviewed by: rmacklem MFC after: 2 weeks	2015-04-19 11:34:41 +00:00
Edward Tomasz Napierala	50a220c699	Replace "new NFS" with just "NFS" in some sysctl description strings. Sponsored by: The FreeBSD Foundation	2015-04-19 06:18:41 +00:00
Pedro F. Giffuni	f738ee4825	Drop experimental dir_index support. The htree directory index is a highly desirable feature for research purposes and was meant to improve performance in our ext2/3 driver. Unfortunately our implementation has two problems: - It never really delivered any performance improvement. - It appears to corrupt the filesystem in undetermined circumstances. Strictly speaking dir_index is not required for read/write support in ext2/3 and our limited ext4 support still works fine without it. Regain stability in the ext2 driver by removing it. We may need it back (fixed) if we want to support encrypted ext4 support but thanks to the wonders of version control we can always revert this change and bring it back. PR: 191895 PR: 198731 PR: 199309 MFC after: 5 days	2015-04-17 22:26:01 +00:00
Rick Macklem	66e80f77d2	mav@ has found that NFS servers exporting ZFS file systems can perform better when using a 128K read/write data size. This patch changes NFS_MAXDATA from 64K to 128K so that clients can use 128K for NFS mounts to allow this. The patch also renames NFS_MAXDATA to NFS_SRVMAXIO so that it is clear that it applies to the NFS server side only. It also avoids a name conflict with the NFS_MAXDATA defined in rpcsvc/nfs_prot.h, that is used for userland RPC. Tested by: mav Reviewed by: mav MFC after: 2 weeks	2015-04-16 22:35:15 +00:00
Rick Macklem	dda11d4ab9	File systems that do not use the buffer cache (such as ZFS) must use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache. Reviewed by: kib MFC after: 2 weeks	2015-04-15 20:16:31 +00:00
Will Andrews	677c3c0c66	tmpfs_getattr(): Return more correct allocated byte counts. For VREG vnodes, return the resident page count (multiplied by PAGE_SIZE) for the tmpfs node's anonymous VM object that stores actual file contents. For all other vnodes, return the tmpfs_node's tn_size, which should not be rounded to a page. This change allows using stat(2) to identify a sparse file on tmpfs. Reviewed by: kib MFC after: 1 week	2015-04-10 19:04:39 +00:00
Konstantin Belousov	2359e2dcc3	Do not call msdosfs_sync() on the read-only msdosfs mounts. In fact, it should be a nop for ro. PR: 199152 Reviewed by: bde (PR version of the patch) Submitted by: longwitz@incore.de MFC after: 1 week	2015-04-05 21:10:38 +00:00
Konstantin Belousov	420d65d9e4	Assert that an msdosfs mount is not read-only when FAT modifications are requested. PR: 199152 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-04-05 21:08:04 +00:00
Konstantin Belousov	bda2eb9ae8	Refine r280308. Do not completely disable timestamping of devfs nodes on reads or writes, the time marks are used to display idle time by w(1) [1]. Instead, use vfs.devfs.dotimes as the selector of default precision vs. using time_second. The later gives seconds precision, which is good enough for the purpose. Note that timestamp updates are unlocked and the updates itself, as well as the check in devfs_timestamp, are non-atomic. Noted by: truckman [1] Reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-04-01 08:25:40 +00:00
Konstantin Belousov	c66d17c7bc	msdosfs: mark unused compat-mount fields The magic number MSDOSFS_ARGSMAGIC, which used to distinguish "old" vs "new" msdosfs mount arguments, has not been used since 2005; it should just go away now. Likewise, the local-to-Unicode table that changed at the same time is unused. Leave the space reserved in the old style mount arguments, though, since we still support the old mount call (via the cmount entry point). Submitted by: Chris Torek <chris.torek@gmail.com> MFC after: 2 weeks	2015-03-22 09:09:26 +00:00
Xin LI	4f9343fc7c	Disable timestamping on devfs read/write operations by default. Currently we update timestamps unconditionally when doing read or write operations. This may slow things down on hardware where reading timestamps is expensive (e.g. HPET, because of the default vfs.timestamp_precision setting is nanosecond now) with limited benefit. A new sysctl variable, vfs.devfs.dotimes is added, which can be set to non-zero value when the old behavior is desirable. Differential Revision: https://reviews.freebsd.org/D2104 Reported by: Mike Tancsa <mike sentex net> Reviewed by: kib Relnotes: yes Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2015-03-21 01:14:11 +00:00
Gleb Smirnoff	4d6481a4c9	o Enhance vm_pager_free_nonreq() function: - Allow to call the function with vm object lock held. - Allow to specify reqpage that doesn't match any page in the region, meaning freeing all pages. o Utilize the new function in couple more places in vnode pager. Reviewed by: alc, kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-03-17 19:19:19 +00:00
Jung-uk Kim	2d427c524d	Fix white spaces.	2015-03-02 19:14:58 +00:00
Edward Tomasz Napierala	ead063e0a2	Make fuse(4) respect FOPEN_DIRECT_IO. This is required for correct operation of GlusterFS. PR: 192701 Submitted by: harsha at harshavardhana.net Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-03-02 19:04:27 +00:00
Warner Losh	03afe9ba70	nandfs_meta_bread() calls bread() which can set bp to NULL in some error cases. Calling brelse() with a NULL pointer is not allowed, so only call brelse() when the bp is non-NULL. Reported by: Maxime Villard (reported as uninitialized variable)	2015-03-01 21:41:37 +00:00
Alexander Kabaev	0fd841369a	Do not leak 'copy' buffer if bmap_truncate_indirect fails. Reported by: Brainy Code Scanner, by Maxime Villard. MFC after: 2 weeks	2015-02-28 22:24:45 +00:00
Konstantin Belousov	4fc4286fbc	Some fixes for fdescfs lookup code. Do not ever return doomed vnode from lookup. This could happen, if not checked, since dvp is relocked in the 'looking up ourselves' case. In the other case, since dvp is relocked, mount point might go away while fdesc_allocvp() is called. Prevent the situation by doing vfs_busy() before unlocking dvp. Reuse the vn_vget_ino_gen() helper. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-02-28 19:57:22 +00:00
Konstantin Belousov	08189ed667	The VNASSERT in vflush() FORCECLOSE case is trying to panic early to prevent errors from yanking devices out from under filesystems. Only care about special vnodes on devfs, special nodes on other kinds of filesystems do not have special properties. Sponsored by: EMC / Isilon Storage Division Submitted by: Conrad Meyer MFC after: 1 week	2015-02-27 16:43:50 +00:00
Pedro F. Giffuni	e5c356b2a2	ext2fs: Plug small memory leak free() e2fs_contigdirs upon error. Undo zeroing of e2fs_gd as this was actually a false positive. X-MFC with: 278790	2015-02-15 14:25:00 +00:00
Pedro F. Giffuni	6be4cf2244	Reuse value of cursize instead of recalculating. Reported by: Clang static checker MFC after: 1 week	2015-02-15 01:34:00 +00:00
Pedro F. Giffuni	f3ee91ed2b	Initialize the allocation of variables related to the ext2 allocator. The e2fs_gd struct was not being initialized and garbage was being used for hinting the ext2 allocator variant. Use malloc to clear the values and also initialize e2fs_contigdirs during allocation to keep consistency. While here clean up small style issues. Reported by: Clang static analyser MFC after: 1 week	2015-02-15 01:12:15 +00:00
Edward Tomasz Napierala	29836e077a	Restore ABI compatibility, broken in r273127. Note that while this fixes ABI with 10.1, it breaks ABI for 11-CURRENT, so rebuild of automountd(8) is neccessary. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2015-02-10 16:17:16 +00:00
Konstantin Belousov	bf5fce2bee	Remove duplicated assignment. CID: 1267988 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-02-03 12:09:48 +00:00
Konstantin Belousov	e0a60ae16a	Update directory times immediately after an entry is created or removed. Postponing it until tmpfs_getattr() is called causes discordant values reported for file times vs. directory times. Reported and tested by: madpilot Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-31 21:31:53 +00:00
Konstantin Belousov	f1a90a7bac	Remove single-use boolean. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-31 12:58:04 +00:00
Konstantin Belousov	311d39f2ee	POSIX states that write(2) "shall mark for update the last data modification and last file status change timestamps of the file". Currently, tmpfs only modifies ctime when file was extended. Since r277828 followed tmpfs_write(), mmaped writes also do not modify ctime. Fix this, by updating both ctime and mtime for writes to tmpfs files. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-31 12:27:18 +00:00
Dimitry Andric	7f4daa88f1	Fix a -Wcast-qual warning in smbfs_subr.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-30 22:02:32 +00:00
Dimitry Andric	38fc0aa484	Fix a -Wcast-qual warning in udf_vnops.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-30 22:01:45 +00:00
Dimitry Andric	03ce3d7219	Fix a bunch of -Wcast-qual warnings in cd9660_util.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-29 20:40:25 +00:00
Dimitry Andric	09bb4a314a	Fix a bunch of -Wcast-qual warnings in msdosfs_conv.c, by using __DECONST. No functional change. MFC after: 3 days	2015-01-29 20:30:13 +00:00
Jamie Gritton	464aad1407	Add allow.mount.fdescfs jail flag. PR: 192951 Submitted by: ruben@verweg.com MFC after: 3 days	2015-01-28 21:08:09 +00:00
Konstantin Belousov	f40cb1c645	Update mtime for tmpfs files modified through memory mapping. Similar to UFS, perform updates during syncer scans, which in particular means that tmpfs now performs scan on sync. Also, this means that a mtime update may be delayed up to 30 seconds after the write. The vm_object' OBJ_TMPFS_DIRTY flag for tmpfs swap object is similar to the OBJ_MIGHTBEDIRTY flag for the vnode object, it indicates that object could have been dirtied. Adapt fast page fault handler and vm_object_set_writeable_dirty() to handle OBJ_TMPFS_NODE same as OBJT_VNODE. Reported by: Ronald Klop <ronald-lists@klop.ws> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-01-28 10:37:23 +00:00
Konstantin Belousov	3544b0f68f	tmpfs does not use UVM on FreeBSD. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2015-01-28 10:25:35 +00:00
Konstantin Belousov	3b50dff506	Stop enforcing additional reference on all cdevs, which was introduced in r277199. Acquire the neccessary reference in delist_dev_locked() and inform destroy_devl() about it using CDP_UNREF_DTR flag. Fix some style nits, add asserts. Discussed with: hselasky Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-19 17:36:52 +00:00
Konstantin Belousov	a57a934a38	Ignore devfs directory entries for devices either being destroyed or delisted. The check is racy. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-19 17:24:52 +00:00
Enji Cooper	ae266f893f	Fix the build when INVARIANTS is defined by restoring `bo`'s definition in ext2_truncate(..) and by putting it under INVARIANTS ifdefs X-MFC with: r277354 MFC after: 2 weeks	2015-01-19 07:10:08 +00:00
Pedro F. Giffuni	9a53618ab2	ext2: Garbage-collect some unused variables Reported by: clang static analysis MFC after: 2 weeks	2015-01-19 03:30:45 +00:00
Pedro F. Giffuni	7075482d4a	ext2: fix for uninitialized pointer read. path.ep_bp was being used uninitialized in ext4_ext_find_extent(). CID: 1062344 MFC after: 1 week	2015-01-18 21:18:28 +00:00
Pedro F. Giffuni	955ba37baa	Remove dead code. After the ext2 variant of the "orlov allocator" was implemented, the case for a negative or zero dirsize disappeared. Drop the dead code and unsign dirsize given that it can't be negative anyways. CID: 1008669 MFC after: 1 week	2015-01-18 20:26:27 +00:00
Konstantin Belousov	e3612a4c1f	Make SIGSTOP working for sleeps done while waiting for fifo readers or writers in open(2), when the fifo is located on an NFS mount. Reported by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-18 15:03:26 +00:00
Pedro F. Giffuni	84b170d298	ext2: cosmetical issues Minor sorting and note when the cases are expected to fall through. MFC after: 1 week	2015-01-17 15:19:18 +00:00
Hans Petter Selasky	d2955419cd	Avoid race with "dev_rel()" when using the recently added "delist_dev()" function. Make sure the character device structure doesn't go away until the end of the "destroy_dev()" function due to concurrently running cleanup code inside "devfs_populate()". MFC after: 1 week Reported by: dchagin@	2015-01-14 22:07:13 +00:00
Hans Petter Selasky	9570004318	Don't use POLLNVAL as a return value from the client side poll function. Many existing clients don't understand POLLNVAL and instead relies on an error code from the read(), write() or ioctl() system call. Also make sure we wakeup any client pollers before the cuse server is closing, so they don't wait forever for an event.	2015-01-13 13:32:18 +00:00
Ed Maste	4882501b5c	ANSIfy msdosfs Add a few cases and style(9) fixes missed in r276887 Sponsored by: The FreeBSD Foundation	2015-01-12 21:55:48 +00:00
Ed Maste	10c9700f3e	ANSIfy sys/fs/msdosfs There are a number of msdosfs improvements in NetBSD that may be worth bringing over, and this reduces noise in the comparison. Differential Revision: https://reviews.freebsd.org/D1466 Reviewed by: kib Sponsored by: The FreeBSD Foundation	2015-01-09 14:50:08 +00:00
Robert Watson	eae6da3db4	Use M_SIZE() instead of hand-crafted (and mostly correct) NFSMSIZ() macro in the NFS server; garbage collect now-unused NFSMSIZ() and M_HASCL() macros. Also garbage collect now-unused versions in headers for the removed previous NFS client and server. Reviewed by: rmacklem Sponsored by: EMC / Isilon Storage Division	2015-01-07 17:22:56 +00:00
Mateusz Guzik	cd29b292b8	Convert nullfs hash lock from a mutex to an rwlock.	2014-12-30 21:41:35 +00:00
Rick Macklem	07d491dede	r245508 modified the NFS client's Setattr RPC to use VA_UTIMES_NULL to indicate whether it should set the time to the current tod on the server. This had the side effect of making the NFS client use the client's timestamp for exclusive create, starting with FreeBSD9.2. Unfortunately a bug in some Solaris NFS servers causes these servers to return NFS_OK to the Setattr RPC done during exclusive create, but not actually set the file's mode, leaving the file's mode == 0. This patch restores the NFS client's behaviour to use the server's tod for the exclusive open's Setattr RPC, to avoid the Solaris server bug and to restore the pre-FreeBSD9.2 NFS behaviour. Discussed on: freebsd-fs PR: 186293 MFC after: 3 months	2014-12-28 21:13:52 +00:00
Rick Macklem	2f88b3d20a	Delete some duplicate code that was harmless because exactly the same code is at the end of the nfscl_checksattr() function that is called just before it. As such, this code had already been executed and didn't do anything. MFC after: 1 week	2014-12-25 22:29:37 +00:00
Rick Macklem	52f1bb38c2	A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1 was reported via email. This was caused by a LOR between the sleep lock used to serialize the local locking (nfsrv_locklf()) and locking the vnode. I believe this patch fixes the problem by delaying relocking of the vnode until the sleep lock is unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side effect of unlocking the vnode, unlocking the vnode was moved to before the functions that call nfsvno_advlock(). It shouldn't affect the execution of the default case where vfs.nfsd.enable_locallocks=0. Reported by: loic.blot@unix-experience.fr Discussed with: kib MFC after: 1 week	2014-12-25 01:55:17 +00:00
Rick Macklem	62c23db947	Fix kernel builds with "options NFS_DEBUG" that were broken by r276096. Also delete the two kernel options NFS_GATHERDELAY, NFS_WDELAYHASHSIZ which are no longer used. Reported by: bz	2014-12-23 14:24:36 +00:00
Rick Macklem	c15882f091	Remove the old NFS client and server from head, which means that the NFSCLIENT and NFSSERVER kernel options will no longer work. This commit only removes the kernel components. Removal of unused code in the user utilities will be done later. This commit does not include an addition to UPDATING, but that will be committed in a few minutes. Discussed on: freebsd-fs	2014-12-23 00:47:46 +00:00

... 3 4 5 6 7 ...

3684 Commits