freebsd-dev

Author	SHA1	Message	Date
Attilio Rao	258bee160c	Garbage collect HPFS bits which are now already completely disconnected from the tree since few months (please note that the userland bits were already disconnected since a long time, thus there is no need to update the OLD* entries). This is not targeted for MFC.	2013-03-02 14:54:33 +00:00
Jilles Tjoelker	6d6a91c50f	nullfs: Improve f_flags in statfs(). Include some flags of the nullfs mount itself: MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW. This allows userland code calling statfs() or fstatfs() to see these flags. In particular, this allows opendir() to detect that a -t nullfs -o union mount needs deduplication (otherwise at least . and .. are returned twice) and allows rtld to detect a -t nullfs -o noexec mount as noexec. Turn off the MNT_ROOTFS flag from the underlying filesystem because the nullfs mount is definitely not the root filesystem. Reviewed by: kib MFC after: 1 week	2013-03-02 12:42:23 +00:00
Pawel Jakub Dawidek	2609222ab4	Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ \| PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ \| PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE \| PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ \| PROT_WRITE \| PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK \| CAP_READ) #define CAP_PWRITE (CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP \| CAP_SEEK \| CAP_READ) #define CAP_MMAP_W (CAP_MMAP \| CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP \| CAP_SEEK \| 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R \| CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R \| CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W \| CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R \| CAP_MMAP_W \| CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| CAP_GETSOCKOPT \| \ CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| CAP_SETSOCKOPT \| CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT \| CAP_BIND \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| \ CAP_GETSOCKOPT \| CAP_LISTEN \| CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| \ CAP_SETSOCKOPT \| CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT \| CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib	2013-03-02 00:53:12 +00:00
Alan Cox	2c8472682c	Eliminate a duplicate #include. Sponsored by: EMC / Isilon Storage Division	2013-02-26 07:00:24 +00:00
Attilio Rao	590f9303e5	Merge from vmobj-rwlock branch: Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h. Sponsored by: EMC / Isilon storage division Tested by: pho Reviewed by: alc	2013-02-26 01:00:11 +00:00
John Baldwin	593efaf9f7	Further refine the handling of stop signals in the NFS client. The changes in r246417 were incomplete as they did not add explicit calls to sigdeferstop() around all the places that previously passed SBDRY to _sleep(). In addition, nfs_getcacheblk() could trigger a write RPC from getblk() resulting in sigdeferstop() recursing. Rather than manually deferring stop signals in specific places, change the VFS_() and VOP_() methods to defer stop signals for filesystems which request this behavior via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than a MNTK flag so that it works properly with VFS_MOUNT() when the mount is not yet fully constructed. For now, only the NFS clients are set this new flag in VFS_SET(). A few other related changes: - Add an assertion to ensure that TDF_SBDRY doesn't leak to userland. - When a lookup request uses VOP_READLINK() to follow a symlink, mark the request as being on behalf of the thread performing the lookup (cnp_thread) rather than using a NULL thread pointer. This causes NFS to properly handle signals during this VOP on an interruptible mount. PR: kern/176179 Reported by: Russell Cattelan (sigdeferstop() recursion) Reviewed by: kib MFC after: 1 month	2013-02-21 19:02:50 +00:00
Warner Losh	b96f7e0a60	The request queue is already locked, so we don't need the splsofclock/splx here to note future work.	2013-02-21 02:43:44 +00:00
Konstantin Belousov	bb7ca8229d	Do not update the fsinfo block on each update of any fat block, this is excessive. Postpone the flush of the fsinfo to VFS_SYNC(), remembering the need for update with the flag MSDOSFS_FSIMOD, stored in pm_flags. FAT32 specification describes both FSI_Free_Count and FSI_Nxt_Free as the advisory hints, not requiring them to be correct. Based on the patch from bde, modified by me. Reviewed by: bde MFC after: 2 weeks	2013-02-17 20:35:54 +00:00
Baptiste Daroussin	3c4223a65a	Revert r246791 as it needs a security review first Reported by: gavin, rwatson	2013-02-14 15:17:53 +00:00
Baptiste Daroussin	f4365abd91	Allow fdescfs to be mounted from inside a jail MFC after: 1 week	2013-02-14 13:03:15 +00:00
Pedro F. Giffuni	a9d1b29995	ext2fs: Use prototype declarations for function definitions Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-10 19:49:37 +00:00
Attilio Rao	5e60cb948e	Remove a racy checks on resident and cached pages for tmpfs_mapped{read, write}() functions: - tmpfs_mapped{read, write}() are only called within VOP_{READ, WRITE}(), which check before-hand to work only on valid VREG vnodes. Also the vnode is locked for the duration of the work, making vnode reclaiming impossible, during the operation. Hence, vobj can never be NULL. - Currently check on resident pages and cached pages without vm object lock held is racy and can do even more harm than good, as a page could be transitioning between these 2 pools and then be skipped entirely. Skip the checks as lookups on empty splay trees are very cheap. Discussed with: alc Tested by: flo MFC after: 2 weeks	2013-02-10 01:04:10 +00:00
Pedro F. Giffuni	4e1e0e2582	ext2fs: Replace redundant EXT2_MIN_BLOCK with EXT2_MIN_BLOCK_SIZE. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-08 21:09:44 +00:00
Pedro F. Giffuni	1a125d6d85	ext2fs: make e2fs_maxcontig local and remove tautological check. e2fs_maxcontig was modelled after UFS when bringing the "Orlov allocator" to ext2. On UFS fs_maxcontig is kept in the superblock and is used by userland tools (fsck and growfs), In ext2 this information is volatile so it is not available for userland tools, so in this case it doesn't have sense to carry it in the in-memory superblock. Also remove a pointless check for MAX(1, x) > 0. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-08 20:58:00 +00:00
Pedro F. Giffuni	a940ce65cd	Remove unused MAXSYMLINKLEN macro. Reviewed by: mckusick PR: kern/175794 MFC after: 1 week	2013-02-08 20:30:19 +00:00
Konstantin Belousov	2ca4998342	Stop translating the ERESTART error from the open(2) into EINTR. Posix requires that open(2) is restartable for SA_RESTART. For non-posix objects, in particular, devfs nodes, still disable automatic restart of the opens. The open call to a driver could have significant side effects for the hardware. Noted and reviewed by: jilles Discussed with: bde MFC after: 2 weeks	2013-02-07 14:53:33 +00:00
John Baldwin	a120a7a3cd	Rework the handling of stop signals in the NFS client. The changes in 195702, 195703, and 195821 prevented a thread from suspending while holding locks inside of NFS by forcing the thread to fail sleeps with EINTR or ERESTART but defer the thread suspension to the user boundary. However, this had the effect that stopping a process during an NFS request could abort the request and trigger EINTR errors that were visible to userland processes (previously the thread would have suspended and completed the request once it was resumed). This change instead effectively masks stop signals while in the NFS client. It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot be masked directly. Also, instead of setting PBDRY on individual sleeps, the NFS client now sets the TDF_SBDRY flag around each NFS request and stop signals are masked for all sleeps during that region (the previous change missed sleeps in lockmgr locks). The end result is that stop signals sent to threads performing an NFS request are completely ignored until after the NFS request has finished processing and the thread prepares to return to userland. This restores the behavior of stop signals being transparent to userland processes while still preventing threads from suspending while holding NFS locks. Reviewed by: kib MFC after: 1 month	2013-02-06 17:06:51 +00:00
Pedro F. Giffuni	d427334435	ext2fs: move assignment where it is not dead. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:26:34 +00:00
Pedro F. Giffuni	80b6a61199	ext2fs: Remove unused em_e2fsb definition.. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:23:56 +00:00
Pedro F. Giffuni	555368dcf1	ext2fs: Remove useless rootino local variable. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:17:41 +00:00
Pedro F. Giffuni	ef024b0da9	ext2fs: Correct off-by-one errors in FFTODT() and DDTOFT(). Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:13:05 +00:00
Pedro F. Giffuni	666116e4a3	ext2fs: Use nitems(). Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:08:56 +00:00
Pedro F. Giffuni	fdc100e4c2	ext2fs: Use EXT2_LINK_MAX instead of LINK_MAX Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-05 03:01:04 +00:00
Pedro F. Giffuni	757224cbdb	ext2fs: general cleanup. - Remove unused extern declarations in fs.h - Correct comments in ext2_dir.h - Several panic() messages showed wrong function names. - Remove commented out stray line in ext2_alloc.c. - Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then write-only member e2fs_blocksize_bits from struct m_ext2fs. - Remove the unused macro EXT2_FIRST_INO() and the then write-only member e2fs_first_inode from struct m_ext2fs. - Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from struct m_ext2fs. - Remove the unused members e2fs_bmask, e2fs_dbpg and e2fs_mount_opt from struct m_ext2fs - Correct harmless off-by-one error for fspath in ext2_vfsops.c. - Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS() and EXT2_DESC_PER_BLOCK_BITS(). - Remove the !_KERNEL versions of the EXT2_* macros. Submitted by: Christoph Mallon MFC after: 2 weeks	2013-02-02 22:23:45 +00:00
Konstantin Belousov	11fca81ccd	The MSDOSFSMNT_WAITONFAT flag is bogus and broken. It does less than track the MNT_SYNCHRONOUS flag. It is set to the latter at mount time but not updated by MNT_UPDATE. Use MNT_SYNCHRONOUS to decide to write the FAT updates syncrhonously. Submitted by: bde MFC after: 1 week	2013-02-01 18:30:41 +00:00
Konstantin Belousov	79fb7dd167	Backup FATs were sometimes marked dirty by copying their first block from the primary FAT, and then they were not marked clean on unmount. Force marking them clean when appropriate. Submitted by: bde MFC after: 1 week	2013-02-01 18:25:53 +00:00
Konstantin Belousov	9ec062dddd	The directory entry for dotdot was corrupted in the FAT32 case when moving a directory to a subdir of the root directory from somewhere else. For all directory moves that change the parent directory, the dotdot entry must be fixed up. For msdosfs, the root directory is magic for non-FAT32. It is less magic for FAT32, but needs the same magic for the dotdot fixup. It didn't have it. Both chkdsk and fsck_msdosfs fix the corrupt directory entries with no problems. The fix is to use the same magic for dotdot in msdosfs_rename() as in msdosfs_mkdir(). For msdosfs_mkdir(), document the magic. When writing the dotdot entry in mkdir, use explicitly set pcl variable instead on relying on the start cluster of the root directory typically has a value < 65536. Submitted by: bde MFC after: 1 week	2013-02-01 18:06:06 +00:00
Konstantin Belousov	48efa33b49	The mountmsdosfs() function had an insane sanity test, remove it. Trying FAT32 on a small partition failed to mount because pmp->pm_Sectors was nonzero. Normally, FAT32 file systems are so large that the 16-bit pm_Sectors can't hold the size. This is indicated by setting it to 0 and using only pm_HugeSectors. But at least old versions of newfs_msdos use the 16-bit field if possible, and msdosfs supports this except for breaking its own support in the sanity check. This is quite different from the handling of pm_FATsecs -- now the 16-bit value is always ignored for FAT32 except for checking that it is 0, and newfs_msdos doesn't use the 16-bit value for FAT32. Submitted by: bde MFC after: 1 week	2013-02-01 18:01:03 +00:00
Konstantin Belousov	a26b949f2d	Fix a backwards comment in markvoldirty(). Submitted by: bde MFC after: 1 week	2013-02-01 17:58:37 +00:00
Konstantin Belousov	dd6035234a	Assert that the mbuf in the chain has sane length. Proper place for this check is somewhere in the network code, but this assertion already proven to be useful in catching what seems to be driver bugs causing NFS scrambling random memory. Discussed with: rmacklem MFC after: 1 week	2013-02-01 16:57:02 +00:00
Konstantin Belousov	6168020f66	Be conservative and do not try to consume more bytes than was requested from the server for the read operation. Server shall not reply with too large size, but client should be resilent too. Reviewed by: rmacklem MFC after: 1 week	2013-01-27 09:34:25 +00:00
Pedro F. Giffuni	646a7fea0c	Clean some 'svn:executable' properties in the tree. Submitted by: Christoph Mallon MFC after: 3 days	2013-01-26 22:08:21 +00:00
Pedro F. Giffuni	879aeda7b6	Cosmetical off-by-one Technically, the case when all the blocks are released is not a sanity check. Move further the comment while here. Suggested by: bde MFC after: 3 days	2013-01-26 21:50:52 +00:00
John Baldwin	a89a2c8ba4	Further cleanups to use of timestamps in NFS: - Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds portion of wall-time stamps to manage timeouts on events. - Remove unused nd_starttime from the per-request structure in the new NFS server. - Use nanotime() for the modification time on a delegation to get as precise a time as possible. - Use time_second instead of extracting the second from a call to getmicrotime(). Submitted by: bde (3) Reviewed by: bde, rmacklem MFC after: 2 weeks	2013-01-25 15:25:24 +00:00
Pedro F. Giffuni	69017f8d8c	ext2fs: fix a check for negative block numbers. The previous change accidentally left the substraction we were trying to avoid in case that i_blocks could become negative. Reported by: bde MFC after: 4 days	2013-01-23 14:29:29 +00:00
Pedro F. Giffuni	1d04725a7a	ext2fs: make some inode fields match the ext2 spec. Ext2fs uses unsigned fields in its dinode struct. FreeBSD can have negative values in some of those fields and the inode is meant to interact with the system so we have never respected the unsigned nature of most of those fields. Block numbers and the NFS generation number do not need to be signed so redefine them as unsigned to better match the on-disk information. MFC after: 1 week	2013-01-22 18:54:03 +00:00
Pedro F. Giffuni	4b21c8fda9	ext2fs: temporarily disable the reallocation code. Testing with fsx has revealed problems and in order to hunt the bugs properly we need reduce the complexity. This seems to help but is not a complete solution. MFC after: 3 days	2013-01-22 18:36:31 +00:00
Xin LI	e4558aacfc	Make it possible to force async at server side on new NFS server, similar to the old one's nfs.nfsrv.async. Please note that by enabling this option (default is disabled), the system could potentionally have silent data corruption if the server crashes before write is committed to non-volatile storage, as the client side have no way to tell if the data is already written. Submitted by: rmacklem MFC after: 2 weeks	2013-01-18 19:42:08 +00:00
Pedro F. Giffuni	b187108f66	ext2fs: Add some DOINGASYNC check to match ffs. This is mostly cosmetical. Reviewed by: bde MFC after: 3 days	2013-01-18 19:11:17 +00:00
John Baldwin	d177f14da9	Use vfs_timestamp() to set file timestamps rather than invoking getmicrotime() or getnanotime() directly in NFS. Reviewed by: rmacklem, bde MFC after: 1 week	2013-01-18 18:43:38 +00:00
John Baldwin	39804bc89d	Remove a no-longer-used variable after the previous change to use VA_UTIMES_NULL. Submitted by: bde, rmacklem MFC after: 1 week	2013-01-17 18:45:20 +00:00
John Baldwin	5055536eec	Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes() instead of comparing the desired time against the current time as a heuristic. Reviewed by: rmacklem MFC after: 1 week	2013-01-16 21:52:31 +00:00
Konstantin Belousov	e8f966eeb8	Remove the filtering of the acceptable mount options for nullfs, added in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it. Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days	2013-01-16 05:32:49 +00:00
John Baldwin	6910d7a0d8	- More properly handle interrupted NFS requests on an interruptible mount by returning an error of EINTR rather than EACCES. - While here, bring back some (but not all) of the NFS RPC statistics lost when krpc was committed. Reviewed by: rmacklem MFC after: 1 week	2013-01-15 22:08:17 +00:00
Konstantin Belousov	603f963e56	The current default size of the nullfs hash table used to lookup the existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains. Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode. Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days	2013-01-14 05:44:47 +00:00
Konstantin Belousov	6b17595133	When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days	2013-01-10 18:24:48 +00:00
Baptiste Daroussin	3d94054c30	Add support for IO_APPEND flag in fuse This make open(..., O_APPEND) actually works on fuse filesystem. Reviewed by: attilio	2013-01-08 12:21:50 +00:00
Pedro F. Giffuni	98f5c0d41a	ext2fs: cleanup de dinode structure. It was plagued with style errors and the offsets had been lost. While here took the time to update the fields according to the latest ext4 documentation. Reviewed by: bde MFC after: 3 days	2013-01-07 03:36:32 +00:00
Gleb Kurtsou	4fd5efe79e	tmpfs: Replace directory entry linked list with RB-Tree. Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation. Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly. Workaround race prone tn_readdir_last[pn] fields update. Add tmpfs_dir_destroy() to free all dirents. Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1. Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c	2013-01-06 22:15:44 +00:00
Konstantin Belousov	268dd286a0	Fix reversed condition in the assertion. Pointy hat to: kib MFC after: 13 days	2013-01-04 07:52:47 +00:00

1 2 3 4 5 ...

2958 Commits