freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	c31480a1f6	UFS snapshots: properly set the vm object size. Citing Kirk: The previous code [before `8563de2f27` -- kib] did not call vnode_pager_setsize() but worked because later in ffs_snapshot() it does a UFS_WRITE() to output the snaplist. Previously the UFS_WRITE() allocated the extra block at the end of the file which caused it to do the needed vnode_pager_setsize(). But the new code had already allocated the extra block, so UFS_WRITE() did not extend the size and thus did not do the vnode_pager_setsize(). PR: 253158 Reported by: Harald Schmalzbauer <bugzilla.freebsd@omnilan.de> Reviewed by: mckusick Tested by: cy Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-02-16 07:11:52 +02:00
Kirk McKusick	8563de2f27	Fix bug 253158 - Panic: snapacct_ufs2: bad block - mksnap_ffs(8) crash The panic reported in 253158 arises because the /mnt/.snap/.factory snapshot allocated the last block in the filesystem. The snapshot code allocates the last block in the filesystem as a way of setting its length to be the size of the filesystem. Part of taking a snapshot is to remove all the earlier snapshots from the image of the newest snapshot so that newer snapshots will not claim the blocks of the earlier snapshots. The panic occurs when the new snapshot finds that both it and an earlier snapshot claim the same block. The fix is to set the size of the snapshot to be one block after the last block in the filesystem. This block can never be allocated since it is not a valid block in the filesystem. This extra block is used as a place to store the initial list of blocks that the snapshot has already copied and is used to avoid a deadlock in and speed up the ffs_copyonwrite() function. Reported by: Harald Schmalzbauer Tested by: Peter Holm PR: 253158 Sponsored by: Netflix	2021-02-11 21:31:16 -08:00
Konstantin Belousov	adf28ab456	fifo: minor comment and assert improvements. In particular, replace a note that reload through vget() is obsoleted, with explanation why this code is required. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:22 +02:00
Konstantin Belousov	26af9f72f7	ffs_unlock: assert that IN_ENDOFF is not leaked past locked scope This catches both missed processing of IN_ENDOFF and missed application of VOP_VPUT_PAIR() after VOP that created an entry in the directory. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:22 +02:00
Konstantin Belousov	28703d2713	ffs softdep: Force processing of VI_OWEINACT vnodes when there is inode shortage Such vnodes prevent inode reuse, and should be force-cleared when ffs_valloc() is unable to find a free inode. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:22 +02:00
Konstantin Belousov	2011b44fa3	softdep_request_cleanup: wait for softdep_request_clean_flush() to pass if we noted a parallel request is active and declined to overflow the system with parallel redundant sync of the vnodes. But we need to wait for the flush to finish to see if there are any freed resources. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:22 +02:00
Konstantin Belousov	013168db8c	ufs_inactive(): stop hiding ERELOOKUP from ffs_truncate(), return it. VFS should retry inactivation when possible, then. This should provide timely removal of unlinked unreferenced inodes. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	b59a8e63d6	Stop ignoring ERELOOKUP from VOP_INACTIVE() When possible, relock the vnode and retry inactivation. Only vunref() is required not to drop the vnode lock, so handle it specially by not retrying. This is a part of the efforts to ensure that unlinked not referenced vnode does not prevent inode from reusing. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	6aed2435c8	ufs vnops: brace softdep_prelink() with DOINGSUJ instead of DOINGSOFTDEP because softdep_prelink() is reverted to NOP for non-J case. There is no need to do anything before ufs_direnter() in SU/non-J case, everything required to sync the directory is done in VOP_VPUT_PAIR(). Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 week Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	ede40b0675	ffs softdep: remove will_direnter argument of softdep_prelink() Originally this was done in `8a1509e442` to forcibly cover cases where a hole in the directory could be created by extending into indirect block, since dependency of writing out indirect block is not tracked. This results in excessive amount of fsyncing the directories, where all creation of new entry forced fsync before it. This is not needed, it is enough to fsync when IN_NEEDSYNC is set, and VOP_VPUT_PAIR() provides the required hook to only perform required syncing. The series of changes culminating in this commit puts the performance of metadata-intensive loads back to that before `8a1509e442`. Analyzed by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	06f2918ab8	ufs_direnter: directory truncation does not need special case for rename In ufs_rename case, tdvp is locked from the place where ufs_direnter() is done till VOP_VPUT_PAIR(), which means that we no longer need to specially handle rename in ufs_direnter(). Truncation, if possible, is done in the same way in ffs_vput_pair() both for rename and other VOPs calling ufs_direnter(). Remove isrename argument and set IN_ENDOFF if ufs_direnter() succeeded and directory needs truncation. In ffs_vput_pair(), stop verifying the condition that directory needs truncation when IN_ENDOFF is set, instead assert that the condition is true. Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	038fe6e089	ufs_rename: use VOP_VPUT_PAIR and rely on directory sync/truncation there Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	74a3652f83	ufs_direnter: move directory truncation to ffs_vput_pair(). VOP_VPUT_PAIR() provides the hook to do the truncation right before unlock, which is required since truncation might need to fsync(), which itself might unlock the directory vnode. Set new flag IN_ENDOFF which indicates that i_endoff is valid and should be checked against inode size. Excessive size is chomped, but this operation is advisory and failure to truncate should not result in the failure of the main VOP. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	30bfb2fa0f	ffs_vput_pair(): try harder to recover from the vnode reclaim In particular, if unlock_vp is false, save vp's inode number and generation. If ffs_inotovp() can re-create the vnode with the same number and generation after we finished with handling dvp, then we most likely raced with unmount, and were able to restore atomicity of open. We use FFSV_REPLACE_DOOMED there, to drop the old vnode. This additional recovery is not strictly required, but it improves the quality of the implementation. Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	f2c9d038bd	FFS: implement special VOP_VPUT_PAIR(). It cleans IN_NEEDSYNC flag on dvp before returning, by applying ffs_syncvnode() until success or an error different from ERELOOKUP. IN_NEEDSYNC cleanup is required to avoid creating holes in the directories when extended into indirect block. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:21 +02:00
Konstantin Belousov	be44e98637	ffs_snapshot: use VOP_VPUT_PAIR after VOP_CREATE. If the snapshot embrio was reclaimed under us, return error outright. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	08c2dc2841	ufs_direnter/SU: unconditionally UFS_UPDATE inode when extending directory for all kinds of async/SU mount variants. Submitted by: mckusick Reviewed by: chs Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	1de1e2bfbf	ffs_syncvnode: only clear IN_NEEDSYNC after successfull sync If it is cleaned before the sync, other threads might see the inode without the flag set, because syncing could unlock it. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	89fd61d955	Merge ufs_fhtovp() into ffs_inotovp(). The function alone was not used for anything but ffs_fstovp() for long time. Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	5952c86c78	ffs_inotovp(): interface to convert (ino, gen) into alive vnode It generalizes the VFS_FHTOVP() interface, making it possible to fetch the inode without faking filehandle. Also it adds the ffs flags argument which allows to control ffs_vgetf() call. Requested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	f16c26b1c0	ffs: Add FFSV_REPLACE_DOOMED flag to ffs_vgetf() It specifies that caller requests a fresh non-doomed vnode. If doomed vnode is found in the hash, it should behave similarly to FFSV_REPLACE. Or, to put it differently, the flag is same as FFSV_REPLACE, but only when the found hashed vnode is doomed. Reviewed by: chs, mkcusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:20 +02:00
Konstantin Belousov	e94f2f1be3	ffs: call ufsdirhash_dirtrunc() right after setting directory size Later processing of ffs_truncate() might temporary unlock the directory vnode, causing unsychronized dirhash and inode sizes if update is postponed to UFS_TRUNCATE() callers. Reviewed by: chs, mkcusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-02-12 03:02:19 +02:00
Konstantin Belousov	bf0db19339	buf SU hooks: track buf_start() calls with B_IOSTARTED flag and only call buf_complete() if previously started. Some error paths, like CoW failire, might skip buf_start() and do bufdone(), which itself call buf_complete(). Various SU handle_written_XXX() functions check that io was started and incomplete parts of the buffer data reverted before restoring them. This is a useful invariant that B_IO_STARTED on buffer layer allows to keep instead of changing check and panic into check and return. Reported by: pho Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundations	2021-02-12 03:02:19 +02:00
Konstantin Belousov	0281f88e5d	ffs_vnops.c: Move opt_*.h includes to the top. as it is done in other places. Header files might need options defined for correct operation. Reviewed by: chs, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-02-12 03:02:19 +02:00
Kirk McKusick	a63eae65ff	Revert `2d4422e799`, Eliminate lock order reversal in UFS ffs_unmount(). After discussion with Chuck Silvers (chs@) we have decided that there is a better way to resolve this lock order reversal which will be committed separately. Sponsored by: Netflix	2021-01-30 00:03:37 -08:00
Mateusz Guzik	c892d60a1d	ufs: denote lack of support for lockless symlink lookup It is unclear without investigating if it can be provided without using extra memory, so for the time being just don't.	2021-01-23 15:04:43 +00:00
Kirk McKusick	79a5c790bd	Eliminate a locking panic when cleaning up UFS snapshots after a disk failure. Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of mounting a UFS filesystem with snapshots, each of the vnodes describing a snapshot has its individual lock replaced with the snapshot lock. When the filesystem is unmounted the vnode's original lock is returned replacing the snapshot lock. When a disk fails while the UFS filesystem it contains is still mounted (for example when a thumb drive is removed) UFS forcibly unmounts the filesystem. The loss of the drive causes the GEOM subsystem to orphan the provider, but the consumer remains until the filesystem has finished with the unmount. Information describing the snapshot locks was being prematurely cleared during the orphaning causing the return of the snapshot vnode's original locks to fail. The fix is to not clear the needed information prematurely. Sponsored by: Netflix	2021-01-15 16:36:42 -08:00
Kirk McKusick	173779b98f	Eliminate lock order reversal in UFS when unmounting filesystems with snapshots. Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of mounting a UFS filesystem with snapshots, each of the vnodes describing a snapshot has its individual lock replaced with the snapshot lock. When the filesystem is unmounted the vnode's original lock is returned replacing the snapshot lock. The lock order reversal happens because vnode locks must be acquired before snapshot locks. When unmounting we must lock both the snapshot lock and the vnode lock before swapping them so that the vnode will be continuously locked during the swap. For each vnode representing a snapshot, we must first acquire the snapshot lock to ensure exclusive access to it and its original lock. We then face a lock order reversal when we try to acquire the original vnode lock. The problem is eliminated by doing a non-blocking exclusive lock on the original lock which will always succeed since there are no users of that lock. Sponsored by: Netflix	2021-01-15 16:03:01 -08:00
Mateusz Guzik	6b3a9a0f3d	Convert remaining cap_rights_init users to cap_rights_init_one semantic patch: @@ expression rights, r; @@ - cap_rights_init(&rights, r) + cap_rights_init_one(&rights, r)	2021-01-12 13:16:10 +00:00
Kirk McKusick	2d4422e799	Eliminate lock order reversal in UFS ffs_unmount(). UFS uses a new "mntfs" pseudo file system which provides private device vnodes for a file system to safely access its disk device. The original device vnode is saved in um_odevvp to hold the exclusive lock on the device so that any attempts to open it for writing will fail. But it is otherwise unused and has its BO_NOBUFS flag set to enforce that file systems using mntfs vnodes do not accidentally use the original devfs vnode. When the file system is unmounted, um_odevvp is no longer needed and is released. The lock order reversal happens because device vnodes must be locked before UFS vnodes. During unmount, the root directory vnode lock is held. When when calling vrele() on um_odevvp, vrele() attempts to exclusive lock um_odevvp causing the lock order reversal. The problem is eliminated by doing a non-blocking exclusive lock on um_odevvp which will always succeed since there are no users of um_odevvp. With um_odevvp locked, it can be released using vput which does not attempt to do a blocking exclusive lock request and thus avoids the lock order reversal. Sponsored by: Netflix	2021-01-11 16:49:07 -08:00
Thomas Munro	e7347be9e3	ffs: Support O_DSYNC. Respect the new IO_DATASYNC flag when performing synchronous writes. Compared to O_SYNC, O_DSYNC lets us skip updating the inode in some cases, matching the behaviour of fdatasync(2). Reviewed by: kib Differential Review: https://reviews.freebsd.org/D25160	2021-01-08 13:15:56 +13:00
Mateusz Guzik	3e506a67bb	vfs: add v_irflag accessors Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27793	2021-01-03 06:50:06 +00:00
Mateusz Guzik	9997aedb8f	ufs: use VNPASS when asserting on a vnode in ufs_read_pgcache	2021-01-01 03:14:11 +00:00
Mark Johnston	ace3d9475c	ffs: Avoid out-of-bounds accesses in the fs_active bitmap We use a bitmap to track which cylinder groups have changed between snapshot creation and filesystem suspension. The "legs" of the bitmap are four bytes wide (see ACTIVESET()) so we must round up the allocation size to a multiple of four bytes. I believe this bug is harmless since UMA/kmem_* will both pad the allocation and zero the full allocation. Note that malloc() does inline zeroing when the allocation size is known at compile-time. Reported by: pho (using KASAN) Reviewed by: kib, mckusick MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27731	2020-12-23 11:16:40 -05:00
Ryan Libby	93dba42c0e	ffs: quiet -Wstrict-prototypes Reviewed by: kib, markj, mckusick Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D27558	2020-12-11 22:51:57 +00:00
Kirk McKusick	bb3c01ec79	Document the BA_CLRBUF flag used in ufs and ext2fs filesystems. Suggested by: kib MFC after: 3 days Sponsored by: Netflix	2020-12-06 20:50:21 +00:00
Konstantin Belousov	2c7ada9917	ufs: handle two more cases of possible VNON vnode returned from VFS_VGET(). Reported by: kevans Reviewed by: mckusick, mjg Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27457	2020-12-06 18:09:14 +00:00
Konstantin Belousov	21a45add50	ffs: do not read full direct blocks if they are going to be overwritten. BA_CLRBUF specifies that existing context of the block will be completely overwritten by caller, so there is no reason to spend io fetching existing data. We do the same for indirect blocks. Reported by: tmunro Reviewed by: mckusick, tmunro Tested by: pho, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27353	2020-11-30 17:03:26 +00:00
Konstantin Belousov	cd85379104	Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav () Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225	2020-11-28 12:12:51 +00:00
Konstantin Belousov	92bcefd1d2	clear_inodedeps: handle ERELOOKUP from ffs_syncvnode(). Reported and tested by: pho Sponsored by: The FreeBSD Foundation	2020-11-26 18:03:24 +00:00
Konstantin Belousov	07ef907f6e	ffs_softdep.c: get_parent_vp(): Fix bp lock leak when inum inode was already freed. Reported by: markj, pho Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-11-25 17:12:21 +00:00
Konstantin Belousov	8a1509e442	Handle LoR in flush_pagedep_deps(). When operating in SU or SU+J mode, ffs_syncvnode() might need to instantiate other vnode by inode number while owning syncing vnode lock. Typically this other vnode is the parent of our vnode, but due to renames occuring right before fsync (or during fsync when we drop the syncing vnode lock, see below) it might be no longer parent. More, the called function flush_pagedep_deps() needs to lock other vnode while owning the lock for vnode which owns the buffer, for which the dependencies are flushed. This creates another instance of the same LoR as was fixed in softdep_sync(). Put the generic code for safe relocking into new SU helper get_parent_vp() and use it in flush_pagedep_deps(). The case for safe relocking of two vnodes with undefined lock order was extracted into vn helper vn_lock_pair(). Due to call sequence ffs_syncvnode()->softdep_sync_buf()->flush_pagedep_deps(), ffs_syncvnode() indicates with ERELOOKUP that passed vnode was unlocked in process, and can return ENOENT if the passed vnode reclaimed. All callers of the function were inspected. Because UFS namei lookups store auxiliary information about directory entry in in-memory directory inode, and this information is then used by UFS code that creates/removed directory entry in the actual mutating VOPs, it is critical that directory vnode lock is not dropped between lookup and VOP. For softdep_prelink(), which ensures that later link/unlink operation can proceed without overflowing the journal, calls were moved to the place where it is safe to drop processing VOP because mutations are not yet applied. Then, ERELOOKUP causes restart of the whole VFS operation (typically VFS syscall) at top level, including the re-lookup of the involved pathes. [Note that we already do the same restart for failing calls to vn_start_write(), so formally this patch does not introduce new behavior.] Similarly, unsafe calls to fsync in snapshot creation code were plugged. A possible view on these failures is that it does not make sense to continue creating snapshot if the snapshot vnode was reclaimed due to forced unmount. It is possible that relock/ERELOOKUP situation occurs in ffs_truncate() called from ufs_inactive(). In this case, dropping the vnode lock is not safe. Detect the situation with VI_DOINGINACT and reschedule inactivation by setting VI_OWEINACT. ufs_inactive() rechecks VI_OWEINACT and avoids reclaiming vnode is truncation failed this way. In ffs_truncate(), allocation of the EOF block for partial truncation is re-done after vnode is synced, since we cannot leave the buffer locked through ffs_syncvnode(). In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136	2020-11-14 05:30:10 +00:00
Konstantin Belousov	738ea0010b	Add ffs_inode_bwrite() helper. In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136	2020-11-14 05:19:59 +00:00
Konstantin Belousov	7b795aa3c0	Revert r367669 to re-commit with proper message	2020-11-14 05:19:44 +00:00
Konstantin Belousov	c0d2077f41	Add a framework that tracks exclusive vnode lock generation count for UFS. This count is memoized together with the lookup metadata in directory inode, and we assert that accesses to lookup metadata are done under the same lock generation as they were stored. Enabled under DIAGNOSTICS. UFS saves additional data for parent dirent when doing lookup (i_offset, i_count, i_endoff), and this data is used later by VOPs operating on dirents. If parent vnode exclusive lock is dropped and re-acquired between lookup and the VOP call, we corrupt directories. Framework asserts that corruption cannot occur that way, by tracking vnode lock generation counter. Updates to inode dirent members also save the counter, while users compare current and saved counters values. Also, fix a case in ufs_lookup_ino() where i_offset and i_count could be updated under shared lock. It is not a bug on its own since dvp i_offset results from such lookup cannot be used, but it causes false positive in the checker. In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136	2020-11-14 05:17:04 +00:00
Konstantin Belousov	61846fc4dc	Add a framework that tracks exclusive vnode lock generation count for UFS. This count is memoized together with the lookup metadata in directory inode, and we assert that accesses to lookup metadata are done under the same lock generation as they were stored. Enabled under DIAGNOSTICS. UFS saves additional data for parent dirent when doing lookup (i_offset, i_count, i_endoff), and this data is used later by VOPs operating on dirents. If parent vnode exclusive lock is dropped and re-acquired between lookup and the VOP call, we corrupt directories. Framework asserts that corruption cannot occur that way, by tracking vnode lock generation counter. Updates to inode dirent members also save the counter, while users compare current and saved counters values. Also, fix a case in ufs_lookup_ino() where i_offset and i_count could be updated under shared lock. It is not a bug on its own since dvp i_offset results from such lookup cannot be used, but it causes false positive in the checker. In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136	2020-11-14 05:10:39 +00:00
Mark Johnston	f44994874b	ffs: Clamp BIO_SPEEDUP length On 32-bit platforms, the computed size of the BIO_SPEEDUP requested by softdep_request_cleanup() may be negative when assigned to bp->b_bcount, which has type "long". Clamp the size to LONG_MAX. Also convert the unused g_io_speedup() to use an off_t for the magnitude of the shortage for consistency with softdep_send_speedup(). Reviewed by: chs, kib Reported by: pho Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27081	2020-11-11 13:48:07 +00:00
Conrad Meyer	e6790841f7	UFS2: Fix DoS due to corrupted extattrfile Prior versions of FreeBSD (11.x) may have produced a corrupt extattr file. (Specifically, r312416 accidentally fixed this defect by removing a strcpy.) CURRENT FreeBSD supports disk images from those prior versions of FreeBSD. Validate the internal structure as soon as we read it in from disk, to prevent these extattr files from causing invariants violations and DoS. Attempting to access the extattr portion of these files results in EINTEGRITY. At this time, the only way to repair files damaged in this way is to copy the contents to another file and move it over the original. PR: 244089 Reported by: Andrea Venturoli <ml AT netfence.it> Reviewed by: kib Discussed with: mckusick (earlier draft) Security: no Differential Revision: https://reviews.freebsd.org/D27010	2020-10-30 19:00:42 +00:00
Mateusz Guzik	4bfebc8d2c	cache: add cache_vop_mkdir and rename cache_rename to cache_vop_rename	2020-10-30 10:46:35 +00:00
Edward Tomasz Napierala	bce7ee9d41	Drop "All rights reserved" from all my stuff. This includes Foundation copyrights, approved by emaste@. It does not include files which carry other people's copyrights; if you're one of those people, feel free to make similar change. Reviewed by: emaste, imp, gbe (manpages) Differential Revision: https://reviews.freebsd.org/D26980	2020-10-28 13:46:11 +00:00

1 2 3 4 5 ...

2316 Commits