freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	a9a047bc87	vfs: handle doomed vnodes in vdefer_inactive vgone dooms the vnode while keeping VI_OWEINACT set and then drops the interlock. vputx can pick up the interlock and pass it to vdefer_inactive since the flag is set. The race is harmless, just don't defer anything as vgone will take care of it. Reported by: pho	2020-01-07 20:24:21 +00:00
Mateusz Guzik	c8b3463dd0	vfs: reimplement deferred inactive to use a dedicated flag (VI_DEFINACT) The previous behavior of leaving VI_OWEINACT vnodes on the active list without a hold count is eliminated. Hold count is kept and inactive processing gets explicitly deferred by setting the VI_DEFINACT flag. The syncer is then responsible for vdrop. Reviewed by: kib (previous version) Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D23036	2020-01-07 15:56:24 +00:00
Mateusz Guzik	b7cc9d1847	vfs: trylock in vfs_msync and refactor the func - use LK_NOWAIT instead of calling VOP_ISLOCKED before deciding to lock - evaluate flags before looping over vnodes Reviewed by: kib Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D23035	2020-01-07 15:44:19 +00:00
Mateusz Guzik	c92fe112a7	vfs: use a dedicated counter for free vnode recycling Otherwise vlrureclaim activitity is mixed in and it is hard to tell which vnodes got reclaimed.	2020-01-07 15:42:01 +00:00
Mateusz Guzik	cc2b586d69	vfs: prevent numvnodes and freevnodes re-reads when appropriate Otherwise in code like this: if (numvnodes > desiredvnodes) vnlru_free_locked(numvnodes - desiredvnodes, NULL); numvnodes can drop below desiredvnodes prior to the call and if the compiler generated another read the subtraction would get a negative value.	2020-01-07 04:34:03 +00:00
Mateusz Guzik	37fe521a6f	vfs: annotate numvnodes and vnode_free_list_mtx with __exclusive_cache_line	2020-01-07 04:30:49 +00:00
Mateusz Guzik	478368ca41	vfs: eliminate v_tag from struct vnode There was only one consumer and it was using it incorrectly. It is given an equivalent hack. Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D23037	2020-01-07 04:29:34 +00:00
Mateusz Guzik	a91190c63e	vfs: add a helper for allocating marker vnodes	2020-01-07 04:27:40 +00:00
Pawel Biernacki	91c4b68fa3	kern_sysctl: make sysctl.debug work as intended r136999 introduced SYSTCL_DEBUG but apparently "opt_sysctl.h" was never included making the option ignored. r322954 introduced sysctl.reuse_test with OID number equal to 0, effectively shadowing the very special sysctl.debug one. Use OID_AUTO as it doesn't need any special treatment. Reviewed by: kib (mentor) Approved by: kib (mentor) Differential Revision: https://reviews.freebsd.org/D23056	2020-01-06 19:47:59 +00:00
Mateusz Guzik	2e77cad11d	locks: add default delay struct Use it for all primitives. This makes everything fit in 8 bytes.	2020-01-05 12:48:19 +00:00
Mateusz Guzik	6b8dd26e7c	locks: convert delay times to u_short int is just a waste of space for this purpose.	2020-01-05 12:47:29 +00:00
Mateusz Guzik	d6ae918835	Mark mtxpool_sleep as read mostly, not frequently. The latter is not justified.	2020-01-05 12:46:35 +00:00
Kyle Evans	535b1df993	shm: correct KPI mistake introduced around memfd_create When file sealing and shm_open2 were introduced, we should have grown a new kern_shm_open2 helper that did the brunt of the work with the new interface while kern_shm_open remains the same. Instead, more complexity was introduced to kern_shm_open to handle the additional features and consumers had to keep changing in somewhat awkward ways, and a kern_shm_open2 was added to wrap kern_shm_open. Backpedal on this and correct the situation- kern_shm_open returns to the interface it had prior to file sealing being introduced, and neither function needs an initial_seals argument anymore as it's handled in kern_shm_open2 based on the shmflags.	2020-01-05 04:06:40 +00:00
Kyle Evans	58366f05c0	shmfd/mmap: restrict maxprot with MAP_SHARED + F_SEAL_WRITE If a write seal is set on a shared mapping, we must exclude VM_PROT_WRITE as the fd is effectively read-only. This was discovered by running devel/linux-ltp, which mmap's with acceptable protections specified then attempts to raise to PROT_READ\|PROT_WRITE with mprotect(2), which we allowed. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D22978	2020-01-05 03:15:16 +00:00
Mateusz Guzik	7e2ea5772b	vfs: factor out avoidable branches in _vn_lock	2020-01-05 01:00:11 +00:00
Mateusz Guzik	8dbc63520c	vfs: drop thread argument from vinactive	2020-01-05 00:59:47 +00:00
Mateusz Guzik	867fd730c6	vfs: patch up vnode count assertions to report found value	2020-01-05 00:59:16 +00:00
Jeff Roberson	727c691857	Use a separate lock for the zone and keg. This provides concurrency between populating buckets from the slab layer and fetching full buckets from the zone layer. Eliminate some nonsense locking patterns where we lock to fetch a single variable. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D22828	2020-01-04 03:15:34 +00:00
Mateusz Guzik	b249ce48ea	vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427	2020-01-03 22:29:58 +00:00
Kyle Evans	3a22f09cbf	emulated atomic64: disable interrupts as the lock mechanism on !SMP Reviewed by: jhibbits, bdragon Differential Revision: https://reviews.freebsd.org/D23015	2020-01-03 18:29:20 +00:00
Brandon Bergren	9aafc7c052	[PowerPC] [MIPS] Implement 32-bit kernel emulation of atomic64 operations This is a lock-based emulation of 64-bit atomics for kernel use, split off from an earlier patch by jhibbits. This is needed to unblock future improvements that reduce the need for locking on 64-bit platforms by using atomic updates. The implementation allows for future integration with userland atomic64, but as that implies going through sysarch for every use, the current status quo of userland doing its own locking may be for the best. Submitted by: jhibbits (original patch), kevans (mips bits) Reviewed by: jhibbits, jeff, kevans Differential Revision: https://reviews.freebsd.org/D22976	2020-01-02 23:20:37 +00:00
Konstantin Belousov	478ca4b004	Rename umtxq_check_susp() to thread_check_susp() and make it usable outside of kern_umtx.c. To be used in several future changes. Discussed with: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-01-02 22:13:59 +00:00
Konstantin Belousov	8f4d74eb1e	Style: remove trailing spaces/tabs. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-01-02 22:07:03 +00:00
Pawel Biernacki	e8cdbb4815	sysctl: hide 2.x era compat node r23081 introduced kern.dummy oid as a semi ABI compat for kern.maxsockbuf that was moved to a new namespace. It never functioned as an alias of any kind and was just returning 0 unconditionally, hence it was probably provided to keep some 3rd party programmes happy about sysctl(3) not reporting an error because of non-existing oid. After nearly 23 years it seems reasonable to just hide it from sysctl(8) list not to cause unnecessary confusion as for its purpose. Reported by: Antranig Vartanian <antranigv@freebsd.am> Reviewed by: kib (mentor) Approved by: kib (mentor) Differential Revision: https://reviews.freebsd.org/D22982	2020-01-02 01:23:43 +00:00
Mateusz Guzik	57db0e12c8	vfs: drop an always-false check from vlrureclaim The vnode gets held few lines prior, making the VI_FREE condition illegal.	2020-01-01 22:51:17 +00:00
Alexander V. Chernikov	c83dda362e	Split gigantic rtsock route_output() into smaller functions. Amount of changes to the original code has been intentionally minimised to ease diffing. The changes are mostly mechanical, with the following exceptions: * lltable handler is now called directly based of RTF_LLINFO flag presense. * "report" logic for updating rtm in RTM_GET/RTM_DELETE has been simplified, fixing several potential use-after-free cases in rt_addrinfo. * llable asserts has been replaced with error-returning, preventing kernel crashes when lltable gw af family is invalid (root required). MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22864	2019-12-31 17:26:53 +00:00
Alexander Motin	024932aae9	Use atomic for start_count in devstat_start_transaction(). Combined with earlier nstart/nend removal it allows to remove several locks from request path of GEOM and few other places. It would be cool if we had more SMP-friendly statistics, but this helps too. Sponsored by: iXsystems, Inc.	2019-12-30 03:13:38 +00:00
Mark Johnston	9f5632e6c8	Remove page locking for queue operations. With the previous reviews, the page lock is no longer required in order to perform queue operations on a page. It is also no longer needed in the page queue scans. This change effectively eliminates remaining uses of the page lock and also the false sharing caused by multiple pages sharing a page lock. Reviewed by: jeff Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22885	2019-12-28 19:04:00 +00:00
Justin Hibbits	741dfd86b3	Fix the powerpc copyout fixup from r356113 Summary: r356113 used an older patch, which predated the freebsd_copyout_auxargs() addition. Fix this by using a private powerpc_copyout_auxargs() instead, and keep it private to powerpc, not in MI files. Reviewed by: kib, bdragon Differential Revision: https://reviews.freebsd.org/D22935	2019-12-27 17:38:25 +00:00
Mateusz Guzik	3983dc32d7	Plug a warning in read-mostly spinlocks reported by gcc.	2019-12-27 13:37:19 +00:00
Mateusz Guzik	eb9764615d	vfs: remove production kernel checks and mp == NULL support from vdrop 1. The only place in the tree which calls getnewvnode with mp == NULL does it for vp_crossmp which will never execute this codepath. Any vnode which legally has ->v_mount == NULL is also doomed, which once more wont execute this code. 2. Remove an assertion for v_holdcnt from production kernels. It gets taken care of by refcount macros in debug kernels. Any code which would want to pass NULL mp can construct a fake one instead. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D22722	2019-12-27 11:26:12 +00:00
Mateusz Guzik	1f162fef76	Add read-mostly sleepable locks To be used when like rmlocks, except when sleeping for readers needs to be allowed. See the manpage for more information. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D22823	2019-12-27 11:19:57 +00:00
Justin Hibbits	6b45727338	Fix the build from r356113. Types had changed from when the patch was first created, and a final build was not done pre-commit.	2019-12-27 04:52:17 +00:00
Justin Hibbits	adea0d6368	Eliminate the last MI difference in AT_* definitions (for powerpc). Summary: As a transition aide, implement an alternative elfN_freebsd_fixup which is called for old powerpc binaries. Similarly, add a translation to rtld to convert old values to new ones (as expected by a new rtld). Translation of old<->new values is incomplete, but sufficient to allow an installworld of a new userspace from an old one when a new kernel is running. Test Plan: Someone needs to see how a new kernel/rtld/libc works with an old binary. If if works we can probalby ship this. If not we probalby need some more compat bits. Submitted by: brooks Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D20799	2019-12-27 04:07:03 +00:00
Conrad Meyer	f3bae413e9	random(9): Deprecate random(9), remove meaningless srandom(9) srandom(9) is meaningless on SMP systems or any system with, say, interrupts. One could never rely on random(9) to produce a reproducible sequence of outputs on the basis of a specific srandom() seed because the global state was shared by all kernel contexts. As such, removing it is literally indistinguishable to random(9) consumers (as compared with retaining it). Mark random(9) as deprecated and slated for quick removal. This is not to say we intend to remove all fast, non-cryptographic PRNG(s) in the kernel. It/they just won't be random(9), as it exists today, in either name or implementation. Before random(9) is removed, a replacement will be provided and in-tree consumers will be converted. Note that despite the name, the random(9) interface does not bear any resemblance to random(3). Instead, it is the same crummy 1988 Park-Miller LCG used in libc rand(3).	2019-12-26 19:41:09 +00:00
Conrad Meyer	af00898b5d	gone_in(9): Trivial string grammar and style cleanups	2019-12-26 18:25:07 +00:00
Kyle Evans	f46412c021	kern_cons: add a stub kbdinit for configs with no keyboard/console drivers A weak symbol here is decidedly cleaner than any #ifdef soup or relocating kbdinit, the former leading to maintenance required on addition of any console/keyboard drivers and the latter pushing kbd init bits away from where they're used.	2019-12-26 15:47:19 +00:00
Kyle Evans	3ed7166aca	kbd: merge linker set drivers into standard kbd driver list This leads to the revert of r355806; this reduces duplication in keyboard registration and driver switch lookup and leaves us with one authoritative source for currently registered drivers. The reduced duplication later is nice as we have more procedure involved in keyboard setup. keyboard_driver->flags is used to more quickly detect bogus adds/removes. From KPI consumers' perspective, nothing changes- kbd_add_driver of an already-registered driver will succeed, and a single kbd_delete_driver will later remove it as expected. In contrast to historical behavior, kbd_delete_driver on a driver registered via linker set will now actually de-register the driver so that it may not be used -- e.g. if kbdmux's MOD_LOAD handler fails somewhere. Detection for already-registered drivers in kbd_add_driver has improved, as the previous SLIST_NEXT(driver) != NULL check would not have caught a driver that's at the tail end. kbdinit is now called from cninit() rather than via SYSINIT so that keyboard drivers are available as early as console drivers. This is particularly important as cnprobe will, in both syscons and vt, attempt to do any early configuration of keyboard drivers built-in (see: kbd_configure). Reviewed by: imp (earlier version, pre-cninit change) Differential Revision: https://reviews.freebsd.org/D22835	2019-12-26 15:21:34 +00:00
Conrad Meyer	fea73412a0	sleep(9), sleepqueue(9): const'ify wchan pointers _sleep(9), wakeup(9), sleepqueue(9), et al do not dereference or modify the channel pointers provided in any way; they are merely used as intptrs into a dictionary structure to match waiters with wakers. Correctly annotate this such that _sleep() and wakeup() may be used on const pointers without invoking ugly patterns like __DECONST(). Plumb const through all of the underlying sleepqueue bits. No functional change. Reviewed by: rlibby Discussed with: kib, markj Differential Revision: https://reviews.freebsd.org/D22914	2019-12-24 16:19:33 +00:00
Brandon Bergren	7821a820d0	[PowerPC] Implement Secure-PLT jump table processing for ppc32. Due to clang and LLD's tendency to use a PLT for builtins, and as they don't have full support for EABI, we sometimes have to deal with a PLT in .ko files in a clang-built kernel. As such, augment the in-kernel linker to support jump table processing. As there is no particular reason to support lazy binding in kernel modules, only implement Secure-PLT immediate binding. As part of these changes, add elf_cpu_parse_dynamic() to the MD API of the in-kernel linker (except on platforms that use raw object files.) The new function will allow MD code to act on MD tags in _DYNAMIC. Use this new function in the PowerPC MD code to ensure BSS-PLT modules using PLT will be rejected during insertion, and to poison the runtime resolver to ensure we get a clear panic reason if a call is made to the resolver. Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D22608	2019-12-24 15:56:24 +00:00
Conrad Meyer	e30f025ff9	kern_synch: Fix some UB It is UB to evaluate pointer comparisons when pointers do not point within the same object. Instead, convert the pointers to numbers and compare the numbers. Reported by: kib Discussed with: rlibby	2019-12-24 06:08:29 +00:00
Konstantin Belousov	52f3524cfd	Do not use waitable allocation of pbuf when creating cluster for write. Previously just ensuring that we do not sleep when clustering for md(4) vnode was enough. Now, with the switch of the pbuf allocator to uma and completely broken per-subsystem pbuf limits, it might cause unbounded sleep even for non-md(4) vnodes. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22899	2019-12-23 20:15:19 +00:00
Philip Paeps	1b6dd6d772	Print upper 32 bits in devmap table entries If the devmap entry uses the upper 32 bits they wouldn't be printed in devmap_dump_table(). This fixes that. Submitted by: Nicholas O'Brien <nickisobrien_gmail.com> Sponsored by: Axiado	2019-12-20 03:40:53 +00:00
Mark Johnston	5d3bd9e242	Fix SIGINFO stack collection to ignore threads with swapped-out stacks. We by definition cannot trace the stack of such a thread. Also remove a redundant stack_zero() call in the SIGINFO handler, the stack structure is cleared by the MD stack_capture(). Sponsored by: The FreeBSD Foundation	2019-12-19 19:34:25 +00:00
Jeff Roberson	d8d5f03610	Fix a bug in r355784. I missed a sched_add() call that needed to reacquire the thread lock. Reported by: mjg	2019-12-19 18:22:11 +00:00
Hans Petter Selasky	cc79ea3a26	Restore important comment in RCU/EPOCH support in FreeBSD after r355784. Sponsored by: Mellanox Technologies	2019-12-18 09:30:32 +00:00
Mateusz Guzik	6fa079fc3f	vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738	2019-12-16 00:06:22 +00:00
Mateusz Guzik	3fd19ce7a5	mtx: eliminate recursion support from thread lock Now that it is not used after schedlock changes got merged. Note the unlock routine temporarily still checks for it on account of just using regular spin unlock. This is a prelude towards a general clean up.	2019-12-16 00:04:33 +00:00
Jeff Roberson	686bcb5c14	schedlock 4/4 Don't hold the scheduler lock while doing context switches. Instead we unlock after selecting the new thread and switch within a spinlock section leaving interrupts and preemption disabled to prevent local concurrency. This means that mi_switch() is entered with the thread locked but returns without. This dramatically simplifies scheduler locking because we will not hold the schedlock while spinning on blocked lock in switch. This change has not been made to 4BSD but in principle it would be more straightforward. Discussed with: markj Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22778	2019-12-15 21:26:50 +00:00
Jeff Roberson	1c81a87efd	schedlock 3/4 Eliminate lock recursion from turnstiles. This was simply used to avoid tracking the top-level turnstile lock. explicitly check for it before picking up and dropping locks. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22746	2019-12-15 21:19:41 +00:00

1 2 3 4 5 ...

17066 Commits