freebsd-skq

Author	SHA1	Message	Date
mjg	bcfa67ab8b	vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715	2019-12-08 21:30:04 +00:00
mjg	4b9989aca8	vfs: clean up vputx a little 1. replace hand-rolled macros for operation type with enum 2. unlock the vnode in vput itself, there is no need to branch on it. existence of VPUTX_VPUT remains significant in that the inactive variant adds LK_NOWAIT to locking request. 3. remove the useless v_usecount assertion. few lines above the checks if v_usecount > 0 and leaves. should the value be negative, refcount would fail. 4. the CTR return vnode %p to the freelist is incorrect as vdrop may find the vnode with holdcnt > 1. if the like should exist, it should be moved there 5. no need to error = 0 for everyone Reviewed by: kib, jeff (previous version) Differential Revision: https://reviews.freebsd.org/D22718	2019-12-08 21:13:07 +00:00
mjg	872f296f3c	vfs: factor out vnode destruction out of vdrop Sponsored by: The FreeBSD Foundation	2019-12-08 21:11:25 +00:00
jeff	389afb1898	Handle multiple clock interrupts simultaneously in sched_clock(). Reviewed by: kib, markj, mav Differential Revision: https://reviews.freebsd.org/D22625	2019-12-08 01:17:38 +00:00
kib	5f45f7a6f5	Only return EPERM from kill(-pid) when no process was signalled. As mandated by POSIX. Also clarify the kill(2) manpage. While there, restructure the code in killpg1() to use helper which keeps overall state of the process list iteration in the killpg1_ctx structued, later used to infer the error returned. Reported by: amdmi3 Reviewed by: jilles Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22621	2019-12-07 18:07:49 +00:00
mjg	0a3ea4b564	vfs: clean up delmntque similarly to vdrop r355414	2019-12-07 12:56:24 +00:00
mjg	818ef82e15	vfs: catch vn_printf up with reality - add the missing VV_VMSIZEVNLOCK and VV_READLINK flags - add decoding v_mflag While here sort flags.	2019-12-07 12:55:58 +00:00
brooks	dfa2e15cbe	sysent: Reduce duplication and improve readability. Use the power of variable to avoid spelling out source and generated files too many times. The previous Makefiles were hard to read, hard to edit, and badly formatted. Reviewed by: kevans, emaste Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22714	2019-12-06 23:59:23 +00:00
mav	30af7c4d0c	Make devstat_end_transaction_bio() count BIO_ORDERED. MFC after: 2 weeks	2019-12-06 18:39:05 +00:00
bz	b42975a154	Improve EPOCH_TRACE Two changes to EPOCH_TRACE: (1) add a sysctl to surpress the backtrace from epoch_trace_report(). Sometimes the log line for the recursion is enough and the backtrace massively spams the console. (2) In order to be able to go without the backtrace do not only print where the previous occurance happened, but also where the current one happens. That way we have file:line information for both and can look at them without the need for getting line numbers from backtrace and a debugging tool. Reviewed by: glebius Sponsored by: Netflix (originally) Differential Revision: https://reviews.freebsd.org/D22641	2019-12-06 16:34:04 +00:00
mjg	b72734b537	sx: check for SX_LOCK_SHARED \| SX_LOCK_WRITE_SPINNER when exclusive-locking First, this removes a spurious difference compared to rw locks. More importantly though this avoids a trip through sleepq code if the lock happens to be caught in this state.	2019-12-05 13:43:44 +00:00
mjg	3e04f4b855	vfs: remove 'active' variable from _vdrop No functional changes.	2019-12-05 13:40:10 +00:00
mav	0e1fa50f0d	Mark some more hot global variables with __read_mostly. MFC after: 1 week	2019-12-04 21:26:03 +00:00
rlibby	b95a02bb86	mbuf zones: take out the trash The mbuf zones were explicitly specifying the uma trash procedures on zcreate, conditionally on INVARIANTS, because that used to be necessary in order to get use-after-free checking for uma zones with non-empty constructors or destructors. After r355137 uma automatically invokes the trash constructor and destructor as long as no init and fini are specified. This now allows the mbuf zones to pass their constructors and destructors without needing to add on the uma trash procedures conditionally. Reviewed by: cem, jhb, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22583	2019-12-04 18:21:29 +00:00
jhb	0d8d23a6a3	Use uintptr_t instead of register_t * for the stack base. - Use ustringp for the location of the argv and environment strings and allow destp to travel further down the stack for the stackgap and auxv regions. - Update the Linux copyout_strings variants to move destp down the stack as was done for the native ABIs in r263349. - Stop allocating a space for a stack gap in the Linux ABIs. This used to hold translated system call arguments, but hasn't been used since r159992. Reviewed by: kib Tested on: md64 (amd64, i386, linux64), i386 (i386, linux) Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22501	2019-12-03 23:17:54 +00:00
mckusick	d137f58263	Currently the breadn_flags() and getblkx() interfaces are passed the vnode, logical block number, and size of data block that is being requested. They then use the VOP_BMAP function to calculate the mapping from logical block number to physical block number from which to access the data. This change expands the interface to also pass the physical block number in cases where the VOP_MAP function may no longer work, for example when a file is being truncated. No functional change. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix	2019-12-03 23:07:09 +00:00
jeff	5f3e7444d9	Use a precise bit count for the slab free items in UMA. This significantly shrinks embedded slab structures. Reviewed by: markj, rlibby (prior version) Differential Revision: https://reviews.freebsd.org/D22584	2019-12-02 22:44:34 +00:00
jeff	18bccfabd0	Fix the last few cases that grab without busy or valid. The grab functions must return the page in some held state for consistency elsewhere. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22610	2019-12-02 22:38:25 +00:00
jeff	e7288d9732	Initialize the idle thread's lock sooner so it's not evaluated on every fork exit and we can rely on it elsewhere. Reviewed by: mav, kib, jhb, markj Differential Revision: https://reviews.freebsd.org/D22624	2019-12-02 22:35:45 +00:00
mjg	080ffac31b	lockmgr: remove more remnants of adaptive spinning Sponsored by: The FreeBSD Foundation	2019-12-01 00:35:08 +00:00
kevans	4e48e813a9	tty: implement TIOCNOTTY Generally, it's preferred that an application fork/setsid if it doesn't want to keep its controlling TTY, but it could be that a debugger is trying to steal it instead -- so it would hook in, drop the controlling TTY, then do some magic to set things up again. In this case, TIOCNOTTY is quite handy and still respected by at least OpenBSD, NetBSD, and Linux as far as I can tell. I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as long as it doesn't impose a major burden. Reviewed by: bcr (manpages), kib Differential Revision: https://reviews.freebsd.org/D22572	2019-11-30 20:10:50 +00:00
mjg	c3fab6f99b	smp: cast the read in quiesce_all_critical through void * Fixes compilation on some 32-bit arm platforms. Sponsored by: The FreeBSD Foundation	2019-11-30 19:33:02 +00:00
mjg	5c9cf176a8	lockprof: use IPI-injecetd fences to fix hangs on stat dump and reset The previously used quiesce_all_cpus walks all CPUs and waits until curthread can run on them. Even on contemporary machines this becomes a significant problem under load when it can literally take minutes for the operation to complete. With the patch the stall is normally less than 1 second. Reviewed by: kib, jeff (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21740	2019-11-30 17:24:42 +00:00
mjg	6aadcd4686	Add a way to inject fences using IPIs A variant of this facility was already used by rmlocks where IPIs would enforce ordering. This allows to elide fences where they are rarely needed and the cost of IPI (should it be necessary) is cheaper. Reviewed by: kib, jeff (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21740	2019-11-30 17:22:10 +00:00
mjg	afeeba244b	devfs: introduce a per-dev lock to protect ->si_devsw This allows bumping threadcount without taking the global devmtx lock. In particular this eliminates contention on said lock while using bhyve with multiple vms. Reviewed by: kib Tested by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22548	2019-11-30 16:46:19 +00:00
kevans	6fa3c0bd17	tty_rel_gone: add locking assertion We already assert the lock is held later during tty_rel_free(), but it is arguably good form to clarify locking expectations here as well at the top-level that other drivers use.	2019-11-29 14:46:13 +00:00
kib	b4c6542df3	Add a VN_OPEN_INVFS flag. vn_open_cred() assumes that it is called from the top-level of a VFS syscall. Writers must call bwillwrite() before locking any VFS resource to wait for cleanup of dirty buffers. ZFS getextattr() and setextattr() VOPs do call vn_open_cred(), which results in wait for unrelated buffers while owning ZFS vnode lock (and ZFS does not use buffer cache). VN_OPEN_INVFS allows caller to skip bwillwrite. Note that ZFS is still incorrect there, because it starts write on an mp and locks a vnode while holding another vnode lock. Reported by: Willem Jan Withagen <wjw@digiware.nl> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-29 14:02:32 +00:00
rlibby	75bc5dbcbe	ktls_session zone: don't need to specify uma trash The use of the uma trash procedures is automatic, there's no need to pass them explicitly here. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22582	2019-11-29 06:25:03 +00:00
kevans	77fb93e1b7	tty_pts: don't rely on tty header pollution for sys/mutex.h tty_pts.c relies on sys/tty.h for sys/mutex.h. Include it directly instead of relying on this pollution to ease the diff for anyone that wants to try converting the tty lock to anything other than a mutex.	2019-11-29 03:56:01 +00:00
jeff	a65d31ef2d	Handle large mallocs by going directly to kmem. Taking a detour through UMA does not provide any additional value. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D22563	2019-11-29 03:14:10 +00:00
jeff	6e09ead90c	Fix DEBUG_REDZONE build after r355169	2019-11-28 08:56:14 +00:00
hselasky	79dc3a05bf	Factor out check for mounted root file system. Differential Revision: https://reviews.freebsd.org/D22571 PR: 241639 MFC after: 1 week Sponsored by: Mellanox Technologies	2019-11-28 08:47:36 +00:00
jeff	049ad3955f	Garbage collect the mostly unused us_keg field. Use appropriately named union members in vm_page.h to store the zone and slab. Remove some nearby dead code. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D22564	2019-11-28 07:49:25 +00:00
kib	60d99c176d	Requested and tested by: kevans Reviewed by: kevans (previous version), markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22546	2019-11-27 20:33:53 +00:00
rlibby	b5630a819f	witness: sleepable rm locks are not sleepable in read mode There are two classes of rm lock, one "sleepable" and one not. But even a "sleepable" rm lock is only sleepable in write mode, and is non-sleepable when taken in read mode. Warn about sleepable rm locks in read mode as non-sleepable locks. Do this by defining a new lock operation flag, LOP_NOSLEEP, to indicate that a lock is non-sleepable despite what the LO_SLEEPABLE flag would indicate, and defining a new witness lock instance flag, LI_SLEEPABLE, to track the product of LO_SLEEPABLE and LOP_NOSLEEP on the lock instance. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22527	2019-11-27 01:54:39 +00:00
mjg	d21b67186e	cache: stop reusing .. entries on enter It almost never happens in practice anyway. With this eliminated ->nc_vp cannot change vnodes, removing an obstacle on the road to lockless lookup.	2019-11-27 01:21:42 +00:00
mjg	5e8cfe32e0	cache: fix numcache accounting on entry . entries are never created and .. can reuse existing entries, meaning the early count bump is both spurious and leading to overcounting in certain cases.	2019-11-27 01:20:55 +00:00
mjg	a93204e206	cache: hide "doingcache" behind DEBUG_CACHE	2019-11-27 01:20:21 +00:00
hselasky	abea55f57f	Fix panic when loading kernel modules before root file system is mounted. Make sure the rootvnode is always NULL checked. Differential Revision: https://reviews.freebsd.org/D22545 PR: 241639 MFC after: 1 week Sponsored by: Mellanox Technologies	2019-11-26 12:20:44 +00:00
oshogbo	4ae67fb7ab	procdesc: allow to collect status through wait(1) if process is traced The debugger like truss(1) depends on the wait(2) syscall. This syscall waits for ALL children. When it is waiting for ALL child's the children created by process descriptors are not returned. This behavior was introduced because we want to implement libraries which may pdfork(1). The behavior of process descriptor brakes truss(1) because it will not be able to collect the status of processes with process descriptors. To address this problem the status is returned to parent when the child is traced. While the process is traced the debugger is the new parent. In case the original parent and debugger are the same process it means the debugger explicitly used pdfork() to create the child. In that case the debugger should be using kqueue()/pdwait() instead of wait(). Add test case to verify that. The test case was implemented by markj@. Reviewed by: kib, markj Discussed with: jhb MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D20362	2019-11-25 18:33:21 +00:00
rlibby	32e5f65de4	sysctl sysctls: wire old buf before output with sysctl lock Several sysctl sysctls output to a user buffer while holding a non-sleepable lock that protects the sysctl topology. They need to wire the output buffer, or else they may try to sleep on a page fault. Reviewed by: cem, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22528	2019-11-25 07:38:27 +00:00
kib	404183d739	Record part of the owner struct thread pointer into busy_lock. Record as much bits from curthread into busy_lock as fits. Low bits for struct thread * representation are zero due to struct and zone alignment, and they leave space for busy flags (perhaps except statically allocated thread0). Upper bits are not very interesting for assert, and in most practical situations recorded value should allow to manually identify the owner with certainity. Assert that unbusy is performed by the owner, except few places where unbusy is done in io completion handler. For this case, add _unchecked variants of asserts and unbusy primitives. Reviewed by: markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22298	2019-11-24 19:12:23 +00:00
imp	3193ed06d2	Add a warning about Giant Locked devices Add a warning when a device registers with devfs and requests D_NEEDGIANT. The warning says the device will go away before 13.0. This is needed to flush out the devices in the tree that are still Giant locked. This warning, or some variant of it, should have gone into the tree a long time ago... The intention is to require all devices be converted to not use automatic giant in this way, or remove any such devices that remain that we don't have the hardware to test a conversion of. kbd so far is the only device that can't leave the tree, yet needs something sensible done to avoid the auto giant lock (even if it is just doing the wrapping itself). There may be others added to this list... Any discussions of this topic will take place on arch@.	2019-11-23 23:57:26 +00:00
cem	35d496b56a	Add explicit SI_SUB_EPOCH Add explicit SI_SUB_EPOCH, after SI_SUB_TASKQ and before SI_SUB_SMP (EARLY_AP_STARTUP). Rename existing "SI_SUB_TASKQ + 1" to SI_SUB_EPOCH. epoch(9) consumers cannot epoch_alloc() before SI_SUB_EPOCH:SI_ORDER_SECOND, but likely should allocate before SI_SUB_SMP. Prior to this change, consumers (well, epoch itself, and net/if.c) just open-coded the SI_SUB_TASKQ + 1 order to match epoch.c, but this was fragile. Reviewed by: mmacy Differential Revision: https://reviews.freebsd.org/D22503	2019-11-22 23:23:40 +00:00
glebius	63e627ce4f	cc_ktr_event_name is used only with KTR	2019-11-21 23:55:43 +00:00
mav	7484143fd8	Add variant of root_mount_hold() without allocation. It allows to use this KPI in non-sleepable contexts. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-11-21 21:59:35 +00:00
andrew	d5bfef0bc3	Disable KCSAN within a panic. The kernel is single threaded at this point and the panic is more important. Sponsored by: DARPA, AFRL	2019-11-21 13:59:01 +00:00
andrew	e95c204297	Add kcsan_md_unsupported from NetBSD. It's used to ignore virtual addresses that may have a different physical address depending on the CPU. Sponsored by: DARPA, AFRL	2019-11-21 13:22:23 +00:00
andrew	34537aa902	Fix the bus_space functions with KCSAN on arm64. Arm64 doesn't define the bus_space_set_multi_stream and bus_space_set_region_stream functions. Don't try to define them there. Sponsored by: DARPA, AFRL	2019-11-21 13:12:58 +00:00
andrew	6e5970c8f4	Port the NetBSD KCSAN runtime to FreeBSD. Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in the FreeBSD kernel. It is a useful tool for finding data races between threads executing on different CPUs. This can be enabled by enabling KCSAN in the kernel config, or by using the GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later needs a compiler change to allow -fsanitize=thread that KCSAN uses. Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22315	2019-11-21 11:22:08 +00:00

1 2 3 4 5 ...

17006 Commits