freebsd-dev

Author	SHA1	Message	Date
Jeff Roberson	af00971419	Handle pagein clustering in vm_page_grab_valid() so that it can be used by exec_map_first_page(). This will also enable pagein clustering for other interested consumers (tmpfs, md, etc). Discussed with: alc Approved by: kib Differential Revision: https://reviews.freebsd.org/D22731	2019-12-15 02:00:32 +00:00
Doug Moore	9f70442a04	Simplify the processing a leaf mask to find big-enough ranges of set bits, by storing and modifying the complement of the original leaf mask, and by avoiding some unnecessary intermediate variables in computing the shift amounts. The logic is similar to what has recently been committed to sys/sys/bitstring.h. Compute better hint updates for the case when the cursor starts in mid-leaf, and eliminates some otherwise viable solutions. Assume the worst case, that all the eliminated offsets could have been solutions, and you can still compute a better hint than we use now. Eliminate some unnecessary conditional control flow. Approved by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D22666	2019-12-14 19:44:42 +00:00
Mateusz Guzik	6f836483ec	Remove the useless return value from proc_set_cred	2019-12-14 00:43:17 +00:00
John Baldwin	4b28d96e5d	Remove the deprecated timeout(9) interface. All in-tree consumers have been converted to callout(9). Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D22602	2019-12-13 21:03:12 +00:00
Warner Losh	b832a7e505	Create new wrapper function: bus_delayed_attach_children() Delay the attachment of children, when requested, until after interrutps are running. This is often needed to allow children to run transactions on i2c or spi busses. It's a common enough idiom that it will be useful to have its own wrapper. Reviewed by: ian Differential Revision: https://reviews.freebsd.org/D21465	2019-12-13 19:39:33 +00:00
John Baldwin	bf2276f378	Use callout(9) instead of deprecated timeout(9) for fail points. Allocate the callout structure on-demand from fail_point_use_timeout_path() since most fail points do not use timeouts. Reviewed by: markj (earlier version), cem Differential Revision: https://reviews.freebsd.org/D22599	2019-12-13 19:26:04 +00:00
Edward Tomasz Napierala	34ad5ac242	Add kern_kill() and use it in Linuxulator. It's just a cleanup, no functional changes. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22645	2019-12-13 18:44:02 +00:00
Edward Tomasz Napierala	be2cfdbc86	Add kern_getsid() and use it in Linuxulator; no functional changes. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22647	2019-12-13 18:39:36 +00:00
Ryan Libby	9825eadf2c	bitset: rename confusing macro NAND to ANDNOT s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual implementation is "and not" (or "but not"), i.e. A but not B. Fortunately this does appear to be what all existing callers want. Don't supply a NAND (not (A and B)) operation at this time. Discussed with: jeff Reviewed by: cem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22791	2019-12-13 09:32:16 +00:00
Conrad Meyer	cd5650407e	kern/subr_unit: Rip srandomdev, random(3) out of dead code The simulation cannot be reproduced, so the value of using a deterministic PRNG like random(3) is dubious. The number of repitions used in the sample isn't a problem for the Chacha implementation of arc4random we have today. (Also, no one actually runs this code; it was provided as an example of the work the author did validating the implementation. It's not even test code.)	2019-12-13 04:48:20 +00:00
Rick Macklem	ea9a16b252	r355677 requires that vop_stdioctl() be global so it can be called from NFS. r355677 modified the NFS client so that it does lseek(SEEK_DATA/SEEK_HOLE) for NFSv4.2, but calls vop_stdioctl() otherwise. As such, vop_stdioctl() needs to be a global function. Missed during the code merge for r355677.	2019-12-13 00:14:12 +00:00
Edward Tomasz Napierala	d6fee74a0c	Add kern_sync(9), and make kernel code call it instead of going via sys_sync(2). Minor cleanup, no functional changes. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19366	2019-12-12 18:45:31 +00:00
Mark Johnston	7789ab32b3	Rename tdq_ipipending and clear it in sched_switch(). This fixes a regression after r355311. Specifically, sched_preempt() may trigger a context switch by calling thread_lock(), since thread_lock() calls critical_exit() in its slow path and the interrupted thread may have already been marked for preemption. This would happen before tdq_ipipending is cleared, blocking further preemption IPIs. The CPU can be left in this state indefinitely if the interrupted thread migrates. Rename tdq_ipipending to tdq_owepreempt. Any switch satisfies a remote preemption request, so clear tdq_owepreempt in sched_switch() instead of sched_preempt() to avoid subtle problems of the sort described above. Reviewed by: jeff, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22758	2019-12-12 02:43:24 +00:00
Mateusz Guzik	c8b29d1212	vfs: locking primitives which elide ->v_vnlock and shared locking disablement Both of these features are not needed by many consumers and result in avoidable reads which in turn puts them on profiles due to cache-line ping ponging. On top of that the current lockgmr entry point is slower than necessary single-threaded. As an attempted clean up preparing for other changes, provide new routines which don't support any of the aforementioned features. With these patches in place vop_stdlock and vop_stdunlock disappear from flamegraphs during -j 104 buildkernel. Reviewed by: jeff (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D22665	2019-12-11 23:11:21 +00:00
Mateusz Guzik	55eb92db8d	fd: static-ize and devolatile openfiles Almost all access is using atomics. The only read is sysctl which should use a whole-int-at-a-time friendly read internally.	2019-12-11 23:09:12 +00:00
Andriy Gapon	64ebbdd54d	add a sanity check to the system call registration code A system call number should be at least reserved. We do not expect an attempt to register a fixed number system call when nothing at all is known about it. MFC after: 3 weeks Sponsored by: Panzura	2019-12-11 15:52:29 +00:00
John Baldwin	a8a03706fb	Add a callout_func_t typedef for functions used with callout_*(). This typedef is the same as timeout_t except that it is in the callout namespace and header. Use this typedef in various places of the callout implementation that were either using the raw type or timeout_t. While here, add <sys/callout.h> to the manpage. Reviewed by: kib, imp MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D22751	2019-12-10 21:58:30 +00:00
Mateusz Guzik	ff4486e827	vfs: refactor vhold and vdrop No fuctional changes.	2019-12-10 00:08:05 +00:00
John Baldwin	d8010b1175	Copy out aux args after the argument and environment vectors. Partially revert r354741 and r354754 and go back to allocating a fixed-size chunk of stack space for the auxiliary vector. Keep sv_copyout_auxargs but change it to accept the address at the end of the environment vector as an input stack address and no longer allocate room on the stack. It is now called at the end of copyout_strings after the argv and environment vectors have been copied out. This should fix a regression in r354754 that broke the stack alignment for newer Linux amd64 binaries (and probably broke Linux arm64 as well). Reviewed by: kib Tested on: amd64 (native, linux64 (only linux-base-c7), and i386) Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22695	2019-12-09 19:17:28 +00:00
Mateusz Guzik	abd80ddb94	vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715	2019-12-08 21:30:04 +00:00
Mateusz Guzik	791a24c7ea	vfs: clean up vputx a little 1. replace hand-rolled macros for operation type with enum 2. unlock the vnode in vput itself, there is no need to branch on it. existence of VPUTX_VPUT remains significant in that the inactive variant adds LK_NOWAIT to locking request. 3. remove the useless v_usecount assertion. few lines above the checks if v_usecount > 0 and leaves. should the value be negative, refcount would fail. 4. the CTR return vnode %p to the freelist is incorrect as vdrop may find the vnode with holdcnt > 1. if the like should exist, it should be moved there 5. no need to error = 0 for everyone Reviewed by: kib, jeff (previous version) Differential Revision: https://reviews.freebsd.org/D22718	2019-12-08 21:13:07 +00:00
Mateusz Guzik	fd6e0c43a6	vfs: factor out vnode destruction out of vdrop Sponsored by: The FreeBSD Foundation	2019-12-08 21:11:25 +00:00
Jeff Roberson	c3cccf95bf	Handle multiple clock interrupts simultaneously in sched_clock(). Reviewed by: kib, markj, mav Differential Revision: https://reviews.freebsd.org/D22625	2019-12-08 01:17:38 +00:00
Konstantin Belousov	0cc9fb7551	Only return EPERM from kill(-pid) when no process was signalled. As mandated by POSIX. Also clarify the kill(2) manpage. While there, restructure the code in killpg1() to use helper which keeps overall state of the process list iteration in the killpg1_ctx structued, later used to infer the error returned. Reported by: amdmi3 Reviewed by: jilles Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22621	2019-12-07 18:07:49 +00:00
Mateusz Guzik	12e483e5f7	vfs: clean up delmntque similarly to vdrop r355414	2019-12-07 12:56:24 +00:00
Mateusz Guzik	4f4d9a086a	vfs: catch vn_printf up with reality - add the missing VV_VMSIZEVNLOCK and VV_READLINK flags - add decoding v_mflag While here sort flags.	2019-12-07 12:55:58 +00:00
Brooks Davis	af796bfa71	sysent: Reduce duplication and improve readability. Use the power of variable to avoid spelling out source and generated files too many times. The previous Makefiles were hard to read, hard to edit, and badly formatted. Reviewed by: kevans, emaste Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22714	2019-12-06 23:59:23 +00:00
Alexander Motin	cb847b8152	Make devstat_end_transaction_bio() count BIO_ORDERED. MFC after: 2 weeks	2019-12-06 18:39:05 +00:00
Bjoern A. Zeeb	173c062a56	Improve EPOCH_TRACE Two changes to EPOCH_TRACE: (1) add a sysctl to surpress the backtrace from epoch_trace_report(). Sometimes the log line for the recursion is enough and the backtrace massively spams the console. (2) In order to be able to go without the backtrace do not only print where the previous occurance happened, but also where the current one happens. That way we have file:line information for both and can look at them without the need for getting line numbers from backtrace and a debugging tool. Reviewed by: glebius Sponsored by: Netflix (originally) Differential Revision: https://reviews.freebsd.org/D22641	2019-12-06 16:34:04 +00:00
Mateusz Guzik	befd3e35b3	sx: check for SX_LOCK_SHARED \| SX_LOCK_WRITE_SPINNER when exclusive-locking First, this removes a spurious difference compared to rw locks. More importantly though this avoids a trip through sleepq code if the lock happens to be caught in this state.	2019-12-05 13:43:44 +00:00
Mateusz Guzik	3eeb8a1fba	vfs: remove 'active' variable from _vdrop No functional changes.	2019-12-05 13:40:10 +00:00
Alexander Motin	61322a0a8a	Mark some more hot global variables with __read_mostly. MFC after: 1 week	2019-12-04 21:26:03 +00:00
Ryan Libby	30be9685a3	mbuf zones: take out the trash The mbuf zones were explicitly specifying the uma trash procedures on zcreate, conditionally on INVARIANTS, because that used to be necessary in order to get use-after-free checking for uma zones with non-empty constructors or destructors. After r355137 uma automatically invokes the trash constructor and destructor as long as no init and fini are specified. This now allows the mbuf zones to pass their constructors and destructors without needing to add on the uma trash procedures conditionally. Reviewed by: cem, jhb, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22583	2019-12-04 18:21:29 +00:00
John Baldwin	31174518d2	Use uintptr_t instead of register_t * for the stack base. - Use ustringp for the location of the argv and environment strings and allow destp to travel further down the stack for the stackgap and auxv regions. - Update the Linux copyout_strings variants to move destp down the stack as was done for the native ABIs in r263349. - Stop allocating a space for a stack gap in the Linux ABIs. This used to hold translated system call arguments, but hasn't been used since r159992. Reviewed by: kib Tested on: md64 (amd64, i386, linux64), i386 (i386, linux) Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22501	2019-12-03 23:17:54 +00:00
Kirk McKusick	d00066a5f9	Currently the breadn_flags() and getblkx() interfaces are passed the vnode, logical block number, and size of data block that is being requested. They then use the VOP_BMAP function to calculate the mapping from logical block number to physical block number from which to access the data. This change expands the interface to also pass the physical block number in cases where the VOP_MAP function may no longer work, for example when a file is being truncated. No functional change. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix	2019-12-03 23:07:09 +00:00
Jeff Roberson	9b78b1f433	Use a precise bit count for the slab free items in UMA. This significantly shrinks embedded slab structures. Reviewed by: markj, rlibby (prior version) Differential Revision: https://reviews.freebsd.org/D22584	2019-12-02 22:44:34 +00:00
Jeff Roberson	4504268a1b	Fix the last few cases that grab without busy or valid. The grab functions must return the page in some held state for consistency elsewhere. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22610	2019-12-02 22:38:25 +00:00
Jeff Roberson	e15046952d	Initialize the idle thread's lock sooner so it's not evaluated on every fork exit and we can rely on it elsewhere. Reviewed by: mav, kib, jhb, markj Differential Revision: https://reviews.freebsd.org/D22624	2019-12-02 22:35:45 +00:00
Mateusz Guzik	5fe188b1e8	lockmgr: remove more remnants of adaptive spinning Sponsored by: The FreeBSD Foundation	2019-12-01 00:35:08 +00:00
Kyle Evans	1b50b999f9	tty: implement TIOCNOTTY Generally, it's preferred that an application fork/setsid if it doesn't want to keep its controlling TTY, but it could be that a debugger is trying to steal it instead -- so it would hook in, drop the controlling TTY, then do some magic to set things up again. In this case, TIOCNOTTY is quite handy and still respected by at least OpenBSD, NetBSD, and Linux as far as I can tell. I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as long as it doesn't impose a major burden. Reviewed by: bcr (manpages), kib Differential Revision: https://reviews.freebsd.org/D22572	2019-11-30 20:10:50 +00:00
Mateusz Guzik	e0a1a1e6cb	smp: cast the read in quiesce_all_critical through void * Fixes compilation on some 32-bit arm platforms. Sponsored by: The FreeBSD Foundation	2019-11-30 19:33:02 +00:00
Mateusz Guzik	3ac2ac2e08	lockprof: use IPI-injecetd fences to fix hangs on stat dump and reset The previously used quiesce_all_cpus walks all CPUs and waits until curthread can run on them. Even on contemporary machines this becomes a significant problem under load when it can literally take minutes for the operation to complete. With the patch the stall is normally less than 1 second. Reviewed by: kib, jeff (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21740	2019-11-30 17:24:42 +00:00
Mateusz Guzik	5032fe17a2	Add a way to inject fences using IPIs A variant of this facility was already used by rmlocks where IPIs would enforce ordering. This allows to elide fences where they are rarely needed and the cost of IPI (should it be necessary) is cheaper. Reviewed by: kib, jeff (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21740	2019-11-30 17:22:10 +00:00
Mateusz Guzik	a02cab334c	devfs: introduce a per-dev lock to protect ->si_devsw This allows bumping threadcount without taking the global devmtx lock. In particular this eliminates contention on said lock while using bhyve with multiple vms. Reviewed by: kib Tested by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22548	2019-11-30 16:46:19 +00:00
Kyle Evans	9e387c3da2	tty_rel_gone: add locking assertion We already assert the lock is held later during tty_rel_free(), but it is arguably good form to clarify locking expectations here as well at the top-level that other drivers use.	2019-11-29 14:46:13 +00:00
Konstantin Belousov	fdc6b10d44	Add a VN_OPEN_INVFS flag. vn_open_cred() assumes that it is called from the top-level of a VFS syscall. Writers must call bwillwrite() before locking any VFS resource to wait for cleanup of dirty buffers. ZFS getextattr() and setextattr() VOPs do call vn_open_cred(), which results in wait for unrelated buffers while owning ZFS vnode lock (and ZFS does not use buffer cache). VN_OPEN_INVFS allows caller to skip bwillwrite. Note that ZFS is still incorrect there, because it starts write on an mp and locks a vnode while holding another vnode lock. Reported by: Willem Jan Withagen <wjw@digiware.nl> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-29 14:02:32 +00:00
Ryan Libby	815db2f6f8	ktls_session zone: don't need to specify uma trash The use of the uma trash procedures is automatic, there's no need to pass them explicitly here. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22582	2019-11-29 06:25:03 +00:00
Kyle Evans	cf29433090	tty_pts: don't rely on tty header pollution for sys/mutex.h tty_pts.c relies on sys/tty.h for sys/mutex.h. Include it directly instead of relying on this pollution to ease the diff for anyone that wants to try converting the tty lock to anything other than a mutex.	2019-11-29 03:56:01 +00:00
Jeff Roberson	6d6a03d7a8	Handle large mallocs by going directly to kmem. Taking a detour through UMA does not provide any additional value. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D22563	2019-11-29 03:14:10 +00:00
Jeff Roberson	b476ae7f52	Fix DEBUG_REDZONE build after r355169	2019-11-28 08:56:14 +00:00

1 2 3 4 5 ...

17011 Commits