freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	6e1619dae3	Add pfind_any It looks for both regular and zombie processes. This avoids allproc relocking previously seen with pfind -> zpfind calls.	2017-11-11 18:04:39 +00:00
Mateusz Guzik	272640b7fc	Avoid allproc lock in pfind if curproc->pid == pid	2017-11-11 18:03:26 +00:00
Mateusz Guzik	9b57bf75d0	Remove useless proc lookup from sysctl_out_proc	2017-11-11 18:02:23 +00:00
Mateusz Guzik	c7e4e92ecd	rwlock: use fcmpset for setting RW_LOCK_WRITE_SPINNER	2017-11-11 09:34:11 +00:00
Matt Joras	2ca45184dc	Introduce EVENTHANDLER_LIST and some users. This introduces a facility to EVENTHANDLER(9) for explicitly defining a reference to an event handler list. This is useful since previously all invokers of events had to do a locked traversal of the global list of event handler lists in order to find the appropriate event handler list. By keeping a pointer to the appropriate list an invoker can avoid this traversal completely. The pointer is initialized with SYSINIT(9) during the eventhandler stage. Users registering interest in events do not need to know if the event is backed by such a list, since the list is added to the global list of lists. As with lists that are not pre-defined it is safe to register for the events before the list has been created. This converts the process_* and thread_* events to using the new facility, as these are events whose locked traversals end up showing up significantly in ports build workflows (and presumably other workflows with many short lived threads/procs). It may be advantageous to convert other events to using the new facility. The el_flags field is now unused, but leave it be so that this revision can be MFC'd. Reviewed by: bdrewery, markj, mjg Approved by: rstone (mentor) In collaboration with: ian MFC after: 4 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12814	2017-11-09 22:51:48 +00:00
Konstantin Belousov	9acf7b136d	Zero whole struct ptrace_lwpinfo to not leak kernel stack data. Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com> Discussed with: secteam Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D12796	2017-11-08 23:32:56 +00:00
Jeff Roberson	8d6fbbb867	Replace manyinstances of VM_WAIT with blocking page allocation flags similar to the kernel memory allocator. This simplifies NUMA allocation because the domain will be known at wait time and races between failure and sleeping are eliminated. This also reduces boilerplate code and simplifies callers. A wait primitive is supplied for uma zones for similar reasons. This eliminates some non-specific VM_WAIT calls in favor of more explicit sleeps that may be satisfied without new pages. Reviewed by: alc, kib, markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon	2017-11-08 02:39:37 +00:00
Bartek Rutkowski	cee09850f7	Make sysctl_kern_proc_umask execute fast path when requested pid in curproc->p_pid or 0, avoiding unnecessary locking. Update libc consumer to skip calling getpid(). Submitted by: Pawel Biernacki <pawel.biernacki@gmail.com> Reviewed by: mjg, robak Approved by: mjg Sponsored by: Mysterious Code Ltd. Differential Revision: D12972	2017-11-07 15:13:32 +00:00
Mateusz Guzik	db520fdd46	rwlock: fix up compilation without KDTRACE_HOOKS after r324787	2017-11-06 05:14:05 +00:00
Mateusz Guzik	ce80021f4e	namecache: bump numcache after dropping all locks This makes no difference correctness-wise, but shortens total hold time.	2017-11-05 22:29:45 +00:00
Mateusz Guzik	119b826a62	namecache: wlock buckets in cache_lookup_nomakeentry Since the case of an empty chain was already covered, it si very likely that the existing entry is matching. Skipping readlocking saves on lock upgrade.	2017-11-05 22:28:39 +00:00
Mateusz Guzik	ba324b5946	namecache: skip locking in cache_lookup_nomakeentry if there is no entry	2017-11-05 21:59:39 +00:00
Ed Maste	80dc9f8888	ANSIfy sys/kern/md4c.c PR: 223453 Submitted by: ota@j.email.ne.jp MFC After: 2 weeks	2017-11-05 19:49:44 +00:00
Mateusz Guzik	a52058f013	namecache: skip locking in cache_purge_negative if there are no entries	2017-11-05 08:31:25 +00:00
Pedro F. Giffuni	7aa472731e	ANSI-fy exec_shell_imgact(). Fix a stray space while here. PR: 223317 MFC after: 3 days	2017-11-04 15:41:08 +00:00
Konstantin Belousov	30c438723d	Convert explicit panic() call to assert. Based on github pull request: #113 Submitted by: pmarillo@github MFC after: 1 week	2017-11-04 10:49:34 +00:00
Mateusz Guzik	a2c36a24b6	Special-case pget lookups where pid == curproc->pid Saves on allproc_lock acquires during buildworld, poudriere etc. Submitted by: Pawel Biernacki <pawel.biernacki@gmail.com> Sponsored by: Mysterious Code Ltd. Differential Revision: D12929	2017-11-03 19:21:36 +00:00
Mateusz Guzik	ac850e5a8d	namecache: fix .. check broken after r324378 wtf by: mjg Diagnosed by: avg	2017-11-01 08:40:04 +00:00
Mateusz Guzik	59e260f860	Fixup r325264, take #2 whack an unused variable	2017-11-01 06:46:58 +00:00
Mateusz Guzik	5644fffa25	namecache: ncnegfactor 16 -> 12 It is used on each new entry addition to decide whether to whack an existing negative entry in order to prevent a blow out in size, but the parameter was set years ago and never revisited. Building with poudriere results in about 400 evictions per second which unnecessarily grab entries from the hot list. With the new parameter there are next to no evictions of the sort.	2017-11-01 06:45:41 +00:00
Mateusz Guzik	5d03f1e11f	Fixup r325264 Accidentally committed an incomplete diff.	2017-11-01 06:38:46 +00:00
Mateusz Guzik	c0b5261b55	Save on loginclass list locking by checking if caller already uses the struct	2017-11-01 06:12:14 +00:00
Mateusz Guzik	5949c7e504	Save on uihash table locking by checking if the caller already uses the struct In particular with poudriere this saves about 90% of lookups.	2017-11-01 05:51:20 +00:00
John Baldwin	e012fe34cb	Discard the correct thread event reported for a ptrace stop. When multiple threads wish to report a tracing event to a debugger, both threads call ptracestop() and one thread will win the race to be the reporting thread (p->p_xthread). The debugger uses PT_LWPINFO with the process ID to determine which thread / LWP is reporting an event and the details of that event. This event is cleared as a side effect of the subsequent ptrace event that resumed the process (PT_CONTINUE, PT_STEP, etc.). However, ptrace() was clearing the event identified by the LWP ID passed to the resume request even if that wasn't the 'p_xthread'. This could result in clearing an event that had not yet been observed by the debugger and leaving the existing event for 'p_thread' pending so that it was reported a second time. Specifically, if the debugger stopped due to a software breakpoint in one thread, but then switched to another thread that was used to resume (e.g. if the user switched to a different thread and issued a step), the resume request (PT_STEP) cleared a pending event (if any) for the thread being stepped. However, the process immediately stopped and the first thread reported it's breakpoint event a second time. The debugger decremented the PC for "both" breakpoint events which resulted in the PC now pointing into the middle of an instruction (on x86) and a SIGILL fault when the process was resumed a second time. To fix, always clear the pending event for 'p_xthread' when resuming a process. ptrace() still honors the requested LWP ID when enabling single-stepping (PT_STEP) or setting a different PC (PT_CONTINUE). Reported by: GDB testsuite (gdb.threads/continue-pending-status.exp) Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D12794	2017-10-27 03:16:19 +00:00
Alan Somers	df485bdb3c	Fix aio_suspend in 32-bit emulation An off-by-one error has been present since the system call was first present in 185878. It additionally became a memory corruption bug after change 324941. The failure is actually revealed by our existing AIO tests. However, apparently nobody's been running those in 32-bit emulation mode. Reported by: Coverity, cem CID: 1382114 MFC after: 18 days X-MFC-With: 324941 Sponsored by: Spectra Logic Corp	2017-10-26 19:45:15 +00:00
Warner Losh	7d41b6f078	Handle RB_POWERCYCLE in the MI part of the kernel Signal init with SIGWINCH in shutdown_nice for RB_POWERCYCLE. Sponsored by: Netflix	2017-10-25 15:30:44 +00:00
Mark Johnston	64a16434d8	Add support for compressed kernel dumps. When using a kernel built with the GZIO config option, dumpon -z can be used to configure gzip compression using the in-kernel copy of zlib. This is useful on systems with large amounts of RAM, which require a correspondingly large dump device. Recovery of compressed dumps is also faster since fewer bytes need to be copied from the dump device. Because we have no way of knowing the final size of a compressed dump until it is written, the kernel will always attempt to dump when compression is configured, regardless of the dump device size. If the dump is aborted because we run out of space, an error is reported on the console. savecore(8) is modified to handle compressed dumps and save them to vmcore.<index>.gz, as it does when given the -z option. A new rc.conf variable, dumpon_flags, is added. Its value is added to the boot-time dumpon(8) invocation that occurs when a dump device is configured in rc.conf. Reviewed by: cem (earlier version) Discussed with: def, rgrimes Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11723	2017-10-25 00:51:00 +00:00
Alan Somers	913b932900	Remove artificial restriction on lio_listio's operation count In r322258 I made p1003_1b.aio_listio_max a tunable. However, further investigation shows that there was never any good reason for that limit to exist in the first place. It's used in two completely different ways: * To size a UMA zone, which globally limits the number of concurrent aio_suspend calls. * To artifically limit the number of operations in a single lio_listio call. There doesn't seem to be any memory allocation associated with this limit. This change does two things: * Properly names aio_suspend's UMA zone, and sizes it based on a new constant. * Eliminates the artifical restriction on lio_listio. Instead, lio_listio calls will now be limited by the more generous max_aio_queue_per_proc. The old p1003_1b.aio_listio_max is now an alias for vfs.aio.max_aio_queue_per_proc, so sysconf(3) will still work with _SC_AIO_LISTIO_MAX. Reported by: bde Reviewed by: jhb MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12120	2017-10-23 23:12:01 +00:00
Mateusz Guzik	5132933a08	Bump WITNESS_PENDLIST to accomodate sleepq chain bump. Reported by: ngie	2017-10-23 01:00:35 +00:00
Mateusz Guzik	9e68989764	Make the sleepq chain hash size configurable per-arch and bump on amd64. While here cache-align chains. This shortens longest found chain during poudriere -j 80 from 32 to 16. Pushing this higher up will probably require allocation on boot.	2017-10-22 20:43:50 +00:00
Mateusz Guzik	5a17c5524f	sdt: make all sdt probe sites test one variable This saves on cache misses at the expense of a slight grow of .text. Note this is a bandaid for lack of hotpatching. Discussed with: markj	2017-10-22 20:22:23 +00:00
Mateusz Guzik	614e1868d6	Change kdb_active type to u_char. Fixes warnings from gcc and keeps the small size. Perhaps nesting should be moved to another variablle. Reported by: ngie	2017-10-22 13:42:56 +00:00
Enji Cooper	f2374e0cc5	Clean up trailing whitespace in kdb_thr_ctx(..) MFC after: 1 week	2017-10-22 12:12:52 +00:00
Konstantin Belousov	456a73ef01	Remove the support for mknod(S_IFMT), which created dummy vnodes with VBAD type. FFS ffs_write() VOP catches such vnodes and panics, other VOPs do not check for the type and their behaviour is really undefined. The comment claims that this support was done for 'badsect' to flag bad sectors, we do not have such facility in kernel anyway. Reported by: Dmitry Vyukov <dvyukov@google.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-10-22 08:11:45 +00:00
Mateusz Guzik	be49509eea	mtx: implement thread lock fastpath MFC after: 1 week	2017-10-21 22:40:09 +00:00
Michal Meloun	904d8c492f	Add AT_HWCAP2 ELF auxiliary vector. - allocate value for new AT_HWCAP2 auxiliary vector on all platforms. - expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly same way as for AT_HWCAP. MFC after: 1 month Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D12699	2017-10-21 12:05:01 +00:00
Mark Johnston	a3e8a25a52	Avoid the nbp lookup in the final loop iteration in flushbuflist(). The end of the loop must re-lookup the next buf since the bufobj lock is dropped in the loop body. If the lookup fails, the loop is restarted. This mechanism non-obviously also terminates the loop when the end of the buf list is reached. Split up the two loops termination cases to make the code a bit less fragile. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12730	2017-10-20 14:56:13 +00:00
Mateusz Guzik	62bf13cbf9	mtx: fix up UP build after r324778 Reported by: Michael Butler	2017-10-20 14:04:01 +00:00
Mateusz Guzik	c48a94251d	Mark kdb_active as __read_frequently and switch to bool to eat less space.	2017-10-20 04:02:53 +00:00
Mateusz Guzik	2567807c32	rwlock: reduce lockstat branches in the slowpath MFC after: 1 week	2017-10-20 03:32:42 +00:00
Mateusz Guzik	cbc2d7c218	mtx: stop testing SCHEDULER_STOPPED in kabi funcs for spin mutexes There is nothing panic-breaking to do in the unlock case and the lock case will fallback to the slow path doing the check already. MFC after: 1 week	2017-10-20 00:34:25 +00:00
Mateusz Guzik	0d74fe267b	mtx: clean up locking spin mutexes 1) shorten the fast path by pushing the lockstat probe to the slow path 2) test for kernel panic only after it turns out we will have to spin, in particular test only after we know we are not recursing MFC after: 1 week	2017-10-20 00:30:35 +00:00
Mateusz Guzik	9b8de76beb	sysctl: only take mem lock if oldlen is > 4 * PAGE_SIZE The previous limit of just one page is hit by ps. The entire mechanism should be reworked, if not whacked. It seems the intent is to reduce kernel dos-ability - some handlers wire the amount of memory passed here. Handlers should probably stop wiring in the first place or in the worst case indicate they are doing so so that the check is done only if necessary. It should also probably be a counter, not a lock. MFC after: 1 week	2017-10-19 01:38:31 +00:00
Mateusz Guzik	e6b645ae89	execve: avoid one proc lock/unlock trip unless PTRACE_EXEC is set MFC after: 1 week	2017-10-19 00:46:15 +00:00
Mateusz Guzik	80a2397a38	Tidy up pmc support at execve. The proc-specific check is inherently racy, so the code can just unlock beforehand. MFC after: 1 week	2017-10-19 00:38:14 +00:00
Mateusz Guzik	cb1c79008e	sysvsem: check if semu_list has anything on it before grabbing the lock This should get a process-specific support instead. MFC after: 1 week	2017-10-19 00:31:00 +00:00
Mateusz Guzik	c69a1a50cd	Don't take Giant for SMP status and cpu topology sysctls. Not only this lock doesn't play any role here, dirtying it slows down other things a little bit as giant-held checks (e.g. DROP_GIANT) are spread all over the kernel. MFC after: 1 week	2017-10-18 22:00:44 +00:00
Mark Johnston	46fcd1af63	Move kernel dump offset tracking into MI code. All of the kernel dump implementations keep track of the current offset ("dumplo") within the dump device. However, except for textdumps, they all write the dump sequentially, so we can reduce code duplication by having the MI code keep track of the current offset. The new dump_append() API can be used to write at the current offset. This is needed to implement support for kernel dump compression in the MI kernel dump code. Also simplify dump_encrypted_write() somewhat: use dump_write() instead of duplicating its bounds checks, and get rid of the redundant offset tracking. Reviewed by: cem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11722	2017-10-18 15:38:05 +00:00
Brooks Davis	39ed7f250a	Remove mbpool(9) now that it has no consumers. mbpool existed to support NICs with memory interfaces and all remaining comsumers were removed earlier this year with NATM. Reviewed by: jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D10513	2017-10-18 00:18:03 +00:00
Mark Johnston	fa00affd18	Fix a racy VI_DOOMED check in MNT_VNODE_FOREACH_ALL(). MNT_VNODE_FOREACH_ALL() is supposed to avoid returning doomed vnodes, but the VI_DOOMED check it used was done without the vnode interlock held, so it could race with a concurrent vgone(). Submitted by: Don Morris <don.morris@isilon.com> Reviewed by: kib, mckusick MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12704	2017-10-17 19:41:45 +00:00

1 2 3 4 5 ...

15689 Commits