freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	c39baa7480	Generalize UFS buffer pager to allow it serving other filesystems which also use buffer cache. Most important addition to the code is the handling of filesystems where the block size is less than the machine page size, which might require reading several buffers to validate single page. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-28 11:43:59 +00:00
Marcel Moolenaar	07f862a769	Include <stdarg.h> instead of <machine/stdarg.h> when compiled as part of libsbuf. The former is the standard header, and allows us to compile libsbuf on macOS/linux.	2016-10-24 18:03:04 +00:00
Konstantin Belousov	835c2787be	Handle broadcast NMIs. On several Intel chipsets, diagnostic NMIs sent from BMC or NMIs reporting hardware errors are broadcasted to all CPUs. When kernel is configured to enter kdb on NMI, the outcome is problematic, because each CPU tries to enter kdb. All CPUs are executing NMI handlers, which set the latches disabling the nested NMI delivery; this means that stop_cpus_hard(), used by kdb_enter() to stop other cpus by broadcasting IPI_STOP_HARD NMI, cannot work. One indication of this is the harmless but annoying diagnostic "timeout stopping cpus". Much more harming behaviour is that because all CPUs try to enter kdb, and if ddb is used as debugger, all CPUs issue prompt on console and race for the input, not to mention the simultaneous use of the ddb shared state. Try to fix this by introducing a pseudo-lock for simultaneous attempts to handle NMIs. If one core happens to enter NMI trap handler, other cores see it and simulate reception of the IPI_STOP_HARD. More, generic_stop_cpus() avoids sending IPI_STOP_HARD and avoids waiting for the acknowledgement, relying on the nmi handler on other cores suspending and then restarting the CPU. Since it is impossible to detect at runtime whether some stray NMI is broadcast or unicast, add a knob for administrator (really developer) to configure debugging NMI handling mode. The updated patch was debugged with the help from Andrey Gapon (avg) and discussed with him. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8249	2016-10-24 16:40:27 +00:00
Konstantin Belousov	55ee7a4c5f	In the fueword64(9) wrapper for architectures which do not implemented native fueword64(9) still, use proper type for local where fuword64() result is stored. Note that fueword64() is unused in the tree. Submitted by: Chunhui He <hchunhui@mail.ustc.edu.cn> PR: 212520 MFC after: 1 week	2016-10-23 11:23:17 +00:00
Conrad Meyer	8798ef0679	ddb(4): Add sleepchains to "show allchains" Reported by: markj Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8320	2016-10-22 18:02:20 +00:00
Hiren Panchasara	9d71a3975e	Rework r306337. In sendit(), if mp->msg_control is present, then in sockargs() we are allocating mbuf to store mp->msg_control. Later in kern_sendit(), call to getsock_cap(), will check validity of file pointer passed, if this fails EBADF is returned but mbuf allocated in sockargs() is not freed. Made code changes to free the same. Since freeing control mbuf in sendit() after checking (control != NULL) may lead to double freeing of control mbuf in sendit(), we can free control mbuf in kern_sendit() if there are any errors in the routine. Submitted by: Lohith Bellad <lohith.bellad@me.com> Reviewed by: glebius MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D8152	2016-10-21 18:27:30 +00:00
Mariusz Zaborski	4b83a77606	capsicum: perform copyout without the fildesc lock held in sys_cap_ioctls_get Reviewed by: pjd	2016-10-21 16:12:23 +00:00
Mateusz Guzik	bb697a20d7	cache: fix up a corner case in r307650 If no negative entry is found on the last list, the ncp pointer will be left uninitialized and a non-null value will make the function assume an entry was found. Fix the problem by initializing to NULL on entry. Reported by: glebius	2016-10-20 19:55:50 +00:00
Kevin Lo	61f481fb7e	Remove register keyword. Reviewed by: kib	2016-10-20 01:21:10 +00:00
Kevin Lo	7c68685366	Remove a sentence about putting initialization in init_proc.c or kern_proc.c and useless comment. Reviewed by: kib	2016-10-20 01:19:37 +00:00
Sean Bruno	026204b4c6	Resolve whitespace diff to NextBSD. Check to see that the taskqueue thread count requires us to acutally iterate over the thread count to bind to cpus. Submitted by: mmacy@nextbsd.org	2016-10-19 21:01:24 +00:00
Mateusz Guzik	53dc58f2dc	Mark a bunch of mpsafe sysctls as such. This gives me a sysctl Giant-free buildworld.	2016-10-19 19:42:01 +00:00
Mateusz Guzik	a45a1a25b8	cache: split negative entry LRU into multiple lists This splits the ncneg_mtx lock while preserving the hit ratio at least during buildworld. Create N dedicated lists for new negative entries. Entries with at least one hit get promoted to the hot list, where they get requeued every M hits. Shrinking demotes one hot entry and performs a round-robin shrinking of regular lists. Reviewed by: kib	2016-10-19 18:29:52 +00:00
Sean Bruno	abf38392c6	Assert that we're assigning a non-null taskqueue. ref: `535865d02c` Fix cpu assignment by assuring stride is non-zero, assert that all tasks have a valid taskqueue. ref: `db39817623` Start cpu assignment from zero. ref: `d99d39b6b6` Submitted by: mmacy@nextbsd.org	2016-10-18 14:00:26 +00:00
Sean Bruno	12d1b8c9f3	Ensure that tasks with a specific cpu set prior to smp starting get re-attached to a thread running on that cpu. ref: `fcc20e306b` Submitted by: mmacy@nextbsd.org	2016-10-18 13:55:34 +00:00
Sean Bruno	dc35f36560	Tell gtask to what we've been bound. ref: `54414984cf` Submitted by: mmacy@nextbsd.org	2016-10-18 13:16:27 +00:00
Ed Maste	9e62195361	makesyscalls.sh: remove trailing space on the "created from" line In r10905 and r10906 makesyscalls was modified to avoid emitting a literal $Id$ string in the generated file, with: gsub("[$]Id: ", "", $0) gsub(" [$]", "", $0) Then r11294 added some functionality and also tried to address the $Id$ problem in a different way, by removing every $: sed -e 's/\$//g ... This rendered the gsub infeffective. The gsub was later updated to track the $Id$ -> $FreeBSD$ switch, even though it did not do anything. Revert the addition of the s/\$//g, and update the gsub to keep the resulting format the same. Discussed with: bde MFC after: 1 week Sponsored by: The FreeBSD Foundation	2016-10-17 13:52:24 +00:00
Hans Petter Selasky	d3bf5efc1f	Fix device delete child function. When detaching device trees parent devices must be detached prior to detaching its children. This is because parent devices can have pointers to the child devices in their softcs which are not invalidated by device_delete_child(). This can cause use after free issues and panic(). Device drivers implementing trees, must ensure its detach function detaches or deletes all its children before returning. While at it remove now redundant device_detach() calls before device_delete_child() and device_delete_children(), mostly in the USB controller drivers. Tested by: Jan Henrik Sylvester <me@janh.de> Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D8070 MFC after: 2 weeks	2016-10-17 10:20:38 +00:00
Konstantin Belousov	5975e53d40	Fix a race in vm_page_busy_sleep(9). Suppose that we have an exclusively busy page, and a thread which can accept shared-busy page. In this case, typical code waiting for the page xbusy state to pass is again: VM_OBJECT_WLOCK(object); ... if (vm_page_xbusied(m)) { vm_page_lock(m); VM_OBJECT_WUNLOCK(object); <---1 vm_page_busy_sleep(p, "vmopax"); goto again; } Suppose that the xbusy state owner locked the object, unbusied the page and unlocked the object after we are at the line [1], but before we executed the load of the busy_lock word in vm_page_busy_sleep(). If it happens that there is still no waiters recorded for the busy state, the xbusy owner did not acquired the page lock, so it proceeded. More, suppose that some other thread happen to share-busy the page after xbusy state was relinquished but before the m->busy_lock is read in vm_page_busy_sleep(). Again, that thread only needs vm_object lock to proceed. Then, vm_page_busy_sleep() reads busy_lock value equal to the VPB_SHARERS_WORD(1). In this case, all tests in vm_page_busy_sleep(9) pass and we are going to sleep, despite the page being share-busied. Update check for m->busy_lock == VPB_UNBUSIED in vm_page_busy_sleep(9) to also accept shared-busy state if we only wait for the xbusy state to pass. Merge sequential if()s with the same 'then' clause in vm_page_busy_sleep(). Note that the current code does not share-busy pages from parallel threads, the only way to have more that one sbusy owner is right now is to recurse. Reported and tested by: pho (previous version) Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D8196	2016-10-13 14:41:05 +00:00
Conrad Meyer	d9ce8a41ea	kern_linker: Handle module-loading failures in preloaded .ko files The runtime kernel loader, linker_load_file, unloads kernel files that failed to load all of their modules. For consistency, treat preloaded (loader.conf loaded) kernel files in the same way. Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8200	2016-10-13 02:06:23 +00:00
Ed Maste	2a059700b6	Use correct size type in do_setopt_accept_filter Submitted by: ecturt@gmail.com	2016-10-12 00:56:49 +00:00
Oleksandr Tymoshenko	609b0fe966	INTRNG - fix MSI/MSIX release path Use isrc in attached MSI data structure instead of using map's isrc directly. map's isrc is set to NULL on IRQ deactivation which happens prior to pci_release_msi so MSI_RELEASE_MSI receives array of NULLs Reviewed by: mmel Differential Revision: https://reviews.freebsd.org/D8206	2016-10-11 17:00:29 +00:00
Sean Bruno	1ee17b070d	Fix bug where malloc(.., M_NOWAIT) return value is not checked, Change to M_WAITOK and move outside the mutex Submitted by: shurd Reviewed by: mmacy@nextbsd.org MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D7649	2016-10-11 14:08:53 +00:00
Mateusz Guzik	45571f8886	vfs: assert empty tmp free list on unmount	2016-10-08 13:38:05 +00:00
Mateusz Guzik	c6c44ff7eb	vfs: clear the tmp free list flag before taking the free vnode list lock Safe access is already guaranteed because of the mnt_listmx lock.	2016-10-08 13:36:59 +00:00
Konstantin Belousov	f71d08566c	Limit scope of the optimization in r306608 to dounmount() caller only. Other uses of cache_purgevfs() do rely on the cache purge for correct operations, when paths are invalidated without unmount. Reported and tested by: jkim Discussed with: mjg Sponsored by: The FreeBSD Foundation	2016-10-07 11:38:28 +00:00
Bryan Drewery	32641585a9	vrefl: Assert that the interlock is held. Sponsored by: Dell EMC Isilon MFC after: 2 weeks	2016-10-06 18:10:19 +00:00
Bryan Drewery	5a22c9582c	Add vrecyclel() to vrecycle() a vnode with the interlock already held. Obtained from: OneFS Sponsored by: Dell EMC Isilon MFC after: 2 weeks	2016-10-06 18:09:22 +00:00
Conrad Meyer	f43292ecf4	vfs_bio: Remove a leading space (style) Introduced in r282085. Sponsored by: Dell EMC Isilon	2016-10-05 23:42:02 +00:00
Bryan Drewery	0617f64ec6	Correct some comments after r294299. Sponsored by: Dell EMC Isilon	2016-10-04 21:44:20 +00:00
Ed Maste	65eea7ede6	ANSIfy inflate.c Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D8143	2016-10-04 17:57:30 +00:00
Konstantin Belousov	5420f76b59	Style. Reviewed by: emaste Sponsored by: The FreeBSD Foundation MFC after: 3 days	2016-10-04 15:23:03 +00:00
Mateusz Guzik	4876636eb7	cache: ignore purgevfs requests for filesystems with few vnodes purgevfs is purely optional and induces lock contention in workloads which frequently mount and unmount filesystems. In particular, poudriere will do this for filesystems with 4 vnodes or less. Full cache scan is clearly wasteful. Since there is no explicit counter for namecache entries, the number of vnodes used by the target fs is checked. The default limit is the number of bucket locks. Reviewed by: kib	2016-10-03 00:02:32 +00:00
Mateusz Guzik	5bb81f9b2d	vfs: batch free vnodes in per-mnt lists Previously free vnodes would always by directly returned to the global LRU list. With this change up to mnt_free_list_batch vnodes are collected first. syncer runs always return the batch regardless of its size. While vnodes on per-mnt lists are not counted as free, they can be returned in case of vnode shortage. Reviewed by: kib Tested by: pho	2016-09-30 17:27:17 +00:00
Mateusz Guzik	8660b707ff	vfs: remove the __bo_vnode field from struct vnode The pointer can be obtained using __containerof instead. Reviewed by: kib	2016-09-30 17:11:03 +00:00
Gleb Smirnoff	7ed6b78b92	Provide kern.maxphys sysctl, which returns MAXPHYS. Naming matches NetBSD.	2016-09-29 23:07:28 +00:00
Allan Jude	0176ca2ed5	Allow reading the following sysctl MIBs in capability mode: kern.hostname, kern.domainname, and kern.hostuuid This allows sandboxed applications to read these sysctls Submitted by: cem (original version) Reviewed by: cem, jonathan, rwatson (original version) Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D8015	2016-09-29 16:29:49 +00:00
Hans Petter Selasky	99eca1b2b3	While draining a timeout task prevent the taskqueue_enqueue_timeout() function from restarting the timer. Commonly taskqueue_enqueue_timeout() is called from within the task function itself without any checks for teardown. Then it can happen the timer stays active after the return of taskqueue_drain_timeout(), because the timeout and task is drained separately. This patch factors out the teardown flag into the timeout task itself, allowing existing code to stay as-is instead of applying a teardown flag to each and every of the timeout task consumers. Add assert to taskqueue_drain_timeout() which prevents parallel execution on the same timeout task. Update manual page documenting the return value of taskqueue_enqueue_timeout(). Differential Revision: https://reviews.freebsd.org/D8012 Reviewed by: kib, trasz MFC after: 1 week	2016-09-29 10:38:20 +00:00
Hiren Panchasara	7c9a4d09d6	Revert r306337. dhw@ reproted a panic which seems related to this and bde@ has raised some issues.	2016-09-26 15:45:30 +00:00
Eric van Gyzen	310ab671b8	Make no assertions about mutex state when the scheduler is stopped. This changes the assert path to match the lock and unlock paths. MFC after: 1 week Sponsored by: Dell EMC	2016-09-26 15:30:30 +00:00
Hiren Panchasara	41bb1a25a9	In sendit(), if mp->msg_control is present, then in sockargs() we are allocating mbuf to store mp->msg_control. Later in kern_sendit(), call to getsock_cap(), will check validity of file pointer passed, if this fails EBADF is returned but mbuf allocated in sockargs() is not freed. Fix this possible leak. Submitted by: Lohith Bellad <lohith.bellad@me.com> Reviewed by: adrian MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D7910	2016-09-26 10:13:58 +00:00
Julian Elischer	1c8260b61d	Give the user a clue as to which process hit maxfiles. MFC after: 1 week Sponsored by: Panzura	2016-09-24 22:56:13 +00:00
Konstantin Belousov	939457e3e0	Add the foundation copyrights to procctl kernel sources. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-09-23 12:32:20 +00:00
Mariusz Zaborski	ad5e83dd3c	fd: fix up fget_cap If the kernel is not compiled with the CAPABILITIES kernel options fget_unlocked doesn't return the sequence number so fd_modify will always report modification, in that case we got infinity loop. Reported by: br Reviewed by: mjg Tested by: br, def	2016-09-23 08:13:46 +00:00
Mateusz Guzik	deffc4a026	fd: fix up fgetvp_rights after r306184 fget_cap_locked returns a referenced file, but the fgetvp_rights does not need it. Instead, due to the filedesc lock being held, it can ref the vnode after the file was looked up. Fix up fget_cap_locked to be consistent with other _locked helpers and not ref the file. This plugs a leak introduced in r306184. Pointy hat to: mjg, oshogbo	2016-09-23 06:51:46 +00:00
Mateusz Guzik	1d2541fd1a	cache: get rid of the global lock Add a table of vnode locks and use them along with bucketlocks to provide concurrent modification support. The approach taken is to preserve the current behaviour of the namecache and just lock all relevant parts before any changes are made. Lookups still require the relevant bucket to be locked. Discussed with: kib Tested by: pho	2016-09-23 04:45:11 +00:00
Gleb Smirnoff	a2d8f9d2fc	Fix regression from r297400, which truncates headers in case of low socket buffer and put a small optimization for low socket buffer case: - Do not hack uio_resid, and let m_uiotombuf() properly take care of it. This fixes truncation of headers at low buffer. - If headers ate all the space, jump right to the end of the cycle, to avoid doing single page I/O and allocating zero length mbuf. - Clear hdr_uio only if space is positive, which indicates that all uio was copied in. Reviewed by: pluknet, jtl, emax, rrs, lstewart, emax, gallatin, scottl	2016-09-22 20:34:44 +00:00
Ruslan Bukin	30f3bfe58e	Adjust the sopt_val pointer on bigendian systems (e.g. MIPS64EB). sooptcopyin() checks if size of data provided by user is <= than we can accept, else it strips down the size. On bigendian platforms we have to move pointer as well so we copy the actual data. Reviewed by: gnn Sponsored by: DARPA, AFRL Sponsored by: HEIF5 Differential Revision: https://reviews.freebsd.org/D7980	2016-09-22 12:41:53 +00:00
Mariusz Zaborski	6490bc6529	fd: simplify fgetvp_rights by using fget_cap_locked Reviewed by: mjg	2016-09-22 11:54:20 +00:00
Mariusz Zaborski	85b0f9de11	capsicum: propagate rights on accept(2) Descriptor returned by accept(2) should inherits capabilities rights from the listening socket. PR: 201052 Reviewed by: emaste, jonathan Discussed with: many Differential Revision: https://reviews.freebsd.org/D7724	2016-09-22 09:58:46 +00:00

1 2 3 4 5 ...

15115 Commits