freebsd-skq

Author	SHA1	Message	Date
Kyle Evans	d529de874b	res_find: Fix fallback logic The fallback logic was broken if hints were found in multiple environments. If we found a hint in either the loader environment or the static environment, fallback would be incremented excessively when we returned to the environment-selection bits. These checks should have also been guarded by the fbacklvl checks. As a result, fbacklvl could quickly get to a point where we skip either the static environment and/or the static hints depending on which environments contained valid hints. The impact of this bug is minimal, mostly affecting mips boards that use static hints and may have hints in either the loader environment or the static environment. There may be better ways to express the searchable environments and describing their characteristics (immutable, already searched, etc.) but this may be revisited after 12 branches. Reported by: Dan Nelson <dnelson_1901@yahoo.com> Triaged by: Dan Nelson <dnelson_1901@yahoo.com> MFC after: 3 days	2018-08-18 19:45:56 +00:00
Xin LI	ed1fa01ac4	Regen after r337998.	2018-08-18 06:33:51 +00:00
Xin LI	0362ec1e8e	getrandom(2) should not be restricted in capability mode.	2018-08-18 06:31:49 +00:00
Mark Johnston	1436ff1ebb	Typo. X-MFC with: r337974	2018-08-17 16:07:06 +00:00
Mark Johnston	3ccbdc8254	Add INVARIANTS-only fences around lockless vnode refcount updates. Some internal KASSERTs access the v_iflag field without the vnode interlock held after such a refcount update. The fences are needed for the assertions to be correct in the face of store reordering. Reported and tested by: jhibbits Reviewed by: kib, mjg MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16756	2018-08-17 15:41:01 +00:00
Mariusz Zaborski	2fe6aefff8	capsicum: allow the setproctitle(3) function in capability mode Capsicum in past allowed to change the process title. This was broken with r335939. PR: 230584 Submitted by: Yuichiro NAITO <naito.yuichiro@gmail.com> Reported by: ian@niw.com.au MFC after: 1 week	2018-08-17 14:35:10 +00:00
Kyle Evans	45625675e7	subr_prf: Don't write kern.boot_tag if it's empty This change allows one to set kern.boot_tag="" and not get a blank line preceding other boot messages. While this isn't super critical- blank lines are easy to filter out both mentally and in processing dmesg later- it allows for a mode of operation that matches previous behavior. I intend to MFC this whole series to stable/11 by the end of the month with boot_tag empty by default to make this effectively a nop in the stable branch.	2018-08-17 03:42:57 +00:00
Jamie Gritton	c542c43ef1	Revert r337922, except for some documention-only bits. This needs to wait until user is changed to stop using jail(2). Differential Revision: D14791	2018-08-16 19:09:43 +00:00
Jamie Gritton	284001a222	Put jail(2) under COMPAT_FREEBSD11. It has been the "old" way of creating jails since FreeBSD 7. Along with the system call, put the various security.jail.allow_foo and security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or BURN_BRIDGES). These sysctls had two disparate uses: on the system side, they were global permissions for jails created via jail(2) which lacked fine-grained permission controls; inside a jail, they're read-only descriptions of what the current jail is allowed to do. The first use is obsolete along with jail(2), but keep them for the second-read-only use. Differential Revision: D14791	2018-08-16 18:40:16 +00:00
Edward Tomasz Napierala	e77b6cfe34	In the help message at the mountroot prompt, suggest something that actually works and matches the bsdinstall(8) default. MFC after: 2 weeks Sponsored by: DARPA, AFRL	2018-08-15 12:12:21 +00:00
Alan Cox	c65ed2ff53	Eliminate a redundant assignment. MFC after: 1 week	2018-08-11 19:21:53 +00:00
Kyle Evans	0915d9d070	subr_prf: remove think-o that had returned to local patch Reported by: cognet	2018-08-10 15:35:02 +00:00
Kyle Evans	170bc29131	boot tagging: minor fixes msgbufinit may be called multiple times as we initialize the msgbuf into a progressively larger buffer. This doesn't happen as of now on head, but it may happen in the future and we generally support this. As such, only print the boot tag if we've just initialized the buffer for the first time. The boot tag also now has a newline appended to it for better visibility, and has been switched to a normal printf, by requesto f bde, after we've denoted that the msgbuf is mapped.	2018-08-10 15:29:06 +00:00
Kyle Evans	240fcda1e8	subr_prf: style(9) the sizeof Reported by: jkim, ian	2018-08-09 19:09:06 +00:00
Kyle Evans	4c793b68da	subr_prf: Use "sizeof current_boot_tag" instead	2018-08-09 17:53:18 +00:00
Kyle Evans	2a4650cc11	BOOT_TAG: Make a config(5) option, expose as sysctl and loader tunable BOOT_TAG lived shortly in sys/msgbuf.h, but this wasn't necessarily great for changing it or removing it. Move it into subr_prf.c and add options for it to opt_printf.h. One can specify both the BOOT_TAG and BOOT_TAG_SZ (really, size of the buffer that holds the BOOT_TAG). We expose it as kern.boot_tag and also add a loader tunable by the same name that we'll fetch upon initialization of the msgbuf. This allows for flexibility and also ensures that there's a consistent way to figure out the boot tag of the running kernel, rather than relying on headers to be in-sync. Prodded super-super-lightly by: imp	2018-08-09 17:47:47 +00:00
Kyle Evans	21aa6e8345	msgbuf: Light detailing (const'ify and bool'itize)	2018-08-09 17:42:27 +00:00
Leandro Lupori	c8e2123b6a	[ppc] Fix kernel panic when using BOOTP_NFSROOT On PowerPC (and possibly other architectures), that doesn't use EARLY_AP_STARTUP, the config task queue may be used initialized. This was observed while trying to mount the root fs from NFS, as reported here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230168. This patch has 2 main changes: 1- Perform a basic initialization of qgroup_config, similar to what is done in taskqgroup_adjust, but simpler. This makes qgroup_config ready to be used during NFS root mount. 2- When EARLY_AP_STARTUP is not used, call inm_init() and in6m_init() right before SI_SUB_ROOT_CONF, because bootp needs to send multicast packages to request an IP. PR: Bug 230168 Reported by: sbruno Reviewed by: jhibbits, mmacy, sbruno Approved by: jhibbits Differential Revision: D16633	2018-08-09 14:04:51 +00:00
Matt Macy	9fec45d8e5	epoch_block_wait: don't check TD_RUNNING struct epoch_thread is not type safe (stack allocated) and thus cannot be dereferenced from another CPU Reported by: novel@	2018-08-09 05:18:27 +00:00
Kyle Evans	2834d61202	kern: Add a BOOT_TAG marker at the beginning of boot dmesg From the "newly licensed to drive" PR department, add a BOOT_TAG marker (by default, --<<BOOT>>--, to the beginning of each boot's dmesg. This makes it easier to do textproc magic to locate the start of each boot and, of particular interest to some, the dmesg of the current boot. The PR has a dmesg(8) component as well that I've opted not to include for the moment- it was the more contentious part of this PR. bde@ also made the statement that this boot tag should be written with an ordinary printf, which I've- for the moment- declined to change about this patch to keep it more transparent to observer of the boot process. PR: 43434 Submitted by: dak <aurelien.nephtali@wanadoo.fr> (basically rewritten) MFC after: maybe never	2018-08-09 01:32:09 +00:00
Konstantin Belousov	8f94195022	Followup to r337430: only call elf_reloc_ifunc on x86. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-08-07 20:43:50 +00:00
Konstantin Belousov	289ead7cb0	Add missed handling of local relocs against ifunc target in the obj modules. Reported and tested by: wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-08-07 18:26:46 +00:00
Mark Johnston	c7902fbeae	Improve handling of control message truncation. If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space for all control messages, the kernel sets MSG_CTRUNC in the message flags to indicate truncation of the control messages. In the case of SCM_RIGHTS messages, however, we were failing to dispose of the rights that had already been externalized into the recipient's file descriptor table. Add a new function and mbuf type to handle this cleanup task, and use it any time we fail to copy control messages out to the recipient. To simplify cleanup, control message truncation is now only performed at control message boundaries. The change also fixes a few related bugs: - Rights could be leaked to the recipient process if an error occurred while copying out a message's contents. - We failed to set MSG_CTRUNC if the truncation occurred on a control message boundary, e.g., if the caller received two control messages and provided only the exact amount of buffer space needed for the first. PR: 131876 Reviewed by: ed (previous version) MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16561	2018-08-07 16:36:48 +00:00
Konstantin Belousov	a70e9a1388	Swap in WKILLED processes. Swapped-out process that is WKILLED must be swapped in as soon as possible. The reason is that such process can be killed by OOM and its pages can be only freed if the process exits. To exit, the kernel stack of the process must be mapped. When allocating pages for the stack of the WKILLED process on swap in, use VM_ALLOC_SYSTEM requests to increase the chance of the allocation to succeed. Add counter of the swapped out processes to avoid unneeded iteration over the allprocs list when there is no work to do, reducing the allproc_lock ownership. Reviewed by: alc, markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D16489	2018-08-04 20:45:43 +00:00
Mark Johnston	5b0480f2cc	Don't check rcv sockbuf limits when sending on a unix stream socket. sosend_generic() performs an initial comparison of the amount of data (including control messages) to be transmitted with the send buffer size. When transmitting on a unix socket, we then compare the amount of data being sent with the amount of space in the receive buffer size; if insufficient space is available, sbappendcontrol() returns an error and the data is lost. This is easily triggered by sending control messages together with an amount of data roughly equal to the send buffer size, since the control message size may change in uipc_send() as file descriptors are internalized. Fix the problem by removing the space check in sbappendcontrol(), whose only consumer is the unix sockets code. The stream sockets code uses the SB_STOP mechanism to ensure that senders will block if the receive buffer fills up. PR: 181741 MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16515	2018-08-04 20:26:54 +00:00
Mark Johnston	e62ca80bde	Style.	2018-08-04 20:16:36 +00:00
Andriy Gapon	e0fa977ea5	safer wait-free iteration of shared interrupt handlers The code that iterates a list of interrupt handlers for a (shared) interrupt, whether in the ISR context or in the context of an interrupt thread, does so in a lock-free fashion. Thus, the routines that modify the list need to take special steps to ensure that the iterating code has a consistent view of the list. Previously, those routines tried to play nice only with the code running in the ithread context. The iteration in the ISR context was left to a chance. After commit r336635 atomic operations and memory fences are used to ensure that ie_handlers list is always safe to navigate with respect to inserting and removal of list elements. There is still a question of when it is safe to actually free a removed element. The idea of this change is somewhat similar to the idea of the epoch based reclamation. There are some simplifications comparing to the general epoch based reclamation. All writers are serialized using a mutex, so we do not need to worry about concurrent modifications. Also, all read accesses from the open context are serialized too. So, we can get away just two epochs / phases. When a thread removes an element it switches the global phase from the current phase to the other and then drains the previous phase. Only after the draining the removed element gets actually freed. The code that iterates the list in the ISR context takes a snapshot of the global phase and then increments the use count of that phase before iterating the list. The use count (in the same phase) is decremented after the iteration. This should ensure that there should be no iteration over the removed element when its gets freed. This commit also simplifies the coordination with the interrupt thread context. Now we always schedule the interrupt thread when removing one of handlers for its interrupt. This makes the code both simpler and safer as the interrupt thread masks the interrupt thus ensuring that there is no interaction with the ISR context. P.S. This change matters only for shared interrupts and I realize that those are becoming a thing of the past (and quickly). I also understand that the problem that I am trying to solve is extremely rare. PR: 229106 Reviewed by: cem Discussed with: Samy Al Bahra MFC after: 5 weeks Differential Revision: https://reviews.freebsd.org/D15905	2018-08-03 14:27:28 +00:00
Alan Somers	da4465506d	Fix LOCAL_PEERCRED with socketpair(2) Enable the LOCAL_PEERCRED socket option for unix domain stream sockets created with socketpair(2). Previously, it only worked with unix domain stream sockets created with socket(2)/listen(2)/connect(2)/accept(2). PR: 176419 Reported by: Nicholas Wilson <nicholas@nicholaswilson.me.uk> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16350	2018-08-03 01:37:00 +00:00
Andriy Gapon	a260971458	fix a typo resulting in a wrong variable in kern_syscall_deregister The difference is between sysent, a global, and sysents, a function parameter.	2018-08-02 09:41:55 +00:00
Mark Johnston	4765717321	Remove a redundant check. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2018-07-30 17:58:41 +00:00
Alan Somers	6040822c4e	Make timespecadd(3) and friends public The timespecadd(3) family of macros were imported from NetBSD back in r35029. However, they were initially guarded by #ifdef _KERNEL. In the meantime, we have grown at least 28 syscalls that use timespecs in some way, leading many programs both inside and outside of the base system to redefine those macros. It's better just to make the definitions public. Our kernel currently defines two-argument versions of timespecadd and timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define three-argument versions. Solaris also defines a three-argument version, but only in its kernel. This revision changes our definition to match the common three-argument version. Bump _FreeBSD_version due to the breaking KPI change. Discussed with: cem, jilles, ian, bde Differential Revision: https://reviews.freebsd.org/D14725	2018-07-30 15:46:40 +00:00
Andrew Turner	cd2106eaea	Ensure the DPCPU and VNET module spaces are aligned to hold a pointer. Previously they may have been aligned to a char, leading to misaligned DPCPU and VNET variables. Sponsored by: DARPA, AFRL	2018-07-30 14:25:17 +00:00
David E. O'Brien	455d358977	Correct copyright dates.	2018-07-30 07:01:00 +00:00
Antoine Brodin	ccd6ac9f6e	Add allow.mlock to jail parameters It allows locking or unlocking physical pages in memory within a jail This allows running elasticsearch with "bootstrap.memory_lock" inside a jail Reviewed by: jamie@ Differential Revision: https://reviews.freebsd.org/D16342	2018-07-29 12:41:56 +00:00
Don Lewis	290d906084	Fix the long term ULE load balancer so that it actually works. The initial call to sched_balance() during startup is meant to initialize balance_ticks, but does not actually do that since smp_started is still zero at that time. Since balance_ticks does not get set, there are no further calls to sched_balance(). Fix this by setting balance_ticks in sched_initticks() since we know the value of balance_interval at that time, and eliminate the useless startup call to sched_balance(). We don't need to randomize the intial value of balance_ticks. Since there is now only one call to sched_balance(), we can hoist the tests at the top of this function out to the caller and avoid the overhead of the function call when running a SMP kernel on UP hardware. PR: 223914 Reviewed by: kib MFC after: 2 weeks	2018-07-29 00:30:06 +00:00
David Bright	95c05062ec	Allow a EVFILT_TIMER kevent to be updated. If a timer is updated (re-added) with a different time period (specified in the .data field of the kevent), the new time period has no effect; the timer will not expire until the original time has elapsed. This violates the documented behavior as the kqueue(2) man page says (in part) "Re-adding an existing event will modify the parameters of the original event, and not result in a duplicate entry." This modification, adapted from a patch submitted by cem@ to PR214987, fixes the kqueue system to allow updating a timer entry. The kevent timer behavior is changed to: * When a timer is re-added, update the timer parameters to and re-start the timer using the new parameters. * Allow updating both active and already expired timers. * When the timer has already expired, dequeue any undelivered events and clear the count of expirations. All of these changes address the original PR and also bring the FreeBSD and macOS kevent timer behaviors into agreement. A few other changes were made along the way: * Update the kqueue(2) man page to reflect the new timer behavior. * Fix man page style issues in kqueue(2) diagnosed by igor. * Update the timer libkqueue system test to test for the updated timer behavior. * Fix the (test) libkqueue common.h file so that it includes config.h which defines various HAVE_* feature defines, before the #if tests for such variables in common.h. This enables the use of the actual err(3) family of functions. * Fix the usages of the err(3) functions in the tests for incorrect type of variables. Those were formerly undiagnosed due to the disablement of the err(3) functions (see previous bullet point). PR: 214987 Reported by: Brian Wellington <bwelling@xbill.org> Reviewed by: kib MFC after: 1 week Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D15778	2018-07-27 13:49:17 +00:00
Andriy Gapon	111b043cdf	change interrupt event's list of handlers from TAILQ to CK_SLIST The primary reason for this commit is to separate mechanical and nearly mechanical code changes from an upcoming fix for unsafe teardown of shared interrupt handlers that have only filters (see D15905). The technical rationale is that SLIST is sufficient. The only operation that gets worse performance -- O(n) instead of O(1) is a removal of a handler, but it is not a critical operation and the list is expected to be rather short. Additionally, it is easier to reason about SLIST when considering the concurrent lock-free access to the list from the interrupt context and the interrupt thread. CK_SLIST is used because the upcoming change depends on the memory order provided by CK_SLIST insert and the fact that CL_SLIST remove does not trash the linkage in a removed element. While here, I also fixed a couple of whitespace issues, made code under ifdef notyet compilable, added a lock assertion to ithread_update() and made intr_event_execute_handlers() static as it had no external callers. Reviewed by: cem (earlier version) MFC after: 4 weeks Differential Revision: https://reviews.freebsd.org/D16016	2018-07-23 12:51:23 +00:00
Emmanuel Vadot	c54fe25dcb	Raise the size of L3 table for early devmap on arm64 Some driver (like efifb) needs to map more than the current L2_SIZE Raise the size so we can map the framebuffer setup by the bootloader. Reviewed by: cognet	2018-07-19 21:58:06 +00:00
Mark Johnston	bf923a556d	Delete an XXX comment addressed by r336505. X-MFC with: r336505 Sponsored by: The FreeBSD Foundation	2018-07-19 20:11:08 +00:00
Mark Johnston	483f692ea6	Have preload_delete_name() free pages backing preloaded data. On i386 and amd64, add a vm_phys segment for physical memory used to store the kernel binary and other preloaded data. This makes it possible to free such memory back to the system once it is no longer needed, e.g., when a preloaded kernel module is unloaded. Previously, it would have remained unused. Reviewed by: kib, royger MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16330	2018-07-19 20:00:28 +00:00
Mark Johnston	73624a804a	Provide the full module path to preload_delete_name(). The basename will never match against the preload metadata, so these calls previously had no effect. Reviewed by: kib, royger MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16330	2018-07-19 19:50:42 +00:00
Konstantin Belousov	53e20b2702	When reporting an error, print the errno value. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-07-19 19:03:18 +00:00
Emmanuel Vadot	326867616f	kern_cpu: When adding abs frequency allow for unordered insertion Keep the list ordered as some code assume that it is but allow for unordered cf_settings sets.	2018-07-19 11:28:14 +00:00
Mark Johnston	9295517ac9	Add a FALLTHROUGH comment to kvprintf(). Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de> MFC after: 3 days	2018-07-17 14:56:54 +00:00
Mariusz Zaborski	f1fe1e020f	Extend amount of possible coredumps from 10 to 100000 when using index format. The amount of digits in the name of corefile is assigned dynamically. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D16118	2018-07-15 17:10:12 +00:00
Mateusz Guzik	95ab076d6e	lockmgr: tidy up slock/sunlock similar to other locks	2018-07-13 22:40:14 +00:00
Warner Losh	25bc561e68	There's two files in the sys tree named inflate.c, in addition to it being a common name elsewhere. Rename the old kzip one to subr_inflate.c. This actually fixes the build issues on sparc64 that my inclusion of .PATH ${SYSDIR}/kern created in r336244, so also revert the broken workaround I committed in r336249. This slipped passed me because apparently, I never did a clean build.	2018-07-13 17:41:28 +00:00
Warner Losh	52379d36a9	Create helper functions for parsing boot args. boot_parse_arg to parse a single arg boot_parse_cmdline to parse a command line string boot_parse_args to parse all the args in a vector boot_howto_to_env Convert howto bits to env vars boot_env_to_howto Return howto mask mased on what's set in the environment. All these routines return an int that's the bitmask of the args translated to RB_* flags. As a special case, the 'S' flag sets the comconsole_speed env var. Any arg that looks like a=b will set the env key 'a' to value 'b'. If =b is omitted, 'a' is set to '1'. This should help us reduce the number of redundant copies of these routines in the tree. It should also give a more uniform experience between platforms. Also, invent a new flag RB_PROBE that's set when 'P' is parsed. On x86 + BIOS, this means 'probe for the keyboard, and if it's not there set both RB_MULTIPLE and RB_SERIAL (which means show the output on both video and serial consoles, but make serial primary). Others it may be some similar concept of probing, but it's loader dependent what, exactly, it means. These routines are suitable for /boot/loader and/or the kernel, though they may not be suitable for the tightly hand-rolled-for-space environments like boot2. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D16205	2018-07-13 16:43:05 +00:00
Brooks Davis	d92da75941	Round down the location of execpathp to slightly improve copyout speed. In practice, this moves the padding from below the canary to above execpathp has no impact on stack consumption. Submitted by: Wuyang-Chung (via github pull request #159) MFC after: 1 week	2018-07-13 11:32:27 +00:00
Mateusz Guzik	bcbc8d35eb	fd: stop passing M_ZERO to uma_zalloc The optimisation seen with malloc cannot be used here as zone sizes are now known at compilation. Thus bzero by hand to get the optimisation instead.	2018-07-12 22:48:18 +00:00

1 2 3 4 5 ...

16232 Commits