freebsd-nq

Author	SHA1	Message	Date
Alan Cox	3e7cb27cdd	Use a single, consistent approach to returning success versus failure in vm_map_madvise(). Previously, vm_map_madvise() used a traditional Unix- style "return (0);" to indicate success in the common case, but Mach- style return values in the edge cases. Since KERN_SUCCESS equals zero, the only problem with this inconsistency was stylistic. vm_map_madvise() has exactly two callers in the entire source tree, and only one of them cares about the return value. That caller, kern_madvise(), can be simplified if vm_map_madvise() consistently uses Unix-style return values. Since vm_map_madvise() uses the variable modify_map as a Boolean, make it one. Eliminate a redundant error check from kern_madvise(). Add a comment explaining where the check is performed. Explicitly note that exec_release_args_kva() doesn't care about vm_map_madvise()'s return value. Since MADV_FREE is passed as the behavior, the return value will always be zero. Reviewed by: kib, markj MFC after: 7 days	2018-06-04 16:28:06 +00:00
Ruslan Bukin	8cd6c09e7e	Fix build: ignore a GCC 7.2.0 warning which says that third argument of memset(3) should contain the number of elements multiplied by the element size. Sponsored by: DARPA, AFRL	2018-06-04 16:20:22 +00:00
Justin Hibbits	12f691959f	Align UMA data to 128 byte cacheline size Suggested by: mjg	2018-06-04 15:44:17 +00:00
Mark Johnston	bcc51ef48d	Fix the NUMA build for non-x86 platforms. acpi_map_pxm_to_vm_domainid() is currently implemented only on x86. MFC after: 1 week	2018-06-04 14:56:02 +00:00
Rick Macklem	8472f76005	Revert r334586 since I now think __unused is the better way to handle this.	2018-06-04 11:35:04 +00:00
Matt Macy	9645bcabdf	hwpmc: fix fixed counters checks	2018-06-04 04:49:06 +00:00
Matt Macy	07d80fd8dc	hwpmc: ABI fixes - increase pmc cpuid field from 8 to 12 bits - add cpuid version string to initialize entry in the log so that filter can identify which counter index an event name maps to - GC unused config flags - make fixed counter assignment more robust as well as the changes needed to be properly identified for filter	2018-06-04 02:05:48 +00:00
Matt Macy	5de96e33d6	hwpmc: support sampling both kernel and user stacks when interrupted in kernel This adds the -U options to pmcstat which will attribute in-kernel samples back to the user stack that invoked the system call. It is not the default, because when looking at kernel profiles it is generally more desirable to merge all instances of a given system call together. Although heavily revised, this change is directly derived from D7350 by Jonathan T. Looney. Obtained from: jtl Sponsored by: Juniper Networks, Limelight Networks	2018-06-04 01:10:23 +00:00
Rick Macklem	12c7a494ad	Fix a gcc8 warning about a write only variable. gcc8 warns that "verf" was set but not used. This was because the code that uses it is disabled via a "#if 0". This patch adds a "#if 0" to the variable's declaration and assignment to get rid of the warning. This way the code could be re-enabled without difficulty. Requested by: mmacy MFC after: 2 weeks	2018-06-03 19:46:44 +00:00
Matt Macy	2ce69a4d04	hwpmc: ensure that mapin updates are synchronous	2018-06-03 19:37:17 +00:00
Matt Macy	a1a1e8a89a	pmc: remove assert that is invalid in interrupt context	2018-06-03 19:37:09 +00:00
Vladimir Kondratyev	6bae6b2538	[evdev] Sync event codes with Linux kernel 4.16 MFC after: 2 weeks	2018-06-03 10:53:10 +00:00
Justin Hibbits	a1a990d8a4	Revert r326083, it doesn't behave as expected. Even though there do appear to be more artificial frames, with 12, stack traces no longer list at all. Revert until a better, more stable value can be determined.	2018-06-03 03:53:11 +00:00
Mateusz Guzik	d0a22279db	Remove an unused argument to turnstile_unpend. PR: 228694 Submitted by: Julian Pszczołowski <julian.pszczolowski@gmail.com>	2018-06-02 22:37:53 +00:00
Mateusz Guzik	34c538c356	malloc: try to use builtins for zeroing at the callsite Plenty of allocation sites pass M_ZERO and sizes which are small and known at compilation time. Handling them internally in malloc loses this information and results in avoidable calls to memset. Instead, let the compiler take the advantage of it whenever possible. Discussed with: jeff	2018-06-02 22:20:09 +00:00
Justin Hibbits	5167f178ab	Included VSX registers in powerpc core dumps Summary: Included VSX registers in powerpc core dumps (both kernel and gcore) Submitted by: Luis Pires Differential Revision: https://reviews.freebsd.org/D15512	2018-06-02 20:28:58 +00:00
Mateusz Guzik	15825d5b78	amd64: add a mild depessimization to rep mov/stos users Currently all the primitives are waiting for a rewrite, tidy them up in the meantime. Vast majority of cases pass sizes which are multiple of 8. Which means the following rep stosb/movb has nothing to do. Turns out testing first if there is anything to do is a big win across the board (cpus with and without ERMS, Intel and AMD) while not pessimizing the case where there is work to do. Sample results for zeroing 64 bytes (ops/second): Ryzen Threadripper 1950X 91433212 -> 147265741 Intel(R) Xeon(R) CPU X5675 @ 3.07GHz 90714044 -> 121992888 bzero and bcopy are on their way out and were not modified. Nothing in the tree uses them.	2018-06-02 20:14:43 +00:00
Justin Hibbits	2e65567500	Added ptrace support for reading/writing powerpc VSX registers Summary: Added ptrace support for getting/setting the remaining part of the VSX registers (the part that's not already covered by FPR or VR registers). This is necessary to add support for VSX registers in debuggers. Submitted by: Luis Pires Differential Revision: https://reviews.freebsd.org/D15458	2018-06-02 19:17:11 +00:00
Mateusz Guzik	ba96f37758	Use __builtin for various mem* and b* (e.g. bzero) routines. Some of the routines were using artificially limited builtin already, drop the explicit limit. The use of builtins allows quite often allows the compiler to elide the call or most zeroing to begin with. For instance, if the target object is 32 bytes in size and gets zeroed + has 16 bytes initialized, the compiler can just add code to zero out the rest. Note not all the primites have asm variants and some of the existing ones are not optimized. Maintaines are strongly encourage to take a look (regardless of this change).	2018-06-02 18:03:35 +00:00
Mateusz Guzik	97e8984893	libkern: tidy up memset 1. Remove special-casing of 0 as it just results in an extra function call. This is clearly pessimal. 2. Drop the inline stuff. For the most part it is much better served with __builtin_memset (coming later). 3. Move the declaration to systm.h to match other funcs. Archs are encouraged to implement the variant for their own platform so that this implementation can be dropped.	2018-06-02 17:57:09 +00:00
Michael Tuexen	13500cbb61	Don't overflow a buffer if we receive an INIT or INIT-ACK chunk without a RANDOM parameter but with a CHUNKS or HMAC-ALGO parameter. Please note that sending this combination violates the specification. Thnanks to Ronald E. Crane for reporting the issue for the userland stack. MFC after: 3 days	2018-06-02 16:28:10 +00:00
Bruce Evans	9729130321	Improve defaults for per-CPU kernel console colors, especially with 2 or 4 CPUs. Add a compile-time option SC_KERNEL_CONS_ATTRS to control the defaults. Default to color numbers in reverse order to CPU numbers (instead of in the same order with white first and wrapping to dark grey), so that the brightest bright colors are used first. Don't use dark grey at all; replace it by dark green. Syscons has too many compile-time options, but this one is needed in in case the defaults give something like white on white, or the user really hates this feature and can't wait to turn it off in rc. MFC after: next release?	2018-06-02 14:07:27 +00:00
Bruce Evans	fa49511709	Use per-CPU attributes earlier. The per-CPU ts is not initialized early, so the global kernel ts is used early, but it ony has 1 (normal) attribute. Switch this to the per-CPU attribute. The difference is most visible with EARLY_AP_STARTUP. Change to using the curcpu macro instead of PCPU_GET(cpuid) in 2 places for the above and in 1 other place in my old code in syscons. The function-like spelling is perhaps better for indicating that curcpu is volatile (unlike curthread), but for CPU attributes volatility is a feature.	2018-06-02 10:36:30 +00:00
Bruce Evans	d10566cf49	Oops, the last minute reduction in the clobber list for i386 MCOUNT_OVERHEAD() in r334522 was too agressive. Only mcount exit preserves %eax and %edx.	2018-06-02 09:59:27 +00:00
Bruce Evans	b9cedb46e2	Fix low-level locking during panics. The SCHEDULER_STOPPED() hack breaks locking generally, and mtx_trylock_() especially. When mtx_trylock_() returns nonzero, naive code version here trusts it to have worked. But when SCHEDULER_STOPPED() is true, mtx_trylock_() returns 1 without doing anything. Then mtx_unlock_() crashes especially badly attempting to unlock iff the error is detected, since mutex unlocking functions don't check SCHEDULER_STOPPED(). syscons already didn't trust mtx_trylock_spin(), but it was missing the logic to turn on sp->kdb_locked when turning off sp->mtx_locked during panics. It also used panicstr instead of SCHEDULER_LOCKED because I thought that panicstr was more fragile. They only differ for a window of lines in panic(), and in broken cases where stop_cpus_hard() in panic() didn't work.	2018-06-02 08:38:59 +00:00
Bruce Evans	c507c512b9	Finish COMPAT_AOUT support for amd64. It wasn't in any amd64 or MI file in /sys/conf, so was unavailable in configurations that don't use modules, and was not testable or notable in NOTES. Its normal configuration (not using a module) is still silently deprecated in aout(4) by not mentioning it there. Update i386 NOTES for COMPAT_AOUT. It is not i386-only, or even very MD. Sort its entry better. Finish gzip configuration (but not support) for amd64. gzip is really gzipped aout. It is currently broken even for i386 (a call to vm fails). amd64 has always attempted to configure and test it, but it depends on COMPAT_AOUT (as noted). The bug that it depends on unconfigured files was not detected since it is configured as a device. All other optional image activators are configured properly using an option.	2018-06-02 06:40:15 +00:00
Bruce Evans	49c871278a	Fix high resolution kernel profiling just enough to not crash at boot time, especially for SMP. If configured, it turns itself on at boot time for calibration, so is fragile even if never otherwise used. Both types of kernel profiling were supposed to use a global spinlock in the SMP case. If hi-res profiling is configured (but not necessarily used), this was supposed to be optimized by only using it when necessary, and slightly more efficiently, in asm. But it was not done at all for mcount entry where it is necessary. This caused crashes in the SMP case when either type of profiling was enabled. For mcount exit, it only caused wrong times. The times were wrongest with an i8254 timer since using that requires exclusive access to the hardware. The i8254 timer was too slow to use here 20 years ago and is much less usable now, but it is the default for the SMP case since TSCs weren't invariant when SMP was new. Do the locking in all hi-res SMP cases for simplicity. Calibration uses special asms, and the clobber lists in these were sort of inverted. They contained the arg and return registers which are not clobbered, but on amd64 they didn't contain the residue of the call-used registers which may be clobbered (%r10 and %r11). This usually caused hangs at boot time. This usually affected even the UP case.	2018-06-02 05:48:44 +00:00
Bruce Evans	dbe3061729	Fix recent breakages of kernel profiling, mostly on i386 (high resolution kernel profiling remains broken). memmove() was broken using ALTENTRY(). ALTENTRY() is only different from ENTRY() in the profiling case, and its use in that case was sort of backwards. The backwardness magically turned memmove() into memcpy() instead of completely breaking it. Only the high resolution parts of profiling itself were broken. Use ordinary ENTRY() for memmove(). Turn bcopy() into a tail call to memmove() to reduce complications. This gives slightly different pessimizations and profiling lossage. The pessimizations are minimized by not using a frame pointer() for bcopy(). Calls to profiling functions from exception trampolines were not relocated. This caused crashes on the first exception. Fix this using function pointers. Addresses of exception handlers in trampolines were not relocated. This caused unknown offsets in the profiling data. Relocate by abusing setidt_disp as for pmc although this is slower than necessary and requires namespace pollution. pmc seems to be missing some relocations. Stack traces and lots of other things in debuggers need similar relocations. Most user addresses were misclassified as unknown kernel addresses and then ignored. Treat all unknown addresses as user. Now only user addresses in the kernel text range are significantly misclassified (as known kernel addresses). The ibrs functions didn't preserve enough registers. This is the only recent breakage on amd64. Although these functions are written in asm, in the profiling case they call profiling functions which are mostly for the C ABI, so they only have to save call-used registers. They also have to save arg and return registers in some cases and actually save them in all cases to reduce complications. They end up saving all registers except %ecx on i386 and %r10 and %r11 on amd64. Saving these is only needed for 1 caller on each of amd64 and i386. Save them there. This is slightly simpler. Remove saving %ecx in handle_ibrs_exit on i386. Both handle_ibrs_entry and handle_ibrs_exit use %ecx, but only the latter needed to or did save it. But saving it there doesn't work for the profiling case. amd64 has more automatic saving of the most common scratch registers %rax, %rcx and %rdx (its complications for %r10 are from unusual use of %r10 by SYSCALL). Thus profiling of handle_ibrs_exit_rs() was not broken, and I didn't simplify the saving by moving the saving of these registers from it to the caller.	2018-06-02 04:25:09 +00:00
Rick Macklem	dec8894b45	Fix the default number of threads for Flex File layout pNFS client I/O. The intent was that the default would be based on number of CPUs, but the code disabled using taskqueue() by default. This code is only executed when mounting a NFSv4.1 server that supports the Flexible File layout for pNFS and, since such servers are rare, this change shouldn't result in a POLA violation. (The FreeBSD pNFS server is still a project and the only other one that uses Flexible File layout is being developed by Primary Data and I don't know if they have even shipped any to customers yet.) Found while testing the pNFS server.	2018-06-02 00:11:26 +00:00
Mark Johnston	49a3710c89	Remove the "pass" variable from the page daemon control loop. It serves little purpose after r308474 and r329882. As a side effect, the removal fixes a bug in r329882 which caused the page daemon to periodically invoke lowmem handlers even in the absence of memory pressure. Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D15491	2018-06-02 00:01:07 +00:00
Konstantin Belousov	633d3b1c71	Only check for MAP_32BIT when available. Reported by: mmacy Sponsored by: The FreeBSD Foundation MFC after: 10 days	2018-06-01 23:50:51 +00:00
Mark Johnston	3fb14f61e1	Avoid completing I/O when dumping core after a panic. Filesystem or pager completion callbacks are generally non-functional after a panic and may trigger deadlocks if invoked in this context (e.g., by attempting to destroying a buffer mapping). To avoid this situation, short-circuit I/O completion in biodone(). Reviewed by: imp Discussed with: mav MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D15592	2018-06-01 23:49:32 +00:00
Alan Cox	60221a5701	Only a small subset of mmap(2)'s flags should be used in combination with the flag MAP_GUARD. Rather than enumerating the flags that are not allowed, enumerate the flags that are allowed. The list of allowed flags is much shorter and less likely to change. (As an aside, one of the previously enumerated flags, MAP_PREFAULT, was not even a legal flag for mmap(2). However, because of an earlier check within kern_mmap(), this misuse of MAP_PREFAULT was harmless.) Reviewed by: kib MFC after: 10 days	2018-06-01 21:37:42 +00:00
Justin Hibbits	3254c39f83	Increase powerpc64 KVA from ~7.25GB to 32GB This will let us use much more KVA for ZFS ARC where needed. This may be incresed in the future if memory requirements increase. Discussed with: nwhitehorn	2018-06-01 21:37:20 +00:00
Michael Tuexen	c14f9fe5ef	Limit the retransmission timer for SYN-ACKs by TCPTV_REXMTMAX. Use the same logic to handle the SYN-ACK retransmission when sent from the syn cache code as when sent from the main code. MFC after: 3 days Sponsored by: Netflix, Inc.	2018-06-01 21:24:27 +00:00
Michael Tuexen	badef00d58	Ensure net.inet.tcp.syncache.rexmtlimit is limited by TCP_MAXRXTSHIFT. If the sysctl variable is set to a value larger than TCP_MAXRXTSHIFT+1, the array tcp_syn_backoff[] is accessed out of bounds. Discussed with: jtl@ MFC after: 3 days Sponsored by: Netflix, Inc.	2018-06-01 19:58:19 +00:00
Rick Macklem	9442a64e53	Add the BindConnectiontoSession operation to the NFSv4.1 server. Under some fairly unusual circumstances, the Linux NFSv4.1 client is doing a BindConnectiontoSession operation for TCP connections. It is also used by the ESXi6.5 NFSv4.1 client. This patch adds this operation to the NFSv4.1 server. Reported by: andreas.nagy@frequentis.com Tested by: andreas.nagy@frequentis.com MFC after: 2 weeks	2018-06-01 19:47:41 +00:00
Warner Losh	16bc63ec75	Add PNP_INFO to aac Reviewed by: imp, chuck Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com> Sponsored by: Google, Inc. (GSoC 2018)	2018-06-01 19:42:59 +00:00
Navdeep Parhar	c27fcc70cc	cxgbe(4): Include full duplex mediaopt in media that can be reported as active. Always report full duplex in active media. Sponsored by: Chelsio Communications	2018-06-01 16:46:29 +00:00
Justin Hibbits	a608b7d313	Unbreak 32-bit binaries on powerpc64 Recently a change was made which broke loading 32-bit binaries on powerpc64, with an assertion in ld-elf32.so.1: ld-elf32.so.1: assert failed: /usr/local/poudriere/jails/ppc64/usr/src/libexec/rtld-elf/rtld.c:390 It turns out Elf32_AuxInfo was broken for a very long time on powerpc64, as it uses long and pointers, which are both 64 bits on powerpc64, and only manifested with the recent work on auxargs.	2018-06-01 16:31:05 +00:00
Ed Maste	b8d908b71e	ANSIfy sys/kern	2018-06-01 13:26:45 +00:00
Breno Leitao	48f64992f2	powerpc64: Avoid overwriting initrd area Currently kexec loads an initrd file into the main memory but does not mark that region as reserved, thus the area is not protected. If any initrd/md file is loaded from kexec/petitboot, the region might become corarupted/overwritten since FreeBSD does not know the region is 'reserved'. This patch simply adds the initrd area as a reserved memory region. Approved by: jhibbits Differential Revision: https://reviews.freebsd.org/D15610	2018-06-01 12:43:13 +00:00
Hans Petter Selasky	57a865f808	Implement the __sg_alloc_table_from_pages() function based on the existing sg_alloc_table_from_pages() function in the LinuxKPI. This basically allow segments to have a limit, max_segment. Submitted by: Johannes Lundberg <johalun0@gmail.com> MFC after: 1 week Sponsored by: Mellanox Technologies Sponsored by: Limelight Networks	2018-06-01 12:09:07 +00:00
Hans Petter Selasky	6fad8d171a	Implement radix_tree_iter_delete() in the LinuxKPI. Submitted by: Johannes Lundberg <johalun0@gmail.com> MFC after: 1 week Sponsored by: Mellanox Technologies Sponsored by: Limelight Networks	2018-06-01 11:42:09 +00:00
Hans Petter Selasky	0a85496223	Improve high resolution timer support in the LinuxKPI. Submitted by: Johannes Lundberg <johalun0@gmail.com> MFC after: 1 week Sponsored by: Mellanox Technologies Sponsored by: Limelight Networks	2018-06-01 11:33:14 +00:00
Hans Petter Selasky	f03ae7e802	Add more GFP macro definitions in the LinuxKPI. Submitted by: Johannes Lundberg <johalun0@gmail.com> MFC after: 1 week Sponsored by: Mellanox Technologies Sponsored by: Limelight Networks	2018-06-01 11:14:59 +00:00
Andriy Gapon	0a15ff37d6	call AcpiLeaveSleepStatePrep after re-enabling interrupts I want to do this change because this call (actually, AcpiHwLegacyWakePrep) does a memory allocation and ACPI namespace evaluation. Although it is not very likely to run into any trouble, it is still not safe to make those calls with interrupts disabled. witness(4) and malloc(9) do not currently check for a context with interrupts disabled via intr_disable and we lack a facility for doing that. So, those unsafe operations fly under the radar. But if intr_disable in acpi_EnterSleepState was replaced with spinlock_enter (which it probably should be), then witness and malloc would immediately complain. Also, AcpiLeaveSleepStatePrep is documented as called when interrupts are enabled. It used to require disabled interrupts, but that requirement was changed a long time ago when support for _BFS and _GTS was removed from ACPICA. The ACPI wakeup sequence is very sensitive to changes. I consider this change to be correct, but there can be fallouts from it. What AcpiHwLegacyWakePrep essentially does is writing a value corresponding to S0 into SLP_TYPx bits of PM1 Control Register(s). According to ACPI specifications that write should be a NOP as SLP_EN bit is not set. But I see in some chipset specifications that they allow to ignore SLP_EN altogether and to act on a change of SLP_TYPx alone. Also, there are a couple of accesses to ACPI hardware before the new location of the call to AcpiLeaveSleepStatePrep. One is to clear the power button status and the other is to enable SCI. So, the move may affect the interaction between then OS and ACPI platform. I have not seen any regressions on my test system, but it's a desktop. MFC after: 5 weeks	2018-06-01 09:44:23 +00:00
Edward Tomasz Napierala	e8a5d07df5	Set bDeviceClass properly for composite device (template 8). There should be no functional change. PR: 203289 Reviewed by: hselasky@ MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2018-06-01 09:17:20 +00:00
Navdeep Parhar	b9330ed7a2	cxgbe(4): Retire an old check.	2018-06-01 01:05:34 +00:00
Matt Macy	b824b2d67a	Update FreeBSD_version to reflect removal of in-kernel pmc tables for Intel	2018-06-01 00:49:20 +00:00

1 2 3 4 5 ...

122475 Commits