freebsd-skq

Author	SHA1	Message	Date
markj	61d7f51c05	Implement kernel support for early loading of Intel microcode updates. Updates in the format described in section 9.11 of the Intel SDM can now be applied as one of the first steps in booting the kernel. Updates that are loaded this way are automatically re-applied upon exit from ACPI sleep states, in contrast with the existing cpucontrol(8)-based method. For the time being only Intel updates are supported. Microcode update files are passed to the kernel via loader(8). The file type must be "cpu_microcode" in order for the file to be recognized as a candidate microcode update. Updates for multiple CPU types may be concatenated together into a single file, in which case the kernel will select and apply a matching update. Memory used to store the update file will be freed back to the system once the update is applied, so this approach will not consume more memory than required. Reviewed by: kib MFC after: 6 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16370	2018-08-13 17:13:09 +00:00
dteske	03591a0d28	Fix misspellings of transmitter/transmitted Reviewed by: emaste, bcr Sponsored by: Smule, Inc. Differential Revision: https://reviews.freebsd.org/D16025	2018-08-10 20:37:32 +00:00
hselasky	5b5bf02859	Implement missing atomic_fcmpset_XXX() support for i386. This also fixes i386 build after r337527. MFC after: 1 week Sponsored by: Mellanox Technologies	2018-08-09 11:30:13 +00:00
kib	3792b2b104	Add pmap_is_valid_memattr(9). Discussed with: alc Sponsored by: The FreeBSD Foundation, Mellanox Technologies MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15583	2018-08-01 18:45:51 +00:00
imp	11397021c9	Rename VM_FREELIST_ISADMA to VM_FREELIST_LOWMEM. There's no differene between VM_FREELIST_ISADMA and VM_FREELIST_LOWMEM except for the default boundary (16MB on x86 and 256MB on MIPS, but they are otherwise the same). We don't need both for any system we support (there were some really old ARC systems that did have ISA/EISA bus, but we never ran on them and they are too old to ever grow support for). Differential Review: https://reviews.freebsd.org/D16290	2018-07-27 18:34:20 +00:00
markj	21c018b44b	Fix handling of KVA in kmem_bootstrap_free(). Do not use vm_map_remove() to release KVA back to the system. Because kernel map entries do not have an associated VM object, with r336030 the vm_map_remove() call will not update the kernel page tables. Avoid relying on the vm_map layer and instead update the pmap and release KVA to the kernel arena directly in kmem_bootstrap_free(). Because the pmap updates will generally result in superpage demotions, modify pmap_init() to insert PTPs shadowed by superpage mappings into the kernel pmap's radix tree. While here, port r329171 to i386. Reported by: alc Reviewed by: alc, kib X-MFC with: r336505 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16426	2018-07-27 15:46:34 +00:00
kib	42ec259d74	Extend ranges of the critical sections to ensure that context switch code never sees FPU pcb flags not consistent with the hardware state. This is uncovered by the eager FPU switch mode. Analyzed, reviewed and tested by: gleb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-07-24 19:22:52 +00:00
alc	2d577dea62	Annotate a parameter as unused. X-MFC with: r336288	2018-07-20 16:31:25 +00:00
markj	1810bcca4b	Have preload_delete_name() free pages backing preloaded data. On i386 and amd64, add a vm_phys segment for physical memory used to store the kernel binary and other preloaded data. This makes it possible to free such memory back to the system once it is no longer needed, e.g., when a preloaded kernel module is unloaded. Previously, it would have remained unused. Reviewed by: kib, royger MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16330	2018-07-19 20:00:28 +00:00
markj	322c13228f	Restore the check for the page size extension after r332489. Without this, the support for transparent superpage promotion on i386 was left disabled. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D16279	2018-07-15 22:18:31 +00:00
alc	c0242a15de	Correct some typos. Reviewed by: kib	2018-07-14 19:35:41 +00:00
alc	dd64d030ae	Add support for pmap_enter(..., psind=1) to the i386 pmap. In other words, add support for explicitly requesting that pmap_enter() create a 2 or 4 MB page mapping. (Essentially, this feature allows the machine-independent layer to create superpage mappings preemptively, and not wait for automatic promotion to occur.) Export pmap_ps_enabled() to the machine-independent layer. Add a flag to pmap_pv_insert_pde() that specifies whether it should fail or reclaim a PV entry when one is not available. Refactor pmap_enter_pde() into two functions, one by the same name, that is a general-purpose function for creating PDE PG_PS mappings, and another, pmap_enter_4mpage(), that is used to prefault 2 or 4 MB read- and/or execute-only mappings for execve(2), mmap(2), and shmat(2). Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D16246	2018-07-14 17:20:27 +00:00
alc	935a2a380a	Eliminate unnecessary differences between i386's pmap_enter() and amd64's. For example, fully construct the new PTE before entering the critical section. This change is a stepping stone to psind == 1 support on i386. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D16188	2018-07-10 18:00:55 +00:00
alc	f3d2853f4b	Invalidate the mapping before updating its physical address. Doing so ensures that all threads sharing the pmap have a consistent view of the mapping. This fixes the problem described in the commit log messages for r329254 without the overhead of an extra fault in the common case. Once other pmap_enter() implementations are similarly modified, the workaround added in r329254 can be removed, reducing the overhead of CoW faults. See also r335784 for amd64. The i386 implementation of pmap_enter() already reused the PV entry from the old mapping. Reviewed by: kib, markj Tested by: pho MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D16133	2018-07-08 16:51:54 +00:00
kib	d5fe3eb988	Expand x86 struct pcpus to UMA_PCPU_ALLOC_SIZE AKA PAGE_SIZE. This restores counters(9) operation. Revert r336024. Improve assert of pcpu size on x86. Reviewed by: mmacy Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D16163	2018-07-06 19:50:44 +00:00
kib	4bb78ec9b8	Revert to recommit with the proper message.	2018-07-06 19:50:25 +00:00
kib	49a6d02633	Save a call to pmap_remove() if entry cannot have any pages mapped. Due to the way rtld creates mappings for the shared objects, each dso causes unmap of at least three guard map entries. For instance, in the buildworld load, this change reduces the amount of pmap_remove() calls by 1/5. Profiled by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16148	2018-07-06 19:48:47 +00:00
hselasky	147a9bb331	Make sure kernel modules built by default are portable between UP and SMP systems by extending defined(SMP) to include defined(KLD_MODULE). This is a regression issue after r335873 . Discussed with: mmacy@ Sponsored by: Mellanox Technologies	2018-07-06 10:13:42 +00:00
mmacy	ff20311f27	Back pcpu zone with domain correct pages - Change pcpu zone consumers to use a stride size of PAGE_SIZE. (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier) - Allocate page from the correct domain for a given cpu. - Don't initialize pc_domain to non-zero value if NUMA is not defined There are some misconceptions surrounding this field. It is the _VM_ NUMA domain and should only ever correspond to valid domain values as understood by the VM. The former slab size of sizeof(struct pcpu) was somewhat arbitrary. The new value is PAGE_SIZE because that's the smallest granularity which the VM can allocate a slab for a given domain. If you have fewer than PAGE_SIZE/8 counters on your system there will be some memory wasted, but this is obviously something where you want the cache line to be coming from the correct domain. Reviewed by: jeff Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15933	2018-07-06 02:06:03 +00:00
kib	caf6998654	Extend r335969 to superpages. It is possible that a fictitious unmanaged userspace mapping of superpage is created on x86, e.g. by pmap_object_init_pt(), with the physical address outside the vm_page_array[] coverage. Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 17:28:06 +00:00
kib	288402cc53	Revert r335999 to re-commit with the correct error message.	2018-07-05 17:26:13 +00:00
kib	400673b3c1	Use vm_page_unhold_pages() instead of manually rolling unoptimized version of it. Noted by: alc Sponsored by: The FreeBSD Foundation	2018-07-05 16:40:20 +00:00
kib	12d583b35c	In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 16:38:54 +00:00
kib	a0b8819d1d	In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 16:27:34 +00:00
kib	7c8694b186	In x86 pmap_extract_and_hold()s, handle the case of PHYS_TO_VM_PAGE() returning NULL. vm_fault_quick_hold_pages() can be legitimately called on userspace mappings backed by fictitious pages created by unmanaged device and sg pagers. Note that other architectures pmap_extract_and_hold() might need similar fix, but I postponed the examination. Reported by: bde Discussed with: alc Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D16085	2018-07-04 21:21:59 +00:00
mmacy	40fc34fd82	inline atomics and allow tied modules to inline locks - inline atomics in modules on i386 and amd64 (they were always inline on other arches) - allow modules to opt in to inlining locks by specifying MODULE_TIED=1 in the makefile Reviewed by: kib Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16079	2018-07-02 19:48:38 +00:00
chuck	360d52c628	Fix the Linux kernel version number calculation The Linux compatibility code was converting the version number (e.g. 2.6.32) in two different ways and then comparing the results. The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2 v = v0 * 1000000 + v1 * 1000 + v2; The LINUX_KERNVER() macro, on the other hand, converted the value with bit shifts. I.e. where major=a, minor=b, and patch=c v = (((a) << 16) + ((b) << 8) + (c)) The Linux kernel uses the later format via the KERNEL_VERSION() macro in include/generated/uapi/linux/version.h Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as in the .trans_osrel functions. PR: 229209 Reviewed by: emaste, cem, imp (mentor) Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D15952	2018-06-22 00:02:03 +00:00
emaste	bc07d95d93	linuxulator: do not include legacy syscalls on arm64 Existing linuxulator platforms (i386, amd64) support legacy syscalls, such as non-*at ones like open, but arm64 and other new platforms do not. Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h files. We may need finer grained control in the future but this is sufficient for now. Reviewed by: andrew Sponsored by: Turing Robotic Industries Differential Revision: https://reviews.freebsd.org/D15237	2018-06-15 14:41:51 +00:00
brooks	8e419faaf8	Regen after 335177 (rename sys_obreak to sys_break).	2018-06-14 21:29:31 +00:00
brooks	ad6bae500f	Name the implementation of brk and sbrk sys_break(). The break() system call was renamed (several times) starting in v3 AT&T UNIX when C was invented and break was a language keyword. The last vestage of a need for it to be called something else (eg obreak) was removed in r225617 which consistantly prefixed all syscall implementations. Reviewed by: emaste, kib (older version) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15638	2018-06-14 21:27:25 +00:00
kib	a3a3e0b0ab	Reorganize code flow in fpudna()/npxdna() to highlight the critical section scope. Sprinkle __predict_false() for conditions known to never occur or occur only on rare platforms. Sponsored by: The FreeBSD Foundation	2018-06-14 11:09:51 +00:00
kib	18574304a4	Remove printf() in #NM handler. Give up and remove the almost useless informational message reporting that device not available exception occured while our state tracking indicates the current CPU has FPU context loaded for the current thread. It seems that this is recurring bug with some VM monitors. Sponsored by: The FreeBSD Foundation	2018-06-14 10:33:26 +00:00
kib	f2aa526948	Enable eager FPU context switch by default on i386 too, based on amd64 r335072. Security: CVE-2018-3665 Sponsored by: The FreeBSD Foundation	2018-06-13 21:10:23 +00:00
rlibby	6e36dd43ee	i386: copyin/copyout error is EFAULT Discussed with: kib MFC with: r332489 Sponsored by: Dell EMC Isilon	2018-06-13 19:57:03 +00:00
jtl	8222f5cb7c	Make UMA and malloc(9) return non-executable memory in most cases. Most kernel memory that is allocated after boot does not need to be executable. There are a few exceptions. For example, kernel modules do need executable memory, but they don't use UMA or malloc(9). The BPF JIT compiler also needs executable memory and did use malloc(9) until r317072. (Note that a side effect of r316767 was that the "small allocation" path in UMA on amd64 already returned non-executable memory. This meant that some calls to malloc(9) or the UMA zone(9) allocator could return executable memory, while others could return non-executable memory. This change makes the behavior consistent.) This change makes malloc(9) return non-executable memory unless the new M_EXEC flag is specified. After this change, the UMA zone(9) allocator will always return non-executable memory, and a KASSERT will catch attempts to use the M_EXEC flag to allocate executable memory using uma_zalloc() or its variants. Allocations that do need executable memory have various choices. They may use the M_EXEC flag to malloc(9), or they may use a different VM interfact to obtain executable pages. Now that malloc(9) again allows executable allocations, this change also reverts most of r317072. PR: 228927 Reviewed by: alc, kib, markj, jhb (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D15691	2018-06-13 17:04:41 +00:00
kib	b7403f6b1b	All exceptions IDT descriptors must use interrupt gates on 4/4 kernel. Fix it for #MF. Noted by: rlibby Sponsored by: The FreeBSD Foundation	2018-06-12 10:43:20 +00:00
kib	5021be544c	Fix typo. Sponsored by: The FreeBSD Foundation	2018-06-12 10:41:26 +00:00
bde	1096fe355c	Fix panics in potentially all x86bios calls on i386 since r332489. A call to npxsave() in the exception trampolines was not relocated. This call to a garbage address usually paniced when made, but it is only made when the thread has used an FPU recently, and this is not the usual case. PR: 228755 Reviewed by: kib	2018-06-10 14:21:01 +00:00
markj	e402a8384b	Tell the compiler that rdtscp clobbers %ecx.	2018-06-09 18:31:19 +00:00
mmacy	33d22ed3f8	hwpmc: simplify calling convention for hwpmc interrupt handling pmc_process_interrupt takes 5 arguments when only 3 are needed. cpu is always available in curcpu and inuserspace can always be derived from the passed trapframe. While facially a reasonable cleanup this change was motivated by the need to workaround a compiler bug. core2_intr(cpu, tf) -> pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) -> pmc_add_sample(cpu, ring, pm, tf, inuserspace) In the process of optimizing the tail call the tf pointer was getting clobbered: (kgdb) up at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709 4709 pmc_save_kernel_callchain(ps->ps_pc, (kgdb) up 1205 error = pmc_process_interrupt(cpu, PMC_HR, pm, tf, resulting in a crash in pmc_save_kernel_callchain.	2018-06-08 04:58:03 +00:00
mmacy	a2a3facbc2	cpufunc: add rdtscp for x86	2018-06-07 00:54:11 +00:00
mmacy	3b8eb9c59a	hwpmc: ABI fixes - increase pmc cpuid field from 8 to 12 bits - add cpuid version string to initialize entry in the log so that filter can identify which counter index an event name maps to - GC unused config flags - make fixed counter assignment more robust as well as the changes needed to be properly identified for filter	2018-06-04 02:05:48 +00:00
bde	48738e7c73	Oops, the last minute reduction in the clobber list for i386 MCOUNT_OVERHEAD() in r334522 was too agressive. Only mcount exit preserves %eax and %edx.	2018-06-02 09:59:27 +00:00
bde	1cd9b8b732	Finish COMPAT_AOUT support for amd64. It wasn't in any amd64 or MI file in /sys/conf, so was unavailable in configurations that don't use modules, and was not testable or notable in NOTES. Its normal configuration (not using a module) is still silently deprecated in aout(4) by not mentioning it there. Update i386 NOTES for COMPAT_AOUT. It is not i386-only, or even very MD. Sort its entry better. Finish gzip configuration (but not support) for amd64. gzip is really gzipped aout. It is currently broken even for i386 (a call to vm fails). amd64 has always attempted to configure and test it, but it depends on COMPAT_AOUT (as noted). The bug that it depends on unconfigured files was not detected since it is configured as a device. All other optional image activators are configured properly using an option.	2018-06-02 06:40:15 +00:00
bde	0409ad53fa	Fix high resolution kernel profiling just enough to not crash at boot time, especially for SMP. If configured, it turns itself on at boot time for calibration, so is fragile even if never otherwise used. Both types of kernel profiling were supposed to use a global spinlock in the SMP case. If hi-res profiling is configured (but not necessarily used), this was supposed to be optimized by only using it when necessary, and slightly more efficiently, in asm. But it was not done at all for mcount entry where it is necessary. This caused crashes in the SMP case when either type of profiling was enabled. For mcount exit, it only caused wrong times. The times were wrongest with an i8254 timer since using that requires exclusive access to the hardware. The i8254 timer was too slow to use here 20 years ago and is much less usable now, but it is the default for the SMP case since TSCs weren't invariant when SMP was new. Do the locking in all hi-res SMP cases for simplicity. Calibration uses special asms, and the clobber lists in these were sort of inverted. They contained the arg and return registers which are not clobbered, but on amd64 they didn't contain the residue of the call-used registers which may be clobbered (%r10 and %r11). This usually caused hangs at boot time. This usually affected even the UP case.	2018-06-02 05:48:44 +00:00
bde	d4e99d50f0	Fix recent breakages of kernel profiling, mostly on i386 (high resolution kernel profiling remains broken). memmove() was broken using ALTENTRY(). ALTENTRY() is only different from ENTRY() in the profiling case, and its use in that case was sort of backwards. The backwardness magically turned memmove() into memcpy() instead of completely breaking it. Only the high resolution parts of profiling itself were broken. Use ordinary ENTRY() for memmove(). Turn bcopy() into a tail call to memmove() to reduce complications. This gives slightly different pessimizations and profiling lossage. The pessimizations are minimized by not using a frame pointer() for bcopy(). Calls to profiling functions from exception trampolines were not relocated. This caused crashes on the first exception. Fix this using function pointers. Addresses of exception handlers in trampolines were not relocated. This caused unknown offsets in the profiling data. Relocate by abusing setidt_disp as for pmc although this is slower than necessary and requires namespace pollution. pmc seems to be missing some relocations. Stack traces and lots of other things in debuggers need similar relocations. Most user addresses were misclassified as unknown kernel addresses and then ignored. Treat all unknown addresses as user. Now only user addresses in the kernel text range are significantly misclassified (as known kernel addresses). The ibrs functions didn't preserve enough registers. This is the only recent breakage on amd64. Although these functions are written in asm, in the profiling case they call profiling functions which are mostly for the C ABI, so they only have to save call-used registers. They also have to save arg and return registers in some cases and actually save them in all cases to reduce complications. They end up saving all registers except %ecx on i386 and %r10 and %r11 on amd64. Saving these is only needed for 1 caller on each of amd64 and i386. Save them there. This is slightly simpler. Remove saving %ecx in handle_ibrs_exit on i386. Both handle_ibrs_entry and handle_ibrs_exit use %ecx, but only the latter needed to or did save it. But saving it there doesn't work for the profiling case. amd64 has more automatic saving of the most common scratch registers %rax, %rcx and %rdx (its complications for %r10 are from unusual use of %r10 by SYSCALL). Thus profiling of handle_ibrs_exit_rs() was not broken, and I didn't simplify the saving by moving the saving of these registers from it to the caller.	2018-06-02 04:25:09 +00:00
mmacy	2f6bd2cd39	hwpmc: remove unused pre-table driven bits for intel Intel now provides comprehensive tables for all performance counters and the various valid configuration permutations as text .json files. Libpmc has been converted to use these and hwpmc_core has been greatly simplified by moving to passthrough of the table values. The one gotcha is that said tables don't support pentium pro and and pentium IV. There's very few users of hwpmc on _amd64_ kernels on new hardware. It is unlikely that anyone is doing low level optimization on 15 year old Intel hardware. Nonetheless, if someone feels strongly enough to populate the corresponding tables for p4 and ppro I will reinstate the files in to the build. Code for the K8 counters and !x86 architectures remains unchanged.	2018-05-31 22:41:07 +00:00
dim	a34aa33cd6	Resolve conflicts between macros in fenv.h and ieeefp.h This is a follow-up to r321483, which disabled -Wmacro-redefined for some lib/msun tests. If an application included both fenv.h and ieeefp.h, several macros such as __fldcw(), __fldenv() were defined in both headers, with slightly different arguments, leading to conflicts. Fix this by putting all the common macros in the machine-specific versions of ieeefp.h. Where needed, update the arguments in places where the macros are invoked. This also slightly reduces the differences between the amd64 and i386 versions of ieeefp.h. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D15633	2018-05-31 20:22:47 +00:00
kib	b412910664	Use pmap_pte_ufast() instead of pmap_pte() in pmap_extract(), pmap_is_prefaultable() and pmap_incore(), pushing the number of shootdown IPIs back to the 3/1 kernel. Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:47:20 +00:00
kib	398a2a262a	Extract code for fast mapping of pte from pmap_extract_and_hold() into the helper function pmap_pte_ufast(). Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:43:48 +00:00

1 2 3 4 5 ...

13553 Commits