freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	5803d744c7	Reorganize code flow in fpudna()/npxdna() to highlight the critical section scope. Sprinkle __predict_false() for conditions known to never occur or occur only on rare platforms. Sponsored by: The FreeBSD Foundation	2018-06-14 11:09:51 +00:00
Konstantin Belousov	fa7fad8ab9	Remove printf() in #NM handler. Give up and remove the almost useless informational message reporting that device not available exception occured while our state tracking indicates the current CPU has FPU context loaded for the current thread. It seems that this is recurring bug with some VM monitors. Sponsored by: The FreeBSD Foundation	2018-06-14 10:33:26 +00:00
Konstantin Belousov	fc3e80c322	Enable eager FPU context switch by default on i386 too, based on amd64 r335072. Security: CVE-2018-3665 Sponsored by: The FreeBSD Foundation	2018-06-13 21:10:23 +00:00
Ryan Libby	a7be368aec	i386: copyin/copyout error is EFAULT Discussed with: kib MFC with: r332489 Sponsored by: Dell EMC Isilon	2018-06-13 19:57:03 +00:00
Jonathan T. Looney	0766f278d8	Make UMA and malloc(9) return non-executable memory in most cases. Most kernel memory that is allocated after boot does not need to be executable. There are a few exceptions. For example, kernel modules do need executable memory, but they don't use UMA or malloc(9). The BPF JIT compiler also needs executable memory and did use malloc(9) until r317072. (Note that a side effect of r316767 was that the "small allocation" path in UMA on amd64 already returned non-executable memory. This meant that some calls to malloc(9) or the UMA zone(9) allocator could return executable memory, while others could return non-executable memory. This change makes the behavior consistent.) This change makes malloc(9) return non-executable memory unless the new M_EXEC flag is specified. After this change, the UMA zone(9) allocator will always return non-executable memory, and a KASSERT will catch attempts to use the M_EXEC flag to allocate executable memory using uma_zalloc() or its variants. Allocations that do need executable memory have various choices. They may use the M_EXEC flag to malloc(9), or they may use a different VM interfact to obtain executable pages. Now that malloc(9) again allows executable allocations, this change also reverts most of r317072. PR: 228927 Reviewed by: alc, kib, markj, jhb (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D15691	2018-06-13 17:04:41 +00:00
Konstantin Belousov	6ee7d5afcc	All exceptions IDT descriptors must use interrupt gates on 4/4 kernel. Fix it for #MF. Noted by: rlibby Sponsored by: The FreeBSD Foundation	2018-06-12 10:43:20 +00:00
Konstantin Belousov	7a18e90447	Fix typo. Sponsored by: The FreeBSD Foundation	2018-06-12 10:41:26 +00:00
Bruce Evans	09e3c9a4ec	Fix panics in potentially all x86bios calls on i386 since r332489. A call to npxsave() in the exception trampolines was not relocated. This call to a garbage address usually paniced when made, but it is only made when the thread has used an FPU recently, and this is not the usual case. PR: 228755 Reviewed by: kib	2018-06-10 14:21:01 +00:00
Mark Johnston	f090f67503	Tell the compiler that rdtscp clobbers %ecx.	2018-06-09 18:31:19 +00:00
Matt Macy	eb7c901995	hwpmc: simplify calling convention for hwpmc interrupt handling pmc_process_interrupt takes 5 arguments when only 3 are needed. cpu is always available in curcpu and inuserspace can always be derived from the passed trapframe. While facially a reasonable cleanup this change was motivated by the need to workaround a compiler bug. core2_intr(cpu, tf) -> pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) -> pmc_add_sample(cpu, ring, pm, tf, inuserspace) In the process of optimizing the tail call the tf pointer was getting clobbered: (kgdb) up at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709 4709 pmc_save_kernel_callchain(ps->ps_pc, (kgdb) up 1205 error = pmc_process_interrupt(cpu, PMC_HR, pm, tf, resulting in a crash in pmc_save_kernel_callchain.	2018-06-08 04:58:03 +00:00
Matt Macy	155046394a	cpufunc: add rdtscp for x86	2018-06-07 00:54:11 +00:00
Matt Macy	07d80fd8dc	hwpmc: ABI fixes - increase pmc cpuid field from 8 to 12 bits - add cpuid version string to initialize entry in the log so that filter can identify which counter index an event name maps to - GC unused config flags - make fixed counter assignment more robust as well as the changes needed to be properly identified for filter	2018-06-04 02:05:48 +00:00
Bruce Evans	d10566cf49	Oops, the last minute reduction in the clobber list for i386 MCOUNT_OVERHEAD() in r334522 was too agressive. Only mcount exit preserves %eax and %edx.	2018-06-02 09:59:27 +00:00
Bruce Evans	c507c512b9	Finish COMPAT_AOUT support for amd64. It wasn't in any amd64 or MI file in /sys/conf, so was unavailable in configurations that don't use modules, and was not testable or notable in NOTES. Its normal configuration (not using a module) is still silently deprecated in aout(4) by not mentioning it there. Update i386 NOTES for COMPAT_AOUT. It is not i386-only, or even very MD. Sort its entry better. Finish gzip configuration (but not support) for amd64. gzip is really gzipped aout. It is currently broken even for i386 (a call to vm fails). amd64 has always attempted to configure and test it, but it depends on COMPAT_AOUT (as noted). The bug that it depends on unconfigured files was not detected since it is configured as a device. All other optional image activators are configured properly using an option.	2018-06-02 06:40:15 +00:00
Bruce Evans	49c871278a	Fix high resolution kernel profiling just enough to not crash at boot time, especially for SMP. If configured, it turns itself on at boot time for calibration, so is fragile even if never otherwise used. Both types of kernel profiling were supposed to use a global spinlock in the SMP case. If hi-res profiling is configured (but not necessarily used), this was supposed to be optimized by only using it when necessary, and slightly more efficiently, in asm. But it was not done at all for mcount entry where it is necessary. This caused crashes in the SMP case when either type of profiling was enabled. For mcount exit, it only caused wrong times. The times were wrongest with an i8254 timer since using that requires exclusive access to the hardware. The i8254 timer was too slow to use here 20 years ago and is much less usable now, but it is the default for the SMP case since TSCs weren't invariant when SMP was new. Do the locking in all hi-res SMP cases for simplicity. Calibration uses special asms, and the clobber lists in these were sort of inverted. They contained the arg and return registers which are not clobbered, but on amd64 they didn't contain the residue of the call-used registers which may be clobbered (%r10 and %r11). This usually caused hangs at boot time. This usually affected even the UP case.	2018-06-02 05:48:44 +00:00
Bruce Evans	dbe3061729	Fix recent breakages of kernel profiling, mostly on i386 (high resolution kernel profiling remains broken). memmove() was broken using ALTENTRY(). ALTENTRY() is only different from ENTRY() in the profiling case, and its use in that case was sort of backwards. The backwardness magically turned memmove() into memcpy() instead of completely breaking it. Only the high resolution parts of profiling itself were broken. Use ordinary ENTRY() for memmove(). Turn bcopy() into a tail call to memmove() to reduce complications. This gives slightly different pessimizations and profiling lossage. The pessimizations are minimized by not using a frame pointer() for bcopy(). Calls to profiling functions from exception trampolines were not relocated. This caused crashes on the first exception. Fix this using function pointers. Addresses of exception handlers in trampolines were not relocated. This caused unknown offsets in the profiling data. Relocate by abusing setidt_disp as for pmc although this is slower than necessary and requires namespace pollution. pmc seems to be missing some relocations. Stack traces and lots of other things in debuggers need similar relocations. Most user addresses were misclassified as unknown kernel addresses and then ignored. Treat all unknown addresses as user. Now only user addresses in the kernel text range are significantly misclassified (as known kernel addresses). The ibrs functions didn't preserve enough registers. This is the only recent breakage on amd64. Although these functions are written in asm, in the profiling case they call profiling functions which are mostly for the C ABI, so they only have to save call-used registers. They also have to save arg and return registers in some cases and actually save them in all cases to reduce complications. They end up saving all registers except %ecx on i386 and %r10 and %r11 on amd64. Saving these is only needed for 1 caller on each of amd64 and i386. Save them there. This is slightly simpler. Remove saving %ecx in handle_ibrs_exit on i386. Both handle_ibrs_entry and handle_ibrs_exit use %ecx, but only the latter needed to or did save it. But saving it there doesn't work for the profiling case. amd64 has more automatic saving of the most common scratch registers %rax, %rcx and %rdx (its complications for %r10 are from unusual use of %r10 by SYSCALL). Thus profiling of handle_ibrs_exit_rs() was not broken, and I didn't simplify the saving by moving the saving of these registers from it to the caller.	2018-06-02 04:25:09 +00:00
Matt Macy	e92a1350b5	hwpmc: remove unused pre-table driven bits for intel Intel now provides comprehensive tables for all performance counters and the various valid configuration permutations as text .json files. Libpmc has been converted to use these and hwpmc_core has been greatly simplified by moving to passthrough of the table values. The one gotcha is that said tables don't support pentium pro and and pentium IV. There's very few users of hwpmc on _amd64_ kernels on new hardware. It is unlikely that anyone is doing low level optimization on 15 year old Intel hardware. Nonetheless, if someone feels strongly enough to populate the corresponding tables for p4 and ppro I will reinstate the files in to the build. Code for the K8 counters and !x86 architectures remains unchanged.	2018-05-31 22:41:07 +00:00
Dimitry Andric	b451efbedc	Resolve conflicts between macros in fenv.h and ieeefp.h This is a follow-up to r321483, which disabled -Wmacro-redefined for some lib/msun tests. If an application included both fenv.h and ieeefp.h, several macros such as __fldcw(), __fldenv() were defined in both headers, with slightly different arguments, leading to conflicts. Fix this by putting all the common macros in the machine-specific versions of ieeefp.h. Where needed, update the arguments in places where the macros are invoked. This also slightly reduces the differences between the amd64 and i386 versions of ieeefp.h. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D15633	2018-05-31 20:22:47 +00:00
Konstantin Belousov	d05d616c35	Use pmap_pte_ufast() instead of pmap_pte() in pmap_extract(), pmap_is_prefaultable() and pmap_incore(), pushing the number of shootdown IPIs back to the 3/1 kernel. Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:47:20 +00:00
Konstantin Belousov	c5981f69ee	Extract code for fast mapping of pte from pmap_extract_and_hold() into the helper function pmap_pte_ufast(). Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:43:48 +00:00
Konstantin Belousov	7883d57a72	Restore pmap_copy() for 4/4 i386 pmap. Create yet another temporal pte mapping routine pmap_pte_quick3(), which is the copy of the pmap_pte_quick() and relies on the pvh_global_lock to protect the frame. It accounts into the same counters as pmap_pte_quick(). It is needed since pmap_copy() uses pmap_pte_quick() already, and since a user pmap is no longer current pmap. pmap_copy() still provides the advantage for real-world workloads involving lot of forks where processes do not exec immediately. Benchmarked by: bde Sponsored by: The FreeBSD Foundation	2018-05-30 20:39:22 +00:00
Konstantin Belousov	d94bd3726b	Do use pmap_pte_quick() in pmap_enter_quick_locked(). Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:26:47 +00:00
Konstantin Belousov	1095694a73	Avoid unneccessary TLB shootdowns in pmap_unwire_ptp() for user pmaps, which no longer create recursive page table mappings. Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation	2018-05-30 20:24:21 +00:00
Brooks Davis	cbf7e0cba7	Correct pointer subtraction in KASSERT(). The assertion would never fire without truly spectacular future programming errors. Reported by: Coverity CID: 1391370 Sponsored by: DARPA, AFRL	2018-05-29 20:03:24 +00:00
Hans Petter Selasky	43bb1274d0	Implement atomic_add_64() and atomic_subtract_64() for the i386 target. While at it add missing _acq_ and _rel_ variants for 64-bit atomic operations under i386. Reviewed by: kib @ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-05-29 11:59:02 +00:00
Konstantin Belousov	ded29bd9a5	Optimize i386 pmap_extract_and_hold(). In particular, stop using pmap_pte() to read non-promoted pte while walking the page table. pmap_pte() needs to shoot down the kernel mapping globally which causes IPI broadcast. Since pmap_extract_and_hold() is used for slow copyin(9), it is very significant hit for the 4/4 kernels. Instead, create single purpose per-processor page frame and use it to locally map page table page inside the critical section, to avoid reuse of the frame by other thread if context switched. Measurement demostrated very significant improvements in any load that utilizes copyin/copyout. Found and benchmarked by: bde Sponsored by: The FreeBSD Foundation	2018-05-25 16:29:22 +00:00
Konstantin Belousov	feac1d4808	Cleanup. Remove unused instruction and label. Tested by: bde Sponsored by: The FreeBSD Foundation	2018-05-25 16:24:20 +00:00
Andriy Gapon	279be68bfd	re-synchronize TSC-s on SMP systems after resume, if necessary The TSC-s are checked and synchronized only if they were good originally. That is, invariant, synchronized, etc. This is necessary on an AMD-based system where after a wakeup from STR I see that BSP clock differs from AP clocks by a count that roughly corresponds to one second. The APs are in sync with each other. Not sure if this is a hardware quirk or a firmware bug. This is what I see after a resume with this change: SMP: passed TSC synchronization test after adjustment acpi_timer0: restoring timecounter, ACPI-fast -> TSC-low Reviewed by: kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D15551	2018-05-25 07:33:20 +00:00
Warner Losh	fb3bfdf186	Make memmove and bcopy share code Make memmove the primary interface, but have bcopy be an alternative entry point that jumps into memmove. This will slightly pessimize bcopy calls, but those are about to get much rarer. Return dst always, but it will be ignored by bcopy callers. We can remove just the alt entry point if we ever remove bcopy entirely. Differential Revision: https://reviews.freebsd.org/D15374	2018-05-24 21:11:33 +00:00
Brooks Davis	5f77b8a88b	Avoid two suword() calls per auxarg entry. Instead, construct an auxargs array and copy it out all at once. Use an array of Elf_Auxinfo rather than pairs of Elf_Addr * to represent the array. This is the correct type where pairs of words just happend to work. To reduce the size of the diff, AUXARGS_ENTRY is altered to act on this array rather than introducing a new macro. Return errors on copyout() and suword() failures and handle them in the caller. Incidentally fixes AT_RANDOM and AT_EXECFN in 32-bit linux on amd64 which incorrectly used AUXARG_ENTRY instead of AUXARGS_ENTRY_32 (now removed due to the use of proper types). Reviewed by: kib Comments from: emaste, jhb Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15485	2018-05-24 16:25:18 +00:00
Konstantin Belousov	8936419a6c	x86: stop unconditionally clearing PSL_T on the trace trap. We certainly should clear PSL_T when calling the SIGTRAP signal handler, which is already done by all x86 sendsig(9) ABI code. On the other hand, there is no obvious reason why PSL_T needs to be cleared when returning from the signal handler. For instance, Linux allows userspace to set PSL_T and keep tracing enabled for the desired period. There are userspace programs which would use PSL_T if we make it possible, for instance sbcl. Remember if PSL_T was set by PT_STEP or PT_SETSTEP by mean of TDB_STEP flag, and only clear it when the flag is set. Discussed with: Ali Mashtizadeh Reviewed by: jhb (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D15054	2018-05-23 21:39:29 +00:00
Konstantin Belousov	79547c52b8	Stop obliterating actual exception type for emulated traps from vm86. Wording and reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15054	2018-05-23 21:26:41 +00:00
Konstantin Belousov	61bc50d032	Style. Wording and reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D15054	2018-05-23 21:25:49 +00:00
Konstantin Belousov	3ae6b51919	Support IBRS for i386. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15522	2018-05-23 16:31:46 +00:00
Konstantin Belousov	82a4284d4b	Use local unique labels inside most often used macros. Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-05-22 13:45:40 +00:00
Konstantin Belousov	a3c7cd11d2	Fix double-load of %cr3 and double-copy of the stack frame for the kernel entry from userspace vm86. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-05-22 13:30:56 +00:00
Matt Macy	137fd41bd9	fix i386 builds after r334005 and r334009 r334005: add pc_ibpb_set as it is now referenced by common code (although presumably not needed on i386 since it has been there since the first spectre mitigation work on amd64) r334009: there is no amd64 rflags -> i386 eflags	2018-05-22 05:09:33 +00:00
John Baldwin	9e2154ff1c	Cleanups related to debug exceptions on x86. - Add constants for fields in DR6 and the reserved fields in DR7. Use these constants instead of magic numbers in most places that use DR6 and DR7. - Refer to T_TRCTRAP as "debug exception" rather than a "trace trap" as it is not just for trace exceptions. - Always read DR6 for debug exceptions and only clear TF in the flags register for user exceptions where DR6.BS is set. - Clear DR6 before returning from a debug exception handler as recommended by the SDM dating all the way back to the 386. This allows debuggers to determine the cause of each exception. For kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value to other parts of the handler (namely, user_dbreg_trap()). For user traps, wait until after trapsignal to clear DR6 so that userland debuggers can read DR6 via PT_GETDBREGS while the thread is stopped in trapsignal(). Reviewed by: kib, rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D15189	2018-05-22 00:45:00 +00:00
Mark Johnston	892bdccca0	Enable kernel dump features in GENERIC for most platforms. This turns on support for kernel dump encryption and compression, and netdump. arm and mips platforms are omitted for now, since they are more constrained and don't benefit as much from these features. Reviewed by: cem, manu, rgrimes Tested by: manu (arm64) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D15465	2018-05-19 19:53:23 +00:00
Ed Maste	891cf3ed44	Use NULL for SYSINIT's last arg, which is a pointer type Sponsored by: The FreeBSD Foundation	2018-05-18 17:58:09 +00:00
Konstantin Belousov	3f3a2d0f8d	Fix PMC_IN_TRAP_HANDLER() for i386 after the 4/4 split. Sponsored by: The FreeBSD Foundation	2018-05-13 20:10:02 +00:00
Konstantin Belousov	a9c53bbb24	Kernel entry from vm86 mode, where PCB_VM86CALL pcb flag is not set, is executed on the right stack already. No copy from the entry stack to the kstack must be performed for vm86 bios call code to function. To access the pcb flags on kernel entry, unconditionally switch to kernel address space if vm86 mode is detected. This fixes very early vm86 bios calls, typically done when boot is performed by boot2 without loader, and kernel falls back to BIOS calls to get SMAP. Reported by: bde Sponsored by: The FreeBSD Foundation	2018-05-12 11:06:59 +00:00
Konstantin Belousov	801bf88ce3	On return from exception or interrupt, returns to vm86 mode with PCB_VM86CALL pcb flag not set should be treated same as return to userspace. Most important, the address space must be switched. This fixes usermode vm86 operations after the 4/4 split. Sponsored by: The FreeBSD Foundation	2018-05-12 11:02:39 +00:00
Konstantin Belousov	507e50d5f9	Initialize tramp_idleptd during cold pmap startup, before the exception code is copied to the trampoline. The correct value is then copied to trampoline automatically, so tramp_idleptd_reloced can be eliminated. This will allow to use the same exception entry code to handle traps from vm86 bios calls on early boot stage, as after the trampoline is configured. Sponsored by: The FreeBSD Foundation	2018-05-12 10:57:34 +00:00
Konstantin Belousov	6652b9d9ea	Create a macro for PIC code which loads %cr3 from tramp_idleptd. Sponsored by: The FreeBSD Foundation	2018-05-12 10:51:50 +00:00
Konstantin Belousov	2017ad1e81	Fix use of the custom TSS on i386 after the 4/4 split. Record common_tssd, the descriptor to be written in GDT to point to the common TSS, before LTR is executed. The LTR instruction sets the loaded descriptor type to 386 TSS busy, which traps on reloads. Sponsored by: The FreeBSD Foundation	2018-05-12 10:48:53 +00:00
Konstantin Belousov	0de8041c8e	Remove dead declaration. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-05-11 20:47:45 +00:00
Warner Losh	3429b518c9	Remove unused bcopyb. Differential Revision: https://reviews.freebsd.org/D15374	2018-05-10 02:31:54 +00:00
Konstantin Belousov	053641bb1c	Prepare DB# handler for deferred trigger of watchpoints. Since pop %ss/mov %ss instructions defer all interrupts and exceptions for the next instruction, it is possible that the userspace watchpoint trap executes on the first instruction of the kernel entry for syscall/bpt. In this case, DB# should be treated similarly to NMI: on amd64 we must always load GSBASE even if the trap comes from kernel mode, and load the kernel page table root into %cr3. Moreover, the trap must use the dedicated stack, because we are still on the user stack when trapped on syscall entry. For i386, we must reload %cr3. The syscall instruction is not configured, so there is no issue with executing on user stack when trapping. Due to some CPU erratas it is not always possible to detect that the userspace watchpoint triggered by inspecting %dr6. In trap(), compare the trap %rip with the known unsafe entry points and if matched pretend that the watchpoint did not fire at all. Thank you to the MSRC Incident Response Team, and in particular Greg Lenti and Nate Warfield, for coordinating the response to this issue across multiple vendors. Thanks to Computer Recycling at The Working Center of Kitchener for making hardware available to allow us to test the patch on additional CPU families. Reviewed by: jhb Discussed with: Matthew Dillon Tested by: emaste Sponsored by: The FreeBSD Foundation Security: CVE-2018-8897 Security: FreeBSD-SA-18:06.debugreg	2018-05-08 17:00:34 +00:00
Konstantin Belousov	7035cf14ee	Implement support for ifuncs in the kernel linker. Required MD bits are only provided for x86. Reviewed by: jhb (previous version, as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D13838	2018-05-03 21:37:46 +00:00

1 2 3 4 5 ...

13220 Commits