freebsd-skq

Author	SHA1	Message	Date
Konstantin Belousov	fd8d844f76	amd64 KPTI: add control from procctl(2). Add the infrastructure to allow MD procctl(2) commands, and use it to introduce amd64 PTI control and reporting. PTI mode cannot be modified for existing pmap, the knob controls PTI of the new vmspace created on exec. Requested by: jhb Reviewed by: jhb, markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D19514	2019-03-16 11:44:33 +00:00
Konstantin Belousov	7e0a345bc5	Add symbolic name for TSC_AUX MSR address. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-03-15 16:39:05 +00:00
Konstantin Belousov	3dcf329ee5	Add register number, CPUID bits, and print identification for TSX force abort errata. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-03-12 18:59:01 +00:00
John Baldwin	2e43efd0bb	Drop "All rights reserved" from my copyright statements. Reviewed by: rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19485	2019-03-06 22:11:45 +00:00
Konstantin Belousov	a2d95495ee	Add usermode helpers for for Intel userspace protection keys feature. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18893	2019-02-20 09:56:23 +00:00
Konstantin Belousov	e7a9df16e6	Add kernel support for Intel userspace protection keys feature on Skylake Xeons. See SDM rev. 68 Vol 3 4.6.2 Protection Keys and the description of the RDPKRU and WRPKRU instructions. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18893	2019-02-20 09:51:13 +00:00
Konstantin Belousov	5671e0d62e	Add definition for %cr4 PKRU enable bit. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D18893	2019-02-19 19:13:48 +00:00
Konstantin Belousov	eb785fab3b	Port sysctl kern.elf32.read_exec from amd64 to i386. Make it more comprehensive on i386, by not setting nx bit for any mapping, not just adding PF_X to all kernel-loaded ELF segments. This is needed for the compatibility with older i386 programs that assume that read access implies exec, e.g. old X servers with hand-rolled module loader. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-02-07 02:17:34 +00:00
Konstantin Belousov	ccc2d07e77	Update CPUID bits definitions and CPU identification based on changes in SDM rev. 069. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-02-04 23:57:59 +00:00
Konstantin Belousov	9a52756044	i386: Merge PAE and non-PAE pmaps into same kernel. Effectively all i386 kernels now have two pmaps compiled in: one managing PAE pagetables, and another non-PAE. The implementation is selected at cold time depending on the CPU features. The vm_paddr_t is always 64bit now. As result, nx bit can be used on all capable CPUs. Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE configs, for drivers compatibility. Kernel layout, esp. max kernel address, low memory PDEs and max user address (same as trampoline start) are now same for PAE and for non-PAE regardless of the type of page tables used. Non-PAE kernel (when using PAE pagetables) can handle physical memory up to 24G now, larger memory requires re-tuning the KVA consumers and instead the code caps the maximum at 24G. Unfortunately, a lot of drivers do not use busdma(9) properly so by default even 4G barrier is not easy. There are two tunables added: hw.above4g_allow and hw.above24g_allow, the first one is kept enabled for now to evaluate the status on HEAD, second is only for dev use. i386 now creates three freelists if there is any memory above 4G, to allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed from 3 to 1. The PAE_TABLES kernel config option is retired. In collaboarion with: pho Discussed with: emaste Reviewed by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D18894	2019-01-30 02:07:13 +00:00
Konstantin Belousov	957b9bbf3c	x86 busdma: fix mis-use of bus_addr_t where vm_paddr_t is assumed. Right now bus_addr_t and vm_paddr_t are always aliased to the same underlying integer type on x86, which makes the interchange hard to detect. Shortly, i386 kernel would use uint64_t for vm_paddr_t to enable automatic use of PAE paging structures if hardware allows it, while bus_addr_t would be extended to 64bit only when PAE option is specified. Fix all places that were identified as using bus_addr_t while page address was assumed. This was performed by testing the complete PAE merging patch on machine with > 4G of RAM enabled. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18854	2019-01-18 13:38:56 +00:00
Conrad Meyer	16068ae479	Add definitions for AMD Spectre/Meltdown CPUID information No functional change, aside from printing recognized bits in CPU identification. The bits are documented in 111006-B "Indirect Branch Control Extension"[1] and 124441 "Speculative Store Bypass Disable."[2] Notably missing (left as future work): * Integration with hw.spec_store_bypass_disable and hw_ssb_active flag, which are currently Intel-specific * Integration with hw_ibrs_active global flag, which are currently Intel-specific * SSB_NO integration in hw_ssb_recalculate() * Bhyve integration (PR 235010) [1]: https://developer.amd.com/wp-content/resources/111006-B_AMD64TechnologyIndirectBranchControlExtenstion_WP_7-18Update_FNL.pdf [2]: https://developer.amd.com/wp-content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf PR: 235010 (related, but does not fix) MFC after: a week	2019-01-17 19:44:47 +00:00
Ben Widawsky	91890b73ad	Add definitions for Intel Speed Shift These definitions will be used by a driver to implement Hardware P-States (autonomous control of HWP, via Intel Speed Shift technology). Reviewed by: kib Approved by: emaste (mentor) Differential Revision: https://reviews.freebsd.org/D18050	2018-11-21 00:21:58 +00:00
John Baldwin	e13507f6f0	Axe MINIMUM_MSI_INT. Just allow MSI interrupts to always start at the end of the I/O APIC pins. Since existing machines already have more than 255 I/O APIC pins, IRQ 255 is no longer reliably invalid, so just remove the minimum starting value for MSI. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D17991	2018-11-16 23:39:39 +00:00
Konstantin Belousov	2343757338	Align IA32_ARCH_CAP MSR definitions and use with SDM rev. 068. SDM rev. 068 was released yesterday and it contains the description of the MSR 0x10a IA32_ARCH_CAP. This change adds symbolic definitions for all bits present in the document, and decode them in the CPU identification lines printed on boot. But also, the document defines SSB_NO as bit 4, while FreeBSD used but 2 to detect the need to work-around Speculative Store Bypass issue. Change code to use the bit from SDM. Similarly, the document describes bit 3 as an indicator that L1TF issue is not present, in particular, no L1D flush is needed on VMENTRY. We used RDCL_NO to avoid flushing, and again I changed the code to follow new spec from SDM. In fact my Apollo Lake machine with latest ucode shows this: IA32_ARCH_CAPS=0x19<RDCL_NO,SKIP_L1DFL_VME,SSB_NO> Reviewed by: bwidawsk Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D18006	2018-11-16 21:27:11 +00:00
John Baldwin	b6b42932db	Convert the number of MSI IRQs on x86 from a constant to a tunable. The number of MSI IRQs still defaults to 512, but it can now be changed at boot time via the machdep.num_msi_irqs tunable. Reviewed by: kib, royger (older version) Reviewed by: markj MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D17977	2018-11-15 18:37:41 +00:00
Konstantin Belousov	83813c6696	Apply fix to un-cripple max cpu id on BSP earlier. We need to know actual value for the standard extended features before ifuncs are resolved. Reported and tested by: madpilot Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-11-12 19:17:26 +00:00
Brooks Davis	c3adaa3305	Consolidate identical ELF auxargs type defintions. All platforms except powerpc use the same values and powerpc shares a majority of them. Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc. Reviewed by: jhb, imp Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17397	2018-10-22 22:24:32 +00:00
Konstantin Belousov	632227a739	Update x86/ifunc.h. Remove ifunc emulation. Add helper for usermode ifunc resolver definition. Update copyright years. Sponsored by: The FreeBSD Foundation Approved by: re (rgrimes) MFC after: 1 week	2018-09-30 16:57:30 +00:00
Mark Johnston	d5089b3aed	Log a message after a successful boot-time microcode update. Reviewed by: kib Approved by: re (delphij) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17135	2018-09-14 17:04:36 +00:00
John Baldwin	fd036deac1	Dynamically allocate IRQ ranges on x86. Previously, x86 used static ranges of IRQ values for different types of I/O interrupts. Interrupt pins on I/O APICs and 8259A PICs used IRQ values from 0 to 254. MSI interrupts used a compile-time-defined range starting at 256, and Xen event channels used a compile-time-defined range after MSI. Some recent systems have more than 255 I/O APIC interrupt pins which resulted in those IRQ values overflowing into the MSI range triggering an assertion failure. Replace statically assigned ranges with dynamic ranges. Do a single pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to determine the total number of IRQs required. Allocate the interrupt source and interrupt count arrays dynamically once this pass has completed. To minimize runtime complexity these arrays are only sized once during bootup. The PIC range is determined by the PICs present in the system. The MSI and Xen ranges continue to use a fixed size, though this does make it possible to turn the MSI range size into a tunable in the future. As a result, various places are updated to use dynamic limits instead of constants. In addition, the vmstat(8) utility has been taught to understand that some kernels may treat 'intrcnt' and 'intrnames' as pointers rather than arrays when extracting interrupt stats from a crashdump. This is determined by the presence (vs absence) of a global 'nintrcnt' symbol. This change reverts r189404 which worked around a buggy BIOS which enumerated an I/O APIC twice (using the same memory mapped address for both entries but using an IRQ base of 256 for one entry and a valid IRQ base for the second entry). Making the "base" of MSI IRQ values dynamic avoids the panic that r189404 worked around, and there may now be valid I/O APICs with an IRQ base above 256 which this workaround would incorrectly skip. If in the future the issue reported in PR 130483 reoccurs, we will have to add a pass over the I/O APIC entries in the MADT to detect duplicates using the memory mapped address and use some strategy to choose the "correct" one. While here, reserve room in intrcnts for the Hyper-V counters. PR: 229429, 130483 Reviewed by: kib, royger, cem Tested by: royger (Xen), kib (DMAR) Approved by: re (gjb) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16861	2018-08-28 21:09:19 +00:00
John Baldwin	a800b45c18	Merge amd64 and i386 <machine/intr_machdep.h> headers. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16803	2018-08-20 12:31:39 +00:00
John Baldwin	a568818913	Remove some vestiges of IPI_LAZYPMAP on i386. The support for lazy pmap invalidations on i386 was removed in r281707. This removes the constant for the IPI and stops accounting for it when sizing the interrupt count arrays. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16801	2018-08-19 16:14:59 +00:00
Konstantin Belousov	8d32b46379	Add definitions related to the L1D flush operation capability and MSR. Sponsored by: The FreeBSD Foundation	2018-08-14 17:19:11 +00:00
Mark Johnston	97edfc1b45	Implement kernel support for early loading of Intel microcode updates. Updates in the format described in section 9.11 of the Intel SDM can now be applied as one of the first steps in booting the kernel. Updates that are loaded this way are automatically re-applied upon exit from ACPI sleep states, in contrast with the existing cpucontrol(8)-based method. For the time being only Intel updates are supported. Microcode update files are passed to the kernel via loader(8). The file type must be "cpu_microcode" in order for the file to be recognized as a candidate microcode update. Updates for multiple CPU types may be concatenated together into a single file, in which case the kernel will select and apply a matching update. Memory used to store the update file will be freed back to the system once the update is applied, so this approach will not consume more memory than required. Reviewed by: kib MFC after: 6 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16370	2018-08-13 17:13:09 +00:00
Mark Johnston	a18e40aad4	Use the existing MSR_BIOS_SIGN on AMD. Reported by: kib Sponsored by: The FreeBSD Foundation	2018-07-13 20:56:20 +00:00
Mark Johnston	5612bb23d0	Define the MSR used to fetch the current microcode patch level on AMD. It is defined in the AMD family 17h register reference. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2018-07-13 19:42:59 +00:00
Konstantin Belousov	300c34e431	Add a name for the MSR controlling standard extended features report on AMD. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-07-05 10:44:18 +00:00
Konstantin Belousov	fe15b8543e	Order the portion of the AMD-specific MSRs names definitions numerically. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-07-05 10:34:01 +00:00
Konstantin Belousov	7705dd4df0	Provide a helper function acpi_get_fadt_bootflags() to fetch the FADT x86 boot flags. Reviewed by: royger Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D16004 MFC after: 1 week	2018-06-25 11:01:12 +00:00
John Baldwin	9e2154ff1c	Cleanups related to debug exceptions on x86. - Add constants for fields in DR6 and the reserved fields in DR7. Use these constants instead of magic numbers in most places that use DR6 and DR7. - Refer to T_TRCTRAP as "debug exception" rather than a "trace trap" as it is not just for trace exceptions. - Always read DR6 for debug exceptions and only clear TF in the flags register for user exceptions where DR6.BS is set. - Clear DR6 before returning from a debug exception handler as recommended by the SDM dating all the way back to the 386. This allows debuggers to determine the cause of each exception. For kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value to other parts of the handler (namely, user_dbreg_trap()). For user traps, wait until after trapsignal to clear DR6 so that userland debuggers can read DR6 via PT_GETDBREGS while the thread is stopped in trapsignal(). Reviewed by: kib, rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D15189	2018-05-22 00:45:00 +00:00
Konstantin Belousov	3621ba1ede	Add Intel Spec Store Bypass Disable control. Speculative Store Bypass (SSB) is a speculative execution side channel vulnerability identified by Jann Horn of Google Project Zero (GPZ) and Ken Johnson of the Microsoft Security Response Center (MSRC) https://bugs.chromium.org/p/project-zero/issues/detail?id=1528. Updated Intel microcode introduces a MSR bit to disable SSB as a mitigation for the vulnerability. Introduce a sysctl hw.spec_store_bypass_disable to provide global control over the SSBD bit, akin to the existing sysctl that controls IBRS. The sysctl can be set to one of three values: 0: off 1: on 2: auto Future work will enable applications to control SSBD on a per-process basis (when it is not enabled globally). SSBD bit detection and control was verified with prerelease microcode. Security: CVE-2018-3639 Tested by: emaste (previous version, without updated microcode) Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-05-21 21:08:19 +00:00
Konstantin Belousov	9be4bbbb21	Add definition for Intel Speculative Store Bypass Disable MSR bits Security: CVE-2018-3639 Sponsored by: The FreeBSD Foundation MFC after: 3 days	2018-05-21 21:07:13 +00:00
Konstantin Belousov	d5effb01f1	Add helper macros to hide some boring repeatable ceremonies to define ifuncs on x86. Also keep helpers to define 'pseudo-ifuncs' which are emulated by the indirect jmp. Reviewed by: jhb (previous version, as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D13838	2018-05-03 21:45:59 +00:00
Konstantin Belousov	986c4ca387	Turn off IBRS on suspend. Resume starts CPU from the init state, which clears any loaded microcode updates. As result, IBRS MSRs are no longer available, until the microcode is reloaded. I have to forcibly clear cpu_stdext_feature3, which assumes that CPUID leaf 7 reg %ebx does not report anything except Meltdown/Spectre bugs bits. If future CPUs add new bits there, hw_ibrs_recalculate() and identify_cpu1()/identify_cpu2() need to be adjusted for that. Submitted and tested by: Michael Danilov <mike.d.ft402@gmail.com> PR: 227866 Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15236	2018-04-30 20:18:32 +00:00
Roger Pau Monné	9dba82a442	x86: improve reservation of AP trampoline memory So that it doesn't rely on physmap[1] containing an address below 1MiB. Instead scan the full physmap and search for a suitable address to place the trampoline code (below 1MiB) and the initial memory pages (below 4GiB). Sponsored by: Citrix Systems R&D Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D14878	2018-04-05 14:39:51 +00:00
John Baldwin	d41e41f9f0	Remove very old and unused signal information codes. These have been supplanted by the MI signal information codes in <sys/signal.h> since 7.0. The FPE_*_TRAP ones were deprecated even earlier in 1999. PR: 226579 (exp-run) Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D14637	2018-03-27 20:57:51 +00:00
Konstantin Belousov	8fbcc3343f	Move the CR0.WP manipulation KPI to x86. This should allow to avoid some #ifdefs in the common x86/ code. Requested by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-03-20 20:20:49 +00:00
John Baldwin	7af5f2acfb	Fix a typo. Reviewed by: kib	2018-03-19 17:14:56 +00:00
Ed Maste	315fbaeca2	Correct pseudo misspelling in sys/ comments contrib code and #define in intel_ata.h unchanged.	2018-02-23 18:15:50 +00:00
Warner Losh	ef1fcaf0f5	Do not include float interfaces when using libsa. We don't support float in the boot loaders, so don't include interfaces for float or double in systems headers. In addition, take the unusual step of spiking double and float to prevent any more accidental seepage.	2018-02-23 04:04:25 +00:00
Konstantin Belousov	c688c9051b	Fix build with gas. Do not use C constant suffixes. Bit values are small enough to not require typing, despite they are used for 64bit MSR writes. The added cast in hw_ibrs_recalculate() is redundand but I prefer to add it for clarity. Reported by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-02-13 15:30:31 +00:00
Warner Losh	62bca77843	Move __va_list and related defines to sys/sys/_types.h __va_list and related defines are identical in all the ARCH/include/_types.h files. Move them to sys/sys/_types.h Sponsored by: Netflix	2018-02-12 14:48:20 +00:00
Konstantin Belousov	319117fd57	IBRS support, AKA Spectre hardware mitigation. It is coded according to the Intel document 336996-001, reading of the patches posted on lkml, and some additional consultations with Intel. For existing processors, you need a microcode update which adds IBRS CPU features, and to manually enable it by setting the tunable/sysctl hw.ibrs_disable to 0. Current status can be checked in sysctl hw.ibrs_active. The mitigation might be inactive if the CPU feature is not patched in, or if CPU reports that IBRS use is not required, by IA32_ARCH_CAP_IBRS_ALL bit. Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D14029	2018-01-31 14:36:27 +00:00
Konstantin Belousov	c8f9c1f3d9	Use PCID to optimize PTI. Use PCID to avoid complete TLB shootdown when switching between user and kernel mode with PTI enabled. I use the model close to what I read about KAISER, user-mode PCID has 1:1 correspondence to the kernel-mode PCID, by setting bit 11 in PCID. Full kernel-mode TLB shootdown is performed on context switches, since KVA TLB invalidation only works in the current pmap. User-mode part of TLB is flushed on the pmap activations as well. Similarly, IPI TLB shootdowns must handle both kernel and user address spaces for each address. Note that machines which implement PCID but do not have INVPCID instructions, cause the usual complications in the IPI handlers, due to the need to switch to the target PCID temporary. This is racy, but because for PCID/no-INVPCID we disable the interrupts in pmap_activate_sw(), IPI handler cannot see inconsistent state of CPU PCID vs PCPU pmap/kcr3/ucr3 pointers. On the other hand, on kernel/user switches, CR3_PCID_SAVE bit is set and we do not clear TLB. I can imagine alternative use of PCID, where there is only one PCID allocated for the kernel pmap. Then, there is no need to shootdown kernel TLB entries on context switch. But copyout(3) would need to either use method similar to proc_rwmem() to access the userspace data, or (in reverse) provide a temporal mapping for the kernel buffer into user mode PCID and use trampoline for copy. Reviewed by: markj (previous version) Tested by: pho Discussed with: alc (some aspects) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D13985	2018-01-27 11:49:37 +00:00
Ed Maste	b3327f62f0	Enable KPTI by default on amd64 for non-AMD CPUs Kernel Page Table Isolation (KPTI) was introduced in r328083 as a mitigation for the 'Meltdown' vulnerability. AMD CPUs are not affected, per https://www.amd.com/en/corporate/speculative-execution: We believe AMD processors are not susceptible due to our use of privilege level protections within paging architecture and no mitigation is required. Thus default KPTI to off for AMD CPUs, and to on for others. This may be refined later as we obtain more specific information on the sets of CPUs that are and are not affected. Submitted by: Mitchell Horne Reviewed by: cem Relnotes: Yes Security: CVE-2017-5754 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D13971	2018-01-19 15:42:34 +00:00
Konstantin Belousov	bd50262f70	PTI for amd64. The implementation of the Kernel Page Table Isolation (KPTI) for amd64, first version. It provides a workaround for the 'meltdown' vulnerability. PTI is turned off by default for now, enable with the loader tunable vm.pmap.pti=1. The pmap page table is split into kernel-mode table and user-mode table. Kernel-mode table is identical to the non-PTI table, while usermode table is obtained from kernel table by leaving userspace mappings intact, but only leaving the following parts of the kernel mapped: kernel text (but not modules text) PCPU GDT/IDT/user LDT/task structures IST stacks for NMI and doublefault handlers. Kernel switches to user page table before returning to usermode, and restores full kernel page table on the entry. Initial kernel-mode stack for PTI trampoline is allocated in PCPU, it is only 16 qwords. Kernel entry trampoline switches page tables. then the hardware trap frame is copied to the normal kstack, and execution continues. IST stacks are kept mapped and no trampoline is needed for NMI/doublefault, but of course page table switch is performed. On return to usermode, the trampoline is used again, iret frame is copied to the trampoline stack, page tables are switched and iretq is executed. The case of iretq faulting due to the invalid usermode context is tricky, since the frame for fault is appended to the trampoline frame. Besides copying the fault frame and original (corrupted) frame to kstack, the fault frame must be patched to make it look as if the fault occured on the kstack, see the comment in doret_iret detection code in trap(). Currently kernel pages which are mapped during trampoline operation are identical for all pmaps. They are registered using pmap_pti_add_kva(). Besides initial registrations done during boot, LDT and non-common TSS segments are registered if user requested their use. In principle, they can be installed into kernel page table per pmap with some work. Similarly, PCPU can be hidden from userspace mapping using trampoline PCPU page, but again I do not see much benefits besides complexity. PDPE pages for the kernel half of the user page tables are pre-allocated during boot because we need to know pml4 entries which are copied to the top-level paging structure page, in advance on a new pmap creation. I enforce this to avoid iterating over the all existing pmaps if a new PDPE page is needed for PTI kernel mappings. The iteration is a known problematic operation on i386. The need to flush hidden kernel translations on the switch to user mode make global tables (PG_G) meaningless and even harming, so PG_G use is disabled for PTI case. Our existing use of PCID is incompatible with PTI and is automatically disabled if PTI is enabled. PCID can be forced on only for developer's benefit. MCE is known to be broken, it requires IST stack to operate completely correctly even for non-PTI case, and absolutely needs dedicated IST stack because MCE delivery while trampoline did not switched from PTI stack is fatal. The fix is pending. Reviewed by: markj (partially) Tested by: pho (previous version) Discussed with: jeff, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2018-01-17 11:44:21 +00:00
Konstantin Belousov	e8c770a66e	Enumerate and print Intel CPU features for Speculative Execution Side Channel Mitigations. The definitions are taken from the document 336996-001. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-01-14 12:36:23 +00:00
Conrad Meyer	233933cb00	amd64: Add a 48-bit MAXADDR constant Some devices (e.g., ccp(4) -- to be committed) can only access the low 48 bits of physical memory. Reviewed by: markj Sponsored by: Dell EMC Isilon	2018-01-13 17:55:22 +00:00
Jeff Roberson	6f4acaf4c9	Add support for NUMA domains to bus dma tags. This causes all memory allocated with a tag to come from the specified domain if it meets the other constraints provided by the tag. Automatically create a tag at the root of each bus specifying the domain local to that bus if available. Reviewed by: jhb, kib Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D13545	2018-01-12 23:34:16 +00:00

1 2 3 4 5

240 Commits