freebsd-dev

Author	SHA1	Message	Date
John Baldwin	a800b45c18	Merge amd64 and i386 <machine/intr_machdep.h> headers. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16803	2018-08-20 12:31:39 +00:00
Konstantin Belousov	c1141fba00	Update L1TF workaround to sustain L1D pollution from NMI. Current mitigation for L1TF in bhyve flushes L1D either by an explicit WRMSR command, or by software reading enough uninteresting data to fully populate all lines of L1D. If NMI occurs after either of methods is completed, but before VM entry, L1D becomes polluted with the cache lines touched by NMI handlers. There is no interesting data which NMI accesses, but something sensitive might be co-located on the same cache line, and then L1TF exposes that to a rogue guest. Use VM entry MSR load list to ensure atomicity of L1D cache and VM entry if updated microcode was loaded. If only software flush method is available, try to help the bhyve sw flusher by also flushing L1D on NMI exit to kernel mode. Suggested by and discussed with: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D16790	2018-08-19 18:47:16 +00:00
John Baldwin	8cd385fda0	Make 'device crypto' lines more consistent. - In configurations with a pseudo devices section, move 'device crypto' into that section. - Use a consistent comment. Note that other things common in kernel configs such as GELI also require 'device crypto', not just IPSEC. Reviewed by: rgrimes, cem, imp Differential Revision: https://reviews.freebsd.org/D16775	2018-08-18 20:32:08 +00:00
Warner Losh	62ee5bbd73	GPT is standard in x86 and arm64 land. Add it to DEFAULTS with the others. Differential Revision: https://reviews.freebsd.org/D16740	2018-08-17 14:47:21 +00:00
Konstantin Belousov	54564eda77	Fix early EFIRT on PCID machines after r337773. Ensure that the valid PCID state is created for proc0 pmap, since it might be used by efirt enter() before first context switch on the BSP. Sponsored by: The FreeBSD Foundation MFC after: 6 days	2018-08-15 12:48:49 +00:00
Konstantin Belousov	c30578feeb	Provide part of the mitigation for L1TF-VMM. On the guest entry in bhyve, flush L1 data cache, using either L1D flush command MSR if available, or by reading enough uninteresting data to fill whole cache. Flush is automatically enabled on CPUs which do not report RDCL_NO, and can be disabled with the hw.vmm.l1d_flush tunable/kenv. Security: CVE-2018-3646 Reviewed by: emaste. jhb, Tony Luck <tony.luck@intel.com> Sponsored by: The FreeBSD Foundation	2018-08-14 17:29:41 +00:00
Konstantin Belousov	9840c7373c	Reserve page at the physical address zero on amd64. We always zero the invalidated PTE/PDE for superpage, which means that L1TF CPU vulnerability (CVE-2018-3620) can be only used for reading from the page at zero. Note that both i386 and amd64 exclude the page from phys_avail[] array, so this change is redundant, but I think that phys_avail[] on UEFI-boot does not need to do that. Eventually the blacklisting should be made conditional on CPUs which report that they are not vulnerable to L1TF. Reviewed by: emaste. jhb Sponsored by: The FreeBSD Foundation	2018-08-14 17:14:33 +00:00
Konstantin Belousov	8fba5348fc	amd64: ensure that curproc->p_vmspace pmap always matches PCPU curpmap. When performing context switch on a machine without PCID, if current %cr3 equals to the new pmap %cr3, which is typical for kernel_pmap vs. kernel process, I overlooked to update PCPU curpmap value. Remove check for %cr3 not equal to pm_cr3 for doing the update. It is believed that this case cannot happen at all, due to other changes in this revision. Also, do not set the very first curpmap to kernel_pmap, it should be vmspace0 pmap instead to match curproc. Move the common code to activate the initial pmap both on BSP and APs into pmap_activate_boot() helper. Reported by: eadler, ambrisko Discussed with: kevans Reviewed by: alc, markj (previous version) Tested by: ambrisko (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16618	2018-08-14 16:37:14 +00:00
Konstantin Belousov	ef52dc71eb	Fix typo. Noted by: alc MFC after: 3 days	2018-08-14 16:27:17 +00:00
Mark Johnston	97edfc1b45	Implement kernel support for early loading of Intel microcode updates. Updates in the format described in section 9.11 of the Intel SDM can now be applied as one of the first steps in booting the kernel. Updates that are loaded this way are automatically re-applied upon exit from ACPI sleep states, in contrast with the existing cpucontrol(8)-based method. For the time being only Intel updates are supported. Microcode update files are passed to the kernel via loader(8). The file type must be "cpu_microcode" in order for the file to be recognized as a candidate microcode update. Updates for multiple CPU types may be concatenated together into a single file, in which case the kernel will select and apply a matching update. Memory used to store the update file will be freed back to the system once the update is applied, so this approach will not consume more memory than required. Reviewed by: kib MFC after: 6 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16370	2018-08-13 17:13:09 +00:00
Konstantin Belousov	cb0eecdf92	Futex support functions in linux.ko and linux32.ko on amd64 should be aware of SMAP. Reported and tested by: Johannes Lundberg <johalun0@gmail.com>, wulf Sponsored by: The FreeBSD Foundation	2018-08-07 18:29:10 +00:00
Kyle Evans	3395e43a04	efirt: Don't enter EFI context early, convert addrs to KVA instead efi_enter here was needed because efi_runtime dereference causes a fault outside of EFI context, due to runtime table living in runtime service space. This may cause problems early in boot, though, so instead access it by converting paddr to KVA for access. While here, remove the other direct PHYS_TO_DMAP calls and the explicit DMAP requirement from efidev. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D16591	2018-08-04 21:41:10 +00:00
Konstantin Belousov	54c531cacd	Add END()s for amd64 linux futex support routines. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-08-04 13:57:50 +00:00
Konstantin Belousov	35efb3b1de	Fix typo in copyinstr_smap, resulting in mis-handling of too long strings. Reported and tested by: pho PR: 230286 Sponsored by: The FreeBSD Foundation	2018-08-03 15:35:29 +00:00
Konstantin Belousov	e45b89d23d	Add pmap_is_valid_memattr(9). Discussed with: alc Sponsored by: The FreeBSD Foundation, Mellanox Technologies MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15583	2018-08-01 18:45:51 +00:00
Mark Johnston	8a5efe3601	Make sure that ENTRY() and END() refer to the same symbol. X-MFC with: r336876	2018-08-01 15:50:42 +00:00
Marcelo Araujo	be963beee6	- Add the ability to run bhyve(8) within a jail(8). This patch adds a new sysctl(8) knob "security.jail.vmm_allowed", by default this option is disable. Submitted by: Shawn Webb <shawn.webb____hardenedbsd.org> Reviewed by: jamie@ and myself. Relnotes: Yes. Sponsored by: HardenedBSD and G2, Inc. Differential Revision: https://reviews.freebsd.org/D16057	2018-08-01 00:39:21 +00:00
Mark Johnston	40fd44953c	COMPAT_LINUX32 has not depended on COMPAT_43 in some time. MFC after: 3 days	2018-07-31 21:40:13 +00:00
Kyle Evans	164138e7d8	amd64/GENERIC: Enable EFIRT by default As noted in UDPATING, the new loader tunable efi.rt_disabled may be used to disable EFIRT at runtime. It should have no effect if you are not booted via UEFI boot. MFC after: 6 weeks	2018-07-30 17:54:18 +00:00
Konstantin Belousov	8e36389535	Remove unneeded CLDs instructions in the SMAP-ed version of several functions from support.S. I believe they re-appeared due to me mis-merging my r327820 into the topic branch. Sponsored by: The FreeBSD Foundation	2018-07-30 16:54:51 +00:00
Konstantin Belousov	b3a7db3b06	Use SMAP on amd64. Ifuncs selectors dispatch copyin(9) family to the suitable variant, to set rflags.AC around userspace access. Rflags.AC bit is cleared in all kernel entry points unconditionally even on machines not supporting SMAP. Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D13838	2018-07-29 20:47:00 +00:00
Warner Losh	67d33338c0	Rename VM_FREELIST_ISADMA to VM_FREELIST_LOWMEM. There's no differene between VM_FREELIST_ISADMA and VM_FREELIST_LOWMEM except for the default boundary (16MB on x86 and 256MB on MIPS, but they are otherwise the same). We don't need both for any system we support (there were some really old ARC systems that did have ISA/EISA bus, but we never ran on them and they are too old to ever grow support for). Differential Review: https://reviews.freebsd.org/D16290	2018-07-27 18:34:20 +00:00
Mark Johnston	6c85795a25	Fix handling of KVA in kmem_bootstrap_free(). Do not use vm_map_remove() to release KVA back to the system. Because kernel map entries do not have an associated VM object, with r336030 the vm_map_remove() call will not update the kernel page tables. Avoid relying on the vm_map layer and instead update the pmap and release KVA to the kernel arena directly in kmem_bootstrap_free(). Because the pmap updates will generally result in superpage demotions, modify pmap_init() to insert PTPs shadowed by superpage mappings into the kernel pmap's radix tree. While here, port r329171 to i386. Reported by: alc Reviewed by: alc, kib X-MFC with: r336505 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16426	2018-07-27 15:46:34 +00:00
Konstantin Belousov	45ed991d96	On amd64, enable workarounds for several Ryzen erratas as described in the AMD document 55449 'Revision Guide for AMD Family 17h Models 00h-0Fh Processors' rev 1.12. The errata numbers are mentioned near each action. It seems that newer BIOSes already include required chicken bits settings, so the magic MSR updates are only needed when BIOS cannot be updated. On the other hand, MWAIT avoidance seems to be important. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-07-27 15:31:20 +00:00
Konstantin Belousov	41bed185c1	Extend ranges of the critical sections to ensure that context switch code never sees FPU pcb flags not consistent with the hardware state. This is uncovered by the eager FPU switch mode. Analyzed, reviewed and tested by: gleb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-07-24 19:22:52 +00:00
Mark Johnston	483f692ea6	Have preload_delete_name() free pages backing preloaded data. On i386 and amd64, add a vm_phys segment for physical memory used to store the kernel binary and other preloaded data. This makes it possible to free such memory back to the system once it is no longer needed, e.g., when a preloaded kernel module is unloaded. Previously, it would have remained unused. Reviewed by: kib, royger MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16330	2018-07-19 20:00:28 +00:00
Roger Pau Monné	b0663c33c2	xen: implement early init helper for PVHv2 In order to setup an initial environment and jump into the generic hammer_time initialization function. Some of the code is shared with PVHv1, while other code is PVHv2 specific. This allows booting FreeBSD as a PVHv2 DomU and Dom0. Sponsored by: Citrix Systems R&D	2018-07-19 08:44:52 +00:00
Roger Pau Monné	f2577f25c1	xen: add PVHv2 entry point The PVHv2 entry point is fairly similar to the multiboot1 one. The kernel is started in protected mode with paging disabled. More information about the exact BSP state can be found in the pvh.markdown document on the Xen tree. This entry point is going to be joined with the native entry point at hammer_time, and in order to do so the BSP needs to be bootstrapped into long mode with the same set of page tables as used on bare metal. Sponsored by: Citrix Systems R&D	2018-07-19 07:39:35 +00:00
Konstantin Belousov	53dec71d39	Expand x86 struct pcpus to UMA_PCPU_ALLOC_SIZE AKA PAGE_SIZE. This restores counters(9) operation. Revert r336024. Improve assert of pcpu size on x86. Reviewed by: mmacy Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D16163	2018-07-06 19:50:44 +00:00
Konstantin Belousov	fb0a281196	Revert to recommit with the proper message.	2018-07-06 19:50:25 +00:00
Konstantin Belousov	1614716655	Save a call to pmap_remove() if entry cannot have any pages mapped. Due to the way rtld creates mappings for the shared objects, each dso causes unmap of at least three guard map entries. For instance, in the buildworld load, this change reduces the amount of pmap_remove() calls by 1/5. Profiled by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16148	2018-07-06 19:48:47 +00:00
Hans Petter Selasky	a7a7f5b472	Make sure kernel modules built by default are portable between UP and SMP systems by extending defined(SMP) to include defined(KLD_MODULE). This is a regression issue after r335873 . Discussed with: mmacy@ Sponsored by: Mellanox Technologies	2018-07-06 10:13:42 +00:00
Matt Macy	428194fed2	counter(9): unbreak amd64 following r336020 Apply temporary fix to counter until daylight hours. The fact that the assembly for counter_u64_add relied on the sizeof(struct pcpu) was the basis for the otherwise arbitrary offset never came up in D15933. critical_{enter,exit} is now inline so the only real added overhead is the added (mostly false) conditional branch in exit.	2018-07-06 10:10:00 +00:00
Matt Macy	ab3059a8e7	Back pcpu zone with domain correct pages - Change pcpu zone consumers to use a stride size of PAGE_SIZE. (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier) - Allocate page from the correct domain for a given cpu. - Don't initialize pc_domain to non-zero value if NUMA is not defined There are some misconceptions surrounding this field. It is the _VM_ NUMA domain and should only ever correspond to valid domain values as understood by the VM. The former slab size of sizeof(struct pcpu) was somewhat arbitrary. The new value is PAGE_SIZE because that's the smallest granularity which the VM can allocate a slab for a given domain. If you have fewer than PAGE_SIZE/8 counters on your system there will be some memory wasted, but this is obviously something where you want the cache line to be coming from the correct domain. Reviewed by: jeff Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15933	2018-07-06 02:06:03 +00:00
Konstantin Belousov	945a6b310b	Extend r335969 to superpages. It is possible that a fictitious unmanaged userspace mapping of superpage is created on x86, e.g. by pmap_object_init_pt(), with the physical address outside the vm_page_array[] coverage. Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 17:28:06 +00:00
Konstantin Belousov	a0ef97f6fa	Revert r335999 to re-commit with the correct error message.	2018-07-05 17:26:13 +00:00
Konstantin Belousov	c59dfa63bf	In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 16:38:54 +00:00
Konstantin Belousov	81dac87135	In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085	2018-07-05 16:27:34 +00:00
Alan Cox	c11c3d64e0	As of r335784, if pmap_enter() replaces a managed mapping by an unmanaged mapping, then it leaks the unlinked PV entry. This change eliminates that leak, freeing the PV entry. Reviewed by: kib, markj X-MFC with: r335784 Differential Revision: https://reviews.freebsd.org/D16130	2018-07-05 02:04:18 +00:00
Konstantin Belousov	84a15fe70d	In x86 pmap_extract_and_hold()s, handle the case of PHYS_TO_VM_PAGE() returning NULL. vm_fault_quick_hold_pages() can be legitimately called on userspace mappings backed by fictitious pages created by unmanaged device and sg pagers. Note that other architectures pmap_extract_and_hold() might need similar fix, but I postponed the examination. Reported by: bde Discussed with: alc Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D16085	2018-07-04 21:21:59 +00:00
John Baldwin	79ba91952d	Use 'e' instead of 'i' constraints with 64-bit atomic operations on amd64. The ADD, AND, OR, and SUB instructions take at most a 32-bit sign-extended immediate operand. 64-bit constants that do not fit into that constraint need to be loaded into a register. The 'i' constraint tells the compiler it can pass any integer constant to the assembler, whereas the 'e' constrain only permits constants that fit into a 32-bit sign-extended value. This fixes using atomic_add/clear/set/subtract_long/64 with constants that do not fit into a 32-bit sign-extended immediate. Reported by: several folks Tested by: Pete Wright <pete@nomadlogic.org> MFC after: 2 weeks	2018-07-03 22:03:28 +00:00
Matt Macy	f4b3640475	inline atomics and allow tied modules to inline locks - inline atomics in modules on i386 and amd64 (they were always inline on other arches) - allow modules to opt in to inlining locks by specifying MODULE_TIED=1 in the makefile Reviewed by: kib Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16079	2018-07-02 19:48:38 +00:00
Mark Johnston	1253de1eb6	Invalidate the mapping before updating its physical address. Doing so ensures that all threads sharing the pmap have a consistent view of the mapping. This fixes the problem described in the commit log messages for r329254 without the overhead of an extra fault in the common case. Once other pmap_enter() implementations are similarly modified, the workaround added in r329254 can be removed, reducing the overhead of CoW faults. With this change we can reuse the PV entry from the old mapping, potentially avoiding a call to reclaim_pv_chunk(). Otherwise, there is nothing preventing the old PV entry from being reclaimed. In rare cases this could result in the PTE's page table page being freed, leading to a use-after-free of the page when the updated PTE is written following the allocation of the PV entry for the new mapping. Reported and tested by: pho Reviewed by: alc, kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D16005	2018-06-28 21:40:31 +00:00
Konstantin Belousov	7f12ebe583	Do not leave stray qword on top of stack for interrupts and exceptions without error code. Doing so it mis-aligned the stack. Since the only consumer of the SSE instructions with the alignment requirements is AES-NI module, and since the FPU context cannot be accessed in interrupts, the only situation where the alignment matter are the compat32 syscalls, as reported in the PR. PR: 229222 Reported and tested by: dewayne@heuristicsystems.com.au Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-06-25 11:29:04 +00:00
Mark Johnston	a8be239d69	Re-count available PV entries after reclaiming a PV chunk. The call to reclaim_pv_chunk() in reserve_pv_entries() may free a PV chunk with free entries belonging to the current pmap. In this case we must account for the free entries that were reclaimed, or reserve_pv_entries() may return without having reserved the requested number of entries. Reviewed by: alc, kib Tested by: pho (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D15911	2018-06-23 10:41:52 +00:00
Chuck Tuffli	3575504976	Fix the Linux kernel version number calculation The Linux compatibility code was converting the version number (e.g. 2.6.32) in two different ways and then comparing the results. The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2 v = v0 * 1000000 + v1 * 1000 + v2; The LINUX_KERNVER() macro, on the other hand, converted the value with bit shifts. I.e. where major=a, minor=b, and patch=c v = (((a) << 16) + ((b) << 8) + (c)) The Linux kernel uses the later format via the KERNEL_VERSION() macro in include/generated/uapi/linux/version.h Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as in the .trans_osrel functions. PR: 229209 Reviewed by: emaste, cem, imp (mentor) Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D15952	2018-06-22 00:02:03 +00:00
Matt Macy	92689b3f02	remove ixl iwarp and ixlv from the build until they are in a working state	2018-06-19 02:48:53 +00:00
Eric Joyner	1031d839aa	ixl(4): Update to use iflib Update the driver to use iflib in order to bring performance, maintainability, and (hopefully) stability benefits to the driver. The driver currently isn't completely ported; features that are missing: - VF driver (ixlv) - SR-IOV host support - RDMA support The plan is to have these re-added to the driver before the next FreeBSD release. Reviewed by: gallatin@ Contributions by: gallatin@, mmacy@, krzysztof.galazka@intel.com Tested by: jeffrey.e.pieper@intel.com MFC after: 1 month Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15577	2018-06-18 20:12:54 +00:00
Ed Maste	931e2a1a6e	linuxulator: do not include legacy syscalls on arm64 Existing linuxulator platforms (i386, amd64) support legacy syscalls, such as non-*at ones like open, but arm64 and other new platforms do not. Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h files. We may need finer grained control in the future but this is sufficient for now. Reviewed by: andrew Sponsored by: Turing Robotic Industries Differential Revision: https://reviews.freebsd.org/D15237	2018-06-15 14:41:51 +00:00
Konstantin Belousov	459ccd3c5f	linuxolator/amd64: Don't mangle %r10 on return from syscall for EJUSTRETURN. This fixes the %r10 content for rt_sigreturn. Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com> MFC after: 1 week	2018-06-14 12:35:57 +00:00

1 2 3 4 5 ...

7786 Commits